Tools for estimating fake/non-prompt lepton backgrounds with the ATLAS detector at the LHC

Measurements and searches performed with the ATLAS detector at the CERN LHC often involve signatures with one or more prompt leptons. Such analyses are subject to `fake/non-prompt' lepton backgrounds, where either a hadron or a lepton from a hadron decay or an electron from a photon conversion satisfies the prompt-lepton selection criteria. These backgrounds often arise within a hadronic jet because of particle decays in the showering process, particle misidentification or particle interactions with the detector material. As it is challenging to model these processes with high accuracy in simulation, their estimation typically uses data-driven methods. Three methods for carrying out this estimation are described, along with their implementation in ATLAS and their performance.


Introduction
Many measurements and searches for new physics performed with the ATLAS detector [1] at the Large Hadron Collider (LHC) at CERN require the presence of one or more leptons (electrons or muons) to indicate that a high-energy electroweak process occurred in the collision. Lepton candidates are reconstructed from signals in the inner tracker, calorimeters, and muon spectrometer [2,3]. Identification criteria are then applied to suppress candidates originating from physical objects other than leptons from the hard-scattering process of the event. These background candidates fall into two categories: ) 'non-prompt leptons' from the semileptonic decay of hadrons, or from photon ( ) conversions in detector material, and ) 'fake leptons' where the reconstructed object is not, in fact, due to a lepton. In contrast to the aforementioned categories, 'real leptons' are defined as electrons or muons produced either directly in the hard-scattering process or directly in the decay of a short-lived non-hadronic resonance (such as a / boson).
The rates at which fake or non-prompt leptons are selected are difficult to model accurately from simulation. They can depend strongly on details of the physics simulation, including in non-perturbative regions where the simulation would not be expected to be reliable. They also depend on the modelling of the material composition and response of the detector. In addition, fake leptons sometimes occur with low probability as a result of misidentifying a hadronic jet in multi-jet events. The computing resources required to simulate these processes with a sufficient sample size would be prohibitive. Therefore, 'data-driven' approaches are commonly used to estimate these backgrounds.
To simplify the adoption of such methods, and to ensure that they are applied uniformly, a set of standard tools and prescriptions has been developed for use in ATLAS physics analyses that are subject to fake/nonprompt lepton backgrounds. The principles and performance of these tools are described in this paper. The motivation and mathematical basis of the methods are explained in Section 2; a description of the relevant features of the ATLAS detector is given in Section 3; the criteria used to select leptons are given in Section 4; the simulated signal and background processes, as well as the different processes that can lead to fake/non-prompt leptons, are discussed in Sections 5 and 6; the procedures used to measure the efficiencies for real and fake/non-prompt leptons are described in Section 7; the systematic uncertainties associated with the methods are described in Section 8; and Section 9 provides examples of the application of the fake/non-prompt lepton estimation methods in two published ATLAS physics analyses, with details that are not included in the existing publications.

Methods
The fake/non-prompt lepton estimation methods considered in this paper depend on defining two tiers of lepton selection criteria, called the 'baseline' and 'tight' criteria. The tight criteria are used to select the signal leptons in a physics analysis, while the baseline criteria accept all of the tight lepton candidates as well as an additional set of candidates with a higher rate of fake/non-prompt contributions. Candidates that satisfy the baseline criteria but not the tight criteria are called 'loose' leptons. If the two sets of criteria are chosen well, the fraction of real leptons in the baseline sample that satisfy the tight criteria will be substantially higher than the corresponding fraction for fake/non-prompt leptons. These fractions are called the 'real efficiency' ( r ) and 'fake efficiency' ( f ), respectively.
The r values are generally taken from Monte Carlo (MC) simulated events that are corrected to account for differences between data and the simulation, while the f values are typically measured in a data sample that is orthogonal to the one that is used for the data analysis, as detailed in Section 7.

Matrix method
With the efficiencies known, a simple counting of the numbers of lepton candidates that satisfy the tight and loose criteria provides an estimate for the number of fake/non-prompt leptons. In the simplest case, where an analysis selects signal events containing exactly one tight lepton candidate and no loose lepton candidates, the relationship between the numbers of tight and loose leptons observed in data and the composition of the sample in terms of real and fake/non-prompt leptons is: where t and l are the numbers of events with tight and loose lepton candidates, and b r and b f are the unknown numbers of real and fake/non-prompt leptons in the baseline sample. In matrix notation, the relationship is given by: The fact that the unknown values ( b r and b f ) and the observed yields ( t and l ) are related via the matrix M gives rise to the name of this method: the 'matrix method'. Inversion of the matrix allows b f to be determined: In the typical use case, the quantity of interest is the number of events in the tight sample where the lepton is fake/non-prompt, t f . This is related to the number of such events in the baseline sample, b f , by: Similarly, the number of real leptons in the tight sample is and these can be treated as elements of a column matrix N t rf . The fact that t appears in Eq. (1) means that information about the content of the analysis signal region is used in the estimate, and therefore an analysis is not completely blinded when using this approach.
Generally, the values of r and f depend on the lepton candidate's momentum, proximity to other objects, or other factors. Details of how these variations are accounted for in the estimation are given below.

Asymptotic matrix method
In this method, events in the baseline sample are considered one at a time, and a 'fake weight ' is defined for each event, corresponding to the two terms in Eq. , where r, and f, are the values of r and f that are appropriate for lepton . Since r, is always less than one, the weight for an event with a tight lepton candidate is negative. By extension, the total fake/non-prompt lepton background in the tight sample is then estimated by This approach is convenient since the need only be calculated once and then can be stored with the event, allowing the distribution of the fake-lepton yield to be binned in any variable of interest, as well as a simple re-computation of t f if the event selection is modified. One drawback is that since the value of may be negative, there is no guarantee that t f will be positive. The value of t f is also sensitive to fluctuations in the input r, and f, values.
The statistical uncertainty of t f is given by: The method that makes use of the is known as the 'asymptotic matrix method', since Eq. (6) is only valid in the asymptotic limit with a large number of events.

Poisson likelihood matrix method
In this method [4] 1 the elements of N t rf are treated as free parameters, which are varied to maximise the likelihood of the observed N tl values. By doing so, the Poisson-distributed nature of the N tl values is taken into account, so there is no need to use the asymptotic approximation. In the fit, the N t rf values are converted to N b rf using Eqs. (4) and (5), where the entries in the matrix M are calculated using the averages of the prompt and fake/non-prompt lepton efficiencies in the baseline sample: The resulting N b rf values are used to obtain the expectation values for N tl using Eq. (2). These expectation values are denoted by N tl,exp . The N t rf parameters are adjusted (subject to the constraint that they must be non-negative) to maximise the joint Poisson likelihood 1 An earlier variant of the likelihood matrix method is described in Ref. [5].
where the product is over the elements of N tl ; tl and tl,exp are the th elements of N tl and N tl,exp , respectively; and [ | ] is the Poisson probability for observing events when are expected. The output of the fit consists of an estimate of the number of fake/non-prompt leptons and the uncertainty in this quantity, which is obtained by noting the value for which − ln exceeds its minimum by 0.5.
The primary advantages of the Poisson likelihood approach are that the result is constrained to be nonnegative, and the uncertainty is a better approximation to the range that gives 68% coverage, particularly in samples with few events. In addition, in some scenarios it can provide smaller statistical uncertainties than the asymptotic matrix method or the fake-factor method (described in Section 2.2). The main drawback is that the estimated yield must be calculated for the sample as a whole rather than from a sum of individual event weights, which complicates the process of producing a distribution of the fake-lepton yield, as required for any differential measurement (to do so, the likelihood must be applied in every bin of the distribution).

Fake-factor method
The fact that real-lepton kinematic distributions and efficiencies are generally modelled well in simulation, and that scale factors can be applied to account for any differences observed between the values in simulation and in data control samples, leads to an alternative method that uses simulation, rather than the data, to measure the real-lepton contribution to the loose lepton sample.
The number of fake-lepton events in the loose sample is f and the number of fake-lepton events in the tight sample is where the 'fake factor' is defined as ≡ f /(1 − f ). Thus, in the fake-factor method, the number of tight fake/non-prompt leptons for a given analysis can be computed using the fake factor , the total number of loose lepton candidates l , and the number of real leptons in the loose lepton sample l r , where the latter quantity can be estimated using MC simulated samples, and its contribution subsequently subtracted from the quantity l observed in the data. In practice, the calculation is performed on an event-by-event basis to account for potential variations in due to properties of the lepton: where is the fake factor appropriate for lepton , all sources of prompt leptons are considered in the sum over MC simulated events, l MC is the number of MC events in the loose sample, and MC, is the weight assigned to simulated event , based on the cross-section of the simulated process and any corrections to the selection efficiency that may be needed to reflect the performance on data events.
The main advantage of the fake-factor method is that this result does not depend on t , i.e. the yield in the analysis signal region. This means that unlike the matrix method, the fake-factor method can be applied while remaining 'blind' to the contents of the signal region. However, the method does have some of the same drawbacks as the asymptotic matrix method, namely the possibility of t f being negative, and sensitivity to fluctuations in the f values.

Generalisation for multi-lepton final states
The above methods can be generalised to cases where multiple baseline lepton candidates are considered in each event. For the matrix methods, this is done by increasing the dimensionality of M , N tl , and N (t,b) rf to 2 b , where b is the number of baseline lepton candidates in each event. The estimated fake/non-prompt yield depends on the requirements of a particular analysis in three ways: first, from the requirement placed on the desired number t of tight lepton candidates per event; second, whether or not events with additional loose lepton candidates are vetoed; and third, on the minimum number of fake/non-prompt leptons f defining the background to be evaluated with one of the data-driven methods. 2 The consideration of f is reflected in the transition from the number of fake/non-prompt lepton events in the baseline sample to the number in the tight sample, in a generalisation of Eq. (4): Here, the sum is over all combinations of real and fake/non-prompt leptons that include at least f fake/non-prompt leptons, is a function of the real and fake/non-prompt efficiencies that will result in the required number of tight lepton candidates for a given set of real and fake/non-prompt leptons, and b rf, is the th element of N b rf To address the requirements on t and the possible presence of additional loose lepton candidates, the analysis must consider all events in the baseline sample with lepton multiplicities up to the sum of the allowed numbers of tight and loose lepton candidates in the signal region.
As an example, consider the case where b = 2. If an analysis selects signal events containing exactly two tight lepton candidates and no loose lepton candidates, and the background with f ≥ 1 is being evaluated, then where b r 1 f 2 is the number of events in the baseline sample where the first lepton candidate is real and the second is fake/non-prompt, and b f 1 r 2 and b f 1 f 2 are defined correspondingly. The ordering of the lepton candidates is typically according to T , but the method does not depend on the ordering used.
For an analysis with f ≥ 1 that accepts events with two tight lepton candidates or one tight and one loose lepton candidate, the expression would change to 2 An example of a case where f is greater than one would be a dilepton analysis where there are backgrounds from both (→ ℓ ) + events where the jet forms a fake/non-prompt lepton, and dĳet events where both jets form fake/non-prompt leptons. The analysers may choose to estimate the first contribution from MC simulation, and use data-driven methods for the second contribution, and therefore setting f = 2 for the data-driven approach is required to avoid double-counting.
where the additional terms are needed to account for the additional ways that fake/non-prompt leptons might satisfy the signal selection (and the terms involving products of the efficiencies are subtracted to avoid double-counting). For an analysis that imposes the same lepton candidate requirements but only the background with f = 2 is being evaluated, the expression is: The methods can also be extended to cases where there are more than two levels of lepton selection criteria, or distinct categories of fake/non-prompt leptons, as in Ref. [6].
As with the matrix method, the fake-factor method can be generalised to higher lepton candidate multiplicities. In the dilepton final state, the number of events with two tight lepton candidates, of which at least one is fake/non-prompt, is: where t(l) 1 t(l) 2 is the number of events where the first lepton candidate is tight (loose) and the second lepton candidate is tight (loose), t(l) 1 t(l) 2 r 1 r 2 is the contribution to t(l) 1 t(l) 2 from events where both lepton candidates are real leptons, and 1 and 2 correspond to the fake factors associated with the first and second lepton candidate, respectively.
However, the algebraic simplification that leads to Eq. (9), where the result depends simply on products of the fake factors and the observed tight and loose lepton candidate yields, with a correction term that depends only on events where all the leptons are real, does not hold for all possible event selections nor all values of f ; such a simplification is restricted to cases where the baseline and tight candidate lepton multiplicities are the same in all events, and where f = 1.

Use with weighted events
In some cases, such as for self-consistency tests using simulated events, it may be advantageous to weight the events that are input to the fake/non-prompt background estimate. This is straightforward for the fake-factor and asymptotic matrix methods, since the weight returned by the method for each event can be multiplied by the event weight evt . For the likelihood matrix method, this is handled by using the scaled Poisson distribution [7], in which the values of tl,exp and tl,evt from Eq. (8)

Performance studies
MC simulations of experiments are utilised to assess the statistical performance of the methods. These simulations consist of pseudoexperiments that mimic scenarios that might occur in an actual analysis. Pseudoexperiments with sample sizes of 10 or 1000 dilepton events in the baseline sample are considered. The fraction of fake/non-prompt leptons in the baseline lepton sample is varied for each pseudoexperiment, with a uniform distribution between 0 and 100%. The values of r and f for each lepton are drawn from Gaussian distributions with specified means and widths (limits are imposed such that the values are always between zero and one, and f is always at least 10% less than r ). Each lepton is randomly assigned as fake/non-prompt or real, in accord with the fraction of fake/non-prompt leptons assumed for the pseudoexperiment. Then each lepton is judged to either meet or fail to meet the tight selection criteria, based on whether or not it is a real lepton and the values of r and f assigned to it. The set of simulated leptons is input to the data-driven algorithms, and the estimated fake yield and its statistical uncertainty are determined for each pseudoexperiment. These values are then compared with the expectation value for the number of fake/non-prompt lepton events in each pseudoexperiment, which is determined by the numbers of real and fake/non-prompt baseline leptons, the values of r and f for each lepton, and the value of f for the simulated analysis. For example, in a dilepton sample with f ≥ 1, the expectation value is: For the likelihood matrix method, the likelihood maximisation is implemented using the function minimisation package [8] to minimise the negative log likelihood. The interface to is provided by the TMinuit class in [9].
The fake-factor method requires a two-step process, where the contribution from real-lepton events is subtracted in the second step. In an actual physics analysis, this subtraction is done using MC simulation of the prompt lepton contribution. For the simple MC simulation used here, the second step is modelled by running each pseudoexperiment a second time with the same parameters but a statistically independent sample of events, and running the fake-factor method only on the events that do not have fake/non-prompt leptons. The second sample has ten times more events than the primary sample, to be consistent with the usual case where the MC simulation sample has a multiple of the number of events in the data sample. The result of this second run is then scaled down by a factor of ten and subtracted from the result when using the initial set of simulated events.
As an initial example, one can investigate the performance for dilepton events under conditions that are favourable for estimating the fake/non-prompt lepton background. This means that the samples are large (1000 events per pseudoexperiment), and the values of r and f are on average well-separated (here, r = 0.90 and f = 0.10). The ratios of the estimated to true fake yields are shown versus t f in Figure 1(a). The average statistical uncertainties in the estimates for each method are shown in Figure 1(b), and the fraction of pseudoexperiments in which the true fake yield lies within the uncertainty reported for each method is shown in Figure 1(c). The performance of the methods for dilepton analyses with low statistical precision (10 events per pseudoexperiment) are shown in Figure 2. Finally, to represent a more challenging situation, the case where there is less separation between the values of r and f (due, for example, to the application of stricter online lepton selection criteria that might be required when the LHC runs at higher luminosities) is also explored. Figure 3 shows the results when r = 0.70 and f = 0.30, and there are 10 events per pseudoexperiment.  The quantity t f is the expectation value for the number of fake/non-prompt lepton events in each pseudoexperiment. Plot (a) shows the ratio of the estimated to expected fake-lepton yields, (b) shows the absolute uncertainty estimate for each method (for the likelihood method the average of the upward and downward uncertainties is taken), and (c) shows the fraction of pseudoexperiments where the true fake yield lies within the reported one standard deviation (1 ) range. These studies show that all three methods give accurate estimates, with nearly equivalent performance, in high-statistics samples with a large separation between r and f (as shown in Figure 1). One notable feature is the dip in the uncertainty values for all three methods near t f = 10 in Figure 1(b). This occurs because there are two ways for the model to produce an expectation around 10: either a very low fake fraction in the baseline to start with, or a very large fake fraction so that there are few real leptons and most of the background is from events with two fake/non-prompt leptons, which gives a minimum value of 2 f · 1000 = 10 when f = 0.10. These two processes will have very different uncertainties. There is also a bias in the likelihood matrix method toward low values when the true number of fakes is large. This bias arises due to the averaging of efficiencies over the entire baseline sample (see Eq. (7)) when in fact the efficiencies are on average different for real and fake leptons. Such differences in the averages occur randomly in the 'toy' MC tests, but can occur systematically in a physics analysis if, for example, the real and fake efficiencies have different kinematic distributions. The biases can be mitigated by binning the baseline sample in the variables for which the real-lepton and fake-lepton distributions may differ, and performing the likelihood fit separately in each bin. As an example of the effect of such a binning, the pseudoexperiments can be run with the results binned according to the values of f . The effect of using two such bins in the value of f for each lepton is shown in Figure 4.
When the situation becomes more challenging, such as in Figures 2 and 3, the characteristics of each method become more distinct. For low-statistics samples, the likelihood approach tends to exhibit a bias toward high values when the true number of fakes is small, a natural consequence of the fact that it cannot return negative values. The coverage of the true value by the estimated uncertainty is, however, still reasonable. A clear distinction between the precision of the methods also appears in Figures 2 and 3, where the likelihood approach has the smallest uncertainty, followed by the asymptotic matrix method and then the fake-factor method. This is because the likelihood approach considers lepton efficiencies averaged over the entire sample and is therefore less susceptible to event-by-event fluctuations in the values of r and f . Despite the differences between them, none of the approaches is incorrect, as shown by the statistical coverage plots: except in extreme cases, the confidence intervals built from the estimates and their statistical uncertainty do contain the true fake yield in at least 68% of pseudoexperiments, and the visible overcoverage is due mostly to the fact that the uncertainties are computed under the assumption that all of the N tl values are independent, while the pseudoexperiments were generated with a fixed number of baseline events for each, which means there was some anticorrelation among the N tl .
When selecting which method to use, the analyser needs to consider the relative benefits and complexities of implementing the methods, along with the size of the uncertainty in the fake/non-prompt lepton background yield relative to other uncertainties in the analysis.

The ATLAS detector
While the above description of the matrix and fake-factor methods is general, the remainder of this paper discusses the application of these methods to ATLAS physics analyses, and therefore a brief description of the experimental apparatus follows.
The ATLAS detector [1] at the LHC covers nearly the entire solid angle around the collision point. 3 It consists of an inner tracking detector surrounded by a thin superconducting solenoid, electromagnetic (EM) and hadronic calorimeters, and a muon spectrometer (MS) incorporating three large superconducting toroidal magnets. The inner-detector system is immersed in a 2 T axial magnetic field and provides charged-particle tracking in the range | | < 2.5.
The high-granularity silicon pixel detector covers the vertex region and typically provides four measurements per track, the first hit normally being in the insertable B-layer (IBL) installed before Run 2 [10,11]. It is followed by the silicon microstrip tracker, which usually provides eight measurements per track. These silicon detectors are complemented by the transition radiation tracker (TRT), which enables radially extended track reconstruction up to | | = 2.0. The TRT also provides electron identification information based on the fraction of hits (typically 30 in total) above a higher energy-deposit threshold corresponding to transition radiation.
The calorimeter system covers the pseudorapidity range | | < 4.9. Within the region | | < 3.2, EM calorimetry is provided by barrel and endcap high-granularity lead/liquid-argon (LAr) calorimeters, with an additional thin LAr presampler covering | | < 1.8 to correct for energy loss in material upstream of the calorimeters. Hadronic calorimetry is provided by the steel/scintillator-tile calorimeter, segmented into three barrel structures within | | < 1.7, and two copper/LAr hadronic endcap calorimeters. The solid angle coverage is completed with forward copper/LAr and tungsten/LAr calorimeter modules optimised for EM and hadronic measurements respectively.
The MS comprises separate trigger and high-precision tracking chambers measuring the deflection of muons in a magnetic field generated by the superconducting air-core toroids. The field integral of the toroids ranges between 2.0 and 6.0 Tm across most of the detector. A set of precision chambers covers the region | | < 2.7 with three layers of monitored drift tubes, complemented by cathode-strip chambers in the forward region, where the background is highest. The muon trigger system covers the range | | < 2.4 with resistive-plate chambers in the barrel, and thin-gap chambers in the endcap regions. Interesting events are selected to be recorded by the first-level trigger system implemented in custom hardware, followed by selections made by algorithms implemented in software in the high-level trigger [12]. The first-level trigger accepts events from the 40 MHz bunch crossings at a rate below 100 kHz, which the high-level trigger reduces in order to record events to disk at about 1 kHz.
An extensive software suite [13] is used in data simulation, in the reconstruction and analysis of real and simulated data, in detector operations, and in the trigger and data acquisition systems of the experiment.

Lepton selection criteria
Full descriptions of the electron and muon reconstruction algorithms and available selection criteria used in ATLAS are provided in Refs. [2] and [3], respectively. Here the features most relevant to the fake/non-prompt lepton background estimation are summarised briefly.
The pseudorapidity is defined in terms of the polar angle as = − ln tan( /2). Angular distance is measured in units of

Electron reconstruction and identification
Electron candidates are reconstructed within | | < 2.47 as tracks in the inner detector matched to energy clusters in the EM calorimeter. In order to separate true electrons from other unwanted reconstructed candidates, electron identification (ID) algorithms are used. These rely upon a set of variables that quantify the distribution of energy in the calorimeter, the quality of the spatial match between the calorimeter deposit and the associated track, and the transition radiation signal in the TRT (see Table 1 of Ref. [2] for a complete list). Rather than place individual requirements on these variables, they are combined into a likelihood discriminant based upon the probability density functions of the variables measured for prompt electrons in → + − events and for background candidates reconstructed in inclusive collision events.
Since different analyses have different requirements for electron selection efficiency and background rejection, several 'working points' (WPs) are defined by different values of the likelihood. The likelihood threshold values are varied according to the T and | | of the electron candidate, so that the selection efficiency varies smoothly with the electron T . The most commonly used ID WPs (and their average efficiencies 4 measured for typical electroweak processes) are 'Loose' (93%), 'Medium' (88%) and 'Tight' (80%). In addition to the listed WPs, there is another ('LooseAndBLayer') WP that imposes the same requirement on the likelihood as the 'Loose' WP, but also requires that the electron track have a hit in the IBL to suppress candidates originating from photon conversions.
Often, additional requirements are imposed on the impact parameter of the electron's track: the impact parameter in the transverse plane, 0 , with respect to the centre of the beamspot must satisfy | 0 | < 5 ( 0 ), where ( 0 ) is its estimated uncertainty, while the longitudinal separation 0 between the point where 0 is measured and the chosen primary vertex of the event, multiplied by a moderating factor sin( ) which accounts for reduced 0 accuracy for more forward tracks, cannot exceed 0.5 mm in absolute value.

Muon reconstruction and identification
Muon candidates are reconstructed in the region | | < 2.5 by combining MS tracks with matching inner-detector tracks. The muon reconstruction efficiency is approximately 98% per muon in simulated → + − events. After reconstruction, high-quality muon candidates used for physics analyses are selected by a set of requirements on the number of hits in the different inner subdetectors and different MS stations, on the track fit properties, and on variables that test the compatibility of the individual measurements in the two detector systems, as detailed in Ref. [3]. These criteria reduce the background from in-flight decays of light-flavour hadrons, which often result in kinked tracks. The most commonly used muon ID WPs (and their efficiencies measured in¯MC events) are 'Medium' (98%) and 'HighPt' (80%), the latter optimised to offer the best momentum resolution for T > 100 GeV. The same impact parameter requirements as defined for electrons are also often applied to muon candidates, with a tighter condition in the transverse plane: | 0 | < 3 ( 0 ).

Lepton isolation
In addition to the ID criteria mentioned above, most analyses place requirements on the isolation of the lepton from other detector activity. This is especially helpful in reducing the contribution from leptons produced in heavy-flavour decays, or muons from ± or ± decays, within jets 5 as there are often other components of the jet that are near the lepton in these cases. In the context of the matrix and fake-factor methods, the baseline lepton selection usually does not require isolation, while the tight lepton selection usually does. However, many of the single-lepton triggers [14,15] used in ATLAS require isolation, so analyses that depend on such triggers cannot avoid applying isolation requirements for baseline leptons.
The calorimeter isolation [2,3] is calculated from the sum of transverse energies of calorimeter energy clusters within Δ ≡ √︁ (Δ ) 2 + (Δ ) 2 = XX/100 of the lepton candidate, not including the contribution expected from the candidate itself. Typical values of XX are 20, 30 or 40. Expected residual contributions to the isolation from the lepton candidate, as well as expected contributions from particles produced by additional proton-proton ( ) interactions, are subtracted [19], resulting in the variable coneXX T .
The track-based isolation [2,3], denoted by coneXX T , is based on tracks near the lepton candidate with either T > 0.5 or 1 GeV that satisfy basic track-quality requirements and are spatially consistent with the primary vertex of the event. The scalar sum of the transverse momenta of such tracks, excluding tracks associated with the lepton candidate, is compared with the T of the candidate to assess the isolation. For muon candidates, only the single track associated with the candidate is excluded; for electron candidates, additional tracks consistent with pair-production from a bremsstrahlung photon are also excluded. The track isolation can also be defined with a variable cone size ( varconeXX T ). In this case, the size of the cone around the lepton candidate within which tracks are considered is varied as a function of the T of the candidate: Δ = min 10 GeV where max is the maximum cone size (typically 0.2 or 0.3).
Combining selections on track-based and calorimeter-based isolation provides even better fake/non-prompt lepton background rejection, as the two isolation variables use complementary information. Track-based isolation was found to be less sensitive to detector noise and pile-up effects than calorimeter-based isolation, and the inner detector provides a better T measurement than the calorimeters for individual soft hadrons.
On the other hand, calorimeter-based isolation includes neutral particles as well as some particles below the inner detector's track-T threshold, which are ignored when computing track isolation. However, track and calorimeter isolation variables measure hadronic activity in a redundant manner, since charged particles are measured by both the calorimeters and the inner detector, and simple selection cuts applied independently to those two variables may not achieve optimal rejection power. To avoid this, an analysis can use a 'particle-flow' algorithm, which allows removal of overlapping contributions from the track-based and calorimeter-based isolation, decreasing the correlation between the two variables. For the time being, particle-flow-based isolation variables are defined only for muons, and discussed in Ref. [3].
For analyses where the fake/non-prompt lepton background may be dominated by non-prompt electrons and muons from the decays of -and -hadrons [20,21], isolation WPs using a boosted decision tree (BDT) discriminant based on isolation and secondary vertex information, referred to as the non-prompt lepton BDT, are also proposed.
Several isolation WPs based on tracking, a combination of calorimetry and tracking, particle-flow, or a non-prompt lepton BDT are defined to allow for consistency across analyses that require different levels of lepton isolation. They are described in Refs. [2] and [3].

Removing overlaps between jets and leptons
In some cases the same object can result in multiple signatures in the detector. For example, an electron will deposit energy in the calorimeter, and that energy will generally also be clustered into a jet. In addition, sometimes objects will be spatially correlated due to the underlying physics, as when a muon is produced by heavy-flavour decay within a jet. To avoid double-counting, and to select only the isolated leptons that are typically of interest in physics analyses, an overlap removal (OR) procedure is applied to resolve these ambiguities. The procedure used for most physics analyses is as follows, where the lepton candidates are those that satisfy the baseline ID criteria for the analysis: • All electron candidates that are within Δ = 0.01 of a muon candidate (or share a track with a muon candidate) are removed.
• All jets that are within Δ = 0.2 of any remaining electron candidates are removed.
• All electron candidates that are within Δ = 0.4 of any remaining jet are removed.
• Cases where a remaining jet is within Δ = 0.4 of a muon candidate are examined to determine the number of tracks associated with the jet. If there are more than two such tracks the muon candidate is removed; otherwise the jet is removed.
Variations of this procedure are also supported, primarily for analyses that focus on heavy-flavour jets or that select boosted massive particles using large-jets.

Monte Carlo simulation samples
The studies of fake/non-prompt leptons that are presented in Sections 6 and 7 make use of large MC samples of simulated events. The production of¯events at next-to-leading order (NLO) in the quantum chromodynamics (QCD) coupling constant s is described in Ref. [22] and relies on the P B v2 event generator [23] interfaced with P 8.230 [24] for parton showering and subsequent steps, with the A14 set of tuned parameters [25]. The parton distribution functions used for matrix element calculation and parton showering are NNPDF3.0 [26] and NNPDF2.3 [27] respectively. Generated events were filtered such that at least one of the top quarks decays semileptonically. The E G 1.2.0 program [28] was used to model heavy-flavour hadron decays.
The production of Drell-Yan / * → ℓ + ℓ − events (ℓ = , , ) at NLO in s is described in Ref. [29] and relies on the S 2.2 event generator [30] with the dedicated set of tuned parameters and the NNPDF3.0 [26] parton distribution function. Events were generated according to a partition of the phase space described in Ref. [29], resulting in a set of orthogonal samples which were combined with weights corresponding to the NLO cross-section calculated by the generator. The ℓ + ℓ − invariant mass was required to be at least 40 GeV.
A full ATLAS detector simulation [31] based on G 4 [32] was then used to faithfully reproduce particle interactions with the detector and its response. Additional interactions in the same or neighbouring bunch crossings were also simulated in order to reproduce the conditions of data-taking during the LHC Run 2 operation.
The source of a lepton candidate in simulation, discussed in Section 6, is identified by using preserved generator-level information to match each reconstructed lepton candidate to the closest particle produced by the event generator, based primarily on the angular separation between the latter and the reconstructed track of the former. If the matched particle is a charged lepton (or, in the case of electron candidates, a photon), its origin is checked so that the different sources of non-prompt leptons can be distinguished.
In Section 9, the simulation of several other processes is involved, and is performed with a similar workflow; complete information about the MC generators employed, their configurations, and the cross-section calculations used to normalise the simulated samples, is available in the references provided in that section for those two analyses.

Sources of fake/non-prompt leptons
The relative contributions of fake/non-prompt leptons from different sources to the sample of selected lepton candidates depend on the energy and spatial location of the candidate in the detector, the identification, impact parameter and isolation criteria applied, the overlap removal procedure, and the nature of the selected final state (e.g. the presence or absence of bottom quarks).
Figures 5 and 6 present for illustration the relative contributions of the different sources of fake/non-prompt muons and electrons respectively, as a function of the transverse momentum of the candidate, as measured in MC simulated events. They are shown for two different processes:¯production, leading to final states rich in heavy-flavour hadrons, and Drell-Yan production of + − or + − pairs; for¯events at least one of the top quarks is required to decay leptonically. For these particular figures, the reconstruction/ID of electron and muons and the general event selection follow those described in Ref. [33] for baseline lepton candidates, which correspond to rather loose criteria (in particular, no isolation nor transverse impact parameter requirements are applied, and no overlap removal is done). For both processes, events are considered only if they contain a pair of reconstructed baseline leptons with identical charges, a signature for which fake/non-prompt leptons usually represent a non-negligible source of background.
The 'prompt -conversion' category only includes electron candidates where the photon is separated by Δ > 0.1 from any generator-level high-T electron from the hard-scatter interaction (photons emitted at smaller separation are generally reconstructed as part of the electron candidates).
Non-prompt leptons are those arising from electroweak decays of hadrons. Heavy-flavour -and -hadrons decay close to the interaction point and the resulting leptons are distinguishable from real leptons mostly by isolation and impact parameter.
Light-flavour hadrons can also be a major source of fake leptons via decay-in-flight in the tracker volume. This happens mainly in final states for which QCD multi-jet production is a significant contributor. A charged hadron stopping early in the calorimeter, and generating a narrower-than-average shower, can mimic the experimental signature of an electron. Electron ID criteria are particularly powerful in rejecting these candidates rather than those from other sources, based notably on three-dimensional profiles of the shower, but since many orders of magnitude more hadrons than leptons are produced in collisions at the LHC, a substantial number of such fake electrons may be selected in a physics analysis. With regard to muons, the depth of the ATLAS calorimeter is sufficient to stop most pions and kaons before they can Among hadrons faking electrons, one significant class is neutral pion decays into photons ( 0 → ); the collimated photons create a single energy deposit in the EM calorimeter, while the associated track might be provided by the conversion of one of the photons in the upstream detector material. Due to the importance of this phenomenon, the electron ID criteria specifically discriminate against these by attempting to identify a two-peak structure in the distribution of the cell energies matched to the electron cluster [2], which is not present for real electrons. Other contributions such as Dalitz decay of pions [34] also create comparable experimental signatures.
The conversion of photons into electron-positron pairs represents the last important class of non-prompt electrons. These -conversions must typically occur early on (e.g. in the beam pipe) and be largely asymmetric in the splitting of the momentum between the two electrons; otherwise, the conversion vertex can be reconstructed, or the candidate electron's track lacks hits in the first layers of the inner detector, both leading to the proper classification of the reconstructed object as a photon instead of an electron [2]. The origin of the photon itself influences the characterisation of the candidate as non-prompt or real: photons emitted close to a real electron, either due to bremsstrahlung or as higher-order quantum electrodynamic (QED) corrections to the production process, are typically considered part of the electron candidate (the calorimeter energy deposits tend to overlap to the extent that a single cluster is reconstructed); furthermore, from the perspective of quantum field theory, well-defined electrons must include extra radiation ('dressed leptons' [35]). The electron reconstruction procedure [2] accounts for bremsstrahlung, in particular by allowing kinks in the track consistent with bremsstrahlung emission in dense material regions. In contrast, photons from other origins, such as initial-state radiation, QED processes not involving leptons or where photons are sufficiently separated from leptons, or hadronic jet fragmentation, may be considered as sources of fake electrons.   Figures 5 and 6 that also pass the tight selection criteria of Ref. [33]. Error bars represent statistical uncertainties of the simulation, and for purposes of clarity, values with relative uncertainties greater than 30% are not shown.
It can be seen in Figure 5 that non-prompt muons constitute the only substantial contribution to the fake/non-prompt muon background, while for electrons Figure 6 shows more variety: in general, nonprompt electrons are particularly represented in the lower T range, especially in processes involving the production of heavy-flavour hadrons, while hadron fakes and converted photons populate the higher T range. 6 The different sources of fake/non-prompt leptons have distinct probabilities to satisfy the tight lepton selection criteria described in Section 4. Figure 7 illustrates those differences for the particular example of signal leptons definitions used in Ref. [33] (including lepton-jet overlap removal) and simulatedē vents. Such variability between sources is unwelcome, as the fake efficiencies required for the application of the methods described in Section 2 then depend upon the relative contributions of each source to the regions of interest, which may not be easy to assess. It is therefore desirable to measure the efficiencies in regions similar in composition to the regions where the background estimate is needed, otherwise large extrapolation uncertainties may apply.
Precise simulation of these various sources of background, including their relative contributions, is indeed very challenging, as it relies heavily on the modelling of the soft-QCD regime by event generators, including modelling of fragmentation and hadronisation processes, hadron decay modelling, and soft emissions and detector modelling. Another issue is that only a small fraction of fake or non-prompt lepton candidates survive the ID and isolation requirements, so the simulation of a very large number of events is needed to obtain a statistically accurate prediction. For inclusive processes with large cross-sections (e.g. multi-jet production), this is often impractical.
For these reasons, many of the fake/non-prompt lepton background predictions used in ATLAS publications are based on methods using the data, such as the ones described in this paper. They rely on common properties shared to some extent by the different sources of fake/non-prompt leptons that differentiate them from real leptons, such as a high likelihood to not meet the combination of ID and isolation criteria.

Measurement of real and fake/non-prompt lepton efficiencies
The matrix and fake-factor methods both rely on knowledge of the efficiency for leptons that pass the baseline selection to also pass the tight selection. For the fake-factor method, only the efficiency for fake/non-prompt leptons is used explicitly in the calculation, while for the matrix method the efficiency for real leptons must also be measured. In many cases these efficiencies depend on the properties of the lepton (such as its T or angular distance from a jet) or on the event in which it is found (such as the overall activity of the event, as measured for example by the number of reconstructed primary vertices).

Real-lepton efficiencies
The efficiencies of specific working points of lepton selections are calibrated precisely in ATLAS for general purposes, primarily with 'tag-and-probe methods' based e.g. on → ℓℓ events. By performing these measurements on both data and MC-simulated events, 'scale factors' (SFs) that account for differences between the efficiencies observed in data and simulation are derived. These SFs can then be applied to simulated events using the selection criteria that are relevant to a given analysis to determine the appropriate real-lepton efficiencies. This MC-based approach is valid as long as both the baseline and tight lepton selection criteria are taken from the set for which SFs have been measured; only in extraordinary cases would an analysis utilise different selection criteria. Since the efficiencies depend more strongly on the environment than the SFs do, the main advantage of this approach over purely data-based measurements is to allow efficiencies to be obtained directly in the desired environment (i.e. the region in which the fake/non-prompt background estimate is needed), rather than being extrapolated from a more distant region which would be needed for reliable measurements in data.
Details of the real-electron and real-muon efficiency measurements can be found in Refs. [2] and [3], respectively. The real efficiencies are often parameterised with respect to the T and | | of the leptons, and measured separately for electrons and muons.

Fake/non-prompt lepton efficiencies
The fake/non-prompt lepton efficiencies are specific to each analysis, primarily since there are several sources for such leptons (see Section 6), which will contribute with different weights depending on the chosen selection criteria. In general, though, the first step in the efficiency measurement is to identify a region that has a large contribution from fake leptons. Two approaches are commonly used. In the first, events with a pair of leptons with the same electric charge are selected. Since such lepton pairs are only rarely produced at the LHC (via processes such as + jets, + jets, and¯+ ( = / ) production) it is likely that one of the two leptons is fake/non-prompt. By placing stringent quality criteria on one lepton in these events, the probability that the remaining lepton is fake/non-prompt is enhanced. The second approach is to use single-lepton events, where criteria are imposed to suppress the contribution from real leptons. Examples of such criteria are requiring the missing transverse momentum 7 miss T , or transverse mass 8 T , to be below specific thresholds, thereby reducing the contribution from + jets or¯events, or requiring via the track impact parameters that the lepton originate from a position inconsistent with the primary event vertex, thereby enhancing the contribution of leptons from heavy-flavour decay.
In either approach, there will be a residual contribution from events with only real leptons in the selected sample. This contribution is typically estimated using MC simulation and subtracted separately from both the tight and baseline samples before the ratio of these samples is taken to measure the efficiency.
As with the real-lepton efficiencies, the fake/non-prompt lepton efficiencies depend on properties of the lepton candidates or of the event in which they are found. Therefore, it is generally helpful to bin the efficiencies in the lepton T and | |, and possibly in terms of other quantities as well. The optimal binning to be used is chosen in the context of each physics analysis, considering the given numbers of events and potential changes in the relative contributions from the different sources of fake/non-prompt leptons; illustrative examples are provided in Section 9.
Adopting a parameterisation of the efficiency with respect to variables other than T and can sometimes be beneficial. For example, when considering solely the probability that fake or non-prompt leptons satisfy the isolation criteria, another relevant quantity is the momentum of the parent jet: the fake efficiency corresponds in that case to the probability that most of the jet's visible momentum is carried by the lepton. A parameterisation as a function of lepton T thus assumes that for a particular T , the distribution of the parent jet's momentum is similar in the regions where the efficiencies are measured and the regions where the background estimates are needed. If this assumption does not hold, it can be useful to adopt instead a parameterisation as a function of the parent jet's momentum. Since this quantity is not easily accessed experimentally (unlike the lepton T ), proxy observables are used in practice. An example of successful application is the analysis in Ref. [37], which employed the sum of the lepton's T and the transverse energy in a cone around the lepton as a proxy. The preceding discussion is also illustrated in Figure 8 for non-prompt muons produced in the decay of -hadrons in simulated¯events. The probabilities for such muons to satisfy a track-based isolation requirement, as defined in Section 4, are shown for two alternative parameterisations: one based on the muon T , and the other on the scalar sum of the muon T and the transverse energy deposited in calorimeter-cell clusters within a cone of size Δ = 0.4 around the muon (referred to as cone40 T in Section 4). This scalar sum serves as a proxy for the parent jet's transverse momentum.
In the second parameterisation, the fake efficiency is the fraction of jets with a non-prompt muon and visible momentum T + cone40 T in which the muon is mostly isolated, i.e. T cone40 T . While one might consider jets with arbitrarily soft muons in the denominator of this fraction, for practical reasons Figure 8(b) only includes events where muons satisfy T > 10 GeV. To study the dependence of these efficiencies on the momentum distribution of the underlying jet, different regions of jet momentum are emphasised by imposing different requirements on the global transverse energy T in the event, defined for this purpose as the scalar T sum of all jets with T > 25 GeV and | | < 2.8 that are a distance Δ > 0.6 from the muon. This quantity is indeed partially correlated with the kinematics of the muon's parent jet, via the momentum of the top quarks producing all these jets. The reconstruction, calibration and selection of jets and muons for this figure are otherwise those detailed in Ref. [33].
It can be observed that for the case of a T -dependent parameterisation, the efficiencies vary strongly with T , although large differences occur mostly for T > 40 GeV. Since most non-prompt muons are produced at low T , the overall impact of this non-captured dependency might be small, unless regions of interest in the analysis specifically select high-T leptons. In contrast, the parameterisation as a function of T + cone40 T is much less influenced by T , making the measured efficiencies less dependent on the event topology. In practice, a compromise has to be found between this observation and other elements evoked above justifying a T -dependent parameterisation, especially for electrons because ID criteria are usually employed in addition to isolation. The direct dependency of efficiencies on other variables that are also correlated with the event topology (e.g. miss T ) may also be reduced by such a parameterisation.
As detailed in the next section, suitable uncertainties must be assigned to the use of the fake efficiencies in different regions than those in which they are measured, in particular to account for potential differences in relative contributions of the different sources of fake/non-prompt leptons. To minimize these uncertainties, it has sometimes been found beneficial to use an approach closer to that of section 7.1, in which efficiencies are evaluated in the simulation and supplemented by data-driven correction factors that depend on the source of the fake/non-prompt lepton. These correction factors are derived using dedicated control regions that are enriched in a particular source of fake/non-prompt leptons. The main assumptions are then the universality of the correction factors across different processes, and the ability of the simulation to adequately predict the relative contributions of each source in the regions of interest. Such an approach has for example been used in Ref. [38].

Systematic uncertainties
Systematic uncertainties in the fake/non-prompt lepton background estimates from the matrix method and the fake-factor method arise from uncertainties in the values of r and f . These uncertainties can be traced to statistical uncertainties from the samples used to measure the efficiencies, to potential biases that may cause the efficiencies in the signal region for a particular analysis to differ from the values obtained from control samples (such as differences in the origin of fake/non-prompt leptons between these regions), and to uncertainties in the modelling of contamination from real-lepton processes in the samples used to measure the fake efficiency. Details of these uncertainties and their estimation are provided below. The overall impact of variations in r and f depends on the characteristics of the analysis. For the matrix method, from Eqs. (3) and (4) one obtains: Therefore, the relative importance of uncertainties in the real and fake efficiencies (Δ r and Δ f ) depend on the typical values of r and f and on the values of t r and t f . For analyses with the uncertainty in r will dominate, and vice versa.

Statistical uncertainties in the measured efficiencies
Statistical uncertainties in the real-lepton and fake-lepton efficiencies can be accounted for either by analytically propagating the uncertainties through to the estimated event yields or by varying the efficiencies input to the nominal yield calculation by their statistical uncertainties and using the resulting difference in the estimated fake/non-prompt background yield as the resulting uncertainty. The real-and fake-efficiency uncertainties are generally uncorrelated since the efficiencies are measured in statistically independent samples. In the usual case where the efficiencies are measured in bins of one or more quantities, the bins are uncorrelated, so the variations in each bin are applied separately. The total statistical uncertainty in the fake/non-prompt lepton background estimate is given by the sum in quadrature of the uncertainties from all variations. This source of systematic uncertainty is usually not dominant.

Systematic uncertainties in the measured efficiencies
Systematic uncertainties in f are generally larger and more challenging to assess than for r : 1. Real-lepton efficiencies have only slight variations due to event environment effects since real leptons have small contributions to their measurements from underlying event and jet activity. This also means there are a wide variety of samples with which they can be calibrated in great detail.
2. There are several sources of fake/non-prompt leptons, and the efficiencies may differ between these sources. Therefore, any differences in the fake/non-prompt lepton composition between the sample used to measure the efficiencies and the signal region for an analysis may lead to a bias in the efficiencies.
Several methods are used to estimate systematic uncertainties in the fake-lepton efficiencies. One is simply to vary the selection criteria for events in the control region used to measure the fake-lepton efficiencies, since the composition of fake/non-prompt leptons in the standard and alternative control regions may differ. A more sophisticated approach is to use MC simulation to estimate the fake/non-prompt lepton compositions in both the control and analysis regions. This information, combined with the MC-estimated selection efficiencies for each source of fake/non-prompt leptons, can be used to provide an estimate of the uncertainty.

Uncertainties in the modelling of real-lepton processes
When measuring the fake-lepton efficiencies, a correction must be applied to account for contamination from processes with real leptons in the control sample used for the measurement: where t r and l r are the numbers of real leptons in the selected tight and loose samples, respectively. MC simulation of real-lepton processes is generally used to estimate t r and l r , with corrections applied to account for known differences in object selection efficiencies between simulation and data. Nonetheless, several sources of systematic uncertainty in the real-lepton contamination remain: uncertainties in the cross-sections of the real-lepton processes, uncertainties in the correction factors, and uncertainties in the parameters, e.g. parton distribution functions (PDFs) and factorisation/renormalisation scales ( f , r ), used in the simulation.

Uncertainties due to biases in the likelihood matrix method
The biases that occur in some situations for the likelihood matrix method (see Section 2.5) may also be considered as a source of systematic uncertainty, especially when the signal region contains many bins with few events in each. The magnitude of the bias can be estimated either by repeating the analysis with coarser bins, or by constraining the total fake/non-prompt lepton yield estimate to be the value returned by applying the likelihood matrix method to the entire unbinned sample. Any resulting differences in the binned estimates can be taken as a systematic uncertainty.

Examples of application in ATLAS analyses
In this section the application of the fake/non-prompt lepton background estimation methods is described using two example ATLAS analyses. The first is a measurement of the¯differential cross-section using events that contain three or four lepton candidates, and the second is a model-independent search for 'beyond the Standard Model' (BSM) phenomena in events with three or more lepton candidates. In both cases, the high lepton multiplicity suppresses the Standard Model (SM) backgrounds, which makes the relative contribution of the fake/non-prompt lepton background larger.

Measurement of the¯cross-section in final states with three or four leptons
This analysis measured the differential production cross-section of the¯process in final states with a total of three or four electron and muon candidates. The tools described above are used for the lepton efficiency measurements and the application of the likelihood matrix method (described in Section 2.1.2). In addition to the fake/non-prompt lepton estimation in the signal regions of the analysis, the results are checked in several validation regions enriched in fake/non-prompt leptons. The likelihood matrix method was chosen for this analysis since the number of fake/non-prompt leptons in the signal regions with three or four leptons is expected to be very low, and the likelihood matrix method provides more stable results for binned estimations, which are necessary for differential background predictions in these low-statistics signal regions. The fraction of events with more than one fake/non-prompt lepton in the signal or validation regions has been checked and found to be negligible. More details about the analysis and the fake/non-prompt lepton background estimation can be found in Ref. [42].

Real-lepton efficiencies
The first step in the application of the matrix method is to measure the efficiencies for real and fake/nonprompt leptons that satisfy the baseline criteria to also satisfy the tight criteria. In the¯cross-section measurement, baseline electrons are required to satisfy the 'LooseAndBLayer' ID WP, whereas tight  Figure 9: The two-dimensional real-lepton efficiencies obtained for (a) electrons and (b) muons, in bins of T and | | of the leptons. The last T bin is inclusive. The real-lepton efficiencies are obtained using MC simulation, but corrected to match the performance seen in data control samples.
electrons are required to satisfy the stricter 'Medium' ID criteria and to be isolated from nearby tracks and calorimeter energy deposits. Both the baseline and tight muons are required to satisfy the 'Medium' WP, and tight muons are in addition required to be isolated from nearby tracks.
As discussed in Section 7.1, the real-lepton efficiencies are obtained using MC simulation, corrected to match the performance seen in data control samples. Those efficiencies are shown in Figure 9, binned in lepton T and | |. To check for potential dependencies on the number of additional jets in the events used for the measurements, the efficiencies are derived for different jet multiplicities. No significant differences between the real-lepton efficiencies are observed.

Fake/non-prompt lepton efficiencies
The fake/non-prompt lepton efficiencies are measured with same-charge electron-muon ( ) or muonmuon ( ) data events using a tag-and-probe method, where one 'tag lepton' with very stringent requirements on momentum and isolation ( T > 40 GeV, max( cone20 T , cone20 T )/ T < 0.01) is selected and the remaining 'probe lepton' is used for the efficiency measurement. Events with more than two leptons are not considered. Only the aforementioned baseline electron or muon requirements are used to select the probe lepton. The samples are dominated by events containing at least one fake/non-prompt lepton, and are orthogonal to the signal regions, which require a minimum of three tight lepton candidates. In addition, events constitute negligible fractions of these samples. Table 1: Definition of the fake-lepton control regions used for the electron ( -fakes-CR) and muon ( -fakes-CR) fake-efficiency measurements. ℓ is the number of leptons (ℓ), while jets ( -jets ) is the number of jets ( -tagged jets, see text), respectively. For additional details of the definitions of physics objects, see Ref. [42].
The definition of the regions used for the fake-efficiency measurements is summarised in Table 1. For the electron fake efficiencies an signature is used, with the muon being used as the tag lepton. For the muon fake efficiencies, the region is used, 9 since the region also contains unwanted events where an electron with an incorrectly measured charge is selected as the tag lepton and paired with a real muon.
Real-lepton background processes leading to the same-charge dilepton signature are estimated using MC-simulated events and are subtracted from data to obtain an unbiased efficiency measurement. The contribution from electrons with misassigned charge in the same-charge region is also subtracted using estimates from MC-simulated events. After this subtraction, the dominant source of fake/non-prompt leptons is found to be heavy-flavour hadron decays. The fake efficiencies, binned in lepton T and | |, are shown in Figure 10.
It is assumed that since the loose and tight lepton selection criteria depend on quantities related to the lepton itself or to its immediate surroundings, the chosen parameterisation captures the main variations in the fake efficiencies, and residual dependencies on the event environment can be covered by systematic  Figure 9, the fake/non-prompt lepton efficiencies depend much more on the specifications of the analysis.
uncertainties (see Section 8). Indeed, in the simulation, the dependence of the fake efficiencies on the number of light-flavour jets or -tagged 10 jets is mild. The uncertainties are evaluated by comparing, in the simulation, the fake efficiencies in event selections corresponding to either the measurement regions in Table 1 or the regions of interest in the analysis for which the background estimates are needed. These differences, evaluated as a function of T and | |, are applied as a systematic uncertainty of the fake-efficiency measurement (as discussed in Section 8), and are of the order of 10%-20%, except for muons with T > 50 GeV, for which they reach 40%. Furthermore, normalisation uncertainties are considered for the real-lepton background processes, which are subtracted in the fake-efficiency measurement. They are evaluated by scaling the real-lepton background processes upwards and downwards within their cross-section uncertainties before the subtraction and using the differences between the modified and nominal efficiencies as uncertainties, which are added in quadrature to the aforementioned uncertainties.

Results in the fake/non-prompt lepton validation regions
To validate the performance of the method, predictions are obtained and compared with data in two dedicated validation regions called 'VR-3ℓ-1 3 ' and 'VR-3ℓ-1 3 -no ', which have a larger proportion of fake/non-prompt leptons than is expected in the signal regions. The definitions of these two validation regions are summarised in Table 2. No charge requirements are placed on the reconstructed lepton candidates in these regions.  The variable SF ℓℓ refers to the invariant mass of the same-flavour opposite-charge (SFOC) lepton pair with the invariant mass closest to the boson mass. VR-3ℓ-1 3 is a region similar to the actual signal regions defined in Ref.
[42], but without a requirement on the mass for the (SFOC) lepton candidate pair. Therefore, it contains a higher fraction of fake/non-prompt leptons after the selection. To further enhance the fake/non-prompt lepton contribution, the third-highest-T lepton that satisfies the baseline selection criteria must not satisfy the tight criteria. An additional validation region, VR-3ℓ-1 3 -no , is defined by requiring all three leptons to satisfy the tight selection criteria, but placing a veto on SFOC lepton pairs that have an invariant mass consistent with the boson, thereby enhancing the fake/non-prompt lepton fraction in this region. Both regions are orthogonal to the analysis signal regions and intended to validate the predictions of the matrix method for different levels of fake/non-prompt lepton contamination.
Some example distributions are shown in Figure 11 for VR-3ℓ-1 3 and Figure 12 for VR-3ℓ-1 3 -no . The processes with three real leptons (modelled with MC simulations) plus the prediction from the matrix method can be compared with data in these regions.
The hatched bands in Figures 11 and 12 show only the statistical uncertainties of the MC prediction and the uncertainties associated with the fake/non-prompt lepton estimates (i.e. no theoretical or detector-related systematic uncertainties are included). The total uncertainty associated with the fake/non-prompt lepton estimate itself contains a systematic component, which is evaluated from variations of the input fake/real efficiencies (described in the previous section), and the statistical uncertainty of the data sample to which the likelihood matrix method is applied. 11 There is generally good agreement between the data and the total background estimate, except that the background is overestimated at low Δ (ℓ 1 , ℓ 2 ) and low miss T in Figure 11. One contribution to that difference is that the two leading (two highest-T ) lepton candidates are likely to be real, yet when they are near each other they have a lower efficiency for satisfying the isolation criteria, and thus are misinterpreted as fake/non-prompt leptons by the matrix method. Analyses that are sensitive to such issues may benefit from imposing a minimum Δ requirement between leptons. A discrepancy is also observed in the higher T bins of Figure 11(d). Events in these bins have three leptons with T above 50 GeV. As shown in Figures 9 and 10, only a single T bin above 50 GeV is available for measuring r and f , due to the limited number of events in the control regions, so variations above 50 GeV may be missed. This point was not investigated thoroughly since only a small fraction of events in the analysis were impacted.   Figure 11: (a) Angular separation between the leading and second leading (in T ) lepton candidates, Δ (ℓ 1 , ℓ 2 ), (b) angular separation between the second-and third-leading lepton candidates, Δ (ℓ 2 , ℓ 3 ), (c) missing transverse momentum in the event, miss T , and (d) the T of the third-leading lepton in VR-3ℓ-1 3 . The processes with three real leptons are modelled with MC simulation, while the contribution from fake/non-prompt leptons (dark red) comes from the likelihood matrix method as described above. The hatched band shows the uncertainty from the MC statistics and the fake/non-prompt background estimate. The rightmost bins are inclusive and contain all events above the -axis ranges. The lower panel shows the ratio of data to the total SM prediction (sum of the real-lepton background contributions estimated with MC samples and the fake/non-prompt lepton contribution estimated with the matrix method  Figure 12: Comparisons of the predicted and observed yields in VR-3ℓ-1 3 -no , with respect to the T of the (a) leading and (b) subleading lepton candidates, (c) the scalar sum of the lepton and jet transverse momenta, T , and (d) the missing transverse momentum in the event, miss T . The processes with three real leptons are modelled with MC simulation, while the contribution from fake/non-prompt leptons (dark red) comes from the likelihood matrix method as described above. The hatched band shows the uncertainty from MC statistics and the fake/non-prompt background estimate. The rightmost (leftmost) bins are inclusive and contain all events above (below) the -axis ranges. The lower panel shows the ratio of data to the total SM prediction (sum of the real-lepton background contributions estimated with MC samples and the fake/non-prompt lepton contribution estimated with the matrix method). Further details are available in Ref. [42].

Model-independent search for new phenomena in multi-lepton final states
Many interesting new models for BSM physics predict final states with three or more leptons. The general multi-lepton search for new phenomena [44] agnostically considers such final states. Its aim is to be sensitive to BSM phenomena in often-overlooked corners of phase space. The background estimation for the multi-lepton search uses MC predictions to account for events that contain only real leptons, and the fake-factor method for events containing at least one fake/non-prompt lepton. The dominant sources of fake/non-prompt leptons are semileptonic heavy-flavour decays (primarily of -hadrons), light-hadron decays, and misidentification of light hadrons as leptons. These mainly arise in + jets and¯events.

Fake/non-prompt lepton selection
The fake factors are measured using events with a single lepton candidate. In order to prevent a bias in the fake factor due to trigger selection criteria [14,15], the selected events are required to have fired a loose single-lepton trigger where isolation requirements are not imposed. However, due to the high rate of events that pass such triggers, a prescale factor is applied, which reduces the number of events available for measuring the fake factors. Baseline lepton candidates must pass a common object selection, as detailed in Ref. [44]. Electron candidates are required to pass either the 'Loose' ID WP with calorimeter-and track-based isolation requirements, or the 'Tight' ID WP with no isolation requirement. Baseline muon candidates are required to pass the 'Medium' ID WP ('HighPt' for T > 300 GeV). Only leptons with There are statistical uncertainties due to the data and MC sample sizes in each fake-factor bin, which due to their small size are summed in quadrature into a single uncertainty.
In the regions enriched in fake/non-prompt leptons, MC predictions are used to subtract the real-lepton contribution from the data. The uncertainties in the MC contributions are propagated to the fake factors. The main contributions from real leptons in the single-lepton regions are from + jets, + jets andē vents. For the / + jets processes, an uncertainty of 5% is applied to the cross-section. For thep rocess, the assumed uncertainties are +2.4% −3.3% ( r , f ) and ±4.2% (PDF).
Extrapolating the fake factors from the single-lepton sample to the multi-lepton samples used in the analysis introduces an uncertainty because these samples differ in the kinematic distributions of fake/non-prompt leptons, and possibly also in the fake/non-prompt lepton composition. Two uncertainties are included to address the bias caused by imposing a miss T upper bound in the fake/non-prompt lepton estimation sample, and by imposing a T requirement on the tag jet in the fake/non-prompt muon estimation sample. These uncertainties are estimated by varying the requirements on these variables upwards and downwards by 10 GeV. 12 Plots showing the impact of these systematic effects on the fake factors are given in Figures 13  and 14.
Finally, a direct assessment of the uncertainty in the composition of the fake/non-prompt lepton background in the multi-lepton sample is made. Since fake/non-prompt leptons can come from both light-and heavy-flavour sources, it is possible that the relative abundances from these sources can vary between samples. This possibility is addressed through an additional uncertainty which leverages the different event signatures produced by light-and heavy-flavour hadrons, the latter consisting primarily of -hadrons. To evaluate this uncertainty, an alternative set of fake factors is computed which, in addition to the binning in

Validation regions
Validation regions are defined using appropriate sub-selections of and events. These are used to check that the computed fake factors extrapolate correctly from the regions where they are calculated to the regions in which they are applied. The 'on-' validation region requires an SFOC lepton pair with a dilepton mass within 10 GeV of the boson mass. The 'off-' validation region also requires a SFOC pair of leptons, but requires the dilepton mass to fall outside of the -mass window. Only mixed-flavour final states are selected for these validation regions so that which lepton to choose as the third one, assumed to be the fake/non-prompt lepton, is unambiguous. The sources of fake/non-prompt leptons contribute in different ratios to the on-and off-validation regions: the on-validation region is more sensitive to + jets events than the off-region, while the inverse is true for¯events, although in absolute terms, + jets events are more numerous than¯events in both cases. Both validation regions target, through a T requirement of T (ℓ, miss T ) < 40 GeV, a third lepton that is likely to be fake/non-prompt. The union of the on-and off-validation regions is called the 'fakes validation region'.
The variables of primary importance for this analysis are the invariant mass of all lepton candidates in the event ( inv ) and the miss T . The signal regions are separated according to the values of these quantities, as discussed in Ref. [44]. The inv distributions in the two validation regions are shown in Figure 16, while the miss T distributions are shown in Figure 17. Lastly, the electron and muon candidate T distributions are shown in the fakes validation region in Figure 18. The comparisons in Figure 16 were also presented in Ref.
[44] (albeit in logarithmic scale) and thus include fitted normalisation factors for the + jets and + jets backgrounds from their respective control regions ('post-fit'). On the other hand, the complementary distributions shown in Figures 17 and 18 were obtained independently of this statistical analysis, and thus employ the unconstrained SM background normalisations and uncertainties ('pre-fit' distributions). The background estimate is consistent with the data within the statistical and systematic uncertainties.

Conclusions
For physics analyses exploring signatures with one or more prompt leptons, background contributions due to fake/non-prompt leptons are often difficult to estimate in simulations. Therefore, data-driven methods are commonly used. Three related methods have been adopted by the ATLAS Collaboration as recommended tools: the asymptotic matrix method, the likelihood matrix method, and the fake-factor method. All three approaches depend on defining two categories of leptons, one of which ('tight') is subject to the same identification and selection criteria as are used in the analysis. The other category ('loose') adds additional lepton candidates with less stringent selection requirements. The union of the two sets is called the 'baseline' sample. The criteria are typically defined such that the probability for a real baseline lepton to satisfy the tight criteria is substantially higher than the corresponding probability for a fake/non-prompt lepton. Then, the relative numbers of loose and tight leptons in the analysis sample can be used to estimate the contribution of fake/non-prompt leptons, either inclusively or differentially in any variables of interest.
Despite their similarities, the methods each have their own strengths and drawbacks. The asymptotic matrix method and fake-factor method provide a fake/non-prompt lepton weight for each event, which is convenient for analyses. However, these methods are subject to large uncertainties if the efficiency for loose fake/non-prompt leptons to satisfy the tight criteria is large in parts of the analysis phase space. The likelihood matrix method returns a smaller uncertainty in such cases, and avoids any possibility of producing a negative estimate for the event yield, but does not provide a per-event weight, introducing difficulties e.g. for differential estimations. The fake-factor method uses simulation rather than data to incorporate the contribution from events where all leptons are tight. This method can therefore be employed while the signal region for an analysis is fully blinded, although it may induce additional simulation-related uncertainties in the background estimate.
The systematic uncertainties for all three methods arise from similar sources, with the largest contributions related to the extrapolation of the efficiencies measured in the control samples to events in the analysis sample. Differences in the fake/non-prompt lepton composition in the samples must be accounted for and appropriate uncertainties must be assigned to this extrapolation.
The performance of the likelihood matrix method and the fake-factor method has been demonstrated in a differential¯cross section measurement and in a model-independent search for BSM phenomena in multi-lepton final states, respectively. In both cases, the chosen method was shown to provide a reliable estimate of the fake/non-prompt lepton background, as measured in validation regions.
[36] ATLAS Collaboration, Performance of missing transverse momentum reconstruction with the ATLAS detector using proton-proton collisions at