Performance of photon reconstruction and identification with the CMS detector in proton-proton collisions at √s = 8 TeV

A description is provided of the performance of the CMS detector for photon reconstruction and identification in proton-proton collisions at a centre-of-mass energy of 8 TeV at the CERN LHC. Details are given on the reconstruction of photons from energy deposits in the electromagnetic calorimeter (ECAL) and the extraction of photon energy estimates. The reconstruction of electron tracks from photons that convert to electrons in the CMS tracker is also described, as is the optimization of the photon energy reconstruction and its accurate modelling in simulation, in the analysis of the Higgs boson decay into two photons. In the barrel section of the ECAL, an energy resolution of about 1% is achieved for unconverted or late-converting photons from H→γγ decays. Different photon identification methods are discussed and their corresponding selection efficiencies in data are compared with those found in simulated events.


Introduction
This paper describes the reconstruction and identification of photons with the CMS detector [1] in data taken in proton-proton collisions at √ s = 8 TeV during the 2012 CERN LHC running period. Particular emphasis is put on the use of photons in the observation and measurement of the diphoton decay of the Higgs boson [2]. For this decay mode, the energy resolution has significant impact on the sensitivity of the search and on the precision of measurements made in the analysis. The uncertainties related to the photon energy scale are the dominant contributions to the systematic uncertainty in the Higgs boson mass, m H = 124.70 ± 0.31 (stat) ± 0.15 (syst) GeV, measured in ref. [2]. The procedure employed to optimize the photon energy estimation and its accurate modelling in the simulation is described. This procedure relies on the large sample of recorded Z boson decays to dielectrons, whose showers are reconstructed as photons, and on simulation to model differences in detector response to electrons and photons.
-1 - The reconstruction of photons from the measured energy deposits in the electromagnetic calorimeter (ECAL) [3] and the extraction of a photon energy estimate is described, as well as the association of the electron tracks to clusters in the ECAL for photons that convert in the tracker. A large fraction of the energy deposited in the detector by all proton-proton interactions arises from photons originating in the decay of neutral mesons, and these electromagnetic showers provide a substantial background to signal photons. The use and interest of photons as signals or signatures in measurements and searches is therefore mainly focussed on those with high transverse momentum where this background is less severe. Photon selection methods used for the H → γγ channel and other analyses are described, together with measurements of the selection efficiency. The efficiency measured in data is compared with that found in simulated events.
The paper starts with brief descriptions of the CMS detector (section 2), paying particular attention to geometrical details of the electromagnetic calorimeter that are important for shower reconstruction, and of the data and simulated event samples used (section 3). Section 4 describes photon reconstruction in CMS: clustering of the shower energy deposited in the ECAL crystals, correction of the cluster energy and fine tuning of the calibration, photon energy resolution, and uncertainties in the photon energy scale. Section 5 describes the reconstruction of the electron tracks resulting from photons that undergo conversion before reaching the ECAL. Section 6 discusses the separation of prompt photons from energy deposits originating from the decay of neutral mesons, describing two identification algorithms, and giving results on their performance. The main results are summarized in section 7.

CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the superconducting solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass/scintillator hadron calorimeter (HCAL), each one composed of a barrel and two endcap sections. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. Extensive forward calorimetry complements the coverage provided by the barrel and endcap detectors. A more detailed description of the CMS detector can be found in ref. [1].
The pseudorapidity coordinates, η, of detector elements are measured with respect to the coordinate system origin at the centre of the detector, whereas the pseudorapidity of reconstructed particles and jets is measured with respect to the interaction vertex from which they originate. The transverse energy, denoted by E T , is defined as the product of energy and sin θ , with θ being measured with respect to the origin of the coordinate system.
Charged-particle trajectories are measured by the silicon pixel and strip tracker, with full azimuthal coverage within |η| < 2.5. Consisting of 1 440 silicon pixel detector modules and 15 148 silicon strip detector modules, totalling about 10 million silicon strips and 60 million pixels, the silicon tracker provides an impact parameter resolution of ≈15 µm and a transverse momentum, p T , resolution of about 1.5% for charged particles with p T = 100 GeV [4].
The total amount of material between the interaction point and the ECAL, in terms of radiation lengths ( X 0 ), raises from 0.4 X 0 close to η = 0 to almost 2 X 0 near |η| = 1.4, before falling to about 1.3 X 0 around |η| = 2.5. The probability of photon conversion before reaching the ECAL is -2 -thus large and, since the resulting electrons (e + e − pairs) emit bremsstrahlung in the material, the electromagnetic shower of some photons starts to develop in the tracker. The electrons are deflected by the 3.8 T magnetic field, resulting in multiple electromagnetic showers in the ECAL.
The ECAL is a homogeneous and hermetic calorimeter made of lead tungstate, PbWO 4 , scintillating crystals. The high density (8.28 g cm −3 ), short radiation length (8.9 mm), and small Molière radius (23 mm) of the PbWO 4 crystals enabled the construction of a compact calorimeter with fine lateral granularity. The central barrel covers |η| < 1.48 with the inner surface located at a radius of 1290 mm. The endcaps cover 1.48 < |η| < 3.00 and are located at |z| > 3154 mm. A preshower detector consisting of two planes of silicon sensors interleaved with a total of 3 X 0 of lead is located in front of the endcaps and covers 1.65 < |η| < 2.60.
The ECAL barrel is made of 61 200 trapezoidal crystals with front face transverse sections of about 22×22 mm 2 , giving a granularity of 0.0174 in η and φ . The crystals have a length of 230 mm (25.8 X 0 ). Each half-barrel is formed by 18 barrel supermodules each covering 20 • in φ and containing 85 × 20 = 1700 crystals. The crystals of a half-barrel may be viewed as positioned in a regular rectangular grid in (η, φ ) space (which wraps round on itself in φ ), and indexed by 85 × 360 integer pairs. The supermodules are composed of four modules. Within the modules there are submodules each containing two rows of five crystals. The void between adjacent crystals within the same submodule is 350 µm wide. The void between adjacent crystals in adjacent submodules is 550 µm wide. The voids between adjacent crystals separated by module and supermodule boundaries are about 6 mm wide. The module boundaries occur at |η| = 0, 0.435, 0.783, and 1.131, and the supermodules boundaries occur every 20 • in φ . The geometry is quasi-projective, with almost all the crystal axes tilted by an angle of 3 • with respect to the line from the coordinate origin in both the η and φ directions, and only the void at η = 0 points to the origin -the 3 • tilt relative to the η direction is introduced progressively for the first five rings of crystals away from this boundary.
The ECAL endcaps are made of 14 648 trapezoidal crystals (7324 each) with a front face transverse section of 28.6 × 28.6 mm 2 , and a length of 220 mm (24.7 X 0 ). The crystals are grouped in 5 × 5 crystal structural units, with the crystals in adjacent units being separated by a void of 2 mm. The voids between adjacent crystals within the 5 × 5 units are 350 µm wide. Each endcap is constructed as two half-disks. The crystals are installed within a quasi-projective geometry pointing 1300 mm beyond the centre of the detector, giving tilts of 2 • to 8 • relative to the direction of the coordinate origin.

Data and simulated event samples
The results presented here use data corresponding to an integrated luminosity of 19.7 fb −1 taken at a centre-of-mass energy of 8 TeV.
The Monte Carlo (MC) simulation of the response of the CMS detector employs a detailed description of it, and uses GEANT4 version 9.4 (patch 03) [5]. The simulated events include the presence of multiple pp interactions taking place in each bunch crossing weighted to reproduce the distribution of the number of such interactions in data. The presence of signals from multiple pp interactions in each recorded event is known as pileup. Interactions taking place in a preceding or a following bunch crossing, i.e. within a window of ±50 ns around the triggering bunch crossing, -3 -

Calibration of individual ECAL channels
The calorimeter signals in data must be calibrated and corrected for several detector effects [17]. The crystal transparency is continuously monitored during data taking by measuring the response to light from a laser system, and the observed changes are corrected for when the events are reconstructed. The relative calibration of the individual channels is achieved using the φ -symmetry of the energy deposited by pileup and the underlying event, the invariant mass measured in two photon decays of π 0 and η mesons, and the momentum measured by the tracker for isolated electrons from W and Z boson decays.

Clustering
Clustering of ECAL shower energy is performed on intercalibrated, reconstructed signal amplitudes. The clustering algorithms collect the energy from radiating electrons and converted photons that gets spread in the φ direction by the magnetic field. These algorithms are described in detail in ref. [18], and evolved from fixed matrices of 5 × 5 crystals, which provide the best reconstruction of unconverted photons, by allowing extension of the energy collection in the φ direction, to form "superclusters". Clusters are built starting from a "seed crystal": one containing a signal corresponding to a transverse energy greater than those of all its immediate neighbours and above a predefined threshold. In the barrel, where the crystals are arranged in an (η, φ ) grid, the clusters have a fixed width of five crystals centred on the seed crystal, in the η direction. In the φ direction, adjacent strips of five crystals are added if their summed energy is above another predefined threshold. Further clusters, aligned in η, may be seeded and added to the original, "seed", cluster if they lie within an extended φ window (seed crystal ±17 crystals) -under the control of a further predefined threshold. Clustering in the endcaps uses fixed matrices of 5 × 5 crystals. After a seed cluster has been defined, further 5 × 5 matrices are added if their centroid lies within a small η window and within a φ distance roughly equivalent to the 17 crystals span used in the barrel. The 5 × 5 matrices are allowed to partially overlap one another. For unconverted photons, the superclusters resulting from both the barrel and endcap algorithms are usually simply 5 × 5 matrices.  Figure 1. Distributions of the R 9 variable for photons in the ECAL barrel that convert in the material of the tracker before a radius of 85 cm (solid filled histogram), and those that convert later, or do not convert at all before reaching the ECAL (outlined histogram).

Simulation
The R 9 variable is defined as the energy sum of the 3 × 3 crystals centred on the most energetic crystal in the supercluster divided by the energy of the supercluster. The showers of photons that convert before reaching the calorimeter have wider transverse profiles and lower values of R 9 than those of unconverted photons. Figure 1 shows the R 9 distribution for photons in the ECAL barrel that convert in the material of the tracker before a radius of 85 cm, and those that convert later, or do not convert at all before reaching the ECAL. The events are simulated Higgs boson diphoton decays, H → γγ, and the photons are required to satisfy p T > 25 GeV. Both histograms are normalized to unity. Despite being an imperfect indicator of whether a photon converts before reaching the ECAL, R 9 is strongly correlated with the photon energy resolution degradation due to the spreading of showers initiated in the tracker, induced by the magnetic field. Based on such information, the simplest energy estimation for photons is made by summing the energy in the supercluster for barrel (endcap) photons with R 9 < 0.94 (R 9 < 0.95), and summing the energy in a 5 × 5 crystal matrix for the remaining "unconverted" photons. Signals recorded in the preshower detector are included in the region |η| > 1.65.

Correction of cluster energy
Significant improvements in energy resolution are obtained by correcting the initial sum of energy deposits forming the supercluster for the variation of shower containment in the clustered crystals and for the shower losses of photons that convert before reaching the calorimeter. The main mechanisms resulting in systematic variation of the fraction of the initial energy contained in the clustered crystals, ranked in approximate order of increasing severity, are (i) variation of longitudinal depth at which the shower passes through the off-pointing intercrystal voids (causing variation of longitudinal containment), -6 -(ii) variation of shower location with respect to the lateral granularity (causing variation of lateral containment), (iii) variation in the amount of energy absorbed before reaching the ECAL for showers starting before the ECAL, (iv) variation in the extent to which the energy of showers starting before the ECAL is clustered, and, (v) if the shower passes through an intermodule void, the variation of longitudinal depth at which the shower passes through it.
The direction of a shower crossing any of the voids between adjacent crystals (detailed in section 2) makes an angle of about 3 • relative to the crystal sides. The result is a loss of crystal depth seen by the shower. For a 350 µm void the loss of depth is small: 0.35 mm/ sin 3 • ≈ 6.7 mm (about 0.75 X 0 ). For the 6 mm intermodule voids the loss of depth is equal to about half a crystal length. The effect of such a reduction of calorimeter thickness depends on the shower development at the depth at which the void is crossed.
Corrections as a function of η, E T , R 9 , and the lateral extension of the cluster in φ , have been obtained from the observed losses in simulated events, and used in many data analyses [19][20][21][21][22][23][24]. Corrections have also been extracted from data, using photons from final state radiation in dimuon decays of Z bosons [19], although limits on precision start to be severe for p T > 30 GeV since the steeply falling p T spectrum of these photons limits the number available.
To obtain the best possible energy resolution for the H → γγ analysis [2] the energy measurement is obtained using a multivariate regression technique. The H → γγ analysis uses events containing pairs of photons with an invariant mass in the range 100 < m γγ < 180 GeV, with the threshold on the lowest p T photon set at m γγ /4. This corresponds to p T > 25 GeV for all photons used in the analysis, and p T 30 GeV for photons used in the estimation of the mass of the Higgs boson at 125 GeV. The photon energy response is parameterized by a function with a Gaussian core and two power law tails, an extended form of the Crystal Ball function [25]. The regression provides an estimate of the parameters of the function for a single photon, and consequently a prediction of the probability distribution of the ratio of true energy to uncorrected energy. The corrected photon energy is taken from the most probable value of this distribution. The input variables are the η coordinate of the supercluster, the φ coordinate of barrel superclusters, and a collection of shower shape variables: R 9 of the supercluster, the energy weighted η-width and φ -width of the supercluster, and the ratio of the energy in the HCAL behind the supercluster and the energy of the supercluster. In the endcap, the ratio of preshower energy to raw supercluster energy is also included.
Additional information is included for the seed cluster of the supercluster: the relative energy and position of the seed cluster, the local covariance matrix of the magnitude of the crystal energy signals, and a number of energy ratios of crystal matrices of different sizes defined with respect to the position of the seed crystal. These variables provide information on the likelihood and location of a photon conversion and the degree of showering in the material between the interaction vertex and the calorimeter, and together with their correlation with the η and φ position of the supercluster, drive the magnitude of containment correction predicted by the regression. In the barrel, the η and - φ indices of the seed crystal, as well as the position of the seed cluster with respect to the seed crystal are also included. These variables, together with the seed cluster energy ratios, provide information on the amount of energy that is likely to be contained in the cluster, or lost in the intermodule voids, and drive the corrections for local containment predicted by the regression. Although the variations of local containment and the losses due to showering that starts in the tracker material are different effects, the corrections are allowed to be correlated in the regression to account for the fact that a showering photon is not incident on the ECAL at a single point, and is consequently less affected by variations of local containment.
Finally, the number of primary vertices and the median transverse energy density ρ [26] in the event are included in order to allow for the correction of residual systematic effects due to the average amount of pileup in the event.
The semiparametric regression is trained to predict the true energy of the photon, E true , given the uncorrected supercluster energy. The uncorrected energy, E raw , is taken as the sum of individual crystal energies in a supercluster. After training, the regression predicts the full probability density function (pdf) for the inverse response, E true /E raw , for each individual photon. In figure 2 the sum of predicted distributions for photons with p T > 25 GeV in simulated H → γγ events is compared to the observed distribution of E true /E raw . The agreement is excellent, although there are deviations, e.g. in the barrel at E true /E raw ≈ 1.2, that are larger than can be explained by the statistical uncertainties, and although at E true /E raw ≈ 1.2 the probability is down by more than two orders of magnitude from the peak the deviation points to the existence of systematic effects in the event-by-event estimate of the tails of the energy response. The prediction of the pdf for the inverse response is used in the H → γγ analysis to estimate the mass resolution of individual diphoton systems, which assists in the classification of diphoton events, and is shown here for information. The energy of photon superclusters is taken to be the most probable value of the pdf, and the performance of this specific assignment, which is probed by the assessment of the resolution in section 4.5, is therefore independent of the details of the pdf.

Fine tuning of calibration and simulated resolution
In the H → γγ analysis the final calibration of the energy measurement in data and the modelling of the energy resolution in simulation were fine-tuned. Electron showers from rather pure samples (the background contribution is <0.1%) of Z bosons decaying to electrons were reconstructed as photons, using only the information in the ECAL and without using any information from the tracker. The dielectron invariant mass was then calculated using the vertex position obtained from the electron tracks, and its distribution compared to that obtained in simulated events.
The corrections required are small. They comprise a correction to the energy scale for the data, and a correction to the energy resolution of the MC simulation (achieved by adding a Gaussian distributed random contribution to the energy reconstructed in simulated events). Before the finetuning the data have already been corrected for variations of crystal transparency, and the individual crystals have been intercalibrated. The simulation of the showers in the ECAL includes these uncertainties. The increase of the energy-equivalent noise during the data-taking period is also simulated. The noise variation is due to a gradual increase of the leakage current in the silicon avalanche photodiodes used in the ECAL barrel region, and due to response loss in the endcap, with the amount of variation depending on η.
Three explanations have been suggested for the need of an additional smearing of the energy estimate in simulated events to achieve complete agreement with the data. The slightly worse energy resolution may be explained by (i) the presence of more tracker material in the detector, between the interaction point and the ECAL, than in the simulation, (ii) underestimation of the uncertainty in the individual crystal calibration -although it would be difficult to reconcile a significant underestimation with the fact that the individual crystal calibration uncertainties have been obtained by detailed comparisons among different methods of intercalibration, (iii) residual differences between the actual ECAL geometry and the one implemented in the simulation so that the energy correction estimates, obtained by multivariate regression from simulated events, are suboptimal for data.
Measurements (discussed in section 4.6) show that there is, indeed, more tracker material present in the detector than is simulated, and this results in worse energy resolution for photons that convert in the tracker, and an increase in their number. This fact, however, does not account for all the observed resolution discrepancies, which include the need to worsen the simulated resolution of showers for which the R 9 variable has a high value (corresponding to photons that convert late or not at all). The other two factors listed above represent further contributions in addition to that from mismodelling of tracker material, although their relative magnitude is not known [17]. While additional intercalibration errors would increase the constant term in the fractional energy resolution, the contributions of the other effects have an energy dependence. As described below, the applied smearing is allowed to have an energy-dependent component.
The supercluster energy scale is tuned and corrected by varying the scale in the data to match that observed in simulated events. Two procedures have been used to obtain these corrections:

JINST 10 P08010
the "fit method" and the "smearing method". The fit method uses an analytic fit to the Z boson invariant mass peak, with a convolution of a Breit-Wigner distribution with a Crystal Ball function. Distributions obtained from data and from simulated events are fitted separately and the results are compared to extract a scale offset. The Breit-Wigner width is fixed to that of the Z boson: Γ Z = 2.495 GeV [27]. The parameters of the Crystal Ball function, which gives a reasonable description of the calorimeter resolution effects and of bremsstrahlung losses in front of the calorimeter, are left free in the fit. The smearing method uses the simulated Z-boson invariant mass shape as a probability density function in a maximum likelihood fit. All the known detector effects, reconstruction inefficiencies, and the Z-boson kinematics are taken into account in the simulation. The residual discrepancy between data and simulation is described by an energy smearing function. A Gaussian smearing applied to the simulated response has been found to be adequate to describe the data in all the categories of events examined. A larger number of electron shower categories can be handled by the smearing method as compared to the fit method.
The procedure implemented to fine-tune the energy scale has three steps for the barrel, and two steps for the endcap calorimeters. In each step, the parameters defining the scale and the width are both allowed to float in the fit, and corrections to the scale are extracted. Only in the final step, the third step for the barrel and the second step for the endcaps, are energy smearing corrections extracted for application to simulated events. The first step corrects for possible time dependencies during data taking by extracting, with the fit method, the scale correction to be applied to the data for each data-taking epoch (51 epochs defined by ranges of run numbers), and for each region in absolute pseudorapidity (4 bins, two in the barrel and two in the endcaps). This step was originally introduced to account for possible imperfections in the transparency corrections. However the transparency corrections obtained from the laser monitoring system during 8 TeV data taking are of quality such that there is very little variation to correct. This can be seen from figure 3, which shows the ratio of the energy measured by the ECAL over the momentum measured by the tracker, E/p, for electrons selected from W → eν decays, as a function of the date at which they were recorded. The magnitudes of the energy scale corrections extracted in the first step of the fine-tuning procedure are thus small, generally < 0.1% in the barrel and < 0.2% in the endcaps.
The second step derives corrections for effects mainly related to the material in front of the calorimeter, and uses the smearing method. Showers are classified in two R 9 bins in each of two barrel and two endcap pseudorapidity regions, yielding eight shower categories. Combining different pairs of shower categories, 36 Z → e + e − invariant mass distributions are constructed for both data and simulated events. The shower energies in simulated events are modified by applying a Gaussian multiplicative random factor with a mean value 1 + ∆P and a standard deviation ∆σ . The method maximizes the likelihood of the fit between the invariant mass distributions as a function of the 16 parameters (∆P and ∆σ for each shower category), for the full Z → e + e − data sample, including events where the two showers are in different categories. The energy scale discrepancies found in this step are shown in table 1 together with their uncertainties. The corrections that must be applied to the data are the reciprocals of these values.
The large Z → e + e − data sample provides sufficient statistical precision for the third step to be performed in the barrel. This step introduces E T -dependent corrections to the energy scale using 20 bins defined by ranges in |η|, R 9 , and E T using the smearing method as in the second step. In   Figure 3. Ratio of the energy measured by the ECAL over the momentum measured by the tracker, E/p, for electrons selected from W → eν decays, as a function of the date at which they were recorded. The ratio is shown both before (red points), and after (green points), the application of transparency corrections obtained from the laser monitoring system, and for both the barrel (upper plot) and the endcaps (lower plot). Histograms of the values of the measured points, together with their mean and RMS values are shown beside the main plots. Table 1. Energy scale discrepancies, and associated statistical uncertainties, found in the second step of the fine-tuning procedure. The corrections that must be applied to the data are the reciprocals of these values.

JINST 10 P08010
this step the smearing procedure is iterated because the value of the corrections applied can change the E T bin into which a photon falls. Convergence is achieved after three iterations. The residual discrepancies measured in this final step are shown, as a function of E T , in figure 4, and their reciprocals are applied as corrections, with the value for the highest E T bin being used for photons with E T > 100 GeV. It can be seen from the figure that the largest corrections obtained in the third and final step are for photons with R 9 < 0.94 and |η| > 1.
The energy scale corrections finally applied to the data are the product of the corrections extracted in the steps described above. The smearing to be applied to the simulated energy resolution, extracted in the second step for the endcaps and in the third step for the barrel, is modelled by an amplitude and a mixing angle specifying the sharing of this amplitude between a constant term and a 1/ √ E term, providing thereby an extra degree of freedom to the energy resolution uncertainty. The uncertainties and correlations from the fit contribute to the systematic uncertainty in the energy resolution. In the endcaps, it is not possible to determine the sharing between a constant and energy dependent term, and therefore the smearing is taken to be constant, not varying with energy. The corrections to the resolution of the simulated photons range from ≈ 0.7 (1)% to 1 (2)% in the barrel for high (low) R 9 , respectively, and from 1.6 to 2.0% in the endcaps. In the barrel, the uncertainties in these values are about 10% of the values themselves. In the endcaps the uncertainties are about 15% for the two most relevant photon categories, and up to 50% for the categories which contribute few event to the H → γγ analysis. The uncertainties are assessed by (i) examining the variation of the R 9 distribution as a function of η and comparing it to what is observed for photons, (ii) changing the R 9 value used for categorization, (iii) using an energy estimate for the electron showers based on an electron-trained regression rather than the photon regression, (iv) changing the p T threshold of the sample used, and (v) changing the identification criteria used to select the electrons. The effect of these systematic uncertainties on the Higgs boson mass determination is <10 MeV, and they have little impact (< 1%) on the significance of the signal. Figure 5 shows the electron pair invariant mass reconstructed in Z → e + e − events in the 8 TeV data and simulated events where the electrons are reconstructed as photons, and the full set of photon corrections and smearings is applied. The resulting distributions are shown separately for the case where both showers are in the barrel, and for the case where at least one of the showers is in an endcap. The distributions of simulated events are normalized to match the distributions in data. In the panels beneath the main plots, the ratio of the number of events in data to the number of simulated events in each bin is shown, together with a band obtained by propagating the uncertainties in the simulated energy resolution, and the energy scale in data, to the dielectron masses obtained. There is excellent agreement between the simulation and data in the cores of the distributions. A slight discrepancy is present in the low-mass tail in the endcaps, where the Gaussian smearing cannot account for some noticeable non-Gaussian effects. Since the electron showers are reconstructed as photons, the mass peaks do not appear at the true Z-boson mass, both in data and in the simulation. This is because the fraction of the original particle energy contained in a supercluster is, on average, a little smaller for electrons than for photons, and consequently the photon energy regression imperfectly estimates the energy of electron showers. With respect to the uncorrected distributions, the corrections to the data shift the peak by about −0.5 GeV for the case where both the showers  are in the barrel, and by about −1 GeV if either of the showers is in an endcap. In addition, the distributions obtained from data are slightly narrower after the corrections. The distributions for the simulated events after the correction procedure are wider, because of the applied smearing.

Photon energy resolution
The single-photon energy resolution in Z → e + e − events where the electron showers are reconstructed as photons has been measured in both data and simulated events using a method similar to, but independent of, that used to obtain the corrections and smearings. The data and simulated event samples are the same as those used to obtain the corrections and smearings. The fitting methodology allows the resolution and energy scale for single showers to be extracted in fine bins of chosen variables, but with the limitation that the energy resolution for each bin is parameterized as a Gaussian distribution. Figure 6 shows the resolution measured in small bins of η, taken as the position of the shower in the ECAL, for showers with R 9 ≥ 0.94 and R 9 < 0.94, for data and simulated events. The vertical dashed lines show the barrel module boundaries, where the resolution is somewhat degraded, and the grey band at |η| ≈ 1.5 marks the barrel-endcap transition region excluded from the photon fiducial region used in the H → γγ analysis. The simulated resolution matches the resolution observed in data as a function of η very well. There is a small systematic difference in the endcap, particularly for the photons with R 9 < 0.94, with the simulated photons showing worse energy resolution than the photons in data. This is understood as being a result of the methodology used to determine the resolution, which focuses on the Gaussian core of the distribution. In this region, the Gaussian smearing added to the simulation in the fine-tuning step is larger than elsewhere, and the smearing truly required here would have a non-Gaussian tail.  Figure 6 demonstrates the very good agreement between simulation and data achieved for the resolution of electron showers reconstructed as photons. This is an important achievement, but it does not provide a measurement of the energy resolution of photons. Electron showers tend to have worse energy resolution than photon showers of the same energy since all electrons radiate to some extent in the material of the tracker, even those with high values of R 9 . Furthermore, the fitting technique used to obtain the resolution shown in figure 6, parameterizes the resolution as a Gaussian distribution and thus tends to be more sensitive to the core of the resolution function and less sensitive to its non-Gaussian tail. Additionally, it is of particular interest to examine the energy resolution achieved for photons resulting from the decay of Higgs bosons, which are on average more energetic than the electrons resulting from the decay of Z bosons.
Since there is excellent agreement between data and simulation for electron showers, the energy resolution of photons in simulated events provides an accurate estimate of their resolution in data. Figure 7 shows the distribution of reconstructed energy divided by the true energy, E meas /E true , of photons in simulated H → γγ events that pass the selection requirements given in ref. [2], in a narrow η range in the barrel, 0.2 < |η| < 0.3. The distribution for photons with R 9 ≥ 0.94 is shown on the left, and that for photons with R 9 < 0.94 is shown on the right. The width of the distribution is parameterized in two ways: by the half-width of the narrowest interval containing 68.3% of the distribution, σ eff , and by the full-width-at-half-maximum of the distribution divided by 2.35, σ HM . These parameters are both equal to the standard deviation in the case of -  a purely Gaussian distribution. Since σ HM measures the width of the Gaussian core of the distribution, the values are smaller, particularly where non-Gaussian tails make a larger contribution: for example, for R 9 < 0.94 and at the intermodule boundaries. Figure 8 shows the fractional energy resolution, parameterized as σ eff /E, as a function of η, in simulated H → γγ events that pass the analysis selection requirements. A bin size of 0.1 in η has been used, with adjustments to allow a small bin of width 0.03 centred on the barrel module boundaries where it can be seen that the resolution is locally degraded. -

Energy scale uncertainty
The photon energy scale has been checked with photons in Z → µ + µ − γ events. After a selection of events ensuring a pure and unbiased sample of photons, there is agreement between the measured photon energy and that predicted from the known Z-boson mass and measured muon momenta. The overall energy scale difference between data and simulation found with the Z → µ + µ − γ events (using the fine-tuning corrections, obtained as described in section 4.4) is 0.25% ± 0.11% (stat) ± 0.17% (syst). The study is made for photons with p T > 20 GeV, and the mean p T of the photons selected is 28 GeV. When binned in p T (so as to probe possible nonlinearities), and in R 9 and η -16 -(according to the known dependencies of the ECAL), the agreement of the measurements with the defined energy scale remains good, although the uncertainties in individual bins are, at best, between 0.2 and 0.3%. Thus this check does not provide a very strong constraint on the uncertainty in the Higgs boson mass arising from the uncertainty in the photon energy scale. An additional limitation is that the check is for a range of photon energies that has only a limited overlap with that used in the Higgs boson analysis. For these reasons the uncertainty in the Higgs boson mass arising from the uncertainty in the photon energy scale has been analysed as described below.
There are three main sources of systematic uncertainty in the energy scale that is defined by the fine-tuning described in section 4.4. These uncertainties are the main contributions to the systematic uncertainty in the measured mass of the Higgs boson in the diphoton decay channel [2]. The largest uncertainties are due to the possible imperfect simulation of (i) differences in detector response to electrons and photons, and (ii) energy scale nonlinearity. Finally there is an uncertainty resulting from the procedure and methodology described in section 4.4. These uncertainties are discussed in detail in ref. [2] and summarized below together with additional results and information.
Since the energy scale has been obtained using electron showers reconstructed as photons, an important source of uncertainty in the photon energy scale is the imperfect modelling of the difference between electrons and photons by the simulation. The most important cause of the imperfect modelling is an inexact description of the material between the interaction point and the ECAL. Figure 9 shows the thickness of the tracker material in terms of radiation lengths, as inferred from data, relative to what is inferred from simulated events, as a function of |η|. The two methods used to infer the material thickness employ the energy loss of electrons in Z → e + e − events and the energy loss of low transverse momentum, 0.9 < p T < 1.1 GeV, charged-hadron tracks, where the momentum loss is computed from the change in the track curvature between the beginning and end of the track. The measurement using low-p T charged hadrons is difficult to implement in the regions of the tracker at large η, and no values are available beyond |η| = 2, but for |η| < 1.6 the two methods give results that are in good agreement. In addition, there is no charged-hadron measurement for the bin centred at |η| = 0.95 where the transition between the tracker barrel and endcap results in few tracks with the number of hits required to make a good measurement.
The difference between data and simulation in the material thickness of the tracker is almost certainly due to mismodelling of specific structures and localized regions. This hypothesis is supported by studies of the location of low-p T (down to p T ≈ 1 GeV) photon conversion vertices, as shown in ref. [28]. The results shown in figure 9, however, assume a simple scaling of the overall thickness. The effect of changes in the amount of tracker material on the relative difference between the electron and photon energy scales has been studied with events simulated using tracker models where the amount of material is increased uniformly by 10, 20, and 30%. Mismodelling of localized structures may affect the measurements used to infer thickness in figure 9 somewhat differently from the way it affects the relative difference between the electron and photon energies. Therefore it is necessary to be rather conservative in the assignment of a systematic uncertainty. It is assumed that the effects on the energy scale are covered by a 10% uniform deficit of simulated material in the region |η| < 1.0 and a 20% uniform deficit for |η| > 1.0. The resulting uncertainty in the photon energy scale has been assessed using the simulated samples in which the tracker material is increased uniformly, and ranges from 0.03% in the central ECAL barrel up to 0.3% in the outer endcap.
-  Figure 9. Tracker material thickness (in terms of radiation lengths) inferred in the data, X data , relative to that inferred in simulated events, X MC , as a function of |η|, using electrons in Z → e + e − events (circles), and low-momentum charged hadrons (squares).

CMS
Since the longitudinal profiles of energy deposition of electrons and photons differ, a further difference in response between electrons and photons which would result from imperfect simulation, is related to modelling of the varying fraction of scintillation light reaching the photodetector as a function of the longitudinal depth in the crystal at which it was emitted. Ensuring adequate uniformity of light collection was a major accomplishment in the development of the crystal calorimeter and was achieved by depolishing one face of each barrel crystal. However, an uncertainty in the achieved degree of uniformity remains and, in addition, the uniformity is modified by the radiation-induced loss of transparency of the crystals. The uncertainty results in a difference in the energy scales between electrons and unconverted photons that is not present in the standard simulation. The effect of the uncertainty, including the effect of radiation-induced transparency loss, has been studied.
A scaling as a function of depth, measured from the front face of the crystal, is applied to the deposited energy. In the standard simulation this scaling is uniformly equal to unity, i.e. flat, for all except the rearmost 10 cm of the crystal. To simulate nonuniformity of light collection, an appropriate slope is introduced based on laboratory light-collection efficiency measurements made on the crystals, and measurements of its dependence on crystal transparency. The slope of the light collection efficiency as a function of depth, at the time when the ECAL was constructed, is taken to be −0.14 ± 0.08%/X 0 [29,30], for the front half of the crystal ("front non-uniformity"). The change of this slope, ∆F, is parametrized as a function of the absorption coefficient induced by irradiation measured in m −1 , ∆µ, and is given by ∆F = 0.4% × ∆µ/X 0 [31]. Finally, the induced absorption coefficient is related to the light-yield (LY) loss measured by the laser monitoring system, ∆(LY/LY 0 ), through ∆µ = k ×∆(LY/LY 0 ), where k = 0.02%/m (i.e. taking the average value of the measurements reported in refs. [32] and [33]).

JINST 10 P08010
The uncertainty in the slope is taken as the difference between the flat response used in the standard simulation and the average slope measured at the time of ECAL construction plus the slope change resulting from the maximum radiation-induced light loss in the barrel. The resulting magnitude of the uncertainty in the photon energy scale in the barrel is 0.04% for photons with R 9 > 0.94 and 0.06% for those with R 9 < 0.94, but the signs of the energy shifts are opposite since unconverted photons penetrate deeper into the crystal than electrons, whereas converted photons share their energy between two electrons, whose showers thus penetrate the crystal less than a single electron shower. In the endcaps, the magnitude of the uncertainty in the photon energy scale is taken to be the same as in the barrel, and the effect of the longitudinal uniformity has not been studied in detail, firstly because the uncertainty in the energy scale due to other effects is larger there, and secondly because these studies were done in the context of the H → γγ analysis where uncertainties in the endcap energy scale had very little impact on the overall mass scale uncertainty. For the diphoton mass in the H → γγ analysis the two anticorrelated uncertainties result in an uncertainty of about 0.015% in the mass scale. The effect of the tracker material uncertainty on this value, where a changed tracker material budget would change the number of photons that convert in the tracker material, is negligible.
In assessing the systematic uncertainties for the H → γγ mass measurement, differences between MC simulation and data in the extrapolation from shower energies typical of electrons from Z → e + e − decays to those typical of photons from H → γγ decays, were also investigated. The linearity of the energy response was studied in two ways: by examining the dependence of the energymomentum ratio, E/p, of isolated electrons from Z and W boson decays as a function of E T , and by looking at the invariant mass of dielectrons from Z boson decays as a function of the scalar sum of the transverse energies of the two electron showers, H T . In both cases, the energy or transverse energy of the electrons and the invariant mass of the dielectron, are those obtained when the ECAL showers are reconstructed as photons. The showers are required to satisfy E T > 25 GeV and the photon identification requirements of the H → γγ analysis (with the electron veto removed). The E/p distributions, obtained from simulated events for a number of bins in E T , and the dielectron invariant mass distributions, obtained for a number of bins in H T , were fitted to the corresponding distributions obtained from events in data. A scale factor was extracted from each fit, whose difference from unity measures the residual discrepancy of the energy response in data relative to that in simulated events. As a cross-check, an iterated truncated-mean method was used to estimate the E/p or dielectron invariant mass peak positions and gave consistent results.
The results are shown in figure 10 for both the E/p and the dielectron invariant mass analyses. The points coming from the analysis of the dielectron mass are plotted as a function of H T /2. The four panels show results for different η and R 9 categories, with the dielectron analysis restricted to events where both electron showers fall in the same category. The η categories correspond to the barrel and endcap regions. The horizontal error bars indicate the uncertainty in the mean E T or H T /2 for the bin, but for most bins that uncertainty is negligible and hidden behind the plotted central value marker. In the endcaps for low R 9 the point corresponding to E T = 95.4 GeV for the E/p analysis has a value of 1.0146 which does not fit in the plot scale, although the lower vertical error bar, extending down below 1, can be seen. The differential nonlinearity is estimated from a linear fit through the points (shown by the lines). The uncertainties in the fit parameters of a linear response model, shown by the bands, are extracted after scaling the uncertainties such that the χ 2 -  Figure 10. Residual discrepancy of the energy response in data relative to that in simulated events as a function of transverse energy (for the E/p analysis, squares) and of H T /2 (for the dielectron mass analysis, circles) in four η and R 9 categories. The dielectron analysis is restricted to events where both the electron showers fall in the same η, R 9 category. The uncertainties in the fit parameters of a linear response model are shown by bands -further details are given in the text.
per degree of freedom of the fits is equal to unity. The stability of the result has been checked by removing the points of the dielectron mass analysis that have very small statistical uncertainties (i.e. where H T /2 is about half the Z-boson mass).
A value of 0.1% was assigned to the uncertainty in the effect of differential nonlinearity for a diphoton mass around 125 GeV in all events except those in the class in which the diphoton transverse momentum is particularly high, so that the highest transverse momentum photon in the event typically has p T > 100 GeV. For this event class the uncertainty is set at 0.2%.
The digitization of the ECAL signals uses 12-bit analogue-to-digital-converters (ADCs) and, to increase the dynamic range, three different preamplifiers with different gains are used for each crystal, each with its own ADC, and the largest unsaturated digitization is recorded together with two bits coding the ADC number [1]. The possibility that imperfect matching between the different "gain ranges" introduces an uncertainty in the energy of the measured photons was investigated. The effect of switching preamplifiers for digitizing large signals, E 200 GeV in the barrel and E T 80 GeV in the endcaps, was found to be negligible for photons from Higgs boson decays. The fraction of photons for which the lower-gain preamplifiers are used is small (<2%) and the lower-gain preamplifiers appear to be very well calibrated to the high-gain preamplifiers.
A further small uncertainty arises from imperfect electromagnetic shower simulation. A simulation made with a shower description using the Seltzer-Berger model for the bremsstrahlung energy spectrum [34], which represents an improvement over GEANT4 version 9.4.p03, changes the energy scale for both electrons and photons. The much smaller changes in the difference between the electron and photon energy scales, although mostly consistent with zero, are interpreted as a limitation on our knowledge of the correct simulation of the showers, leading to a further uncertainty of 0.05% in the mass of the Higgs boson.
-20 -The statistical uncertainties in the measurements used to set the energy scale are small, but the methodology, which is described in section 4.4, has a number of systematic uncertainties related to the imperfect agreement between data and MC simulation. The uncertainties range from 0.05% for unconverted photons in the ECAL central barrel to 0.1% for converted photons in the ECAL outer endcaps.
Accounting for all the contributions, the uncertainty in the photon energy scale at p T ≈ m Z /2, where m Z is the Z boson mass, is about 0.1% in the central barrel, 0.15% in the outer barrel, and 0.3% in the endcaps. These uncertainties are largely correlated. The exact values, their correlations in two R 9 times four η bins, together with the contribution from the residual nonlinearity and from the uncertainties on the energy and mass resolution have been propagated to the signal model of the H → γγ analysis. Together with similar, and not entirely correlated, uncertainties in the 7 TeV data they contribute 0.14 GeV to the systematic uncertainty of 0.15 GeV in the Higgs boson mass measurement [2].

Conversion track reconstruction
Photons traversing the CMS tracker have a sizeable probability of converting into electron-positron pairs. Although converted photons are fully clustered in the ECAL as described in section 4, and identified with good approximation by the R 9 shower-shape variable, additional useful information is gained by reconstructing the associated e + e − track pairs. According to simulation, the fraction of photon conversions occurring before the last three layers of the tracker (reconstruction of conversion tracks requires at least three layers) is as high as about 60% in the pseudorapidity regions with the largest amount of tracker material in front of the ECAL (figure 11). Fully reconstructed conversions are used in the particle-flow reconstruction algorithm [35,36]: the association of electron-track pairs with energy deposits in the ECAL avoids their being misidentified as charged hadrons, thus improving the determination of the photon isolation, as discussed in section 6. The direction of the electron-track pair is also exploited in assisting the determination of the longitudinal coordinate of the interaction vertex in the H → γγ analysis [2]. The aim of this section is to describe the methods used to reconstruct electron-track pairs and show the level of agreement between data and simulation in a very pure sample of photons.
Conversion reconstruction uses the full CMS tracking power [4]. Track reconstruction is based on an iterative tracking procedure. The first iteration aims at finding tracks originating from the interaction vertex while subsequent iterations aim at finding tracks from displaced (secondary) vertices at increasing distance from the primary vertex. In addition, tracks starting from clusters in the ECAL and propagated inward into the tracker volume are sought, so as to reconstruct late-occurring conversions [37]. All tracks associated to the main electron reconstruction [18], as well as the subsample of the standard tracks which can be associated to energy deposits in the ECAL, are possible electron candidates and are refitted with the Gaussian sum filter method [38]. Tracks reconstructed as electrons are selected with basic quality requirements on the minimum number of hits and goodness of the track fit. Tracks are then required to have a positive charged-signed transverse impact parameter (the primary vertex lies outside the trajectory helix). Track-pairs of opposite charge are then filtered to remove tracks that might have resulted from conversions in the beam pipe, or could possibly consist of electrons originating from the primary vertex. Additional requirements on the track pair are meant to specifically identify the photon conversion topology. Photon conversion candidates can be distinguished from massive meson decays, nuclear interactions or vertices from misreconstructed tracks by exploiting the fact that the momenta of the conversion electrons are approximately parallel since the photon is massless. For this purpose, the angular separation of the track pair in the longitudinal plane, measured in terms of ∆ cot θ , is required to be less than 0.1. Also, the two-dimensional distance of minimum approach between the two tracks is required to be positive to remove intersecting helices. Finally, the point in which the two tracks are tangent is required to be well contained in the tracker volume.
Track pairs surviving the selection are fitted to a common vertex with a 3D-constrained kinematic vertex fit. The 3D constraint imposes the tracks to be parallel in both transverse and longitudinal planes. The pair is retained if the vertex fit converges and the χ 2 probability is greater than a given threshold. The transverse momentum of the pair is finally refitted with the vertex constraint.
Reconstructed conversions are required to satisfy a minimum transverse momentum threshold, meant to reduce accidental or poorly reconstructed pairs. The threshold on the converted photon p T as measured by the tracks can vary depending on the application: in this paper, mainly focussing on medium to high transverse momentum, the threshold is chosen to be 10 GeV. More than one conversion track-pair candidate can be reconstructed for the same supercluster. When such a case occurs, the optimal conversion is chosen by finding the best directional match between the momentum direction of the track pair and the position of the supercluster. The matching criterion is expressed in terms of the ∆R = √ ∆η 2 + ∆φ 2 distance between the supercluster direction and the conversion direction. The conversion candidate with minimum ∆R is retained if ∆R is less than 0.1. Both the conversion and supercluster directions are redefined with respect to the fitted conversion vertex position. A sample of Z → µ + µ − γ events with a photon resulting from final-state radiation (FSR) is selected from dimuon-triggered data, together with a corresponding sample of simulated events. A very high photon purity (98%) is achieved in the selection, which is not reachable in any other sample. Events from Z → µ + µ − γ decays are selected by requiring the presence of two highquality muon tracks reconstructed with both the muon detector and the tracker within |η| < 2.4, originating from the interaction vertex, and each having p T > 10 GeV. Each muon track is also required to be associated to small energy deposits in the hadron calorimeter. The dimuon invariant mass is required to be above 35 GeV.
Photon candidates are selected with loose identification criteria and with transverse momentum above 10 GeV, within |η| < 2.5 (excluding the ECAL barrel-endcap transition region) and added to the dimuon system. The distance of the photon from the closest muon is required to satisfy ∆R < 0.8, while the muon furthest from the photon must satisfy p T > 20 GeV. It is required that the track of the muon closest to the photon is not reconstructed also as an electron. Finally the three-body invariant mass, m µ µγ , is required to satisfy 60 < m µ µγ < 120 GeV. Figure 12 shows the µ µγ invariant mass for events in which a conversion track pair, matched to the photon, has also been reconstructed. The invariant mass is calculated using the photon energy measured in the ECAL and taking the dimuon vertex. The distributions are normalized to the number of candidates in data and show good agreement between data and simulation.
An estimator of the quality of the conversion reconstruction is the matching between the energy measured in the ECAL and the momentum measured from the track pair after refitting with the conversion vertex constraint. If the track pair is correctly reconstructed and associated to the right cluster in the calorimeter the ratio E/p must be close to one. As for single electrons [18] Figure 13. Distribution of the E/p ratio, where E is the supercluster energy measured in the ECAL and p is the total momentum measured from the track pair refitted with the conversion vertex constraint, for photons in Z → µ + µ − γ events in data (points with error bars) and simulation (histograms), separately for (left) barrel and (right) endcap. The simulated distributions are normalized to the number of entries in data. distribution of the E/p shows tails around unity, because the electrons from conversions both emit bremsstrahlung along their trajectory through the tracker and the total track-pair momentum does not account for the total energy collected in the calorimeter. The distributions are shown in figure 13 for barrel and endcap separately, where the shape of the E/p distribution in data is compared to that in simulation. The distributions are normalized to the number of entries in data. Converted photons from the decay of neutral mesons in jets or accidental track pairs do not exhibit a E/p peak at unity.
The distributions of photon supercluster pseudorapidity and of photon conversion radius are shown in figure 14. The empty bin in the left plot, centred on |η| = 1.5, corresponds to the ECAL barrel-endcap transition region in which photons are excluded from the analysis. The radial position of the conversion vertices for |η| < 1.4 in the right plot reveals the tracker structure, as shown in ref. [28] using low-p T conversions in minimum bias events. Data and simulation are in fair agreement. The number of photons from Z → µ + µ − γ events in data is however insufficient to probe the local differences between data and simulation shown in figure 9.

Photon identification
In physics analyses using photon signals, a large and reducible background comes from photon candidates that arise from neutral mesons produced in jets. In the transverse momentum range of interest, the photons from the decay of neutral pions are collimated and are reconstructed as a single photon -in the barrel the minimum separation of the two photons from the decay of a π 0 with p T = 15 GeV is about the same as the crystal size. The background tends to be dominated by π 0 's that take a substantial fraction of the total jet p T and are thus relatively isolated from jet activity in the detector. Nevertheless, rejection of this background must rely heavily on isolation, particularly since the high probability of conversion in the tracker material, followed by the separation of the e + e − pair in the 3.8 T magnetic field, means that the lateral shower-shape patterns in φ have little power to discriminate prompt or single photons from background, leaving only the η coordinate for lateral shape discrimination. A further consequence of the high probability of conversion in the tracker material is that the R 9 distributions of signal and background differ for two independent reasons: firstly, the showers from π 0 's tend to have lower R 9 values because of the two separated decay photons; and secondly, there is a higher chance that at least one of two photons from a π 0 converts. Two photon identification algorithms are used in CMS to select against candidate photons originating in jets: an approach using selection requirements applied to a set of individual variables, and a multivariate technique. Both methods include a criterion intended to reject electrons misidentified as photons.

Electron rejection
The photon identification prescriptions discussed in this paper use the "conversion-safe electron veto" to reject electrons. This veto requires that there be no charged-particle track with a hit in the inner layer of the pixel detector not matched to a reconstructed conversion vertex, pointing to the photon cluster in the ECAL. The "hit in the inner layer" is computed as a hit in the first layer where a hit is possible, accounting for the small number of inoperative sensors, and for geometrical configurations where a track can pass between the first layer of sensors without leaving a hit. The photon inefficiency is thus reduced, almost entirely, to that resulting from photons converting in the beam pipe.
-25 - The conversion-safe electron veto is appropriate where electrons do not constitute a significant background, as for example in the H → γγ analysis, both because the invariant mass range of interest is sufficiently far from the Z boson mass, the largest source of prompt electron pairs, and because there are two photons to which the requirement can be applied, providing a powerful rejection of an electron pair being identified as a photon pair. A more severe rejection of electrons can be achieved by rejecting any photon for which a "pixel track seed" consisting of at least two hits in the pixel detectors suggests a charged-particle trajectory that would arrive at the ECAL within some window defined around the photon supercluster position. The efficiencies for photons or electrons to pass either of these requirements, as measured in 8 TeV data, are shown in table 2 separately for the barrel and the endcap. The efficiencies are obtained from photons in Z → µ + µ − γ events and from electrons in Z → e + e − events, for photons or electrons that have passed all criteria in the loose photon identification based on sequential requirements (section 6.3) except the electron veto.

Photon identification variables
Photon identification is based on two main categories of observables: shower-shape and isolation variables, and a description is given here of those most commonly used. The lateral extension of the shower, σ ηη , is measured in terms of the energy weighted spread within the 5 × 5 crystal matrix centred on the crystal with the largest energy deposit in the supercluster [18]. This variable, like the variable q ηφ mentioned below, is obtained by measuring position by counting crystals. This has the advantage that the differences in the size of the voids between the crystals, particularly at the module boundaries, are ignored, which better matches the lateral behaviour of showers. The separation of signal from background by this variable is illustrated in figure 15 where the signal candidates are FSR photons in Z → µ + µ − γ events. Photon candidates are required to satisfy p T > 20 GeV, f h < 0.05, where f h is the hadronic fraction defined in more detail below, and the conversion-safe electron veto is applied. The Z → µ + µ − γ events are selected as in section 5. Photons in data are compared with those in a simulated sample. There is imperfect matching between data and simulation, particularly in the barrel, which has to be taken into account when using the σ ηη variable. The background-dominated photon candidates are taken from a sample of dimuon triggered events in data. The simulated distributions are normalized to the number of signal photons in data, and the barrel and endcaps are shown separately.
The variable σ ηη is often used in conjunction with q ηφ , the diagonal component of the covariance matrix constructed from the energy-weighted crystal positions within the 5 × 5 crystal array centred on the crystal containing the largest energy. As previously discussed in section 4.2, the R 9 variable measures the overall transverse spread of the shower. Additional information on the -26 -  Figure 15. Distribution of the shower-shape variable, σ ηη , for FSR photons in Z → µ + µ − γ events in data (solid circles) and simulation (histogram), and for background-dominated photon candidates in dimuon triggered events (open circles). The barrel and endcaps are shown separately. The simulated signal and background distributions are normalized to the number of signal photons in the data. The ratios between the photon signal distributions in data and simulation are shown in the bottom panels.
shower-shape is provided by the ratio E 2×2 /E 5×5 , where E 2×2 is the maximum energy sum collected in a 2 × 2 crystal array that includes the largest energy crystal in the supercluster, and E 5×5 is the energy collected in a 5 × 5 crystal matrix centred around the same crystal. The energy-weighted spreads along η (σ η ) and φ (σ φ ), calculated using all crystals in the supercluster, give further measures of the lateral spread of the shower. In the endcap, where CMS is equipped with a preshower, the variable σ RR = √ σ 2 xx + σ 2 yy is considered, where σ xx and σ yy measure the lateral spread in the two orthogonal sensor planes of the detector. The hadronic leakage of the shower, f h , is defined as the ratio between the energy collected by the HCAL towers behind the supercluster and the energy of the supercluster.
Photon isolation is measured exploiting the information provided by the particle-flow event reconstruction [35,36]. The particle-flow algorithm combines information from the tracker, the calorimeters, and the muon detectors, and aims to reconstruct the four-momenta of all particles in the event, classifying them as charged and neutral hadrons, photons, electrons and muons. The photon isolation variables are obtained by summing the transverse momenta of charged hadrons, I π , photons, I γ , and neutral hadrons, I n , inside an isolation region of radius ∆R in the (η, φ ) plane around the photon direction. Since the reconstruction of the signal photons and the particle-flow objects is not (yet) optimally synchronized, energy from the signal photon must be removed from the isolation sums by imposing geometrical requirements. When calculating I γ , particle-flow photons falling in a pseudorapidity slice of size ∆η = 0.015 are excluded from the sum. Similarly, when constructing I π , summing the transverse momenta of charged hadrons, a region of ∆R = 0.02 is excluded. Charged hadrons are reliably associated with reconstructed primary vertices and thus I π is potentially independent of pileup. However, the association of photons with a primary vertex is often less than certain, and an incorrect choice of the vertex used will give a random isolation sum consistent with an isolated photon. For this reason, two variables are defined, I π , where the list of charged hadrons is measured with respect to the primary vertex chosen for the photon, and I max π , where the isolation sum is the largest among those calculated for all reconstructed primary vertices.
When the charged-hadron component of the isolation is calculated from candidates compatible with the chosen primary vertex, it is independent on the number of pileup events as shown in the left plot of figure 16, where the number of reconstructed primary vertices in the event is used as a measure of the number of pileup events. This illustrative figure is made using photons in γ + jet events and requiring them to satisfy p T > 50 GeV, which, by ensuring 50 GeV of recoil in the event, results in a high probability that the primary vertex of the hard interaction, and hence of the photon, is correctly identified. The variables constructed by summing photons and neutral hadrons, inside an isolation region, need to be corrected to remove the contribution from pileup. The extra contribution in the isolation region is estimated as ρ A eff , where ρ is the median of the transverse energy density per unit area in the event [26] and A eff is the area of the isolation region weighted by a factor that takes into account the dependence of the pileup transverse energy density on pseudorapidity. The effective areas have been determined in γ + jet events. When the extra contribution due to pileup, calculated using ρ, is subtracted from the photon and neutral hadron sums, their dependence on the number of vertices is removed ( figure 16, right). Figure 17 illustrates how the three isolation variables defined above behave for signal and background, as well as the good agreement between data and simulation for a region with radius ∆R = 0.3. The figure shows the distribution of the variables for photons in the ECAL barrel. Similar results are found in the endcaps. The signal photons shown have high purity and are -28 - Table 3. Photon identification requirements for three working points corresponding to selections of different stringency.

Loose
Medium Tight conversion-safe from Z → µ + µ − γ events, and the background-dominated candidates are obtained from data, as in figure 15. A value of zero is plotted for the isolation variables in those cases when the pileup subtraction results in a negative value. For the distributions of the variables for signal photons, the ratio of values found in data and simulation is shown.

Photon identification based on sequential requirements
This section describes the identification of photons by sequential application of requirements. Various versions have been used in different data analyses, although the basic principles remain the same. After applying the electron veto, requirements are made on σ ηη , f h , and the isolation sums. In most cases, the isolation thresholds are expressed as a constant term added to a term proportional to the candidate photon transverse momentum, p γ T . A summary of the standard photon identification requirements, where different combinations of requirements and thresholds are used for the barrel and the endcap, is given in table 3 for three different working points. The working points correspond to selections of different stringency, and the corresponding efficiency curves are shown in figure 18, for photon candidates with p T > 15 GeV in a sample of simulated γ + jet events.
Photon identification efficiencies are measured with the "tag-and-probe" method, as described in ref. [39], using samples of Z → e + e − events. The results of these measurements can be used to correct the simulation for any mismodelling by evaluating the ratio of efficiencies in data and simulation. For the results shown here, refinements to the simulation were implemented to reproduce the changes of the magnitude of the energy-equivalent electronic noise during the data-taking period (most relevant for the barrel), and to better simulate the effects of out-of-time pileup (more relevant for the endcaps). These refinements have been described in section 3. Electrons resulting from Z-boson decays, in a data sample passing the 27 GeV single-electron trigger, are used for the measurement. The "tag" candidates are required to have p T > 30 GeV, satisfy tight electron identification [18], and be matched to a triggering electron. The dielectron invariant mass is required to be in the range 60 < m ee < 120 GeV. The "probe" candidates are electron showers reconstructed as photons and matched to the non-tag electron. They are required to have p T > 15 GeV and are tested  Figure 17. Distributions of the isolation variables: (top) I γ , (bottom left) I π , and (bottom right) I n , constructed from particle-flow objects. The distributions are shown for FSR photons from Z → µ + µ − γ events in data (solid circles) and simulation (histogram) and for background-dominated photon candidates in dimuon triggered events (open circles). The simulated signal and background distributions are normalized to the number of signal photons in data. The ratios between the photon signal distributions in data and simulation are shown in the bottom panels.

JINST 10 P08010
for passing (or not) the photon identification criteria, with the exception of the electron veto. Invariant mass distributions are then made separately for the cases in which the probe photons satisfy or fail the identification requirements, hereafter referred to as "passing" and "failing" distributions. Simultaneous fits to the passing and failing distributions are performed to extract the identification efficiency. The Z-boson invariant mass distribution is modelled with a template extracted from simulation and convolved with a Gaussian function. The background is modelled with an exponential times an error function. Figure 19 shows an example of fits to the Z → e + e − mass peak for the central barrel region. The transverse momentum of the probe photon is in the range 20 < p T < 30 GeV and the identification criteria correspond to the medium working point quoted in table 3. The number of events in data is such that the statistical uncertainties in the data points, shown by error bars, are not visible in the figure. The fitted numbers of signal events in the two plots give a measured efficiency of 74% with negligible statistical uncertainty. The hump on the left side of failing probes is due to radiating electrons for which a fraction of energy is not collected. Figure 20 shows the comparison of the selection efficiency in data and simulation, as a function of the photon transverse momentum, for barrel and endcap separately. The values are obtained using electrons from Z → e + e − decays with a tag-and-probe technique, with the probe electron reconstructed as a photon, and the electron veto removed from the identification criteria. The data-to-simulation ratio, showing a good level of agreement for p T > 20 GeV is shown in the panels beneath the main plot. The shaded bands represent the systematic uncertainties, which have been evaluated by replacing, in the fits to the invariant mass distribution, the background modelling with simple exponential and polynomial functions. The statistical uncertainties in the data measurements are too small to be visible. Since the measurement is made for an electron sample, the electron veto is not applied, and its efficiency (table 2) and the agreement of data and simulation, are measured separately. The different level of efficiency in the barrel and the endcap seen in figure 20, but not in figure 18, is explained by the requirement (or not) of the electron veto.
The background rejection, defined as the reciprocal of the efficiency of background photons to pass selection requirements, and the signal efficiency have been determined for the three working points of the photon identification based on sequential requirements in a simulated γ + jet sample. The signal corresponds to reconstructed photons matched to simulated prompt photons and the background corresponds to reconstructed photons matched to a jet. The signal transverse momentum distribution is reweighted to follow the background spectrum, and photon candidates are required to satisfy 25 < p T < 200 GeV. In section 6.4 the values found for background rejection and signal efficiency when using photon identification based on sequential requirements are compared with the background rejection as a function of signal efficiency obtained with the multivariate photon identification.

Multivariate photon identification
A more sophisticated photon identification technique is based on a multivariate analysis, employing a boosted decision tree (BDT) implemented in the TMVA framework [40] . The technique allows the definition of a single discriminating variable characterizing each photon (the BDT score) resulting from the combination of many variables discriminating prompt photons from background candidates. The list of variables used as the input to the BDT includes all shower-shape and isolation variables described earlier, plus three quantities that strengthen the discrimination of signal   Figure 19. Example of fits to the Z → e + e − invariant mass distribution for (left) passing and (right) failing probes, in the transverse momentum range 20 < p T < 30 GeV and |η| < 0.8.  Figure 20. Comparison of the selection efficiency as a function of photon transverse momentum in data (circles) and simulation (triangles) for the identification based on sequential requirements for (left) |η| < 0.8 and (right) 1.6 < |η| < 2. Statistical and systematic uncertainties are respectively shown by the error bars and shaded bands. The horizontal error bars mark the full width of the p T bins in which the measurements are made, and the data points are plotted at the centre of each bin. The ratios of efficiencies in data and simulation are shown in the bottom panels.
-33 - barrel endcap barrel endcap ≤ 0.9 < 0.075 < 0.075 < 0.014 < 0.034 < 4 GeV < 4 GeV < 4 GeV > 0.9 < 0.082 < 0.075 < 0.014 < 0.034 < 50 GeV < 50 GeV < 4 GeV and background by accounting for the dependencies in the shower-shape and isolation variables on the pileup present in the event, and the η and E T of the candidate photon: the median energy per unit area, ρ, and the η and uncorrected energy of the supercluster corresponding to the candidate photon.
The multivariate photon identification was developed in the context of the H → γγ analysis, which uses a diphoton trigger employing a loose photon selection. To ensure the independence of the analysis from the online requirements imposed with the trigger, a preselection is applied to photons candidates. The preselection makes similar requirements, but somewhat more severe, to those made online by the trigger. Simulated events are required to satisfy the same preselection requirements listed in table 4. Besides the variables already described in section 6, two further isolation variables are used, I HCAL and I Trk , which are the sums of transverse energy in the HCAL towers, and charged-particle tracks with p T > 1 GeV, respectively, in regions of ∆R < 0.3 about the photon candidate. The HCAL sum is uncorrected for pileup, and the charged-particle track sum uses the tracks associated to the vertex with the highest Σp 2 T of associated tracks, as is done in the trigger. There are different requirements for photon candidates depending on whether they have a high or low value of the R 9 variable. The value used to define this categorization, R 9 = 0.9, reflects the one used in the trigger.
The BDT is trained on a sample of simulated γ + jet events, where the photon candidates matching the prompt photon are used as signal, and photon candidates not matching the prompt photon are used as background. The photon candidates are required to have p T > 20 GeV and to satisfy the preselection. The photon transverse momentum and pseudorapidity in signal are reweighted to match the corresponding distribution of background non-prompt photons, so that the input signal-to-background ratio does not depend on p T . Since the training of the BDT is performed with simulated samples, it is important to verify the quality of the modelling of all input variables. The input variables are studied in Z → e + e − events where the electrons are reconstructed as photons and in Z → µ + µ − γ events. Examples of the comparison of the distributions of input variables in data and simulated events are shown for signal photons in figures 15 and 17. Figure 21 shows the distribution of the BDT score for Z → e + e − events, where the electrons are reconstructed as photons. The distributions of the BDT score in data and simulation agree well. A shift of ±0.01 of the score is shown as a band in the plot. This shift comfortably covers the small differences between the distributions in data and simulation, and is taken as the uncertainty in the value of the photon identification BDT score predicted by simulation. The same comparison can be made for photon candidates in Z → µ + µ − γ events, and figure 22 shows the distributions of the BDT score for photons in data and in simulated events. The agreement is again good for photons in both the barrel and the endcaps. The separation of signal and background can be seen in figure 23. The figure shows the photon identification BDT score of the lower-scoring photon in diphoton pairs with an invariant mass in the range 100 < m γγ < 180 GeV for diphoton events passing the preselection in the 8 TeV dataset and for simulated background events (histogram with shaded error bands showing the statistical uncertainty). The relative fractions of diphoton pairs arising from γ-γ, γ-jet, and jet-jet processes in the MC sample is the result of using the cross sections and K-factors described in section 3. The tall histogram corresponds to simulated Higgs boson events (m H = 125 GeV). The distribution of the photon identification BDT score of the lower-scoring photon for simulated diphoton background events also agrees well with the distribution seen in the data. The bump that can be seen in both distributions at a BDT score of about 0.13 corresponds to events where both photons are prompt and, therefore, signal-like.
If a simple requirement is made on the BDT score of photon candidates, defining a working point with a signal efficiency of about 80%, the signal and background efficiencies are found to be flat as a function of the photon transverse momentum and the number of vertices in the event, for both ECAL barrel and endcaps. The identification efficiency obtained by making such a requirement on the photon identification BDT score has also been measured in data with the tag-and-probe technique in Z → e + e − events (reconstructing the electrons from the Z boson as photons). The tag photon is required to have p T > 35 GeV and the BDT score is required to be > 0.15. Figure 24 shows the data-to-simulation comparison of the efficiencies as a function of the probe photon transverse momentum for |η| < 1 and 1.5 < |η| < 2 separately. The ±0.01 systematic uncertainty assigned to the BDT score in simulation covers, together with the systematic uncertainty in the tag-and-probe efficiency measurements, the residual difference observed between the efficiencies measured in data and simulation. The dependence of background rejection on signal efficiency as the requirement on the photon identification BDT score is varied, is shown in figure 25. The signals are reconstructed photons matched to prompt photons in simulated γ + jet events. The background are photons reconstructed in simulated dijet events. The loose preselection defined in table 4 is applied to all photon candidates, so the background rejection and signal efficiency shown are relative to this preselection. The signal transverse momentum distribution is reweighted to follow the background spectrum, and photon candidates are required to satisfy 25 < p T < 200 GeV. The figure also shows the background rejection and signal efficiency of the three working points of the photon identification using sequential requirements. The multivariate selection can be seen to have better performance, arising from the use of additional information, including the correlation among variables.

Summary
A description has been provided of the performance of the CMS detector for photon reconstruction and identification in proton-proton collisions at a centre-of-mass energy of 8 TeV at the CERN LHC. Details are given of the reconstruction of photons from energy deposits in the ECAL and of the extraction of photon energy estimates. The reconstruction of electron tracks from photons that convert to e + e − -pairs in the CMS tracker is also described, as is the optimization of the photon energy reconstruction and its accurate modelling in simulation for the analysis of the Higgs boson decay into two photons. The excellent agreement between data and simulation, demonstrated for  electron showers, enables the extraction of an accurate estimate of the energy resolution of photons from H → γγ decays in data. In the barrel section of the ECAL, an energy resolution of about 1% is achieved for unconverted or late-converting photons arising from the H → γγ decay. The remaining barrel photons have a resolution of about 1.3% up to |η| = 1, rising to about 2.5% at |η| = 1.4. In the endcaps, the resolution of unconverted or late-converting photons from the same sample is about 2.5%, while the remaining endcap photons have a resolution of somewhat worse than 3%.
The photon energy scale uncertainty and its impact on the Higgs boson mass measurement are discussed in depth. Since the scale is set using the showers of electrons from Z → e + e − decays reconstructed as photons, the largest uncertainties are due to the possible imperfect simulation of (i) differences in detector response to electrons and photons, and (ii) energy scale nonlinearity between the energies typical of electrons from the Z boson decay and photons from the Higgs boson decay. Results of measurements of the material thickness of the tracker are shown, together with a comparison between data and simulated events of the energy response as a function of E T . The uncertainty in the photon energy scale at p T ≈ m Z /2, is about 0.1% in the central barrel, 0.15% in the outer barrel, and 0.3% in the endcaps.
Different photon identification methods are discussed, and their corresponding selection efficiencies in data are compared with those found in simulated events. For the two photon identification methods considered, the agreement between data and simulation for the efficiency as a function of photon p T is found to be good. Comparing the background rejection as a function of signal efficiency, the multivariate selection has somewhat better performance, resulting from the use of additional information including the correlation among variables.
-38 -Individuals have received support from the Marie-Curie programme and the European Research Council and EPLANET (European Union); the Leventis Foundation; the A. P. Sloan Foundation; the Alexander von Humboldt Foundation; the Belgian Federal Science Policy Office; the Fonds pour la Formationà la Recherche dans l'Industrie et dans l'Agriculture (FRIA-Belgium); the Agentschap voor Innovatie door Wetenschap en Technologie (IWT-Belgium); the Ministry of Education, Youth and Sports (MEYS) of the Czech Republic; the Council of Science and Industrial Research, India; the HOMING PLUS programme of Foundation for Polish Science, cofinanced from European Union, Regional Development Fund; the Compagnia di San Paolo (Torino); the Consorzio per la Fisica (Trieste); MIUR project 20108T4XTM (Italy); the Thalis and Aristeia programmes cofinanced by EU-ESF and the Greek NSRF; and the National Priorities Research Program by Qatar National Research Fund.
[15] J. Alwall et al., The automated computation of tree-level and next-to-leading order differential cross sections and their matching to parton shower simulations, JHEP 07 (2014)