Improved Measurement of the Reactor Antineutrino Flux and Spectrum at Daya Bay

A new measurement of the reactor antineutrino flux and energy spectrum by the Daya Bay reactor neutrino experiment is reported. The antineutrinos were generated by six 2.9~GW$_{\mathrm{th}}$ nuclear reactors and detected by eight antineutrino detectors deployed in two near (560~m and 600~m flux-weighted baselines) and one far (1640~m flux-weighted baseline) underground experimental halls. With 621 days of data, more than 1.2 million inverse beta decay (IBD) candidates were detected. The IBD yield in the eight detectors was measured, and the ratio of measured to predicted flux was found to be $0.946\pm0.020$ ($0.992\pm0.021$) for the Huber+Mueller (ILL+Vogel) model. A 2.9~$\sigma$ deviation was found in the measured IBD positron energy spectrum compared to the predictions. In particular, an excess of events in the region of 4-6~MeV was found in the measured spectrum, with a local significance of 4.4~$\sigma$. A reactor antineutrino spectrum weighted by the IBD cross section is extracted for model-independent predictions.


Introduction
Since the discovery of the neutrino in 1956 at the Savannah River reactor power plant by Cowan, Reines and collaborators [1], reactor antineutrinos have played a crucial role in the development of the standard model of particle physics [2], and in the exploration of neutrino oscillation. Near the beginning of this century, the CHOOZ and Palo Verde experiments attempted to measure the neutrino mixing angle θ 13 using reactor an-010201-2 tineutrinos at ∼1 km baselines and obtained upper limits [3][4][5]. In 2003, the KamLAND experiment observed terrestrial neutrino oscillations with a flux-average baseline of 180 km [6], confirming the large mixing angle (LMA) solution to the solar neutrino problem. In 2012, the Daya Bay experiment reported the first observation of a non-zero θ 13 [7] with more than 5 σ significance, consistent with the results from T2K [8], MINOS [9], Double CHOOZ [10] and RENO [11] experiments. The discovery of a non-zero θ 13 opened the way to determining the neutrino mass hierarchy and searching for CP violation in neutrino oscillation experiments. In the future, reactor neutrino experiments at ∼km baselines will continue to improve the precision of θ 13 measurements, while reactor neutrino experiments at baselines of ∼50 km [12,13] are aiming to determine the neutrino mass hierarchy and precisely measure the neutrino mixing angle θ 12 and the mass-squared splittings ∆m 2 21 and ∆m 2 32 . In addition, reactor neutrino experiments at baselines of ∼10 m will probe physics beyond the three-neutrino framework through the search for shortbaseline neutrino oscillation [14][15][16][17]. A recent review of reactor neutrino oscillation experiments can be found in Ref. [18].
Reactors are a pure source of electron antineutrinos, ν e . Inside a reactor core, fission processes are maintained by neutrons produced through the fission of 235 U nuclei. A portion of the neutrons are captured by 238 U nuclei and subsequent beta decays and neutron captures lead to the production of fissile isotopes 239 Pu and 241 Pu. The beta-decay chains of the fission products of these four isotopes are the main source ofν e . On average, about six antineutrinos are released per fission. Before 2011, the prediction of antineutrino flux and spectrum was based on the beta spectra measured at ILL Grenoble for the thermal-neutron induced fission of 235 U, 239 Pu, and 241 Pu [19][20][21] and the theoretical calculation of Vogel for 238 U [22], which was shown to be in good agreement with available data [23]. In 2011, re-evaluation of the reactor antineutrino flux and spectrum [24,25] with improved theoretical treatments was carried out, and the new predicted reactor antineutrino flux was shown to be higher than the experimental data. This discrepancy is commonly referred to as the "Reactor Antineutrino Anomaly" [26]. One possible explanation of the reactor antineutrino anomaly is through neutrino oscillation with a frequency corresponding to a mass squared difference at the eV-scale, by introducing at least one additional sterile neutrino. Meanwhile, it was pointed out in Ref. [27] that the uncertainty due to the spectral shape of numerous first forbidden beta decays may be larger, which could largely reduce the significance of the anomaly. In addition to the anomaly of the integrated reactor antineutrino flux, recent results from the current generation of θ 13 experiments have also highlighted the presence of a spectral anomaly consisting of an excess of detected events with respect to predictions in the region of 4-6 MeV of the reconstructed prompt energy [28][29][30]. This feature is unlikely to be the result of active-sterile neutrino oscillations, and raises further questions on the accuracy of some existing reactor antineutrino flux and spectrum predictions.
To shed light on these issues and probe the nuclear physics underlying current reactor antineutrino flux models, it is crucial to compare model predictions with precision measurements of reactor antineutrino flux and spectrum. While the modeling of the reactor antineutrino spectrum is less critical for oscillation experiments employing relative measurements between multiple detectors, an accurate determination of the reactor antineutrino spectrum is critical to realize the full potential of the next-generation single-detector medium-baseline reactor antineutrino oscillation experiments [31].
This article will present Daya Bay's reactor antineutrino flux and spectral analyses utilizing the dataset from its most recent spectral oscillation analysis [32]. The dataset is comprised of more than 1.2 million antineutrino candidates collected in eight antineutrino detectors (ADs) in two near experimental halls (with fluxweighted baselines of 560 m and 600 m) and one far hall (flux-weighted baseline 1640 m), providing a factor of 3.6 times more statistics over the results presented in Ref. [29]. This paper also aims to provide detailed description of key inputs to these analyses not described in previous Daya Bay publications, such as the method of predicting the flux and spectrum from each Daya Bay core, as well as the method of determining the IBD detection efficiencies of the Daya Bay ADs. Finally, a more detailed description will be provided regarding how the flux and spectrum analyses were carried out, and how the observed prompt spectra are unfolded into a reactor antineutrino spectrum, which is a useful input for future reactor antineutrino experiments. This paper is organized as follows: Sec. 2 summarizes in detail the treatment of the reactor antineutrino flux and spectrum prediction in Daya Bay's neutrino oscillation analysis with the full eight-detector configuration [32]. Sec. 3 overviews the standard IBD selections used by Daya Bay, while Sec. 4 provides an in-depth explanation of the analysis performed to determine the detection efficiency of the Daya Bay detectors. The updated measurements of the reactor antineutrino flux and the positron prompt energy spectrum are presented in detail in Sec. 5 and Sec. 6. Based on the measured prompt energy spectrum, an extracted reactor antineutrino spectrum weighted by the IBD cross section is presented in Sec. 7. Finally, a summary is given in Sec. 8.

Reactor Description
The Daya Bay nuclear power complex is situated at Daya Bay in southern China, approximately 55 kilometers northeast of Hong Kong. As shown in Fig. 1, the nuclear power complex consists of three nuclear power plants (NPPs): the Daya Bay NPP, the Ling Ao NPP, and the Ling Ao II NPP. Each of them has a pair of reactor cores generating 2.9 GW thermal power each, during normal operation. The distance between the two cores in each NPP is about 88 m. The Ling Ao II cores reached full power in July 2011 while the other cores were running in commercial operation. The uncertainty of the baseline measurement is estimated to be 18 mm. Details of the baseline measurement are described in [33]. The Daya Bay and Ling Ao NPPs use the French Framatome Advanced Nuclear Power 990 MW e (electric power) three cooling loop design, and Ling Ao II NPP uses an updated Chinese version (CPR 1000) of 1080-MW e . Each cooling system consists of a primary loop and a secondary loop connected with a steam generator. Figure 2 shows a schematic diagram of one cooling system. Inside each reactor core, 157 fuel elements are bonded to socket plates in the water-filled reactor pressure vessel. The water absorbs the heat generated by fissions in the fuel and then circulates through inverted U-shape tubes of the steam generators, which are immersed in water of the secondary loops. The heat is then transferred to the water in the secondary loop and the water is vaporized into saturated steam, which flows to the turbine-alternator unit. The cooled water in the primary loop is then pumped back to the vessel and goes to the next cycle. The water is slightly doped with boric acid, which acts as the thermal neutron absorber. Boron concentration, controlled by the NPPs, decreases during the refueling cycle to compensate for the power loss caused by the depletion of fuel, helping to keep the total power of the reactor stable at a nominal level. Schematic diagram of the reactor cooling system. At Daya Bay, each reactor core is connected with 3 cooling systems in parallel.

Reactor Power Measurements and Monitoring Systems
Three different systems, RPN (Nuclear Instrumentation System) [34], KME (Test Instrumentation System) [35] [36], and KIT/KDO (Centralized Data Processing System/Test Data Acquisition System) [34,36], were deployed to monitor the power of the reactor cores in Daya Bay. Table 1 is a summary of the three power monitoring systems. The RPN system is used for reactor monitoring and protection by measuring the neutron flux with four neutron detectors placed around the reactor core. The reactor power is supposed to be proportional to the neutron flux. However, as the nuclear fuel burns, the power as measured by RPN gradually differs by an increasing amount from the actual power due to the change of the isotope content in the core. To guarantee accuracy, the RPN system's measured powers are compared with the more accurate KIT/KDO system every day. Once the difference exceeds 1.5% of full power, the RPN system is re-calibrated.
The KME and KIT/KDO systems are based on the heat balance method. The KME is the secondary loop power measurement system, and has the best accuracy among all three systems. This system measures the parameters such as water flow rate, temperature and pressure in the secondary loop, and calculates the enthalpy increase when the water passes through the steam generator. Other heat sources such as pumps in the secondary loop are also considered. By considering the power of all three steam generators and heat from other sources, the reactor core thermal power can be calculated as where W R is the reactor core thermal power, W SG i is the thermal power of the i-th steam generator, and W ∆P r is the heat input and power loss from the pump systems and other heat sources. Daya Bay and Ling Ao reactors are all based on French Pressurized Water Reactors (PWRs). For French PWRs, the measurement of nominal thermal power follows a procedure known as BIL100, which is performed on the secondary loop [37]. The predominant term in the calculation of uncertainty for BIL100 is the uncertainty related to mass flow rate of the feed water, which accounts for up to 80% of the uncertainty related to thermal power [37]. To minimize this source of uncertainty, orifice plates were installed in the secondary loop to precisely measure water flow. The uncertainty of the orifice water flow measurement is typically 0.72% (90% C.L.), and could be improved to 0.4% (90% C.L.) according to lab tests [38]. For Daya Bay's KME system, four benchmark tests were made to compare the core power result between the KME system and an EDF (Electricite de France)-developed high precision SAPEC system (EDF's standardized system for enhanced safety and performance periodic tests on the PWR fleet), which has its own sensors, databases and data processing systems [39]. The tests showed the relative difference between the two systems was 0.031% to 0.065%. The power measurement uncertainty of the KME system is estimated to be less than 0.25%. This is comparable to the uncertainty estimated for the SAPEC system, which is <0.26% [39]. Although the KME system has the best precision, it is an offline system. The power plant usually does the KME measurements weekly or monthly, but this frequency does not meet the experimental requirement. The KIT/KDO system is an online system for monitoring the core power, based on a primary loop heat balance method. The system measures the temperature, pressure, and mass flow rate of the feed water in the primary loop to calculate the thermal power. Installation of orifice plates in the primary loop is not allowed, thus the KIT/KDO system uses another flow meter to measure the water flow, which is less precise than the KME system. However, the KIT/KDO system is calibrated monthly to the KME system by adjusting the feed water flow rate in the primary loop in the KIT/KDO system once the difference between the powers measured by the two systems exceeds 0.1% of full power. Conservatively, considering that the uncertainty between the steam generators is fully correlated in the KME system, and accounting for the difference between the KIT/KDO system and the KME system, the uncertainty of the KIT/KDO system is estimated to be 0.5%. In the Daya Bay Experiment the KIT/KDO measured thermal power is provided hourly to calculate the reactor antineutrino flux.

Reactor Core and Refueling
The reactor core consists of 157 fuel elements, and each element contains 264 fuel assemblies of uranium dioxide with a 235 U enrichment of 4%. The height of the elements is 3.7 m and the diameter of the core is 3 m. The six reactors shut down alternately for refueling and overhaul. The refueling cycle period for the Daya Bay NPP is about 18 months, with 1/3 of all fuel elements replaced with fresh fuel during the refueling period. For the Ling Ao NPPs, the refueling cycle period is 12 months, and 1/4 of the fuel elements are replaced. Refueling usually takes one month. At the beginning of each burning cycle, the positions of the fuel elements in the core are rearranged. Fresh fuel is placed in the core center, while the old fuel is moved outward. This scheme has the advantages of reducing neutron leakage, enhancing activity, and increasing fuel burn-up. On the other hand, the scheme results in a non-uniform power distribution in the core and increases the power peaking factor. To reduce this effect, burnable gadolinium fuel "poison" rods are installed in some elements to absorb neutrons. Figure 3 shows an example of the reactor core map of the fuel elements with different burn-up at the end of a refueling cycle.
When refueling for a new cycle, the fuel elements are configured in the reactor core around the center as symmetrically as possible. Because of this, the reactor core can be considered as a point source of antineutrinos with a center of gravity stable at the core center, as will be discussed later.

Fuel Evolution and Core Simulation
Burn-up describes the energy extracted from the fuel element per ton of initial uranium mass since its place-ment into the reactor core, defined as where W is the average power of the fuel element, D is the days since the fuel element begins to burn in the core, and M U in is the initial uranium mass of the fuel element. The unit of burn-up is MW · day · ton U −1 . A similar quantity, cycle burn-up, is used to describe the aging of the whole reactor core in a refueling cycle. Cycle burn-up can also be calculated using Eq. 2, where W , D, and M U in in this case represent the total nuclear power of the reactor core, the days since the beginning of the refueling cycle, and the initial uranium mass of all the fuel elements in the reactor core. In reactors, electron antineutrinos are emitted primarily from the fissions of four isotopes: 235 U, 238 U, 239 Pu, and 241 Pu. Fissions of other isotopes contribute less than 0.3%. Fissions of 238 U are only induced by fast neutrons, while fissions of the other three isotopes are mainly induced by thermal neutrons. Fresh fuel elements contain only uranium isotopes. The plutonium isotopes are gradually generated through neutron captures on 238 U and subsequent neutron captures and beta decays of its successor isotopes.
Fuel evolution is a dynamic process related to many factors such as power, neutron flux, fuel composition, type and position of fuel elements, and boron concentration. For safe operation of the reactors, NPPs do calculations and simulations of the fuel evolution in every refueling cycle by considering all of the factors above. These detailed simulations are performed by validated and licensed commercial software. The simulation package used by the Daya Bay NPP is SCIENCE, which was developed by CEA, France. It uses the APOLLO2 code [40] as the core component. The simulation results are provided to the Daya Bay collaboration in a table which uses cycle burn-up as the index. The fission fractions are provided by the simulation in the form of f i (β), where f i is the fission fraction of isotope i, and β is the cycle burn-up. Figure 4 shows an example of the fission fraction evolution as a function of cycle burn-up within a refueling cycle [41]. The APOLLO2 code is widely used for cross section generation and neutron transport calculations in commercial reactor cores. It adopts rigorous methodology for its validation, including comparison with the reference calculation using the same nuclear data libraries, and with the experimental measurements [40]. Measurements of spent fuel isotopic content were made and compared with the results calculated using the APOLLO2 code [42]. The comparison shows that the measurementmodel deviations are less than 5%. Therefore, the uncertainty of the calculated fission fraction is conservatively estimated to be 5% for each isotope.
The NPPs also provide 3D core simulation results for different burn-up stages, which enable an investigation of the spatial distribution of the antineutrino production inside the core. The reactor can be considered as a point source ofν e for the Daya Bay experiment because the fuel elements are symmetrically arranged in the reactor core as shown in Fig 3. The relative difference between treating the reactor as a point source and as a finite source is negligible and the variation of the effective fission center in the reactor is estimated to be 2 cm horizontally. The impact on the baselines of the vertical variation of the fission center is negligible. Combined with the 18 mm uncertainty in the baseline measurements, the total uncertainty of the baselines is conservatively estimated to be 27 mm.
The open source simulation code DRAGON [43]was also used to calculate the fission fractions, and to estimate their uncertainty. The impact of many reactor parameters was taken into account, including power, neutron flux, fuel composition, type and position of fuel elements, and boron content. DRAGON was originally developed for CANDU (CANada Deuterium Uranium) reactors, but also yields reliable predictions for PWRs [44,45]. The fission fraction uncertainty of each isotope was found to be less than 5%, consistent with the results of APOLLO2 validation. The fission fractions of four isotopes are correlated with each other because 239 Pu and 241 Pu are gradually produced while 235 U is continuously consumed and the sum of the fission fractions is normalized to be 100%. DRAGON was used to calculate correlations among fission fractions using the fission fraction data from several cycles of the NPPs. The results are given in Table 2. The correlations were used as an input when propagating the fission fraction uncertainties to the reactor antineutrino flux uncertainty.

Expected Unoscillated Spectrum
Electron antineutrinos are generated in the reactors from the beta decays of the fission fragments produced by the four isotopes. Each fission isotope produces a uniqueν e spectrum through its fission and subsequent decay chains. In principle, using cumulative fission yields and beta decay information for each fission production, it is possible to compute the antineutrino spectrum ab initio. However, this requires reliable beta decay information on more than 1000 isotopes [46], many of which have never been observed. The lack of decay information combined with nuclear structure-related uncertainties and the uncertainties of the fission yields, results in an overall 10-20% energy dependent uncertainty in the predicted antineutrino spectrum.
To improve on the purely ab initio method described above, several direct measurements were done at ILL [19][20][21] in the 1980s to determine the electron energy spectra from the individual fission isotopes 235 U, 239 Pu, and 241 Pu. In these measurements, foils of isotope samples were placed inside the reactor and exposed to thermal neutron fluxes for 1-2 days. A high-precision electron spectrometer measured the electrons emitted by the samples. The observed electron spectrum was then converted into an antineutrino spectrum by fitting with a set of hypothetical β-decay branches and adding up the antineutrino spectrum from each fitted branch. The uncertainty of the antineutrino spectrum by this conversion process was estimated to be 2.7%. These experiments did not perform similar measurements for 238 U, which only fissions with fast neutrons. Theoretical antineutrino flux calculations for 238 U were carried out by Vogel [22], with overall uncertainties < 10%. Since 238 U only contributes to ∼8% of the total reactor antineutrino flux, the error introduced to the total flux is less than 1%. These calculations of antineutrino spectra are referred to as the ILL+Vogel model.
The prediction of antineutrino spectra from 235 U, 239 Pu, and 241 Pu was recently improved [24,25], where the ILL electron spectra were reanalyzed by taking into account several higher-order corrections to the β-decay spectra. The ab initio calculation of the 238 U antineutrino spectrum was updated by Mueller et al. [24]. These new calculations are referred to as the Huber+Mueller model. The claimed uncertainty of the predicted total flux from the Huber+Mueller model is 2.4%. Both the ILL+Vogel model and the Huber+Mueller model are used to calculate the expected antineutrino spectrum from a single reactor core. A measurement of the 238 U beta spectrum was performed and the corresponding antineutrino spectrum was determined in Ref. [47]. Replacing the Mueller 238 U antineutrino spectrum with this measurement only changes the total integrated flux by 0.2% since 238 U only contributes 8% of the total integrated flux.
The total antineutrino spectrum is calculated once the time evolution of reactor power and fission fractions are provided by the Daya Bay NPP, where i is the index of individual fission isotope in the reactor fuel, that is 235 U, 238 U, 239 Pu, or 241 Pu. dφ i (E ν )/dE ν is the antineutrino spectrum of the i-th isotope per fission, and F i is the total fission rate of the i-th isotope. The total fission rate is directly related to the total thermal power of the reactor core, and can be calculated as follows: where W th is the total thermal power of the reactor core, e i is the energy released per fission of the i-th isotope, and f i is the fission fraction of the i-th isotope. The term i f i ·e i represents the average energy released per fission from the four isotopes. The energy released per fission (e i ) is defined as the amount of energy from a fission event that transforms into heat over a finite time interval [48], which has a slight dependence on the reactor burning history. They were calculated by considering the neutron captures in the reactor and decays of long-lived fission daughters, using typical PWR reactor parameters [48]. The improved calculation of the energy released per fission [49] used in this analysis includes using updated nuclear databases, considering the production yields of fission fragments from both thermal and fast incident neutrons, and an updated calculation of the average energy taken away by antineutrinos. This new calculation gives slightly larger values of e i with smaller uncertainties than in [48], resulting in a 0.32% decrease of the calculated antineutrino flux. The values of e i and their uncertainties are listed in Table 3. In the Daya Bay experiment, the electron antineutrinos are detected via the inverse beta decay (IBD) reaction:ν e +p → e + +n. The expected antineutrino spectrum weighted by the IBD cross section in the detector d from reactor r is calculated by where L dr is the distance from reactor r to detector d, d is the IBD selection efficiency, N d p is the number of target protons, and σ(E ν ) is the inverse beta decay cross section calculated using the formalism in [50], with the updated neutron lifetime of 880.3±1.1 s taken from PDG 2014 [2]. The uncertainty of the cross section is dominated by the uncertainty of neutron lifetime. The total reactor antineutrino spectra for a detector d is the sum of antineutrino spectra from all reactors: As an example, the expected total antineutrino spectrum at the near site ADs is shown in Fig. 5.

Non-equilibrium Effect and Spent Nuclear Fuel Correction
In the ILL measurements, fissile samples were exposed to the thermal neutron flux for only 1-2 days. The rate of beta decays from some long-lived fission fragments did not reach equilibrium with their production rates. When using converted antineutrino spectra from the ILL measurements, this non-equilibrium effect needs to be corrected, since the long-lived fission fragments accumulate in the reactor core and their beta decays contribute to the total antineutrino flux.
After burning in the core, the nuclear fuel is removed from the reactor and stored as spent nuclear fuel (SNF) in a cooling pool near the reactor core. The long-lived isotopes in the SNF will decay and act as another source of antineutrinos.
The total neutrino spectrum is then modified: where S ILL is the expected antineutrino spectrum with ILL measurement-based models, S neq is the contribution from the non-equilibrium effect and S SN F is the contribution from the spent fuel. The non-equilibrium correction is a function of antineutrino energy, the burn-up and irradiation history of nuclear fuel [24]. Taking into account the information of the refueling history of reactors provided by the China General Nuclear Power Corporation, the cumulative contribution of the non-equilibrium effect at Daya Bay and Ling Ao reactors was calculated. On average, the effect contributed ∼0.6% additional IBD events, which is illustrated in Fig. 6. The uncertainty of the non-equilibrium effect is taken to be 30% from the estimation in Ref. [24].
The contribution of SNF can be evaluated by using the cumulative yields and spectra of the known long-lived fission fragments. The candidate isotopes were selected from the fission products with the condition that they have a half-life longer than 10 hours and either the isotope or its daughter nuclei undergoes beta decay with end point energy larger than the IBD reaction threshold (1.8 MeV). The antineutrino spectra of these candidate isotopes were calculated based on their beta decay process. The cumulative yields of the SNF were calculated with the input from the refueling history and SNF inventory information provided by the China General Nuclear Power Corporation. The calculated SNF antineutrino spectrum is illustrated in Fig. 6. The contribution to the total number of IBD events is ∼ 0.3%, which is consistent with previous calculations [51,52]. The uncertainty is conservatively estimated to be 100% after the investigation on the uncertainty of the SNF inventory history information. We neglect an additional low energy correction [53] which has a smaller effect than SNF.  . The ratio of calculated antineutrino spectrum the non-equilibrium effect (red) and spent nuclear fuel (blue) to that from the four fissile isotopes in reactor core. The drop at 3 MeV is due to the end point energy of 144 Pr beta decay, which contributes the most with its mother nuclide 144 Ce to SNF antineutrinos.

Systematic Uncertainties of the Predicted Reactor Antineutrino Spectrum
The systematic uncertainties of the predicted reactor antineutrino spectrum can be categorized as either correlated or uncorrelated among different reactor cores. The list of systematic uncertainties, and their values for the integrated reactor antineutrino flux, are shown in Table 4. The combined correlated uncertainty is taken to be 2.7% from the ILL+Vogel model (or 2.4% from the Huber+Mueller model). The correlated uncertainties are common for all reactor cores, therefore they are irrelevant in the neutrino oscillation analysis where only the relative rate and spectrum between the near and the far detectors are compared. The combined uncorrelated uncertainty is 0.9%, as a square root of the quadratic sum of the uncorrelated items, including power, energy/fission, fission fraction, spent fuel, and non-equilibrium in Table 4.  Some uncertainties are dependent on antineutrino energy, and can induce fluctuations in the energy spectrum, while the others only impact the integrated antineutrino flux. The contribution from each energy-dependent component is broken down and shown in Fig. 7. The energydependent uncertainties can be further categorized as correlated or uncorrelated between energy bins. The isotope antineutrino spectra of 235 U, 239 Pu, and 241 Pu are converted from the respectively measured beta decay spectra. The uncertainties of these spectra have both bin-to-bin correlated and uncorrelated components. The bin-to-bin correlated uncertainty is induced by the electron to antineutrino spectrum conversion models. The bin-to-bin uncorrelated uncertainty is induced by the statistical uncertainty of the measured beta decay spectra. The antineutrino spectrum of 238 U is based on theoretical calculation, and its uncertainty is bin-to-bin correlated.
The size of the total uncertainty is shown as the error bars on the predicted antineutrino spectrum in Fig. 5.

Inverse Beta Decay Event Selection
After production in the six Daya Bay reactor cores as described above,ν e are detected in identically designed Daya Bay antineutrino detectors (ADs). Each AD consists of three nested cylindrical vessels. The inner acrylic vessel (IAV) with a thickness of 11 mm is filled with 0.1% gadolinium-doped liquid scintillator (GdLS), which constitutes the primary antineutrino target. The outer acrylic vessel surrounding the target is filled with undoped LS, increasing the efficiency of detecting gamma rays produced in the target. The outermost stainless steel tank is filled with mineral oil. A total of 192 8-inch photomultiplier tubes (PMTs) are radially positioned in the mineral-oil region of each AD. Specular reflectors are deployed directly above and below the outer acrylic vessel. Three automated calibration units (ACUs) capable of deploying radioactive sources into the AD along three vertical z-axes are located on the top of each AD's outer tank [54]. At each site, ADs are submerged in two-zone water Cherenkov muon detection systems, composed of inner and outer water shields (IWS and OWS), in three experimental halls, as shown in Fig. 1. A more detailed description of all detector systems can be found in [33,55].
For the first seven months of Daya Bay data-taking from December 2011 until July 2012, six ADs were deployed and utilized for data analysis, two at the Daya Bay near site, one at the Ling Ao near site, and three at the Far Site. For the additional 13 months of the data to be used in this publication, from October 2012 to November 2013, the full eight-AD detector deployment was utilized, with two ADs at each near site and four ADs at the Far Site. During a special calibration period in Summer 2012, one ACU was temporarily removed to facilitate deployment of a Manual Calibration System, which was capable of deploying an articulating acrylic arm down the AD's center axis, allowing for fullvolume calibration of the GdLS volume at a variety of vertical Z-positions and radial R-positions with a PuC neutron/gamma source.
A series of cuts are applied to the data to select high purity time-coincident trigger pairs in the AD that match the characteristics of IBD signals: a prompt energy deposition from ionization and annihilation of the IBD positron, followed by an energy deposition from Gd-capture of the IBD neutron 30 µs later on average. The selection process and various cuts have been described in detail in a previous Daya Bay publication [41], and have remained unchanged for this analysis. We briefly list the sequence of IBD selection cuts below.
• Flasher Cut: Spurious single triggers caused by PMT light emission are efficiently removed using light collection topology cuts described in [41].
• Capture Time Cut: Candidate trigger pairs are selected by requiring time-coincident triggers be separated by 1-200 µs.
• Prompt Energy Cut: The prompt trigger in the time-coincident pair must have an energy of 0.7-12 MeV.
• Delayed Energy Cut: The delayed trigger in the time-coincident pair must have an energy of 6-12 MeV.
• Muon Veto Cut: Candidate pairs are rejected if their delayed signals occur (i) within a (-2 µs, 600 µs) time window with respect to a water shield muon trigger with a PMT multiplicity >12 either in the inner or outer water shield, or (ii) within a (0, 1000 µs) time window with respect to triggers in the same AD with an energy ranging from 20 MeV to 2.5 GeV, or (iii) within a (0, 1 s) time window with respect to triggers in the same AD with an energy above 2.5 GeV.
• Multiplicity Cut: To remove ambiguities in the IBD pair selection when multiple triggers are in timecoincidence, candidate pairs are removed if there is an additional candidate with E > 0.7 MeV in the interval 200 µs before the prompt-like signal, 200 µs after the delay-like signal, or between the prompt-like and delayed-like signals.
Total IBD candidate event rates after applying these cuts are listed in Table 5. Due to the near-identical response of the Daya Bay ADs, the efficiencies of most IBD selection cuts are the same for all detectors. Muon veto efficiency ( µ ) and multiplicity cut efficiencies ( m ) are dependent on muon fluxes and intrinsic background levels, which vary among different sites and ADs.
Backgrounds from accidental coincidences, fast neutrons, cosmogenic 8 He/ 9 Li production, AD-intrinsic alpha radioactivity, and AmC neutron calibration sources remain in the sample of IBD candidates and have been estimated using a variety of techniques described in detail in previous publications [32,41]. Background rate estimates remain unchanged for this analysis. Table 5. Summary of signal and backgrounds. Rates are corrected for the muon veto and multiplicity cut efficiencies εµ · εm. Rate differences between detectors at the same site result from differences in fluxes between detector locations.

Event Selection Efficiencies
In order to estimate the total number of inverse beta decay interactions in each AD, the efficiencies of all signal selection cuts must be estimated. All cut efficiencies have been estimated in previous Daya Bay publications [32,41]. Many of these efficiencies remain unchanged in this analysis, and are only briefly described here. A few key efficiencies common to all detectors have been recalculated with respect to those reported in [41] utilizing new comparisons between data and Monte Carlo (MC) simulation. The improved data-constrained detection efficiencies and systematics will be described below in detail. The recalculation and application of these key efficiencies and systematics result in a robust measurement of the overall reactor ν e flux from Daya Bay. Since these key systematics for detector efficiencies are largely correlated among all Daya Bay detectors, this reanalysis does not affect the previous measurement of oscillation parameters reported by Daya Bay.
To produce improved efficiency determinations, a variety of new MC samples were generated utilizing an updated version of Daya Bay's simulation framework NuWa, which is based on the Geant4 simulation package [56] and the Gaudi framework [57]. A few key MC improvements with respect to the version utilized to produce previous efficiency estimates in [41] are briefly highlighted. Models used to generate the spectrum of gammas released by neutron capture on Gd were altered based on new Daya Bay and bench-top datasets. These alterations, which affect the efficiency in detecting neutron captures on Gd, will be described in further detail below. Adjustment was also made to the model describing the thermalization and scattering of neutrons at all energies. This adjustment will also be described in further detail below, as it has a small impact on the capture time cut and on the position distribution of IBD events.

Flasher Cut Efficiency
Spontaneous light emission from the Daya Bay PMT bases can mimic particle interactions of various energies. Flasher triggers can be rejected using charge topology cuts, as described in detail in [41]. The IBD signal efficiency of these cuts is estimated to be 99.98%.

Capture Time Cut Efficiency
To be selected as an IBD signal the time separation between the trigger pair must be within a (1µs, 200µs) range. As described in [41], the vast majority of signal events meet this criterion, with 98.70% passing this cut in the most recent Daya Bay MC simulations. An uncertainty of 0.12% is assigned to this cut by noting small differences in trigger coincidence time distributions between AmC, AmBe, and PuC fast neutron source deployments and MC.

Muon Veto Cut Efficiency
Cuts are applied to reject coincident triggers correlated in time with muons traversing the water pools or ADs. The characteristics and performance of these cuts are described in [41]. Total signal efficiencies for these cuts depend on the muon flux at each site and are around 82%, 86%, and 98% at EH1, EH2, and EH3, respectively, as shown in Table 5. Muon veto cut efficiencies are calculated based on the actual number of muon vetos enforced in the dataset, and thus have negligible uncertainties.

Multiplicity Cut Efficiency
Some trigger coincidences containing more than two triggers are also rejected to avoid ambiguities in identifying the true IBD prompt-delayed pair. The definitions of these multiplicity cuts are described in [41], and have an efficiency of 97.5% for all ADs, within 0.1%. Multiplicity cut efficiencies are calculated on-the-fly and have negligible associated uncertainty.

Prompt Energy Cut Efficiency
While the 0.7 MeV prompt energy cut is significantly below the 1 MeV annihilation gamma energy, a small proportion of events (0.2%) deposit most of their energy in the non-scintillating inner acrylic vessel and fall below the threshold. The efficiency is 99.81% ± 0.10% determined by the most recent Daya Bay MC data which is consistent with that cited in [41].

Delayed Energy Cut: Gd Capture Fraction
Inefficiency in detection of neutrons from IBD interactions in the target GdLS region is the result of three primary physical processes: • Capture on hydrogen in the target, producing a single 2.2 MeV gamma well below the applied 6 MeV threshold.
• Capture on hydrogen outside the target where no Gd is present, producing the same 2.2 MeV gamma (spill-out effect).
• Deposition of significant neutron-Gd (nGd) capture gamma energy outside the scintillating detector region, producing a detected delayed energy below the applied 6 MeV threshold.
We choose to describe and quantify each of these contributions to the delayed energy cut efficiency separately in this analysis to produce robust and transparent efficiency and uncertainty estimates fully constrained by data. We begin by describing our estimates of inefficiency from the first two of these processes, collectively described as the Gd capture fraction. 4.6.1 Gd Concentration and AD-Center Gd Capture Fraction The keV-range kinetic energy neutrons created in IBD interactions in the GdLS thermalize in the detector and capture principally on either H or Gd nuclei. Because of their low capture energy (2.2 MeV), neutronhydrogen (nH) captures are completely excluded from the IBD signal by the 6 MeV cut used in this analysis. Determining the Gd capture fraction is vital in determining the predicted reactor antineutrino flux. This Gd capture fraction is physically determined by the Gd concentration in the GdLS, which is ∼0.1% by weight. The Gd capture fraction resulting from the Gd concentration can be measured largely independently of spill-out effects by looking at AD-center Gd capture events from various non-IBD (i.e., from calibration)datasets.
The AD-center Gd capture fraction was first measured utilizing muon spallation neutrons. This dataset was obtained by selecting all AD non-flasher triggers within a time window of 20-300 µs after traversal of the AD by a muon, which is identified by an AD trigger with more than 3000 photoelectrons (∼20 MeV). Triggers from events other than neutron captures are then removed from the sample by subtracting a similar dataset occurring 520-800 µs after a muon traversal. AD center events are then selected by removing all events having reconstructed positions R > 0.8 m or |Z| > 0.8 m, where the position reconstruction follows the second method described in [55]. The background-subtracted spallation neutron capture spectrum for all four near ADs is shown in Fig. 8, along with the background spectrum. The Gd capture fraction for this dataset can be calculated using the following definition: . (8) Low-energy cuts ∼30% below the nH and nGd peak values were chosen to exclude roughly similar proportions of the nH and nGd low-energy tails from the calculation of this metric. By this definition, the Gd capture fraction for a specific dataset is determined independently of any MC inputs. For this dataset, we obtain a Gd capture fraction of 85.4%, as seen in Table 6, with a statistical uncertainty of <0.1%. We estimate the systematic uncertainty in this ratio by looking for the variation in F Gd with variation in the selection parameters. To probe possible uncertainties arising from unequal inclusion of nGd and nH low-energy tails, F Gd low-energy cut values were independently adjusted to values between 10% and 50% below each peak's energy. For all variations, F Gd was found to be consistent within 0.2%. When signal and background subtraction time windows are altered in absolute length (180 to 280 µs), relative length (few-µs difference in signal and background window length), or in start time (from 20µs to 40µs for signal, for example), F Gd is altered by < 0.1%. As AD-center position cuts are varied from the nominal 0.8 m to either 0.5 m or 1.0 m, F Gd is altered by 0.3%. We also note that fractional contributions of target spallation neutron capture on other isotopes, such as carbon, are below 0.1%, negligible in the scope of the efficiency analysis. Adding the uncertainties quadratically, we obtain a Gd capture fraction of 85.4% ± 0.4% from spallation neutrons.
The Gd capture fraction has also been measured by deploying AmC, AmBe, and PuC neutron calibration sources at the centers of the two ADs at the Daya Bay Near Site (EH1) during a period of special calibration runs coincident with installation of the final two Daya Bay detectors at the other experimental halls [54,58]. These neutron sources produce time-correlated triggers, with proton recoils and excitation gammas forming the prompt signal, and the subsequent neutron capture forming the delayed signal. The neutron kinetic energy ranges and excitation gamma energies for various prompt en-ergy ranges for these sources are listed in Table 6. Some excited states with low neutron energies closer to that of ∼keV-scale IBD neutrons, such as the second excited state of 16 O produced by the PuC (α,n) reaction, are easily separable from other calibration source decays exhibiting higher neutron kinetic energies. This is because minimally quenched de-excitation gammas from these excited states produce a much higher prompt energy than the highly-quenched prompt proton recoils generated by energetic neutrons produced in the ground state. Meanwhile, other excited states produce either a variety of neutron kinetic energies (AmBe), or have prompt energies indistinguishable from the ground state. Daya Bay's standard gamma-less AmC sources produce no transitions to excited states, since alphas in these sources are moderated with thin gold foils [59]. For all sources, removal of uncorrelated triggers was accomplished by subtracting a set of accidental coincidences formed by randomly ordering in time that calibration run's single triggers according to the calculated singles rate for that run. As the sources were deployed at the detector center, cuts on reconstructed position were not utilized.

Data Set
For the calibration source data, an alternate procedure utilizing MC input for determining the Gd capture fraction was used to cross-check the spallation results utilizing the F Gd metric. First, the total fraction of background-subtracted time-coincident triggers passing the 6 MeV delayed energy cut was determined: Delayed energies below 1.7 MeV are excluded from the dataset as the statistical uncertainties from the accidental background subtraction for some datasets in this region are too high. Next, MC simulations includ-ing sources and the radioactive sources, source enclosures, deployment weights and suspension lines analogous to the actual AD-center source deployments were used to calculate the total number of coincidences in the <1.7 MeV delayed energy region (∼0.3%), as well as the number of nGd captures below 6 MeV reconstructed energy (∼1.5%). These numbers were used to correct F Gd,all to provide a semi-independent measure of the total ratio of nGd to other capture types, similar to F Gd . This method of estimating the Gd capture fraction avoids the uncertainty from defining nH and nGd energy windows, but has added uncertainty because of 0.1%level disagreements in low-energy contributions between 010201-13 the source deployment in data and MC. Resultant F Gd values from extended runs of these three sources are shown in Table 6, with delayed spectra from each source shown in Figure 9. While neutron capture tail shapes from the different sources deviate slightly from one another, likely due to differing source packaging material and optical properties, values of F Gd from all sources agree to within 0.3%. These source F Gd values are also consistent within 0.4% between data and MC for all source types and neutron energy ranges. Similar variations of energy and timing cuts applied to the spallation neutron dataset above produce <0.1% changes in F Gd values. These differences provide a conservative estimate of systematic uncertainty on the Gd capture fraction similar to that reported from spallation neutrons. Background-subtracted calibration neutron capture spectra from three different neutron sources deployed in the EH1 detector centers. The AmC data is binned more coarsely to reduce error bar sizes in the low-statistics tails.
After the completion of these studies, the Gd capture fraction for AD-center inverse beta decay interactions was studied in the MC simulation and were determined to be 85.5%. This agreement with a wide variety of studies indicates the initial values of Gd concentrations of the Daya Bay scintillator were properly measured and implemented in simulation. 4.6.2 Spill-out Effects and Full-Volume Gd Capture Fraction The previous section concerned itself with finding the Gd capture fraction at the detector center and matching this value between data and MC. In order to determine the Gd capture fraction for the entire target volume, which is the relevant number for the total detection efficiency, one must take into account the proportion of IBD neutrons created in the GdLS that escape the target and capture outside the GdLS, where all captures are non-Gd. This process, termed as the "spill-out" ef-fect, is naturally dependent on the proximity of the IBD interaction point to the boundary of the GdLS volume.
The MC is used to provide the full-volume Gdcapture fraction for the detection efficiency analysis. This value is calculated with MC to be 84.17%. The accuracy and systematic uncertainty of this total Gd capture fraction were then estimated by comparing total H/Gd ratios for existing non-IBD datasets between data and MC. We note that since the sizes of the nH and nGd low-energy tails in the neutron energy spectra increase with increasing R and |Z| due to increased gamma energy leakage, it is difficult to fully disentangle Gd detection inefficiencies from spill-out effects. As the position dependences of the nH and nGd tails are correlated, the previously-defined metric F Gd in Eq. 8 is relatively insensitive to these gamma energy leakage effects. For this reason, we utilize the F Gd metric described above for a data-MC comparison of full-volume Gd capture fractions. An comparison of PuC calibration source data with MC data using an alternate metric is described in Sec. 4.8 as a cross-check.
The Daya Bay MCS [58] deployed a PuC neutron source on an articulating arm at a wide variety of positions throughout the GdLS volume with an accuracy of 2.5 cm in r and 1.2 cm in z; this dataset can be used to calculate a full-volume Gd capture fraction. Due to the geometry of the source and articulating arm, deployments were limited to source positions with -1.45 m < Z < 1.25 m and R < 1.35 m. PuC deployments at similar positions were then simulated, including the attendant MCS articulating arm infrastructure. For each source placement position, F Gd was calculated in data and MC utilizing a process identical to that described in the previous section, except that backgrounds were subtracted utilizing an off-window method as was done in the previously described spallation neutron study. This was necessary to remove coincidences formed by closelyspaced neutrons from the intense (∼ 1kHz) PuC source. Figure 10 demonstrates the change in PuC neutron F Gd separately as a function of R and Z for data and MC. One can see good agreement at most positions. By fitting the distributions in R along the detector's Z-center and in Z along the detector axis, one can integrate over the full target volume to obtain a full-volume Gd capture fraction. A variety of fit methods are utilized to account for the lack of data near the GdLS top and bottom (|Z|∼1.5 m). This process yields a full-volume Gd capture fraction of 84.1% for data and 83.5% for MC. The Gd capture fraction for the PuC source in the MCS differs from that of IBDs due to the higher kinetic energy of neutrons and the MCS deployment arm; hence, we use the MCS data to benchmark the full-volume Gd captures fractions between data and MC. Therefore, these results should be utilized not as an indicator of the true IBD Gd capture fraction, but as a benchmark of the agreement between full-volume Gd capture fractions between MC and data. An additional check on this analysis using interpolated values between all available PuC MCS deployment positions yields differences of up to 0.7% between data and MC for all PuC neutron kinetic energies.
After performing these benchmark comparisons between MC and data, the precision of the MC-reported full-volume Gd capture fraction is estimated as the maximum difference between these reported MC and data values above, 0.7%. Adding quadratically the approximate 0.4% uncertainty in the AD-center Gd capture fraction, we obtain a predicted Gd capture fraction of 84.17 ± 0.80%. The difference from early Daya Bay publications (83.8 ± 0.8%) is caused by the improved Geant4 neutron thermalization models on which this analysis is based, which produce a lower rate of IBD neutron spillout.

Delayed Energy Cut: Gd Capture Detection Efficiency
Of the 84.17% of target IBD neutrons capturing on Gd, a small percentage will have delayed reconstructed energy below the 6 MeV delayed energy cut. This inefficiency arises as a portion of gammas from some Gd captures exit the scintillating region of the detector before depositing their energy. In order to properly estimate the predicted reactor antineutrino flux, this Gd capture detection efficiency must be properly estimated. As with the full-volume Gd capture fraction, the Gd capture detection efficiency is determined using MC, since the full tail of the IBD delayed energy signal is obscured in data by nH captures and accidental backgrounds.
The shape of the Gd capture tail, and therefore the Gd capture detection efficiency, is dependent on the model used to describe the gamma energies released by a nGd capture. The excited states of 158 Gd and 156 Gd, the products of neutron capture on 157 Gd and 155 Gd, are numerous, making a first-principles determination and modelling of de-excitation pathways impractical. Instead, the Daya Bay MC produces nGd capture gammas by performing an energy-conserving sampling of previously-measured Gd-capture gamma spectra. The algorithm that performs this sampling is tuned to ensure that the energy conservation requirement does not bias aggregate sampled gamma spectra relative to the input spectrum. In previous publications [7,41], Daya Bay utilized nGd gamma spectrum models based on early spectroscopic measurements [60], shown in top panel of Fig. 11, which do not sufficiently reproduce the IBD extended nGd tail shapes now visible in Daya Bay's highstatistics datasets, pictured in middle panel of Fig. 11. This gamma model is referred to in this paper as the "M13A,Old" model We investigated additional nGd gamma models to obtain a better description of the data. It was found that nGd gamma spectra included in Geant4 libraries [56], shown in bottom panel of Fig. 11, produced reasonable agreement with observed data once energy conservation in gamma emission, not present in Geant4 by default, was implemented. This model is referred to as the "M14A,Geant" model in this paper. Another wellmatching model, called "M14A,Caltech", was generated through direct measurement of nGd gamma production in a small cell of Daya Bay GdLS using a benchtop HPGe detector setup at Caltech. In both new "M14A,Geant" and "M14A,Caltech" models, the total contribution of high-energy gammas is lower than in early spectroscopic measurements. Figure 11 shows the combined IBD nGd capture spectra from all Daya Bay detectors and from the various tested MC models. The nGd tail is clearly visible with high statistical precision above 3.0 MeV, and provides a 010201-15 direct constraint on the delayed energy cut inefficiency above this energy. The two MC models provide a bounding envelope around the observed spectrum when approaching the low-energy region where the nH peak obscures the true nGd tail shape. The delayed energy cut inefficiency from this low energy region is estimated by the relative contribution from these new MC nGd capture models. The data-constrained portion of the tail from 3-6 MeV provides a 6.6% inefficiency, while the low-energy MC-constrained portion below 3.0 MeV contributes 0.4% and 0.9% for the different models. The total estimated nGd detection efficiency using the "M14A,Geant" model is 92.71%. A conservative 100% uncertainty is assigned to the total contribution below 3 MeV due to the lack of direct data constraints. The 0.5% difference in the low-energy contribution from the data-enveloping MC models provides good motivation for this choice. Further uncertainty contributions from statistical and other systematics, such as the MCdata difference in energy scale near the GdLS-LS boundaries, are negligible in comparison.
This Gd capture detection efficiency estimate, 92.71%, differs from previous estimates, 90.9%, in [7] by 1.8%, a ∼3 σ change with respect to previous systematic uncertainty estimates. As previously mentioned, this difference stems from improved modelling of the nGd gamma spectrum in the updated Daya Bay MC simulations. Due to the limited available statistics, previous uncertainty estimates were made using comparisons between the previous MC model and data only in a narrow higher-energy window (6-7 MeV) bordering the nGd tail region. In contrast, the updated efficiency estimate is directly constrained by data with <0.1% statistical uncertainty for the bulk of the nGd tail, with 100% uncertainty assumed in regions where no direct data constraint exists. This results in a robust and conservative estimate of the delayed energy cut efficiency. Nevertheless, the change of the Gd capture detection efficiency does not affect the measurement of the oscillation parameters reported by Daya Bay using relative comparison between Near and Far detectors.

Combined Delayed Energy Cut Efficiency
Cross-Check In addition to estimating them separately as we have done above, we can cross-check the accuracy of the MC in modelling combined effects of the Gd capture fraction and nGd detection efficiency by comparing the previously-defined F Gd,all metric between data and MC for a representative non-IBD dataset. This F Gd,all metric effectively achieves both of these efficiencies above the applied 1.7 MeV analysis threshold. The MCS data set was re-analyzed to determine F Gd,all for points spaced evenly throughout the target volume. The positionweighted average value for all data points using the fullvolume fits described above were 80.3%, compared to a MC value of 79.5%. This difference is well within the uncertainties of 0.8% and 0.9% defined for the full-volume Gd capture fraction and nGd detection efficiency, providing further confidence that MC modelling of these two sub-efficiencies is accurate within the estimated systematics.

Spill-in Effects
When calculating the total number of expected Gd capture detections, one must take into account IBD neutrons generated outside the GdLS that are captured in the GdLS. This process, termed as the spill-in effect, effectively increases the size of the target volume. As with the spill-out effect, the size of the spill-in effect and the net increase in effective target volume is calculated using MC simulation of IBD neutrons in the detector. The calculated value of the effective target size in the default Daya Bay MC due to spill-in is 104.9%.
The spill-in correction obtained by the MC is dependent on the choice of neutron scattering models. Daya Bay's default MC for neutron scattering includes inelastic scattering of thermal neutrons below 4 eV where molecular effects due to hydrogen bonds and their energy transfer with neutrons must be considered. This is in contrast to "free-gas" models of neutron scattering which forego this detailed modelling at low energies. In its G4NDL3.13 physics library, Geant4 has inherited several neutron thermal scattering models and parameters from the Evaluated Nuclear Data Files (ENDF/B-VI) database [61] for a variety of moderators such as water and polyethylene. As database entries are unavailable for the primary Daya Bay target materials GdLS, acrylic, and mineral oil, ENDF models for water and polyethylene were used to describe each of these target materials in the MC. Variations between these different models individually for each Daya Bay target material produced <1.0% changes in MC-reported total IBD rates. A freegas neutron scattering model produced effective target masses roughly 2% larger than the default MC. Due to a variety of mismatches between data and the free-gas model MC to be described below, the free-gas model was ruled out as a viable description of the physics in the detector and not considered when calculating systematics envelopes.
A wide variety of data-MC comparisons of calibration and IBD data have been implemented to determine the uncertainty in this MC-produced spill-in estimate. Spillin estimates from the default MC can be directly benchmarked to data by comparing extended deployments of a combined AmC/Ge source at a single position at the detector Z-center in the LS volume 22 cm radially outward from the GdLS edge. Given the low spill-in rate from this position, a main background in this calibration dataset is IBDs from reactor antineutrinos, which were reduced by choosing a low prompt energy window (0.9-1.3 MeV) and applying a cut on the reconstructed distance from the source (<1 m), with the remaining IBDs statistically subtracted utilizing time-adjacent non-calibration runs. The ratio of nGd to nH captures in this calibration dataset was found to be 4.1%, in agreement with the default MC within 1.0%, even after including wide vari-ations in background subtraction methods, energy cut windows, and distance cuts.
While this calibration-based result appears quite robust, the difference in kinetic energy between AmC source neutrons and keV-scale IBD neutrons necessitates further studies to reliably estimate spill-in effects for IBD interactions. In the absence of a low-energy neutron source, indirect determination of MC spill-in accuracy was also accomplished by comparing spill-in-correlated IBD time and position observables between data and MC.
The comparison between data and MC reconstructed prompt position distributions is shown in Fig. 12. Care has been taken in this comparison to correct for cmlevel relative differences in reconstructed IBD positions between MC and data for the MCS measurement, which can also cause event rate differences at large R p,rec . With these biases corrected, the magnitude of the MC-data differences near the detector boundary ( R 2 p,rec > 1.5 m 2 ) sum to ∼0.5% of the total number of IBD events shown in bottom panel of Fig. 12. With variation of neutron scattering models, the sums of MC-data differences for all cases are < 0.6%, which translates to a 1.0% difference of spill-in. The IBD coincidence time distributions have been compared between data and MC. Since spill-in events originate in the LS, they tend to have longer coincidence times and contribute heavily to the tail of the IBD coincidence time distribution. This relation was determined with a MC IBD event dataset by calculating both the spill-in fraction and the fraction of signal events with greater than 50 µs coincidence time for subsets of events in common reconstructed position bins, and the result is shown in top of Fig. 13. The coincidence time peak-totail ratios in each R p bin were computed for both data and MC with a requirement of > 3.5 MeV on the prompt event energy, shown in bottom of Fig. 13. The difference between data and MC at the boundary of GdLS region reflects the spill-in difference. According to the relation between capture time distortion (peak-to-tail ratio) and the fraction of spill-in event, the spill-in fraction is evaluated for the data. The relative contribution of spillin events was found to agree between data and MC to within 0.5% of the total event sample for a wide variety of systematic variations including coincidence time tail definitions, and assumed position reconstruction biases. Top: Points show the relation between true spill-in percentage and coincidence time peak-to-tail ratio in Monte Carlo simulation for event groupings at common Rp,rec. The peakto-tail ratio compares IBD events in the (1,50) and (50,200) µs capture time regions. A curve is fitted to infer the spill-in fraction for the IBD candidate dataset. Bottom: The coincidence time peak-to-tail ratio in different bins along the radius. The spill-in percentage in each bin of data is predicted with the relation of spill-in and peakto-tail ratio obtain from MC.

Spill-in
To provide a conservative estimate of the inaccuracy of the IBD spill-in percentage reported by MC, the maximum MC-data difference observed in any of these studies, 1.0%, is used as the uncertainty in the spill-in contribution to the efficiency estimate.

Target protons
The uncertainty in the number of target protons is included with the detection efficiency uncertainties as the IBD rate is proportional to the number of target protons. The number of target protons in the GdLS is calculated as where M is the mass of GdLS in the target, F H is the mass fraction of hydrogen in GdLS, N A is the Avogadro constant, I1 H is the isotope abundance of 1 H in natrual hydrogen, and m H is the atomic mass of hydrogen. The target mass was precisely measured during the detector filling and monitored during the data taking. The uncertainty of the target mass is 3.0 kg [33], corresponding to 0.015% of the 20 ton target mass. The hydrogen mass fraction of F H = 12.02 ± 0.11% was obtained from the combination of two sets of independent combustion measurements, one of which is tabulated in Ref. [33]. The combined fractional uncertainty in N p is 0.92%. The previous reported uncertainty in N p of 0.47% [29] was incorrect.

Efficiency Summary
Calculated detection efficiencies and their related uncertainties are listed in Table 7. The detection efficiency common to all detectors is =80.6%. Including the efficiencies that vary among detectors, as given in Table 5, total detection efficiencies range from 64.6% to 77.2%. The total systematic uncertainty of detection efficiencies is δ / =1.93%.

Measurement of Reactor Antineutrino Flux
Naively, the reactor antineutrino flux can be measured directly using the Daya Bay near-site data. However, due to the relatively large size of θ 13 , even at the near sites (360-500 m baselines) there is an approximately 1-2% deficit of the antineutrino flux caused by neutrino oscillations. Therefore, far-site data are required in order to extract the value of θ 13 independent of other experiments. In this section, we describe two methods to measure the reactor antineutrino flux from the Daya Bay experiment. In the first method, the data from all ADs are fit based on neutrino oscillation theory and a reference reactor antineutrino flux model. The value of sin 2 2θ 13 and the flux normalization R are simultaneously obtained from the fit, the latter being the measured reactor antineutrino flux. In the second method, we use the measured value of sin 2 2θ 13 and the near-site data only. The measured reactor antineutrino flux is then expressed in a model-independent way in terms of σ f (cm 2 /fission) and Y (cm 2 /day/GW th ). Finally, we combine our measurement with the past short-baseline experiments to obtain a global average value, and compare it with different model predictions.

Measurement of sin 2 2θ 13 and Flux Normalization R
The IBD event candidates are selected as described in Sec. 3. Figure 14 shows the daily averaged rates of IBD candidate events per AD in the three experimental halls as a function of time. The expected backgrounds are subtracted and the detection efficiencies are corrected in the figure. The measured IBD rates are highly correlated with the reactor operations.  Figure 15 shows the integrated rate of the detected ν e signals at each AD, divided by the no-oscillation predictions. A signal deficit of about 6% at the far hall relative to the near halls is observed, indicating the size of the oscillation driven by θ 13 . A normalization factor R was defined to scale the signal predicted by a reactor model. The value of R, together with the value of sin 2 2θ 13 , was simultaneously determined with a χ 2 constructed similarly as in Ref. [7] using only the integrated rate information, where M d is the number of measured IBD events in the dth detector with backgrounds subtracted, B d is the corresponding number of background events, T d is the number of IBD events predicted by a reactor model with neutrino oscillations, and ω d r is the fractional IBD contribution from the r-th reactor to the d-th detector determined by baselines and reactor antineutrino fluxes. σ r (0.9%) is the uncorrelated reactor uncertainty, σ d (0.2%) is the uncorrelated detection uncertainty, σ B,d is the background uncertainty listed in Ref. [32], and σ D (1.93%) is the correlated detection uncertainty, i.e. the uncertainty of detection efficiency in Table 7. Their corresponding nuisance parameters are α r , d , η d , and D , respectively.
We use the rate-only fit in this analysis in order to fix the reference reactor model to its nominal value. Thus the obtained normalization R can be directly compared with other experiments. Fixing the reactor model does not affect the oscillation result due to the relative measurement between far and near detectors. If we add the spectral information, we would need to include and inflate the model uncertainty in the fit in order not to bias the oscillation result. Consequently, even though we would obtain a more precise value of sin 2 2θ 13 , the best-fit flux of the reference reactor model would deviate from its nominal value, making the comparison with other experiments impractical.
The minimization of the rate-only χ 2 defined in Eq. 11 yields χ 2 /NDF = 5.7/6. The best-fit value of sin 2 2θ 13 = 0.085 ± 0.006 is insensitive to the choice of reactor models. The uncertainty in sin 2 2θ 13 is statistically dominated. The 0.9% reactor related uncertainty, treated as uncorrelated in the oscillation analysis in order to avoid a bias of the sin 2 2θ 13 fit, is conservatively added quadratically to the uncertainty of R, effectively treating it as correlated among reactors in the rate measurement. The best-fit result of R is 0.946 ± 0.020 (0.992 ± 0.021) when compared with the Huber+Mueller (ILL+Vogel) model. Replacing the Mueller 238 U spectrum with the measured spectrum in Ref. [47] yields an R increased slightly by 0.002. The contributions to the uncertainty in R are summarized in Table 8. The uncertainty is dominated by the detection uncertainty σ D .  Fig. 15. Ratio of the detected to expected nonoscillationνe signals at the 8 ADs located in three experimental halls as a function of the effective baseline, which is determined for each detector by equating the multicore oscillated flux to an effective oscillated flux from a single baseline. A 6% signal deficit at the far hall relative to the near halls is observed, indicating the size of the θ13driven oscillation. The oscillation survival probability at the best-fit value is given by the red curve. In addition, there is a 5% normalization deficit when compared with the Huber+Mueller model prediction. The uncertainty of the model prediction is shown as the gray band around unity. Two far hall points are displaced by 50 m for visual clarity. The best-fit oscillation curve is shown in Fig. 15. Disregarding the normalization, the measurement is consistent within the three-neutrino paradigm. On the other hand, the normalization is inconsistent with the Huber+Mueller model prediction within the model uncertainties. We will further discuss the implication in Sec. 5.3.

Measurement of IBD Yield
In this subsection, we express the measurement in two model-independent ways: the IBD yield per nuclear fission (σ f ), and the IBD yield per GW th per day (Y ).
σ f for each AD is determined by solving the following equation: where N f r is the predicted number of fissions from the rth reactor core, which is calculated based on W r (average thermal power of rth core), f iso r (average fission fraction of rth core for each isotope) and E iso (mean energy release per fission for each isotope), integrated over the live time of the detector: L dr is the distance between the dth detector and the rth reactor core. N T d is the total number of target protons in the GdLS of each AD. The total detection efficiency, D d , is different for each AD because of different effects of muon veto and multiplicity cuts on each AD. P dr sur is the survival probability given an AD-core pair, calculated using the best-fit value of sin 2 2θ 13 from the rateonly analysis described in the previous subsection. Due to the relatively large size of θ 13 , even at the near sites there are on average about 1.5% rate deficits, as shown in Fig. 15. The values of σ d f for all ADs, from Eq. 12, are summarized in Table 9. Similar to the normalization R, the uncertainty in σ d f (summarized in Table 9 as σ exp ) is dominated by the correlated detection uncertainty σ D .
Theoretically, σ f represents the IBD cross section convolved with the reactor antineutrino spectra from all fission isotopes, and integrated over energy: Given a reactor model that predicts the antineutrino spectrum S iso (E ν ) for each of the four main fission isotopes 235 U, 238 U, 239 Pu and 241 Pu, and the fission fractions f iso determined by NPP operations and simulations, σ f can be theoretically calculated and compared with the model-independent measurement. The ratios of the measurement versus the Huber+Mueller model prediction (R H+M ), and versus the ILL+Vogel model prediction (R I+V ) for each AD are summarized in Table 9. Alternatively, we can define Y d ≡ σ d f N f r /W r as the IBD yield per GW thermal power per day. The above expression approaches a common value Y after averaging multiple fuel burnup cycles, since all the reactor cores 010201-20 have the same average fuel composition. During the 6-AD data taking period, none of the reactor cores had completed a burnup cycle. The differences in fuel composition cause about 2% variations in measured IBD yield (top panel of Fig. 16). These core-to-core variations can be corrected using known values of the fission fractions given by Table 9. On the other hand, all the reactor cores had roughly one full cycle during the 6-AD and 8-AD data taking period. Therefore measurements from eight detectors give the same value (within statistical fluctuation), and core-to-core variations are negligible (bottom panel of Fig. 16). Table 9. Tabulated Table 9 further summarizes a few characteristic parameters calculated for each AD, including the average fission fraction f iso d , flux-weighted baseline L d and average survival probability P d sur . These parameters can be trivially obtained in the case of a single reactor core, but require clear definitions in the multi-core case of Daya Bay. The average fission fraction f iso d is defined as follows: where β dr is the flux-weighting factor calculated from N f r and L dr (see Eq. 12 for definition). We note that the average fission fractions for the two newly installed ADs (EH2-AD2 and EH3-AD4) are slightly different from the ADs at the same site, because they are seeing different reactor core histories with respect to other detectors. The flux-weighted baseline L d is defined as Finally, the average survival probability P d sur is calculated as follows: (17) where N dr is the predicted number of IBD events at the dth AD from the rth reactor core without oscillation, and P dr sur is the average survival probability given an AD-core pair as defined in Eq. 12.    Table 10.

Comparison with Past Reactor Experiments
Recently, there was great interest in the so-called "reactor antineutrino anomaly", which arises from reevaluations of the reactorν e flux that resulted in an increase of the predictedν e flux in the Huber+Mueller model [24,25]. Combining the new predictions with the re-analysis of the past experimental data at baselines 10-100 m suggests a ∼4-6% deficit between the measured and the predicted reactorν e flux [26,62]. In this subsec-tion, our measurement is compared with the past reactor neutrino experiments.
A global fit was performed for the past reactor neutrino experiments.
Nineteen short-baseline (<100 m) measurements were included using the data from Ref. [26]. The measurements from CHOOZ [63] and Palo Verde [64] were also included after correcting for the standard three neutrino oscillations using the best-fit value of sin 2 2θ 13 from the rate-only analysis described in a previous subsection. The results of all 21 experiments are summarized in Table 11. In the "ratio" column, each measured flux was compared to the Huber+Mueller flux prediction (in Ref. [26], the "ratio" column is calculated with respect to the Mueller [24] model). The σ err column summarizes the total uncertainty reported by each measurement, and the σ cor column summarizes the correlated uncertainty among experiments in the same group. Both σ err and σ cor include the theoretical uncertainty of the model prediction σ model . At the time of those measurements, all experiments reported results using a common σ model = 2.7% from the ILL+Vogel model. Since this σ model is correlated among all measurements, it is the minimum value of σ cor . Both σ err and σ cor were taken from Ref. [26] except for SRP-I, SRP-II, ROVNO88-1I, and ROVNO88-2I. We adopted the uncertainty treatment of Ref. [62] for those four experiments. To calculate the global average independent of the model uncertainty used by the past measurements, we follow the method described in Ref. [62] by first removing σ model from both uncertainties, and define: σ exp err and σ exp cor now represent experimental uncertainties only. We then build a covariance matrix V exp such that where R obs i is the "ratio" column in Table 11 corrected by the "P sur " column for the θ 13 -oscillation effect. R obs i represents the observed rate from each measurement.
We then calculate the best-fit average ratio R past g by minimizing the χ 2 function defined as: where V −1 is the inverse of the covariance matrix V . This procedure yields the best-fit result R past g = 0.942 ± 0.009, where the error is experimental only.
Since we now use the Huber+Mueller model as the reference model, we re-evaluate the model uncertainty using the correlated and uncorrelated uncertainty components given by Ref. [24,25].  Fig. 17.
The consistency between Daya Bay's measurement and past experiments suggests that the origin of the "reactor antineutrino anomaly" is from the theoretical side. Either the uncertainties of the theoretical models that predict the reactor antineutrino flux are underestimated or more intriguingly, there exists an additional neutrino oscillation that suppresses the reactor antineutrino flux within a few meters from the reactor. Such an oscillation would imply the existence of one or more eV-mass-scale sterile neutrinos. To investigate this tantalizing possibility, future short baseline (10 m) experiments are required to observe the L/E dependence of such an oscillation.

Measurement of Reactor Antineutrino Spectrum
In this section, we extend the study from reactor antineutrino flux to its energy spectrum. The measured prompt energy spectra from the four near-site ADs were summed and compared with the predictions. The detector response of the Daya Bay ADs was studied and used to convert the predicted antineutrino spectrum to the prompt energy spectrum for comparison. A discrepancy was found in the energy range between 4 and 6 MeV with a maximum local significance of 4.4 σ. The discrepancy and possible reasons for it were investigated.

Detector Response
The predicted antineutrino flux and spectrum were calculated via the procedure described in Sec. 2. At each AD, the reactor antineutrino survival probability was taken into account with the best fit oscillation parameters, sin 2 2θ 13 = 0.084 and |∆m 2 ee | = 2.42×10 −3 eV 2 , based on the oscillation analysis of the same dataset [32]. The relation of the antineutrino spectrum S(Eν e ) and the reconstructed prompt energy spectrum S(E p ) can be expressed as, where R(Eν e , E p ) is the detector energy response and can be thought of as a response matrix, which maps each antineutrino energy to a spectrum of reconstructed prompt energies. The energy response includes four main effects: the IBD prompt energy shift, IAV effect, non-linearity, and energy resolution, which are studied in the following.

IBD Prompt Energy Shift
The antineutrino energy is transferred to a positron and a neutron via the IBD reaction,ν e + + p → e + + n. The positron kinetic energy is where Eν e is energy of the antineutrino, M n , M p and M e are the neutron, proton and electron masses, and T n is the kinetic energy of the neutron. The visible prompt energy is related to the antineutrino energy as The positron annihilation produces two gammas with total energy 1.022 MeV. The shift is approximately 0.78 MeV, with a small correction from T n , which has an average value of ∼10 keV for reactor antineutrinos. The kinetic energies of the positron and the neutron are calculated based on the formula in [50] at the first order in 1/M , where M is the nucleon mass.

IAV Effect
In the Daya Bay ADs, the inner and outer acrylic vessels, as well as the supporting acrylic ribs are nonscintillating material. In particular, when IBD reactions occur around or in the acrylic, the generated positrons and the annihilation γ-rays are likely to lose energy in the acrylic without producing scintillation light. This effect will reduce the visible energy and distort the prompt energy spectrum. This effect is called the IAV effect as most of the events that lose energy in acrylic cluster around the inner acrylic vessel. To study the IAV effect, simulated IBD reactions were uniformly generated based on the density of the target protons in the detector materials and determined the corresponding deposited energy. From the MC truth information, 13% of IBD events lose more than 50 keV in the acrylic vessel, which yields E vis < T e + + 1.022 MeV. Figure 18 shows the deposited energy versus (T e + + 1.022 MeV) to illustrate the effect of the IAV on the IBD prompt events. Some positrons lose all of their kinetic energy in the acrylic vessel but the two annihilation γ-rays escape to the scintillator. In this case, a deposited energy of about 1.022 MeV will be detected, which enhances the deposited energy spectrum at around 1 MeV as shown in Fig. 18. The uncertainty of the IAV effect is studied by comparing the simulation results with the IAV thickness varying within a range of 0.4 mm. The induced uncertainty on the prompt energy spectrum was estimated to be 4% below 1.25 MeV, dropping rapidly to 0.1% at higher energies.

Non-linearity
The energy response of the antineutrino detector is not linear due to the effects originating from the scintillator and the electronics. These two effects, both at a level of 10%, are parameterized with two functions, f scint and f elec . The scintillator non-linearity is related to the ionization quenching, which is modeled by Birks' formula, and Cherenkov light emission. The electronics non-linearity is introduced by the loss of the slow scintillation light in a limited charge collection time-window. It is modeled using an exponential as a function of total visible energy based on the scintillation light timing profile and a charge collection study [65].  Estimated energy response of the detectors to positrons, including both kinetic and annihilation gamma energy (red solid curve). Gamma rays from both deployed and intrinsic sources as well as spallation 12 B β decay determined the model, and provided an envelope of curves consistent with the data within a 68.3% C.L. (grey band). An independent estimate using the beta+gamma energy spectra from 212 Bi, 214 Bi, 208 Tl, as well as the Michel electron spectrum produced a consistent result (blue dashed line).

010201-24
The non-linearity model includes five parameters: detector energy scale, Birks' constant, relative contribution from Cherenkov light, and the amplitude and decay constant of the electronics model. The parameters are determined by a combined χ 2 fit to the mono-energetic γ lines of calibration sources and continuous β spectrum of 12 B produced by the muon spallation inside the AD. The Geant4 simulation is used to build the relation of non-linearity response of different particle species, such as gamma, e + and e − . The IBD positron non-linearity response derived from the best fit parameters is shown in Fig 19. The uncertainty band is constructed by considering calibration and model uncertainties. The positron non-linearity response was validated using the Michel electron spectrum from muon decay at rest and the continuous β + γ spectra from internal radioactive β decays of 212 Bi, 214 Bi and 208 Tl (see Ref. [32] for detailed nonlinearity treatment). The non-linearity uncertainty has a negligible effect on the measured oscillation parameters because it is treated as correlated for all ADs.

Energy Resolution
The detector energy resolution was studied by a variety of calibration sources deployed at the detector center, IBD and spallation neutrons, and alpha sources from radioactivity. For each source, the reconstructed energy is measured and the width and the energy of the peak are obtained from fits with Gaussian function to the peak of the energy distribution. The results from both MC and experimental data are shown in Fig. 20. The relative energy resolution of an antineutrino detector as a function of energy is parameterized by where σ E is the uncertainty of the reconstructed energy distribution, E is the peak of the distribution,and a, b and c are three parameters that quantify the contribution from spatial resolution of reconstructed energy, photon statistics, and PMT dark noise, respectively [66]. The parameters in Eq. 25 were studied by fitting the energy resolution of the calibration sources as well as IBD and spallation neutrons, uniformly distributed in GdLS. The internal radioactive alpha sources were used to crosscheck the result. Naked gamma sources are also simulated for comparison, and they have better energy resolution than the calibration data because they do not include the source shielding and calibration source deployment apparatus. The best fit parameters are a = 0.016, b = 0.081 MeV 1 2 and c = 0.026 MeV when the energy is given in the units of MeV. A variation of the parameters within the uncertainties has negligible effects on the prompt spectrum when it is smeared, therefore the uncertainty of energy resolution is neglected in the analysis.

Energy Response Matrix
After taking into account the above effects, the detector response matrix (Eq. 22) can be constructed to map the reconstructed energy to the antineutrino energy. Two methods were used to evaluate the energy response matrix. The first method estimates the IAV effect, non-linearity, and energy resolution step-by-step using analytical methods as described above. The second method constructs the response matrix using a fulldetector simulation based on Geant4 [56]. The detector geometry and material properties used in simulation are precisely determined by the various surveys and standalone measurements. As an example, the thickness of the inner acrylic is measured with a precision of 0.4 mm, which allows for a small uncertainty of the IAV effect. The Birks' constant of ionization quenching is tuned by benchmark data using a small sample of Daya Bay GdLS, and by comparing non-linearity from calibration sources in the ADs between data and MC. The energy calibration and reconstruction process of MC data follows the same procedure as applied to the measured data. Figure 21 shows the detector response matrix which is constructed using the map of reconstructed energy and the input antineutrino energy in MC.
Both methods produced consistent response matrices for the prompt energy above 1.25 MeV. The uncertainty below 1.25 MeV was inflated to cover the difference of 10% between the two methods.

Spectral Comparison
To quantify the discrepancy between the measured and predicted spectra, the uncertainties in both spectra were estimated. Besides the statistical uncertainty, the systematic uncertainties include reactor-related uncertainty, detector-related uncertainty and backgroundrelated uncertainty. The reactor related uncertainty presented in Sec. 2 is propagated to the prompt energy spectrum when converting the antineutrino energy spectrum to the prompt energy spectrum. The uncertainty of the detection efficiency is assumed to be independent of energy, and therefore does not impact the spectral shape. The uncertainty of the IAV effect on the prompt energy spectrum is 4% below 1.25 MeV and rapidly drops to 0.1% above 1.25 MeV. The uncertainty of non-linearity shown as the error band in Fig. 19 is propagated to the prompt energy spectrum when applying the non-linearity effect to generate the predicted spectrum. Five major sources of background are identified in the Daya Bay detectors. They are the accidental background, cosmogenic 9 Li and 8 He beta-decays, fast neutrons, Am-C neutron sources, and 13 C(α, n) 16 O reactions. The background uncertainty is incorporated when subtracting the background from the measured spectrum.
To incorporate statistical, reactor-related, detectorrelated and background-related uncertainties, a covariance matrix V was constructed as where V stat is the statistical component, and V sys is the shape-only systematic component. The statistical component has only diagonal terms and is calculated analytically. Large samples of prompt spectra were generated to include the fluctuation due to various systematic uncertainties from the reactor, detector energy response, and background uncertainties. The elements in the covariance matrix of the systematic component were calculated as (27) where N expts is the number of toy MC samples, N ran(nom) i is the random (nominal) predicted number of events at the prompt energy bin i. Total number of events in the random predicted spectra are normalized to the nominal prediction. Finally, the total covariance matrix was calculated by summing these two components, V = V stat + V sys . Figure 22 shows the elements of the correlation matrix, V ij / V ii V jj , and the fractional size of the diagonal elements of the covariance matrix, V ii /N pred i , for each component. A χ 2 was defined to test the compatibility of the observed prompt energy spectrum with the predictions, where N obs(pred) i is the observed(predicted) number of events at the i-th prompt energy bin and V is the covariance matrix that includes all the statistical and systematic uncertainties. Figure 23A shows a comparison of the observed near-site prompt energy spectrum with the prediction. The predicted spectrum was normalized to the measurement. A clear discrepancy between the data and the prediction near 5 MeV is observed, while the agreement is reasonable in other energy regions. A comparison to the Huber+Mueller model yields a χ 2 /dof of 46.6/24 in the full energy range from 0.7 to 12 MeV, corresponding to a 2.9 σ discrepancy. The ILL+Vogel model shows a similar level of discrepancy from the data.
Another compatibility test was performed with a modified fitting algorithm. In this method, N (=number of prompt energy bins) free-floating nuisance parameters are introduced to the oscillation parameter fit to adjust the normalization for each bin, as described in [65]. The compatibility was tested by evaluating ∆χ 2 = χ 2 (standard) − χ 2 (N extra parameters) (29) for N degrees of freedom. We obtained ∆χ 2 /N = 50.1/25, which is consistent with the results obtained by the first method using Eq. 28.

Quantification of the Local Deviation
The ratio of the measured to predicted energy spectra is shown in Fig. 23B. The spectral discrepancy around 5 MeV prompt energy is clearly visible. Two approaches are adopted to evaluate the significance of this discrepancy. The first method evaluates the χ 2 contribution of each energy bin, By definition, i χ 2 i is equal to the value of χ 2 defined in Eq. 28. As shown in Fig. 23C, an enhanced contribution is visible around 5 MeV.
In the second approach, the significance of the deviation is evaluated based on the modified oscillation analysis similar to Eq. 29. Instead of allowing all the N nuisance parameters to be free floating, only parameters within a selected energy window are varied in the fit. The difference between minimum χ 2 s before and after introducing these nuisance parameters within the selected energy window was used to evaluate the p-value of the local variation from the predictions. The p-values with 1 MeV sliding energy window are shown in Fig. 23C. The local significance for a discrepancy is greater than 4 σ at the highest point around 5 MeV. In addition, the local significance for the 2 MeV window between 4 and 6 MeV were evaluated. We obtained a ∆χ 2 /N value of 37.4/8, which corresponds to the p-value of 9.7 × 10 −6 (4.4 σ).
Comparing with the ILL+Vogel model shows a similar level of local discrepancy between 4 and 6 MeV.
The excess between 4 and 6 MeV was ∼1.5% of the total observed IBD candidates. An excess of events in a same energy range was not observed in the spallation 12 B beta decay spectrum, ruling out detector effects as an explanation. Adding a simple beta-decay branch or a mono-energetic peak cannot reproduce the observed excess, indicating that it cannot be explained by a simple background contribution. Contributions from other interaction channels (e.g.ν e + 13 C) were investigated and were found to be too small to account for the excess. The events in the energy region around 5 MeV are carefully examined: the neutron capture time, the delayed energy spectrum, and the distance distribution for the delayed neutron capture signal were found to match IBD event characteristics. The vertex distribution of the prompt signal was found to be uniform and consistent with IBD events. Figure 24 shows the event rate versus time in the energy window of 4.5-5.5 MeV and other windows. The strong correlation indicates that the excess around 5 MeV is proportional to the reactor antineutrino flux. Therefore, it strongly suggests that the deviation is due to the imperfect modelling of the reactor antineutrino spectrum. A recent ab initio calculation of the antineutrino spectrum showed a similar deviation from previous predictions in the 4-6 MeV energy region [46], and identified prominent fission daughter isotopes as a potential explanation. Similar discussions can be found in Ref. [67]. Furthermore, a recent evaluation of uncertainties in forbidden decays suggests an additional ∼5% uncertainty in both the rate and spectral shape of reactor antineutrino flux models using beta-to-antineutrino conversions [27], which may be another source of the discrepancy.

Generic Antineutrino Spectrum of IBD Reactions
Since the predicted and measured prompt energy spectra have some discrepancies, it can be useful to extract an antineutrino spectrum weighted by the IBD cross section using the measured prompt energy spectrum at Daya Bay. The unfolded antineutrino spectrum can be used as a model-independent input for reactor antineutrino flux and spectra prediction for future reactor antineutrino experiments.

Solutions of Linear Inverse Problems
A measured quantity (such as the energy spectrum) usually includes various detector effects, such as finite energy resolution and the limited acceptance of the detector. Therefore, a correction or a transformation of the measured spectrum is necessary to deduce the true energy spectrum without the specific detector effects. The result allows for a direct comparison with other experiments and theoretical predictions. The transformation from measured to true spectrum is called unfolding and belongs to the class of linear inverse problems. It requires a thorough understanding of the detector physics response. Due to the properties of the detector response, e.g. finite energy resolution, the inverse problem is illposed: a small fluctuation of the measured spectrum can result in a large change of the unfolded result. Nonproper solution by simple inversion yields unstable results. The singular value decomposition (SVD) regularization method [68], i.e. the standard unfolding method for the linear inverse problem, discussed below, is used for obtaining the antineutrino spectrum at Daya Bay. Alternative methods are also discussed and the results are cross-checked.

SVD and Generalized Inverse of Response Matrix
The solution of a linear inverse problem of the type Ax = y (where A is the detector response matrix, x is the true distribution vector of n dimensions, y is the measured distribution vector of m dimensions) requires the construction of the generalized inverse matrix, A # (A −1 in the case of square matrix). The case of m = n can be solved simply by matrix inversion. In practice, y has statistical fluctuations, and a simple matrix inversion results in large fluctuations in x and negative correlations between bins of x. One method, along the idea of a least squares fit, is to minimize where V y is the covariance matrix for the measurement of y. A larger dimension for the measurement, i.e. m > n, would lead to a more precise solution. The SVD method is an orthogonalization method applied to the m-by-n matrix A. As a prerequisite, a scaling process is applied to the equations Ax = y: the rows of A and y are both divided by the error vector of y. (The scaled matrix and vector are still written as A and y for convenience.) The SVD of the m-by-n matrix A with m > n is expressed: where U and V are m×m and n×n orthogonal matrices (U T U = I, V T V = I), u i and v i are the corresponding vectors, while Σ is an m × n diagonal matrix with nonnegative diagonal elements. The singular values σ i of Σ are ordered and positive. Using the SVD matrices U , Σ and V , a generalized inverse A # of the matrix A can be defined by with Multiplying the equation Ax = y by the generalized inverse A # , the vector x is obtained by with coefficient c j = y T u j . The covariance matrix for the estimate of x is given by For the case of an n-by-n matrix A, the orthogonalization is a simple diagonalization: where Λ is a diagonal matrix with eigenvalues λ j . The final solution after transformation is with coefficient c j = 1/ λ j (b T u j ). The corresponding expression for the covariance matrix V x is given by However, due to the properties of the response matrix A, the singular values (or eigenvalues) typically span many orders of magnitude. The larger singular values represent the dominant components of the detector response matrix, however the small singular values would dominate the result if included in the solution. The standard technique to reduce or suppress the contribution from the smaller eigenvalues is the regularization method, which does not introduce a bias if the regularization parameter is well defined and the response matrix A is known.
For the Daya Bay experiment, the inputs of the linear inverse problem are the measured prompt energy spectra and the detector response matrix. The measurement vector y is the sum of the prompt energy spectra of the four near-site ADs weighted by their target mass relative to average target mass (M n ) of all ADs: where E prompt is the bin center of the prompt energy spectra at each bin. The covariance matrix V y is composed of the statistical, systematic and background uncertainties described in Sec. 6. The response matrix A is constructed by either of the two methods as described in Sec. 6.
With the SVD method, the linear inverse problem is solved by a linear transformation of the measured prompt energy spectra. This transformation is realized by the generalized inverse A # of the response matrix A. The construction of this generalized inverse allows the use of the standard method for propagating uncertainties. The resulting covariance matrix of the unfolded result necessarily describes the correlations between bins of the unfolded spectrum.

Regularization Method
The exact solution of the linear system is equivalent to the minimization of A simple method of regularization is the truncation of the diagonalized matrix to exclude huge 1/σ i components in solution x, which is equivalent to ignoring the insignificant components of the detector response matrix. Though simple truncation is better than keeping all j, this introduces biases which are difficult to control. One of the usual choices, used in high energy physics, is requiring that the regularized solution be smooth. Technically, this requirement is introduced into the χ 2 minimization condition by adding an extra term [68,69]: The parameter τ plays the role of the Lagrange multiplier in the new conditional minimization problem. A small value of τ has a weak regularization effect, the correlations between bins remain mainly negative, and the result is still dominated by a large statistical fluctuation. A very large value of τ reduces the statistical fluctuation but introduces positive correlations between bins of the solution.
For the regularization method, it is important to choose a proper regularization parameter τ . Usually Monte Carlo samples with different statistical and systematic uncertainties are analyzed to obtain the optimal regularization parameter. For unfolding the measured prompt energy spectrum with the same statistics as our data set (∼ 1 million IBD events), an unfolding package based on RooUnfold [70], and TSVD in ROOT is developed and used. In RooUnfold, a positive integer, k, acts as the regularization parameter. To determine a proper k value, different toy MC samples of prompt energy spectra were generated by folding different true antineutrino energy spectra with one detector response matrix. Two nominal true antineutrino energy spectra were constructed: one with the Huber+Mueller model and the other with the 'model' of a recent ab initio calculations [46]. The true antineutrino spectrum samples were generated by varying the nominal spectrum according to its systematic uncertainties. Each sampled true spectrum was folded with the detector response matrix to become a true prompt energy spectrum. Then, each bin was varied with its statistical uncertainty. The energy range and number of bins of the sampled prompt energy spectra were the same as the measured spectra. A least squares method was defined to determine the regularization parameter k: where x i true is the bin content of the true antineutrino spectrum, x i unf old is the bin content of the unfolded spectrum, which was generated from the corresponding x true , and V ii x is the i-th diagonal element of the covariance matrix given by the unfolding. A k value scan for x unf old was carried out during unfolding to find the minimum χ 2 . The best k value was found to be 15, for the ∼ 1 million IBD candidates, and when the nominal true antineutrino distribution was constructed with the Hu-ber+Mueller model. The best k value is not sensitive to the choice of the reactor antineutrino spectrum model, but becomes smaller as the size of the sample increases.
Once the regularization parameter k was determined with the input of the measured prompt spectrum of Eq. 40, its full covariance matrix, and the detector response matrix from Geant4 MC simulation, the true antineutrino energy spectrum S combined (E) and its covariance matrix V combined (E) were obtained by unfolding with the SVD regularization method, shown in   Fig. 25. The unfolded antineutrino energy spectrum using the combined prompt energy spectrum in the four near site ADs, and its covariance matrix (inset). The content of last bin is the integral up to 12 MeV.

Bias Estimation
An important requirement of unfolding is to minimize the bias. The bias of each bin is defined as the average difference between the true and unfolded distribution sample pairs, written as < |x t -x u | >. True antineutrino spectrum samples were generated and used for the bias estimation, including different reactor antineutrino models, different statistics, different bin numbers, and different sample sizes. Bias estimation was processed with the same bin width and statistics of the experimental data. The bias of each bin is illustrated in Fig. 26. Between 2.75 and 6.5 MeV, the bias is 0.5%, which is comparable to the statistical uncertainty. The bias increases outside this region due to the lower statistics. To have zero bias during unfolding, the condition is that the detector response matrix is exactly known. As mentioned in Sec. 6, there are two methods to construct the response matrix: analytical and full Geant4 MC simulation. With one antineutrino spectrum as input, the two matrices generate two prompt spectra with 0.5% bin-by-bin differences. We used both matrices during bias estimation to obtain conservative results; i.e., using one matrix for the folding process to generate the true prompt energy spectrum samples, and the other matrix for the unfolding process to obtain the corresponding antineutrino energy spectra. If the folding and unfolding processes use the same matrix, the bias is reduced when the statistics of the samples increase, and is negligible for 1 million events.

Iterative Methods
Another common way to unfold is using iterative methods. Iterative methods have advantages for obtaining the true distributions of multiple dimensions, but require a starting value. One typical example is the Bayesian iterative method [71]. An initial guess of the antineutrino spectrum can be set as the starting value, which is updated iteratively by a calculation that takes into account the response matrix and the observed prompt spectrum. The iteration is stopped when the change in the antineutrino spectrum is small enough. This method has an implicit regularization property, i.e. the number of iterations is similar to the regularization parameter in the SVD regularization method. The summed prompt spectrum of the four ADs was also unfolded by the Bayesian iterative method, with the response matrix obtained by the Geant4 MC simulation method. Figure 27 compares the unfolded antineutrino spectrum obtained with the Bayesian method and the SVD regularization method. The two methods yield consistent results and the difference below 8 MeV is negligible compared with the spectrum uncertainty.

Antineutrino Spectrum Weighted by the IBD Cross Section
7.3.1 Normalization of Antineutrino Spectrum Weighted by the IBD Cross Section A generic reactor antineutrino spectrum for the IBD reaction was extracted from the measurement to provide a model-independent input for predicting reactor antineutrino flux and spectra. The detector response effects were removed by unfolding the combined prompt spectrum S combined (E prompt ) to an antineutrino spectrum S combined (E) for IBD reactions (Fig. 25). Oscillation effects were also removed and each bin of the antineutrino spectrum was normalized to cm 2 /fission/MeV using reactor information, which can be directly compared with the isotope spectra weighted by the IBD cross section. The generic antineutrino spectrum is expressed as S generic (E) = S combined (E) P sur (E, L)N P F total , where P sur (E, L) is the average survival probability of theν e calculated with the fluxes from the six reactors to the four detectors, N P is the number of target protons in the average target mass M n , and F total is a normalization factor based on the baseline-weighted total number of fissions. The average survival probability P sur (E, L) is obtained by weighting the antineutrino contributions B dr from different reactors r to each detector d, where S dr (E ν ) and S d (E ν ) are the expected antineutrino spectrum with contributions from each reactor and from all reactors, calculated using Eq. 5 and Eq. 6. The average survival probability P sur (E, L) is then where L dr is the baseline between the r-th reactor to d-th detector; P dr sur (E, L dr ) is the survival probability of antineutrinos after travelling from the r-th reactor to the d-th detector. The oscillation probability is calculated using the oscillation parameters in the oscillation analysis of the same data set [32].
The normalization factor F total for all four ADs contributed from six reactors is calculated as where W t r is the average power of the t-th week; α t ir is the weekly fission fraction of i-th isotope; e i is the fission energy; d is the detection efficiency of each AD, which is the multiplication of the detection efficiency of d-th detector, which is the product of the IBD detection efficiency of all ADs, 0 = 80.6%, the weekly multiplicity cut and muon veto efficiencies ( m and µ ), and the weekly live time. The uncertainty of N P F total is not dependent on the antineutrino energy and only contributes to the rate uncertainty of the generic spectrum. The rate uncertainty is 2.0%, which is contributed from the uncertainties of the efficiencies (1.93%) listed in Table 7, the fission energy (0.2%), and the reactor power and fission fractions (0.5%).
From Eqs. 44-47, the normalized reactor antineutrino spectrum measured at the two near sites is obtained. The obtained generic antineutrino spectrum is shown in the top panel of Fig. 28. The values of the spectrum and the covariance matrix are shown in Tables 12 and 13 in the appendix. The middle panel of Fig. 28 is the ratio of the generic reactor antineutrino spectrum to the prediction using the isotope spectra of the Huber+Mueller model and the effective fission fractions listed in Table 10. The bottom panel of Fig. 28 shows the ratio of the spectrum obtained in the 6+8 AD period to that in the 6 AD period [29]. The deviation of the ratio from one is due to the difference of fission fractions in the two data period and the statistic fluctuation. The average deficit is equal to the overall flux deficit reported in Sec. 5. The bump in the 5-7 MeV antineutrino energy corresponds to that in the 4-6 MeV prompt energy in Fig. 23. The correlation matrix of the generic spectrum is obtained from its covariance matrix, which is calculated by both toy MC sampling method, and standard error propagation with matrices. Figure 29 shows the correlation matrix of the generic spectrum and its components for the energydependent uncertainties.

Possible
Application of Generic Antineutrino Spectrum The generic antineutrino spectrum has been weighted by the IBD cross sections. Other reactor neutrino experiments not utilizing the IBD reaction can remove the IBD weighting factor to obtain the antineutrino spectrum from the reactor. IBD reaction experiments could directly use the generic spectrum to predict the antineutrino spectrum with IBD cross section S A in their experiment. A simplified example is: where S dyb is the generic spectrum from the Daya Bay, i.e. S generic (E), f dyb and f A are the effective fission fractions of the Daya Bay experiment and the reactor antineutrino experiment A; and S mod are the isotope antineutrino spectra from models, such as ILL+Vogel, Huber+Mueller, etc. S A could then replace the isotope spectra related part i S mod i e i in the calculation of the spectrum prediction presented in Sec. 2. The idea of this application depends on the condition that the effective fission fractions of different reactor antineutrino experiments, i.e. f dyb and f A are small; therefore, corrections from i (f A i − f dyb i )S mod i would be relatively small, and S A will be dominated by the measurement result S dyb rather than reactor models.

Summary
After the final two detectors were installed in the Daya Bay experiment, an additional 404 days of data had been taken. Including the previous 217 days of data taken by six ADs, more than 1.2 million IBDs were detected by the Daya Bay experiment. The inverse beta decay (IBD) selection efficiency was found to be 80.6% with a relative uncertainty of 1.93% based on a detailed study of the detector performance. The measured IBD yield is (1.53±0.03)×10 −18 cm 2 /GW/day or (5.91 ± 0.12) × 10 −43 cm 2 /fission. The ratio of measured flux to the predictions is 0.946±0.020 (0.992±0.021) for the Huber+Mueller (ILL+Vogel) model, which is consistent with the global average of previous short baseline experiments. In addition, the predicted and measured spectra were compared, and a deviation of 2.9 σ was found. Particularly, an excess of events was found in the region of 4-6 MeV with a local significance of 4.4 σ.
Further investigation on the excess of events reveals possible problems in the reactor antineutrino flux predictions. A reactor antineutrino spectrum weighted by the IBD cross section was extracted from the measurement at Daya Bay, providing a model-independent input for future reactor antineutrino experiments. Table 13. Covariance matrix of antineutrino spectrum. Unit: [cm 2 /f ission/M eV ] 2 × 10 −92 .
The square-roots of the diagonal elements are plotted in Panel A as error bars in Fig. 28 .