The Pantheon+ Analysis: The Full Dataset and Light-Curve Release

Here we present 1701 light curves of 1550 spectroscopically confirmed Type Ia supernovae (SNe Ia) that will be used to infer cosmological parameters as part of the Pantheon+ SN analysis and the SH0ES (Supernovae and H0 for the Equation of State of dark energy) distance-ladder analysis. This effort is one part of a series of works that perform an extensive review of redshifts, peculiar velocities, photometric calibration, and intrinsic-scatter models of SNe Ia. The total number of light curves, which are compiled across 18 different surveys, is a significant increase from the first Pantheon analysis (1048 SNe), particularly at low redshift ($z$). Furthermore, unlike in the Pantheon analysis, we include light curves for SNe with $z<0.01$ such that SN systematic covariance can be included in a joint measurement of the Hubble constant (H$_0$) and the dark energy equation-of-state parameter ($w$). We use the large sample to compare properties of 151 SNe Ia observed by multiple surveys and 12 pairs/triplets of"SN siblings"- SNe found in the same host galaxy. Distance measurements, application of bias corrections, and inference of cosmological parameters are discussed in the companion paper by Brout et al. (2022b), and the determination of H$_0$ is discussed by Riess et al. (2022). These analyses will measure w with $\sim3\%$ precision and H$_0$ with 1 km/s/Mpc precision.


INTRODUCTION
Measurements of Type Ia supernovae (SNe Ia) were essential to the discovery of the accelerating expansion of the universe (Riess et al. 1998;Perlmutter et al. 1999). Since then, the continually growing sample size of these special "standardizable candles" has strengthened a key pillar of our understanding of the standard model of cosmology in which the universe is dominated by dark energy and dark matter. While modern transient surveys are now discovering as many SNe Ia in 5 yr as had been discovered in the last 40 yr (e.g., Smith et al. 2020;Dhawan et al. 2021;Jones et al. 2021), progress in using these data for constraining cosmological parameters has been made by the compilation of multiple samples (e.g., Betoule et al. 2014;Scolnic et al. 2018;Brout et al. 2019a;Jones et al. 2019). The reason for this is that different surveys are optimized to discover and measure SNe in different redshift ranges, and the constraints on cosmological parameters benefit from leveraging measurements at different redshifts. In this paper, we present the latest compilation of spectroscopicallyconfirmed SNe Ia, which we call Pantheon+; this sample is a direct successor of the Pantheon analysis , which itself succeeded the Joint Light-curve Analysis (JLA; Betoule et al. 2014).
In the past, measurements of the equation-of-state parameter of dark energy (w) and the expansion rate of the universe (H 0 ) have been done separately (e.g., Riess et al. 2016;Scolnic et al. 2018), even though both rely on many of the same SNe Ia. One reason for this split is that the determination of these two parameters is based on comparing SNe Ia in different redshift ranges. For H 0 , SNe Ia in very nearby galaxies with z 0.01 that have calibrated distance measurements are compared to those in the "Hubble flow" at 0.023 < z < 0.15, ignoring higher redshifts. For w, measurements typically utilize SNe Ia up to z ≈ 2, but exclude those at z < 0.01. Thus, only SNe Ia within one of the three ranges, those at 0.023 < z < 0.15, are common to both analyses.
Here we perform a single analysis of discovered SNe Ia measured over the entire redshift range, from z = 0 to z = 2.3. This work spawns a number of analyses which include the w measurement presented by Brout et al. (2022b, in prep., hereafter B22b) as well as the H 0 measurement of Riess et al. (2022, in prep., hereafter R22). R22 additionally depend on Cepheid and geometric distance measurements, which make up what is called the "first rung" of the distance ladder, whereas Cepheid measurements and z < 0.01 SN measurements make up the "second rung," and SN measurements along with their redshifts make up the "third rung." Both Cepheids and SNe are used in two of the three rungs. Furthermore, the SNe discussed here can be used to measure growth of structure, as indicated by the model comparisons by Peterson et al. (2021) and for measurements of anisotropy discussed by B22b. A review of many potential cosmological measurements possible with large SN Ia samples is given by Scolnic et al. (2019).
Measurements of SN Ia light curves by different surveys can be accumulated to improve their constraining power on cosmological inferences because (1) the SNe can be uniformly standardized using their light-curve shapes and colors, and any dependence of the standardization properties with redshift can be measured; and (2) properties of the photometric systems and observations of tertiary standards are typically given so that current analyses can recalibrate the systems (e.g., Scolnic et al. 2015;Currie et al. 2020) and refit light curves. This latter point, when used with an analysis of SN surveys in aggregate, yields the ability to quantify and reduce survey-to-survey calibration errors. This is explored by Brout et al. (2022a, in prep., hereafter B22a), who present a new cross-calibration of the photometric systems used in this analysis and the resulting recalibration of the SALT2 light-curve model. Brownsberger et al. (2021) show that while measurements of H 0 are particularly robust to calibration errors of SNe Ia, this is not the case for measurements of w. In this paper, we analyze measurements of the same SNe from different surveys as an alternate test on the accuracy of our calibration.
The large size of this sample also allows us to compare "sibling SNe" -that is, SNe belonging to the same host galaxy. As shown in various studies Burns et al. 2020;Biswas et al. 2021), sibling SNe provide powerful tests of our understanding of the relationships between SN properties and their host galaxies. With this large compilation, we can increase the statistics of sibling pairs (and triples). Our findings on the consistency of the distance modulus values determined for sibling SNe, as well as the consistency of distance measurements of SNe from different samples, can be used to improve the construction of the distance-covariance matrix between SNe. This matrix is described by B22b, and relates the covariance between distance measurements of SNe due to various systematic uncertainties.
Lastly, this paper documents the data release of standardized SNe Ia for the Pantheon+ sample. A companion paper by (Carr et al. 2021, hereafter C22) performs a comprehensive review of all the redshifts used and also corrects a small number of SNe with incorrect metaproperties (e.g., location, host association, naming), all included here. We note that this compilation includes light curves that have not been published elsewhere and light curves that have been provided individually as the focus of a single paper, as well as the larger samples from specific surveys. The compilation presented here attempts to homogenize the presentation and documentation of these light curves.
The structure of this paper is as follows. In Section 2, we describe the light-curve samples released as part of the Pantheon+ compilation. Section 3 presents the light-curve fits, the selection requirements (data quality cuts), and the properties of the host galaxies. We discuss in Section 4 trends of the fitted and host-galaxy parameters, as well as new studies of SN siblings and duplicate SNe. Section 5 presents our discussions and conclusions. Importantly, in the Appendix, we describe the format of the data release itself.

DATA
The Pantheon+ sample comprises 18 different samples, where a sample is loosely defined as the dataset produced by a single SN survey over a discrete period of time. The samples and their references, as well as their redshift ranges, are given in Table 1. In the Appendix, we give an overview of each sample where we detail the original data-release paper, the location of the data, and the photometric system of the SNe. This table should be combined with the tables in Appendix A in B22a that have the information for the photometric systems and information on stellar catalogs used for cross-calibration.
Here we review the main changes since the first Pantheon release. We have added 6 large samples: Foundation Supernova Survey (Foundation; Foley et al. 2018), the Swift Optical/Ultraviolet Supernova Archive (SOUSA) 1 , the first sample from the Lick Observatory Supernova Search (LOSS1; Ganeshalingam et al. 2010), the second sample from LOSS (LOSS2; Stahl et al. 2019), and the Dark Energy Survey (DES; Brout et al. 2019b). All but DES are low-z surveys, which is why in Figure 1. (Top:) The redshift distribution of the Pan-theon+ sample that passes all the light-curve requirements, as well as the same for the JLA and Pantheon samples. The largest increase in the number of SNe for the Pantheon+ sample is at low redshift owing to the addition of the Foundation, LOSS1, LOSS2, SOUSA, and CNIa0.2 samples. The largest increase at higher redshift is due to the inclusion of the DES 3-year sample. We do not use SNe from SNLS at z > 0.8 due to sensitivity to the U -band in model training, so the Pantheon+ statistics between 0.8 < z < 1.0 are lower than that of Pantheon and JLA. (Bottom:) The Pantheon+ redshift diagram shown cumulatively by survey. Figure 1 the largest improvement in SN numbers is at low redshift. Additionally, there was a new data release for the Carnegie Supernova Project (CSP; Krisciunas et al. 2017b) which remeasured previous photometry for CSP-I and added more SNe.
Additionally, there are light curves that have not yet been published, but are included in the respective Pan-theon+ sample. These are SN 2021pit from SOUSA and SN 2021hpr from LOSS2, which follow the processing and photometric systems of the larger samples. Additionally, there are three light curves from Foundation after their release (SN 2017erp, SN 2018gv, and SN 2019np). SN 2018gv and SN 2019n were processed with the same pipeline described in Foley et al. (2018). For SN 2017erp, this SN was outside of the PS1 footprint, so Skymapper catalogs (Onken et al. 2019) were used to set the photometric zeropoints following the process outlined in Scolnic et al. (2015).
We have made a special effort to calibrate and include surveys that contain observations of SNe Ia in near enough galaxies ( 40 Mpc) for which Cepheid observations with the Hubble Space Telescope (HST) have been obtained because such objects are rare (approximately one per year) and their numbers limit the precision of the determination of H 0 (see R22). As shown by Brownsberger et al. (2021), the sensitivity of measurements of H 0 to the photometric calibration of SN light curves depends on whether the relative number of second-rung SNe observed by a survey is similar to the relative number of third-rung SNe observed by that survey. Brownsberger et al. (2021) demonstrate that our current compilation has sufficiently similar numbers so that the impact of potential cross-survey systematics from calibration is < 0.2% in H 0 .
For each of the samples, the photometric systems are recalibrated by B22a. Two surveys previously in Pantheon have changed in response to an improved understanding of their photometry. (1) For SDSS, the reported photometry was thought in Pantheon to be in the AB system but was actually in the natural system, so offsets to the photometry of [−0.06, 0.02, 0.01, 0.01, 0.01 mag] in ugriz were not applied in Pantheon (the u-band usage in SALT2 is minimal, as most SNe discovered by SDSS are at z > 0.1, outside the usable redshift range for the u band filter).
(2) For CfA3K and CfA3S, the photometry of the SNe was assumed in Pantheon to be in the natural system but was actually in the standard system -this changes the B band by ∼ +0.01 mag fainter relative to the other bands.
We release the light curves with the photometry as given by the original sources here (though all put in a standard syntax): https://pantheonplussh0es.github. io/. The calibration of the samples and derived offsets to the photometric zeropoints given in B22a will be included at the same github page. Furthermore, we include files to quickly apply calibration definitions and offsets (e.g., the CALSPEC zeropoints needed to define the photometric systems) to fit the light curves.

Light-Curve Fits
In order to obtain distance moduli (µ) from SN Ia light curves, we fit the light curves with the SALT2 model (Guy et al. 2007) using the trained model parameters from B22a over a spectral energy distribution (SED) wavelength range of 200-900 nm. We select passbands whose central wavelength (λ) satisfies 300 nm <λ/(1 + z) <700 nm, and we select epochs between −15 to +45 rest-frame days with respect to the epoch of peak brightness. We use the SNANA software package (Kessler et al. 2009) to fit the SALT2 model to the data, and we use SNANA's MINOS computational algorithm to determine the parameters and their uncertainties.
Each light-curve fit determines the parameters color (c), stretch (x 1 ), and overall amplitude (x 0 ), with m B ≡ −2.5 log 10 (x 0 ), as well as the time of peak brightness (t 0 ) in the rest-frame B-band wavelength range. To convert the light-curve fit parameters into a distance modulus, we follow the modified Tripp (1998) relation as given by Brout et al. (2019a): where α and β are correlation coefficients, M is the fiducial absolute magnitude of a SN Ia for our specific standardization algorithm, and δ µ−bias is the bias correction derived from simulations needed to account for selection effects and other issues in distance recovery.
For the nominal analysis of B22b, the canonical "massstep correction" δ µ−host is included in the bias correction δ µ−bias following  and Popovic et al. (2021). The α and β used for the nominal fit are 0.148 and 3.112, respectively, and the full set of distance modulus values and uncertainties are presented by B22b.
In addition, we compute a light-curve fit probability (P fit ), which is the probability of finding a light-curve data-model χ 2 as large or larger assuming Gaussiandistributed flux uncertainties. In Figure 2, the light curves of the 42 SNe Ia used for the determination of H 0 in the second-rung distance ladder of R22 are shown with overlaid light-curve fits using the SALT2 model. All light-curve fit parameters for the sample will be  (2007) Note-The different samples included in the Pantheon+ compilation, the number of SNe that are in the cosmology sample and the number from the full sample, the redshift range, and the reference. We provide fitted light-curve parameters for all the light curves with a converged SALT2 fit as part of the data release, but the cosmological analysis is done only with the SNe that pass all the cuts listed in Table 2.
made available in machine-readable format as described in Appendix B and shown in Fig. 7. The parameters from the fits of the light curves are given before the set of light curves before the majority of the selection cuts in Table 2 are applied, which are discussed in the following section.
Finally, in the discussion about the results on siblings and duplicates below, we refer to the distance-covariance matrix. For this, we follow Conley et al. (2010), which defines a covariance matrix C with where the summation is over the systematics (k), ∆µ zi are the residuals in distance for the SNe fitted between different systematics, and σ k is the gives the magnitude Figure 2. Light curves of all SNe Ia used for the SN Ia-Cepheid calibration (second rung of the distance ladder). When a SN has been observed by multiple surveys, multiple light curves are shown for each filter. The SALT2 fit from each light curve is overplotted. Certain filters (e.g., I and sometimes R) are not included in the fit when the observed-frame filter is outside the used SALT2 wavelength range of 300-700 nm.
of the systematic uncertainty. Any additional covariance between the ith and jth SNe that is not due to systematics can be included in that element of the covariance matrix.

Selection Requirements
For this compilation, we require all SNe Ia to have adequate light-curve coverage in order to reliably constrain light-curve-fit parameters. We also limit ourselves to include SNe Ia with properties in a range well represented by the training sample in order to limit systematic biases in the measured distance modulus. The sequential loss of SNe Ia from the sample owing to cuts is shown in Table 2. We define T rest as the number of days since the date of peak brightness t 0 in the rest frame of the SN. Following Scolnic et al. (2018), we require an observation before 5 days after peak brightness (T rest < 5). As with Betoule et al. (2014), we also require the uncertainty in the fitted peak-date of the light-curve (PKMJD) to be < 2 observed frame days to ensure precision in the fit. We require −3 < x 1 < 3 and −0.3 < c < 0.3 over which the light-curve model has been trained. Furthermore, we require that the uncertainty in x 1 is < 1.5 to help avoid pathological fits or inversion issues for systematic uncertainty covariance matrices.
For all samples (though only applicable at low z) we require limited Milky Way extinction following Betoule et al. (2014) and Scolnic et al. (2015), E(B − V ) MW < 0.2. We follow past analyses of specific samples in order to employ a minimum P fit cut: this is done for DES, PS1, and SDSS with levels of 0.01, 0.001, and 0.001, respectively. These different levels are determined from comparisons of distributions of P fit from data and simulations, and depend on the accuracy of the SALT2 model and of the precision of the photometric errors given for SN light-curve measurements. SNLS is the only large, high-z sample in which a P fit cut is not applied, and this is because Betoule et al. (2014) found no difference in the accuracy of the fitted light curves with low P fit . We see similar insignificant differences in Hubble residuals or fit parameters between SNe with high and low P fit as Betoule et al. (2014) do, but retain the usage of P fit to be consistent with how SNLS was previously used. Finally, we remove all SNLS and DES SNe from the sample for z > 0.8, as B22a find large (∼ 0.2) differences in µ for these SNe depending on the inclusion of the U band at low redshift in the SALT2 training samples, and we are unable to calibrate U through cross-calibration. In total, 59 SNe are removed owing to this cut.
In the penultimate row of Table 2 ("Valid BiasCor"), 10 light curves are lost owing to their light-curve properties falling within a region of parameter space that is Note-Impact of various cuts used for cosmology analysis. Both the number removed from each cut, and the number remaining after each cut, are shown. The "SALT2 converged" criterion is the starting point for this assessment and includes all light curves for which the fitting procedure converged. Of the 1701 light curves that pass all cuts, 151 are "Duplicate" SNe.
too sparsely populated in the simulation to yield a meaningful bias prediction. Bias corrections are discussed in detail by B22b. Additionally, there are 60 more light curves that are lost owing to the requirement that they pass all the cuts discussed above for the 40 systematic perturbations discussed by B22b in order to create the covariance matrix in Equation 2. For example, varying the SALT2 model will change the recovered c or x 1 values, which could then be outside the allowed ranges. Additionally, B22b place a cut on SN distance modulus values in the Hubble diagram due to Chauvenet's criterion. We label the number cut here in Table 2, and this is discussed in detail by B22b. In total, 1701 light curves pass all the cuts, though as discussed below, a significant fraction of these are duplicate SNe.

Host-galaxy Properties
In order to allow the use of host-galaxy information that may improve light-curve standardization (e.g., Sullivan et al. 2010;Kelly et al. 2010;Lampeitl et al. 2010;Popovic et al. 2021), we rederived host properties for all SNe Ia with z < 0.15 so that they can be measured consistently. For z > 0.15 and higher-z surveys, we use the masses provided from respective analyses: for SNLS Betoule et al. (2014), for SDSS Sako et al. (2018), for PS1 Scolnic et al. (2018), and for DES Smith et al. (2020). We discuss consistency across these different samples below. For the HST surveys as listed in Table 1, masses were not originally derived for the majority of the host galaxies, so we followed a similar procedure as below but using photometry directly from the publicly available images acquired as part of the surveys given in Table 1.
There are three steps we follow to determine the masses of the host galaxies: 1. Identify the host galaxy.
2. Measure photometry of the host galaxy.
3. Fit a galaxy SED model to the data.
For the low-z sample, for host-galaxy identification, we followed the work of C22 to identify host galaxies and used the directional-light-radius method described by Sullivan et al. (2006) and Gupta et al. (2016) to associate a host galaxy with each SN Ia. All host-galaxy identifications were visually inspected for quality control. We then retrieved images from GALEX (Martin et al. 2005), PS1 (Chambers & et al. 2017), SDSS (Ahumada et al. 2020), and 2MASS (Skrutskie et al. 2006). We measure aperture photometry on the images, and use the PS1 r band to measure the size of the host galaxy "ellipse." We then use that ellipse size to measure consistent elliptical aperture photometry for every image of the source. We use ugriz SDSS photometry rather than griz PS1 photometry when both are available as PS1 has some background-subtraction defects for bright hosts (Jones et al. 2019).
In order to determine host-galaxy properties from the photometry of the galaxies, we used the LePHARE SED-fitting method (Ilbert et al. 2006). The galaxy templates use the Chabrier (2003) initial mass function and were taken from the Bruzual & Charlot (2003) library. The values of the extinction E(B − V ) varied from 0 to 0.4 mag. For galaxies that LePHARE was not able to determine a host mass, we first confirm that the hosts are faint and have not been misidentified, and then we assign them to the low-mass bin.
A plot of the trend of host-galaxy masses for our largest samples (CSP, Foundation, CfA3, DES, SDSS, SNLS, PS1) is shown in Figure 3. When we compare different estimates of host-galaxy mass from varying the photometry or mass-fitting technique, we find typical differences on the level of 0.2 dex (see, e.g., Sako et al. 2018;Smith et al. 2020), which would make up some of the differences between median mass of different samples. Another way to quantify this is to measure the difference in the relative ratio of high-mass to low-mass hosts (where the separator is 10 10 M ) between different surveys. Doing so, we find that typical differences in the same bin between surveys on the order of 15% would cause ∼ 0.01 mag biases for a mass step of 0.06 mag if they were systematic and not random. As there is no evidence of systematic biases beyond the 0.2 dex scale, this number is used to account for systematics in B22b. Furthermore, we find relatively good agreement with past estimates compiled in Pantheon, with the typical differences between median masses in the same bin on the level of 0.5 dex.

Trends of SN Parameters and Comparison to Previous Analyses
We show the evolution of the light-curve fit parameters with redshift in Figure 4. As seen in previous analyses, we do find nonzero evolution of these parameters with redshift. These are modeled by Popovic et al. (2021), who describe a separate mass distribution for low-z (e.g., CfA1-4, CSP) and high-z (SDSS, SNLS, PS1, DES) samples.
In total, there are 1701 SNe, significantly more than the number from Pantheon (1048) or JLA (742). All but 14 of the SNe in Pantheon are in Pantheon+, and all but 10 of the SNe in JLA are in Pantheon+. In B22a, we show the differences between the µ values found in Pantheon+ and those found in Pantheon and JLA. The largest differences are due to the calibration of the SALT2 model, which is revised by B22a. We note that the issue of revising the CfA3K and CfA3S system definition mentioned previously does cause a ∼ 0.025 mag change (toward fainter distance-modulus values) relative to Pantheon. Additionally, the sample source of the SN is given in the last column, and we include measurements of the same SN from multiple samples where available.

Sibling Supernovae
As part of this analysis and that of C22, we have determined the host galaxy for each SN in our sample. We can then query for galaxies that have hosted more than one SN that make it to the Hubble diagram. Note that owing to our strict quality cuts, this number is fewer than the total number of SN siblings. We find 12 galaxies that have hosted SN siblings, as listed in Table 3. We include the measurements from different samples if a SN has been observed by multiple telescopes. Two of the galaxies hosted three SNe, and we consider all pair-wise combinations of the triplets.
Comparing the properties of the SNe, we find the standard deviation of the differences in c of 0.10, in x 1 of 1.04, and in µ of 0.32 mag. We can compare these values to those taking random pairs of SNe at low z by bootstrapping: c of 0.12, x 1 of 1.6, and ∆µ of 0.22, where ∆µ subtracts off the best-fit cosmology to account for two SNe having two different redshifts. A median 0.22 mag difference is consistent with expectations for SNe with a dispersion of ∼ 0.16 mag, which is the RMS on the Hubble diagram found in B22b. We find that the uncertainties in the standard deviation are 0.023 in c, 0.33 in x 1 , and 0.043 in ∆µ. Therefore, we find that the x 1 values for the siblings are ∼ 2σ closer than two random SNe, the c values are < 1σ closer, but the µ values are 2.4σ further apart in the siblings than any random pair of SNe. The relatively high agreement in x 1 but low agreement in ∆µ is consistent with the findings of  for 8 pairs of siblings found in the DES sample: there are indications that x 1 is correlated for SNe in the same hosts, but no significant evidence that the ∆µ values are correlated. This insight is important for creating the systematic covariance matrix of B22b that no covariance should be given for measurements of SN distances in the same galaxy.

Duplicate Supernovae
We denote SNe that have been observed by multiple surveys as "Duplicate SNe." As discussed by R22 and B22b, unlike in previous analyses, we do not choose between specific versions of the SNe and instead propagate each fit from each survey, and then include a covariance term between the duplicate SNe in our final covariance matrix used for cosmology. Not all duplicate SNe have the same given name, and we therefore search on RA, DEC, and PKMJD for duplicate SNe. In total, there are 151 SNe which have been observed by more than one survey, with all but one duplicate SN having z < 0.1.
We find a standard deviation of the differences in the pairs of 0.102 mag. Following a similar bootstrapping procedure as above, and only using low-z SNe, we calculate a typical dispersion for 151 pairs of random SNe (correcting for redshift differences) to have 0.218 mag with 0.011 uncertainty. Therefore, the distances of the same SN measured by two separate surveys agree by > 10σ better than two random SNe. This insight is again important for creating the systematic covariance matrix in B22b that the intrinsic scatter of a SN Ia should be shared for measurements of the same SN by different surveys; from Equation 2, C zi,zi = σ int , where the i-th and j-th light curve are of the same SN from different samples, and σ int is the intrinsic scatter of the sample.
In Figure 5, we present a comparison of the distance modulus of the SN duplicates between surveys. We do not find any deviations from the mean beyond 2σ. The largest deviation is from LOSS1 (Ganeshalingam et al. 2010) at 2.0σ. B22a show the mean distance modulus residuals for each subsample for all surveys and do not find any magnitude deviations greater than 0.05 mag with the exception of CfA1. Our results here generally support the agreement found by B22a.
Furthermore, in Table 4, we give the fraction of the sample each survey contributes to the 2nd and 3rd rungs of the distance ladder described in R22, where the 3rd rung has the limit of z < 0.15 and those in the 2nd rung are determined SNe found in nearby galaxies with associated Cepheid measurements. (We note that the baseline determination of H 0 further limits the 3rd rung sample to z > 0.0233 and to late-type hosts.) Assuming (gray) survey errors, an estimate of the error in H 0 from survey miscalibration results from the difference in these fractions multiplied by the mean residual of each survey from the full compilation. We give the fractional difference between these two rungs by sample and the survey residual calculated by B22a (see Figure 6) in Table 4. If one multiplies the fractional difference between rungs by the Hubble residual offsets, this describes the sensitivity of H 0 (in magnitudes, not km/s/Mpc) to possible discrepancies of sample offsets. We find that the largest fractional difference is due to Foundation at ∼ 23%, and the majority of the fractional differences are between 2 − 15%. After multiplying these differences by the Hubble residual offsets, we find the products are all below 4 mmag. This would imply a sensitivity in H 0 on the level of 0.2%. This also illustrates the benefit of using a similar mix of surveys for both samples. Because we cannot avoid using a mix of surveys for the 2nd rung (these are objects are rare) the use of a single sample for the 3rd rung would propagate an error in H 0 at the level of ∼ 1% as shown in Brownsberger et al. (2021).

DISCUSSION AND CONCLUSIONS
In this paper, we presented the new "Pantheon+" sample that is used in a series of analyses for cosmological parameter measurements. The challenge of a compilation analysis like this one is documentation, and unlike previous analyses, we attempt here to document key properties about the samples (photometric system, data location, references) to improve reproducibility in the future.
The Pantheon+ analysis improves on the Pantheon analysis in nearly every facet. Not only do we increase the sample size, but we do a comprehensive review of the redshifts (C22) and peculiar velocities (Peterson et al. 2021), a new calibration and model retraining for the sample (B22a), and new cosmological analyses by R22 and B22b. In Section 2, we detail data that have been added to the previous Pantheon compilation, as well as changes to the data that were previously used. As these samples date back 40 yr, we have made a significant effort to check assumptions about how data have been passed from analysis to analysis, rather than assuming previous analyses have understood each facet correctly.
The size of a sample like this will soon be surpassed by other samples from newer and upcoming surveys like the Zwicky Transient Facility (ZTF; Dhawan et al. 2021), the Young Supernova Experiment (YSE; Jones et al. 2021), the Dark Energy Survey (DES; Smith et al. 2020), the Legacy Survey of Space and Time (LSST; Ivezić et al. 2019), and the Nancy Grace Roman Space Telescope (Roman; Hounsell et al. 2018). These surveys may find a similar number of SNe to this compilation in only a matter of days. However, the usefulness of the Pantheon+ sample, particularly at low redshift, is unlikely to be surpassed for some time owing to its utility for constraining the Hubble constant. For this measurement, we are statistically limited by the number of SNe in nearby galaxies in which Cepheids can be found, which is typically one SN discovered per year (R22).
Two of the findings from this paper will be used to create the systematic covariance matrix of B22b. The first is that we find excellent agreement when different surveys measure the same SNe, and the second is that we find relatively poor agreement when surveys measure distances of two SNe in the same galaxy. The latter of these findings will be best tested with LSST, which can find over 800 siblings (The LSST Dark Energy Science Collaboration et al. 2018;). Finally, we show that because of our effort to include samples that cover the second and third rung of the distance ladder, the accuracy of the H 0 measurement will not be limited by possible discrepancies in measurements of the SN distances by sample. Note-The relative fractions of SN samples by survey (accounting for duplicates) for the 3rd rung of distance ladder (z < 0.15) SN sample, the 2nd rung of distance ladder Cepheid-hosted SN sample in R21, the difference between the two, the mean offset by survey given in B22a in Fig 6, and the product of the survey offset with the fractional difference. The product indicates the size of the sensitivity of H 0 (in mag, not km/s/Mpc -divide by ∼ 2 for % units in H 0 ) to survey mis-calibration or other issues. See Brownsberger et al. (2021) for more information about this sensitivity.
FITRES file from SNANA's SALT2 light-curve fitting Figure 7. Display of what a .FITRES file looks like that has all the information from the light-curve fit, as well as ancillary information. A value of −9 is given where information is unavailable. The full file will be included at pantheonplussh0es.github.io B. SN DATA INFORMATION