The Pantheon+ Analysis: Cosmological Constraints

We present constraints on cosmological parameters from the Pantheon+ analysis of 1701 light curves of 1550 distinct Type Ia supernovae (SNe Ia) ranging in redshift from $z=0.001$ to 2.26. This work features an increased sample size, increased redshift span, and improved treatment of systematic uncertainties in comparison to the original Pantheon analysis and results in a factor of two improvement in cosmological constraining power. For a Flat$\Lambda$CDM model, we find $\Omega_M=0.334\pm0.018$ from SNe Ia alone. For a Flat$w_0$CDM model, we measure $w_0=-0.90\pm0.14$ from SNe Ia alone, H$_0=73.5\pm1.1$ km s$^{-1}$ Mpc$^{-1}$ when including the Cepheid host distances and covariance (SH0ES), and $w_0=-0.978^{+0.024}_{-0.031}$ when combining the SN likelihood with constraints from the cosmic microwave background (CMB) and baryon acoustic oscillations (BAO); both $w_0$ values are consistent with a cosmological constant. We also present the most precise measurements to date on the evolution of dark energy in a Flat$w_0w_a$CDM universe, and measure $w_a=-0.1^{+0.9}_{-2.0}$ from Pantheon+ alone, H$_0=73.3\pm1.1$ km s$^{-1}$ Mpc$^{-1}$ when including SH0ES, and $w_a=-0.65^{+0.28}_{-0.32}$ when combining Pantheon+ with CMB and BAO data. Finally, we find that systematic uncertainties in the use of SNe Ia along the distance ladder comprise less than one third of the total uncertainty in the measurement of H$_0$ and cannot explain the present"Hubble tension"between local measurements and early-Universe predictions from the cosmological model.


INTRODUCTION
Type Ia supernovae (SNe Ia) anchor the standard model of cosmology with their unmatched ability to map the past 10 billion years of expansion history. SNe Ia provided the first evidence of the accelerating expansion of the Universe (Riess et al. 1998;Perlmutter et al. 1999), and they remain invaluable because they are (1) bright enough to be seen at large cosmic distances, (2) common enough to be found in large numbers, and (3) can be standardized to ∼ 0.1 mag precision in brightness or ∼ 5% in distance per object.
Statistical leverage from large samples of SNe Ia has grown rapidly over the last 3 decades, and wellcalibrated and standardized compilations of these samples have facilitated measurements of the relative expansion history across the redshift range 0 < z < 1 characterized by the equation-of-state parameter of dark energy (w = P/(ρc 2 )), and the measurement of the Hubble constant H 0 , the current expansion rate determined from absolute distances. Measurements of w are constrained from the comparison of standardized SN Ia magnitudes over a wide range of redshifts obtained from different surveys with different observing-depth strategies. Measurements of H 0 require very nearby (< 50 Mpc, ∼ 1 discovered per year) SNe Ia found by multiple surveys in galaxies that host calibrated primary distance indicators [e.g., Cepheids, tip of the red giant branch (TRGB)] which are then compared to SNe in the Hubble flow, often from the same surveys.
However, simply combining several subsamples into a large sample of SNe Ia does not provide meaningful gains without rigorous cross-calibration, self-consistent analysis of their light curves and redshifts, and characterization of their numerous sources of related uncertainties or covariance. As samples and compilations grow, ever greater attention must be paid to the control of systematic uncertainties which would otherwise dominate sample uncertainties.
This analysis, Pantheon+, is the successor to the original Pantheon analysis ) and builds on the analysis framework of the original Pantheon to combine an even larger number of SN Ia samples and include those that are in galaxies with measured Cepheid distances in order to be able to simultaneously constrain parameters describing the full expansion history (e.g., Ω M , w 0 , w a ) with the local expansion rate (H 0 ). The original Pantheon compilation of 1048 SNe Ia was used to measure a value (from SNe Ia alone) of w = −1.090 ± 0.220. Riess et al. (2016), in their measurement of the local expansion rate H 0 , used a prerelease version of Pantheon based on Scolnic et al. (2015) and further augmented the sample as Pantheon did not extend to reach the low redshifts of the primary distance indicators at z < 0.01.
Although there was significant overlap in data and analysis between the Pantheon measurement of w and the H 0 measurement of Riess et al. (2016), Riess et al. (2016) included several Cepheid calibrator SNe Ia that were not included in Pantheon and the fitting for H 0 and parameters describing the expansion history were done independently rather than simultaneously. Dhawan et al. (2020) later established a framework for consid-ering the covariance between SNe in primary distance indicator hosts and SNe in the Hubble flow. We build on that framework, which was developed originally for a redshift-binned Hubble diagram, and in this paper we create the first unbinned sample with covariance extending down to z = 0.001 that can be used to propagate correlated systematics for simultaneous measurements of H 0 , Ω M , w 0 , and w a . We analyze the largest set of cosmologically viable SN Ia light curves to date, include low-redshift samples to extend the lower bound in redshift to 0.001 which contains the primary distance indicators (SNe in SH0ES Cepheid host galaxies), propagate systematic uncertainties for both primary distance indicators and higher-redshift SNe simultaneously, and leverage the large strides made in the field of SN Ia cosmology since the original Pantheon.
This paper is the culmination of a series of papers that comprise the Pantheon+ analysis. A graphic of an overview of the numerous Pantheon+ supporting analyses, on which this paper heavily relies, is shown in Fig. 1. Details of each paper pertinent to this analysis are described in Section 3. In short, these papers include (Scolnic et al. 2021, hereafter S22), which describes the sample of 1701 cosmologically viable SN Ia light curves of 1550 distinct SNe, which we will refer to as "the Pantheon+ sample." The redshifts and peculiar velocities of the SNe used here are given by Carr et al. (2021) and a comprehensive analysis of peculiar velocities is presented by Peterson et al. (2021). The crosscalibration of the different photometric systems used in this analysis can be found in (Brout et al. 2021, hereafter Fragilistic), and calibration-related systematic uncertainty limits are determined by Brownsberger et al. (2021). The underlying SN Ia populations describing the dataset are given by Popovic et al. (2021b). The model for intrinsic brightness variations was developed by  and then improved and evaluated by Popovic et al. (2021a). The novel systematic framework for simultaneous measurement of H 0 and cosmology is developed by Dhawan et al. (2020), and improved methodology for systematic uncertainties is described by .
In this work we discuss briefly the aforementioned papers in the context of their use in this analysis, evaluate several additional systematic uncertainties not addressed in these works, measure cosmological parameters, examine additional signals in the Hubble diagram, and compile systematic uncertainty budgets on cosmological parameters. A companion paper by the SH0ES Team (Riess et al. 2021, hereafter R22) combines from this work 277 Hubble flow (0.023< z <0.15) SNe Ia and 42 SNe Ia in Cepheid-calibrator hosts, their relative dis-tances, and their covariance, with the absolute distances of primary distance anchors (Cepheids and TRGB) from R22 in order to measure H 0 under the assumption of FlatΛCDM. Similarly, in this work we utilize the full Pantheon+ sample of 1550 SNe Ia in combination with the R22 Cepheid host distances to show the impact of cosmological models with more freedom than those used in R22 as well as the impact of SN-related systematic uncertainties on inferences of H 0 .
An important aspect of this work is the public release of the data and simulations used here that allow for the reproduction of multiple different stages of this analysis. In Appendix C, we present the numerous products that will be made available, including SN distances, redshifts, uncertainties, covariance, and extensive SNANA simulations (Kessler et al. 2009) of the data that model astrophysical effects, cosmological effects, and the observation/telescope effects of each survey down to the level of cadence, weather history, etc. We encourage the community to validate alternate analyses of the publicly released Pantheon+ sample on these simulations.
The structure of the paper is as follows. In Section 2, we describe the methodology from fitting SN light curves to constraining cosmological parameters. Section 3 summarizes all of the inputs to the analysis including the data sample, calibration, and redshifts. In Section 4, we describe the cosmological results. Sections 5 and 6 are our discussions and conclusions, respectively.

Measuring Distances to SNe Ia
To standardize the SN Ia brightnesses we fit light curves using SNANA with the SALT2 model as originally developed by Guy et al. (2010) and updated in , hereafter SALT2-B22. For each SN, the SALT2 light-curve fit returns four parameters: the lightcurve amplitude x 0 where m B ≡ −2.5log 10 (x 0 ); x 1 , the stretch parameter corresponding to light-curve width; c, the light-curve color that includes contributions from both intrinsic color and dust; and t 0 , the time of peak brightness. Extinction due to Milky Way dust is accounted for in the SALT2 light-curve fitting. From the parameters m B and x 1 , c, we standardize the SN brightnesses and infer distance modului (µ), used in the Hubble diagram, with a modified version of the Tripp (1998) distance estimator. Following (Kessler & Scolnic 2017, hereafter BBC), the distance modulus is defined as where α and β are global nuisance parameters relating stretch and color (respectively) to luminosity. M is the fiducial magnitude of an SN Ia, which can be calibrated by setting an absolute distance scale with primary distance anchors such as Cepheids. δ bias is a correction term 1 to account for selection biases that is determined from simulations following Popovic et al. (2021b), described in detail in Appendix A. δ host is the luminosity correction (step) for residual correlations between the standardized brightness of an SN Ia and the host-galaxy mass, where γ is the magnitude of the SN Ia luminosity differences between SNe in high (M > 10 10 M ) and low (M < 10 10 M ) stellar mass galaxies and where 'hostless' SNe have been assumed to reside in galaxies with low stellar mass. M is the inferred stellar mass measured in units of solar mass (M ) from spectral energy distribution (SED) fitting to the photometry of each host galaxy, S is the step location (nominal analysis assumes S = 10 10 M ), and τ M describes the width of the step. The total distance modulus error, σ µ , for SN i is described as where σ meas is the measurement uncertainty of SALT2 light-curve fit parameters and their associated covariances (see Eq. 3 of Kessler & Scolnic 2017) resulting from photometric uncertainties. The measurement uncertainty is scaled by f (z i , c i , M ,i ) specific to each survey in order to account for selection effects that can reduce the observed scatter at the limits of each sample. The uncertainty contribution from gravitational lensing as given by Jönsson et al. (2010) is σ lens = 0.055z. We note that, as discussed by Kessler et al. (2019a), the correct lensing distribution is utilized in simulations. The nominal distance modulus uncertainty contribution due to the combination of redshift measurement uncertainty (σ z ) and peculiar velocity uncertainty (σ vpec ) have both been converted to distance modulus uncertainty under the assumption of a cosmological model. Chen et al. (2022) note that the optimal way to characterize redshift measurement uncertainty at high redshifts (e.g., the DES sample, z > 0.3) is to float the redshift and use the uncertainty in redshift as a prior in the lightcurve fit. However, following previous analyses we fix the redshift and include the associated distance modulus uncertainty σ z in Eq. 3, which is a correct estimate at low redshifts (z < 0.1). Lastly, σ floor represents the floor in standardizability owing to intrinsic unmodeled variations in SNe Ia such that where σ 2 scat (z i , c i , M ,i ) is determined from a model that describes intrinsic brightness fluctuations and σ 2 gray is a single number representing a gray (color independent) floor in standardizability for all SNe Ia; σ 2 gray determined after the BBC fitting process in order to bring the Hubble diagram reduced χ 2 to unity. The details of σ 2 scat (z i , c i , M ,i ), its model dependence, and its contribution to systematic uncertainties are discussed in further detail in Section 3.3.2 and Appendix A.
To determine the distance modulus values of all the SNe, we follow the BBC fitting process with updates to increase the dimensionality of bias corrections in Popovic et al. (2021b). The likelihood (as given in Eq. 6 of Kessler & Scolnic 2017) results in a cosmologyindependent minimization of the free parameters (α, β, γ, σ gray ) that minimize the scatter in the Hubble diagram. While the BBC process was designed for utility for photometric cosmology analyses and uses SN Ia classification probabilities, the data analyzed here are a spectroscopically confirmed SN Ia sample, and therefore we set the non-Ia SN probabilities to 0 for the whole sample.

The Covariance Matrix
Following Conley et al. (2011), we compute covariance matrices C stat & C syst to account for statistical and systematic uncertainties and expected correlations between the SN Ia light curves in the sample when analyzing cosmological models. BBC produce both a redshift-binned and an unbinned Hubble diagram, enabling both binned and unbinned covariance matrices. For the original Pantheon , JLA (Betoule et al. 2014), and DES3YR , C stat and C syst were redshift-binned matrices (or smoothed as a function of redshift) citing computational limitations. Following , in this work we utilize the unbinned Hubble diagrams to create unbinned covariance matrices. The Pantheon+ sample ) also includes "duplicate SNe Ia," SNe Ia that have been observed simultaneously by numerous different surveys, so that statistical covariance C stat is computed as where each row of the matrix corresponds to an SN light curve, the diagonal of C stat is the full distance error (σ 2 µ ) of the i th light curve, and where measurement noise from components other than the light curve itself are included as off-diagonal covariance between entries corresponding to light curves of the same SN (SN i = SN j ) observed by two different surveys.
Systematic uncertainties can manifest in three key places in the analysis: (1) from changing aspects affecting the light-curve fitting (e.g., survey photometry, calibration, SALT2 model), (2) from changing redshifts that propagate to changes in distance modului relative to a cosmological model, and (3) from changes in the astrophysical or survey-dependent assumptions in the simulations used for bias corrections. For each of these categories we examine all of the known significant sources of systematic uncertainty (ψ) with sizes S ψ which result in residuals in the Hubble diagram relative to our baseline analysis (µ BASE ). In order to compute the effect of systematics, we first define where µ i ψ is the set of distances for systematic ψ. For systematics that affect redshift, we have included new methodology in Eq. 6 that utilizes a reference cosmological model distance µ ref (z) corresponding to FlatΛCDM (Ω M = 0.3, Ω Λ = 0.7). The µ mod (z ψ ) and µ mod (z BASE ) are the cosmological model distances corresponding to redshifts z ψ and z BASE . In order to propagate redshift effects into a distance×distance covariance matrix, the additional component µ mod (z ψ )−µ mod (z BASE ) accounts for the difference in inferred model distance.
Assuming linearity between ∆µ ψ and ψ, we compute the derivative for each ψ in order to build a 1701×1701 systematic covariance matrix as, which denotes the covariance between the i th and j th light-curve fit summed over the different sources of sys-tematic uncertainty (ψ) with uncertainty σ ψ (see Section 3 for details). As shown by , the σ ψ serve as priors on the known size of systematic uncertainties, but the data itself can constrain the impact of each systematic under the condition that information has not been collapsed by binning/smoothing (as was done for the original Pantheon, JLA, and DES3YR).
Fluctuations of the sample of light curves that pass the sample quality cuts (Table 2 of S22) for different systematics result in an ill-defined covariance matrix. To have a well-defined unbinned covariance matrix requires a subtle treatment in order to ensure that the sample is consistent in both the light-curve fitting and BBC stages across all systematics in the analysis. Quality cuts at the light-curve stage are only applied to the set of SNe based on their values found in the baseline analysis, and this SN sample is used for all systematic tests. We perform the BBC process twice -the first iteration to identify the subset of < 1% of SNe for which bias corrections are unable to be computed, and a second iteration using only the common set of SNe that have valid bias corrections in all systematic variants. The final cosmology sample of 1701 light curves that satisfy all criteria is described in detail in S22 (see the "Systematics" row in Table 2 of S22).
Finally, the statistical and systematic covariance matrices are combined and used to constrain cosmological models:

Cosmology
Constraining cosmological models with SN data using χ 2 has been used in previous SN Ia cosmology analyses (e.g., Riess et al. 1998;Astier et al. 2006) and first included systematic covariance in Conley et al. (2011). Here we follow closely the formalism of Conley et al. (2011) where cosmological parameters are constrained by minimizing a χ 2 likelihood: where D is the vector of 1701 SN distance modulus residuals computed as and each SN distance (µ i ) is compared to the predicted model distance given the measured SN/host redshift (µ model (z i )). The model distances are defined as where d L is the model-based luminosity distance that includes the parameters describing the expansion history H(z). For a flat cosmology (Ω k = 0) the luminosity distance is described by where d L (z) is calculated at each step of the cosmological fitting process, and the parameterization of the expansion history (used in Eq. 12 and therefore in the likelihood Eq. 9) in this work is defined as See Hogg (1999) for the forms of the expansion history H(z) used in the case that the assumption of flatness is relaxed.
The parameters M (Eq. 1) and H 0 (Eq. 13) are degenerate when analyzing SNe alone. However, we also present constraints that include the recently released SH0ES Cepheid host distance anchors (R22) in the likelihood which facilitates constraints on both M and H 0 .
When utilizing SH0ES Cepheid host distances, the SN distance residuals are modified to the following: where µ Cepheid i is the Cepheid calibrated host-galaxy distance provided by SH0ES and where µ i −µ Cepheid i is sensitive to the parameters M and H 0 and is largely insensitive to Ω M or w. We also include the SH0ES Cepheid host-distance covariance matrix (C Cepheid stat+syst ) presented by R22 such that the likelihood becomes  (Zuntz et al. 2015) using 250 live points, 30 repeats, and an evidence tolerance requirement of 0.1. This resulted in converged chains containing 1000-3000 independent samples. We verified the SN-only results with CosmoMC (Lewis & Bridle 2002) and with the fast cosmology grid-search program in SNANA. The likelihood for Pantheon+ and R22 Cepheid host distance samples will be made available in the public version of CosmoSIS. In this work we also utilize the additional public likelihoods in CosmoSIS in order to combine with and assess agreement with external cosmological probes: Planck (Collaboration et al. 2018) and baryon acoustic oscillations (BAO, likelihoods discussed in Section 4).
We blind our analysis in two ways simultaneously. First, we blind the binned distance residuals output by the BBC fit as cosmological parameters could be inferred visually from simply looking at the Hubble diagram. Secondly, in order to prevent accidental viewing of the cosmological parameters themselves, the Cosmo-SIS chains were shifted by unknown values following the formalism of Hinton (2016).

DATA AND ANALYSIS INPUTS
Here we review each component of the dataset and analysis. We discuss the fundamental purpose, the baseline treatment in this analysis, and the systematic uncertainties associated with each aspect (if applicable). The impact of systematics in both distance and cosmological inference is shown in Section 4. We provide a brief overview of this section here. Purpose: The flux-calibrated light-curve photometry is fit to determine the SALT2 parameters used in standardization (Eq. 1). Baseline: The light-curve data is described in detail by S22 and references therein. The full set of spectroscopically classified photometric light curves is compiled from 18 different publicly available and privately released samples. In total, 2077 SN light-curve fits converged using SALT2; after quality cuts are applied (Table 2 of S22), this results in 1701 SN light curves of 1550 unique SNe Ia usable for cosmological constraints. The sample includes a 3.5σ Hubble residual outlier cut to remove 5 potential contaminants that are likely non-normal Type Ia or misidentified redshifts. The sample of cosmologically viable light curves includes 81 light curves of 42 SNe used to calibrate Cepheid brightnesses as utilized by R22. The survey SN photometry compiled in Scolnic et al. (2021) and analyzed here is from DES 1 (Brout et al. 2019b;Smith et al. 2020a), Foundation 1 , PS1 , SNLS (Betoule et al. 2014), SDSS (Sako et al. 2011), HST (Gilliland et al. 1999;Riess et al. 2001;Suzuki et al. 2012;Riess et al. 2018Riess et al. , 2004Riess et al. , 2007 Riess et al. 1999;CfA2, Jha et al. 2006;CfA3, Hicken et al. 2009;CfA4, Hicken et al. 2012, and numerous smaller low-redshift samples 1 of 1-2 SNe given by Burns et al. 2018, Burns et al. 2020, Milne et al. 2010, Krisciunas et al. 2017a, Stritzinger et al. 2010a, Gall et al. 2018, Zhang et al. 2010, Tsvetkov & Elenin 2010, and Kawabata et al. 2020 Systematics: See Calibration Section 3.2.1.

Redshifts
Purpose: The peculiar-velocity corrected CMB-frame redshift of each SN/host is required to compare the inferred distance to a distance predicted by a cosmological model, as given in Eq. 10. Additionally, heliocentric redshifts are required in the SALT2 light-curve fits in order to shift the model spectrum to match the data. Baseline: The redshifts for all of the SNe (and their host galaxies, depending on what is available) are provided by Carr et al. (2021), who performed a comprehensive review of redshifts for the Pantheon+ samples and made numerous corrections. Carr et al. (2021) report the heliocentric redshifts for each SN and convert the redshift into the CMB frame. The redshifts of the Pantheon+ sample cover a range of 0.001 < z < 2.3. While redshifts of the 42 Cepheid host calibrator SNe are included, they are not used in the comparison of SN Ia magnitudes to the Cepheid distance scale and are only provided for reference and for SALT2 fitting. Systematics: Following Carr et al. (2021), we apply a coherent shift to each redshift of +4 × 10 −5 . This was conservatively stated by Calcino & Davis (2017) for the potential size of a local void bias and by Davis et al. (2019) as a potential measurement bias.

Peculiar Velocities
Purpose: Peculiar motions of galaxies arise from coherent flows, motion of halos, inflow into clusters or superclusters, and intragroup motion. Corrections are applied to the observed redshifts (after light-curve fitting) based on peculiar-velocity maps derived from independent large spectroscopic galaxy surveys. Baseline: The nominal peculiar velocities used for this analysis were determined by Peterson et al. (2021) from a comparison of multiple treatments of peculiar-velocity maps and group catalogs. Corrections were applied by Carr et al. (2021) for the Pantheon+ sample. The baseline corrections are based on 2M++ (Carrick et al. 2015) with global parameters found in Said et al. (2020) and combined with group velocities estimated from Tully (2015) group assignments. The σ vpec in Eq. 3 is found using 240 km s −1 after accounting for uncertainties propagated into the covariance matrix described below. This σ vpec floor is in agreement with what was used in Peterson et al. (2021) and for the SNe between 0.001 < z < 0.02 it is likely a conservative estimate as Kenworthy et al. (2022) found for the most nearby SN calibrators a floor of 155 ± 25 km s −1 . This apparent reduction at the lowest redshifts may be due to the peculiar velocity maps having higher fidelity at these redshifts and because Pantheon+ has relatively better virial-group information at these redshifts.
Systematics: Peterson et al. (2021) discuss multiple viable alternatives for the treatment of peculiar velocities. The first approach is to use the 2M++ corrections (Carrick et al. 2015) integrating over the line-of-sight relation (iLOS) between distance and the measured redshift. We take this variation as the first systematic with σ 2 ψ = 0.5. The second approach is to use the 2MRS (Lilow & Nusser 2021) peculiar-velocity map; however, differences between 2MRS and 2M++ at very low redshift (z < 0.01) cause numerical stability issues for offdiagonal C syst elements. We incorporate only the diagonal differences between 2MRS and 2M++ into C syst with σ 2 ψ = 0.5. As a numerically stable estimate of the off-diagonal terms, we use the 2M++ velocities transformed by the slope and offset difference between the 2M++ and 2MRS maps found in Peterson et al. (2021). The two approaches added in quadrature result in an effective σ 2 ψ = 1.0.

Host-Galaxy Properties
Purpose: The observed host-galaxy mass versus SN luminosity relation is used to standardize the SN Ia brightnesses in two ways. First, simulations of the dataset include correlations between SN color and SN stretch and host properties such as dust as a function of host mass following Popovic et al. (2021a). Second, a further residual correction is applied in the Tripp Eq. 1 where the "mass step" γ is fit in the BBC stage. Baseline: The host-galaxy stellar masses are presented by S22 and references therein. Masses are determined for all host galaxies, and star-formation rates and morphologies are also included the low-z sample. In the baseline analysis we apply the mass step at 10 10 M following Pantheon and DES3YR. Systematics: Several independent analyses Kelsey et al. 2020) have suggested that the optimal location of the mass step could range between 10 9.8 M and 10 10.2 M . We therefore include a systematic uncertainty where the mass step occurs at 10 10.2 M .

Calibration
Purpose: Photometric calibration of each passband in each survey is needed to fit light curves and facilitate comparison of the brightnesses of SNe across different telescopes/instruments/filters. Photometric calibration is also important to homogenize spectrophotometric datasets used in the SALT2 model training.
Baseline: The calibration of all 25 photometric systems used in this work is discussed in Fragilistic . The outputs of Fragilistic are a best-fit calibration solution for each of the 105 passbands and a joint 105 × 105 covariance matrix that describes the covariance between the zeropoint calibrations of all passbands that arise from using a single common stellar catalog to tie all surveys together (PS1). Systematics: The systematics due to calibration and their impact are discussed in detail in Fragilistic. We estimate the impact of the correlated filter zero-point and central wavelength uncertainties by refitting SALT2 light curves (with retrained SALT2 models; see next SALT2 Model) using 9 realizations of the 105 zeropoints. For each of the 9 realizations a value of σ 2 ψ = 1/9 is adopted such that they add in quadrature to ∼ 1. The uncertainty in modeling the spectrum of the HST primary standard star C26202 has been tripled to account for the recent update in Bohlin et al. (2020); it is now set to 15 mmag over 7000Å (σ ψ = 3 for a systematic of 5 mmag over 7000Å). Lastly, an additional conservative systematic is included only for the CSP SNe to account for the 2% recalibration in CSP tertiary stellar magnitudes from Stritzinger et al. (2010b) to Krisciunas et al. (2017b) (σ ψ = 1).

SALT2 Model
Purpose: The trained SALT2 model is required to fit light curves and determine the light-curve parameters (m b , c, x 1 ) for each SN used in Eq. 1. Baseline: We use the Fragilistic calibration solution and newly trained SALT2-B22 model 3 which was developed following the formalism of Guy et al. (2010) and Taylor et al. (2021). The SALT2 model includes a component of training statistical uncertainty which is incorporated in the fitted light-curve parameters Systematics: For each of the 9 correlated realizations of Fragilistic filter zero-points and central wavelengths discussed above (for Calibration) we simultaneously retrain the SALT2 model. Additionally, to conservatively account for a possible systematic from the redevelopment of the SALT2 model training process itself, we adopt an additional systematic by fitting the dataset with the SALT2 model trained by Betoule et al. (2014) and applying a scaling of σ ψ = 1/3 (See Section 5 and Fig. 15 for impact).

Milky Way Extinction
Purpose: Values of the Milky Way (MW) Galactic dust extinction, E(B − V ) MW , are applied to the SALT2 model spectra during both the model training process and during the data light-curve fitting process. The "extinction curve" describes the relation between the amount of reddening and extinction as a function of wavelength. Baseline: We account for MW extinction using maps from Schlegel et al. (1998), with a scale of 0.86 following Schlafly et al. (2010). We assume the MW extinction curve from Fitzpatrick (1999) with R V = 3.1. Systematics: Similarly to Pantheon, we adopt a global 5% uncertainty scaling of E(B − V ) MW based on the fact that Schlafly & Finkbeiner (2011a), in a reanalysis of Schlafly et al. (2010), derive smaller values of reddening by 4%, despite using a very similar SDSS footprint (σ ψ = 1). While Schlafly & Finkbeiner (2011b) found that their results prefer the Fitzpatrick (1999) extinction curve, we conservatively include an additional systematic uncertainty in the MW extinction curve and analyze the data (training and light-curve fit) using the Cardelli et al. (1989) and apply a systematic scaling of σ ψ = 1/3 3 released publicly at pantheonplussh0es.github.io  Table 1. References for inputs to SNANA simulations used for this analysis. We give references for the "Cadence," which describes the observing history; the "DETEFF," which describes the detection efficiency based on the signal-to-noise ratio (SNR); and the "SPECEFF," which describes the spectroscopic selection efficiency as a function of SN magnitude.
as this reflects the preference of Fitzpatrick (1999) over Cardelli et al. (1989).

Survey Modeling
Purpose: We utilize catalog-level simulations of large samples of SN Ia (> 1, 000, 000 per survey) light curves. SNANA simulations specific to each survey in our analysis are prescribed by each aspect of acquiring an SN Ia sample. As detailed in Figure 1 of Kessler et al. (2019a), the simulations require three main sets of inputs: A Source Model for generating SNe with realistic astrophysical properties and applying cosmological effects such as redshifting, dimming, lensing, peculiar velocities, and MW extinction. A Noise Model, unique to each survey, for applying instrumental and atmospheric noise to determine a detection efficiency ("DETEFF").
A Trigger Model, unique to each survey, that includes the observing cadence and describes an efficiency as a function of B-band peak magnitude for detecting SNe and obtaining a spectroscopic confirmation ("SPEC-EFF").. These simulations for each survey are combined and used to forward model the underlying populations of the SN properties (Popovic et al. 2021a,b) and to determine the expected biases in measured SN distances that follow from the known selection effects. These biases are corrected in the δ bias term of Eq. 1. Baseline: Depicted in Fig. 2 are the distributions of the key observables (z, x 1 , c) for both data and simulations of each survey used in this analysis. We find good agreement between the data and simulations, as described in detail by Popovic et al. (2021a) and Popovic et al. (2021b). We note that the agreement in the redshift dimension is achieved despite not explicitly tuning the redshift distribution of surveys. We simulate SNe in LOW-z and Foundation down to z = 0.001. Novel for this work specifically are the simulations of primary distance indicator hosts of SNe in the range 0.001 < z < 0.01 which are assumed to have the same color and stretch populations as those of their respective surveys (LOW-z and FOUND), and specifically over this redshift range they are assumed to be complete with flat spectroscopic selection efficiency. These simulations facilitate bias corrections to the Cepheid calibrator SNe and thus the propagation of modeling systematics to the SNe used in the companion SH0ES analysis (R22).
The simulation inputs for survey cadence, DETEFF, and SPECEFF functions have been evaluated in many analyses over the past decade. Table 1 shows a summary of where we obtain these inputs for each survey. Survey metadata is used to model the cadence and instrumental properties, if available, such as for FOUND, SDSS, PS1, DES, and SNLS. LOW-z data do not provide such metadata, and thus the cadence and noise properties are extracted from the data as described in Section 6 of Kessler et al. (2019a) following the procedure developed by Scolnic et al. (2018b), which assumes that the LOW-z subset of SNe is magnitude-limited. These are simulations of the CfA and CSP samples, but not of the newer samples included in this work (LOSS, SOUSA, CNIa0.02), thereby implicitly assuming that the CfA and CSP samples have similar selection effects and therefore distance biases as the newer additions. To simulate SN-host correlations, a catalog of host-galaxy properties and specifically their stellar-mass distributions are taken from Popovic et al. (2021b). The simulations used for bias corrections for all surveys are performed in ΛCDM (w = −1.0, Ω M = 0.3, Ω Λ = 0.7) with the SALT2-B22 model. Systematics: We increase the SNR of each simulation by 20%, resulting in all survey simulated distributions changing by more than 1σ, as a single conservative systematic in the determination of the selection biases. Kessler & Scolnic (2017) showed that the sensitivity of the bias corrections to the input cosmology is relatively weak; this was confirmed by Brout et al. (2019a) and found to be a negligible contribution to SN Ia uncertainty budgets. We therefore do not include this as an additional systematic.

Intrinsic Scatter Models
Purpose: A model of the intrinsic SN brightness variations, called "intrinsic scatter," is needed to account for the observed residual variation in SN Ia standardized luminosities that exceeds expectations from measurement uncertainties alone. In addition, models of the true ("parent") populations of SN Ia SALT2 parameters c and x 1 are required for the Source Model in SNANA. The intrinsic scatter model is utilized in the bias-correction simulations. Baseline: We utilize the Brout & Scolnic 2021, hereafter BS21 model that prescribes SN Ia scatter into two color-dependent components: a standard cosmological color law specific to SNe Ia and additional dust-based color laws and dust extinctions that vary with each galaxy/SN. This approach is preferred because of its novel replication of the observed relationships between SN color and residual Hubble diagram scatter as well as its ability to replicate the "mass step" as a function of SN Ia color. We use the scatter model parameters from BS21 with improvements from Popovic et al. (2021a) in our baseline bias-correction simulations; because the BS21 model includes within it the parent c population, we also utilize the separate parent population for x 1 derived by Popovic et al. (2021b). Improving upon Scolnic & Kessler (2016), Popovic et al. (2021b) fit for parent populations in bins of mass to account for host-SN Ia relationships. Popovic et al. (2021b) split their populations into high-and low-redshift groups, and notably for low-redshift surveys the x 1 populations are fitted with a two-Gaussian model to recreate the observed double peak in the x 1 distribution. Systematics: We include two categories of systematics for the intrinsic scatter model and parent populations: (1) different models of intrinsic scatter, and (2) determination of parameters for the BS21 model. For the former, we use two additional scatter models from Kessler et al. (2013) that have been used in previous cosmology analyses (JLA, Pantheon, DES3YR). These are (1) the "G10" model based on Guy et al. (2010) which describes ∼ 70% of the excess Hubble scatter from "gray" variations and the remaining scatter from wavelengthdependent variations, and (2) the "C11" model based on Chotard (2011) which describes ∼ 30% of the excess Hubble scatter from coherent variations, and the remaining scatter from wavelength-dependent variations. For the G10 and C11 scatter models, bias corrections are performed in 7-D as given by Popovic et al. (2021b). For the systematic uncertainty in the determination of the BS21 model parameters we adopt three different viable sets of dust and intrinsic SN populations from Popovic et al. (2021a). These populations are the best-fit (maximum likelihood) parameters (hereafter P21), the mean posterior set of parameters, and a set that represent a 1σ fluctuation in the uncertainty. Lastly, while the BS21 and P21 models impact the simulated bias corrections, the SALT2 training and light-curve fitting has not been altered.
The choice of scatter model is propagated through the simulations used for the bias corrections applied in Eq. 1 and for the uncertainty modeling in σ scat of Eq. 4.

Distance-Modulus Uncertainty Modeling
Purpose: To match the reported SN distance-modulus uncertainties (Eq. 3) to the scatter in distance that is observed in the data. Baseline: The BS21 model parameters have been fit to the observed scatter in the dataset. We can utilize large BS21 simulations to determine σ scat (z, c, M ) after accounting for selection effects. The efficacy of this method is shown in Fig. 3, which demonstrates good agreement between the observed RMS of the Hubble residuals and the uncertainties of the distance-modulus values. Systematics: To conservatively account for how SN cosmology was done in the past (JLA, Pantheon), in Eq. 3 we set σ scat (z, c, M ) = 0 and allow only a single σ gray Figure 3.
Pantheon+ distance-modulus uncertainties (shown as dashed lines with mean σµ and split on host mass) in comparison to the observed root-mean square (RMS) of the distance-modulus residuals (shown as solid lines as RMS µ split on host mass), as a function of color. This shows that the distance errors are adequately modeled (Eq. 4) as a function of SN color and host stellar mass. In previous analyses, the uncertainties were roughly flat as a function of color.
parameter to replicate the methodology used with historic intrinsic scatter models (G10 and C11). However, we note that for G10 and C11, the trends in RMS seen for the data in Fig. 3 do not match the reported uncertainties.

Validation
Purpose: To verify that our analysis can recover input values in data-sized simulated samples and does not produce biases. Such tests are sensitive to the light-curve fitting and BBC technique (as well as implementation and coding errors); however, they are not sensitive to certain aspects of the analysis such as the assumption of the SALT2 model or photometric calibration. Baseline: We perform an end-to-end test of our baseline analysis pipeline from survey photometry catalog-level simulations. We create 20 realizations of each survey in an arbitrary cosmological model (w = −1): 10 with the BS21 scatter model and 10 with the G10 scatter model. We perform light-curve fitting, apply bias corrections, compile into 10 Hubble diagrams, and maximize the cosmological likelihoods (Eq. 9) using a fast cosmology grid-search program in SNANA (Kessler et al. 2009), with approximate priors from CMB measurements (Planck Collaboration et al. 2018) to obtain best-fit cosmological parameters and uncertainties. For the BS21 model simulations we recover a mean best-fit w = −1.012 ± 0.011 and for the G10 model simulations we recover a mean best-fit w = −0.983 ± 0.015; both are within ∼ 1σ of the input cosmology. The 20 realizations are made available publicly 4 along with bias-correction simulations.

Standardization Parameters
The standardization nuisance parameters α, β, γ, and σ gray defined in Eq. 1 and 3 are shown in Table 2 for each of the scatter models used in this work. The best-fit α are similar across scatter models to within ∼ 1σ. The best-fit β values differ across models owing to different treatments of SN Ia color; however, the values for the baseline dust model (BS21) and the P21 dust model are self-consistent.
As shown in Table 2, the additional σ gray term for the BS21 and P21 models is found to be zero. As discussed in Sec. 2, this is consistent with the expectation that if the simulations correctly model the intrinsic scatter and noise of the data, the σ scat (z, c, M ) term of Eq. 3 is sufficient to describe the distance-modulus uncertainties with σ gray = 0. As discussed in Appendix A, for our G10 and C11 systematic treatment, σ scat (z, c, M ) is set to 0, and therefore σ gray ≈ 0.10 approximates the scatter, though it does not account for the observed color dependence.
Table 2 also shows that the best-fit host stellar mass corrections (γ) are consistent with zero for BS21 and P21. This is in agreement with the findings of Popovic et al. (2021a), that modeling the intrinsic scatter in bias-correction simulations with correlations that match those in the observed data removes the need for ad hoc corrections in intrinsic brightness (i.e., γ = 0). This can also be seen in Fig. 5. For the bias correction based on the G10 and C11 models that do not include any mass dependence, the resulting γ is ∼ 0.05 found at 7σ confidence.

The Hubble Diagram
The Pantheon+ Hubble diagram of 1701 SN Ia light curves compiled from 18 different surveys and ranging in redshift from 0.001 to 2.26 is shown in the top panel of Fig. 4. In the bottom panel of Fig. 4 are the residuals to the best-fit cosmology (Eq. 10). Best-fit cosmological parameters will be presented in the following subsections.
Shown in Table 2 is the total observed scatter (RMS) in the Hubble diagram residuals to the best-fit model (bottom of Fig 4) for different scatter models. The BS21 model results in the lowest Hubble diagram RMS and χ 2 , a > 5σ improvement determined from the difference in likelihoods relative to the G10 and C11 scatter models. The observed scatter of ∼ 0.17 mag is larger than seen in the original Pantheon because Pantheon+ extends to lower redshifts and thus is more impacted by scatter induced by peculiar velocities. If we set the minimum redshift to 0.01, the total scatter is reduced to 0.15 mag, matching that of Pantheon. Finally, compared to the original BS21 analysis, P21 uses a more rigorous fitting process that is optimized to better characterize SN Ia colors and intrinsic scatter in addition to Hubble residuals. For this reason, the improvements of P21 are not solely described by the cosmological model likelihood L of Table 2. We therefore have included the use of P21 population parameters as a systematic uncertainty.

The Very Nearby Hubble Diagram
We note from Fig. 4 that in the very nearby Universe, z < 0.008 (v < 2400 km s −1 ), the mean of the Hubble diagram residuals is positive by ∼ 5% at ∼ 2σ significance. This is seen after the use of peculiar-velocity maps from either 2M++ or 2mrs. A similar signal is also seen in The distance-modulus residuals relative to a best-fit cosmological model with binned data for reference (black points). Both the data errors and the binned data errors include only statistical uncertainties. At z < 0.01, the sensitivity of peculiar velocities is very large, and the uncertainties shown reflect this uncertainty. Dashed line is the predicted Hubble residual bias stemming from biased redshifts due to volumetric effects in the very nearby universe (assuming 250 km s −1 uncorrected velocity scatter). Distances (µ) follow Eq. 1 and include α, β, δ bias , and δ host corrections. Binned data are shown for reference (black). No significant residual correlations are seen. the Hubble residuals of the Cepheid distances (Kenworthy et al. 2022). A bias of roughly this size and direction is expected in the presence of measurement errors and unmodeled peculiar velocities which scatter more objects down from higher redshifts and greater volume than from the reverse. This effect is significant only for the most nearby galaxies (z < 0.008). In Fig. 4, we include the prediction (dashed line) for this bias assuming 250 km s −1 uncorrected velocity scatter (not a fit). In the the 3-rung distance ladder utilized to measure H 0 by the SH0ES Team (R22) and in Eq. 14 in this work, the nearby (z <∼ 0.01) Hubble diagram is not used. Rather, only the distance moduli from such nearby SNe are used in the SN-Cepheid absolute distance calibration in the 2 nd rung. Furthermore, in the R22 measurement of the Hubble flow, only SNe with redshifts z > 0.023 are used in the 3 rd rung to limit sensitivity to peculiar velocities. This approach is insensitive to the volumetric redshift scatter effects and there is no resulting impact on the R22 H 0 . However, more local measurements of H 0 from, for example, a 2-rung distance ladder using primary distance indicators like Cepheids and TRGB and their host redshifts (mostly at z ≤ 0.01) are more sensitive to peculiar velocities and the volumetric bias they induce, and are likely to be biased low at the few percent level if not appropriately accounting for this expected bias (Kenworthy et al. 2022). For measurements of other cosmological parameters (e.g., w or Ω M ) with Pantheon+ described in the following subsections, the mean Hubble residual bias of the Low-z and Foundation sample is ∼ 2 mmag and ∼ 1 mmag (respectively), and is considered to be negligible.

The Distance Covariance Matrix
Built following Eq. 7, the 1701×1701 systematic distance covariance matrix is shown in Fig. 6. The sample is sorted by survey and redshift to help visualize the co-variances. The Hubble diagram residuals (Eq. 10) that are used to build the covariance matrix are shown in Fig. 7 for several example sources of systematic uncertainty. As discussed in Appendix C, the information used to create the Hubble diagram as well as the covariance matrix is publicly available 5 and tools to read in this information are in CosmoSIS. The SDSS subsample contributions to the covariance matrix (Fig. 6) stand out visually due to their strong spectroscopic selection function.

Constraints on Cosmological Parameters From Pantheon+ and SH0ES
Parameter constraints from the Pantheon+ SNe Ia and SH0ES Cepheid host absolute distances are shown in Table 3 for FlatΛCDM, ΛCDM, FlatwCDM, and Flatw 0 w a CDM. Unless otherwise stated, constraints on cosmological parameters include both statistical and systematic uncertainties. From the Pantheon+ SNe Ia, for a FlatΛCDM model we find Ω M = 0.334 ± 0.018. We note that SH0ES (R22) utilizes Pantheon+ SNe at z < 0.8 to constrain the deceleration parameter and find q 0 = −0.51 ± 0.024. In a flat universe q 0 = 3Ω M 2 − 1, which gives Ω M = 0.326 ± 0.016, consistent with the result for Ω M reported in this work. Results for H 0 from the inclusion of the SH0ES Cepheid host distances are discussed below.
The constraints on Ω M and Ω Λ for a ΛCDM model are shown in Fig. 8. We find Ω M = 0.306 ± 0.057 and Ω Λ = 0.625 ± 0.084; a flat universe is within the 68% confidence region and Ω M = 0 and Ω Λ = 0 are together rejected at 4.4σ using only the SNe. For a FlatwCDM model, from the SNe Ia alone (not including SH0ES Cepheid calibration) we find Ω M = 0.309 +0.063 −0.069 and w = −0.90±0.14 as shown in the third row of Table 3 and in the blue contour of Fig. 9. This result is consistent within 1σ of the cosmological constant (w = −1).
For a Flatw 0 w a CDM model, from the SNe Ia alone (not including SH0ES Cepheid calibration) we find w 0 = −0.93 ± 0.15 and w a = −0.1 +0.9 −2.0 as shown in the fourth row of Table 3 and in Fig. 10. These results are again consistent with a cosmological constant.
Using distances and a stat+syst covariance matrix that extends to the Cepheid calibrators (Eq. 15) and combining the Pantheon+ SNe with the SH0ES Cepheid host distance calibration, we are able to robustly and simultaneously constrain H 0 and other cosmological parameters describing the expansion history. While we use SH0ES Cepheid data and covariance in this 5 pantheonplussh0es.github.io Figure 8. Confidence contours at the 68% and 95% level for the ΩM and ΩΛ cosmological parameters for the ΛCDM from the Pantheon+ dataset, as well as from the Planck and combined BAO datasets. The constraints from including both the statistical and systematic uncertainties (shaded red) are shown as well as when only statistical uncertainties are propagated (unfilled dashed). We include two lines for reference: one for a flat universe, where ΩM + ΩΛ = 1 and the other that indicates an accelerating universe.
work, likewise Pantheon+ distances and covariance are used in Section 5.2 of R22 in order to fit H 0 and q 0 in FlatΛCDM. As shown in the top Pantheon+ & SH0ES section of Table 3, for ΛCDM, FlatwCDM, and Flatw 0 w a CDM we find H 0 = 73.4 ± 1.1, 73.5 ± 1.1, and 73.3 ± 1.1 km s −1 Mpc −1 , respectively. We note that more complex models do not result in decreased H 0 constraining power from the SNe Ia + Cepheids, while this is not necessarily true for other cosmological probes (Sec. 4.4).

Constraints on Cosmological Parameters From Multiple Probes
In this work we combine the Pantheon+ SNe with external cosmological probes: CMB from Planck   Notes: Summary of marginalized parameter constraints for Pantheon+ and other external probes. The mean and 68% confidence limit are provided for each cosmological parameter. A blank value indicates a parameter not used in the cosmological fit.   mentioned BAO constraints are denoted "allBAO"; we also provide constraints from the combination of spectroscopic redshift galaxy-only subset of BAO probes denoted "galaxyBAO." We report constraints in Table 3 for combinations of datasets that are deemed compatible and discussed below. For a FlatwCDM model when combining Pan-theon+ and Planck we find w = −0.982 +0.022 −0.038 and Ω M = 0.325 +0.010 −0.008 , and when further including allBAO we find w = −0.978 +0.024 −0.031 and Ω M = 0.316 +0.005 −0.008 , both of which are consistent with the cosmological constant at ∼ 3% (Fig. 11). As can be seen in Fig. 9, we do not include SH0ES in combinations with Planck because these measurements are incompatible (R22).
For a Flatw 0 w a CDM model when combining Pan-theon+ and Planck we find w 0 = −0.851 +0.092 −0.099 and w a = −0.70 +0.49 −0.51 , and when combining Pantheon+, Planck, and BAO we find w 0 = −0.841 +0.066 −0.061 and w a = −0.65 +0.28 −0.32 , which is moderately consistent (2σ) with a cosmological constant (Fig. 12). We note that this result is not driven by any single probe. In Fig. 10 we show constraints for Planck alone and for the combination of Planck & Pantheon+. While the broader model freedom of the Flatw 0 w a CDM allows the Planck alone H 0 to be consistent with 73 km s −1 Mpc −1 owing to degeneracy between H 0 and w a (see Fig. 10), after combining Planck with Pantheon+, the H 0 /w a degeneracy is broken (H 0 = 67.4 +1.1 −1.2 km s −1 Mpc −1 ). Therefore, the inclusion of SH0ES with Planck & Pantheon+ results in a Bayesian evidence ratio of −9, and we deem this set of probes incompatible and do not include them in Fig. 10 nor Table 3.

Impact of Systematics on Cosmological Parameter Fits
To understand the impact of systematic uncertainties, in Table 4 we group the systematics investigated in this work into four main categories: Calibration/SALT2, Redshifts, Astrophysics, and Modeling. The baseline, systematic treatments (S ψ ) and scaling priors (σ ψ ) (as described in detail in Section 3) are summarized for each source. The final three columns of Table 4 relate to fits of the sample when combined with Planck Collaboration et al. (2018) in a FlatwCDM model when isolating that systematic. We define both the change in best fit (∆w sys ) and the systematic uncertainty contribution to w (σ sys w ) as follows: where w sys and σ wtot are the cosmological constraints when utilizing C stat+sys and where w stat and σ wstat are the statistical-only constraints when utilizing C stat . We find that the final systematic uncertainty in w (σ sys = 0.019) is comparable yet smaller (∼ 80%) than the statistical uncertainty, suggesting that the measurement is not systematics dominated. The largest contribution to the systematic error budget (0.011) is due to the potential for redshift-measurement bias. This is followed by the uncertainties in the Fragilistic calibration offsets and the resulting propagation to SALT2 modeltraining uncertainties and light-curve fitting uncertainties (0.009). Additionally important is the conservative uncertainty that was applied owing to the usage of the new SALT2 training methodology (0.008) as well as the uncertainty in the MW extinction maps (0.008).
Interestingly, numerous systematic uncertainties are found to be negligible (e.g., BS21 Parameters, G10 versus C11) in the cosmological parameter budget. While certain systematics cause redshift-dependent trends as shown in Fig. 7, they also change the relative scatter of the Hubble residuals. This can most easily be seen for the cosmological likelihood values (L) for the distances with different intrinsic scatter models shown in Table 2. If the baseline analysis is significantly preferred (larger L) by the data over one of the analysis variants, the impact of that systematic on cosmological constraints will be reduced, as is the case for intrinsic scatter.
As we have built a covariance matrix that includes the Cepheid calibrators, we can measure H 0 with and without systematic uncertainties. For FlatΛCDM, we find H 0 = 73.6 ± 1.1 km s −1 Mpc −1 , and when considering only statistical uncertainties from the SNe alone (excluding Cepheid and physical distance calibration uncertainties) σ stat+syst H0 = 0.7 km s −1 Mpc −1 , and σ syst H0 = 0.29 km s −1 Mpc −1 . This suggests that SN systematic uncertainties are not dominating the constraint on H 0 and cannot explain the ∼ 7 km s −1 Mpc −1 difference between Planck and SH0ES.
In Figure 13 we show deviations to the best-fit H 0 for each individual source of systematic uncertainty relative to the baseline analysis and assuming ΛCDM. For reference we also show the full SN contribution to the H 0 error bar (dashed). The deviations from the baseline (∆H 0 ) are small and add in quadrature to 0.32 km s −1 Mpc −1 . We note that when assessing redshift-specific systematics, because model redshifts are not used for the SN-Cepheid calibration in Eq. 14, they mainly impact the Hubble-flow SNe (third rung of the distance ladder).
Finally, to help visualize the impact of systematic uncertainties, we show in Fig. 8 the constraints when in- Figure 13. The impact on recovery of H0, as explained in Sec. 2, of the systematic uncertainties described in Table 4. The units of these measurements are km s −1 Mpc −1 . The dashed lines are given at ∆H0 of 0.7, which is the entire contribution of the uncertainty in R22 from SN measurements.
cluding either statistical-only uncertainties or the combined statistical and systematic uncertainties. Error budgets for different cosmological parameterizations can be generated with the delineated files for systematics provided as part of this release.

Local Structure in the SN Ia Hubble Diagram
Large compilations of SN distances have provided impetus for searches of local structure, over/underdensities, and proper motion (e.g., Mathews et al. 2016;Soltis et al. 2019;Hu et al. 2020). As an initial study, we create sky maps of the SN Hubble diagram residuals (see Fig. 14) and examine two specific areas on the sky that have been documented in the literature and have sufficient SN statistics in the Pantheon+ sample for study.

The CMB Kinematic Dipole
The motion of the Milky Way and Solar System relative to the CMB rest frame (v = 369.82 km s −1 ) is corrected for following Carr et al. (2021) and Peterson et al. (2021). The effect of the CMB dipole motion can be seen in the z HEL sky map (middle right of Fig. 14), where z HEL is the heliocentric redshifts. The z CMB skymap (middle left of Fig. 14) has the CMB dipole-causing peculiar redshift removed following Eq. 7 of Peterson et al. (2021). The direction of the CMB dipole, l = 264 • and b = 48 • (red o in Fig. 14), is shown for reference as well as its antipole (red x). Notes: A summary of the systematic uncertainties and the baseline component of the analysis as described in Sec. 3, the size of the systematic SΨ used to determine the impact of that systematic, the scaling of the systematic σ ψ as constrained in this analysis, and the contribution to the total uncertainty in wCDM (can be compared to statistical uncertainty of 0.03), and the shift when allowing the uncertainty on the best-fit cosmological parameter. The last column shows the simplistic change in best-fit cosmology if a perturbation of size σ ψ is applied with statistical-only uncertainties. The amount shown is different than seen for the combined shift for best-fit and increase of uncertainty given in the previous columns due to the self-calibration as explained by .
As discussed in Section 3.1.3, we examine different velocity reconstructions due to local structure that include estimates of the bulk flow; these are the 2M++ (Carrick et al. 2015) and 2MRS (Lilow & Nusser 2021) corrections and are shown in the top row of Fig. 14. These corrections also include the CMB dipole correction. Peterson et al. (2021) show that the peculiar velocity corrections overall reduce the Hubble residual scatter by ∼ 10%, and this is qualitatively confirmed in our maps. The heliocentric map shows a strong dipole as expected; the z CMB map shows the dipole somewhat removed but with an overcorrection (as expected at low-z because local galaxies share some of our motion); and both z HD maps show that the peculiar velocity corrections have removed most of the overcorrection. However, both reconstructions produce a small signal that can be seen in the maps in the direction opposite the motion causing the CMB dipole. This signal is found to be local, at z < 0.02, and grows with decreasing redshift until z ≈ 0.01 (bottom left of Fig. 14). A possible reason that there is a residual signal in the negative dipole direction in both the z CMB and peculiar velocity corrected redshifts is that the MW motion is coupled with the motion of nearby galaxies in a way that is not yet sufficiently modelled. It is also likely that this is due to low-number statistics (this is only a 1σ deviation) and the uneven sky coverage (the SNe in this region are mostly clustered in Stripe-82). Lastly we note that the positive residuals are driven by SNe at z < 0.02, and thus are not included in the SH0ES  sample and inference of H 0 .

The CMB Cold Spot
The "CMB cold spot," a 5 • region of −70µK centered at (l ∼ 209 • , b ∼ −57 • ), was first detected in data from the Wilkinson Microwave Anisotropy Probe (Vielva et al. 2004;Cruz et al. 2006), and subsequently in Planck data (Gurzadyan et al. 2014). Evidence for an underdensity aligned with the CMB cold spot was presented by Rudnick et al. (2007). Szapudi et al. (2015) and Kovács et al. (2021) subsequently found the Eridanus supervoid in the direction of the cold spot at z ≈ 0.15. However, it is not clear if the alignment of Eridanus and the CMB cold spot is causal or coincidental.
We find a signal in the Pantheon+ Hubble diagram when examining SNe within a 20 • radius of the location of the CMB cold spot ( blue circle region in top-left Fig. 14). The difference in Hubble diagram residuals as a function of redshift is shown in the bottom-right panel of Fig. 14. There are 9 SNe in this region of the sky with redshifts on the near side (0.12 < z < 0.15) and there are 14 SNe on the far side (0.15 < z < 0.20) of the proposed void at z = 0.15, and there is a Hubble residual difference of −0.15 ± 0.06 mag between these two sets of SNe. For an estimate of the significance, we examine 1000 randomly selected 20 • apertures across the sky with at least 8 SNe in each of the near and far redshift ranges split on redshifts between 0.08 and 0.20, and find that deviations with a similar significance occur only 0.2% of the time. We note however, that there are not many independent regions that satisfy the selection criteria and the vast majority of the SNe in the cold-spot selection come from the small deep-field patch within that region. Taking 100 random samples of 10 degree radius from the largest densely sampled region in Pantheon+ (Stripe-82 region) we find no other patch has a significance that exceeds 1.6σ, making the Eridanus patch the most significant step at that redshift in our data.

DISCUSSION
This analysis is the latest in a series of papers that attempt to both grow the compilation of measured SN Ia light curves and improve on the systematic floor. The two most recent compilations and analyses are those of JLA and Pantheon, which respectively included ∼ 40% and ∼ 60% of the SN light curves analyzed here. As seen in Fig. 1 of Scolnic et al. (2021), the majority of the statistical increase for Pantheon+ is in the addition of numerous low-redshift samples extending down to z = 0.001. However, the largest differences in the Hubble diagram are not solely the result of statistical increase, but rather due to improvements in our methodology.
We show in Fig. 15 the difference in inferred distancemodulus values (marginalized over M ) for the Pan-theon+ sample relative to the assumptions used in the JLA analysis, for the three most significant improvements presented in this work. First is the update in the flux cross calibration to the Fragilistic solution, which impacts both the training of the SALT2 model and the zero-points used in light-curve fitting. Second is the impact from updating the MW extinction curve used in JLA (Cardelli et al. 1989) to the Fitzpatrick (1999) relation that is used here. Third is the change resulting from improved modeling of the SN Ia intrinsic scatter; while in this work we adopt the BS21 model, we include the models developed for JLA (G10 and C11) as systematics. Each of these changes has been motivated externally by previous works (e.g., Schlafly & Finkbeiner 2011b;; however, they nonetheless cause shifts in dµ/dz of ∼ 0.05, or ∼ 0.04 in w. Finally, because all of three of these changes have the same sign of dµ/dz slope, rather than canceling each other, when combined in this work they result in a ∼ 0.1 difference in the constraint on w relative to JLA (after combining with CMB).
As discussed by Scolnic et al. (2019), the constraining power of large samples of SNe Ia extends beyond inferences of H 0 and w/Ω M . Large compilations of low-z SNe Ia enable precision measurements of the local growth-of-structure, typically parameterized by f σ 8 (e.g., Huterer et al. 2017;Stahl et al. 2021). Work is ongoing for this measurement using the Pantheon+ sample (Boruah et al., in prep.), which will include validation with simulations as well as propagation of the covariance matrix, which previously would have limited effect on σ 8 calculations owing to smoothing/binning over redshift. While in Sec. 4 we show a Healpix map of Hubble residuals across the sky, there are additional and related tests of anisotropy that can be performed with these data. Previous analyses of the first Pantheon sample (e.g., Colin et al. 2019;Soltis et al. 2019;Andrade et al. 2018;Brownsberger et al. 2019) typically search for radial or hemispherical residuals across the sky. The addition of statistics in the low-redshift sample and improved accounting in Pantheon+ would particularly strengthen these types of studies. A search for matter over/underdensities was performed by Colgáin (2019), which varied the minimum and maximum redshift in the original Pantheon sample and redetermined cosmological constraints. Colgáin (2019) found for Pantheon that Ω M could be < 0 for a low maximum z of ∼ 0.15, though with only ∼ 2σ difference compared to the value of Ω M from the full sample. We show a similar test in Fig. 16 and find relatively stable values of Ω M with no signs of the underdensity seen by Colgáin (2019). The main goal of this work, constraints from SNe Ia alone for a FlatwCDM model, results in stat+syst uncertainties of +0.058 −0.063 and 0.13 for Ω M and w, respectively. This represents a factor of 2 improvement in figure of merit over the original Pantheon (stat+syst uncertainties 0.072 and 0.22 for Ω M and w). This cannot be explained solely by statistical improvements, but rather is also due to a leap in systematics methodology over the original Pantheon and JLA. As shown by , cosmology uncertainty budgets are improved by a factor of ∼ 1.5 when not binning or smoothing data and covariance. In Appendix B we discuss and show a binned error budget for comparison and find a similar factor of 1.5 improvement from this choice alone. In examining the unbinned error budget in Table 4, it can be seen that several systematics are no longer impacting SN Ia cosmology analyses as strongly as had previously been thought. One such example is the negligible size of the parent population systematic despite including three additional sources of scatter model uncertainty, as was also seen by Popovic et al. (2021a). This, as well as the reduction of a number of other systematics in comparison to their size in binned analyses (also shown in Appendix Table 6), is due to the power of the large datasets themselves to self-constrain the size of systematic uncertainties when the systematic itself is not solely degenerate with the cosmological model parameterization. This is especially important because it brings this work from potentially being dominated by systematics to rather being dominated by statistical uncertainties. Furthermore, as shown by , as datasets grow in size, many systematics will continue to shrink without any additional effort. Lastly, it is important to note that approaches such as the Approximate Bayesian Computation method given by Jennings et al. (2016) will not be able to make use of this self-constraining benefit unless additional parameters are included to allow the data themselves to scale the input sizes of the systematic uncertainties (S sys in . While the SN Ia mass step has received much attention in the last decade, we find here that its contribution to the error budget is exceedingly small. Unlike previous analyses, the mass-step treatment in this work is based on a SN color and dust-dependent model (BS21). We find that this more physical model results in smaller scatter in the Hubble diagram (Table 2) and better χ 2 relative to cosmological models which then results in smaller systematic uncertainties. We note that properties of SN Ia host galaxies other than stellar mass have been seen to correlate with SN Ia Hubble diagram residuals. Star-formation rate, specific star-formation rate (sSFR), stellar-population age, and metallicity have all been shown to correlate to varying degrees with the distance-modulus residuals after standardization Lampeitl et al. 2010;Rose et al. 2019;Rigault et al. 2013). For this reason, using sSFR values presented by S22, we also examined the size of a sSFR step in the subset of the Pantheon+ sample for which we have obtained sSFR measurements (z < 0.2). Without applying any bias corrections, we find a significant step in sSFR (across the median sSFR) of 0.031 ± 0.011. However, after applying the nominal set of dust and mass-based bias corrections (BS21) used in this analysis, we find a step in sSFR of 0.008 ± 0.011, consistent with zero. This is likely due to galaxy properties (i.e., stellar mass) being linked to dust properties, and that applying a dust-mass correction is accounting for most, if not all, of the correlations with sSFR and is also tracing the dust distribution.
Going forward, statistical constraints on w and Ω M from SNe will improve significantly owing to upcoming datasets from SN programs of the Dark Energy Survey ( (Hounsell et al. 2018), etc. It is likely that these future datasets will improve the statistical precision by a factor of 100 .
The size of systematic errors on cosmological parameter estimates matched the statistical errors for JLA and the original Pantheon. Systematic uncertainties in this work have been reduced in comparison to Pantheon, and while their impact is still significant, it is no longer the dominant component of the total uncertainty. With the coming surveys, systematics will also likely improve alongside the increase in statistics, as has been the case for previous analyses over the last two decades , and as expected from the impact of systematic 'self-calibration' described in .
As shown in the systematics error budget Table 4, the dominant sources of systematic uncertainty are now from 1) the combination of SALT2 training and calibration of surveys, 2) potential redshift measurement biases, and 3) Milky Way dust systematics. Fortunately there are paths forward for each of these. For survey flux calibration, dedicated programs are needed and there are currently multiple paths underway to improve the fundamental calibration of SN Ia samples and how they are tied to various other samples (e.g., Regnault et al. 2015;Narayan et al. 2019;Stubbs & Brown 2015). There is also ongoing work (Taylor et al. in prep) to train the SALT2 model with more photometric systems which has already shown promising improvements to systematic uncertainties and the ability to constrain rest frame U-band. The systematic from the redshift measurement floor has the potential to be reduced using improved cosmology fitting methodology, although the extent to which the data itself can constrain the size of this floor remains unproven. Alternatively, future large surveys can use multiple spectroscopic instruments and redshifting codes to mitigate potential sources of redshift measurement bias. The Pantheon+ sample is especially sensitive to Milky Way dust systematics because of the differences in the samples used for low and high redshift. At low redshift, to obtain sufficient statistics in a volume limited sample, we have used SNe across the sky and with up to 0.2 in MWEBV, whereas the high redshift surveys have been carried out in low extinction regions of the sky (MWEBV<0.05). Future surveys of larger volumes will be able to mitigate this with a plethora of both low and high redshift in low MW extinction regions on the sky.
Throughout this work, there are a number of upstream components of this analysis that impact downstream analysis steps: i.e. new calibration (or MWEBV Maps/Color law) motivates new SALT2 training, which motivates new fitting of the SN parent populations, which motivates new bias corrections. The Pippin framework (Hinton & Brout 2020), used extensively in this work, was intentionally developed to automate and asynchronize this multistep type of analysis; however, it has yet to incorporate aspects such as the SALT2 retraining  or population fitting (Popovic et al. 2021a). Likely, this framework will need to expand for future analyses.
There is an alternate approach to obtaining cosmology constraints from SNe that has been gaining traction over the last decade. Bayesian Hierarchical Models (BHM) have been developed that utilize bias-corrected observables (Shariff et al. 2016) and that incorporate selection effects directly into the model (Rubin et al. 2015) or likelihood . However, unlike BBC in combination with CosmoSIS, these methods have not been validated with large realistic simulations. As noted in Appendix C, we release as part of this analysis 10 realistic simulations of the Pantheon+ dataset for such validations.
While constraints on w should easily improve with upcoming large SN samples, the road to improving constraints on H 0 is more challenging. There are a limited number of SNe Ia that will explode in the near future within a ∼ 40 Mpc radius, a constraint due to HST discovery limits of Cepheids. At roughly one SN Ia per year, it will take several decades to double the current sample of 42 SNe calibrated by SH0ES Cepheid hosts. Fortunately, we find that the systematics in the measurement of H 0 from the SNe are at the scale of 0.3 km s −1 Mpc −1 as shown in Fig. 13. This is consistent with the general finding of Brownsberger et al. (2021), who showed how robust H 0 is to systematic uncertainties in comparison to the relatively calibration-sensitive constraints of w 0 or Ω M . Lastly, there is ongoing work that combines the progress used here by Peterson et al. (2021) and applies it to a "two-rung" distance-ladder analysis, in which SNe are excluded from the distance ladder (Kenworthy et al. 2022).

CONCLUSION
This work is the culmination of a number of supporting analyses as part of the Pantheon+ effort. In this work, we summarize the various inputs and analyses required to combine the supporting works and ultimately measure distances and cosmological parameters. For the first time we are able to measure the cosmic expansion history and the local distance ladder H 0 simultaneously. We combine our results with additional external probes. Importantly, we release a number of data and analysis products to facilitate reproducing our work by the community. This includes a joint covariance of SNe used for measurements of H 0 and w.
For our main results, we find Ω M = 0.334 ± 0.018 in FlatΛCDM from SNe Ia alone. For a flatw 0 CDM model, we measure w 0 = −0.90±0.14 from SNe Ia alone and w 0 = −0.978 +0.024 −0.031 when combining SNe with constraints on the CMB and allBAO; both are consistent with a cosmological-constant model of dark energy. We also present the most precise measurements to date on the evolution of dark energy in a Flatw 0 w a CDM universe, and measure w a = −0.1 +0.9 −2.0 from Pantheon+ alone and w a = −0.65 +0.28 −0.32 when combining with CMB and BAO data. Finally, while nominal constraints on H 0 are presented in a companion paper by the SH0ES team (R22), we perform joint constraints of H 0 with expansion history and find H 0 = 73.5 ± 1.1 in FlatwCDM, and we show how systematic uncertainties in measurements of the SN component of the distance ladder cannot account for the current level of the "Hubble tension." Table 5. Distance Bias (and Uncertainty) Estimation for Scatter Models G10/C11 BS21/P21 Dimensionality 7d (z, x1, c, M , γ, α, β) 4d (z, x1, c, M ) Mass-step correction γ a fitted parameter γ corrected for within δ bias (γ and δ host consistent with zero) Intrinsic Scatter Floor σ 2 floor = σ 2 gray σ 2 floor = σ 2 scat (zi, ci, M ) + σ 2 gray , applied when f (z, c, M ) > 1 Selection Effects f (z, c) f (z, c, M ) ≤ 1, applied when σ 2 scat (zi, ci, M ) = 0 Notes: Formalism for 4d and 7d bias corrections are described by Popovic et al. (2021b) that depend on the intrinsic scatter model assumed -either G10/C11 or BS21/P21. The statistical and intrinsic scatter uncertainties from Eq. 3 are shown here; the other uncertainty components from Eq. 3 are independent of the scatter model. unity, the necessary σ scat (z, c, M ) is applied and f is set to 1. The resulting f (z, c, M ) and σ scat (z, c, M ) found from simulations are applied to the Pantheon+ data. The method and dimensionality for the application of bias corrections is dependent on the adopted scatter model. Table 5 summarizes the differences between the two main methods used in this work, the first of which is applied when assuming the BS21/P21 scatter model, and the other when assuming the G10 or C11 scatter model. The main difference between these groups of scatter models, as discussed in Sec. 3, is whether the intrinsic scatter is driven by diversity in the reddening ratios R V of the light curves, which affects the application of bias corrections. For both analysis paths, we follow the methodology introduced by Popovic et al. (2021b).

B. BINNED SYSTEMATIC ERROR BUDGET
In Table 6 we show a systematic error budget that is nearly identical to what was performed in Table 4, except that the dataset (∆D) and covariance matrix (C stat+syst ) are binned in 20 redshift bins. This error budget is similar to the methodology performed in the most recent SN cosmology analyses where binned covariance matrices were used (e.g., Pantheon, DES3YR Brout et al. 2019a) and where smoothed data vectors and matrices (which were shown to be equivalent to binned) were used (JLA). The total systematic error when binning is a factor of 1.5 larger (0.029) than when not binning the dataset (0.019).
Systematics that improve the most with unbinned matrices are those with smaller σw unbinned sys /σw binned sys . Binned analyses collapse valuable information in the Hubble diagram down to a single dimension, redshift. We find that as expected, the redshift bias systematic does not improve much at all. This is because systematics that only exhibit redshift dependence are degenerate with cosmological model parameters and cannot be self-constrained by the data as easily. Systematics that exhibit dependence in other parameters (such as SN color) can be drastically reduced in SN Ia cosmological parameter error budgets when not performing binned analyses.

C. PRODUCTS
The following data products that are provided in part by the full suite of Pantheon+ supporting papers are released publicly in machine readable format 6 at pantheonplussh0es.github.io and as part of SNANA and CosmoSIS (where noted). • SN Distance Modulii and Redshifts; this work, Carr et al. (2021), see Table 7 Table 7 is the error on standardized magnitude from the diagonal of the covariance matrix. It is for plotting purposes only and not to be used for cosmological fits. c Due to implementation methodology of this systematic, it has not been performed in the binned case. d The increase in the "HST" systematic is likely due to noise as the values are very small for both binned and unbinned.