Measurements of the Hubble Constant with a Two Rung Distance Ladder: Two Out of Three Ain't Bad

The three rung distance ladder, which calibrates Type Ia supernovae through stellar distances linked to geometric measurements, provides the highest precision direct measurement of the Hubble constant. In light of the Hubble tension, it is important to test the individual components of the distance ladder. For this purpose, we report a measurement of the Hubble constant from 35 extragalactic Cepheid hosts measured by the SH0ES team, using their distances and redshifts at cz<3300 km /s , instead of any, more distant Type Ia supernovae, to measure the Hubble flow. The Cepheid distances are calibrated geometrically in the Milky Way, NGC 4258, and the Large Magellanic Cloud. Peculiar velocities are a significant source of systematic uncertainty at z $\sim$ 0.01, and we present a formalism for both mitigating and quantifying their effects, making use of external reconstructions of the density and velocity fields in the nearby universe. We identify a significant source of uncertainty originating from different assumptions about the selection criteria of this sample, whether distance or redshift limited, as it was assembled over three decades. Modeling these assumptions yields central values ranging from H0 = 71.8 to 77.0 km/s/Mpc. Combining the four best fitting selection models yields H0 = 73.1 (+2.6/-2.3) km/s/Mpc as a fiducial result, at $2.6\sigma$ tension with Planck. While Type Ia supernovae are essential for a precise measurement of H0, unknown systematics in these supernovae are unlikely to be the source of the Hubble tension


INTRODUCTION
The current rate of cosmic expansion, identified as the Hubble constant H 0 , is a fundamental cosmological parameter which sets the age and scale of the universe. In recent times, the unexplained discrepancy between local measures of H 0 based on local distance indicators (e.g., Riess et al. 2021b) and predictions based on measurements of the early universe (e.g., Planck Collaboration et al. 2018) has attracted attention as a "Hubble tension" (Di Valentino et al. 2021). Predictions based on the early universe appear robust, producing the same value at high precision when assuming the ΛCDM model with or without the use of CMB anisotropy data (Addison et al. 2018). Direct measurements of the Hubble constant at low redshift have been conducted using a variety of distance indicators, including water megamasers (Pesce et al. 2020), type II supernovae (de Jaeger et al. 2022), Surface-Brightness Fluctuations (Blakeslee et al. 2021), and Tully-Fisher relations (Schombert et al. 2020;Kourkchi et al. 2020), and modelling of gravitationally lensed quasars (Wong et al. 2020;Birrer et al. 2020). However, the most precise measurements in the literature have come from the use of a three-rung distance ladder using Type Ia supernovae (SNe Ia) (Riess et al. 2021b;Freedman 2021).
Since ∼ 2005, the SH0ES team (Riess et al. 2005(Riess et al. , 2021b has been engaged in a project to refurbish the distance ladder measurement of H 0 using the WFC3 and ACS instruments on the Hubble Space Telescope. These measurements have been based on a three rung distance ladder formed by 1) geometric measurements of the distance to Cepheid variable stars, 2) Cepheid distance measurements of SN Ia host galaxies, and 3) standardized SNe Ia magnitudes found in the Hubble flow between 0.023 < z < 0.15. This redshift range was chosen to reduce the dependence of the result on peculiar motions at low redshifts and cosmological effects at higher redshifts, while retaining sufficient statistics to constrain the Hubble flow. Hallmarks of the SH0ES project include the use of the same instrument to measure all Cepheids across the ladder to nullify systematic uncertainties, the use of the NIR and three colors to mitigate dust and accounting for the dependence of Cepheid luminosity on locally measured abundances. This program also extended the range of Cepheid parallax measurements to a few kpc using spatial scanning of HST (Riess et al. 2014;Casertano et al. 2016). The SH0ES team has expanded the sample of cosmological SNe Ia calibrated by Cepheids from a few in the Hubble Key Project (Freedman et al. 2001) with high quality data to 42 in 37 host galaxies in the most recent result (Riess et al. 2021b;hereafter R21).
In this work we present a two-rung distance ladder based on geometrically calibrated Cepheid distances at z < 0.01 as a measure of the Hubble flow. In the conventional three-rung distance ladder, extragalactic distances measured by a geometrically-calibrated distance indicator (e.g. Cepheid or Mira variables, the tip of the red giant branch) co-located with one or more SNe Ia calibrate the SN Ia absolute luminosity. However the redshifts of these galaxies are not involved in the determination of H 0 . Instead, the Hubble flow is constrained by a separate sample of SNe Ia, because their bright absolute magnitude (M B ∼ −19) allows them to be detected well into the Hubble flow. By using objects at redshifts cz 7000 km s −1 , the fractional effect of galactic peculiar velocities (∼ 300 km s −1 ) will be smaller than SN Ia uncertainties. Wu & Huterer (2017) found that, given the Riess et al. (2016) SN Ia sample, (uncorrected) cosmic variance would affect the SH0ES H 0 measurement by only ∼ 0.5%. As SNe in calibrator and Hubble flow samples are largely observed by the same telescopes and discovered by similar surveys, many calibration and demographic biases are expected to largely cancel out from the differential measurement and are further reduced by recalibration using wide angle surveys (Scolnic et al. 2015;Brownsberger et al. 2021). Semi-independent crosschecks of the third rung of the distance-ladder include Burns et al. (2018), Jones et al. (2022), and Dhawan et al. (2022). However the astrophysics of Type Ia supernovae are debated, including the roles of progenitor age Rigault et al. 2020;Rose et al. 2021) and host galaxy dust González-Gaitán et al. 2021; Thorp et al. 2021;Popovic et al. 2021) in the diversity of observed populations. SN-independent measures are thus a crucial check of possible systematic biases in the SH0ES result.
In this work we remove the third rung, measuring H 0 by directly comparing Cepheid distance measurements to their host galaxy redshifts. Doing so will give us a measurement of H 0 independent of any potential Type Ia supernova systematics. However by removing the third rung, we are forced to reduce the median redshift of the Hubble flow sample by an order of magnitude from z ∼ 0.04 to z ∼ 0.006 and the maximum redshift from z = 0.15 to z = 0.011. At such low redshifts, peculiar velocities (PVs) are expected to have a magnitude ∼ 20% of the observed redshifts. Further, the volume probed by these Cepheid hosts is not large compared to cosmological structures on the scale of baryon acoustic oscillations (∼ 140 Mpc), and as a result the effects of large-scale structure or "cosmic variance" on the measurement must be addressed. The reduction in median redshift by a factor of six can be naively expected to increase the effect of PVs on the measurement by at least the same amount, corresponding to an expected uncertainty in H 0 greater than 3%.
Cosmological measurements using Type Ia supernovae as standard candles have made use of peculiar velocity corrections based on models of the density field in the nearby universe to mitigate systematic PV effects from structure on scales greater than ∼ 10 Mpc h −1 , where h = H 0 /100 km s −1 Mpc −1 . Redshifts used in the cosmological analysis of Brout et al. (2022) and Riess et al. (2021b) also made use of "group" corrections (the replacement of an SN Ia host redshift with a mean redshift of associated galaxies) which attempt to remove noise on small scales from virial motions of galaxies within groups or clusters. Peterson et al. (2021) conclusively demonstrated at high significance that the dispersion of the SN Ia Hubble diagram is reduced with the application of both kinds of PV corrections. In particular two density field reconstructions (Carrick et al. 2015;Lilow & Nusser 2021) performed excellently on these tests, reducing the scatter of the Hubble diagram by ∼ 10% in a sample of 584 SNe at z < 0.08 1 . Both reconstructions are based on galaxy counts and luminosities from redshift surveys of the nearby universe, covering a cubic volume centered on the Milky Way with 400 Mpc h −1 sides.
In order to make a precise constraint on H 0 from the Cepheid distances and Pantheon+ redshifts, it will be essential to make use of these reconstructions to extract as much information about the structure of the nearby universe as is possible. To ensure our constraint is accurate we must also quantify any remaining systematic uncertainties in the corrected redshifts, including the covariance between host galaxies. Finally, as the velocity noise relative to the distance noise increases at low redshifts, selection effects become increasingly important; the well known volumetric or Malmquist bias increases as the square of the relative error. At distances where PV effects make up ∼ 20% of the signal of the Hubble flow, it will be necessary to evaluate the selection of the SH0ES Cepheid sample as fair markers of the Hubble flow and ensure that any associated uncertainties are included in our error estimates.
In this work, we make use of distances to Cepheid-hosting galaxies measured by SH0ES and Pantheon+ redshifts, in conjunction with two density field reconstructions of the nearby universe to measure the Hubble constant. In Section 2 we discuss the data we make use of, including Cepheid data, redshifts, and PV reconstructions. In Section 3 we present our analysis methodology, which uses a hierarchical Bayesian model to address uncertainties and biases associated with a low-redshift measurement. This section includes estimates of the correlations in predicted redshifts based on modelling and the limitations of the reconstructions, as well as modelling of the SH0ES host selection. We discuss the implementation of this model in Section 4 and provide validations of the methodology on the SH0ES data. We present estimates of H 0 from the two-rung distance ladder in Section 5, and discuss our findings in Section 6.

Cepheid Distances
We use the distance moduli to Cepheid host galaxies measured by the SH0ES team. These are based on HST observations collected over 20 years using the WFC3 and ACS instruments in three optical bands (F555W, F814W, and F350LP) along with near-infrared (NIR) measurements in the F160W band . Sources for the data are given in Table 1 and Figure 2 of Riess et al. (2021b). These data have been reanalyzed using an automated pipeline for Cepheid detection, photometry, and data reduction described in Yuan et al. (2022b, in prep). We make use of dereddened Wesenheit magnitudes, and standardize using luminosity-period and -metallicity relations described in Sections 2 and 3 of Riess et al. (2021b) . The absolute luminosities for the Cepheid-derived distances is constrained by three independent geometric distance measures: 1. Milky Way Cepheid parallaxes from Gaia EDR3 (Gaia Collaboration et al. 2016, 2021Riess et al. 2021a) and HST spatial scanning ).
2. An angular distance measurement to the water megamaser system in NGC 4258. (Reid et al. 2019) 3. A detached eclipsing binary distance to the Large Magellanic Cloud. (Pietrzyński et al. 2019) These three anchors yield independent and consistent estimates of the fiducial Cepheid absolute luminosity (see Figures 19 and 21 of R21), and the constraints are combined in the final analysis. Our final dataset includes measurements of the distances to 35 unique galaxies within the Hubble flow which form a coherently constructed and relatively complete sample (targeting all suitable SN Ia-Cepheid hosts at z ≤ 0.01 with SN Ia seen 1980-2021) whose selection can be modelled; we have excluded from the analysis two galaxies which were targeted by a different HST program (HST-GO Proposal #16269) because it used a very different selection method, targeting more luminous galaxies at greater redshifts of 0.01 < z < 0.02. These data will be presented and analyzed in upcoming work.
We use the reduced data products of these efforts, specifically the set of distance moduli and associated covariances (which we label µ Ceph , Σ (SH0ES) µ ) produced by the SH0ES analysis of R21. This covariance matrix encapsulates the statistical error in each distance as well as the correlated effects of crowding, metallicity/period standardization, and the three geometric anchor distances. As discussed in Appendix A, we include additional methodological uncertainties based on the analysis variants of R21 to compute a final covariance matrix Σ µ .
While there are other extragalactic Cepheid measurements in the literature, we restrict ourselves to the SH0ES sample in order to avoid additional systematics relating to their selection functions, calibration uncertainties, photometry methodology, crowding corrections, and other methodological differences.

Galaxy Redshifts & Peculiar Velocities
We use the measured heliocentric redshifts of the 35 Cepheid galaxies taken from the Pantheon+ sample . These are corrected for small-scale galactic motions by "group corrections", as presented in Peterson et al. (2021), based on the galaxy-group assignments of Tully (2015). These redshifts are then converted into the CMB reference frame. With additional corrections for large-scale structure, these redshifts were used for the cosmology analysis of Brout et al. (2022), which found that they were consistent with ∼ 240 km s −1 of uncorrected peculiar velocity scatter. This value was found in a mixed sample of galaxies with or without group corrections. We denote these corrected values as z CMB . The sky coordinates for the host galaxies are also taken from Carr et al. (2021). We convert these sky coordinates into unit vectors in galactic Cartesian coordinates which we labeln.
To correct these redshifts for large-scale structure, we use two reconstructions of the density and peculiar velocity fields in the nearby universe based on data from redshift surveys. These attempt to estimate the underlying fields that produce observed redshift distributions within the observed volume. Peterson et al. (2021) found that the two published reconstructions that most accurately match the SN Ia data are presented by Carrick et al. (2015, hereafter C15) and Lilow & Nusser (2021, hereafter LN21). In both cases the published versions are smoothed with a Gaussian kernel (at a scale of 5-10 Mpc h −1 ) to remove difficult-to-model nonlinear behavior and to ameliorate the effects of shot noise. For convenience of comparison, we apply additional smoothing to the C15 products such that both are smoothed at scale of 7 Mpc h −1 . We will model redshift uncertainties due to limitations of these reconstructions in the following section.
The reconstruction of LN21 is based on redshifts compiled by the 2MASS Redshift Survey (2MRS) (Huchra et al. 2012; additional data released in Macri et al. 2019), using a bias-corrected Weiner filter estimate of the density field. That of C15 is based on the 2M++ redshift compilation (Lavaux & Hudson 2011), and used a method in which PVs and densities were iteratively recalculated until a best fit solution was found. As the 2M++ compilation included the 2MRS sample released Huchra et al. (2012), there is substantial data shared between the two efforts.
Other reconstructions of the local matter and velocity fields have been described in published work, including Graziani et al. (2019) and Jasche & Lavaux (2019). We have chosen to make use of C15 and LN21 primarily due to their performance on the SN Ia data and the public accessibility of their data and data products.

METHODOLOGY
Because uncorrected PVs represent a dominant source of uncertainty, and redshift or distance limits imposed by the sample selection may produce a bias, we proceed with a more extensive effort to model the data than is typically used for a three-rung distance ladder where similar effects are negligible.
We construct a hierarchical Bayesian model for the observed distance moduli of the Cepheid galaxies given the reconstructed density/velocity fields, observed redshifts, and observed positions on the sky. We model our data as determined by the Hubble constant and a set of nuisance parameters which includes the true comoving distance to each observed galaxy as a latent parameter. This formulation of the problem allows us to easily account for aspects of the analysis such as selection, while budgeting for many of the systematic uncertainties of our analysis. In Section 3.1 we present our account of the observed data in terms of the model. Our treatment of the C15 and LN21 reconstructions are presented in Sections 3.1.1 and 3.1.2. We discuss the priors we use in Section 3.2, and the modeling for systematic effects of selection in Section 3.2.1. We summarize our model in Section 3.3.

Data Likelihood
We first define the relation between the modeled parameters and the observed data. To track the relationships between distances, redshifts, and peculiar velocities, we define the parameter vector of latent comoving distances in Mpc h −1 units for the 35 objects of our Cepheid galaxy sample χ. In a universe described by the Friedman-Lemaître-Robertson-Walker metric, the cosmological redshift for the ith object is and concurrently the distance modulus is (2) with the two kinematic parameters fixed to q 0 = −0.55 and j 0 = 1, and H 0 a free parameter. Our result is insensitive to changes in the kinematic quantities, as the effect of these corrections to the Hubble law is ∼ 0.5% for the most distant objects in our sample, and much less for the majority. The likelihood of the observed distance moduli µ Ceph with uncertainties Σ µ is simply defined by the multivariate normal distribution This term of the model is the component of the likelihood in which H 0 appears. The observed likelihood of the redshifts is more complicated. The redshift addition formula gives the observed redshift corrected to the CMB frame as z CMB + 1 = (1 + z Pec. ) × (1 + z Cosm. ). From this the likelihood of the observed vector of CMB frame, group-corrected redshifts z CMB are given by the multivariate normal distribution where the division on the left hand side is conducted elementwise. v Pred. (χ, θ) is the predicted peculiar velocity from the reconstruction used discussed in Section 3.1.1, and Σ Pec. (χ; θ) is our estimated uncertainty in these predictions as presented in Section 3.1.2. The parameters θ that define the velocity predictions and uncertainties will be introduced in the next two sections. These two likelihood definitions make up the constraints on the model parameters from the data. H 0 is measured as the relative scale between the Cepheid distances and the redshifts.

Peculiar Velocity Predictions
The two peculiar velocity reconstructions are publicly distributed as data cubes which give the values of the density contrast δ( r) = (ρ( r) −ρ)/ρ (where ρ is the density of matter) and peculiar velocities v Pec. ( r) at points on a densely sampled grid. These grids are given at a resolution higher than the smoothing scale. These can easily be interpolated to give functions for the density contrast and velocity as a function of position in comoving galactic coordinates δ Rec. ( r), v Rec. ( r) respectively.
The two reconstructions also have both made measurements of the parameters that relate the density to observed velocities. We quantify the effect of the stated uncertainties by marginalizing over the values given in the original work. The relevant parameters are: 1. The scale parameter β, the ratio of the logarithmic growth rate f (Ω M ) of density perturbations to the galaxy bias b. This relates predicted peculiar velocity to galaxy overdensities.
2. The external bulk flow V Ext , a velocity vector added to every point within the reconstructed volume. This models any contribution to PVs sourced by perturbations on scales larger than the volume probed by the data.
The external bulk flow is parameterized by V Ext , the magnitude in units of km s −1 , and (l, b) the galactic longitude and latitude of its direction. Note that this dipole is distinct from the ∼ 600 km s −1 motion of the Milky Way with respect to the cosmic microwave background, most of which is due to gravitational effects from the local volume (within 200 Mpc h −1 ). We remove the fiducial values of the quantities {β Fid. , V Fid. Ext. , l Fid. , b Fid. } from the calculated peculiar velocities to define the unscaled local velocities wheren(l, b) is the unit position vector as a function of angular coordinate. We can then compute the predicted peculiar velocity for the ith object with unit position vectorn

Covariance of PV reconstructions
Previous work including Riess et al. (2016), Scolnic et al. (2018), Boruah et al. (2020) and others have analyzed SN Ia distances while accounting for PV uncertainties using an assumption of diagonal scatter in corrected redshifts, corresponding to an assumption that all correlations in the data have been removed. However, the smoothing of the density field on scales ∼ 10 Mpc h −1 removes any information at smaller distances. Any two galaxies closer together than this scale will have similar predictions for their density/PV, while their true velocities may include unmodelled nonlinear or virial noise. Further, they make use of a finite number of tracers (galaxies) of the underlying density. Lastly, the two reconstructions show some differences in predicted densities and velocities at the scale of ∼ 60 km s −1 , with some differences apparent across the entire volume. It is thus apparent that use of these reconstructions to remove predicted velocities will ameliorate but not totally remove correlations in the data, and a trustworthy estimate of the Hubble constant from data with median distance ∼ 20 Mpc h −1 will require an estimate of these correlations. Our goal then is to construct an appropriate covariance matrix Σ Pec. for the predicted peculiar velocities. These correlated uncertainties will ultimately contribute ∼ 2% of the final error budget (see Table 1).
To understand the uncertainties in the reconstructed density fields, we first need to understand the scales and amplitudes of the underlying cosmological structures which these attempt to model. If the density field under ΛCDM is approximated as Gaussian, the statistics of the density contrast δ are fully specified by the power spectrum P (k). The power spectrum is defined by the relation where δ 3 is a Dirac delta function andδ(k) is the Fourier transform of the density contrast. Given the linear approximation that the divergence of the velocity field is proportional to the density and that the velocity is vorticity free, the covariance of the peculiar velocities along the line of sight at comoving distances χ i and χ j separated by angle θ is given by where τ is conformal time, and D is the linear, scale-independent "growth factor" that describes the amplitude of density perturbations as a function of cosmic time (Davis et al. 2011;Huterer et al. 2015). The term we write as F (u, v, cos(θ)) can be evaluated either as a series over spherical Bessell functions or constructed from elementary and trigonometric functions. We evaluate the power spectrum using the CLASS code (Blas et al. 2011) with nonlinear corrections from halofit (Takahashi et al. 2012) using cosmological parameters from Planck (Planck Collaboration et al. 2018). As F (u, u, 1) = 1/3, the predicted diagonal dispersion < v 2 i > at z = 0 is proportional to the integral dkP (k). Evaluating < v 2 i >, we find a predicted scatter of (380 km s −1 ) 2 of peculiar velocity scatter in the local universe. This quantity represents the uncertainty in uncorrected CMB-frame redshifts as a distance-indicator.
To budget for PV uncertainties in redshifts after correction using a reconstruction, we derive two independent sources of remaining noise, which we label P Unmodelled (k), P Structure (k). These quantities are derived and defined in Appendix B. P Unmodelled (k) represents uncertainties resulting from the Gaussian smoothing applied in the reconstruction procedures, and is predicted to contribute 264 km s −1 of diagonal scatter. The size of this prediction is uncertain, as it is dependent on theory/simulation-based accounts of nonlinear gravitational effects. P Structure (k) is empirically estimated from the differences between the results of C15 and LN21, and represents our estimate of the systematic uncertainty in their modeling of large-scale structure. This component contributes 66 km s −1 of diagonal scatter, which is present across larger distances on the sky. The sum of these two components, which we labelP ∆ (k), is the total power remaining in the density (and thus velocity) field after correction. We compareP ∆ (k) to the ΛCDM prediction in Figure 1; the difference between the two represents the uncertainty removed from our measurement by the use of the galaxy catalog information. Accordingly, our fiducial error budget predicts use of the corrections will reduce the diagonal scatter of the sample from ∼ (380 km s −1 ) 2 to ∼ (270 km s −1 ) 2 .
To account for uncertainties in the size of each of the two effects discussed, we introduce two additional parameters σ Structure which scale the amplitude of the two components of the budgeted uncertainties:  where each component of the power spectrum has been normalized by the fiducial scatter < v 2 i > so the value of each parameter is equal to the scatter contributed by each component. A value of σ (PV) Unmodelled greater than the fiducial value of 264 km s −1 would indicate large virial or nonlinear motions on scales too small to be detected by the reconstructions; a value of σ (PV) Structure larger than the fiducial value of 66 km s −1 would indicate that our estimates of the error in the reconstructed fields are understated.

Removed by Reconstructions
In two model variants (results shown in the first two rows of Table 2) we attempt to fit the data without the use of external galaxy catalog information. In these cases, we instead use the relation where P ΛCDM Structure (k) is defined This reduces to the full ΛCDM power spectrum in the case when the amplitude parameters σ Structure are fixed to their fiducial values, but allows the measurement to marginalize over unknown nonlinearities and cosmology.
Our estimated peculiar velocity covariance is then Three Cepheid hosts are identified as members of the Virgo cluster from NED, but have received different group assignments with distinct redshifts No other galaxies in our sample appear to be cluster members. To account for the difficulty of the group assignments in the dense environment with the cluster, we add an additional (550 km s −1 ) 2 of variance to the diagonal elements of the covariance matrix for these three objects, representing the virial noise of the complex (Kashibadze et al. 2020). An alternative approach could assign a single redshift for the cluster center to all three objects, but the width of the cluster is not negligible relative to the distance. Particularly given that identifying a unique estimate for the central redshift of an object with the size and complex structure of Virgo is nontrivial, we conclude our method is more robust.
Given P CDM (k) Given P (k) Figure 2. Comparison of the velocity covariance given the power spectrum of ΛCDM in blue, and given the estimated power after corrections from PV reconstructions in red. Each point represents the covariance between a pair of Cepheid host galaxies.
As we expect the overall variance of the sample is cut in half on the diagonal (zero separation), and the remaining correlations are present on much smaller scales.
We calculate the covariance matrix for our sample at fiducial distances using our estimated power spectrumP ∆ (k) as well as P ΛCDM (k) at fixed amplitude. We show the values of the covariance between each pair of objects compared to their physical separation in Figure 2. As can be seen, applying the corrections to remove power at large scales drastically reduces the correlations internal to our sample. As the window function F (u, v, θ) is a complex function of the lines-of-sight of each of the two objects and their angle with respect to to each other, the points fall within an envelope as a function of physical separation. The upper bound is set by a particular geometry of the two objects with respect to the observer with small but nonzero angular separation, while the lower bound will be set by a geometry with the same physical separation but 180 • greater angular separation, which will have a similar scale of correlation with opposite sign. The sample of observed Cepheids has few such nearby pairs on opposite sides of the sky at small scales, thus the lower half of the envelope is not seen until larger scales.

Priors & Sample Definition
We use the following priors on the scalar parameters: Unmodelled ) The fiducial quantities for the central value and uncertainties on the reconstruction parameters were taken directly from the relevant published work (i.e. C15 or LN21). For the PV amplitude parameters, we chose the inverse gamma distribution as a positive-definite but relatively uninformative distribution. The scale and shape parameters of the distribution were chosen to restrict the model from accessing pathological regions of the posterior at extreme values of these parameters; they have medians centered on the fiducial values, with 1% of probability mass at values greater than 3× and 5× their respective fiducial values . As previously given, the fiducial value of σ (PV) Unmodelled is 264 km s −1 . The fiducial value of σ (PV) Structure is dependent on whether we use reconstructions (66 km s −1 ) or assume the full ΛCDM systematic (266 km s −1 ). We additionally put an upper bound on σ Structure for computational reasons, as in some cases one MCMC chain could become stuck in the pathological region associated with high values of this parameter.

Selection & Distance Priors
For nearby galaxies, the observed redshift as a proxy for distance is significantly biased by volume effects. Since the number density of galaxies per unit distance generally goes as the square of distance, a galaxy observed with a given redshift is likely to have scattered down from the larger volume element with higher true distance than would be naively inferred by the Hubble law. For a PV scatter of 250 km s −1 , this bias is approximately given by derived from Monte Carlo simulations described below; for our sample, this estimated bias ranges from ∼ 100% for the lowest redshift objects to ∼ 1% at the edge of our sample. In comparison, the Cepheid distances exhibit little statistical noise, resulting in much smaller bias. To quantify and account for the biases of our sample, it is necessary to understand the selection of our sample of Cepheid host galaxies. Specifically, it is necessary to check whether the fact a galaxy was included in our sample provides additional information about the galaxy's distance, independent from the measured values of the redshift and Cepheid distance modulus.
The selection goals of the SH0ES program were ultimately to target every star-forming galaxy at suitable inclination that has hosted a SN of cosmological quality reported before maximum light and observed digitally in the local volume at d 40 Mpc or z 0.011. However there is a key ambiguity, in that the indicator used to evaluate this last criterion could be either the redshift of the galaxy or the SN Ia inferred distance. The SH0ES program was conducted over ∼ 15 years and without any pre-allocation of observing time, so HST proposals were submitted on a cycle-to-cycle basis to target small sets of 1 to 8 hosts of often recently discovered SNe Ia or to take advantage of an extended distance range from a new HST instrument. Near the end of the sample collection, the SH0ES Team targeted the completion of the sample to the above redshift limit, therefore it is difficult for the authors to resolve precisely which limit, distance or redshift, was the most relevant. SH0ES Cepheid program proposals and target lists mix discussion of the redshift or distance limit. Ultimately we conclude that we are unable to determine a priori the selection unambiguously, so we model both and account for this ambiguity as a systematic uncertainty.
To better understand the impact of selection, we ran simple Monte Carlo simulations, in which we sampled galaxies from a 3 dimensional volume, using the density values found along SH0ES lines of sight in the Carrick et al. (2015) reconstruction to weight the distances. We simulated redshifts with a 250 km s −1 scatter, and SN distance moduli with 0.15 mag of scatter. We then compared the observed distribution of redshifts in the SH0ES sample to the simulated redshifts observed when using a strict cut on either redshift or SN distance in Figure 3 just above the most distant point in the SH0ES sample. We show the predicted bias as a function of simulated CMB redshift from these Monte Carlo simulations in Figure 4 for both scenarios. The bias is very similar for both at z < 0.007 but diverges at higher redshifts.
While the redshift distribution of the SH0ES sample can be seen to slightly disfavor selection by a redshift cut (a Kolmogorov-Smirnov test gives a p-value= 0.11), neither scenario is inconsistent with the data. As our data cannot rule out either scenario, we produce results assuming two limiting scenarios: 1. Assume redshift-limited selection: We will assume that no external distance information of any kind was used in selection, besides the galaxy redshift. This assumption is more conservative as regards final uncertainties, as this reduces the information in our sample.
2. Assume distance-limited selection: We assume a strict cut on SN Ia distance modulus was incorporated into the selection. As this assumption entails that the most distant redshifts in our sample are are biased high compared to their true distances, this assumption produces a lower value of the Hubble constant. This assumption is more conservative as regards the ongoing Hubble tension; alternative selection models consistent with the data would reduce the dependence of the selection on SN Ia information, reducing the impact.   Bias is shown if the SH0ES sample were to have been selected in a "redshift-limited" (red line) or a "distance-limited" scenario (blue line). In both scenarios at z 0.008, volumetric bias causes more SNe to have scattered down to their observed value than scattered up. At z 0.008, the predicted bias diverges, as galaxies at larger redshifts could only have made it into the distance-limited sample past a cut on SN distance by scattering up in redshift.
We account for other aspects of selection for SH0ES in addition to the distance criterion. As SN Ia progenitors are stellar remnants formed in galaxies, our analysis will use the galaxy density distributions found by Carrick et al. (2015) and Lilow & Nusser (2021) to account for the inhomogeneous density of galaxies as a function of distance along each line of sight. Doing so additionally accounts for any bias due to the common practice (until the advent of real-time all-sky surveys) of targeting supernova searches to galaxies selected from catalogs rather than random pointings over the entire sky 2 . We do not expect the star-forming galaxies which were then targeted by SH0ES to have galaxy bias parameters significantly different from unity (Swanson et al. 2008), and thus expect the distribution of these galaxies to trace those of the reconstructions.
Lastly, we have checked whether discovery bias could affect our results; i.e., whether that a SN Ia was reported before maximum light provides additional information about the distance of the host galaxy. This would be violated if the magnitude limit for discovery is fainter than the dimmest SN in our sample. For each SN we checked who was determined as discoverer by the IAUC 3 . For each person or collaboration who discovered a SN Ia in our sample, we determined the median magnitude of the supernovae they found at discovery. We find 32 out of 35 SNe were found by discoverers with a median discovery magnitude higher than the peak magnitude of the dimmest SN Ia in our sample. We expect that discovery bias will not significantly affect our measurement, although it may more strongly affect selection at higher distances. If discovery bias were to affect the sample, the effect on the measurement of H 0 would resemble the scenario we call "distance-limited".
Given these considerations, the prior for the latent distances is proportional to the number density of galaxies per unit distance , using the reconstructed density field δ Rec. ( r). If no other distance information was incorporated into the SH0ES selection, this prior is sufficient. We use this definition of the distance prior in our analyses that "Assume redshiftlimited selection". If the SH0ES sample was selected using the information from the SNe Ia, then Equation 16 should be modified to include the constraint that the observed SN Ia distance modulus was less than some critical value. To conduct this analysis without including the actual observed values of SN magnitudes, we adopt the simple model of SN magnitudes as normally distributed with dispersion 0.15 mag and fit the cutoff distance as a free parameter. The distance prior becomes where Φ(x) is the CDF function of the standard normal distribution, and d T is the assumed distance cutoff in units of Mpc h −1 for inclusion in the SH0ES sample. The upper bound on distances is done purely for performance reasons, and 80 Mpc h −1 is well above the distances of any object in our sample.
In this scenario we also include a prior on the hyperparameter log d T log d T ∼ N (log 30, 0.4).
As we are unable to distinguish between these two models of selection a priori, we present results using both priors. In Section 5 we discuss further which model is most appropriate for the data.

Model Summary
To conduct this measurement, we have defined a hierarchical model with 7 scalar parameters and a latent parameter for each Cepheid host galaxy, as well as one hyperparameter included in some analyses: 1. The value of the Hubble constant log H 0 , in units of km s −1 Mpc −1 . This is our only parameter of interest, and the rest of our parameters may be regarded as nuisance parameters included to quantify related systematic uncertainties.
2. β, the scale parameter that relates galaxy overdensity to peculiar velocities.
3. V Ext , the magnitude of the bulk flow external to the volume of the reconstruction in units of km s −1 and (l, b) the galactic longitude and latitude of its direction.
4. χ, the latent comoving distance of each Cepheid host galaxy to the Milky Way in units of Mpc h −1 .

Two scale parameters σ (PV)
Unmodelled , σ Structure , which control the amplitude of the peculiar velocity uncertainties on scales below/above ∼ 7 Mpc h −1 respectively. 6. When assuming distance-limited, a hyper-parameter log d T which sets the assumed cutoff of the distance-limited sample (i.e. the maximum SN Ia magnitude allowed to pass cuts).
Using this methodology allows us to naturally budget for the majority of our systematic effects, account for the selection of our sample, and account for the off-diagonal effects of correlated large scale structure.
When applied to our data, the data strongly dominates the posterior distribution over log H 0 , while the prior largely determines the posterior distribution over the parameters of the velocity reconstructions. This is as expected, as with only 35 data points we do not expect to constrain the velocity field in the local volume as a whole, and rely on the values of these parameters measured in the original work. When fit σ (PV) Unmodelled is mostly well constrained, while σ (PV) Structure is well constrained if "distance-limited" selection is assumed. log d T is strongly constrained.

Computation
We use the code Stan (Carpenter et al. 2017) to sample from the posterior distribution of the parameters as defined by the prior and likelihoods given above. Accounting for the full dependence of the velocity covariance on the latent distances within Stan's HMCMC framework is computationally intractable, requiring evaluation by chain rule of the derivatives of the likelihood with respect to ∼ 1200 matrix elements for ∼ 35 parameters. As the effect is small, we use the MCMC to sample from the posterior while keeping the velocity covariance fixed, evaluated at fiducial values for the distances, then use importance sampling to reweight each sample appropriately. Given the posterior samples of the parameters θ (i) , the relative weight of the ith sample is given by the ratio of the likelihoods of the observed redshifts with the covariance varying/fixed The overall effect of this correction is small; typically ∼ 0.1σ in H 0 .

Effects of each component of modeling
To show why the complexity of the model we have constructed is required, we break down how each component of the model affects the central value and error estimate of our result in Table 1. We use the following simplifications of our model:

Include Cepheid Systematics:
We use the hierarchical model based on Equations 3 and 4 without PV corrections, and include the full systematic covariance matrix for the Cepheid measurements (see Appendix A).

Include PV Corrections:
We include the PV corrections discussed in Section 3.1.1, based on the C15 reconstructions.

Include PV Covariance Estimate:
We include the full estimate of PV covariance for corrected redshifts discussed in Section 3.1.2.

Include Selection Effects (distance-limited) (Baseline model):
We account for the model of distancelimited selection discussed in Section 3.2.1, using the prior in Eq. 18. This is the same result presented in the 7th row of Table 2. 6. Simple Estimator, cz > 2000 km s −1 : As the Simple Estimator, but removing the data at cz < 2000 km s −1 . This reduces the impact of several systematic uncertainties that are otherwise addressed by our modeling. This reduces the sample size from 35 objects to 15. This estimator gives a central value of 72.2 km s −1 Mpc −1 . We see that ignoring any component of the model will substantially bias the inferred value of the Hubble constant, and/or lead to underestimates in the inferred error budget. Note that the first and second rows of Table 1 underestimate cosmic variance and do not account for selection effects, and should not be used outside this context; results which do not make use of any local PV models are shown in the first two rows of Table 2.

Validation
To validate our approach, we wish to ensure that the model is capturing the features found in the real data, as well as compare the use of the C15 and LN21 reconstructions to a more conservative analysis which does not make use of PV information from sources external to SH0ES. To do so we use leave-one-out ("loo") cross-validation based on Vehtari et al. (2017) and Bürkner et al. (2021) to quantify the performance of different models. By refitting the model while withholding one data point from the model, we determine which model most accurately predicts the held-out data. Our metric is the expected log predictive density where µ −i is the vector of distance moduli with the ith entry removed, and M denotes the choice of peculiar velocity reconstructions and other modelling choices. . Higher elpd loo indicates that the model does a better job at predicting the left out data. As a log-likelihood based metric, elpd can be interpreted similarly to similar metrics such as the Bayesian evidence. Absolute values of this quantity are not informative in themselves, but the differences between models allow discrimination between them. We compare the elpd to a reference, which we choose as the model which uses no SH0ES-external information, assumes distance-limited selection, and fits the amplitude of PV covariance. As the estimated elpd is a summation of N independent terms, we estimate the significance of these differences by bootstrapping the estimate to evaluate a p-value which we write as which quantifies the likelihood that a given analysis would produce an elpd better than that of the reference model by chance. Small p-values indicate that the reconstructions used are informative about the underlying peculiar velocities. We can also examine the residuals of the leave-one-out tests by directly comparing the observed value of the distance moduli µ Obs. to the predicted values when that data point is left out of the model fitting. As the predictions incorporate the information of the rest of the sample, these residuals should be uncorrelated (and approximately normal in distribution). We show a comparison of the residuals relative to the predictions in Figure 5, and a comparison of predicted redshifts to observed distance moduli in Figure 6. The positive tail in the uncorrected residuals seen in the upper panel of Figure 5 at low distances is a distinctive artifact of the volumetric or Malmquist bias due to scatter in redshifts. The same artifact at these redshifts can be seen in the SN Ia Hubble residuals of the Pantheon+ analysis There is no significant linear trend, and no significant autocorrelation (r < 0.2) at any lag. (Brout et al. 2022). These data were not used in the Pantheon+ cosmology fit, which cut the sample at z = 0.01, nor in the SH0ES analysis R21 which cut the sample at z = 0.023. The bias is not expected to affect either result, as it scales as z −2 and is negligible in both samples.

RESULTS
We first measure the Hubble constant without the use of peculiar velocity reconstructions. This gives us a measurement independent of PV reconstruction procedures, and allows us to quantify the improvement from the use of this data. We use the full ΛCDM power spectrum as described in Equation 11, set v pred (χ i ) and δ(χ i ) to 0 in Equations 4 and 18, and remove the reconstruction parameters V Ext. , l, b, β from the model. Without the use of other peculiar . Hubble diagram derived from leave-one-out tests. We show observed Cepheid distances on the y-axis, and the predicted cosmological redshift inferred from the redshift of the galaxy corrected for bias and peculiar velocities on the xaxis. These predicted redshifts are determined while the Cepheid distance for each point is held out. Modeling assumes C15 reconstruction, and fits the amplitude of peculiar velocities. We show results assuming both distance-limited and redshift-limited selection. The point with large z-uncertainties is NGC 4424, whose observed redshift places it in the middle of Virgo cluster, and whose cosmological redshift is highly uncertain as a result.
velocity information , we measure H 0 to be 71.0 +4.7 −4.6 km s −1 Mpc −1 , a precision of ∼ 6.3%. Fitting the peculiar velocity amplitudes leads to a smaller estimate of the small-scale or nonlinear power, yielding H 0 = 71.5 +4.6 −4.5 km s −1 Mpc −1 . This choice is significantly favored by the data. We take the latter model as the reference for a two-rung measurement without the use of PV information. We show the values for H 0 and the other parameters in Table 2. Ignoring the galaxy catalog information in the case of an assumed redshift cut produces pathological behavior in the posterior prior due to the lack of constraining power in redshifts, and we do not produce results for this scenario.
We then fit the data using each of the two reconstructions, two choices of prior, and the choice to either fit the amplitude of peculiar velocity covariances or fix them to fiducial values. We report the appropriate parameter constraints in Table 2. Including information from the reconstructions, the fractional error in H 0 decreases from ∼ 6% to ∼ 3%, with the assumption of SN Ia selection decreasing uncertainties further. We show the corner plot of the constraints on the scalar parameters using the C15 reconstruction with fitted PV covariance in Figure 7 under both selection scenarios.
As expected, use of either reconstruction is significantly favored by the elpd metrics. The reconstruction of C15 is slightly favored by the data compared to LN21, but the difference is not significant. The choice to fit the peculiar velocity amplitudes is also strongly favored by the data. The data slightly favors the redshift-limited scenario for selection, but the difference is not significant (p > 0.2 for both LN21 and C15 when peculiar velocity amplitudes are fitted). We conclude that we are incapable of differentiating the models of selection from the data alone. As the redshift-limited scenario implies a more restricted experimental process, we consider both equally likely given the data.
Inclusion of the reconstructions shifts the central value by ∼ 1.5 km s −1 Mpc −1 , much less than the ∼ 3 km s −1 Mpc −1 reduction in the uncertainty, and thus within expectation. We note that the two reconstructions produce very similar results regardless of the other analysis choices made, consistent with our error budget. The different approaches to selection shift results by more, increasing the value of H 0 by ∼ 2-5 km s −1 Mpc −1 , depending on whether or not the peculiar velocity covariance is also fit.
The fitted values of the amplitude of PV covariance are consistent with our expectation. As we have used groupcorrected redshifts, we expect that much of the diagonal scatter has been removed, lowering the value of σ (PV) Unmodelled ; our fitted values are consistent with the estimated 150 km s −1 for galaxy halos found by comparison to N-body simulations in Carrick et al. (2015), and assumed in Betoule et al. (2014), Boruah et al. (2020), and others. However, Brout et al. (2022) finds a peculiar velocity scatter of 240 km s −1 in the Pantheon+ supernova sample. This number was found in a  mixed sample of galaxies with and without group-assignments, at larger distances where group assignments are more uncertain. Additionally PV corrections for the baseline Pantheon+ analysis were not integrated over the line of sight, which tends to increase statistical dispersion in distances (Peterson et al. 2021). The values of σ Structure are in all cases consistent with their fiducial values, showing that our estimates of the magnitude of errors in the reconstructions are reasonable and that the scale of observed local structures is consistent with that expected from ΛCDM.
As the elpd strongly favors the use of peculiar velocity reconstructions and strongly favors fitting of the peculiar velocity amplitudes, we stack the posterior estimates of H 0 from these four models, assigning equal weights to each. The resulting median value for H 0 is H 0 = 73.1 +2.6 −2.3 km s −1 Mpc −1 (with 16% and 84% percentile uncertainties). We take this as our fiducial result. We evaluate a p-value for the difference between our value and the Planck measurement of H 0 = 67.4 ± 0.5 km s −1 Mpc −1 (Planck Collaboration et al. 2018). Comparing the MCMC chains, we find p = 0.004, corresponding to a significance of 2.6σ. We compare our result to R21 and Planck Collaboration et al. (2018) in Figure 8.

Comparison to SNe Ia Hubble Diagram
As an additional crosscheck, we compare the distances we measure from our models to the Type Ia supernova Hubble diagram. We use the standardized magnitudes of 73 light-curves of cosmological SNe from the Pantheon+ sample Brout et al. 2022) with common host galaxies to our SNe. To evaluate a goodness-of-fit metric, we compute a χ 2 by comparing predicted values of µ from the model to SN magnitudes, fitting the absolute magnitude of the Type Ia supernovae, and including the covariance of the predicted µ's and the covariance of the supernovae provided by Brout et al. (2022). χ 2 values do not vary significantly from model to model, between 49.4-51.0, with 72 degrees of freedom. That the values do not change significantly is not surprising, as the Cepheid distances have the least statistical error of any indicator used in this analysis, and the predicted distances are largely driven by the Cepheid measurement. We show a comparison of these two rungs of the distance ladder in Figure 9. Our results are consistent with the results of R21.

CONCLUSION
We find that the Hubble constant can be constrained at extremely low redshifts by taking advantage of measurements of the local density and velocity fields from external sources. Our code and data will be made public as of the acceptance of this paper. Values of the Hubble constant measured when incorporating this information, depending on further modelling assumptions, range from 71.8 km s −1 Mpc −1 to 77.0 km s −1 Mpc −1 . Our data supports a credible interval of . Corner plot showing the posterior distribution of the scalar parameters given the observed redshifts, distance moduli, and the C15 reconstruction, fitting the PV amplitude. Contours of parameters assuming both distance-limited and redshiftlimited selection are shown in blue and red respectively. We show 68% and 95% contours for the 2d histograms, with 16% and 84% quantiles for the 1d histograms. H0 shows degeneracy with log dT in distance-limited case, and σ Structure in redshift-limited case, showing the extent to which selection uncertainties limit our measurement.
H 0 = 73.1 +2.6 −2.3 km s −1 Mpc −1 , at 2.6σ tension with the measurement of Planck. As a result, we conclude the cause of the Hubble tension is unlikely to be an unknown systematic of Type Ia supernovae.
Our analysis is sensitive to the modelling of the Cepheid variable stars, including reddening, crowding, and metallicity corrections. However any systematic errors or uncertainties in this stage of the analysis will be mitigated by our inclusion of a covariance matrix derived from the analysis variants of R21.
We find that the most significant uncertainty of our analysis is the selection function of the SH0ES sample. Seeking a further reduction of the Hubble constant would require informative distance priors with increased probability mass at low distances. However as our data show good consistency with the observed distributions of distances, increasing the probability of low distances will require reductions in the information of the prior. Reducing the size of the peculiar velocity beyond 150 km s −1 will also reduce the volumetric bias and thus will tend to reduce the value of H 0 ; however  . Comparison of observed SN Ia standardized magnitudes to the inferred distances measured from the combination of SH0ES Cepheid, redshifts, and the C15 reconstruction, fitting the peculiar velocity covariance, and assuming distance-limited selection. Where a galaxy is host to multiple SNe Ia, we take a mean of their magnitudes, weighted by their covariance.
doing so seems unmotivated by the data or theoretical estimates of the PV noise. We conclude that alternative selection models beyond those discussed are unlikely to significantly reduce our measured value of H 0 . As the R21 analysis is based on comparison of indicators with much lower relative scatter than the redshifts we use here, the analysis is not likely to be strongly affected by selection effects. Distance selection and biases in the Hubble sample is accounted for using the BBC methodology (Kessler & Scolnic 2017). However any analysis which makes use of high-scatter indicators, including nearby redshifts as a measure of the Hubble flow, must carefully evaluate the selection of their sample. This has been done in measurements of H 0 from "bright sirens" in Abbott et al. (2017) as well as "dark sirens" (Soares-Santos et al. 2019). Boruah et al. (2020) found a nontrivial effect from the use of a volumetric prior in a measurement of H 0 from water megamasers using data from Reid et al. (2019), and we suggest future works more closely examine the selection of megamaser, SBF, and Tully-Fisher distance measurements.
Cosmological uncertainties are unlikely to affect the result. We assume Gaussian behavior in our calculation of the peculiar velocity uncertainties, while the cosmological density field is known to show nonlinear and non-Gaussian behavior, particularly at scales k > 0.1 Mpc h −1 . However the velocity field is comparatively insensitive to these effects, as the velocity field is a factor of 1/k 2 less sensitive to small scales as compared to the density. We also neglect any velocity bias, which is absent on large scales in the absence of non-gravitational forces as we deal with in this work (see e.g., Desjacques et al. (2018)). We thus expect our result to be insensitive to the effects quantified by higher order statistics. Our error estimates also assume ΛCDM physics, but are only weakly dependent on alternative physics through changes to the power spectrum at small scales, given that the power spectrum at large scales has been constrained by numerous independent experiments. Our estimates of the uncertainties in peculiar velocity reconstructions are optimistic, insofar as they assume that the reconstructed velocity fields are no more inaccurate than the differences between the two. At scales larger than the volume of the Cepheid sample, we are unable to effectively constrain these quantities ourselves. At scales smaller than this, we have mitigated these uncertainties by introducing amplitude parameters that are marginalized over in the final result. This data and others (Said et al. 2020;Peterson et al. 2021) strongly support the use of these reconstructions as broadly applicable models of the nearby universe.
This work provides a measurement of H 0 independent of any supernovae sample. At the expense of increased uncertainties, dropping supernova measurements from the analysis has three main advantages. First, this work provides a satisfactory systematic error check of the SH0ES analysis: our result is fully compatible with the findings of R21, potentially ruling out any sizable systematic error contribution from the inclusion of the supernovae. Further, our measurement of H 0 is uncorrelated with the absolute magnitude of higher-redshift supernovae, which circumvents any potential issues when using this value as a prior in analyses which additionally use Type Ia supernovae as a constraint on the evolution of the distance-redshift relation. Lastly as the sample's maximum redshift is z ∼ 0.01, our result is robust to any late time effects such as transitions in the dark content of the Universe unless they have occurred within the last ∼ 100 Myr.
Measurements of the Hubble constant have been conducted at low redshift using a variety of distance indicators. The treatment of peculiar velocities has in some cases been a matter for discussion in the literature (Howlett & Davis 2020;Boruah et al. 2020;Mukherjee et al. 2021). Our methodology shares several elements with Boruah et al. (2020), with improvements to the treatment of off-diagonal velocity uncertainties, and a more flexible model framework parametrized to reduce cosmology dependence. We anticipate the use of similar methodology to that presented here in future Cepheid-based measurements of the Hubble constant from the SH0ES team, and suggest these elements should be incorporated into other analyses based on low-redshift distance indicators.
We are thankful to Tamara Davis and Anthony Carr for their contributions to the quality of the Pantheon+ redshifts, and comments on this paper.
Software: AstroPy (Astropy Collaboration et al. 2013, corner.py (Foreman-Mackey et al. 2021), Matplotlib (Hunter 2007), nbodykit (Hand et al. 2018), NumPy (Harris et al. 2020), pystan (Riddell et al. 2021), SciPy (Virtanen et al. 2020), Stan (Carpenter et al. 2017 In order to quantify the systematics associated with methodological choices made internal to the Cepheid analysis, we include additional systematic uncertainties from the analysis variants of R21. The baseline SH0ES analysis quantified these effects by measuring H 0 for each variant, taking the standard deviation of these as a systematic uncertainty in quadrature. This procedure in general will overestimate the uncertainty, as the analysis is not given the opportunity to mitigate the effect on the measurement by more strongly weighting the data less affected. Since the R21 analysis is generally dependent only on changes in the mean of the calibrator Cepheids relative to the anchor Cepheids, this overestimate would not significantly inflate the final error budget. However our measurement is more sensitive to changes in the relative values of the Cepheid host galaxies due to the correlations among the host galaxy peculiar velocities. By incorporating information about the effect of the analysis variants directly into the model, we minimize the effect on our measurement . For a specific effect (i.e. treatment of outliers in the Cepheid P-L relation) we define a set of analysis variants which represent reasonable alternative treatments of the baseline SH0ES Cepheid sample. For an effect X represented by n analysis variants, where the ith variant produces distance moduli µ (i) X , we define a systematic covariance matrix by the average of the outer products of the residuals relative to the baseline treatment The individual covariance matrices are added together to evaluate the total uncertainty in the distance moduli We briefly discuss the analysis variants we use here; for full details of the treatment of the data under each, refer to Section 6 and Table 5 of R21. We budget for the potential systematic impact of four different steps in the Cepheid analysis methodology: 1. Cepheid Outlier Treatment: The selection of our Cepheid samples from optical data is expected to leave some residual contamination by blended variable stars in the sample. Our baseline analysis therefore includes outlier rejection by individually discarding the largest outlier until no Cepheids deviate from the global periodluminosity (P-L) fit by > 3.3σ. We include variants 2-8 as viable alternative treatments, excluding variant 9 which discards outlier rejection entirely.
2. Reddening, Extinction, and Color Relations: Our baseline analysis uses dereddend Wesenheit magnitudes based on the Fitzpatrick (1999) reddening law with the absolute-to-selective extinction ratio R V fixed to 3.3. Alternative treatments include allowing the assumed value of R V as a free parameter, alternative reddening laws, and individually inferred values of R V for each Cepheid host galaxy using the empirical dust attenuation framework of Hahn et al. (2022). We include variants 16,17, 19, 20, 21, and 22 as viable alternative treatments. Variant 23 excludes reddening standardization entirely and is omitted from our budget.

Period Standardization:
The baseline sample includes all Cepheids with observed periods P > 5 d, and uses a linear P-L relation with a single slope. As some studies have found broken P-L relations for Cepheids at P = 10 d, we include a variant (number 24) which allows different slopes above and below P = 10 d. The difference in slope is consistent with 0 within 2σ. Other variants exclude high or low period Cepheids from the sample (variants 25 and 26). All three variants are included as viable alternative treatments.

Metallicity:
The baseline variant includes a linear standardization of Cepheid magnitudes for the metallicity of the host galaxy. The free parameter γ, the slope of the metallicity-luminosity relation, is significant at ∼ 4.6σ. The Cepheid metallicities are inferred based on Teimoorinia et al. (2021). We include a single variant analysis which uses instead the metallicity scale of Pettini & Pagel (2004), as discussed further in Section 6.6 and Appendix C of R21. We exclude variant 30 which discards metallization standardization entirely.  , and with each component of the systematic uncertainty added individually. Error budget contribution defined as σ 2 Effect − σ 2 Ref. . All results used the model variation assuming redshift-limited selection, fixed the peculiar velocity amplitude, and used the C15 reconstruction, to ensure the assumed errors in the data are kept constant apart from the changes in the Cepheid covariance.
In Figure 10, we show both the baseline covariance Σ (SH0ES) µ and the full covariance including Σ µ . We show how including each effect discussed above in the systematic covariance matrix affects the mean and variance of the H 0 measurement in Table 3. The systematic effect with the most impact on the final measurement is the behavior of the Cepheid period-luminosity relation.
R21 presented several other analysis variants. Many of these are not treated here as they impact only the SN Ia sample, rather than the Cepheid distances. We also do not make use of analysis variants which ignore large amounts of data (the optical Wesenheit variants, which wholly discards all NIR data) or ignore components of the standardization (such as removing metallicity dependence from the Cepheid luminosity relations). As we are interested in making the most accurate measurement possible given all available data, these analysis variants do not represent reasonable treatments. In general our final results will be more sensitive than the R21 result to the exclusion of large amounts of data or the removal of components of the standardization.

B. ESTIMATION OF COVARIANCE IN PV-CORRECTED REDSHIFTS
We first evaluate the power spectrum of the C15 reconstruction. As the C15 reconstruction has masked some parts of their volume due to a lack of data, we use the 212 Mpc h −1 cube centered on the Milky Way. This volume avoids any masking, and encompasses the whole of our sample. We compare with the predicted ΛCDM power in Figure 11. As can be seen, the reconstruction's power spectrum is well approximated by ΛCDM smoothed with a Gaussian kernel at

Modelled Physics Unmodelled Physics
Carrick 15 Lilow + Nusser 21 CDM CDM w/ Gaussian smoothing Figure 11. Comparison between the power spectrums of C15 and LN21 and the ΛCDM spectrum evaluated with CLASS. As would naively be expected, both are well described by a ΛCDM spectrum smoothed by a Gaussian kernel. The physics represented by the nonlinear part of the power spectrum is smoothed out of the reconstructions, and is thus unmodeled. We take this as a systematic.
a scale of 7 Mpc h −1 . The dispersion of the line-of-sight peculiar velocities within the reconstruction is smaller than the (380 km s −1 ) 2 expected from ΛCDM: < v 2 i >∼ (270 km s −1 ) 2 evaluated using the corresponding power spectrum. It is clear that the reconstruction cannot correct for the half of the dispersion that has not been modelled. More formally, our quantity of interest are the differences between the estimated and true density/velocity fields, as represented by the density residuals δ ∆ ( r) = δ C15 ( r) − δ True ( r). (B3) Our goal is to determine the power spectrum of this quantity, which will enable us to budget for unmodeled PV noise and quantify the improvement in our result from the use of the reconstructions. The power spectrum of the residuals is P ∆ (k) = P True (k) + P C15 (k) − 2P True,C15 (k), where P True,C15 is the cross-spectrum term P True,C15 (k) = d 3 k * /(2π) 3 <δ * C15 ( k * )δ True ( k) +δ C15 ( k)δ * True ( k * ) > /2.
This term is essentially a measure of the alignment in phase between the true and reconstructed density fields, while the power spectrum is phase-independent. In the case where the reconstruction is perfectly correct, this cross spectrum term is equal to the power spectrum, and as we expect the power in the residuals drops to zero. While we can assume an underlying ΛCDM power spectrum for the true density fields and the power spectrum of the reconstruction can be measured, evaluating the cross-spectral term requires us to determine the "correctness" of the reconstructions, which cannot be done from first principles.
As an estimate of this term from the data we have available, we have chosen to base our error budget on the the differences between the results of C15 and LN21. This is likely to give an underestimate of the true uncertainties, as both are based on the 2MRS redshifts published in Huchra et al. (2012). However this provides an estimate of the methodological uncertainties that includes off-diagonal terms of the covariance, not otherwise available to us. We test this method on our data in Section 5, as well as parametrizing the scale of covariance in the data and marginalize over these quantities. Ultimately the Cepheid data significantly favor the use of reconstructions with our budgeted uncertainties, and we conclude we have found a reasonable approximation of our true uncertainties. By assuming P True,C15 (k) = P C15,LN21 (k), and using the relation 2P C15,LN21 (k) = P C15 (k) − P C15−LN21 (k) + P LN21 (k), we derive an estimate of the power spectrum which describes the uncertainty in the reconstructed density fieldŝ P ∆ (k) = P Unmodelled (k) + P Structure (k) (B7) where P C15−LN21 (k) is the power spectrum of the residuals between the two reconstructions δ C15 ( r) − δ LN21 ( r). Rather than directly using the power spectrum evaluated from the data cubes, for both C15 and LN21 we use the smoothed ΛCDM power spectrum with a smoothing scale σ Smooth = 7 Mpc h −1 . Our estimate of the uncertainties then has a fairly simple and intuitive form; subtract the modelled physical information from the theoretical prediction and add the power describing the differences between our two models. The relative contribution of the two can be seen in Figure 1; the power from the differences adds (66 km s −1 ) 2 of variance, while the nonlinear term contributes (266 km s −1 ) 2 at much smaller scales. As we have no strong evidence to favor one reconstruction over the other, our error estimate is the same for both, and we produce results using both.