Constraints on the Cosmic Expansion History from GWTC–3

We use 47 gravitational wave sources from the Third LIGO–Virgo–Kamioka Gravitational Wave Detector Gravitational Wave Transient Catalog (GWTC–3) to estimate the Hubble parameter H(z), including its current value, the Hubble constant H 0. Each gravitational wave (GW) signal provides the luminosity distance to the source, and we estimate the corresponding redshift using two methods: the redshifted masses and a galaxy catalog. Using the binary black hole (BBH) redshifted masses, we simultaneously infer the source mass distribution and H(z). The source mass distribution displays a peak around 34 M ⊙, followed by a drop-off. Assuming this mass scale does not evolve with the redshift results in a H(z) measurement, yielding H0=68−8+12kms−1Mpc−1 (68% credible interval) when combined with the H 0 measurement from GW170817 and its electromagnetic counterpart. This represents an improvement of 17% with respect to the H 0 estimate from GWTC–1. The second method associates each GW event with its probable host galaxy in the catalog GLADE+, statistically marginalizing over the redshifts of each event’s potential hosts. Assuming a fixed BBH population, we estimate a value of H0=68−6+8kms−1Mpc−1 with the galaxy catalog method, an improvement of 42% with respect to our GWTC–1 result and 20% with respect to recent H 0 studies using GWTC–2 events. However, we show that this result is strongly impacted by assumptions about the BBH source mass distribution; the only event which is not strongly impacted by such assumptions (and is thus informative about H 0) is the well-localized event GW190814.


ABSTRACT
We use 47 gravitational-wave sources from the Third LIGO-Virgo-KAGRA Gravitational-Wave Transient Catalog (GWTC-3) to estimate the Hubble parameter H(z), including its current value, the Hubble constant H 0 .Each gravitational-wave (GW) signal provides the luminosity distance to the source and we estimate the corresponding redshift using two methods: the redshifted masses and a galaxy catalog.Using the binary black hole (BBH) redshifted masses, we simultaneously infer the source mass distribution and H(z).The source mass distribution displays a peak around 34 M , followed by a drop-off.Assuming this mass scale does not evolve with redshift results in a H(z) measurement, yielding H 0 = 68 +12 −8 km s −1 Mpc −1 (68% credible interval) when combined with the H 0 measurement from GW170817 and its electromagnetic counterpart.This represents an improvement of 17% with respect to the H 0 estimate from GWTC-1.The second method associates each GW event with its probable host galaxy in the catalog GLADE+, statistically marginalizing over the redshifts of each event's potential hosts.Assuming a fixed BBH population, we estimate a value of H 0 = 68 +8 −6 km s −1 Mpc −1 with the galaxy catalog method, an improvement of 42% with respect to our GWTC-1 result and 20% with respect to recent H 0 studies using GWTC-2 events.However, we show that this result is strongly impacted by assumptions about the BBH source mass distribution; the only event which is not strongly impacted by such assumptions (and is thus informative about H 0 ) is the well-localized event GW190814.
Keywords: gravitational waves, cosmology: observations, cosmological parameters 1. INTRODUCTION The discovery of a gravitational wave (GW) signal from a binary neutron star (BNS) merger (Abbott et al. 2017a) and the kilonova emission from its remnant (Coulter et al. 2017;Abbott et al. 2017b) provided the first GW standard siren measurement of the cosmic expansion history (Abbott et al. 2017c).As pointed out by Schutz (1986), the GW signal from a compact binary coalescence directly measures the luminosity distance to the source without any additional distance calibrator, earning these sources the name "standard sirens" (Holz & Hughes 2005).Measuring the cosmic expansion as a function of cosmological redshift is one of the key avenues with which to explore the constituents of the Universe, along with the other canonical probes such as the cosmic microwave background (CMB; Spergel et al. 2003Spergel et al. , 2007;;Komatsu et al. 2011;Ade et al. 2014Ade et al. , 2016;;Aghanim et al. 2020), baryon acoustic oscillations (Eisenstein & Hu 1998, 1997;Dawson et al. 2013;Alam et al. 2017), type Ia supernovae (Riess et al. 1996;Perlmutter et al. 1999;Riess et al. 2016;Freedman 2017;Riess et al. 2019), strong gravitational lensing (Suyu et al. 2010;Wong et al. 2020), and cosmic chronometers (Jimenez et al. 2019).
Even though GW sources are excellent distance tracers, using them to study the expansion history also requires measurement of their redshift.The redshift information is usually degenerate with the source masses in the GW signal, as the redshifted masses affect the GW frequency evolution.However, several techniques are proposed to infer the redshift of GW sources and break the mass-redshift degeneracy.For sources with confirmed electromagnetic counterparts, the host galaxy and its redshift can be determined directly (Holz & Hughes 2005;Dalal et al. 2006;MacLeod & Hogan 2008;Nissanke et al. 2010;Abbott et al. 2017c;Chen et al. 2018;Feeney et al. 2019).For sources without an electromagnetic counterpart, alternative techniques to infer the source redshift include comparing the redshifted mass distribution to an astrophysically-motivated source mass distribution (Chernoff & Finn 1993;Taylor & Gair 2012;Farr et al. 2019;You et al. 2021;Mastrogiovanni et al. 2021), obtaining statistical redshift information from galaxy catalogs (Schutz 1986;MacLeod & Hogan 2008;Del Pozzo 2012;Nishizawa 2017;Chen et al. 2018; * Deceased, August 2020. † Deceased, April 2021.Nair et al. 2018;Fishbach et al. 2019;Soares-Santos et al. 2019;Gray et al. 2020;Yu et al. 2020;Palmese et al. 2020;Borhanian et al. 2020;Finke et al. 2021), comparing the spatial clustering between GW sources and galaxies (Oguri 2016;Mukherjee et al. 2020;Bera et al. 2020;Mukherjee et al. 2021), leveraging external knowledge of the source redshift distribution (Ding et al. 2019;Ye & Fishbach 2021), and exploiting the tidal distortions of neutron stars (Messenger & Read 2012;Chatterjee et al. 2021).
The third LIGO-Virgo-KAGRA GW transient catalog (GWTC-3) (Abbott et al. 2021b) contains 90 compact binary coalescence candidate events with at least a 50% probability of being astrophysical in origin.Out of the events from the third observing run, a notable electromagnetic counterpart has been claimed only for the high-mass binary black hole (BBH) event GW190521 (Graham et al. 2020).However, the significance of this association has been re-assessed by several authors, who found insufficient evidence to claim a confident association (Ashton et al. 2020;De Paolis et al. 2020;Palmese et al. 2021).In this work, we do not include the redshift information from this putative electromagnetic counterpart signal.The only standard siren with an electromagnetic counterpart in GWTC-3 remains the BNS GW170817.
The remainder of this paper is organized as follows.In Sec. 2 we discuss the statistical methods used to infer the cosmological parameters with and without galaxy catalog information.In Sec. 3 we discuss the properties of the GW events and galaxy catalogs used in this paper.In Sec. 4 we present the results of our analyses and in Sec. 5 we discuss their implications for the cosmological parameters.Finally, in Sec.6 we present our conclusions.

METHOD
We use two analysis methods: (i) jointly fitting the cosmological parameters and the source population properties of BBHs, without using galaxy catalog information (Mastrogiovanni et al. 2021;Sec. 2.1), and (ii) fixing the source population properties, and inferring the cosmological parameters using statistical galaxy catalog information (Gray et al. 2020;Sec. 2.2).

Hierarchical inference without galaxy surveys
The GW event catalog can be described by two sets of parameters: a set of population hyper-parameters Φ that are common to the entire population of GW sources, and a set of intrinsic parameters that are unique for each event.The cosmological population hyper-parameters in this work are the cosmological parameters for a flat Universe.For the redshift range considered in the analysis, the contribution to the total energy density from radiation and neutrinos is negligible.Hence, we consider the dark energy density today as Ω DE (z) = 1 − Ω m .The cosmological parameters considered are, therefore, the Hubble constant H 0 , the matter density Ω m and dark energy equation of state (EOS) parameter w(z) = w 0 .(Chevallier & Polarski 2001).The dark energy EOS is defined by w = p/ρ, where for the standard ΛCDM we have w = −1.Additionally, the set of hyper-parameters Φ includes parameters describing the source mass distribution and the merger rate density as a function of redshift.
Given a set of N obs GW detections associated with the data {x} = (x 1 , ..., x N obs ), the posterior on Φ can be expressed as (Mandel et al. 2019;Thrane & Talbot 2019;Vitale et al. 2020) p(Φ|{x}, N obs ) = p(Φ) where p(Φ) is a prior on the population parameters, θ is the set of parameters intrinsic to each GW event, such as spins, masses, redshift etc, p(x i |Φ, θ) is the GW likelihood, p det (θ, Φ) is the probability of detecting a GW event with intrinsic parameters θ and for population hyper-parameters Φ, and p pop (θ|Φ) is a population modelled prior.The denominator in Eq. ( 1) correctly normalizes the posterior and takes into account selection effects (Mandel et al. 2019).We use the hierarchical statistical framework to infer the population parameters Φ and the prior that they induce on the distributions of the GW parameters.
The intrinsic GW parameters that are interesting for cosmology are those which provide information about the redshift z of the source.For sources which are detected at cosmological distances, GWs provide a measurement of the redshifted masses m det 1 , m det 2 and luminosity distance D L , rather than the redshift and source masses m 1 , m 2 , where The relation between source mass and redshifted mass can then be used to probe cosmology even in the absence of an explicit electromagnetic (EM) counterpart (Taylor et al. 2012;Taylor & Gair 2012) provided source mass scale can be well characterized.This approach is more effective if the source mass distribution displays sharp features (Farr et al. 2019;Ezquiaga & Holz 2021;You et al. 2021;Mastrogiovanni et al. 2021).

Population models
We model the BBH population since BBHs represent the majority of the detected sources.We now give a general overview of the source mass and redshift models used in this paper; see App.A for a complete description of the population models.
We describe the underlying distribution in redshift and source masses as where C is a normalization factor, p(m 1 , m 2 |Φ m ) is the source frame mass distribution, Φ m refers to all the population parameters not related to cosmology, the (1 + z) term encodes the clock difference between the source frame and detector frame, and p(z|H 0 , w 0 , Ω m ) is the redshift prior which is taken to be uniform in comoving volume.The term ψ(z|γ, k, z p ) describes the redshift evolution of the merger rate with a parameterization similar to that of Madau & Dickinson (2014), characterized by a low-redshift power-law slope γ, a peak at redshift z p , and a high-redshift power-law slope k after the peak, or The above rate evolution model is more complex than that in Fishbach et al. (2018); Abbott et al. (2021c,d); this is because when we vary the cosmological parameters, the GW observations may be pushed to higher redshift z > 2, past the peak of the star formation rate (Madau & Dickinson 2014).
The source mass models are factorized as where the secondary mass is modeled by a power-law distribution between a minimum mass m min and maximum mass m 1 .For the primary mass, we implement three phenomenological mass models used in Abbott et al. (2019aAbbott et al. ( , 2021c)).The first phenomenological model, is the Truncated model, describes the mass distribution as a power law (PL) between a minimum mass m min and a maximum mass m max (Fishbach & Holz 2017).The BBH mass distribution inferred from GWTC-2 was more complicated than a Truncated PL, and the second and third models are extensions of the Truncated model that contain more complex structures to better fit the mass distribution (Abbott et al. 2021c).The second model, Broken Power Law, consists of two PLs attached at a break-point (Abbott et al. 2021c).The third model is a superposition of a Truncated and a Gaussian component referred to as Power Law + Peak, in which the primary mass distribution is described by a PL with the addition of a Gaussian peak with mean µ g and variance σ 2 g (Talbot & Thrane 2018).Using the Broken Power Law and Power Law + Peak models, GWTC-2 revealed an excess of BBH systems with primary masses in the range ∼ 30-40 M , followed by a drop-off in the merger rate at high masses (Abbott et al. 2021c).This structure in the PL, modeled either as a break or a Gaussian peak, may represent the imprint of (pulsational) pair-instability supernovae (Fryer et al. 2001;Heger & Woosley 2002;Farmer et al. 2019;Renzo et al. 2020;Umeda et al. 2020).
In a companion paper investigating the GWTC-3 population (Abbott et al. 2021d) we show the evidence for sub-structures in the BHs primary mass spectrum around ∼ 10M .However we also find that the simpler Power Law + Peak model is still one of the models preferred by the GWTC-3 data.For this reason, and for simplicity in this paper, we only adopt models which are characterized by a single structure (Broken Power Law and Power Law + Peak) to describe the excess of BHs observed around 35M ; this also corresponds to the binaries that we can observe at higher redshifts and for which source mass assumptions could be important.
In order to infer H(z) from the BBH population, a crucial assumption is that the source mass distribution is independent of redshift (Fishbach et al. 2021).In most BBH formation scenarios, we expect some evolution of the mass distribution with redshift, due to factors such as the metallicity evolution of the Universe (Kudritzki & Puls 2000;Belczynski et al. 2010) and the dependence of the delay time between BBH formation and merger on BBH properties (Kushnir et al. 2016;Gallegos-Garcia et al. 2021;van Son et al. 2021).Nevertheless, if mass features, such as the break in the Broken Power Law model or the peak in the Power Law + Peak model, are caused by the pair-instability supernova process, their location is thought to stay constant within a few solar masses across cosmic time (Farmer et al. 2019).The presence of these sharp mass features drives our cosmological constraints (Farr et al. 2019;Mastrogiovanni et al. 2021).Moreover, the BBH mass distribution is expected to evolve only weakly over the range of red-shift accessible to current observations, at a level below current statistical uncertainties (Fishbach & Kalogera 2021;van Son et al. 2021).Although the BH mass spectrum at formation may vary with cosmic time, BBH channels typically predict a wide distribution of delay times between formation and merger, which tends to wash out any dependence of BBH mass on merger redshift (Mapelli et al. 2019).
In the following we will neglect the selection effect of spin distribution as the detection probability due to their inclusion should not vary by more than a factor of two (Ng et al. 2018b).This is indeed a negligible term with respect to the statistical uncertainties on our posteriors (see Sec. 5) and the dependence of the selection bias with respect to other parameters such as H 0 and γ, for which it nearly follows a power-law.

Statistical galaxy catalog method
We also use the gwcosmo code (Gray et al. 2020) in the pixelated sky scheme (Gray et al. 2021), i.e. using the HEALPix pixelization algorithm (Górski et al. 2005;Zonca et al. 2019), to infer H 0 using information from galaxy surveys.This method assumes a fixed source mass distribution, as well as a fixed-rate evolution for the binaries, and estimates H 0 from the GW data using galaxy catalogs to provide statistical information about the GW source redshifts.
When including galaxy catalog information, the prior on redshift can be replaced by the distribution of galaxies in the survey.However, Eq. (1) needs to be modified in order to take into account completeness corrections.These extra terms account for the impact of incompleteness, i.e. missing galaxies, due to the limited sensitivity of the catalog, in the GW localization volume.In this case, the posterior is given by where G is the hypothesis that the GW host galaxy is included in the catalog and Ḡ that it is not and p(g|H 0 , Φ m , d) expresses their probabilities for g ∈ [G, Ḡ].The term p(N obs |H 0 , Φ m ) is the probability of having N obs detections.We analytically marginalize over this by assuming a uniform in log rate prior.The notation d indicates the hypothesis that an event has been detected.The likelihoods p(x i | d, H 0 , Φ m , g) are built from the GW data and corrected for the selection effects in the case that the host galaxy is, and is not, inside the catalogue; see Gray et al. (2020).
We implement an improved version of the analysis presented in Abbott et al. (2021a) that can estimate H 0 for any given sky direction covered by the GW localization by dividing the sky into equal-area pixels.In each pixel, the apparent magnitude threshold (m thr ) is taken to be the median of the apparent magnitudes of all the galaxies inside that pixel.This assumption is a conservative choice for approximating the impact of catalog completeness: all galaxies with apparent magnitude fainter than the defined threshold are excluded from the analysis.Using this m thr the completeness is assessed and the H 0 likelihoods are calculated in each pixel.In the end, all the pixel likelihoods are combined using weights proportional to the GW posterior probability in each pixel to give the final H 0 posterior of each GW event.Pixels with no GW support make zero contribution, so only the pixels within the 99.9% sky area are used.
These improvements are necessary to correct for galaxy catalog incompleteness in the case that the galaxy surveys contained within the catalog are less sensitive in particular sky areas, such as the directions of the galactic plane.Moreover, the analysis can take into account the fact that the GW luminosity distance posterior conditioned on the sky position might significantly change between different sky positions.The combination of galaxy redshift information and luminosity distance estimation change from pixel to pixel, leading to a more robust estimation of H 0 .

GW events
For our main result, we select 47 GW events with network matched filter signal-to-noise ratio (SNR) > 11 and Inverse False Alarm Rate (IFAR) higher than 4 yr, taking their maximum across the different search pipelines from GWTC-3 (Abbott et al. 2021b), and no plausible instrumental origin.
We also consider events identified by different SNR choices to explore possible systematics in the computation of selection effects, see App.B. In the remainder of the paper we will shortly refer to these different ensembles by quoting the threshold SNR choices.Of the 47 events with SNR > 11, 42 are BBH detections, 2 are the BNS events GW170817 (Abbott et al. 2017a) and GW190425 (Abbott et al. 2020a), 2 are the NSBH events GW200105 and GW200115 (Abbott et al. 2021) and one is the asymmetric mass binary GW190814 (Abbott et al. 2020b).A visual representation of the population of BBHs that we detected is provided in Fig. 1, where we show the distribution of detector frame masses and luminosity distance of the BBHs.We have tabulated all the GW sources used in this analysis in Table 1, mentioning their source properties, sky localization error, the 3D localization volume, number of galaxies in the catalog within the localization volume and the probability that the GW host is present in the GLADE+ catalog (Dálya et al. 2018(Dálya et al. , 2021)).Note that differently from (Abbott et al. 2021b), the estimation of masses and distances are reported using a prior ∝ D 2 L and not uniform in comoving volume since we are interested to show these values using cosmology-agnostic priors.For the events detected during O1, O2 and O3 we use combined posterior samples from the IMRPhenom (Thompson et al. 2020;Pratten et al. 2021) and SEOBNR (Ossokine et al. 2020;Matas et al. 2020) families, while for the two NSBH events GW200105 and GW200115 we use posterior samples generated with low spin priors (Abbott et al. 2021). 1 1 Events in the analysis showing differences in posterior samples with different waveforms are GW191109 010717 and GW200129 065458 (Abbott et al. 2021b).These differences are mostly related to the effective and precession spin parameters which are not considered in this analysis.L prior and a uniform prior on the detector frame masses ad stacking posterior samples for each event.This figure is only representative of the events reported in Tab. 4 and does not indicate the population reconstruction.Left: Distribution of the primary detector frame masses (blue solid line) and source frame masses (orange dashed line).Middle: Same but for the secondary source mass.Right: Distribution of the luminosity distance (bottom axis) and redshift (top axis).

[!p]
Table 1.This table reports all of the GW events considered in this paper and summarizes some of their properties reported with their median values and 90% symmetric credible intervals.First, second and third columns: GW event label, detected SNR (highest among the different pipelines, see (Abbott et al. 2021b)) and inverse false alarm rate.Fourth, fifth, sixth columns: estimated primary and secondary detector frame masses, luminosity distance.Seventh, eighth and ninth columns: primary and secondary source frame masses, and redshift assuming a reference cosmology.Tenth and eleventh columns: sky localization area, and 3D localization comoving volume.The twelfth column lists the number of galaxies in GLADE+ inside the localization volume for each event with observed K-band (B J -band in parenthesis), while the thirteenth column reports the probability p(G|z, H0) that GLADE+ detected the GW event host galaxy using the K-band (B J in parenthesis) luminosity function calculated for a fiducial flat ΛCDM cosmology with H0 = 67.9km s −1 Mpc −1 and Ωm = 0.3065.The lower and upper bounds on these probabilities are derived from the completeness fractions at the boundaries of the 90% localization volume.We report these values using a distance prior proportional to D 2 L and a uniform in detector frame masses prior.The events in bold are the ones with SNR> 11 entering all the analyses while the others are the ones used only for the systematic studies in App.B. Table 1 continued

Description of the GLADE+ galaxy catalog
In one of the analyses that we perform, the redshift information is taken from galaxy surveys for all of the events apart from GW170817, for which we assume the redshift information from its EM counterpart.For the analysis taking into account galaxy surveys we use the GLADE+ (Dálya et al. 2018(Dálya et al. , 2021) ) all-sky galaxy catalog that is a revised version of the first GLADE catalog (Dálya et al. 2018) containing about 22 million galaxies.GLADE+ incorporates six different galaxy catalogs and surveys, namely the Gravitational Wave Galaxy Catalogue (GWGC, White et al. 2011), Hyper-LEDA (Makarov et al. 2014), the 2 Micron All-Sky Survey Extended Source Catalog (2MASS XSC, Skrutskie et al. 2006), the 2MASS Photometric Redshift Catalog (2MPZ, Bilicki et al. 2014), the WISExSCOS Photometric Redshift Catalogue (WISExSCOSPZ, Bilicki et al. 2016), and the Sloan Digital Sky Survey quasar catalogue from the 16th data release (SDSS-DR16Q, Lyke et al. 2020) and covers the full sky with a completeness of about 20% up to 800 Mpc.2Most of the galaxies in the GLADE+ catalog have a redshift measurement obtained photometrically using an artificial neural network algorithm (Collister & Lahav 2004) with a relative error σ z ph ∼ 0.033(1 + z ph ) (Bilicki et al. 2016).The peculiar velocity corrections are implemented for galaxies up to redshift z < 0.05 using a Bayesian technique (Mukherjee et al. 2021c) that can capture both linear and non-linear components of the velocity field.
For our main results, we use all galaxies with measured K s -band (denoted as K-band henceforth) luminosity reported in the Vega system and we assign a probability for each galaxy to be the host of a GW event that is proportional to this luminosity (luminosity weighting).We also explore possible systematics in our results by not using luminosity weighting and by using B J -band observations; see Sec. 5.2 for more details.We choose these two bands since we have found that there is a good match between the galaxy luminosity functions and the galaxy number density of the GLADE+ catalog, in particular for the K-band, see App.C for more details.The K-band galaxies in the GLADE+ catalog are the same as the one in GLADE catalog.The galaxies in the B J -band are present only in the GLADE+ catalog.
Fig. 2 presents a series of skymaps showing the directional dependence of the K-band apparent magnitude threshold for the GLADE+ galaxies, in superposition with the sky localizations of the GW events included in our analysis.Outside of the galactic plane, m thr ∼ 13.5 on average for the K-band while within the galactic plane region the apparent magnitude threshold is significantly lower (i.e.brighter).
We assume that the K-band absolute magnitude distribution for GLADE+ galaxies is well described by a Schechter function with parameters (reported for H 0 = 100 km s −1 Mpc −1 ) M * ,K = −23.39 and α K = −1.09(Kochanek et al. 2001), while for the B J -band we use M * ,BJ = −19.66 and α BJ = −1.21(Norberg et al. 2002).We set a bright cut-off high enough to include all the bright galaxies supported by the Schechter function: M min,K = −27.00and M min,BJ = −22.00.Further, we consider all the galaxies no fainter than M max,K = −19.0,M max,BJ = −16.5.These choices correspond to all galaxies with luminosity L > 0.017L * ,K and L > 0.054L * ,BJ , where L * is the characteristic galaxy luminosity of the Schechter luminosity function.
To calculate the rest-frame absolute magnitudes of the galaxies, for a given cosmology, we apply color and evolution corrections as reported in Kochanek et al. (2001) for the K-band and Norberg et al. (2002) for the B Jband.
For all events, apart from GW190814, we carry out the analysis using a pixel size of 3.35 deg 2 , while for GW190814 we use a pixel size of 0.2 deg 2 since the sky localization for this event was 10 times smaller than most of the others.These values have been chosen taking into account the average number of galaxies per square degree reported in GLADE+.Considering the bright and faint limits of the Schechter function assumed with a median apparent magnitude threshold m thr,K = 13.5 for the K-band and m thr,BJ = 19.7 for the B J -band, we find that there are ∼ 25 galaxies per square degree in GLADE+ reported in the K-band and 500 galaxies per square degree reported in the B J band considered in the analysis.Note that the actual galaxy density per square degree is higher outside the galactic plane and in the region of GW190814.
For any given redshift, the completeness fraction, which is the probability that the galaxy catalog contains the host galaxy of the GW event, P (G|z, H 0 ), is defined as the fraction of galaxies with absolute magnitudes brighter than the absolute magnitude threshold (calculated from m thr ), namely Figure 2. Skymaps showing the GLADE+ K-band apparent magnitude threshold, m thr , generated by dividing the sky into 3.35 deg 2 pixels, this is the resolution used for all the events but GW190814.A mask is applied that removes from the figures all pixels with m thr < 12.5 in order to improve the figure readability.Also shown are the 90% CL sky localizations for the GW events considered in this paper.
Here φ(L) is the assumed galaxy luminosity function, L min and L max are the minimum and maximum luminosity corresponding to M max and M min and L thr is the threshold luminosity for detection, calculated from m thr .Fig. 3 shows the completeness fraction of the GLADE+ catalog, in the K-band and B J -band respectively, as a function of redshift and for different values of m thr , assuming a fiducial cosmology with H 0 = 67.9km s −1 Mpc −1 and Ω m = 0.3065 (Ade et al. 2016).As can be seen from the figure, the GLADE+ catalog is less complete in the K-band than in the B J -band, but we decide to use the K-band data for our main results as they are better described by the Schechter function assumed in our analysis; see App. C.
For GLADE+ systematic uncertainties of the photometric redshift reconstruction are inside the statistical errors Completeness fraction of GLADE+ in the Kband, indicating the probability that the catalog contains the host galaxy of a GW event, as a function of redshift for H0 = 67.9km s −1 Mpc −1 and Ωm = 0.3065.The different lines are calculated for a given K-band apparent magnitude m thr threshold indicated in the legend.The legend also indicates the fraction of the sky, computed by dividing the sky in equal sized pixels of 3.35 deg 2 , for which the apparent magnitude threshold is brighter than the one reported in the legend.The fraction of pixels with no galaxies is ∼ 5% for the Bj and K bands.Bottom: Same as the top panel but for the BJ -band.
. associated to each galaxy (Bilicki et al. 2014) with a very small percentage of outliers.We do not consider deeper galaxy surveys with a restricted sky area footprint, such as the DES Y1 survey (Drlica-Wagner et al. 2018) or DESI Legacy Imaging Survey (Zou et al. 2019) that are supposed to be complete up to redshift z ∼ 1, since we decide to employ the same all-sky galaxy catalog for all of our events.Moreover, color corrections and photometric redshift reconstruction might need particular attention with deeper galaxy surveys.

RESULTS
The first analysis that we present in Sec.4.1 will focus on the impact of the BBH population source masses on inference of the cosmological parameters, using the formalism discussed in Sec.2.1.This analysis uses no galaxy catalog information; instead constraints on the cosmological parameters will be inferred from the mass scale set by the source mass distribution.We use only the BBHs detections as they are the majority of the sources entering in our cosmological analysis and because a joint description of NSBH, BNS and BBHs is uncertain.
The second analysis in Sec.4.2 fixes the source mass distribution and uses redshift information derived from galaxy catalogs, based on the formalism discussed in Sec.2.2.Unless stated otherwise, all the figures are generated with a uniform prior on H 0 , credible intervals are reported as maximum posterior and 68.3% highest density intervals.We use a flat-in-log prior on H 0 only when quoting results combined with GW170817 and its EM counterpart.

Implications of population assumptions for cosmology
We jointly estimate population-related GW parameters and cosmological parameters using BBH events, since these are the majority of the GW events observed to date.We use the 42 BBH detected events with SNR > 11.We exclude from this analysis GW190814 (Abbott et al. 2021c) given the current uncertainty on the nature of the secondary object in this system.
We consider two cosmological models: (i) a flat w 0 CDM model with wide priors on the Hubble constant H 0 , matter density Ω m and dark energy equation of state (EoS) w 0 parameter, and (ii) a flat ΛCDM Universe with a fixed value of Ω m = 0.3065 (Ade et al. 2016) and dark energy EoS parameter w 0 = −1 and with a restricted prior in the H 0 tension region (H 0 ∈ [65, 77] km s −1 Mpc −1 ).We refer to model (i) as the w 0 CDM model, and to model (ii) as the H 0tension model.We also adopt wide priors on the hyperparameters of the GW source mass distribution and its merger rate evolution as described in App. A. For all the phenomenological mass models assumed we obtain posteriors on the source mass distribution and merger rate parameters which are compatible with previous population studies (Abbott et al. 2021c,d) and the latest studies with O3b data (Abbott et al. 2021b).See App.B for more details.
As evident from values of Bayes factor reported in Table 2, we do not find any preference of the data in supporting any one of the cosmological models (w 0 CDM model or the H 0 -tension model) considered in the analysis.As we will see later, this is because the posteriors on Ω m and w 0 are not constrained by the GW observations and the error on the H 0 estimation extends beyond the H 0 tension region.
In Table 3 we report the Bayes factors computed between different mass models, for the case of wide pri- The marginal posterior distributions that we obtain for the cosmological parameters H 0 , Ω m and w 0 are shown in Fig. 4 for each phenomenological mass model.As anticipated by our Bayes factor results, we find that with the current BBH GW events we cannot constrain the values of these three cosmological parameters, as we obtain broad and uninformative posteriors.
With the Power Law + Peak we estimate H 0 = 50 +37 −30 km s −1 Mpc −1 , while for the Broken Power Law model we estimate H 0 = 44 +52 −24 km s −1 Mpc −1 .These constraints on H 0 , as we will see later, arise from the ability of these models to fit an excess of BBHs with masses around 35 M which sets a scale for the redshift distribution of BBHs.that is reported at 95% CI) and the green shaded area in the top panel shows the value of the Hubble constant measured in the local Universe (Riess et al. 2019).
We discuss this effect further using the Power Law + Peak model.Fig. 5 shows the joint posterior distribution between the cosmological parameters and the parameters µ g and m max defined in Eq. (A11), which govern the position of the BBH Gaussian excess and the upper end of the source primary mass distribution respectively.
The presence of a peak in the BBH source mass distribution allows us to set a characteristic source mass scale, which informs H(z) and allows us to exclude higher values of H 0 .Marginalizing over the cosmological parameters, we obtain a central value of µ g = 32 +6 −8 M for the peak position of the Gaussian BBH excess.On the other hand, the disfavoured Truncated model shows support at higher H 0 .This result is due to the fact that the Truncated model is not able to adequately fit the presence of massive binaries while producing an excess of BBHs with masses ∼ 40 M in the detector frame.For this reason, higher H 0 values are more supported since those values place events at higher redshifts, thus reducing their source masses.
When we combine the H 0 posteriors from the three mass models with the H 0 inferred from the bright standard siren GW170817 (see Fig. 6), we find a value of H 0 = 68 +12 We now discuss constraints on H 0 when we fix the source population model but employ galaxy surveys to infer statistical redshift information using the pixelated gwcosmo code (Gray et al. 2020).Our analysis incorporates 47 GW events, comprising 42 BBH detections, GW190814, the two BNS events GW170817 and GW190425, and the two NSBH events GW200105 and GW200115.We include all galaxies of the GLADE+ catalog that lie inside the 99.9% estimated sky area of each event.We use the GLADE+ K-band data in this analysis, adopting luminosity weights for each galaxy.For a more in-depth discussion about the impact of our BH population assumptions and choice of photometric bands, see Sec. 5.2.
To describe the distribution of BH primary masses, we use a Power Law + Peak source mass model where we fix population parameters to the median values obtained in the joint cosmological and population analysis described in Sec.4.1.For the rate evolution we adopt γ = 4.59, k = 2.86 and z p = 2.47, while for the Power Law + Peak model we use α = 3.78, β = 0.81, m max = 112.5 M , m min = 4.98 M , δ m = 4.8 M µ g = 32.27M , σ g = 3.88 M and λ g = 0.03.For the NS source mass model we consider a uniform distribution between m min 1M and m max = 3M consistently with (Abbott et al. 2021d).We evaluate GW selection effects using LIGO and Virgo sensitivities during the O1, O2, and O3 runs.
In Fig. 7 (page 24) we show the posteriors for all of the GW events considered in this analysis for the K-band.For many of the O3 events, the H 0 inference is dominated by the likelihood based on the hypothesis that the host galaxy is not in the catalog (referred to as out-of-catalog).The out-of-catalog term dominates for sources that are localized at redshifts at which the GLADE+ galaxy catalog has low completeness fraction (see Fig. 3).This is the case for most of the GW sources which are BBHs observed at large luminosity distances.Another interesting trend observed in Fig. 7 is that, for lower values of H 0 , the in-catalog likelihood terms tend to dominate because for low H 0 values the GW events are placed at smaller redshifts where the galaxy catalog is more complete, as shown in Fig. 3.
For most of these events, the number of galaxies present in the sky localization volume is large enough that the redshift information is still dominated by population assumptions (Section 5.2).GW190814 is the only event for which there is a sufficiently small number of galaxies in its sky localization area of about 18 deg 2 .Its small area makes this event partially more informative on the value of H 0 in comparison to the other GW events.We can see in Fig. 7 that, out of all the GW events, the most informative posterior on H 0 (compared to the zero galaxy catalog completeness posterior) is from GW190814, provided that the luminosity weighting scheme is applied.We have verified that the H 0 posterior with the K-band and using luminosity weights does not depend on the faint end magnitude limit used for the analysis.For this event, we infer an H 0 constraint of 67 +46 −28 km s −1 Mpc −1 (MAP and HDI).We quote the maximum a posteriori probability (MAP) and the corresponding highest density interval (HDI) values in the analysis.
Fig. 8 shows the redshift distribution of galaxies in the 90% CI sky area of GW190814 (top panel) and the galaxy catalog completeness (bottom panel), compared to the predicted distribution for a prior that is uniform in comoving volume.We observe that for the K-band the H 0 support results from an excess of galaxies, with respect to the uniform in comoving volume prior, around z ∼ 0.051.Switching off the luminosity weighting assumption decreases the contribution of this excess of galaxies since the completeness is estimated to be lower.The same excess is not visible in the B J -band as more galaxies are reported in this band and some luminous galaxies with measured K-band apparent magnitudes do not have measured apparent magnitudes for the B Jband.
Despite the cases where there is a significant incatalog contribution, the final H 0 result is nevertheless dominated by the BBHs population assumptions which are contributing to the out-of-catalog likelihood terms Figure 7. Plots reporting the results on the H0 inference for each event using the GLADE+ K band and luminosity weighting.Two panels are shown for each event.Top panels: Hierarchical likelihood under the hypothesis, G, that the host galaxy is in the catalog (blue solid lines) and under the hypothesis, Ḡ, that the host galaxy is not in the catalog (pink dashed lines).The different lines shown in each panel correspond to the different pixels within the sky localization area for each event.Bottom panels: The blue solid line shows the posterior obtained by summing the terms corresponding to the in-catalog and out-of-catalog hypotheses.The orange dashed line shows the posterior obtained by assuming a galaxy catalog with null completeness.In this case the H0 inference comes entirely from the population assumptions.This plot is intended to show which event is informative on the H0 value and whether the information is coming from population assumptions or galaxy catalog contribution.
(when the galaxy catalog is not complete) and in the incatalog terms when a large number of galaxies is present in the GW sky localization volume.
In Fig. 9 we show the combined H 0 posterior inferred from all of the GW events and for several different scenarios.By using all of the dark sirens, together with K-band galaxy information from GLADE+, we obtain a value of H 0 = 67 +13 −12 km s −1 Mpc −1 .This value is strongly dominated by the BH population assumptions, as can be seen in Fig. 9.The H 0 value obtained from population assumptions alone (Empty catalog case in Fig. 9) is H 0 = 67 +14 −13 km s −1 Mpc −1 .When we combine the galaxy catalog measurement with the result from the bright standard siren GW170817, we obtain H 0 = 68 +8 −6 km s −1 Mpc −1 .This value represents an improvement of 42% with respect to the corresponding Top panels: Distribution of galaxies observed in the GLADE+ K(left panels) and BJ (right panels) bands as a function of redshift z with and without galaxy luminosity weights, compared with the GW190814 redshift localization in its 90% CI sky area, assuming a cosmology with H0 = 67.9km s −1 Mpc −1 and Ωm = 0.3065 (green line) and the predicted redshift distribution for a prior that is uniform in comoving volume (blue dashed line).The redshift distribution is not intended as representative of the reconstructed galaxy distribution since it is calculated by stacking each galaxy redshift localization assumed as a normal distribution.This procedure only serves to give a rough idea where the H0 contribution is coming from.Bottom panels: Completeness calculated using the K and BJ band with and without application of the luminosity weighing scheme.

Considerations for the BBH-based population analysis
We have shown how population assumptions on BBH formation dominate the inference on cosmological parameters and, in particular, we have seen how the presence of an excess of BBHs with primary masses between 30 M and 40 M (Farr et al. 2019) sets a scale for the BBH redshifts, thus allowing for a weak constraint on H 0 .
In the BBH-based population analysis without GW170817, the (H 0 , Ω m , w 0 ) parameters are not constrained.In Fig. 10 we portray these constraints on the expansion rate of the Universe, H(z).The best constraint that we obtain on the expansion rate of the Universe has a value, and uncertainty (median and symmetric 90% CI), of 74 +34 −13 km s −1 Mpc −1 at redshift z = 0 if we include the bright siren GW170817, and 62 +87 −41 km s −1 Mpc −1 at redshift z ∼ 0.01 without it.

Considerations on the catalog analysis
As already discussed in Sec.4.2, the H 0 inference is dominated by the population assumptions of the underlying BH mass distribution.In particular, as shown in Fig. 5, the population parameter that is most strongly correlated with the value of H 0 is the position of the BHs excess µ g .
In Fig. 11 we show the H 0 posterior computed with different choices of µ g and fixing the remaining parameters.The values of µ g are spaced by ∼ 2.5M , which is roughly the uncertainty identified in Sec.2.1.It can be seen that the value of µ g has a strong impact on the inference of H 0 .For values of µ g higher than the median value of 32.55M , the posterior supports low H 0 values as we need GW events to be at a lower redshift to explain the excess of BHs at higher masses.On the other hand, for µ g < 32.55 M , the posterior supports higher H 0 value in order to place events at higher redshift compatible with an excess of BHs at lower masses.
We also explored the effect of raising the maximum mass of the black holes to m max = 150 M .As can be seen in Fig. 11, raising m max does not have a significant effect on the H 0 posterior since only few events are present at these masses.One more parameter for which ferent choices for the luminosity band and weighting scheme adopted for the GLADE+ galaxy catalog (lower panel).The pink and green shaded areas identify the 68% CI constraints on H0 inferred from the CMB anisotropies (Ade et al. 2016) and in the local Universe from SH0ES (Riess et al. 2019) respectively.
we explored the effect of its variation is the γ parameter in the rate evolution model.In the same plot one can see the H 0 posterior for γ = 2.59.This parameter has a stronger effect on the H 0 posterior, making the posterior less informative and at the same time moving its peak to higher values.
The galaxy catalog brings additional information only for GW190814, due to the much better sky localization (∼ 18 deg 2 ) for this event; this has the effect of providing more support for the H 0 tension region.
In Fig. 12, we show how population assumptions impact the hierarchical likelihood calculation as a function of H 0 , for the hypotheses that the host galaxy is (or is not) inside the catalog.Population assumptions strongly impact the out-of-catalog term of the likelihood, which is the dominant contribution to the H 0 posterior when the event is localized in an area where the galaxy catalog has a low completeness fraction which happens for most of the GW events that we consider in this analysis.On the other hand, population assumptions are less important for events with a small localization in a region of the galaxy catalog that is complete.In these cases (for example GW190814), the posterior is dominated by the in-catalog likelihood terms and hence exhibit a weak dependence on the population assumptions.
The aforementioned discussion also explains the difference between our results and those found in Finke et al. (2021) using GWTC-2 events.In that work the authors found a weak dependence of their posterior on the source population parameters.However, in their case only a few events, above a given completeness threshold of 70% for the main result, were used to explore systematic effects due to the source population.Moreover, in exploring these systematics they varied the population assumptions only within the range of uncertainties reported in Abbott et al. (2021c), which already assumes a fixed cosmological model with a value of H 0 consistent with the Planck results (Ade et al. 2016).Consequently, the results obtained by them are primarily driven by the Planck cosmological parameters.
Finally, we also explore the systematics introduced by choices related to the galaxy catalog data.In Fig. 11 we also show the H 0 posteriors obtained with the GLADE+ catalog, but using K-band galaxies without luminosity weighting and B J -band galaxies with luminosity weighting.In both cases, the H 0 posterior is not significantly affected by this choice and it is, again, dominated by the population assumptions.
However, the impact of using luminosity weights is not negligible.For instance, in the case of GW190814 (see Fig. 12), removing the extra luminosity weight suppresses the H 0 posterior peak around 70 km s −1 Mpc −1 .This arises because the luminous galaxies shown in Fig. 8, observed in the K-band, are now contributing to the GW event redshift localization with the same probability as the other 200+ galaxies included in the GW localization volume.We have verified that the H 0 posterior with the K-band and using luminosity weights does not depend on the faint end magnitude limit used for the analysis.

Additional systematic uncertainties
As we have seen, the presence of a known source mass scale can be used to measure cosmological parameters (Farr et al. 2019;Mastrogiovanni et al. 2021).However, if the source mass distribution is mismodeled, then the cosmological inference will be biased.With the current set of events, this effect contributes the dominant source of systematic uncertainty in the measurement of H(z).In the population-based method, a key assumption is that the source mass distribution does not evolve with redshift (Fishbach et al. 2021); any evolution is degenerate with the cosmological inference.Many BBH formation scenarios predict mild evolution in the mass distribution (Mapelli et al. 2019;Weatherford et al. 2021;Gallegos-Garcia et al. 2021;Mapelli et al. 2021;van Son et al. 2021), but given our broad statistical uncertainties, we expect this evolution to only weakly affect our results, and we do not attempt to calibrate the mass models to theory.In the galaxy-catalog based method, we fix the source mass distribution.In this case, the choice of the peak location µ g , associated with excess BHs in the distribution of source masses, is a main source of systematic uncertainty.If the µ g parameter is assumed to be lower (or higher) than its true value, this can lead to a higher (or lower) inferred value of H 0 .The impact of this bias can be reduced if the support from the incatalog part of the statistical galaxy catalog method is informative, which is mainly possible for sources with better three-dimensional localization error and with a more complete galaxy catalog.In the future, as more GW detectors join the network, resulting in more events with better sky localizations, and galaxy catalogs which are more complete at high redshift, the impact of the source population on galaxy catalog method can be mitigated.
One of the additional sources of contamination, when seeking to infer the true luminosity distance D L (and hence the true source masses) of a GW source in the absence of an EM counterpart, is the possible lensing of the GW signal due to the intervening matter distribution (Schneider et al. 1992;Bartelmann 2010;Nakamura 1998;Wang et al. 1996;Takahashi & Nakamura 2003;Dai et al. 2017;Broadhurst et al. 2018;Diego 2019).In the geometric optics limit, lensing modifies the GW signal by a magnification factor µ that only changes the amplitude of the GW strain, leading to a measured luminosity distance given by DL = D L /µ.In the strong lensing limit, when the value of µ is large, the inferred luminosity distance to the source dL may be substantially lower than the true luminosity distance, i.e. introducing a bias in the measurement of the luminosity distance.However, for the GW detections considered here, the probability of observing such a strongly-lensed event is less than one percent (Ng et al. 2018a;Oguri 2018;Mukherjee et al. 2021a), even for a broad range of astrophysical time-delay scenarios (Mukherjee et al. of the cosmological parameters, but will introduce additional variance on the luminosity distance of individual sirens at the level of a few percent (Holz & Wald 1998;Hirata et al. 2010) and it is sub-dominant in comparison to the intrinsic measurement error (of about 20%) on the luminosity distance even for the best source in our current GW catalog.Hence, we also ignore the uncertainty due to weak lensing in this paper.However, in a future analysis, we could include the contribution from weak lensing and its impact on the distance measurement (Holz & Wald 1998;Cutler & Holz 2009;Hirata et al. 2010;Namikawa et al. 2016;Mukherjee et al. 2020).

CONCLUSIONS
Using the 47 GW events with detected SNR > 11 reported in the third LIGO-Virgo-KAGRA Gravitational-Wave Transient Catalog (Abbott et al. 2021b), we have inferred constraints on the cosmological parameters adopting two different approaches: hierarchical inference without galaxy surveys and the statistical galaxy catalog method.We present for the first time analysis that constrains jointly the properties of the population of BBHs and the parameters of the cosmological model, and we have shown the crucial correlation that exists between the two sectors.
We have shown that an excess population of BHs in the mass range 30 -40 M , pointed out by Abbott et al. (2021c), is robust to the choice of assumed cosmological model parameters.While our constraints on the present-day matter density, Ω m , and dark energy EoS, w 0 , parameters are weak, we have measured the Hubble constant to be H 0 = 68 +12 −8 km s −1 Mpc −1 at 68% CL from combining dark sirens with information from the bright siren GW170817 and its electromagnetic counterpart (Abbott et al. 2019b).This result represents an improvement of 17% with respect to the H 0 value reported from analysis of O1 and O2 data (Abbott et al. 2021a) that made use of galaxy catalogs alone to infer statistical redshift information.In our analysis we also obtain weak constraints on the expansion history as a function of redshift.
In addition we provide a constraint on the value of H 0 adopting a fixed Power Law + Peak population model of BBHs and using statistical redshift information inferred from the GLADE+ galaxy catalog.This analysis obtained, for the K-band, H 0 = 67 +13 −12 km s −1 Mpc −1 , which represents an improvement of 42% with respect to Abbott et al. (2021a) alone, and an improvement of 20% with respect to recent H 0 studies using GWTC-2 events (Finke et al. 2021).Most of the constraining power in our H 0 inference comes from the event GW170817 using its electromagnetic counterpart.Combining the above result with information from GW170817 we obtain 68 +8 −6 km s −1 Mpc −1 .The most informative dark siren in the GWTC-3 catalog is GW190814, which alone provides an estimate of H 0 = 67 +46 −28 km s −1 Mpc −1 , provided that the luminosity weighting scheme is applied.
A summary of the different H 0 values obtained using different data sets and model assumptions can be seen in Table 4.The table is divided into two parts.The first part summarises the values that we infer for H 0 when fixing the population model to the most favorable one and then varying the luminosity band from GLADE+ used in our analysis.We used both the B J band and the K band and the results are very similar (see also Fig. 11).The second part of the table summarises the results obtained by marginalizing over the population parameters and using no galaxy catalog information.
Although we have improved our previously reported constraints on the value of H 0 using these 47 GW events, our results are still dominated by the systematic effects induced by the assumptions made about the GW source population.The choice of mass scale set by µ g , the mass at which the excess of BHs is centered, plays a crucial role in constraining the value of H 0 .
In the future, with significantly more both bright and dark sirens it will be possible to make robust measurements of H 0 and other cosmological parameters.On the one hand, measurement of bright sirens will help greatly with inferring the redshift from direct observations of EM counterparts (Holz & Hughes 2005;Dalal et al. 2006;Nissanke et al. 2010;Chen et al. 2018;Feeney et al. 2019).On the other hand, for dark sirens, the application of cross-correlation techniques to infer the clustering redshift of GW sources (Mukherjee et al. 2021;Bera et al. 2020) using spectroscopic galaxy surveys (Diaz & Mukherjee 2021), the PISN mass scale of black holes (Farr et al. 2019;Mastrogiovanni et al. 2021), the redshift distribution of the GW sources (Ding et al. 2019;Ye & Fishbach 2021), and the tidal distortion of neutron stars (Messenger & Read 2012;Chatterjee et al. 2021) will enable robust measurement of the cosmic expansion history.With the aid of more observations and further development of analysis techniques, we will be able to reduce the current systematics and proceed towards accurate and precision gravitational-wave cosmology.
• Truncated model: It describes the distribution of the primary mass m 1 with a truncated power law with slope −α between a minimum mass m min and a maximum mass m max .We provide extra details on the joint inference of cosmological and population parameters using BBHs.
In Fig. 13 we show the corner plots for the posterior associated with the Power Law + Peak model (the one implemented for the galaxy catalog analysis) adopting wide priors on the cosmological parameters and for the population of BBHs.As was also shown in Fig. 5, the main mass-related population parameters correlating with the cosmological parameters, and in particular H 0 , are the position of the Gaussian component (BBH excess in the mass distribution) and the higher end of the source mass distribution.
The other parameter that correlates with the estimation of H 0 is the rate evolution parameter γ.This parameter models a power-law increasing merger rate with the redshift.We find that higher values of γ support lower values of H 0 , which is due to the fact that lowering H 0 will place events at lower redshifts, which are incompatible with the observed mass distribution; therefore γ tries to correct this by supporting higher redshifts.However, the posterior on the rate evolution is well within the statistical uncertainties given in (Abbott et al. 2021d).
We run additional systematic studies using different SNR cuts for the selection of the BBHs to include in our analysis.We explore a higher SNR cut of 12 (more pure) that selects a sample of 35 events.We also explore a lower SNR cut of 10, allowing 60 events with a IFAR> 0.5 yr and no plausible instrumental origin.For this set of events, GW190426 152155 and GW190531 023648 are excluded as their secondary mass extends to lower masses in the NS region.
The marginal H 0 posterior for all the mass models is shown in Fig. 14, where we show that including more events always produces a posterior on the H 0 within the statistical uncertainties of other selection criteria.In all of these cases, the excess of BBHs around 35M is present for all the SNR cuts and it is responsible for the preference observed in the H 0 posterior.

C. SCHECHTER LUMINOSITY FUNCTION STUDIES
In this Appendix we show comparisons of the Schechter luminosity function (LF) for the galaxies with the K and B J bands galaxies reported in GLADE+.Wrong assumptions on the LF, or incorrect description of the selection biases, could potentially introduce a bias in the inferred H 0 .Indeed, one of the key assumptions that we have made to construct our completeness corrections is that the galaxy catalog is magnitude limited, i.e. galaxies are not detected only because they are too faint.This cannot be the case if another selection bias (based on e.g., colors or spectral features) were present, or if color and evolution corrections are not implemented properly.
In Fig. 15 we show a comparison of the assumed LF and the number density of galaxies per comoving volume present in GLADE+.In the case that the galaxy catalog can be correctly described as magnitude-limited, we expect that the   In each panel the solid lines are calculated based on the median apparent magnitude threshold, computed over all sky directions, then collecting and making a histogram of all galaxies in the GLADE+ catalog that lie in the appropriate redshift bin and are brighter than the apparent magnitude threshold.

Figure 1 .
Figure1.Distribution of the mass and luminosity distance parameters for the 42 BBH events with SNR > 11.The figure is generated using a Planck cosmology with H0 = 67.9km s −1 Mpc −1 and Ωm = 0.3065 with a D 2 L prior and a uniform prior on the detector frame masses ad stacking posterior samples for each event.This figure is only representative of the events reported in Tab. 4 and does not indicate the population reconstruction.Left: Distribution of the primary detector frame masses (blue solid line) and source frame masses (orange dashed line).Middle: Same but for the secondary source mass.Right: Distribution of the luminosity distance (bottom axis) and redshift (top axis).

Figure 3 .
Figure3.Top: Completeness fraction of GLADE+ in the Kband, indicating the probability that the catalog contains the host galaxy of a GW event, as a function of redshift for H0 = 67.9km s −1 Mpc −1 and Ωm = 0.3065.The different lines are calculated for a given K-band apparent magnitude m thr threshold indicated in the legend.The legend also indicates the fraction of the sky, computed by dividing the sky in equal sized pixels of 3.35 deg 2 , for which the apparent magnitude threshold is brighter than the one reported in the legend.The fraction of pixels with no galaxies is ∼ 5% for the Bj and K bands.Bottom: Same as the top panel but for the BJ -band.

Figure 4 .
Figure 4. Top panel : Marginal posterior distribution for H0.Middle panel : Marginal posterior distribution for Ωm.Bottom panel : Marginal posterior distribution for w0.In each panel the different lines indicate the 3 phenomenological mass models.The solid orange line identifies the preferred Power Law + Peak model.The pink shaded areas identify the 68% CI of the cosmological parameters inferred from measurements from the CMB (Ade et al. 2016) (apart for w0that is reported at 95% CI) and the green shaded area in the top panel shows the value of the Hubble constant measured in the local Universe(Riess et al. 2019).

Figure 5 .
Figure 5. Posterior probability density for H0 and the population parameters µg, mmax and γ, governing the position of the Gaussian peak, the upper end of the mass distribution and the merger rate evolution in the Power Law + Peak mass model.The solid and dashed black lines indicate the 50% and 90% CL contours.

Figure 6 .
Figure6.Posterior distributions for H0 obtained by combining the H0 posteriors from the 42 BBH detections and the H0 posterior inferred from the bright standard siren GW170817.The pink and green shaded areas identify the 68% CI constraints on H0 inferred from the CMB anisotropies(Ade et al. 2016) and in the local Universe from SH0ES(Riess et al. 2019) respectively.
Figure 8.Top panels: Distribution of galaxies observed in the GLADE+ K(left panels) and BJ (right panels) bands as a function of redshift z with and without galaxy luminosity weights, compared with the GW190814 redshift localization in its 90% CI sky area, assuming a cosmology with H0 = 67.9km s −1 Mpc −1 and Ωm = 0.3065 (green line) and the predicted redshift distribution for a prior that is uniform in comoving volume (blue dashed line).The redshift distribution is not intended as representative of the reconstructed galaxy distribution since it is calculated by stacking each galaxy redshift localization assumed as a normal distribution.This procedure only serves to give a rough idea where the H0 contribution is coming from.Bottom panels: Completeness calculated using the K and BJ band with and without application of the luminosity weighing scheme.

Figure 9 .Figure 10 .Figure 11 .
Figure9.Hubble constant posterior for several cases.Gray dotted line: posterior obtained using all dark standard sirens without any galaxy catalog information and fixing the BBH population model.Orange dashed line: posterior using all dark standard sirens with GLADE+ K-band galaxy catalog information and fixed population assumptions.Black solid line: posterior from GW170817 and its EM counterpart.Blue solid line: posterior combining dark standard sirens and GLADE+ K-band catalog information (orange dashed line) with GW170817 and its EM counterpart (black solid line).The pink and green shaded areas identify the 68% CI constraints on H0 inferred from the CMB anisotropies(Ade et al.  2016)  and in the local Universe from SH0ES(Riess et al. 2019) respectively.

π(m 1
|m min , m max , α) = P(m 1 |m min , m max , −α).(A10)• Power Law + Peak model: It describes the primary mass component as a superposition of a truncated PL, with slope −α between a minimum mass m min and a maximum mass m max , plus a Gaussian component with mean µ g and standard deviation σ g ,π(m 1 |m min , m max , α, λ g , µ g , σ g ) = [(1 − λ g )P(m 1 |m min , m max , −α) + λ g G(m 1 |µ g , σ g )].(A11) • Broken Power Law model: It describes the distribution of m 1 as a PL between a minimum mass m min and a maximum mass m max .The Broken Power Law model is characterized by two PL slopes α 1 and α 2 and by a breaking point between the two regimes at m break = b(m max − m min ), where b is a number ∈ [0, 1].The broken PL model is π(m 1 |m min , m max , α 1 , α 2 ) = P(m 1 |m min , m break , −α 1 ) + P(m break |m min , m break , −α 1 ) P(m break |m break , m max , −α 2 ) P(m 1 |b, m max , −α 2 ).(A12) B. FULL RESULTS FROM THE POPULATION ANALYSIS AND EFFECT OF DIFFERENT SNR CUTS

Figure 13 .
Figure 13.Corner plots for the preferred Power Law + Peak model parameters and cosmological parameters, fitted to BBHs with SNR > 11.

Figure 14 .
Figure14.Marginal posterior probability distributions for H0 with using BBH events with three different SNR cuts and the three mass models.

Figure 15 .
Figure15.Left: Comparison of the assumed BJ -band galaxy luminosity function (pink dashed line) and the differential number density of galaxies in different redshift bins (solid colored lines).The vertical dashed line indicates the median absolute magnitude threshold for galaxy detection, as computed in our code.Right: Same comparison, but for the K-band luminosity function.In each panel the solid lines are calculated based on the median apparent magnitude threshold, computed over all sky directions, then collecting and making a histogram of all galaxies in the GLADE+ catalog that lie in the appropriate redshift bin and are brighter than the apparent magnitude threshold.

Table 2 .
Logarithm of the Bayes factor comparing runs that adopt the same source mass model but different cosmologies: wide priors (for a general w0CDM cosmology) versus restricted priors (in the H0 tension region).ors on the w 0 CDM cosmological parameters.Consistent withAbbott et al. (2021c,d), we find that, even if we allow the cosmological parameters to vary with wide priors, the Truncated model is still strongly disfavored with respect to the Power Law + Peak and Broken Power Law models, by a factor ∼ 100.This result is consistent with the fact that, as indicated in Fig.1, the source mass distribution contains more structure than a simple Truncated model.As motivated inAbbott et al. (2021c), this comparatively poor fit for the Truncated model is due to the inability of this model to capture a moderate fraction of detected events with high masses, while predicting a large fraction of detected events with lower masses.Using the reduced set of signals with SNR > 11, we do not find any compelling evidence to prefer the Power Law + Peak model over the Broken Power Law model.

Table 3 .
Logarithm of the Bayes factor between the different mass models and the Power Law + Peak model preferred by the data, for the case of a w0CDM cosmology with wide priors.