Forecast cosmological constraints from the number counts of Gravitational Waves events

We present a forecast for the upcoming Einstein Telescope (ET) interferometer with two new methods to infer cosmological parameters. We consider the emission of Gravitational Waves (GWs) from compact binary coalescences, whose electromagnetic counterpart is missing, namely Dark Sirens events. Most of the methods used to infer cosmological information from GW observations rely on the availability of a redshift measurement, usually obtained with the help of external data, such as galaxy catalogues used to identify the most likely galaxy to host the emission of the observed GWs. Instead, our approach is based only on the GW survey itself and exploits the information on the distance of the GW rather than on its redshift. Since a large dataset spanning the whole distance interval is expected to fully represent the distribution, we applied our methods to the expected ET's far-reaching measuring capabilities. We simulate a dataset of observations with ET using the package darksirens, assuming an underlying ΛCDM cosmology, and including the possibility to choose between three possible Star Formation Rate density (SFR) models, also accounting for possible population III stars (PopIII). We test two independent statistical methods: one based on a likelihood approach on the theoretical expectation of observed events, and another applying the cut-and-count method, a simpler method to compare the observed number of events with the predicted counts. Both methods are consistent in their final results, and also show the potential to distinguish an incorrect SFR model from the data, but not the presence of a possible PopIII. Concerning the cosmological parameters, we find instead that ET observations by themselves would suffer from strong degeneracies, but have the potential to significantly contribute to parameter estimation if used in synergy with other surveys.


Introduction
Gravitational Waves Astronomy has recently emerged as one of the most fruitful predictions of General Relativity, opening new observational windows onto the Universe.Since the rigorous theoretical formulation in 1916, the first direct measurement was achieved only after a century of experimental effort, marking September 14 th 2015 as the birth of a new field of research in Physics [1].
Since this first detection, numerous other events have been observed, thanks to the LIGO, Virgo and KAGRA collaborations that provided the latest catalogue of observed GW events [2].The list of observed sources of GWs was however restricted to coalescences of combinations of Black Holes (BHs) and Neutron Stars (NSs), until very recently.On June 28 th 2023 the NANOGrav international collaboration announced the evidence for a Gravitational Wave Background (GWB), the long sought after mixed collection of low frequency GW source emissions [3].Even though the dominant contribution to the GWB has to probably be attributed to a population of supermassive black hole binaries, other cosmologically relevant sources may be hidden in the same GW emission frequency band, such as primordial GWs due to inflation, scalar-induced GWs and GWs from processes resulting from cosmological phase transitions [3].
Such new probes have recently become of paramount importance in cosmology, since increased precision brought to light the existence of tensions between measurement of cosmological parameters obtained from different observations (see e.g.[4,5]).The direct measurement of gravitational waves is exactly the new probe that cosmology needed: an abundance of previously concealed information, which could be used in conjunction with electromagnetic emissions to infer distance measurements on cosmological scales.However, observations of GWs do not provide a redshift value z, since what we observe are not electromagnetic waves, and this has led to the development of a multitude of methods to obtain such information.
In the rare case where we can measure the redshift of the binary through the GWs electromagnetic counterpart, we call these events Standard Sirens or Bright Sirens as opposed to Dark Sirens where z has to be found with alternative techniques.
Furthermore, Dark Sirens can be classified as a function of the method used to find z: Statistical Sirens, where the sky patch localized by GW events is scanned for galaxies or galaxy clusters whose redshifts are combined to obtain the most likely value of the true redshift [6][7][8][9]; Spectral Sirens, where the independence of the mass spectrum distribution from redshift can be exploited to break the mass-redshift degeneracy [10]; Love Sirens, where source frame masses of a neutron star binary system are obtained from the direct measurement of its tidal deformability, hence breaking again the mass-redshift degeneracy [11][12][13][14][15][16]; Gray Sirens, when a BH-NS system is doubly used as a Dark Siren and as a Bright Siren [17]; and finally we mention a term often used: Golden Sirens, i.e. well localized single event Dark Sirens such that the resulting Hubble constant estimate can resolve the Hubble tension (i.e. with sub-percent precision) [18].
In this work we focus instead on the possibility of obtaining information from GW surveys without the need to find the events' redshift.This necessarily requires one to work in a statistical framework and therefore it requires extremely high numbers of events to obtain cosmological information, but such a disadvantage is compensated by the fact that one can exploit all observations obtained by GW surveys, without being limited to a subset where the redshift information can be obtained.
One can obtain theoretical predictions for the number of observed events and their distribution in luminosity distance, a quantity that can be obtained from observations, which can allow one to extract cosmological information from GW surveys without the need to depend only on rare events or external data [19,20].
In this paper, we rely on the counts of GW events as our main source of information, building upon previous works where this observable was used to distinguish two different BH populations, namely astrophysical and primordial [21].Similar approaches have been taken before [19,20], but we account differently for cosmological degeneracies, without including priors on the matter content of the Universe Ω m , and we also apply the cut-and-count method [21], initially introduced for primordial black holes, but applied here for cosmological inference.Furthermore, we also explore the impact of astrophysical assumptions on the results obtained with these methods.
In order to pursue this approach, it is crucial for a high number of events to be available, as the method requires observations to trace the overall distribution of events.For such a reason, we decided not to work with currently available data, but rather to investigate next generation surveys, which will provide a significant increase in event statistics, focusing in particular on the planned European Space Agency Einstein Telescope (ET) [22], for which the projected number of detectable events will satisfy our dataset size requirements.
The paper is organized as follows.In section 2 we briefly review the main equations connecting the observable quantities of GW events to cosmology, and we provide details on the survey chosen to test our approach.We focus on the modelling of GW events and their distribution in section 3, while section 4 describes the method used to simulate data and the broad details of the analysis approach.We provide more details on the latter and the two specific methods used to obtain results in section 5, where we also show the expected outcome of the methods for the survey we chose.We draw our conclusions in section 6.

Gravitational Waves from coalescing binaries
While different mechanisms exist to produce the GWs we observe, in this work we restrict our attention to the coalescence of compact objects, i.e. objects in binary systems that emit GWs while spiralling toward each other.The GW emission of these systems at large distances r from the source can be approximated as spherical gravitational waves, hence we expect the usual 1/r decay.Moreover, it can be shown that the propagation through a cosmological background metric implies a decay in the amplitude as 1/d L , where d L is the luminosity distance of the compact binary [23].
Observing the waveform of an incoming GW, it is therefore possible to obtain a measurement of the luminosity distance between the observer and the progenitor system.We briefly review here the theoretical modelling of this relation, following the approach of [24][25][26].
In the transverse traceless gauge the time domain waveform depends on the antenna patterns of the detector and on the +, × polarizations: where θ and ϕ determine the position of the GW source on the celestial sphere, while ψ is the polarization angle, F +,× are the detector antenna patterns and h +,× are the two polarizations of the wave, arising from the independent components of the metric tensor perturbation; we also include the time dependence only in these latter terms since we are considering transient sources of emission, thus neglecting the effect of modulation in the angles due to the relative motion of the detector and the source.The two polarizations for a general inspiralling binary have the expression: with ι the inclination angle with respect to the line of sight of the binary's orbital plane, and ω other orbital parameters, included in two orbital parameter functions H +,× .As stated above, we expect the amplitude of the wave to scale with the source distance; such an effect is encoded in h 0 , which can be written as where G is the Newton constant, ν GW is the time-dependent frequency of the wave, and M is the chirp mass, which depends on the mass of the two binaries as with M = m 1 + m 2 , the total mass of the system and η the mass ratio defined as (2.5) Notice that both the GW frequency ν GW and the total mass M are redshifted from their source value.When needed one can go to the Fourier domain from the time domain waveform which can be rewritten, using the stationary phase approximation, as [24][25][26] where t 0 is a constant describing the epoch of the merger, for simplicity set to zero in our analysis, Ψ(f ; ψ 0 , ψ i , M ) is a phase function, depending on the phase at the epoch of the merger ψ 0 , the post-Newtonian expansion coefficients ψ i and the total mass, and we also have an additional function of the angles φ (2,0) (ι, F +,× ) [27].
In the following, we will use the publicly available package PyCBC131 to obtain the waveform h(f ) computation for a given event, given a set of parameters for the progenitor system, such as the binary masses or its orbital inclination.Moreover, we will be using the IMRPhenomD frequency domain waveform model [28,29].
In this work, our main goal is to extract cosmological information from GW observations.For such a reason, our main interest lies in the amplitude term A of Equation 2.7, where the distance of the source enters.This term can be written as [24][25][26] We can see in Equation 2.8 the inverse dependence of the amplitude on the distance, but we can also notice how this amplitude depends on parameters that can vary from system to system, such as the inclination ι, and the position of the source on the sky which enters in the antenna patterns F + and F × .In the following we will not consider this second class of parameters, as all the cosmological dependence lies in d L , and we generate them randomly for each system (see section 4).However, these parameters can have a significant impact on the uncertainty with which the distance can be measured; while a full treatment of their impact is left to a future work, we discuss their effect and our approximation to take them into account in Appendix A.

The Einstein Telescope
The Einstein Telescope interferometer (ET), will be part of the third generation of groundbased detectors and it is planned to start operating in the 2030s [22].The reference design of ET will have its arm length extended from the current Virgo 3 km and LIGO 4 km up to 10 km or even 15 km.The entire structure could be placed a few hundred meters underground and have a triangular shape for the three nested interferometers it will contain.Specifically, the detector could be composed of two different instruments, one optimized for low frequencies (ET-LF) with a low power laser and at cryogenic temperatures of 20 K; the other one for high frequencies (ET-HF) with high power and at room temperatures; thus each configuration is operated at its lowest noise condition in each frequency band.After several initial sensitivity curve models (ET-B and ET-C), the last one reached is the ET-D, which not only takes into account ET-LF and ET-HF, but also other refinements on noise models [22]; Figure 1 shows the comparison between the noise level of current detectors (advanced Virgo and advanced LIGO) and the expected sensitivity of the ET-D design.In the mock data generation performed in this work, we will consider the ET-D sensitivity curves to extract simulated observed events.
There is also another configuration under evaluation, where two classical L shaped (2L) detectors of an extended arm length of 15 km could be placed in two different locations.The current candidates for this alternative project would be the Sardinia site of Sos Enattos; a site in the Meuse-Rhine region, or possibly in Kamenz in German Lusatia region [30].Moreover, concerning our cosmological interest in luminosity distances, it has been shown that a 2L configuration could outperform the triangle configuration with an increased number of d L measurements with 1% error, with respect to what the reference design can observe [30].Sticking to the reference design, we can understand the potential of third generation interferometers if we consider the maximum redshift that can be reached by the survey as a function of the total coalescing mass of a binary system: total masses of 20 − 100M ⊙ , observed for black hole-black hole (BH-BH) or black hole-neutron star (BH-NS) binaries, will be detected by ET up to z ∼ 20 and higher, even probing the possible presence of primordial origin black hole mergers [22].The expected detection rates will be of orders 10 5 − 10 6 for BH-BH and 7 × 10 4 for NS-NS per year; moreover, the electromagnetic counterpart of NS-NS could be observed with 10 2 − 10 3 events per year.With respect to the current detectors, ET will be also able to observe GW from other sources rather than just compact object binaries: the stochastic gravitational wave background, gravitational emission from isolated pulsars and from supernovae events [30].

Modelling the distribution of events
The main goal of this work is to exploit GW observations to obtain information on the cosmological model.This is straightforward when a GW event is observed with an electromagnetic counterpart, and therefore its redshift can be measured, as one could fit a theoretical relation between luminosity distance d L and redshift to the observed quantities (see e.g.[31][32][33]).If we focus on Dark Sirens, instead only the measurement of d L is available, and the only observable we can exploit is the distribution of observed events in luminosity distance.While this will make extracting cosmological information more complicated, we expect the great majority of the events observed to fall in this category; developing methods that can work without any redshift measurement is therefore crucial to fully exploit the GW catalogues that will be available.
In this section we review the essential details of a widespread modelling of the GW event distribution [34,35], focusing our attention on Astrophysical Black Holes (ABH), showing how this can be connected to cosmology.
It is important to stress that in this work we neglect the impact of other possible progenitors, such as Neutron Stars (NS) or Primordial Black holes (PBH).These do not follow the same theoretical distribution that we will obtain for ABH and therefore the presence of such events in an observational catalogue could affect the results.One expects however that events originating from ABH will be the dominant population [36]; for such a reason we assume in this work that ABH is the only population present, and we leave the study of the impact of NS and PBH for a future investigation.

Distribution of astrophysical black hole binaries in luminosity distance
When dealing with a GW survey, what is obtained from observations is a catalogue of events, each with their observed features, which include the event position in the sky, mass of progenitors and luminosity distance (see section 2).
In order to extract cosmological information from such events, we need to compare what we observe with a theoretical prediction from a model that depends on cosmological parameters.A first feature of GW observations that we can connect to the cosmological model is the rate of events in time, as this will depend, other than on the astrophysical phenomena responsible of the creation of GW progenitors, on the rate of expansion of the Universe.
Indeed, what the predictions from cosmological models give us is the merger rate density R(z), which is the number of merger events per unit of comoving volume V c , per unit of proper time of the source t s , as a function of redshift z: To consider a real detection we need to change the used reference frame, moving from the source to the detector rest frame and its proper time t d .In addition to this, we also want to obtain the probability of an event in units of redshift, a quantity that is (in principle) observable, rather than in terms of the comoving volume.Therefore, we define the merger rate R(z), used throughout this work, as the number of merger events per unit proper time of the detector and per unit redshift, which can be obtained from the merger rate density through a change of variables where we used the relation We can see how the cosmological model enters in this transformation, as the relation of the comoving volume with the redshift leads to where d L (z| ⃗ θ ) is the luminosity distance and H(z| ⃗ θ ) the Hubble parameter, and we explicitly show that these involve a cosmological model with their dependence on a set of cosmological parameters ⃗ θ.
We finally have the expression for the merger rate function: With the merger rate R(z) we can compute the total number of merger events measured during an observation time T obs from an ideal detector: where events are observed only in a finite interval of redshift z With the total number of events Ntot ( ⃗ θ ) and the number of events at redshift z, we can obtain the probability distribution of the mergers, and therefore of the GW events to take place, as Notice that, while this is the theoretical distribution for events to occur, this does not coincide with the distribution of observed events.As we will detail in the following, several of these events will be too far away for surveys to detect them efficiently and therefore they will not appear in the survey catalogues.Then, in the case of a real detection the quantities can be weighted by a detection function f det (z), which represents the fraction of observable events for a given redshift z and it depends on the detector considered.Therefore, we define the detected merger rate: Furthermore, as we discussed in section 2, unless an electromagnetic counterpart is observed, GW surveys do not contain information on the redshift of the events.We can however transform the theoretical redshift probability into a probability in the space of luminosity distance, where z(d L | ⃗ θ ) is the inverse function of the luminosity distance, which is defined as

Merger rate density and the Star Formation Rate
With Equation 3.8, we have shown how we can obtain the theoretical distribution of events in the luminosity distance space, assuming a cosmological model that provides d L (z| ⃗ θ ) and H(z| ⃗ θ ).Such a distribution depends, through Equation 3.4, on the merger rate density of the progenitors of the GW events.This is necessarily related to the evolution of the ABH through the history of the Universe, since these are the seeds of the GW events we are investigating.
ABH can form at the latest stage of star evolution from the collapse of massive stars after their nuclear burning phase ends, while the formation of ABH binaries can happen through two channels: a pre-existing binary system of stars evolves into a system of two ABHs, or two separate isolated ABHs form a binary system in a later phase.For both cases the expression for the merger rate density can be written as a function of time t and black hole mass m BH [37,38] where N is a normalizing factor, P (∆t d ) is the distribution function of the time delay between ABH formation and merger (∆t d ), and R birth (t, m BH ) represents the birthrate of the ABH as a function of time and black hole mass.The latter can be expressed as here τ (m) is the lifetime of a star of mass m, ϕ(m) is the Initial Mass Function (IMF) [37], ψ SFR is the Star Formation Rate density (SFR), δ(m) the Dirac delta function, and g −1 BH is the inverse of a function that, for each Zero Age Main Sequence star mass value (the mass of a star at the start of the hydrogen burning phase), gives its corresponding black hole mass m BH .
One can notice that in order to obtain the merger rate, it is required to know the properties of the systems, like the progenitor masses and the delay between the binary formation and the merger, with the latter depending on the details of the systems through their orbital parameters.
Here, we follow the approach of [37], where a distribution P (∆t d ) ∝ 1/∆t d is considered, and the integral of Equation 3.10 is taken with ∆t min = 50 Myr and ∆t max = H −1 0 , an approach motivated by numerical simulation of binary BH formation [39].Nevertheless, we also follow the approach of [21], and account for the uncertainties in the modelling of the merger rate through the normalization factor N , choosing this by imposing that the predicted merger rate density is compatible with the observations reported in the Gravitational-Wave Transient Catalogs (GWTC) [36,40].
Following [21], we change variable from time to redshift, and we assume a monochromatic mass distribution for ABH, i.e. we assign the same mass to all ABH that will form the events we observe.This is a simplifying assumption because we expect from observations to have a distribution of masses for the components of the binary systems that produce the merger events.In this paper we chose the value for our fiducial ABH to be the peak of the observed current data, i.e. m ABH = 7M ⊙ .We expect that the effect of a distribution of masses and mass ratios would affect the SNR with respect to the monochromatic case, with this depending on the specific features of the system.We will consider our results to be the first approximation to this mass problem and discuss this systematic in a future work.
Furthermore, we assume that the birth rate is direcly proportional to the SFR [37] R birth (z) ∝ ψ SFR (z) . (3.12) The SFR can be measured through galaxy survey observations, which provide the rate of star formation at different redshifts, with the most distant ones being the most uncertain and where results can significantly depend on the type of tracer used to reconstruct the SFR [41].
For such a reason, we decide to work with a parametric approach, expressing the SFR as [42]  where the parameters a, b, ν, z m , controlling the shape of the distribution, are obtained fitting the observational data, and therefore depend on the high redshift tracers used.
In the following, we will investigate the impact of the uncertainty on the SFR on the final cosmological results, choosing different possible measurements of the parameters.In Table 1, we show the values considered for the different SFR cases, which we label as • Baseline: a SFR fit obtained using high redshift observations of galaxies in the redshift range z ∈ [8,10], corresponding to the Fiducial model of [43]; • GRB: a SFR fit obtained using the Gamma-ray burst rate as a high redshift tracer [41]; • Madau-Dickinson (MD): another common SFR model, which uses a slightly different functional form, with [44] ψ MD SFR = ν where we kept the same parameter names for the sake of simplicity.
In addition to this, we leave ourselves the possibility to include, in each of the SFR models considered, a high redshift contribution coming from population III stars (PopIII) [41].This contribution is modelled following Equation 3.13, with the parameter values reported in Table 1, and it is added to the SFR models as where i indicates the type of SFR considered.
The different redshift evolution of these three models are shown, both with and without the PopIII contribution, in Figure 2. One can notice how the main differences arise from the behaviour at high redshift, and we can therefore expect that the choice of SFR will affect the results achievable with GW surveys probing the high redshift regime.

Impact of cosmology
As we want to extract the cosmological information from the distribution of GWs in luminosity distance, we rely on the cosmological dependence of the merger rate R(z| ⃗ θ ), described by Equation 3.4.Such a dependence will propagate to the simulated dataset through the probability distribution of a "true" event P (z| ⃗ θ ), which is connected to our theoretical model as seen in Equation 3.6; hence, in this section we want to analyze the emerging cosmological dependence in our theoretical model T (d L | ⃗ θ ).To investigate this, we fix our baseline fiducial cosmology (by setting the values of Ω m and H 0 ) and settle on the Baseline SFR, with the inclusion of PopIII stars, as our default astrophysical setting.Using Equation 3.8, we can convert the prediction on the redshift distribution of events to a distribution in distance T (d L | ⃗ θ ), which is the one we are able to reconstruct from data, shown in Figure 3.We can thus investigate how changing the cosmological parameters affects this distribution.In Figure 4, we show in the left panel the effect of changing H 0 , while in the right panel we vary Ω m .In the fiducial distribution (Figure 3) we can observe the effect of PopIII only at high luminosity distances, where the distribution shows a small "bump" due to the presence of possible progenitors belonging to this stellar population.We can also notice that only at distances ≳ 150 Gpc we expect a number of events greater by an order of magnitude if PopIII stars are present; however, this happens quite far away from the peak of the distribution, where the probability of mergers is already significantly reduced, and therefore we do not expect a strong effect.For this reason, we decided to neglect PopIII when characterizing the impact of cosmological parameters in the rest of this section.
Concerning the relative differences shown in Figure 4, we notice that there is generally a higher variation with respect to Ω m than with H 0 .
This cosmological dependence in the theoretical model, necessarily propagates to the total number of observed events, which is obtained from Equation 3.5 as an integral of the distribution, including also the detection weighting of Equation 3.7 due to a real detection.In order to investigate the cosmology dependence of this total number, we generate simulated datasets (see section 4 for more details on the steps taken for this purpose) varying the values of the cosmological parameters to observe how the total number of events changes.For each of these datasets we obtain the theoretical expectation for the total number of events N tot , and we extract the observed total number from a Poisson distribution with mean value N tot ; thus, we obtain a set of values N tot i (H 0 , Ω m ).We then fit the trend of this observed number by varying alternately Ω m or H 0 and fixing the other, obtaining which can be rewritten as power law relations (3.17) In Figure 5 we compare the collection of datasets with the fitted profile of Equation 3.17 with their relative error.We notice that for both parameters the total number of events decreases with a higher value of the cosmological parameters.This is an expected result, as most of the cosmology dependence enters in the volume term of Equation 3.3.However, this is not the only cosmological dependence, as both the detection weighting (Equation 3.7), related to the observational uncertainties on the observed distances, as well as the astrophysical contribution of the merger rate density (subsection 3.2), will also change with cosmology.This leads to the deviation from the H −3 0 dependence, that one would expect from the volume term, that we observe in Equation 3.17.
Overall, we find that increasing either H 0 or Ω m decreases the number of expected events; this necessarily imply that one could obtain the same prediction with very different values of the parameters, provided that these are changed accordingly.These parameters are therefore degenerate with each other and an increase in H 0 could be compensated by a decrease in Ω m .
As we will see in section 5 this has a significant impact on the cosmological constraints one can extract from these observations.3.17.In the upper row we have the total number of simulated observed events (without PopIII) and their respective fits.In the lower row we have the relative error between simulated observations and fits δN/N .

Forecast dataset and analysis method
In the previous section we have provided the details of the theoretical modelling for the distribution of ABH merging events in luminosity distance.In order to obtain information on the cosmological parameters from which such predictions depend, we want to compare our model with observations.As pointed out in section 2, current GW observations are able to probe a very limited redshift range, and therefore cannot be used for our purpose, in particular if we want to distinguish between the high redshift behaviour of the different SFR models we introduced in subsection 3.2.For such a reason, we decide to focus our attention on the expected observations that will be performed by the Einstein Telescope, which we described in subsection 2.1.
In this section we first provide details on how we obtain a simulated catalogue for ET, and then we describe the methodology we use to compare our theoretical prediction with the simulated data to obtain cosmological constraints.

Einstein Telescope's luminosity distance mock catalogue
In order to generate a simulated catalogue of GW observations, we use the Python package darksirens2,3 [21], which generates a catalogue of luminosity distances from merger events, given a set of observational specifications, astrophysical and cosmological parameters.This Python code was initially intended to study the fraction of Dark Matter due to PBHs, but it can be easily used without them.
The darksirens package requires us to assume a set of fiducial parameters to generate the simulated dataset, parameters that are divided into four categories: • Cosmology: the code assumes a spatially flat isotropic and homogeneous Universe with a ΛCDM cosmological background.The only cosmological parameters needed are therefore the Hubble constant H 0 and the total matter density parameter Ω m .Our dataset is generated assuming Ω m = 0.32 and H 0 = 67 km s −1 Mpc −1 ; • Primordial Black Holes: the darksirens package includes the fraction of Dark Matter made up of primordial black holes f PBH as an input parameter.As we are here assuming that all the observed events come from ABH, we set f PBH = 10 −9 , effectively neglecting the contribution of PBH; • Astrophysical Black Holes: here one should enter the parameters specifying the SFR model, together with the mass of the ABH progenitors.For the latter we follow [21] and assume that all events originate from ABH with a monochromatic mass distribution, setting m ABH = 7 M ⊙ .Moreover, we produce our simulated dataset assuming the Baseline+PopIII SFR and using the corresponding parameters reported in Table 1.
While the public version of darksirens supports both the Baseline and GRB models, we modified the code to have the possibility to include the MD model and to also include the PopIII contribution; • Specifications: in order to produce a realistic dataset, darksirens requires the user to provide the observation time T obs , the limiting redshift to use to generate events and the threshold signal-to-noise ratio (SNR) below which an event is considered too noisy to be included in the catalogue and it is therefore discarded.We report the chosen value for the specifications in Table 2. Notice that darksirens also allows to account for the effect of gravitational lensing on the observed distances.With respect to the investigation of [21] where the interest lied in very high redshift events, we expect a less significant impact of this effect.Neverthelss, such an effect could still be important for events at the edge of our redshift distribution.Despite this, we neglect this effect as the data simulation code only allows to take it into account for a fixed cosmology, and we will study its impact in detail in a future work.
With all these parameters specified, we can generate the simulated dataset for ET.The process used to generate the mock is detailed in [21], and here we report the different steps only schematically: Specifications T obs [yrs] SNR min lensing z min z max 1 8 no 0.001 20 1. the total number of mergers taking place in the limiting redshift (N tot ) is computed using Equation 3.5; 2. for each of these events, a redshift z i is extracted from Equation 3.6 and associated to the event; 3. using the fiducial cosmological parameters, to each redshift z i a "true" luminosity distance Di is associated; 4. for each event, the SNR ρ i is evaluated, also accounting for the event position in the sky and the orbital inclination, quantities that are randomly drawn.All events for which ρ i < SNR min are removed from the catalogue, while for all others an observational error σ i = 2 Di /ρ i is computed (for more details on the computation of this uncertainty see Appendix A); 5. an observational scatter ∆D i is drawn randomly from a Gaussian distribution N (0, σ i ).
The measured luminosity distance of the i-th event is then defined as After this process, a mock catalogue of measured luminosity distances with their uncertainties D = (D i , σ i ) i=1,...,N det is created, where N det ≤ N tot is the number of events after the SNR cut.It is important to notice that in our work we focused on the observation of Dark Sirens; this implies that, while the redshift information is computed by our simulation code, this will not be available through real observations.For such a reason, we keep in our dataset only the observed distances D i and their error σ i , assuming this is the only information we can extract from a future ET catalogue of merger events.

Analysis method
The main focus of this paper is to quantify the constraining power of a GW survey with ET specification on cosmological parameters.Practically, this requires us to estimate the probability distribution of parameters given the data P ( ⃗ θ |D).Exploiting Bayes' theorem, this can be related to the likelihood of the data L(D| ⃗ θ ) as where π( ⃗ θ ) is the prior probability of the free parameters of the analysis.If one is able to compute L(D| ⃗ θ ) given a set of parameters, it is therefore possible to estimate the posterior by sampling the parameter space and exploiting Equation 4.1.To perform this sampling we rely on a Monte Carlo Markov Chain (MCMC) approach, making use of the Metropolis-Hastings algorithm implemented in the public code Cobaya4 [45].
We create external modules for this code that implement the likelihood calculation we describe in section 5, and we keep as free parameters the Hubble constant H 0 and the matter density Ω m , using flat priors with H 0 ∈ [60, 80] km s −1 Mpc −1 when fixing Ω m , or H 0 ∈ [50, 100] km s −1 Mpc −1 , Ω m ∈ [0.05, 0.8], when both are free.
This approach, would in principle allow us to avoid specifying a SFR model and include the parameters for this (i.e.ν, z m , a and b) as free parameters of our analysis, thus attempting to obtain information on the SFR itself from the distribution of events.However, such an analysis would be hindered by the strong degeneracies between the SFR and cosmological parameters, all entering Equation 3.10, and it would require the addition of external data able to provide information on the SFR parameters.This is beyond the scope of this work, and therefore we will keep the SFR parameters as fixed within our analyses, effectively assuming that such values are measured from an external survey [20].
We will however investigate the impact of assuming the wrong SFR model: as our dataset is simulated assuming a Baseline + PopIII SFR model, we will obtain constraints on H 0 and Ω m analysing the data with different choices of SFR to obtain the theoretical predictions.This will allow us to quantify how the constraints on cosmological parameters are affected when an incorrect modelling of the astrophysical phenomena is done.

Constraints from number counts
Now that we have a simulated dataset, mimicking what the ET will be able to provide, we want to assess how cosmological parameters will be constrained by it.For such a purpose, we use the approach detailed in subsection 4.2 for which we need to define a likelihood function.
In section 3, we saw how cosmological parameters enter in the calculation of the merger rate R(z), both through the volume term and the merger rate density, and how this can be used to compute the distribution of events in luminosity distance, as well as their total number.
As a first step in obtaining our constraints, we only focus on the total number of observed events, i.e. we want to obtain cosmological information using as the only observable the number of events that a survey like ET will measure, neglecting any other information provided by the survey, and we leave the investigation of a distance distribution based likelihood to a future work.

Number counts likelihood
We want to compare the number of events in the simulated dataset, after those below the SNR min are removed, with the number obtained by theoretical calculations when cosmological parameters are changed.This means obtaining the expression of the likelihood L(D| ⃗ θ ) to use in Equation 4.1, i.e. the probability of obtaining N obs (D) observed events given a theoretical prediction N tot ( ⃗ θ ), which depends on the model parameters through Equation 3.5.As shown in [21], when considering only the total number of events, we can work with a Poissonian distribution and write the likelihood as Notice that in our definition of the likelihood we do not directly use the theoretical prediction N tot ( ⃗ θ ), but rather the expected number of observed events N exp ( ⃗ θ ).This subtle distinction is crucial, as in the simulated dataset the low SNR events are removed and this needs to be accounted for in the theoretical prediction in order to not bias the results.As discussed in section 3, one could define a merger rate of detected events R det (z) and obtain all theoretical quantities from this.However, modelling the f det (z) term entering Equation 3.7 can be complicated, as the SNR of each GW observation depends on several features of the progenitor system.
For such a reason, we decide to obtain N exp ( ⃗ θ ) by generating a full mock dataset at each point of the parameter space, thus associating with each event an SNR that we can use to remove all events that are too faint from the theoretical prediction.In order to account for the dispersion in the generation of the mock dataset, we decide to average the number of observed counts over ten realizations, and to take this average as our theoretical prediction N exp ( ⃗ θ ).
We include this calculation in the analysis method described in subsection 4.2 and use it in the next section to obtain constraints on H 0 , both when it is considered as the only free parameter of the analysis and when it is varied alongside Ω m .

Cosmological and astrophysical constraints
As a first result, we obtain the constraints on H 0 when all other parameters are fixed.While this could be interpreted as a choice of astrophysical modelling for what concerns the SFR parameters, fixing Ω m implies that we are assuming it as known from some previous cosmological survey, and that constraints from GW observations are too loose to affect such a prior information.
We therefore obtain MCMC chains fitting our dataset (obtained with Baseline+PopIII) using the number count likelihood and varying H 0 only.Other than for the same SFR case used to obtain the dataset, we repeat the analysis also changing the astrophysical assumptions, in order to quantify the impact of such a choice on the final results.
We show in Table 3 and Figure 6 the results obtained in the different cases.
In the case where the analysis is performed using the same SFR as used for the dataset, we recover the value of H 0 used to generate the dataset, as expected, and we find a 68% confidence level bound for this parameter of H 0 = 67.01 ± 0.17 km s −1 Mpc −1 .This bound is extremely tight and would be competitive with constraints obtained from the Cosmic Microwave Background (CMB) or Supernovae Ia (SNeIa).However, as we discussed in subsection 3.3, we expect a significant degeneracy with Ω m and therefore a much looser constraint when this parameter is also free to vary.
When removing the PopIII contribution in the theoretical predictions and using these to perform the analysis, we notice no significant changes in the results, either in the recovered mean value nor in the strength of the constraint (see the Baseline entry of Table 3).This highlights how whether or not this effect is included does not significantly change the number of observations, a reasonable conclusion as PopIII only contributes at large redshift where the merger rate is already significantly decreasing (see Figure 2).
On the contrary, changing the main astrophysical assumptions and therefore switching from the Baseline to the GRB and MD star formation rates, we find no change in the error obtained on H 0 , but a significant shift on its mean value; the results obtained assuming GRB+PopIII show a ≈ 6σ deviation of the recovered H 0 from its fiducial value, while for MD such a shift increases to ≈ 15σ.This results shows the relevance of the astrophysical assumptions in this analysis, as a wrong modelling of the SFR can lead to significantly biased results.
We also repeat our analysis of GRB and MD cases removing the contribution of PopIII, again finding no significant changes.
The extremely large shifts obtained when changing the SFR seem to imply that this analysis could potentially distinguish between the models used.Once again, however, this result has to be considered carefully, as opening the other parameters of the analysis might significantly reduce the constraining power, and therefore hinder the possible study of SFR models.Given the possible relevance of degeneracies, we now study the two parameter case, where ⃗ θ ≡ (H 0 , Ω m ).We show in Figure 7 the 68% and 95% confidence level contours in the H 0 − Ω m plane, with the different contours referring to a change in the SFR assumption made in obtaining the theoretical predictions.
There is an important difference with respect to the one dimensional case; the degeneracy between Ω m and H 0 implies that a change in one of the parameters can be compensated by a change in the other and this results in a complete loss of constraining power on the two parameters.As we discussed in subsection 3.3, only a combination of these two parameters can be constrained with the number count likelihood, but not the two separately.In or-der to break such a degeneracy, one needs to either include information from other surveys, e.g.adding a prior on Ω m , or, possibly, exploit additional information coming from GW observations, such as the distribution in distance of the events.
We also notice that the SFR model assumption is still relevant and an overall shift of the contours similar to the one dimensional case is still present, although it is less statistically significant.Indeed, GRB and MD predict more events at high redshift than the Baseline model, however more of these events will be cut from the dataset due to their low SNR, hence the total number of events for GRB and MD is lower; to account for this effect the inferred cosmological parameters should be lower, since the total number of events is inversely proportional to H 0 and Ω m .
Concerning the contribution of PopIII, we also find in this case that no conclusion can be drawn concerning the presence of this population, similarly to the one parameter case.For such a reason, in Figure 7 we do not report results without the contribution of PopIII.

Testing the SFR with the cut-and-count method
As we saw in the previous section, when both cosmological parameters are free to vary, the information brought by the total count of events is not enough to obtain constraints.Nevertheless, we have seen how we could still obtain some hints on the differences between the considered SFR.
In this section, we try to obtain further information on the SFR by exploiting a different method, i.e. the cut-and-count method [21].Indeed, we can notice from Figure 2 that different SFR models have similar but not equal behaviours depending on redshift, especially at z > 10 where we have the biggest differences, especially when PopIII is considered.Therefore, if we were to divide the data in two bins (left and right of a given threshold), as it is done in the cut-and-count method, we could attempt to detect a difference in the number of counts in the right bin, depending on the SFR model, as a direct effect of the shape of these distributions; hence this method could in principle distinguish different SFRs and identify the correct one from the data.
Considering the observed luminosity distance dataset D, we choose a value D * to cut the data into two subsets and count the number of events above the cut N > (D, D * | ⃗ θ ).Such a number will be affected by the parameters of the problem, i.e. cosmological and astrophysical parameters.
In accordance to the previous section we test the effect of the Hubble parameter and the assumption on the underlying SFR model, hence we construct datasets D SFR H 0 and compute the quantities N > (D SFR H 0 , D * ).We obtain the error associated with the number above the cut for the observed dataset σ > (D, D * ) and for the test datasets σ > (D SFR H 0 , D * ), consisting of a Poissonian term, due to the discrete occurrences of the events in the distance interval observed, and a binomial term, due to a distance's true value being above or below the cut based on its measured error [21].
We quantify the discrepancy between the dataset and possible test datasets with the statistical shift: i.e. the distance between the test dataset, obtained with a specific SFR choice, and the observed one, in units of the error.As a first step, we want to compare the results obtained using this method with those found in subsection 5.2.Thus, we first take our threshold distance to be D * = 0, because this is the case when all the events are considered.As before, we vary H 0 in the interval [60, 80] km s −1 Mpc −1 and use the same mock dataset.We compute the statistical shift as a function of H 0 for all the SFR models considered.We expect the shift to yield the same results as above, with the minimum of the function lying in the same H 0 values of Table 3.
Indeed, we can notice from Figure 8 that the minimum of each curve, being the best estimation for each SFR model, has the same bias found with the previous method in Figure 6 and we still cannot distinguish the effect of PopIII stars.9 for different SFR models for the cut-and-count method.
We further verified the validity of this method obtaining a qualitative estimate of H 0 , by fitting around the minimum of each significance curve (see Figure 9), then shifted the whole fit by the minimum's value and took 1σ errors, i.e.where the shifted fit equals 1.The results are presented in Table 4 and they agree with our initial results shown in Table 3.We now apply the cut-and-count method to investigate if this approach could be used to distinguish between the SFR models.In this case we assume the cosmological parameters as fixed to the fiducial cosmology.
Since the statistical shift is a measurement of the deviation of the number of events, having a different choice of SFR could yield different counts above the chosen threshold; this is due to the different behaviour of the SFR functions at high redshifts (see Figure 2).
When choosing the cut at D * = 0 we cannot identify the differences in the shape of the SFR functions, only on the overall effect is has on the total counts.However, if we vary the cut across the redshift interval of observations z * (directly linked to the luminosity distance interval, because we are fixing the cosmological parameters, thus z * and D * are interchangeable) we can better highlight the difference in the SFR functions, and thus also notice where two SFR models could produce the same results.
We computed the statistical shift at fixed cosmology as a function of the redshift cut for all the SFR models.We can see the results in Figure 10 from which we can understand the following: • when the SFR model considered for the analysis coincides with that of data, S takes values of at most 1σ, pointing out that we recover the correct model.However, with this method we are still not able to distinguish the presence of PopIII stars; • if we consider the value of S in z * = 0 for different SFR models, we recover the same bias as shown in Figure 6, but this bias gets even larger when varying z * , reaching its maximum at the first peak in S(z * ).The value of z * where we reach this first maximum is therefore the ideal cut to perform on the data if one is interested in observing their high redshift behaviour and studying the impact of the assumptions made on the SFR.We also notice from Figure 10 how this z * is roughly the same for both the alternative SFR functions studied here (z * ≈ 2), possibly showing that such a choice is the most suited to study this effect; • At the respective first minimums of S(z * ) for GRB and MD in Figure 10, we expect a degeneracy between our fiducial model and the others, meaning that for a certain cut, indicated as z * , we have the same number of events after it, thus they should also predict the same estimate for H 0 and minimizing the bias.Such a value of z * will coincide with the cross-over point of the different SFR functions, i.e. the redshift at which the alternative model considered starts predicting a higher probability to see merger events with respect to the Baseline, as it can be seen in Figure 2.
For higher cuts than z * , such a degeneracy will be broken and the statistical shift increases once again, until the redshift becomes high enough that the low number of expected events becomes so low that the significance of the shift vanishes.This analysis shows that, if cosmology is assumed to be given by some external data, one could implement the cut-and-count method; changing the SFR and applying different z * cuts would allow to understand whether the SFR requires a shift in cosmological parameters in order to match the observed counts (i.e. when we find a maximum in the S function), or rather it is able to reproduce observations (low and flat S).
We want to stress that, in the context of a real GW survey, e.g. with ET, this method could be relevant in extracting information on the astrophysical model simply by cutting the data above a specific threshold, as prescribed by our analysis.Then, counting the number of events above this threshold would allow to find early results faster than a parameter space sampling method.

Summary and outlook
With the continuous improvement of GW observations and of the quality of the catalogue provided by observational surveys, it is becoming timely to investigate how this new window on the Universe can be used to extract cosmological information.This is straightforward in the presence of an electromagnetic counterpart to the observed GW or when other means to measure its redshift are available.However, the vast majority of GW events we will observe do not provide this information.
Given the extremely high number of events we expect to observe with next generation surveys, it is crucial to find analysis methods able to extract cosmological information even from Dark Sirens, i.e. events for which no measurement of their redshift is available.
In this paper we focused on this class of events, assuming that they are all generated from mergers of ABHs.This assumption allows us to model the merger rate as seen in subsection 3.1 and subsection 3.2, depending on their unknown redshift, from which we can obtain the theoretical luminosity distance distribution T (d L ).Notice that ABH are expected to be the dominant population of GW progenitors; assessing the impact of additional progenitor populations, such as NS and PBH, is however important to obtain accurate results, and we leave this investigation for a future work.
In subsection 3.3 we discussed the dependence of the theoretical merger rate on two cosmological parameters: H 0 and Ω m , assuming a flat ΛCDM cosmological model.By changing the distribution in distance of the events and, consequently, the capability of detecting them with GW surveys, the final observed effect is a dependence of the total number of observed events on these two parameters, which can be exploited for cosmological inference.
We performed two analysis methods: a likelihood-based method presented in subsection 5.1, and the cut-and-count method defined in subsection 5.3, both adapted from [21].We applied these methods on a simulated dataset obtained using the darksirens package, considering specifications that mimic those expected for the Einstein Telescope interferometer.In producing our simulated dataset, we assumed a fiducial cosmology specified by the parameters H 0 = 67 km s −1 Mpc −1 and Ω m = 0.32, and assumed the Baseline+PopIII model to describe the star formation rate (see subsection 4.1).
In our likelihood approach, we focused on the number counts of GW events as the observable from which to extract cosmological information.We defined the likelihood as a Poisson distribution on the expected number of events, and used this to constrain the free cosmological parameters.We first assumed Ω m as known, possibly from some other cosmological survey, and constrained H 0 alone.The results for the posterior of H 0 are presented in Figure 6 for different choices of SFR functions and the respective estimates with 68% credible intervals are presented in Table 3.
The results in Figure 6 show that Dark Siren observations from ET could constrain the Hubble constant at a sub-percent level from the number counts of mergers.This is a very the binary Black Holes, and including possible systematic effects, such as the presence of other progenitor populations that will affect the theoretical predictions for the merger rate.
Furthermore, we have highlighted how in a case where no external cosmological information is included, the number counts of GW events suffer from strong degeneracies between the cosmological parameters.It is therefore necessary to include further information coming from GW surveys, such as that coming from the statistical distribution of events in distance or from a subset of events for which redshift can be measured.The dataset we obtain from the darksirens package contains an estimate of the observational uncertainties on the luminosity distance measurements obtained as

Figure 1 .
Figure 1.Sensitivity curves for Einstein Telescope interferometer in the ET-D configuration, compared to the current capabilities of Advanced LIGO and Advanced Virgo.

Figure 2 .
Figure 2. SFR functions for the three models, depending on the high tracers considered to reconstruct the SFR and on the presence of population III stars.

Figure 3 .
Figure 3. Theoretical distribution of merger events as a function of luminosity distance, assuming a Baseline SFR model with and without PopIII and fiducial cosmological parameters (H 0 = 67 km s −1 Mpc −1 , Ω m = 0.32).

Figure 4 .
Figure 4. Plotted here the relative absolute difference δT (d L |θ)/T Fid of the theoretical distributions with respect to the fiducial cosmology, with a fixed Baseline SFR model (without PopIII) varying θ = H 0 (left panel) and θ = Ω m (right panel).

Figure 5 .
Figure 5.Comparison between the total number of the simulated datasets and those obtained with the fitting relation of Equation3.17.In the upper row we have the total number of simulated observed events (without PopIII) and their respective fits.In the lower row we have the relative error between simulated observations and fits δN/N .

Figure 6 .
Figure 6.Posterior distribution of H 0 obtained analyzing the dataset constructed with a Base-line+PopIII SFR with different assumptions.Solid lines indicate the use of Baseline (black), GRB (blue) and (orange) in obtaining the theoretical predictions, while dashed and solid lines indicate, respectively, that the contribution of PopIII is included or neglected.

Figure 7 .
Figure 7. 68% and 95% confidence level contours in the H 0 − Ω m plane obtained from the number counts likelihood changing the SFR assumption, with black, blue and yellow referring to Base-line+PopIII, GRB+PopIII and MD+PopIII respectively.

Figure 8 .
Figure 8. Statistical shift without cutting data (D * = 0) as a function of the Hubble parameter for the Star Formation Rate models considered.We can observe a statistically significant shift from the fiducial cosmological value (orange dashed line) similarly to Figure 6.

H 0 10 MDFigure 9 .
Figure 9. Detail of each statistical shift minimum with its respective fit, used to give this method's estimate of the Hubble parameter, where the fiducial cosmological value is represented by the orange dashed line

Figure 10 .
Figure 10.Statistical shift as a function of the cut in redshift for different SFR models.Here the cosmological parameters are fixed at their fiducial values.

Figure 11 .
Figure 11.Example of Fisher matrix based uncertainties on the parameters inferred from a GW observation when the luminosity distance is considered alongside the masses and the inclination of the system (red contours), compared with the approximated error on the luminosity distance obtained from Equation A.1 (gray band).

Table 1 .
SFR parameters for the different models, depending on the high tracers considered to reconstruct the SFR and on whether or not population III stars are present.

Table 2 .
Simulated survey parameters and redshift interval considered.The lensing effect is neglected in this work.

Table 4 .
Summary of the results of Figure