VAMANA: modeling binary black hole population with minimal assumptions

The population analysis of compact binaries involves the reconstruction of some of the gravitational wave (GW) signal parameters, such as, the mass and the spin distribution, that gave rise to the observed data. This article introduces VAMANA, which reconstructs the binary black hole population using a mixture model and facilitates excellent density measurement as informed by the data. VAMANA uses a mixture of weighted Gaussians to reconstruct the chirp mass distribution. We expect Gaussian mixtures to provide flexibility in modeling complex distributions and enable us in capturing details in the astrophysical chirp mass distribution. Each of the Gaussian in the mixture is combined with another Gaussian and a power-law to simultaneously model the spin component aligned with the orbital angular momentum and the mass ratio distribution, thus also wing us to capture their variation with the chirp mass. Additionally, we can also introduce broadband smoothing by restricting the Gaussian mixture to lie within a threshold distance of a predefined reference chirp mass distribution. Using simulated data we show the robustness of our method in reconstructing complex populations for a large number of observations. We also apply our method to the publicly available catalog of GW observations made during LIGO’s and Virgo’s first and second observation runs and present the reconstructed mass, spin distribution, and the estimated merger rate of binary black holes.


Introduction
The tally on the number of observed binary black hole mergers is increasing [1,2,3,4,5,6,7].Currently, data until the end of the first half of the third observation run is publicly available when there were around fifty confirmed observations [8].Multiple observations are facilitating the modeling of population level properties of these binaries.Although the first results presenting the reconstructed mass, spin and redshift distribution show large credible intervals amounting up to two orders of magnitude, the uncertainties are expected to reduce as the catalog of observations grow in number [7].However, some of the factors contributing to this uncertainty may not be completely mitigated by only increasing the catalog size.This is primarily because the GWs provide information on those signal parameters that directly impact the phase evolution.For most of the observations, the phase evolution of GWs is dominated by the chirp mass (M) and a combination of mass ratio and an effective-spin term (χ eff ).These are defined as where q ≡ m 2 /m 1 is the mass ratio corresponding to the component masses of the binary m 1 and m 2 , and s 1z and s 2z are the component of the spins aligned with the orbital angular momentum [9,10,11,12,13,14,15].Similarly, the amplitude is directly affected by a combination of the redshift and the inclination angle [16].Moreover, chirp mass is completely degenerate with the redshift.Thus, there occurs a large uncertainty in the measurement of the component masses and the spins.Yet another source of uncertainty is introduced by the methodology used in reconstructing the population properties.
If the methodology does not model the population properly, the reconstruction may miss the intricate features in the distribution.Multiple methods have been developed to model the population properties of the binary black holes [17,18,19,20,18,21,22,23,24,25,26,27,28,29].Often these methods assume a phenomenological shape for modeling the distribution of a property.For example, power-law distribution has been often used in modeling the component or the chirp mass distribution [17,18,19,20,30].An alternative approach is to perform a model-independent fitting like described in [31,32,33] ( [33] discuss efficient modeling of the masses and spins on simulated data that does not suffer from selection bias).
In this article we describe VAMANA, a mixture model framework for reconstructing the population properties.We model the chirp mass distribution using a mixture of weighted Gaussians which we expect to be capable of modeling a variety of complex distributions [34].Multiple binary formation channels can introduce features in the mass spectrum [28,35,36,37] and thus modeling the chirp mass distribution, which is not known apriori, using a mixture model makes intuitive sense.Additionally, we combine each of the Gaussian in the mixture with another Gaussian and a power-law to simultaneously model the spin component aligned with the orbital angular momentum and the mass ratio distribution.This also allows us to capture their variation with the chirp mass.We further discuss the method in section 2 and present results for modeling performed on simulated and publicly available data in section 3. VAMANA is available online [38].

Method
The methodology to model population properties of merging compact binaries has been discussed in multiple publications [39,40,41].Following [40], the posterior on model hyper-parameters is given by equation 2, where d ≡ {d 0 , • • • , d N obs } is the set of observations, λ is the population model, θ are the signal parameters and p det (θ) encodes the probability of an event with signal parameters θ to be observed with confidence.p(λ) is the prior probability on the model hyper-parameters and L is the log-likelihood.The analysis samples model hyperparameters using Markov Chain Monte Carlo (MCMC) and thus does not require the normalisation constant for equation 2. In practice equation 2 is estimated using discrete samples.Parameter Estimation (PE) analysis samples p(d i |θ) for a population model p(θ|λ PE ) [42], and large scale injection campaign are performed to estimate the sensitivity of the detector network for a population model p(θ|λ inj ) [17].Both the numerator and the denominator are then calculated for a target population p(θ|λ) using importance sampling [39,43].We note that the denominator in equation 2, V (λ) ≡ dθ p det (θ) p(θ|λ), makes correction for the selection bias and is termed the sensitive volume for the population p(θ|λ) [43].

Choice of Signal Parameters
The signal parameters are broadly categorised as a) intrinsic signal parameter: that are directly responsible for the orbital evolution of the binary, such as, masses, spins, tidal deformability, eccentricity, periastron distance, etc., and b) extrinsic signal parameters: that are observer-dependent, such as, luminosity distance, inclination of the binary from the line of sight, sky location, coalescence phase of the GW signal and coalescence time of the GW signal.Masses, spins, and redshift are signal parameters of primary interest for the binary black hole population.Usually, component masses, spin magnitude, and tilt-angle are chosen as the population property for reconstruction [19].However, only a few parameters are directly responsible for the GW signal's phase evolution.The dominant term for a binary's phase evolution is the chirp mass [9].The second leading order is a function of χ eff and q [9, 10,14].At this order, a high χ eff -low q binary is indistinguishable from a low χ eff -high q.The presence of this degeneracy between the masses and spins can be observed in most of the observations.Only for a small number of observations the individual spins or mass ratios are measurable.Taking the example of the recently announced asymmetric binaries GW190412 and GW190814 [44,45], that were observed at a high signal-to-noise ratio and with evidence of contribution from the higher harmonics in the signal, the spins on the primary mass was measurable but limited constraints were put on either the secondary spin magnitude or the tilt-angle respectively.
As not all the masses and spins are measured accurately, when reconstructing population properties, one may expect that by using priors close to the true astrophysical distributions the overall statistics will converge to the true astrophysical distribution for a large number of observations.Alternatively, it has been shown explicitly in [46] and argued in [40] that different priors on masses result in different inference on the spins and vice-versa.Moreover, the signal parameters are estimated with the assumption that the underlying noise in the instrument is Gaussian, however, as this is rarely the case, even with good priors the inference on the masses and spins may become biased and the reconstructed population may not converge to the true distribution even with a large number of observations.It is conceivable that a combination of improper prior with the non-convergence to the true distribution can result in biased inference on the population.
Most of the observations will not have signature for higher harmonics or precession [47,48].Thus, we choose signal parameters that are measured accurately.We choose chirp mass as a population property as it is measured accurately for a wide range of masses [49,14].Additionally, we choose mass ratio and component of the spins aligned with the orbital angular momentum as the other population properties.Furthermore, we assume the same distribution for both the spins.This choice abates the degeneracy in the masses and spins by favouring population models that can produce a small value of χ eff due to small value of s 1z and s 2z and disfavouring population models that can produce a small value of χ eff by a large positive value of s 1z and a large negative value of s 2z or vice-versa.Thus we choose to model the intrinsic signal parameters using distributions totaling three (θ ≡ M, q, s z ) ‡.

Modeling Using Gaussians
Gaussian mixtures are often used in classifying or modeling the probability density of the observed data.A Gaussian mixture can also approximate a function with the hyperparameters of the components calculated using expectation-maximisation [51,52].In a Bayesian setting, likelihood is expressed as a sum of mixtures with an assumed prior distribution of the mixture hyper-parameters [53].The number of components in the mixture can be fixed with the number of components chosen based on the goodness of fit and the complexity of the mixture [54,55] or can be flexible as informed by the data [56].
In this analysis we model the population using a mixture of components.To effectively capture the variation of the mass ratio and the aligned spin component with the chirp mass, each component comprises of a Gaussian to model the the chirp mass, another Gaussian to model the aligned spin components and a power-law to model the mass ratio distribution.Equation 3 describes the distributions used in modeling the population.The notations used in this model are described in table 2, 3) ‡ It has been suggested that the primary and the secondary black holes can have different spin distribution (e.g.see [50]); if future observations corroborate this suggestion, extending VAMANA to include s 2z is straightforward.
where µ = R V (λ) is the expected number of observations for the merger rate R.
The posterior on the hyper-parameters of the reconstructed population, p(θ|λ), are obtained by using Metropolis-Hastings sampling [58].Proposals of hyper-parameters are made and acceptance probability is calculated using proposal distribution and the full joint density L.

Constraints and Smoothing
The proposed chirp mass in VAMANA is a combination of Gaussians with no constraints on their scale or location and thus to contain the error encountered in importance sampling while evaluating equation 2 we constrain the Gaussians modeling the chirp mass distribution to have σ M i always larger than 0.05 µ M i and the Gaussians modeling the aligned-spin distribution to have σ sz i larger than 0.05.This condition ensures that Gaussians have scales larger than the usual standard deviation for the parameter estimates of M/χ eff for most of the observations.
To help with better observing features and trends we can also introduce a broadband smoothing by using a reference population.We can perform this by iteratively changing the hyper-parameters of a simple phenomenological model that uses, • a power-law with fixed cutoff range to model the chirp mass distribution, p(M) = P(M|M min , M max , α M reference ), where we set the cut-offs at the first percentile of the chirp mass estimates of the lightest binary black hole observation (M min s ) and at the eightieth percentile of the chirp mass estimates of the heaviest binary black hole observation (M max s ).Although seemingly arbitrary, these are broad choices and have a negligible impact on the results as the PE samples have negligible support outside this range.
• A truncated Gaussian with boundaries at s max z = ±0.99 to model the spin distribution, p(s z ) = ψ(s z |µ sz reference , σ sz reference ).• And a power-law distribution with boundaries at 0.1 and 1 to model the mass ratio distribution, p(q) = P(q|0.1,1.0, α q reference ). and identify the maximum likelihood fit as the reference population.There are four hyper-parameters in our phenomenological model.We update the values of the hyperparameters to the ones drawn using normal distributions around the current values every-time the likelihood increases.
In equation 5 we define a distance measure r eff , inspired from the idea of importance sampling, where w ≡ (w 0 • • • w n ) are ratios of the probabilities calculated on n chirp mass bins centered at M i .This measure is closely related to the Euclidean distance-squared between the reference population's chirp mass distribution and a proposed chirp mass distribution as defined in equation 6 with p = 2 [59], We can expect the reference chirp mass distribution to have an L 1 value close to zero for the true chirp mass distribution for a large number of observations.Additionally, we don't expect our simple phenomenological model to fit the data very well, and thus L 2 value will be non-vanishing and will depend on the complexity of the true distribution.
Hence we can employ the Gaussian mixture to explore the chirp mass distribution in the vicinity of the reference distribution and putting a threshold on the distance between the reference and the proposed chirp mass distribution provides broadband smoothing.Unlike a phenomenological function that gets modified throughout the chirp mass range, we explore all distributions -expressible as the sum of weighted Gaussians -that are within a distance measure from the reference chirp mass distribution.Although a threshold can be applied on either r eff or L p , we choose to put a threshold on r eff .r eff lies between zero and one with a threshold of zero allowing the mixture of Gassusians complete freedom and a threshold of close to one requiring proposed chirp mass distribution to be close to the reference chirp mass distribution.to µ M i / √ N and for the maximum value of σ sz i which is chosen inversely proportional √ N .With these choices, the mean of the chirp mass distribution corresponding to the hyper-parameter priors is approximately uniform-in-log and the priors on the chirp mass, aligned-spin, and mass ratio remain almost unchanged for a wide range of component number.This will also provide consistent scaling when the number of components is increased to model a bigger gravitational wave catalog in the future.The prior on the merger rate is scale-invariant and does not contribute to the posterior of other hyperparameters [60].The scale-invariant uniform-in-log prior on µ M i also keeps the prior intact in the event its ranges needs adjustment due to the addition of future observations with chirp masses outside the current range.

Priors and Proposals
The scales of the Gaussians modeling the chirp mass and the aligned-spins are proposed using the χ 2 distribution.To avoid Gaussians getting stuck at local maxima each proposal is made using a different value of the degrees of freedom (DOF) with these values drawn from a uniform distribution.A large value of DOF proposes closer to the current value of the scale while a small value of DOF proposes farther from the current value of the scale.The hyper-parameters α q i , q min i , and merger rate are proposed by drawing from a normal distribution around the current values.The scale of the proposing Gaussians is pre-fixed.We use Dirichlet distribution to propose the mixing weights with DOF drawn from a uniform distribution.
Astrophysical chirp mass distribution is expected to have a fall-off similar to a power-law distribution, thus a shift in the location of a Gaussian at lower chirp mass will cause a larger change in the likelihood compared to the same shift in the location of a Gaussian located at a higher chirp mass value.The spins have been measured to be low and distributed normally [6], thus a shift in the location of a Gaussian at lower aligned/anti-aligned spin value will cause a larger change in likelihood compared to the same shift in the location of a Gaussian located at a higher aligned/anti-aligned spin value.To sample the posteriors efficiently the location of Gaussians are proposed using uniform distribution but with support range adjusted according to the current location.For the current location x, the support range [x min , x max ] is calculated using the following prescription, where U is the uniform distribution, F is the cumulative density function and F −1 is the inverse distribution function corresponding to P ref (M) or ψ ref (s z ).δF determines x min and x max , and directly impacts the interval [x min , x max ].We show this pictorially in figure 1.
We have verified that the analysis reproduces the priors for the case of flat likelihood.We show this in figure 2. The number of Gaussians and the ranges or the hyper-parameters will need to be modified as catalog for the observations grows in size.We discuss this further in the context of the presented results in subsection 3.2.

Results
In this section, we discuss the robustness of VAMANA in reconstructing complex distributions using toy models that mimic the primary features of a full analysis.We also apply the analysis on the publicly available data and present the reconstructed mass and spin distribution, and the estimated merger rate for binary black holes.

Toy Model
In an analysis with the real data, the presence of measurement uncertainty requires the likelihood to be marginalised over it and presence of selection effect requires proper re-scaling of the density.We have verified that both these procedures are performed accurately.Thus, we directly focus on the modeling capability of the analysis by concocting two complex toy model populations with distribution defined as, and simulating data directly from this population.For these analysis we generate 1000 simulated data points and use 9 Gaussians to reconstruct the population.The primary goal of the analysis is to verify a bias free methodology.Figure 3 plots the reconstructed population and figure 4 plots the reconstructed M − s z and M − q distribution.The reconstructed distribution show excellent agreement with the true distribution.
We estimate the significance of any feature extracted by the analysis by making comparisons.For example, we can compare the significance of the two peaks in the reconstructed mean by comparing them with the underlying power-law distribution in the ib) distribution.For this case we obtain a log Bayes factor of 98 and 43 respectively, i.e. the peaks are highly significant compared to the underlying power-law distribution.On the other hand, our ia) distribution reconstructs a seemingly spurious peak centered at 38M .However, we calculate a log Bayes factor of around 1 when we compare this peak with the true distribution, i.e. this peak is barely worth mentioning.

Reconstruction Using Observed Gravitational Waves
In this section, we further discuss methodology by presenting the results obtained for the observations made during LIGO's and Virgo's first and second observation run [1].The analysis that uses all publicly available observations is discussed in a separate article [61].We only select the events with a false alarm rate of at most once in five years in PyCBC or GstLAL search analysis [62,63].PE samples of these observations are publicly available along with the prior used in producing these samples [64].Parameter estimation analysis samples were generated using a stochastic sampler LALInference [42].Independent searches have reported few more GW observations [2,3,4,5] but we leave these observations out until a unified framework is in place that can consistently include observations made by many independent searches.Sensitive volume is estimated on the recovered injections that follow the power-law distribution in chirp mass and mass ratio, and uniform distribution in aligned spin components.Injections are distributed uniformly in extrinsic signal parameters, except the redshift, for which they are distributed uniform-in-comoving volume.The recovered injections are defined as the ones that cross a network signal-to-noise ratio (SNR) of 9.0 on a given power spectral density (PSD).We use multiple PSDs chosen uniformly over the observation time.The threshold of 9.0 is chosen as all the observations have been observed at a higher SNR by the search analysis, moreover, a simple quadrature sum suggests that the contribution from instrument noise is low at this SNR threshold enabling our method in estimating SNR similar to an actual search analysis.However, we expect the sensitive volume estimation to be approximate and may result in a slightly biased reconstruction.
Instead of listing the hyper-parameter ranges, we show the distribution of chirp mass, mass ratio, and aligned-spin corresponding to the priors used in Figure 5.We have performed the analysis using a number of components between 3 and 8. Table 3 lists the marginal log likelihood for these analyses.The marginal likelihood remains mostly unchanged for a wide range of component number §.We present results that use six components.Moreover, we have chosen to not apply any smoothing.We perform sanity § To calculate an approximate value for the marginal likelihood we use the prescription defined in [65].checks to verify if the observed distribution is consistently predicted by the reconstructed distribution.We also check if the sampler has converged by observing the presence of any trend in the likelihood value of the posterior.Both of these checks are presented in figure 6.
Figure 7 plots the chirp mass and the primary mass distribution.The primary mass distribution is obtained by making a variable transformation.The figure shows the reconstructed primary mass distribution obtained using "model C" in the LIGO/Virgo analysis [19].It also shows effect of changing the number of components in the analysis.The reconstructed mean is consistent for most of the mass range except for the feature at around 18M which is increasingly pronounced with the increase in the number of components.This feature is primarily due to the observation GW151012.The 4 component analysis is most consistent with the LIGO/Virgo analysis but the 6 component analysis is most favoured.As the analysis is data driven it is expected that fluctuation in data can give rise to features in the reconstructed mean.However, the significance of a feature can be estimated in various ways, e.g.i) by comparing the mean distribution with feature replaced by a best fit powerlaw, ii) by making a comparison with the mean reconstruction obtained using a phenomenological model, and iii) by comparing two reconstructions obtained using a different number of components with one that shows the presence of feature and other that does not.For the feature at 18M , the Bayes factor is only 1.2 between the 6 and 4 component analysis.Thus it is not noteworthy in the presented analysis.Figure 8 plots the mass ratio and alignedcomponent spin distributions.All the observations favour a mass ratio of closer to unity.Reconstruction suggests that the formation channels for black holes prefer producing equal mass binaries with the fractional contribution declining rapidly for lower mass ratios.The measured spins on all the observations are also small.The only exception being GW151226 which has a moderate spin magnitude.VAMANA facilitates the modeling of spins and mass ratio as dependent on the chirp mass.Figure 9 shows the variation of the aligned spin with the total mass of the binary.Except for GW151226, the spins are consistent with small magnitudes and do not vary with the chirp mass of the binary black holes.As has already been reported in multiple publications, this is in contrast to the black hole spins measured in x-ray binaries or the spins expected in the hierarchical merger scenario where black holes that acquired a remnant spin during their mergers go on and merge again with other black holes [66,67,68,69,70].Finally figure 10 plots the posterior on the merger rate, the 90% confidence interval of which is 27.0 +22.3 −15.8 Gpc −3 yr −1 .

Effect of Smoothing
Figure 11 shows the effect of smoothing on the reconstructed chirp mass distribution.The apparent effect is suppression of features that are not strongly supported by the data.An optimum smoothing threshold can be chosen by a bandwidth selection method operating under some rule-of-thumb.Alternatively, a value that maximises the marginal likelihood can also be used.For the analysis performed on the real data marginal likelihood is maximised for r eff = 0.2.Top) The reconstructed chirp mass distribution.The dark salmon band is the 50 % credible interval, the light salmon band is the 90% credible interval and the salmon curve is the mean distribution, bottom) The reconstructed primary mass distribution.The salmon band is the 90 % credible interval and the salmon curve is the mean distribution.For comparison, the reconstructed primary mass corresponding to model C from the LIGO/Virgo analysis is shown in blue colour [19].The figure also includes the mean distribution obtained by analysis using 4 and 8 components.

Conclusion
In this article, we introduced VAMANA, a flexible scheme to model the properties of binary black hole population using a mixture model.We employ a mixture model in reconstructing the chirp mass, aligned-spin, and mass ratio distribution.We show that the analysis is capable of reconstructing complex distributions such as the power-law distribution and expect this flexible methodology will facilitate the extraction of any intricate features in the population.We did not introduce redshift as a signal parameter in this article but have proposed an extension in a separate article [71].Moreover, this method can be extended to include binary neutron star and neutron star-black hole .The posterior on the mass ratio (q) and aligned spin component (s z ).The salmon band is the 90% credible interval and the curve is the mean value.These distributions evidently favour closer to unity mass ratios and small magnitude for the spin-components aligned with the orbital angular momentum.binaries but including low mass, compact binaries will further increase the dynamic range of the chirp mass distribution.A limited number of Gaussians will probably not be sufficient to model a density that changes by a few orders of magnitude over the chirp mass range.Alternatively, this analysis can be broken into two on the chirp mass range to model.We plan to include some of these developments in future works.

Figure 1 .
Figure 1.An example elucidating the procedure to make proposals for µ M i and µ sz i .The locations of Gaussians modeling the chirp mass/aligned-spin are sampled by making proposals uniformly between x min and x max .The procedure to calculate x min and x max is described in equation 7. The dashed lines in the plots show the interval in which the next proposal is made for a Gaussian located at the solid line.The width of the interval is smaller where the change in density is steep as shown by the green lines and larger where the change in density is shallow as shown by the red lines.The width of the support interval depend on the reference population as well as δF.

Figure 2 .
Figure 2.An example prior distribution for the case of two Gaussians obtained by performing the analysis with a flat likelihood and no smoothing.The red curves are the expected distributions.

Figure 3 .
Figure 3.The reconstructed top) chirp mass, bottom) mass ratio and aligned spin distribution for the toy models described in equations 8.The curves are the mean distribution with shaded region representing the 90% credible interval.The dashed curves are the true distributions.The reconstructed mass ratio and aligned spin distributions distribution have been marginalised over the chirp mass.

Figure 4 .
Figure 4.The plot shows the reconstruction of aligned spin and mass ratio as dependent on the chirp mass for the model ib).There are 10 contours in each plot showing equally spaced credible intervals with the first one for the 5% confidence and the last one for the 95% confidence.

Figure 5 .
Figure 5. Prior chirp mass, mass ratio, and aligned spin-distribution were obtained by performing the analysis with a flat likelihood with no smoothing constraints.Top) The shaded region is the 90% credible interval for the chirp mass.The black curve is the mean distribution and the dashed black curve is the uniform-in-log distribution.The red dashed lines are the boundaries M min s and M max s , and the blue curve is obtained by stacking the chirp mass estimates of all the observations into a histogram, bottom) The shaded 90% credible interval for mass ratio and aligned-spin distribution corresponding to the priors used in this analysis.The curves are the mean distribution.

Figure 6 .
Figure 6.Left) The natural log of the likelihood defined in equation 2 showing no visible trend in their values indicate proper convergence of the sampler.Right) The salmon band is the 90% confidence of the cumulative probability of the posterior predictive obtained after applying selection effects to the reconstructed chirp mass distribution.The grey band is the 90% confidence obtained by bootstrapping various realisations of the observed data.Each realisation of the observed data is generated by re-weighting the chirp mass estimate of the observations to the reference population and selecting one data point from each one of them.The observed data is enclosed within the 90% confidence of the posterior's prediction.

Figure 7 .
Figure7.Top) The reconstructed chirp mass distribution.The dark salmon band is the 50 % credible interval, the light salmon band is the 90% credible interval and the salmon curve is the mean distribution, bottom) The reconstructed primary mass distribution.The salmon band is the 90 % credible interval and the salmon curve is the mean distribution.For comparison, the reconstructed primary mass corresponding to model C from the LIGO/Virgo analysis is shown in blue colour[19].The figure also includes the mean distribution obtained by analysis using 4 and 8 components.

Figure 8
Figure8.The posterior on the mass ratio (q) and aligned spin component (s z ).The salmon band is the 90% credible interval and the curve is the mean value.These distributions evidently favour closer to unity mass ratios and small magnitude for the spin-components aligned with the orbital angular momentum.

Figure 9 .
Figure 9.The variation of the aligned-spin and mass ratio with the chirp mass of.Due to the higher spin of GW151226, there is support for a positive spin for low masses.For heavier masses the spins are small.The mass ratio remains close to one throughout the chirp mass range.There are 10 contours in each plot showing equally spaced confidence intervals with the first one for 5% confidence and the last one for the 95% confidence.

Figure 11 .
Figure 11.The curves are the mean reconstructed chirp mass and the bands are the 90% credible interval.The salmon plot has no smoothing applied, while the blue curve applies r eff = 0.2.

Table 1 .
Description of notations used in describing the model.
[57]ability distribution in equation 3 is extended to include the merger rate by incorporating the Poisson term[57]

Table 1
lists the prior applied on the hyper-parameters.Except for the location of Gaussians modeling the chirp mass and the merger rate, that follow a uniform-in-log prior, a uniform prior is applied on all the remaining hyper-parameters.The range of all the priors is fixed except for the maximum value of σ M i which is chosen proportional

Table 3 .
Marginal Likelihood for analysis with different number of components.For a larger component number, marginal likelihood monotonically decreases.The marginal likelihood remains unchanged for a wide range of component number.We cannot conclude the most optimum component number and present results for reconstruction that uses six components.