Hierarchical Inference of Binary Neutron Star Mass Distribution and Equation of State with Gravitational Waves

Jacob Golomb; Colm Talbot

doi:10.3847/1538-4357/ac43bc

1. Introduction

Over the past 6 yr, the LIGO-Virgo gravitational-wave detectors (Acernese et al. 2015; LIGO Scientific Collaboration et al. 2015) have made >50 observations of black hole and neutron star binary mergers (Abbott et al. 2021b), providing a new way to study some of the most energetic events in the universe. Growing catalogs of gravitational-wave observations allow us to study the populations from which compact binary systems originate, offering further insights into the physical nature of these systems (Abbott et al. 2019a, 2021a; Zackay et al. 2019, 2021; Nitz et al. 2020; Venumadhav et al. 2020).

Population inference from gravitational-wave observations is performed by comparing catalogs of observed events to models of the astrophysical distribution. These astrophysical models include strongly physically motivated models (e.g., Wong et al.2021; Zevin et al. 2021), phenomenological models inspired by theoretical predictions and prior observations (e.g., Farrow et al. 2019; Wysocki et al. 2020), or data-driven models (e.g., Tiwari & Fairhurst 2021).

Such population studies are an example of hierarchical Bayesian inference, combining a set of observed events, marginalizing over the single-event parameters for each event, and extracting global properties that govern the single-event parameters to probe the underlying distribution of events, putting observed properties of single events into a wider astrophysical context (see, e.g., Mandel et al. 2019; Thrane & Talbot 2019; Vitale et al. 2020, for recent reviews). Using gravitational-wave observations in a hierarchical framework provides a powerful method of constraining universal properties of merging binary neutron star (BNS) systems, such as the BNS mass distribution (Farr et al. 2011; Farrow et al. 2019) and equation of state (EOS) of dense nuclear matter (Tsui & Leung 2005; Agathos et al. 2015; Lackey & Wade 2015; Hernandez Vivanco et al. 2019; Landry & Essick 2019; Chatziioannou & Farr 2020; Wysocki et al. 2020; Ghosh et al.2021).

While terrestrial experiments have constrained the EOS of cold nuclear matter for densities approaching the nuclear saturation density, a complete picture of the microphysics of nuclear matter above these densities has yet to be confidently determined. With central densities reaching several times nuclear saturation density, neutron stars—observed via kilonova spectra and light curves, X-ray pulsar measurements, and gravitational waves—provide a probe of nuclear physics at supersaturation densities (Coughlin et al. 2018; Bogdanov et al.2019; Metzger 2019; Miller et al. 2019, 2021; Silva et al.2021).

The first detection of a BNS merger via gravitational waves (Abbott et al. 2017, 2019b) provided constraints on the EOS of dense nuclear matter, favoring more "soft" or compressible EOSs over stiffer EOSs (Abbott et al. 2018). The second observed BNS merger (Abbott et al. 2020a) did not provide significant constraints on the EOS; however, the relatively high mass of this system suggests a tension with the galactic population of BNS systems (Abbott et al. 2020a; Galaudage et al. 2021).

While only two confident detections of BNS mergers have been made by LIGO-Virgo, further detections in the near future will provide constraints on the EOS of high-density nuclear matter through the combination of observed events (Margalit & Metzger 2019; Chatziioannou 2020). Unlike gravitational-wave signals from binary black hole mergers, signals from BNS mergers contain information about the neutron star EOS. This information is primarily encoded by the tidal deformabilities of the two bodies during late-stage inspiral, with the magnitude of this effect determined by the underlying EOS (Bildsten & Cutler 1992; Hinderer et al. 2010; Zhao & Lattimer 2018).

Specifically, the EOS directly governs the pressure–density relationship inside the star, a necessary ingredient for solving the Tolman–Oppenheimer–Volkoff equations for the mass–radius relationship of neutron stars (Lindblom 1992; Zhao & Lattimer 2018). For a given EOS, the mass determines the magnitude of a neutron star's quadrupole moment induced from the external field during merger, imprinting a signature in the detected gravitational-wave signal (Thorne 1998; Hinderer et al. 2010; Abbott et al. 2018, 2019b; Chatziioannou et al. 2018; Chatziioannou 2020). This imprint is commonly expressed in terms of dimensionless tidal deformability (Λ), which is defined as (Flanagan & Hinderer 2008; Abbott et al. 2018; Chatziioannou 2020)

$\begin{eqnarray}&&{\rm{\Lambda }}=\displaystyle \frac{\lambda }{{m}^{5}}=\displaystyle \frac{2}{3}{k}_{2}{\left(\displaystyle \frac{R}{m}\right)}^{5},\end{eqnarray} \tag{ 1 }$

where k₂ is the quadrupole Love number, R is the radius of the neutron star, and m is its mass (we express this formula in units where c = G = 1). The EOS determines both k₂ and R for a given neutron star mass, resulting in a unique Λ–m relationship for different (hadronic) EOSs (Hinderer et al. 2010; Wade et al.2014; Zhao & Lattimer 2018; Chatziioannou 2020). Under the assumption of a common EOS among neutron stars, we can infer this Λ–m relationship by combining observations, constraining the underlying EOS.

Previous work has shown that hierarchical inference can be used to constrain the neutron star EOS by considering parameterized models of neutron star populations in conjunction with EOS relations (Lackey & Wade 2015; Landry & Essick 2019; Wysocki et al. 2020). In Wysocki et al. (2020), the authors emphasize the importance of simultaneously inferring the mass distribution and EOS, due to bias that results from independent analyses. In this work, we introduce and implement a new method of performing a simultaneous hierarchical analysis to infer mass distribution and EOS. Specifically, we use a Gaussian mixture model (GMM) as an estimate of single-event posterior probability densities. Using this method, we demonstrate that mismodeling the EOS can lead to a biased inference of the neutron star mass distribution.

The paper is organized as follows. In Section 2, we detail the process of our density estimation procedure and how it can be implemented in general hierarchical inference problems. We then outline our choice of parameterized BNS mass population and EOS models in Section 3. We follow this in Section 4 with the details of the simulated data we use for the proof of concept and a description of how we apply this method to simultaneous EOS and mass distribution inference. We review the results of our simulated data study in Section 5 and conclude with takeaways and motivations for future work in Section 6.

2. Methods

We begin by reviewing Bayesian inference in the context of gravitational-wave data analysis (see, e.g., Thrane & Talbot 2019; Vitale et al. 2020, for recent reviews). In Bayesian inference, one constructs the posterior distribution p(Θ∣d) for a model with parameters Θ given some data d. Bayes's theorem is typically written as

$\begin{eqnarray}&&p({\rm{\Theta }}| d)=\displaystyle \frac{{ \mathcal L }(d| {\rm{\Theta }})\pi ({\rm{\Theta }})}{Z(d)},\end{eqnarray} \tag{ 2 }$

where ${ \mathcal L }(d| {\rm{\Theta }})$ is the likelihood of the data given the model parameters; π(Θ) is the prior probability distribution, characterizing our prior beliefs on the distribution of Θ; and Z(d) is the Bayesian evidence, or the marginal likelihood for the model.⁴ In gravitational-wave analysis, ${ \mathcal L }(d| {\rm{\Theta }})$ is typically taken to be a Gaussian likelihood distribution, whose mean is given by a (frequency domain) gravitational waveform characterized by Θ and variance given by the detector noise (e.g., Veitch et al. 2015). The full set of Θ typically contains parameters intrinsic to the merger event (such as masses and spins), as well as the extrinsic parameters, such as position in the sky and luminosity distance.

Because the set of parameters Θ is typically >10-dimensional, to recover the posterior distribution, Equation (2) is commonly sampled iteratively using a Markov Chain Monte Carlo (MCMC) sampler or nested sampler (e.g., Veitch et al. 2015; Ashton et al. 2019). Once the sampler converges, we are left with a set of samples drawn from the posterior distribution. This posterior distribution represents our full knowledge about the physical parameters of the source of the gravitational-wave event with our prior distribution.

Now, we combine multiple events hierarchically and sample the hyperposterior, in order to learn about the hyperparameters (or population parameters) Ω that describe the global distribution of a subset of single-event parameters (e.g., the distribution of neutron star masses). We denote these single-event parameters of interest as θ, a vector of length D, which is a subset of Θ.

We do the combination by replacing the fixed model above with the set of hyperparameters describing the population model (Thrane & Talbot 2019; Vitale et al. 2020)

$\begin{eqnarray}&&p({\rm{\Omega }}| {\boldsymbol{d}})=\displaystyle \frac{\pi ({\rm{\Omega }}){ \mathcal L }({\boldsymbol{d}}| {\rm{\Omega }})}{Z({\boldsymbol{d}})}.\end{eqnarray} \tag{ 3 }$

In the above notation, π(Ω) is the hyperprior, and Z is the Bayesian evidence for all the observed data d marginalized over the hyperprior. Assuming that the observed events are N independent draws from the population, we express the (hyper)likelihood as

$\begin{eqnarray}&&{ \mathcal L }({\boldsymbol{d}}| {\rm{\Omega }})=\prod _{i=1}^{N}\int { \mathcal L }({d}_{i}| {\theta }_{i})p({\theta }_{i}| {\rm{\Omega }})d{\theta }_{i},\end{eqnarray} \tag{ 4 }$

where for each observed event we marginalize over the single-event parameters θ conditioned on a population model p(θ∣Ω). The likelihood ${ \mathcal L }({d}_{i}| \theta )$ is implicitly already marginalized over all members of Θ not included in θ.

In order to account for selection effects, we augment Equation (4) by including a selection term:

$\begin{eqnarray}&&{P}_{\det }({\rm{\Omega }})=\displaystyle \frac{1}{{N}_{\mathrm{inj}}}\sum _{i=1}^{{N}_{\mathrm{found}}}\displaystyle \frac{p({\theta }_{i}| {\rm{\Omega }})}{p({\theta }_{i}| {{\rm{\Omega }}}_{0})}.\end{eqnarray} \tag{ 5 }$

We compute this term by injecting into simulated detector noise N_inj simulated events from a fiducial source population Ω₀ and determining how many of those events pass our signal-to-noise ratio (S/N) detection threshold (see Section 4). In this equation, the term θ_i consists of the parameters of the ith "found" injection. Our new total likelihood is

$\begin{eqnarray}&&{ \mathcal L }({\boldsymbol{d}}| {\rm{\Omega }})={P}_{\det }{\left({\rm{\Omega }}\right)}^{-N}\prod _{i=1}^{N}\int { \mathcal L }({d}_{i}| {\theta }_{i})p({\theta }_{i}| {\rm{\Omega }})d{\theta }_{i}.\end{eqnarray} \tag{ 6 }$

A key challenge in population inference is efficiently evaluating Equation (4) for a large catalog of events, as Equation (3) is typically also constructed using stochastic sampling, requiring as many as several million evaluations.

Current techniques to compute this integral via Monte Carlo integration involve reweighting posterior samples (assuming the fiducial prior) by the corresponding population likelihood for each sample, for example, in the analysis performed in Abbott et al. (2021a). This can be efficiently parallelized using graphics processing units (GPUs; Talbot et al. 2019) to control the run time. However, this method fails for very narrow distributions, due to having only limited samples from the posterior.

In this work, we consider the converse weighting for this marginalization step: we sample from the population model and compute the likelihood of the observed data for each event given these samples. This requires an efficient method for evaluating the likelihood at arbitrary points in parameter space. Previous work has used kernel density estimates (KDEs; Rosenblatt 1956) and Gaussian processes (GPs; Rasmussen & Williams 2006) for density estimation (Landry & Essick 2019; Wysocki et al. 2020; D'emilio et al. 2021). While KDEs can be made quickly, distributions with sharp edges and high dimensions can cause the KDE to break down, and the complexity of evaluating the KDE scales with the number of samples in the distribution. Similarly, GPs have been shown to provide good fits in small dimensions, but finite-binning effects from fitting the histogrammed samples can limit the accuracy of the density estimate. Another GP method for making density estimates is used in the parameter estimation code RIFT (Lange et al. 2018). However, this requires fixed sets of intrinsic parameters and can be unsuitable for analyzing relationships between source-frame parameters, which is needed in this work (see Section 3).

While these methods provide estimates of single-event likelihoods, they each involve assumptions and/or computational complexities that may make them suboptimal for any general given hierarchical inference problem (Talbot & Thrane 2020; D'emilio et al. 2021). In the next section, we make density estimates of single-event likelihoods using GMMs for use in a hierarchical inference framework. The steps presented here provide a relatively computationally inexpensive density estimation procedure that avoids the shortcomings of the methods outlined above.

2.1. Density Estimation

We begin with the goal of being able to evaluate the individual likelihoods ${ \mathcal L }(d| \theta )$ at any arbitrary point in parameter space. To do this, we must begin with our N sets of posterior samples (for N events) and create a functional form for each likelihood.

As an estimate of the D-dimensional marginalized likelihood for an observed event, we model the likelihood as a linear combination of several D-dimensional Gaussians, where D is the number of parameters of interest in each event's posterior. In such a GMM, a density estimate is made from the set of discrete samples from the single-event posterior, resulting in an analytic model for the likelihood, allowing for evaluation of p(d∣θ) for any θ.

2.1.1. Preprocessing

Realistic posterior distributions for parameters in gravitational-wave analysis are subject to sharp edge effects, widely differing domains between parameters, and other features that make the raw distribution unsuitable for reliable fitting with our density estimation method. In the top panel of Figure 1 we plot the posterior distribution for the chirp mass ( ${ \mathcal M }$ ), mass ratio (q), and two tidal parameters $\tilde{{\rm{\Lambda }}}$ and $\delta \tilde{{\rm{\Lambda }}}$ (see Section 4 for descriptions of these parameters) for GW170817 (Abbott et al. 2018). For this event, we note that the marginal distribution for mass ratio q exhibits a hard cutoff at q = 1, and the correlation plot between $\tilde{{\rm{\Lambda }}}$ and $\delta \tilde{{\rm{\Lambda }}}$ has a sharp triangular shape.

**Figure 1.** Top panel: posterior distribution for the mass and tidal parameters of GW170817 in physical space (before transformation). Bottom panel: posterior distribution for GW170817 after transforming samples into fitting space. Note that the domain of the transformed samples is much more uniform across parameter space and the sharp edges in both 1D and 2D histograms have been removed.
Download figure:
Standard image High-resolution image

We follow Talbot & Thrane (2020) to map the observed distribution to one that is smoother and on a better-behaved domain, and we use this transformed distribution to train the density estimate.

First, we map each posterior sample in θ to the unit interval using the cumulative distribution function (CDF) F of the prior. Next, we transform the samples from the unit interval to the unit normal distribution. The transformed sample is

$\begin{eqnarray}&&\theta ^{\prime} ={{\rm{\Phi }}}^{-1}(F(\theta )),\end{eqnarray} \tag{ 7 }$

where Φ⁻¹ is the probit function, the inverse CDF of the unit normal distribution (Bliss 1934).⁵

The result of this transformation on the posterior distribution for GW170817 can be seen in the bottom panel of Figure 1. The original posterior distribution (in physical space) can be compared to the transformed distribution, which has been made more suitable for fitting to a GMM.

2.1.2. Fitting the Distribution

After mapping the samples from the posterior distribution using the method in the previous section, the transformed samples follow a distribution more suitable to fitting with a GMM.

We train the model on the posterior samples and determine the maximum likelihood means, covariances, and weights assigned to each component of the GMM. Mathematically, the density estimate of an observed posterior distribution is

$\begin{eqnarray}&&p(\theta {{\prime} }_{i}| {d}_{i})\approx \sum _{k}^{K}{\phi }_{{ik}}{ \mathcal N }(\theta {{\prime} }_{i}| {\mu }_{{ik}},{{\rm{\Sigma }}}_{{ik}}),\end{eqnarray} \tag{ 8 }$

where the kth component of the mixture is a multivariate Gaussian of mean μ_k and covariance Σ_k, weighted by ϕ_k. Here $\sum _{k}^{K}{\phi }_{{ik}}=1$ for a K-component GMM, for each i. Note that the index k here runs over the components (individual Gaussians) in the mixture model, not the observed events, as this mixture model is unique to each event. We use the GMM as implemented in Scikit-Learn (Pedregosa et al. 2011), which uses an expectation-maximization method to fit for the μ, Σ, and ϕ parameters in Equation (8) conditioned on a set of transformed samples from the posterior distribution of the ith event.

Since a GMM is a sum of individual weighted Gaussian components, we must determine how many such components to use to make the optimal fit characterizing the distribution without overfitting. To determine this, we take a set of posterior samples from an event and randomly assign 80% of the transformed posterior samples to a training set and the other 20% to a testing set. We train the K-component GMM using the training data and evaluate the score (sample-wise average log likelihood) of the testing samples for the GMM. To determine the optimal number of components to use, we vary K and repeat this process until the score noticeably flattens, indicating that an increased value of K does not better characterize the distribution. When working with a catalog of observed events, it may be efficient to do this step using one selected event (possibly corresponding to the most complex posterior distribution) and use this optimal K for all GMM density estimates in the catalog. However, a more complete fitting method would consist of fitting for the optimal K_i for each event i, rather than using the same K for all events. We note that one could also fit for the optimal K_i values for each event using a statistic such as the Bayesian information criterion (Kass & Raftery 1995). However, we find that the fits we obtain from the flattening of the score are sufficient for good recovery of our simulated distribution, as reported in Section 5.

In Figure 2, we show this curve using our GMM fits from GW170817. As the score flattens out by K = 8–10 components, this represents the optimal number of components to use in the GMM for this posterior distribution. As an illustration, we compare the GMM fit to the true posterior distribution using samples drawn from the GMM and the original transformed samples for GW170817 in Figure 3.

**Figure 3.** Four-dimensional posterior distribution for GW170817. Orange is the posterior samples, and blue is from samples of the GMM fit. The overlap between the two distributions shows that the GMM provides a good density estimate. Plotted in physical (transformed) space in the top (bottom) panel.
Download figure:
Standard image High-resolution image

Since the GMM is trained on transformed posterior samples, we convert the GMM density estimate ${{ \mathcal D }}_{i}(\theta ^{\prime} )$ from the ith event into a single-event likelihood via

$\begin{eqnarray}&&{{ \mathcal L }}_{i}(\theta ^{\prime} )=\displaystyle \frac{{{ \mathcal D }}_{i}(\theta ^{\prime} )}{{ \mathcal N }(\theta ^{\prime} ,\mu =0,\sigma =1)}.\end{eqnarray} \tag{ 9 }$

This results in the correct likelihood because we used the sampling priors for F(θ) in the original transformation $\theta \to \theta ^{\prime}$ (see Equation (7)).

2.2. Hierarchical Likelihoods Using Density Estimates

For a given population model p(θ∣Ω), we compute the likelihood of the event i (for i ∈ (1, N)) as

$\begin{eqnarray}&&{{ \mathcal L }}_{i}({d}_{i}| {\rm{\Omega }})=\displaystyle \frac{1}{M}\sum _{\theta \sim p(\theta | {\rm{\Omega }})}{{ \mathcal L }}_{i}(\theta ^{\prime} ),\end{eqnarray} \tag{ 10 }$

where ${{ \mathcal L }}_{i}$ is the likelihood of the ith event (Equation (9)), and M samples of θ are drawn from the population model p(θ∣Ω). This is a practical Monte Carlo integration scheme for the integral in Equation (4), dependent on the ability to sample from the population model and evaluate the single-event likelihoods at the corresponding points in parameter space.

An implicit step in the above hierarchical likelihood equation is the mapping of the population samples θ into the corresponding transformed (fitting) space samples $\theta ^{\prime}$ for each density estimate, to match the space of the density estimates.

The total likelihood (i.e., Equation (4)) for the data from N events therefore becomes

$\begin{eqnarray}&&{ \mathcal L }({\boldsymbol{d}}| {\rm{\Omega }})=\prod _{i=1}^{N}\displaystyle \frac{1}{M}\sum _{\theta \sim p(\theta | {\rm{\Omega }})}{{ \mathcal L }}_{i}(\theta ^{\prime} ).\end{eqnarray} \tag{ 11 }$

Although Scikit-Learn provides a method of computing the log likelihood of samples in a GMM fit, for a single evaluation of the total likelihood, each of N GMMs must be evaluated for M samples, which can become a computational burden for large N and M. When there are many observed events, it becomes more efficient to extract the best-fit means, covariances, and weights from the GMM fits and evaluate the likelihood matrices in Equation (8) directly on a GPU using CuPy (Okuta et al. 2017). This avoids explicitly looping over the N GMMs for each evaluation of the joint likelihood, while efficiently performing computations over the ${ \mathcal O }(N\times M\times K)$ array using array broadcasting and vectorization on a GPU.

3. Models

For the implementation of our GMM-based hierarchical inference method, we let Ω characterize the mass distribution, as well as the EOS relating the Λ and m parameters. This requires choosing parameterized models for the mass distribution and the EOS model.

3.1. Mass Population Model

To model the distribution of neutron stars in merging binaries, we consider the observationally motivated framework in Farrow et al. (2019). In that work, the authors found evidence based on observations of galactic BNS systems for each companion of a BNS system being drawn from a separate population distribution. The first population distribution characterizes the member of the binary that forms first and spins up owing to accretion, known as a recycled neutron star. The other member of the system is known as the slow neutron star, as it is born second and spins down quickly after formation. The authors found that the model with the best support consists of a two-component Gaussian model for the recycled neutron star mass distribution and a uniform distribution for the slow neutron star mass.

Adopting this mass distribution model, the subset of Ω describing the mass population consists of eight parameters. The lower-mass Gaussian of the recycled distribution is described by the parameters (μ_R,a, σ_R,a) and is weighted by α, and the higher-mass component is described by (μ_R,b, σ_R,b). The probability of observing a mass m from the recycled mass distribution is

$\begin{eqnarray}\begin{array}{rcl}{p}_{R}(m| {\rm{\Omega }}) & = & \alpha { \mathcal N }(m| {\mu }_{R,a},{\sigma }_{R,a})\\ & & +(1-\alpha ){ \mathcal N }(m| {\mu }_{R,b},{\sigma }_{R,b}).\end{array}\end{eqnarray} \tag{ 12 }$

For the slow mass distribution, we denote the low and high limits as m_S,l and m_S,h, respectively. The maximum mass parameter, M_max, represents an absolute cutoff of both distributions; we truncate the recycled (Equation (12)) and slow mass distributions at M_max on the high end and 1 M_⊙ on the low end.

By considering the observed galactic BNS systems in the context of binary formation and evolution models, Zhu & Ashton (2020) found that modeling the slow companion as nonspinning was a robust approximation for gravitational-wave data analysis. The recycled partner, while spun up from the slow companion, has very little support for spins of χ > 0.05 for population and EOS models they considered. We therefore do not consider spins at all in this work, and we model all sources as nonspinning. Since we neglect spins, we have no way of concretely knowing which component mass (m₁, m₂) represents the slow or recycled mass, so each computation of the population probability must account for the possibility of either component being drawn from either distribution, with the constraint that each BNS system consists of exactly one recycled and one slow neutron star.

3.2. EOS Model: The Λ–m Relation

Several nonparametric and parametric models for EOS-sensitive observables exist in gravitational-wave literature, based on the assumption that a neutron star of a given mass will have a corresponding Λ uniquely determined by its EOS (Read et al. 2009; Lindblom 2010; Landry & Essick 2019). Therefore, recovering parameters characterizing this mapping between the two observables, m and Λ, may provide a way to recover information about the underlying nuclear EOS.

To model the EOS-sensitive relationship, we follow the examples of Agathos et al. (2015) and Del Pozzo et al. (2013) and consider a simple expansion of λ(m) about the canonical reference value of 1.4 M_⊙:

$\begin{eqnarray}&&\lambda (m| {c}_{0},{c}_{1})={c}_{0}+{c}_{1}\left(\displaystyle \frac{m-1.4{M}_{\odot }}{{M}_{\odot }}\right).\end{eqnarray} \tag{ 13 }$

The expansion coefficients in this model are our EOS-sensitive parameters, with different combinations approximating different EOSs. Previous work has shown that with this parameterization, LIGO observations will be unlikely to resolve terms c_j for j ≳ 1, so we only include these first two terms in our demonstration. With component masses of observed galactic BNS systems peaking around 1.4 M_⊙, this form for λ(m) can provide meaningful constraints on the EOS, as the expansion is centered at this value.

For our fiducial choice of EOS to simulate, we use the values of c₀ and c₁ corresponding to the relatively soft SLy EOS (Douchin & Haensel 2001). In Figure 4, we show the linear fit to the true EOS from Equation (13) for the Λ–m (top) and λ–m (bottom) relationships. The approximation breaks down for m > 1.8 M_⊙ as previously noted in, e.g., Chatziioannou (2020); however, the majority of simulated events considered in this work are less massive than this. If observed BNS mergers contain high-mass components, this linear parameterization will not be valid; however, it is sufficient for our proof of principle (see Section 6).

**Figure 4.** Dimensionless tidal deformability (top panel) and tidal deformability (bottom panel) as a function of mass. Units of λ are in seconds to the fifth power, following the convention in Agathos et al. (2015).
Download figure:
Standard image High-resolution image

This model provides a weak connection between our EOS-sensitive model and the mass population model. Since c₀ and c₁ are allowed to vary in Equation (13), it is possible to arrive at a negative value for Λ, which is unphysical. Since Λ(m) is a decreasing function with mass, we therefore constrain M_max such that $0\lt {\rm{\Lambda }}({M}_{\max })\lt 200$ , consistent with limits on minimum Λ(m) for commonly used EOSs (see Figure 1 in Chatziioannou 2020).

It is worth noting the limited significance of the M_max parameter in this work, as the constraint $0\lt {\rm{\Lambda }}({M}_{\max })\lt 200$ is simply a cutoff to keep values of Λ physical. This does not necessarily correspond to the maximum neutron star mass as determined by the EOS; this is known as the Tolman–Oppenheimer–Volkoff mass (M_TOV) and is determined from the stability conditions set by a particular EOS (Kalogera & Baym 1996). By cutting off the population model at M_max in this work, we do not associate M_max with the most massive possible neutron star, but instead M_max is the upper limit on the mass of neutron stars in merging binaries. Thus, stellar binary evolution and population channel models play an additional role in the significance of M_max in the population. A more rigorous approach may be to calculate M_TOV for a given EOS and enforce the condition ${M}_{\max }\lt {M}_{\mathrm{TOV}}$ .

Although this work is a proof of concept for this method, improvements to the EOS modeling and parameterization can make the results more realistic and applicable to real observations. The breakdown of our choice of EOS model at high masses serves to potentially bias the entire inference if a significant number of events are observed far from the reference mass used in the Taylor expansion of Equation (13). Additionally, previous work has shown that a simple Taylor expansion, such as the model used in this work, may not be robust to EOS models with phase transitions to nonhadronic constituents. Therefore, a more general model robust to arbitrary EOSs without deviations at high masses may be useful for realistic observations.

4. Data and Implementation

4.1. Data

In order to demonstrate our method, we simulate 100 BNS signals drawn from the mass distribution characterized by the maximum likelihood estimate in Farrow et al. (2019). This corresponds to the combination of mass distribution parameters listed in Table 1. We specify injected tidal parameters using our linear fit to the SLy EOS, neglecting spins.

Table 1. Summary of Hyperparameters Used in Demonstration

Parameter	Value	Prior	Units
μ_Ra	1.34	(1, 2)	M_⊙
σ_Ra	0.02	(0.005, 0.5)	M_⊙
μ_Rb	1.47	(μ_Ra, 2)	M_⊙
σ_Rb	0.15	(0.005, 0.5)	M_⊙
α	0.68	(0, 1)	N/A
M_Sl	1.16	(1, 1.7)	M_⊙
M_Sh	1.42	(M_Sl, M_max)	M_⊙
M_max	2.2	(1.9, 2.3)	M_⊙
c₀/10⁻²⁴	4.88	(10⁻¹, 10)	s⁵
c₁/10⁻²⁴	−5.21	(−10,−1)	s⁵

Note. The second column indicates the number used to generate the samples, and the third column indicates the bounds of the uniform prior used for sampling. The mass distribution parameters include the means, μ_R, and standard deviations, σ_R, of the low-mass Gaussian, a, and higher-mass Gaussian, b, of the recycled mass distribution, along with a weight α (b weighted by (1 − α)). The slow mass uniform distribution is characterized by its lower bound M_Sl and upper bound M_Sh. See the dashed line in the left panel of Figure 5 for the probability density function of the input mass distribution. The c₀ and c₁ parameters are the EOS-sensitive parameters in Equation (13). All mass parameters in units of M_⊙.

Download table as: ASCII Typeset image

We draw the extrinsic parameters isotropically in position and orientation with distances uniform in source frame between 20 and 300 Mpc using the cosmology from the Planck 2015 data release (Planck Collaboration 2016).

For each simulated signal, we generate 128 s of colored Gaussian noise corresponding to the three-detector Advanced LIGO-Virgo network operating at their projected design sensitivities (Abbott et al. 2020b). We employ a GPU implementation of the TaylorF2 waveform model (Buonanno et al. 2009; Talbot et al. 2019) and analyze data between 20 and 2048 Hz. We simplify our parameter estimation by marginalizing phase, merger time, and luminosity distance for sampling the posterior distribution for single-event analyses. We reconstruct the luminosity distance marginal posterior distribution in post-processing using the method outlined in Thrane & Talbot (2019).

We impose a detection threshold of S/N > 8 in at least one detector. For the 37 events passing our detection threshold, we infer the posterior distribution using the Bilby (Ashton et al. 2019) implementation of PyMultiNest (Buchner et al. 2014).

We employ a uniform prior on detector-frame chirp mass ( ${{ \mathcal M }}_{d}$ ) of ±0.01 M_⊙ around the injected value for each event. Our prior on q is uniform from 0.125 to 1. For $\tilde{{\rm{\Lambda }}}$ , defined as (Wade et al. 2014)

$\begin{eqnarray}\begin{array}{rcl}\tilde{{\rm{\Lambda }}} & = & \displaystyle \frac{8}{13}\left[(1+7\eta -31{\eta }^{2})({{\rm{\Lambda }}}_{1}+{{\rm{\Lambda }}}_{2})\right.\\ & & \left.+\sqrt{1-4\eta }(1+9\eta -11{\eta }^{2})({{\rm{\Lambda }}}_{1}-{{\rm{\Lambda }}}_{2})\right],\end{array}\end{eqnarray} \tag{ 14 }$

we use a uniform prior from 0 to 5000. Here η ≡ q(1 + q)⁻² is the symmetric mass ratio. We construct a conditional sampling prior on $\delta \tilde{{\rm{\Lambda }}}$ as follows: for each sample θ_i, we analytically compute the maximum and minimum allowed values of $\delta {\tilde{{\rm{\Lambda }}}}_{i}$ conditioned on q_i and Λ_i. The parameter $\delta \tilde{{\rm{\Lambda }}}$ is defined as (Wade et al. 2014)

$\begin{eqnarray}\begin{array}{rcl}\delta \tilde{{\rm{\Lambda }}}(q,\tilde{{\rm{\Lambda }}}) & = & \displaystyle \frac{1}{2}\left[\sqrt{1-4\eta }\left(1-\displaystyle \frac{13272}{1319}\eta +\displaystyle \frac{8944}{1319}{\eta }^{2}\right)\right.\\ & & \times \,({{\rm{\Lambda }}}_{1}+{{\rm{\Lambda }}}_{2})\\ & & +\left(1-\displaystyle \frac{15910}{1319}\eta +\displaystyle \frac{32850}{1319}{\eta }^{2}+\displaystyle \frac{3380}{1319}{\eta }^{3}\right)\\ & & \times \,\left.({{\rm{\Lambda }}}_{1}-{{\rm{\Lambda }}}_{2})\Space{0ex}{2.8ex}{0ex}\right],\end{array}\end{eqnarray} \tag{ 15 }$

such that $\delta \tilde{{\rm{\Lambda }}}$ deviates from 0 as the differences in component tidal deformabilities increase. It therefore reaches a maximum (minimum) when Λ₁ (Λ₂) is 0. Thus, we calculate the bounds of the uniform prior on $\delta \tilde{{\rm{\Lambda }}}$ conditioned on a sample of $(q,\tilde{{\rm{\Lambda }}})$ , where Λ₁₍₂₎ is computed by setting Λ₂₍₁₎ = 0 for fixed q and $\tilde{{\rm{\Lambda }}}$ in Equation (14), and using the resulting value of $\delta \tilde{{\rm{\Lambda }}}$ as the upper (lower) bound. We then consider a uniform prior on $\delta \tilde{{\rm{\Lambda }}}$ from these conditions.

While ${ \mathcal M }$ is well constrained in the detector frame, the hyperparameters we consider in this work are only relevant to source-frame masses. Therefore, if a given set of posterior samples only contain ${ \mathcal M }$ in the detector frame, they must be converted to source frame via the relationship ${{ \mathcal M }}_{d}=(1+z){ \mathcal M }$ . We construct our corresponding prior on ${{ \mathcal M }}_{s}$ as (Thrane & Talbot 2019)

$\begin{eqnarray}&&\pi ({{ \mathcal M }}_{s})=\int {dzd}{{ \mathcal M }}_{d}\pi (z)\pi ({{ \mathcal M }}_{d})(1+z)\delta \left({{ \mathcal M }}_{s}-\displaystyle \frac{{{ \mathcal M }}_{d}}{(1+z)}\right).\end{eqnarray} \tag{ 16 }$

This is marginalized over both detector-frame mass and distance. This relation may be unique to each observed event if a different prior on ${{ \mathcal M }}_{d}$ is used for each event. We therefore associate a unique prior on ${ \mathcal M }$ with each GMM.⁶

To make the density estimates of each posterior distribution, we follow the method outlined in the previous section, using our single-event sampling priors for F in Equation (7) for the transformation into fitting space for each event's posterior. As an example of GMM density estimation on BNS posterior distributions, we show the fit to GW170817 in Figure 1. We observe that BNS posteriors may include strong correlations between $\tilde{{\rm{\Lambda }}}$ and $\delta \tilde{{\rm{\Lambda }}}$ (i.e., the triangular shape in the $\tilde{{\rm{\Lambda }}}-\delta \tilde{{\rm{\Lambda }}}$ correlation plot in Figure 1), possibly impacting the quality of the density estimate. Using the CDF of our conditional prior as F(θ), the transformation decorrelates $\tilde{{\rm{\Lambda }}}$ and $\delta \tilde{{\rm{\Lambda }}}$ . As can be seen in the bottom panel of Figure 1, the correlation between the tidal parameters no longer exists in transformed space when imposing this condition on the sampling prior.

To motivate our choice for the number of components in our GMMs, we implement the scoring method described in Section 2 on GMM density estimates of GW170817 and shown in Figure 2. Based on this example, we use 10 components for each GMM density estimate of our simulated BNS events.

4.2. Sampling the Hyperposterior

For each calculation of the likelihood, we sample M = 15,000 masses from the population model and compute the corresponding tidal deformabilities conditioned on EOS-sensitive hyperparameters via the relationship in Equation (13). We then convert the M samples of (m_r, m_s, Λ_r, Λ_s) to $({ \mathcal M },q,\tilde{{\rm{\Lambda }}},\delta \tilde{{\rm{\Lambda }}})$ and then into the fitting space for each of the GMMs. While we sample component masses in terms of recycled and slow mass, we convert (m_r, m_s) to (m₁, m₂), adopting the convention m₁ > m₂. Using these transformed samples, we can evaluate Equation (11).

To calculate ${P}_{\det }({\rm{\Omega }})$ , we draw 20,000 binaries from a mass distribution that is uniform in [1, 2.2] M_⊙. For each simulated binary, we compute the S/N in an independent noise realization and keep those that pass our threshold. We neglect the impact of tidal effects on sensitivity. Since our mass distribution model is in terms of slow and recycled components, but our analysis can only specify m₁ and m₂, for the purposes of computing p(θ∣Ω) we assume a priori that each object is equally likely to be the recycled or slow companion, with the assumption that each binary system contains exactly one slow and one recycled partner.

Each likelihood evaluation requires ${ \mathcal O }(M\times N\times K)\sim 6\,\times {10}^{6}$ computations. Running on an NVIDIA GeForce RTX 3080 GPU, each full likelihood evaluation took ≲50 ms for our 37 events, which is comparable to the evaluation time for the method currently used to infer binary black hole mass distributions in LIGO-Virgo-KAGRA analyses (Talbot et al. 2019; Abbott et al. 2021a). We note that the sampling, transformation, and selection function steps in the likelihood introduce subdominant effects to computation time relative to the computation of M × N × K Gaussian likelihoods. We sample the hyperposterior using the Bilby wrapper of the nested sampler PyMultinest (Buchner et al. 2014), sampling with 250 live points.

5. Results

In the left panel of Figure 5 we show the inferred mass distribution when the mass and EOS hyperparameters are sampled simultaneously. The solid line shows the posterior predictive distribution (PPD), the shaded regions show the symmetric 68% credible region, and the dashed lines show the true simulated distribution. With the priors on the mass distribution hyperparameters spanning a wide range, the posterior distribution is relatively well constrained around the input hyperparameters (see Figure 10) for the full 1D and 2D posterior distributions.

Of note, we confidently recover the presence not only of the large peak in recycled mass distribution at 1.34 M_⊙ but also the small and wide peak at higher masses, as shown by the μ_Rb and σ_Rb panels in Figure 10, as well as in the PPD in the left panel of Figure 5. The inferred location of the large peak is ${\mu }_{{Ra}}={1.32}_{-0.03}^{+0.03}\,{M}_{\odot }$ (all ranges are 90% credible intervals), constrained to within ∼4% of the input value. It is worth noting that the hyperparameters associated with the second peak of the recycled mass distribution unsurprisingly show the poorest recovery. This is expected, as the second peak in this distribution is very small (i.e., most masses from the recycled distribution are in the lower-mass peak), and thus very few events coming from these masses are expected. Nevertheless, we are able to recover evidence of this second peak around its input location. The presence of a secondary peak in the recycled mass distribution is favored over a single Gaussian, with a Savage–Dickey density ratio giving a Bayes factor of 2.6 in favor of a secondary peak (α ≠ 0 or α ≠ 1).

Additionally, the bounds of the slow mass distribution are well constrained, with inferred values of ${M}_{{Sh}}={1.45}_{-0.06}^{+0.08}\,{M}_{\odot }$ and ${M}_{{Sl}}={1.17}_{-0.04}^{+0.08}\,{M}_{\odot }$ . Both of these parameters are therefore constrained to within ∼10% of their input values of 1.42 and 1.16 M_⊙, respectively.

Similarly, good recovery is also seen in the EOS parameters (c₀ and c₁), around the values input for our SLy fit, which predicts Λ(1.4 M_⊙) = 313. We infer ${\rm{\Lambda }}(1.4\,{M}_{\odot })={322}_{-25}^{+27}$ , which is constrained to within ∼17% of the true value from the SLy fit. We also recover ${\rm{\Lambda }}(1\,{M}_{\odot })={2282}_{-333}^{+435}$ and ${\rm{\Lambda }}(2\,{M}_{\odot })={29}_{-23}^{+18}$ , the inferred dimensionless tidal deformability of a 1 and 2 M_⊙ neutron star, respectively. The relatively wide credible intervals for these parameters can be understood as a result of the vast majority of our simulated neutron stars having masses closer to 1.4 M_⊙, with very little support in the mass distribution at 1 M_⊙ or 2 M_⊙. For reference, we also plot the Λ–m relationship for two other soft EOSs: AP4 (Akmal et al. 1998) and WFF1 (Wiringa et al. 1988), which are EOSs compatible with observations from GW170817 (Abbott et al. 2019b). This way we show that the recovery can also favor the input EOS over some similar EOSs. Thus, we will be able to robustly distinguish between these EOSs after observing 37 events.

As noted in Section 3, the M_max parameter is fairly insignificant in this work, with its recovered posterior distribution relatively flat (see Figure 10). There is, however, a sharp, slanted cutoff in the correlation plot between c₁ and M_max, owing to the constraint we impose on ${\rm{\Lambda }}({M}_{\max })$ (see Section 3).

The recovery of the EOS parameters obtained in this work stands in contrast to what was found in Agathos et al. (2015). Simulating component masses from a narrow Gaussian peaked at 1.35 M_⊙, but assuming a flat mass prior, the authors needed ${ \mathcal O }(\gt 100)$ events in their catalog to distinguish their candidate EOSs. In contrast to this work, those authors aimed to distinguish between soft, moderate, and stiff EOSs, whereas we only consider three similar soft EOSs, limiting the significance of a direct comparison between works. Despite comparing more similar EOSs in our work, we find a substantially lower threshold for distinguishing EOSs than in Agathos et al. (2015) by performing population inference simultaneously.

Motivated by Lackey & Wade (2015) and Hernandez Vivanco et al. (2019), we test how well this method can constrain the mass distribution and EOS hyperparameters from only lower-S/N or higher-S/N events by repeating the above analysis, but limiting ourselves to a portion of the events. The low-S/N events are the 24 of our simulated events that had a network S/N < 20 (but above our S/N threshold of 8), and the remaining 13 events are the events with S/N > 20.

Consistent with those studies, we find that EOS-sensitive parameters show definitively worse recovery when only including the low-S/N events. The inferred ±1σ parameter space in the left panel of Figure 6 in this case is much wider than when all the events are included, indicating that the S/N > 20 events are providing a significant amount of the EOS information in this analysis. For instance, when only considering these low-S/N events, we infer ${\rm{\Lambda }}(1.4\,{M}_{\odot })={309}_{-63}^{+61}$ . On the other hand, only analyzing the 13 events with S/N > 20 provides ${\rm{\Lambda }}(1.4\,{M}_{\odot })={324}_{-29}^{+28}$ (see the right panel of Figure 6), constraints comparable to those from the full analysis with all S/N > 8 events (see the left panel of Figure 8).

**Figure 6.** Inferred λ–m parameter space from the analysis using only low-S/N events (left) and high-S/N events (right). Note the better recovery of EOS parameters from including few high-S/N events compared to many low-S/N events.
Download figure:
Standard image High-resolution image

As can be seen in Figure 7, the mass distribution recovery is poorer in the low-S/N case compared to including all events in the analysis. Specifically, we recover ${\mu }_{{Ra}}={1.33}_{-0.11}^{+0.07}\,{M}_{\odot };$ while this interval contains the true value of μ_Ra, it is 3 times larger than what we get from the analysis using the full set of events. The 13 high-S/N events contribute less information to the mass distribution inference than the 24 low-S/N events, giving an inferred value ${\mu }_{{Ra}}={1.30}_{-0.21}^{+0.07}\,{M}_{\odot },$ a credible interval ∼4.5 times larger than that from the analysis using the full set of events. Therefore, unlike the case for the EOS, the mass distribution is not preferentially informed by high-S/N events but is most sensitive to the number of events in the population. Because the mass parameters (chirp mass in particular) tend to be relatively well constrained, several observations of lower-S/N BNS merger events can provide constraints on the mass distribution of merging neutron stars.

To estimate the bias from not inferring mass distribution and EOS hyperparameters simultaneously, we conduct the analysis from above but only sample the mass distribution hyperparameters and make GMM density estimates of ${ \mathcal M }$ and q for our 37 observed events. By only considering the mass parameters in our analysis, we neglect the Λ–m relationship from the EOS and implicitly (mis)model the tidal parameters as independent draws from the prior distribution used in the single-event sampling. As seen in the right panel of Figure 5, the mass distribution becomes noticeably biased at the 68% credible level,⁷ with the PPD of the mass spectrum shifted from the input distribution. In this case, we recover ${\mu }_{{Ra}}={1.32}_{-0.09}^{+0.02}\,{M}_{\odot }$ , a credible interval that almost does not contain the true value. This bias is due to ignoring the correlations between the mass parameters (particularly the mass ratio) and the tidal parameters, which can be seen in the 2D posterior panels between q and $\tilde{{\rm{\Lambda }}}$ in single-event posteriors of our simulated events (see Figure 9). Therefore, inferring the Λ–m relationship simultaneously with the mass distribution is necessary for an unbiased result.

6. Discussion

In this work, we demonstrate a new method of hierarchically combining posterior distributions from BNS merger events and inferring mass distribution and EOS parameters simultaneously. The initial step of using GMM density estimates in our transformed space reliably reflects the observed posterior distribution and allows for the evaluation of single-event likelihoods at arbitrary points in parameter space for arbitrary subsets of single parameters efficiently.

We show that our method can recover underlying population model parameters when combining BNS events simulated with realistic observational parameters and noise realizations, while also constraining parameters of the neutron star EOS. Using our new method, we confirm the importance of inferring EOS and mass distribution parameters simultaneously to avoid potential bias in both the inferred mass distribution and EOS.

Additionally, we observe that both low-S/N (<20) and high-S/N (>20) observations contribute to mass population inference, with the few high-S/N observations providing the bulk of the EOS constraints. This finding is generally consistent with the work in Lackey & Wade (2015) and Hernandez Vivanco et al. (2019). Summarized in Figure 8, the EOS recovery from the 37 simulated observations constrains the Λ–m parameter space around the input EOS.

**Figure 8.** Inferred Λ–m (top) and λ–m (bottom) parameter space from the population + EOS analysis. The shaded region corresponds to the ±1σ (68%) region from c₀ and c₁ posterior samples. For reference, selected EOS curves are overplotted.
Download figure:
Standard image High-resolution image

The fourth observing run of the LIGO-Virgo-KAGRA network is expected to begin no earlier than the summer of 2022 and last 1 yr. At the targeted upgraded sensitivity, it is estimated that there will be ${10}_{-10}^{+52}$ BNS detections, significantly raising the prospects for providing constraints on neutron star EOS and population models (Abbott et al. 2020b).

The method presented in this work is generalizable to arbitrary population models and can incorporate parameterized models linking population observables to other single-event observables (i.e., Λ–m relationship). Gravitational-wave population analysis using mass and spin models (see Abbott et al. 2021a) could be similarly evaluated using this method by making GMM density estimates of mass and spin parameters and sampling from the population models used in those studies.

We anticipate that the transformed GMM density estimation method employed here has additional potential applications, as it is robust to edge effects and has superior scaling with dimensionality compared to KDE and GP methods. In addition to being able to handle the delta function population model for tidal parameters, this method can be applied to any situation where the uncertainty in individual measurements is much larger than the domain of the population model, e.g., spin distributions that are not probeable with the method currently employed by the LIGO/Virgo collaboration analyses (Abbott et al. 2019a, 2021a; see also Wysocki et al. 2020). Further, GMMs are a generative model and can therefore be used to generate ${ \mathcal O }({10}^{5})$ additional samples per second from the posterior distribution or as a proposal distribution for subsequent MCMC reanalyses building on the methods in Farr & Farr (2015) and Ashton & Talbot (2021).

While this proof-of-concept study used a simple toy model for the neutron star Λ–m relation, more sophisticated models can be folded into the method. Additionally, this method can be extended to include a model for the distribution of neutron star spins. Further, this study has focused on the situation when there are tens of measurements; the current population of BNS systems is limited to two. In this small population regime the specific choice of population model/prior will significantly impact the inference. We leave these extensions to future work.

We would like to thank Katerina Chatziioannou and Alan Weinstein for useful discussions. We would also like to thank Stefano Rinaldi for useful comments on the manuscript. Finally, we thank the anonymous reviewer for helpful suggestions and critiques on this manuscript. J.G. and C.T. acknowledge the support of the National Science Foundation and the LIGO Laboratory. LIGO was constructed by the California Institute of Technology and Massachusetts Institute of Technology with funding from the National Science Foundation and operates under cooperative agreement PHY-1764464. This paper carries LIGO Document Number LIGO-P2100215.

The authors are grateful for computational resources provided by the LIGO Laboratory and supported by National Science Foundation Grants PHY-0757058 and PHY-0823459. This research has made use of data, software, and/or web tools obtained from the Gravitational Wave Open Science Center (https://www.gw-openscience.org/) (Abbott et al. 2021c), a service of LIGO Laboratory, the LIGO Scientific Collaboration, and the Virgo Collaboration.

Appendix A: Single-event Posteriors

In this appendix, we show the posterior distribution and samples drawn from the GMM fits for a range of our simulated events. By overplotting the samples from the posteriors and the GMM, we show that the GMM accurately characterizes the posteriors of individual events. We also note that the GMM is able to fit various features such as peaks, correlations, and skews that appear in the transformed posterior distributions. This demonstrates the strength of using this density estimate as an analytic approximation of the likelihoods.

The posterior distributions also show the correlations between the tidal parameters and mass parameters (center panels in Figure 9). If the mass distribution is inferred independently from the tidal parameters, the information in the correlation (other than the ${ \mathcal M }\mbox{--}q$ plot) is lost, impacting the recovery of the mass distribution.

Appendix B: Hyperposterior

In Figure 10 we plot the recovered posterior distribution for the mass distribution and EOS hyperparameters inferred from the analysis using the 37 simulated events.

Appendix C: Convergence of Monte Carlo Integrals

Population analyses such as those presented here rely on the use of Monte Carlo integrals to marginalize over the single-event parameters, either by summing over posterior samples for each event with some fiducial prior (e.g., Abbott et al. 2021a) or by summing over samples from the population model as in this work. While such Monte Carlo integrals are asymptotically unbiased estimators, for a finite number of samples there is a finite uncertainty. This uncertainty is generally neglected in the literature, although it has been discussed for the integral estimating the sensitivity function, Equation (5) (Farr 2019). For a generic Monte Carlo integral of some function f over some set of K samples x_k the expectation is

$\begin{eqnarray}&&\left\langle f\right\rangle =\displaystyle \frac{1}{K}\sum _{k}^{K}f({x}_{k})\end{eqnarray} \tag{ C1 }$

and the fractional uncertainty is

$\begin{eqnarray}\begin{array}{rcl}{\left(\displaystyle \frac{{\rm{\Delta }}f}{\left\langle f\right\rangle }\right)}^{2} & = & \displaystyle \frac{1}{K}\displaystyle \frac{\left\langle {f}^{2}\right\rangle -{\left\langle f\right\rangle }^{2}}{{\left\langle f\right\rangle }^{2}}\\ & = & \displaystyle \frac{{\sum }_{k}^{K}{f}^{2}({x}_{k})}{{\left({\sum }_{k}^{K}f({x}_{k})\right)}^{2}}-\displaystyle \frac{1}{K}.\end{array}\end{eqnarray} \tag{ C2 }$

We include the two notational forms to highlight the asymptotic form (center) and the practical method for evaluating the quantity (right). From the asymptotic form, we note that if the moments of f can be analytically computed, we would have an expected variance that is exactly inversely proportional to the number of samples being averaged over.

Neglecting the uncertainty in the estimate of the selection function, the uncertainty in our total likelihood is the logarithm of the product of many Monte Carlo integrals, and the standard rules of propagating uncertainties yield

$\begin{eqnarray}&&{\rm{\Delta }}\mathrm{ln}{ \mathcal L }(\{{d}_{i}\}| {\rm{\Omega }})=\sqrt{\sum _{i}^{N}{\left(\displaystyle \frac{{\rm{\Delta }}{{ \mathcal L }}_{i}}{\left\langle {{ \mathcal L }}_{i}\right\rangle }\right)}^{2}}.\end{eqnarray} \tag{ C3 }$

Here the quantity in the sum is the equivalent of Equation (C2). We emphasize that we are interested in the absolute uncertainty in the log likelihood. This uncertainty will increase with the number of events for fixed per-event variance. In order to maintain a constant uncertainty, the required number of samples per hyperparameter is proportional to the number of events. Thus, the total number of samples required for constant uncertainty scales like N².

C.1. Population Sample Weighting

In this work, we draw samples from the population model and evaluate Equation (9), i.e., $f({x}_{k})\to {{ \mathcal L }}_{i}(\theta ^{\prime} )$ . In addition to estimating the statistical uncertainty, we note that we can directly sample the distribution of $\mathrm{ln}{ \mathcal L }$ by repeatedly evaluating the likelihood with different realizations of samples from the population model.

In Figure 11 we plot the average log likelihood and 1σ uncertainty for 100 trials as a function of increasing number of population samples M (blue). This provides a test of convergence of the Monte Carlo integration, as a converged integral should be invariant under changes to the number of samples M. We note that using M = 15,000 samples, with an associated ${\rm{\Delta }}\mathrm{ln}{ \mathcal L }=0.23$ , is sufficient for convergence of this likelihood integration. In order to confirm that this integral is well-behaved for our choice of M, we perform a check by computing the likelihood again for each hyperparameter sample in our posterior distribution and reweight our original posterior distribution by this new distribution. We do this 10 times to get 10 new mock realizations of our posterior distribution. Differences between each simulated realization should therefore be due to statistical uncertainty when computing the Monte Carlo integral over random samples from the population distribution. We confirm qualitatively that these additional realizations are nearly identical to the original posterior distribution, indicating that the Monte Carlo integral for the likelihood is stable for this choice of M.

Wysocki et al. (2020) further reduces the uncertainty in their implementation of the population sample reweighting method by sampling only from regions in the population model that have nonvanishing support for mass parameters in the single-event likelihoods. This increases the number of effective samples per Monte Carlo computation, presumably resulting in a reduced ${\rm{\Delta }}\mathrm{ln}{ \mathcal L }$ computed via Equation (C3).

C.2. Posterior Sampling Reweighting

As a comparison, we compute the uncertainty in the calculated likelihood when reweighting single-event posterior samples in the likelihood. In this case the quantity inside the sum is the ratio of the population model to the fiducial prior distribution f(x_k) → π(θ∣Λ)/π(θ∣Ø). For this method, we are restricted to a single set of samples from the fiducial posterior distribution, and so we must rely on the statistical uncertainty. We are also limited to only using posterior samples from the masses (no samples from the tidal parameters), as we cannot reweight posterior samples in a population model that accounts for a Λ(m) relationship. In Figure 11, we show the uncertainty in an equivalent calculation in the red shaded region. We center the region at the mean estimator using the population model sampling method with 30,000 population samples for ease of comparison. To calculate this uncertainty, we use 4480 samples per event. We reiterate here that while this reweighting method may give less uncertainty in $\mathrm{ln}{ \mathcal L }$ in this application (i.e., the single-event posteriors are much narrower than the population model), it cannot account for tidal effects in the population model, as the distribution for Λ is a delta function for a given mass (i.e., the single-event posteriors are much wider than the population model).

Hierarchical Inference of Binary Neutron Star Mass Distribution and Equation of State with Gravitational Waves

Article metrics

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. Methods

2.1. Density Estimation

2.1.1. Preprocessing

2.1.2. Fitting the Distribution

2.2. Hierarchical Likelihoods Using Density Estimates

3. Models

3.1. Mass Population Model

3.2. EOS Model: The Λ–m Relation

4. Data and Implementation

4.1. Data

4.2. Sampling the Hyperposterior

5. Results

6. Discussion

Appendix A: Single-event Posteriors

Appendix B: Hyperposterior

Appendix C: Convergence of Monte Carlo Integrals

C.1. Population Sample Weighting

C.2. Posterior Sampling Reweighting

Footnotes

Hierarchical Inference of Binary Neutron Star Mass Distribution and Equation of State with Gravitational Waves

Article metrics

Share this article

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. Methods

2.1. Density Estimation

2.1.1. Preprocessing

2.1.2. Fitting the Distribution

2.2. Hierarchical Likelihoods Using Density Estimates

3. Models

3.1. Mass Population Model

3.2. EOS Model: The Λ–m Relation

4. Data and Implementation

4.1. Data

4.2. Sampling the Hyperposterior

5. Results

6. Discussion

Appendix A: Single-event Posteriors

Appendix B: Hyperposterior

Appendix C: Convergence of Monte Carlo Integrals

C.1. Population Sample Weighting

C.2. Posterior Sampling Reweighting

Footnotes