Constraining the Dense Matter Equation of State with Joint Analysis of NICER and LIGO/Virgo Measurements

G. Raaijmakers; S. K. Greif; T. E. Riley; T. Hinderer; K. Hebeler; A. Schwenk; A. L. Watts; S. Nissanke; S. Guillot; J. M. Lattimer; R. M. Ludlam

doi:10.3847/2041-8213/ab822f

1. Introduction

Determining the behavior of matter at supranuclear densities is one of the major challenges of modern astrophysics and nuclear physics. Astronomical multi-messenger observations yield statistical measurements of neutron star (NS) properties such as gravitational mass, radius, and tidal deformability, providing a way to study matter under extreme conditions. Theoretical predictions for the phases of matter in NS cores span a wide range, from neutron-rich nucleonic matter to hyperon or deconfined quark formation or the emergence of a Bose–Einstein condensate or a color superconducting phase (see Hebeler et al. 2015; Lattimer & Prakash 2016; Oertel et al. 2017; Baym et al. 2018, for recent reviews on this topic). In practice, our uncertainty about dense matter is usually expressed in terms of a space of viable equation of state (EOS) models (see, e.g., Abbott et al. 2018; Raaijmakers et al. 2019, and references therein).

Recently NASA's Neutron Star Interior Composition Explorer (NICER), an X-ray telescope on board the International Space Station, has delivered a joint mass–radius measurement for the millisecond pulsar PSR J0030+0451 using pulse-profile modeling (see Watts 2019 and references therein for a description of the technique). Two independent analyses were conducted within the collaboration, each making slightly different assumptions about the modeling (including priors; Miller et al. 2019; Riley et al. 2019a). The results depended strongly on the assumed geometry for the X-ray-emitting surface hot regions, but it was possible to identify a superior configuration based principally on the likelihood. The results of the two analyses were, however, deemed consistent: Riley et al. (2019a) reported an inferred mass and equatorial radius of $M={1.34}_{-0.16}^{+0.15}$ M_⊙ and ${R}_{\mathrm{eq}}={12.71}_{-1.19}^{+1.14}$ km (for the 68% credible interval); Miller et al. (2019), on the other hand, found $M={1.44}_{-0.14}^{+0.15}$ M_⊙ and ${R}_{\mathrm{eq}}={13.02}_{-1.06}^{+1.24}$ km.

Constraints on the mass and radius have recently also been obtained from the gravitational wave (GW) observations of the binary NS merger event GW170817 (Abbott et al. 2017). The LIGO Scientific and Virgo Collaborations (LVC) reported measurements of the masses and EOS-dependent tidal deformability parameters of the NSs under different prior assumptions on the spins and using various waveform models (Abbott et al. 2017, 2019a, 2019b); see Kastaun & Ohme (2019) for a critical re-examination of the results. The corresponding radii and EOS constraints were inferred in two ways, by using a parameterized spectral EOS (Lindblom 2010) and by employing EOS-insensitive relations, both using the low-spin priors ( ${cS}/{{GM}}^{2}\lt 0.05$ where S is the spin angular momentum) and hence non-rotating stellar models. The results were consistent, leading to $R={11.9}_{-1.4}^{+1.4}$ km from the spectral EOS analysis and $R={10.8}_{-1.7}^{+2.0}$ km for the more massive NS at the 90% credible interval (Abbott et al. 2018); see also De et al. (2018), Annala et al. (2018), and Tews et al. (2018) for independent related work. A number of studies have further included additional constraints from the electromagnetic counterparts assuming a NS–NS progenitor (Bauswein et al. 2017; Gao et al. 2017; Coughlin et al. 2018; Most et al. 2018; Capano et al. 2019; Margalit & Metzger 2019; Radice & Dai 2019; Shibata et al. 2019). In this work, we remain cautious of the large uncertainties in modeling the electromagnetic counterparts and only use the fact that the observed kilonova, an ultraviolet–optical–infrared transient powered by rapid neutron-capture nucleosynthesis (see, e.g., Lattimer & Schramm 1976; Li & Paczyński 1998; Rosswog et al. 1999; Kulkarni 2005; Metzger et al. 2010; Metzger 2017) indicated that the progenitor binary involved at least one NS. We consider two possibilities, a double NS system as assumed in most analyses and a NS–black hole (BH) binary. The mass of the BH in the latter scenario would be very low and could have originated from an earlier merger of two NSs or be of primordial origin (see, e.g., Yang et al. 2018; Coughlin & Dietrich 2019; Hinderer et al. 2019).

In this Letter, we perform a joint analysis of the EOS constraints from NICER and GW170817, following the method for connecting global NS parameters to the EOS used in Raaijmakers et al. (2019). We focus on the mass–radius measurement from Riley et al. (2019a),¹² as the measurement of Miller et al. (2019) has already been used to jointly constrain the EOS with GW and radio pulsar measurements. We will compare our findings to those of Miller et al. (2019) in Section 4.

We use EOS models that incorporate prior information from nuclear physics up to around saturation density, and two different parameterized extensions at high density; one using piecewise-polytropes, and one based on physically motivated assumptions about the speed of sound. We develop the methodology for the combined interpretation of these measurements in a Bayesian framework that also takes into account the measurements of massive pulsars. Our method can readily include a larger number of NSs from anticipated future multi-messenger observations. Next, we analyze the impact of systematic uncertainties arising from different priors for the EOS. We show that the priors used for the spectral EOS inference from GWs (Abbott et al. 2018) allow for much stiffer EOSs than the priors in Raaijmakers et al. (2019) and explain the reasons for these differences. Nevertheless, we find that the resulting EOS constraints are broadly consistent. We quantify explicitly that the fact that GW measurements determine the chirp mass to high accuracy can be utilized to accelerate the parameter inference by treating it as fixed.

2. Inference Framework

We adopt the framework outlined previously in Raaijmakers et al. (2019) and Greif et al. (2019), which we will briefly summarize here, including details of some adjustments made to incorporate information from the GW data of GW170817. Note that the Bayesian methodology is very similar to that outlined in Miller et al. (2020), although there are differences in the prior assumptions (see Section 4).

2.1. Parameterizations

Two distinct parameterizations are considered: a three-piece polytropic (PP) model with varying transition densities between the polytropes (Hebeler et al. 2013), and a speed of sound (CS) model based on physical considerations both at nuclear and high densities (Greif et al. 2019). Both models are matched to an additional polytrope below 1.1 n₀ (with saturation density ${n}_{0}=0.16\,{\mathrm{fm}}^{-3}$ ) with varying normalization that captures the range of allowed EOS calculated from chiral effective field theory (chiral EFT) interactions (Hebeler & Schwenk 2010; Hebeler et al. 2013). At densities below 0.5 n₀ this polytrope is matched to the BPS crust EOS (Baym et al. 1971).

2.2. Bayesian Parameter Estimation

We use Bayesian methodology to estimate parameters in our EOS model. A more in-depth discussion on parameter estimation frameworks in the context of dense matter inference can be found in Riley et al. (2018), which we will very briefly describe here. Let us write, using Bayes' theorem, the posterior distributions of the EOS parameters ${\boldsymbol{\theta }}$ and central energy densities ${\boldsymbol{\varepsilon }}$ (together the interior parameters) as

$\begin{eqnarray}&&p({\boldsymbol{\theta }},{\boldsymbol{\varepsilon }}\,| \,{\boldsymbol{d}},{ \mathcal M })\propto p({\boldsymbol{\theta }}\,| \,{ \mathcal M })\,p({\boldsymbol{\varepsilon }}\,| \,{\boldsymbol{\theta }},{ \mathcal M })\,p({\boldsymbol{d}}\,| \,{\boldsymbol{\theta }},{ \mathcal M }),\end{eqnarray} \tag{ 1 }$

where ${ \mathcal M }$ is the model that includes all assumed physics, and ${\boldsymbol{d}}$ is the data set from both NICER observations and strain data of GW170817 from the GW detectors LIGO/Virgo. Note that the prior on the central energy densities ${\boldsymbol{\varepsilon }}$ is dependent on the EOS parameters ${\boldsymbol{\theta }}$ , because the maximum stable central energy density is different for each set of ${\boldsymbol{\theta }}$ . Given that the two observations are independent, we can separate the likelihood function as¹³

$\begin{eqnarray}\begin{array}{rcl}p({\boldsymbol{\theta }},{\boldsymbol{\varepsilon }}\,| \,{\boldsymbol{d}},{ \mathcal M }) & \propto & p({\boldsymbol{\theta }}\,| \,{ \mathcal M })p({\boldsymbol{\varepsilon }}\,| \,{\boldsymbol{\theta }},{ \mathcal M })\\ & \times & p({{\rm{\Lambda }}}_{1},{{\rm{\Lambda }}}_{2},{M}_{1},{M}_{2}\,| \,{{\boldsymbol{d}}}_{\mathrm{GW}})\\ & \times & p({M}_{3},{R}_{3}\,| \,{{\boldsymbol{d}}}_{\mathrm{NICER}})\end{array}\end{eqnarray} \tag{ 2 }$

where the nuisance-marginalized likelihood functions of (i) M₃ and R₃, and (ii) Λ₁, Λ₂, M₁, and M₂, are equated¹⁴ to the nuisance-marginalized joint posterior density distributions inferred by Riley et al. (2019a) and Abbott et al. (2019b), respectively. The marginal GW likelihood function is degenerate under exchange of binary components, but we adopt the same convention as in Abbott et al. (2019a) and define M₁ ≥ M₂. The interior parameters ${\boldsymbol{\theta }}$ and ${\boldsymbol{\varepsilon }}$ (now containing three central energy densities, corresponding to one observed star by NICER and two by the LVC) map deterministically to the parameters M, R, and Λ (where we have assumed the rotation of the star, Ω, to be zero¹⁵ ), allowing us to sample from the prior distribution of ${\boldsymbol{\theta }}$ and ${\boldsymbol{\varepsilon }}$ and then numerically evaluate the likelihood functions using kernel density estimation (KDE) on the posterior samples. We then draw from the joint posterior distribution $p({\boldsymbol{\theta }},{\boldsymbol{\varepsilon }}\,| \,{\boldsymbol{d}},{ \mathcal M })$ of all interior parameters.

However, one complication that arises when performing KDE on samples from the joint posterior distribution of the two masses associated with GW170817 is that, due to the extreme accuracy to which the chirp mass ${M}_{\mathrm{chirp}}={\left({M}_{1}{M}_{2}\right)}^{3/5}/{\left({M}_{1}+{M}_{2}\right)}^{1/5}$ is known relative to the uncertainty in the individual masses, the choice of bandwidth is difficult to make (for GW170817, M_chirp = 1.186 ± 0.001 M_⊙; Abbott et al. 2019a). A small bandwidth is necessary to accurately describe the chirp mass, while a larger bandwidth is necessary to smooth out finite sampling noise in the distribution of masses. Another complication is that when the two sampled central densities are uncorrelated—except for the assumption of a shared EOS—it is computationally expensive for samplers to find the region in the space of masses where all of the probability density is concentrated.

To avoid these complications, and at the same time utilize the small chirp mass uncertainty, we fix it to its median value of M_chirp = 1.186 M_⊙. Consequently, the mass of the secondary object is a deterministic function of the mass of the primary object, and there is one fewer free central density parameter in the vector ${\boldsymbol{\varepsilon }}$ . For details on the approximation invoked here, we refer to Appendix A, up to Equation (A12). For likelihood evaluation we use the mass ratio q = M₂/M₁ instead of the individual masses, transforming Equation (2) to¹⁶

$\begin{eqnarray}\begin{array}{rcl}p({\boldsymbol{\theta }},{\boldsymbol{\varepsilon }}\,| \,{\boldsymbol{d}},{ \mathcal M }) & \propto & p({\boldsymbol{\theta }}\,| \,{ \mathcal M })\,p({\boldsymbol{\varepsilon }}\,| \,{\boldsymbol{\theta }},{ \mathcal M })\\ & \times & p({{\rm{\Lambda }}}_{1},{{\rm{\Lambda }}}_{2},q\,| \,{{\boldsymbol{d}}}_{\mathrm{GW}})\\ & \times & p({M}_{3},{R}_{3}\,| \,{{\boldsymbol{d}}}_{\mathrm{NICER}}).\end{array}\end{eqnarray} \tag{ 3 }$

Moreover, the deformability ${{\rm{\Lambda }}}_{2}={{\rm{\Lambda }}}_{2}({\boldsymbol{\theta }};q)$ .

In Figure 1 ¹⁷ we compare the updated prior distribution after including information from GW170817 with and without fixing the chirp mass. When the chirp mass is considered as a free parameter we include it as an element of ${\boldsymbol{\theta }}$ ,¹⁸ sample from a uniform prior ${M}_{\mathrm{chirp}}\,\sim \,U(1.180,1.192)$ , and define the likelihood function as $p({{\rm{\Lambda }}}_{1},{{\rm{\Lambda }}}_{2},q,{M}_{\mathrm{chirp}}\,| \,{{\boldsymbol{d}}}_{\mathrm{GW}})$ , thus requiring four-dimensional KDE. The two distributions are, due to the small uncertainty in the chirp mass, almost equal, apart from some finite sampling noise. In the following, we fix the chirp mass to M_chirp = 1.186 M_⊙ in order to reduce the dimensionality of the parameter vector ${\boldsymbol{\theta }}$ .

Figure 1. Refer to the following caption and surrounding text. — **Figure 1.** Comparison of the prior on EOS parameters, transformed to the space of mass and radius, as it would be updated after performing parameter estimation on GW170817 with (green contours) and without (blue contours) fixing the chirp mass to the median, M_chirp = 1.186 M_⊙, of its marginal posterior distribution. The two distributions show some small-scale differences in the 1σ contour but are globally consistent. For comparison we also show the prior distribution before including information from GW170817, but with the 2.14 M_⊙ pulsar information, in black contours. For all contours the dotted and dashed lines indicate the 68% and 95% credible regions, respectively.
Download figure:
Standard image High-resolution image

We use the nested sampling software MultiNest to draw weighted samples from the posterior distribution of ${\boldsymbol{\theta }}$ (Feroz & Hobson 2008; Feroz et al. 2009, 2013; Buchner et al. 2014).

2.3. Priors

The bounds of the prior ranges of parameters used in this analysis are identical to those discussed in Section 2.1.1 of Raaijmakers et al. (2019) and Section 3.1.1 of Greif et al. (2019). Within these bounds all EOS parameters are sampled uniformly, while the central density of a star, ${\varepsilon }_{c}$ , is uniformly sampled in logarithmic space between $\mathrm{log}({\varepsilon }_{c})=14.6$ and an upper bound that is determined by the maximum central density of each particular EOS. We consider an EOS in the PP model up until the highest density that corresponds to a stable NS or to the point where causality is no longer satisfied, i.e., where the speed of sound exceeds the speed of light, c_s > c. The CS model has slightly more restrictive requirements.

(i)
The speed of sound for all densities should be lower than the speed of light.
(ii)
At asymptotic densities ( $\sim 50{n}_{0}$ ) the speed of sound should converge to ${c}_{s}=\sqrt{1/3}c$ from below, based on theoretical calculations of c_s in the framework of perturbative quantum chromodynamics (Fraga et al. 2014).
(iii)
At low densities the speed of sound can be described by that of a normal Fermi liquid such that we require ${c}_{s}\leqslant \sqrt{0.163}\,c$ at densities below 1.5 n₀ (for more details, see Greif et al. 2019).

A notable change from the prior used in Raaijmakers et al. (2019) is how we implement information from pulsar mass measurements in our analysis.¹⁹ These high-precision measurements obtained from the timing of radio pulsars restrict softer EOSs by requiring each EOS to be able to support the heaviest NSs. There have been several massive NSs detected, with the most stringent constraints coming from PSR J0348+0432 with a mass of ${2.01}_{-0.04}^{+0.04}$ M_⊙ (Antoniadis et al. 2013) and more recently PSR J0740+6620 with a mass of ${2.14}_{-0.09}^{+0.10}$ M_⊙ (Cromartie et al. 2019). In many previous analyses the lower 1σ limit of such a mass measurement was taken as the minimum mass that an EOS has to support a priori. However, Miller et al. (2020, 2019) do not make such an assumption and emphasize that likelihood information about high-mass pulsars be treated accurately (see also Alvarez-Castillo et al. 2016). Here we approximate the highest pulsar mass measurement as a Gaussian likelihood function,²⁰ such that Equation (3) reads

$\begin{eqnarray}\begin{array}{rcl}p({\boldsymbol{\theta }},{\boldsymbol{\varepsilon }}\,| \,{\boldsymbol{d}},{ \mathcal M }) & \propto & p({\boldsymbol{\theta }}\,| \,{ \mathcal M })p({\boldsymbol{\varepsilon }}\,| \,{\boldsymbol{\theta }},{ \mathcal M })\\ & \times & p({{\rm{\Lambda }}}_{1},{{\rm{\Lambda }}}_{2},q\,| \,{{\boldsymbol{d}}}_{\mathrm{GW}})\\ & \times & p({M}_{3},{R}_{3}\,| \,{{\boldsymbol{d}}}_{\mathrm{NICER}})\\ & \times & p({M}_{4}\,| \,{{\boldsymbol{d}}}_{\mathrm{radio}}).\end{array}\end{eqnarray} \tag{ 4 }$

The vector ${\boldsymbol{\varepsilon }}$ now contains a fourth central energy density (drawn from the same prior as described at the beginning of this section) for which the corresponding mass, M₄ is used to evaluate the Gaussian likelihood function. We compare the effect of a cutoff in the prior with the implementation of the new likelihood function for the PP model in the left panel of Figure 2. The solid lines indicating the 95% credible region show that the prior distribution between the two methods is very similar, although the likelihood implementation of the 2.14 M_⊙ pulsar is slightly more constraining at lower radii due to the higher mass. In the right panel of Figure 2 we compare the prior distributions for the two parameterizations used in this Letter with the prior distribution of the spectral parameterization used in Abbott et al. (2018). The spectral parameterization allows for much larger radii than we consider here, due to using only a crust EOS without implementing nuclear physics constraints around nuclear saturation density. This is taken into account in this work (see also Hebeler et al. 2013) by adopting the EOS band based on chiral EFT up to 1.1n₀. The exact breakdown density of chiral EFT is not fully known, but many calculated and also predicted nuclear properties are consistent with experiment (Hebeler et al. 2015), including the symmetry energy and other matter properties at saturation density, suggesting that the range of possible EOSs predicted by chiral EFT is valid up to around nuclear saturation density (see also Section 4.2 of Raaijmakers et al. 2019).

Figure 2. Refer to the following caption and surrounding text. — **Figure 2.** Left panel: comparison between the full prior distribution (black shaded region, 95% credible region bounded by the dark green contour) for the PP model with a 1.97 M_⊙ cutoff (light green contour, as used in Raaijmakers et al. 2019) and when updated by parameter estimation using the 2.14 M_⊙ pulsar likelihood function (black dashed contour) from Cromartie et al. (2019). Using a cutoff in the prior allows for slightly smaller radii than using the likelihood function: this is due to both the higher mass of the center of the pulsar likelihood function and the fact that the likelihood function gives more weight to an EOS with a maximum mass of 2.14 M_⊙ than, e.g., 2.05 M_⊙. Right panel: comparison between the 95% credible regions of the prior distributions of the PP model (black, dashed contour) and the CS model (dark green contour), when updated with the 2.14 M_⊙ pulsar likelihood function. We also show the 95% credible region (black shaded region with blue contours) of the prior distribution using the spectral model (Lindblom 2018) that was used in Abbott et al. (2018). From the comparison it is clear that the prior with the spectral model allows for a much stiffer EOS but has a similar bound at low radii.
Download figure:
Standard image High-resolution image

2.4. Generalization to a Large Number of Stars

The separation of the likelihood function based on different observables in Equation (4) is a useful way of analyzing multiple sources at the same time. In the future, however, when a population of NSs is observable, this can quickly become computationally intractable—depending on the sampling algorithm applied—as the number of densities constituting the parameter vector ${\boldsymbol{\varepsilon }}$ grows linearly with the number of observed stars. One can perform the parameter estimation sequentially in this case, where the prior $p({\boldsymbol{\theta }}| { \mathcal M })$ is updated after each iteration and sampled from in the next (see also Figure 2.1 in Riley et al. 2018).²¹

3. EOS Constraints Given Multi-messenger Observations

For both parameterizations discussed in Section 2.1 we draw weighted samples from the posterior distribution $p({\boldsymbol{\theta }}\,| \,{\boldsymbol{d}},{ \mathcal M })$ using nested sampling. In order to explore the effect of different measurements on this posterior distribution, we start by only considering information from the 2.14 M_⊙ pulsar; we then include information from the binary merger GW170817, and finally we include the more recent NICER measurements of PSR J0030+0451. The GW data we use here are the publicly available posterior samples (LIGO-Virgo collaboration 2018) which assume certain priors on the GW parameters as described in Abbott et al. (2019b); a study of the impact of changing these priors is outside the scope of this work but see T. Zhao et al. (2020, in preparation).

3.1. EOS Constraints Assuming GW170817 Was a NS–NS

We illustrate the posterior distribution in two different ways in Figures 3 and 4. The lower panels show the 68% and 95% credible intervals on the pressure at each energy density given the inferred distribution on EOS parameters, i.e., $p(P\,| \,\varepsilon ,{\boldsymbol{d}},{ \mathcal M })$ . The upper panels show the updated prior distribution for a new star given the inferred distribution on EOS parameters, transformed to the space of mass and radii. More practically, this means that for each EOS in the posterior distribution $p({\boldsymbol{\theta }}| {\boldsymbol{d}},{ \mathcal M })$ we draw a new central density from the prior $p({\boldsymbol{\varepsilon }}\,| \,{\boldsymbol{\theta }})$ , where we use again a uniform prior in log space with the upper bound determined by the maximum stable neutron star of that EOS. From this central energy density we then compute the corresponding mass and radius pair. By combining all pairs for each EOS sample in the posterior we obtain a distribution of masses and radii. The contours again indicate the 68% and 95% credible regions. Visually inspecting the posterior distributions for both the PP model (Figure 3) and the CS model (Figure 4) indicates that the inferred masses and tidal deformabilities for GW170817 favor softer EOSs than our prior. Folding in information about the radius of PSR J0030+0451, however, with a peak value around 12.7 km, favors stiffer EOSs. As a result the final posterior distribution is only slightly shifted toward smaller radii but otherwise closely follows the distribution when only the highest-mass pulsar is included. We also note that one should be careful when inferring the maximum mass allowed for a NS from Figures 3 and 4 because the high-mass end of these distributions depends sensitively on the prior on central energy densities. See also Section 4.2 and Figure 6 where we explicitly show the distribution of the maximum mass by taking the highest allowed central energy density from each EOS instead of sampling from the prior on ε.

Figure 3. Refer to the following caption and surrounding text. — **Figure 3.** Posterior distributions conditional on the PP model and given: (i) the 2.14 M_⊙ pulsar alone (left panels), (ii) inclusion of the GW170817 measurements (middle panels), and (iii) inclusion of the mass and radius of PSR J0030+0451 inferred by Riley et al. (2019a) given NICER data (right panels). In the top row we show how the posterior distributions update the prior distributions, by drawing a new central density given the inferred distribution on EOS parameters, $p({\varepsilon }_{c}\,| \,\mathrm{EOS})$ . This is then transformed to the space of masses and radii, with the contours indicating the 68% and 95% credible intervals. In the bottom row we show the marginal posterior distributions of the pressure P conditional on energy density ε, i.e., $p(P\,| \,\varepsilon ,{\boldsymbol{d}},{ \mathcal M })$ . The bands show the connected 68% and 95% credible intervals at each energy density ε. The gray lines in the left panels show the 95% credible interval of the full prior, while the black dotted and dashed lines in all panels show the 68% and 95% credible regions of the updated prior when including information from the 2.14 M_⊙ pulsar. The green contours show the same credible regions, but for posterior distributions that are inferred from multiple measurements of NS observables. In the lower right inset panels we illustrate the evolution of the Kullback–Leibler divergence as a function of energy density. We conclude that most information is gained from including the 2.14 M_⊙ pulsar. The binary merger GW170817 favors softer EOSs than the prior, but the measured radius from PSR J0030+0451 favors stiffer EOSs, resulting in a final posterior distribution very similar to the prior.
Download figure:
Standard image High-resolution image

**Figure 3.** Posterior distributions conditional on the PP model and given: (i) the 2.14 M_⊙ pulsar alone (left panels), (ii) inclusion of the GW170817 measurements (middle panels), and (iii) inclusion of the mass and radius of PSR J0030+0451 inferred by Riley et al. (2019a) given NICER data (right panels). In the top row we show how the posterior distributions update the prior distributions, by drawing a new central density given the inferred distribution on EOS parameters, $p({\varepsilon }_{c}\,| \,\mathrm{EOS})$ . This is then transformed to the space of masses and radii, with the contours indicating the 68% and 95% credible intervals. In the bottom row we show the marginal posterior distributions of the pressure P conditional on energy density ε, i.e., $p(P\,| \,\varepsilon ,{\boldsymbol{d}},{ \mathcal M })$ . The bands show the connected 68% and 95% credible intervals at each energy density ε. The gray lines in the left panels show the 95% credible interval of the full prior, while the black dotted and dashed lines in all panels show the 68% and 95% credible regions of the updated prior when including information from the 2.14 M_⊙ pulsar. The green contours show the same credible regions, but for posterior distributions that are inferred from multiple measurements of NS observables. In the lower right inset panels we illustrate the evolution of the Kullback–Leibler divergence as a function of energy density. We conclude that most information is gained from including the 2.14 M_⊙ pulsar. The binary merger GW170817 favors softer EOSs than the prior, but the measured radius from PSR J0030+0451 favors stiffer EOSs, resulting in a final posterior distribution very similar to the prior.
Download figure:
Standard image High-resolution image

Figure 4. Refer to the following caption and surrounding text. — **Figure 4.** Same as in Figure 3 but for the CS model. Again we conclude that constraints from GW170817 point to softer EOSs with lower radii, but the results from NICER point to stiffer EOSs with higher radii. The final posterior distribution, conditional on the three different measurements combined, is then very similar to the distribution with only information from the 2.14 M_⊙ pulsar included.
Download figure:
Standard image High-resolution image

In order to quantify this we compute the Kullback–Leibler (KL) divergence (Kullback & Leibler 1951) between the two distributions shown in the lower panels of Figures 3 and 4 at a given energy density ε (see lower right inset panels). The KL divergence is an asymmetric measure of how one probability distribution differs from another; when computed using a logarithm of base two, the divergence has units of bits. As expected, most of the information is gained from folding in the 2.14 M_⊙ pulsar constraint. The posterior distribution given GW170817 alone exhibits greater divergence from the prior than does the posterior distribution given GW170817 and NICER information; that said, both divergences are small at all densities.

Finally, we compute the Bayes' factors to investigate whether one parameterization is favored over the other by the data. Assuming the two discrete models to have equal probability a priori, the Bayes' factor reduces to the ratio of the evidences of the two posteriors. We quote the values for the three different posterior distributions in Table 1, where the Bayes' factor K is the ratio of the PP model over the CS model. To interpret the values of K we follow the table of Kass & Raftery (1995) and conclude that none of the Bayes' factors shows substantial support for one of the models over the other.

Table 1. Log-evidences (Z) and Bayes' Factors (K)

Posterior	log (Z)		K
	PP model	CS model
+ 2.14 M_⊙ pulsar	−1.69 ±0.03	−2.22 ±0.02	0.70
+ GW170817	−15.44 ±0.02	−15.08 ±0.02	1.70
+ NICER	−17.05 ±0.03	−17.10 ±0.03	1.05

Note. Log-evidences (Z) and Bayes' factors (K) for the three posterior distributions and two parameterizations. Also quoted are the Bayes' factors (K), computed as the ratio of the evidence for the PP model over that for the CS model. Following the interpretation of Kass & Raftery (1995) there is no significant support for one parameterization over the other.

Download table as: ASCII Typeset image

3.2. EOS Constraints Assuming GW170817 Was a NS–BH

Based on the GW signal and observed electromagnetic counterpart from GW170817 there is a non-negligible chance that one of the compact objects involved in the merger was a light BH (Yang et al. 2018; Ascenzi et al. 2019; Coughlin & Dietrich 2019; Hinderer et al. 2019), provided that the objects had an unequal mass ratio. Such a light BH could for example be formed during an earlier merger of two NSs, or originate from density fluctuations in the early universe (García-Bellido et al. 1996).

We show that making different assumptions related to the nature of the merger can affect the inferred EOS. More specifically we investigate the impact of GW170817 being a NS–BH by fixing the tidal deformability of the heavier object Λ₁ to zero. One complication, however, is that the posterior samples provided in Abbott et al. (2019a) do not contain enough samples when we restrict Λ₁ = 0. Instead, we perform a coordinate transformation on the posterior samples to the effective tidal deformability $\tilde{{\rm{\Lambda }}}$ , a combination of the two independent tidal deformabilities and masses. To incorporate the constraints on the NS–BH scenario from electromagnetic (EM) observations of the associated kilonova we follow the approach outlined in Hinderer et al. (2019). We use the model of Foucart et al. (2018) to compute the remnant mass outside the BH after the merger M_rem from the progenitor NS compactness, which is related to its tidal deformability Λ₂, the dimensionless spin of the BH χ₁, and the mass ratio. For the spin we only consider aligned or anti-aligned spins with the orbital angular momentum. We obtain these inputs from the GW data by transforming the posterior samples of GW170817 to $\tilde{{\rm{\Lambda }}}$ and χ_eff, the effective dimensionless spin, setting the BH tidal deformability Λ₁ = 0 and the NS spin χ₂ = 0. We apply a conservative cut to the GW170817 posterior by only considering samples that have M_rem > 0.1 M_⊙, which can produce the observed bolometric light curve of GW170817 (Kasliwal et al. 2017) within some error margins based on semi-analytic light curve modeling. Effectively this is a crude approximation to the likelihood function that is zero if M_rem is below this threshold and uniform if above. The posterior distribution of the EOS parameters is then given by

$\begin{eqnarray}\begin{array}{rcl}p({\boldsymbol{\theta }},{\boldsymbol{\varepsilon }}| {\boldsymbol{d}},{ \mathcal M }) & \propto & p({\boldsymbol{\theta }}| { \mathcal M })p({\boldsymbol{\varepsilon }}| {\boldsymbol{\theta }},{ \mathcal M })\\ & \times & p(\tilde{{\rm{\Lambda }}}({{\rm{\Lambda }}}_{2},{M}_{1},{M}_{2}),q| {{\boldsymbol{d}}}_{\mathrm{GW},\mathrm{EM}})\\ & \times & p({M}_{3},{R}_{3}| {{\boldsymbol{d}}}_{\mathrm{NICER}})\\ & \times & p({M}_{4}| {{\boldsymbol{d}}}_{\mathrm{radio}}),\end{array}\end{eqnarray} \tag{ 5 }$

in which the tidal deformability of the heavier object is fixed to zero and the chirp mass is again fixed to M_chirp = 1.186 M_⊙. Equation (5) requires that the joint prior density of $\tilde{{\rm{\Lambda }}}$ and q be flat. However, the prior density is not flat. In particular, the prior distribution of $\tilde{{\rm{\Lambda }}}$ exhibits undesirable behavior near zero, where prior support drops to zero. To minimize the induced bias we reweight the posterior samples to a flat prior in $\tilde{{\rm{\Lambda }}}$ using the method described in Abbott et al. (2020). The difference is, however, small since the EM information already excludes values of $\tilde{{\rm{\Lambda }}}$ close to zero. Finally we note that the term $p(\tilde{{\rm{\Lambda }}}({{\rm{\Lambda }}}_{2},{M}_{1},{M}_{2}),q| {{\boldsymbol{d}}}_{\mathrm{GW},\mathrm{EM}})$ is marginalized over Λ₁ and therefore the BH–NS likelihood function is contaminated with the likelihood of a system with two deformable objects. However, comparing the marginalized distribution with the conditional distribution $p(\tilde{{\rm{\Lambda }}}({{\rm{\Lambda }}}_{2},{M}_{1},{M}_{2}),q| {{\boldsymbol{d}}}_{\mathrm{GW},\mathrm{EM}},{{\rm{\Lambda }}}_{1})$ shows that the approximation is valid here in the context of illustrating the impact of different assumptions about the binary (see Appendix B).

In Figure 5 we compare the posterior distribution for the EOS for the assumption that GW170817 is a NS–NS or NS–BH merger, with the prior distribution obtained from only the 2.14 M_⊙ pulsar measurement. In the left panel we only consider GW170817 and the pulsar mass, while in the right panel we also include the NICER measurement. Based on the KL divergence as a function of energy density, we conclude that in both cases GW170817 has more support for softer EOSs when assumed to be a double NS merger. This can be explained by the fact that the relatively low inferred value of $\tilde{{\rm{\Lambda }}}$ in Abbott et al. (2019a) can be achieved with a higher value of Λ₂ (i.e., stiffer EOS) when Λ₁ = 0 and the fact that the NS–BH scenario is only consistent with the EM counterpart for unequal mass ratios and larger NSs.

Figure 5. Refer to the following caption and surrounding text. — **Figure 5.** Comparison between the posterior distributions obtained for the assumption that GW170817 is a NS–NS or NS–BH merger, without (left panel) and with (right panel) the NICER measurements of PSR J0030+0451. Here we only show results for the PP model. The lines show the connected 95% credible regions at a given energy density ε. The black lines show the same credible regions when the posterior distribution is only informed by the 2.14 M_⊙ pulsar. The lower right inset illustrates the evolution of the Kullback–Leibler divergence with energy density ε and indicates that GW170817 is slightly more constraining for the EOS when assumed to be a NS–NS merger.
Download figure:
Standard image High-resolution image

4. Discussion

In this work we have analyzed the combined constraints on the dense matter EOS given the recent inferred mass and radius of PSR J0030+0451 by Riley et al. (2019a) using NICER data, and the measurement of the GW signal from GW170817, in combination with the radio measurement of a 2.14 M_⊙ pulsar.

4.1. Multi-messenger Contributions

The posterior distributions show that the most information is gained from the most massive pulsar mass measurement. In combination with the restricted range of possible EOS at lower densities described by the chiral EFT band and, for the CS model, by the approximation of NS matter as a Fermi liquid, the pulsar mass already puts stringent constraints on the EOS. When including information from GW170817, softer EOSs are yielded a posteriori, but this is only a small effect because EOSs consistent with the 2.14 M_⊙ pulsar mass measurement are restricted by causality to higher radii. Finally, the recent NICER measurement shows more support for stiffer EOSs, causing the posterior distribution to have a narrow peak where the likelihood functions of NICER and GW170817 overlap with the information from the 2.14 M_⊙ pulsar and the chiral EFT band.

In order to quantitatively assess the prior-to-posterior information gain through sequential multi-messenger updates, we have computed KL divergences as a function of energy density. The divergences indicate that most information is gained from the radio pulsar mass measurement. The information gain from GW170817, given prior radio information, is small. Further, including the NICER likelihood yields a smaller information gain because the radio and X-ray modeling constrain a similar part of the EOS parameter space, i.e., provide less support for softer EOSs.

Note that these divergences all depend on which distribution is compared with which. For example, comparing the constraints from NICER or GW170817 with the original, more diffuse prior distribution would yield a higher KL divergence. We argue, however, that a logical, unique order of precedence would be ideal. An obvious option is to chronologically incorporate the various (astronomical) measurements that constrain the EOS.

It is however difficult to design an update order that is truly chronological, given that (i) not all constraints are compiled in any one analysis, and (ii) multi-messenger constraints are being derived contemporaneously, both given newly acquired data, and archival data when the modeling procedure is revolutionized. Typically there will not be a clear chronology, and even if there were, we might try to account for the number of physical assumptions a constraint is conditional on. If we assume that the number of assumptions anti-correlates with robustness to systematic error in our models of reality, we can attempt to compile information—and archive constraints—in loose order of robustness.²²

We therefore opt to start with information contributed from radio timing of pulsars in relativistic binaries. There are multiple constraints to consider: constraints for two systems were reported before GWs were first detected (Demorest et al. 2010; Antoniadis et al. 2013), with measurements for the very first being updated in recent years with continued timing (Fonseca et al. 2016; Arzoumanian et al. 2018; Cromartie et al. 2019; Miller et al. 2020). It is believed that radio measurements relying solely on the relativistic Shapiro delay are robust to systematic error. However, the first report of a pulsar with a mass above 2 M_⊙ is dependent on theoretical models of white dwarf evolution (Antoniadis et al. 2013). The issue of choice can of course be straightforwardly nullified by incorporating all of the radio pulsar mass measurements, each of which encodes orthogonal information in a population-level context. In this work we chose to use a single astronomical source for each class of astronomical messenger. Of the two highly informative measurements relying solely on the relativistic Shapiro delay, derived by Arzoumanian et al. (2018) and Cromartie et al. (2019), we chose to use the constraint reported by the latter.

4.2. Implications for the Maximum Mass of NSs

The maximum mass of NSs places important constraints on the EOS and can aid in determining the nature of a compact binary merger and the following merger remnant. In Figure 6 we show the posterior distributions for M_max for the PP and CS parameterizations considered in this Letter. These distributions are obtained by calculating for each EOS sample in the posterior distribution $p({\boldsymbol{\theta }}| {\boldsymbol{d}},{ \mathcal M })$ the maximum mass for that EOS. The gray dashed lines indicate the prior support on M_max, which drops down rapidly above ∼2.5 M_⊙ for both parameterizations. The bump around 2.3 M_⊙ for the PP model is a consequence of EOSs that terminate at lower maximum masses when their sound speed reaches the speed of light before reaching a maximally stable NS. This leads to a higher value for the PP model of ${M}_{\max }={2.26}_{-0.24}^{+0.16}$ M_⊙ compared to the CS model, ${M}_{\max }={2.13}_{-0.22}^{+0.26}$ M_⊙, when including information from pulsar mass measurements, GW170817, and NICER. Here we quote median values and upper and lower limits that bound the 95% credible region. We find that the maximum mass distribution presented here is broadly consistent with values found in other works (see, e.g., Margalit & Metzger 2017; Shibata et al. 2017; Alsing et al. 2018; Rezzolla et al. 2018; Ruiz et al. 2018; Tews & Schwenk 2020). We find, however, a lower upper-limit since some of these analyses also include information from the kilonova light curves that suggest the formation of a metastable NS and disfavor the prompt formation of a BH.

Figure 6. Refer to the following caption and surrounding text. — **Figure 6.** Posterior distributions of the maximum mass of a non-spinning NS M_max that is supported by the inferred EOS after including the pulsar mass measurement, GW170817 and the results from NICER. For the PP model we find a median value of ${M}_{\max }={2.26}_{-0.24}^{+0.16}$ M_⊙ while for the CS model we find ${M}_{\max }={2.13}_{-0.22}^{+0.26}$ M_⊙, where the upper and lower limit bound the 95% credible region.
Download figure:
Standard image High-resolution image

**Figure 6.** Posterior distributions of the maximum mass of a non-spinning NS M_max that is supported by the inferred EOS after including the pulsar mass measurement, GW170817 and the results from NICER. For the PP model we find a median value of ${M}_{\max }={2.26}_{-0.24}^{+0.16}$ M_⊙ while for the CS model we find ${M}_{\max }={2.13}_{-0.22}^{+0.26}$ M_⊙, where the upper and lower limit bound the 95% credible region.
Download figure:
Standard image High-resolution image

4.3. Comparison to Other Work

We first compare our results with the analysis of Miller et al. (2019), who use a similar Bayesian approach to combine results from NICER, GW170817, and radio pulsar measurements to constrain the EOS. The parameterization of the EOS used in this work differs, however, in several aspects from the two parameterizations used by Miller et al. For the latter the crust EOS of Douchin & Haensel (2001) is considered up to 0.5 ρ_s. Beyond 0.5 ρ_s, two different extrapolations to higher densities are used. One approach used by Miller et al. is to implement the spectral parameterization by Lindblom (2010, 2018). The second parameterization is a PP expansion with two more polytropic segments than used in this work, totaling up to five segments. The range of the first polytropic index of Miller et al. (2019) is chosen to be rather restrictive in the context of chiral EFT (Hebeler et al. 2013). Moreover, Miller et al. (2019) note that first-order phase transitions are not allowed in the case of the spectral parameterization, but they are permitted non-exhaustively in the case of the PP expansion (as in the present work). Overall the priors on their parameterizations, in particular the spectral model (see also Figure 1) allow for a larger range of possible EOS functions. Visually comparing our inferences with Figure 14 in Miller et al. (2019), however, we deem the posterior distributions to be consistent.

Next, we compare to previous analyses of EOS constraints conditional on GW170817. The results of the LVC (Abbott et al. 2018) are broadly consistent with our analysis here, yet there are noticeable differences in the lower bound of the 90% confidence interval in radius (comparing Figures 3 and 4 with Figure 3 in Abbott et al. 2018). We attribute this discrepancy primarily to the different assumptions on the speed of sound: the spectral EOSs in Abbott et al. allowed models with up to 10% violation of causality, while both parameterized EOSs used in our analysis were strictly causal with c_s ≤ c. Additional differences are that Abbott et al. use as the crust EOS a SLy model up to ∼0.5, ρ_s while we use the BPS model, and we incorporate nuclear physics constraints based on chiral EFT up to around saturation density (Hebeler et al. 2013). The latter affects mainly the prior at the large radius end.

A number of independent analyses of the GW data also found broadly consistent results. De et al. (2018) analyzed the GW data with the source location and distance fixed to those determined from the electromagnetic observations, identified scaling relations between tidal parameters and mass ratio using PP models, and determined EOS and radius constraints for a 1.4 M_⊙ star to 8.9 < R_1.4 < 13.2 km consistent with our results here. Most et al. (2018) computed a large catalog of PP EOSs, parameterizing both hadronic models and those with phase transitions. They analyzed the subset of these consistent with high-mass NSs and constraints on tidal deformability parameter $\tilde{{\rm{\Lambda }}}$ , both of which were imposed as hard cutoffs. Their results for the case M_max > 2.01 M_⊙ and ${\tilde{{\rm{\Lambda }}}}_{1.4}\lt 800$ (e.g., their Figure 1, top left panel) are consistent with our findings here. Capano et al. (2019) used a different speed of sound parameterization based on similar chiral EFT constraints, as well as different priors, in particular a uniform distribution in radius. Thus, their results using the GW data alone are skewed more toward smaller radii 9.2 ≲ R_1.4 ≲ 12.3 km when imposing the chiral EFT limits up to ρ_s. Essick et al. (2020) used a nonparametric EOS inference under different priors, also finding broadly consistent results.

In conclusion, the large statistical uncertainties in the available NICER and LIGO/Virgo likelihood functions lead to broad agreements on the EOS and radius constraints across different analyses. However, the impact of priors, assumptions, and parameterizations is starting to become discernible, as we have shown. Highly anticipated upcoming observations with LIGO/Virgo, NICER, and radio pulsars will constrain the EOS for populations of NSs, and yield unique insights into the properties of cold, dense matter. Our method can readily ingest the additional information from multiple sources, as well as incorporate new constraints from subatomic experiments and theory.

We thank the anonymous referee for comments that helped to improve this work. This work was supported in part by NASA through the NICER mission and the Astrophysics Explorers Program. G.R., T.H., and S.N. are grateful for support from the Nederlandse Organisatie voor Wetenschappelijk Onderzoek (NWO) through the VIDI and Projectruimte grants (PI Nissanke). T.E.R. and A.L.W. acknowledge support from ERC Starting grant No. 639217 CSINEUTRONSTAR (PI Watts). This work was sponsored by NWO Exact and Natural Sciences for the use of supercomputer facilities, and was carried out on the Dutch national e-infrastructure with the support of SURF Cooperative. S.K.G., K.H., and A.S. acknowledge support from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—Project-ID 279384907—SFB 1245. S.G. acknowledges the support of the CNES. R.M.L. acknowledges the support of NASA through Hubble Fellowship Program grant HST-HF2-51440.001. J.M.L. acknowledges support by NASA through the NICER mission with Grant 80NSSC17K0554 and by the U.S. DOE through Grant DE-FG02-87ER40317. This research has made extensive use of NASA's Astrophysics Data System Bibliographic Services (ADS) and the arXiv. We thank Jocelyn Read, Wynn Ho, and Cole Miller for comments on a draft manuscript.

Software: Python/C language (Oliphant 2007), GNU Scientific Library (GSL; Gough 2009), NumPy (van der Walt et al. 2011), Cython (Behnel et al. 2011), SciPy (Jones et al. 2001), MPI (Forum 1994), MPI for Python (Dalcín et al. 2008), Matplotlib (Hunter 2007; Droettboom et al. 2018), IPython (Perez & Granger 2007), Jupyter (Kluyver et al. 2016), MultiNest (Feroz et al. 2009), PyMultiNest (Buchner et al. 2014).

Appendix A: GW Likelihood and Prior Implementation

The posterior distribution derived by the LVC, assuming both binary components are NSs, is

$\begin{eqnarray}&&p\left({\boldsymbol{M}},{\boldsymbol{\Lambda }}| {\boldsymbol{d}}\right)\propto p\left({\boldsymbol{d}}| {\boldsymbol{M}},{\boldsymbol{\Lambda }}\right)\end{eqnarray} \tag{ A1 }$

where ${\boldsymbol{M}}={[{M}_{1},{M}_{2}]}^{\top }$ , ${\boldsymbol{\Lambda }}={[{{\rm{\Lambda }}}_{1},{{\rm{\Lambda }}}_{2}]}^{\top }$ , ${\boldsymbol{d}}$ is the strain data vector, and we omit the conditional argument representing the model. The joint prior density $p({\boldsymbol{M}},{\boldsymbol{\Lambda }})$ is flat on compact support, such that the posterior density is proportional to the nuisance-marginalized likelihood function $p\left({\boldsymbol{d}}| {\boldsymbol{M}},{\boldsymbol{\Lambda }}\right)$ . This function is symmetric under exchange of the binary components—i.e., ${M}_{1}\leftrightarrow {M}_{2}$ and ${{\rm{\Lambda }}}_{1}\leftrightarrow {{\rm{\Lambda }}}_{2}$ . We are therefore free to define one mass as being associated with the most massive component: M_i ≥ M_j for $i\ne j$ . If we opt not to, a numeric label is always associated with the same physical object. We are free to transform spaces, such that the likelihood function is $p\left({\boldsymbol{d}}| q,{M}_{\mathrm{chirp}},{\boldsymbol{\Lambda }}\right)$ , where q := M₂/M₁, and M_chirp is a symmetric combination of M₁ and M₂. Here, $q\in {{\mathbb{R}}}^{+}$ if no ordering of masses is enforced, in which case the nuisance-marginalized likelihood for q = Q is equal to that for q = 1/Q under exchange of properties. On the other hand, 0 < q ≤ 1 if M₁ is associated with the more massive component, which is the definition we choose.

The joint posterior distribution of interior parameters (omitting the global model as a conditional argument) is

$\begin{eqnarray}&&p({\boldsymbol{\theta }},{\boldsymbol{\varepsilon }}\,| \,{\boldsymbol{d}})\propto p({\boldsymbol{d}}| {\boldsymbol{\theta }},{\boldsymbol{\varepsilon }})p({\boldsymbol{\varepsilon }}\,| \,{\boldsymbol{\theta }})p({\boldsymbol{\theta }}),\end{eqnarray} \tag{ A2 }$

where ${\boldsymbol{\theta }}$ are EOS parameters, and ${\boldsymbol{\varepsilon }}$ are central (energy) densities, such that $({\boldsymbol{\theta }},{\varepsilon }_{i})\mapsto ({M}_{i},{{\rm{\Lambda }}}_{i})$ . The EOS parameters also operate as hyperparameters, in the simplest mode as an upper bound on the prior support of $\varepsilon \in [a,b({\boldsymbol{\theta }})]$ . The population-level prior density of binaries $p({\boldsymbol{\varepsilon }}\,| \,{\boldsymbol{\theta }})$ is assumed to be separable when one density is associated with a particular physical component:

$\begin{eqnarray}&&p({\boldsymbol{\varepsilon }}\,| \,{\boldsymbol{\theta }})=\prod _{i}p({\varepsilon }_{i}\,| \,{\boldsymbol{\theta }}),\end{eqnarray} \tag{ A3 }$

where $p({\varepsilon }_{i}\,| \,{\boldsymbol{\theta }})$ is identical $\forall i$ . Let us assume $f({\epsilon }_{i})\sim U(a,b)$ for $b=b({\boldsymbol{\theta }})$ , where $f({\varepsilon }_{i})$ is a monotone transformation such as a logarithm, which is our choice in the main text.

We decided to define one density parameter as that of the binary component with the highest central density (and thus total mass), and therefore $f({\varepsilon }_{1})\geqslant f({\varepsilon }_{2})$ . Transforming the joint prior density function above, assuming $f({\epsilon }_{i})={\epsilon }_{i}$ for notational simplicity, yields

$\begin{eqnarray}&&p({\varepsilon }_{1},{\varepsilon }_{2}\,| \,{\boldsymbol{\theta }})=p({\varepsilon }_{2}\,| \,{\varepsilon }_{1},{\boldsymbol{\theta }})p({\varepsilon }_{1}\,| \,{\boldsymbol{\theta }})\end{eqnarray} \tag{ A4 }$

where

$\begin{eqnarray}&&p({\varepsilon }_{1}\,| \,{\boldsymbol{\theta }})=\displaystyle \frac{2({\varepsilon }_{1}-a)}{{\left(b-a\right)}^{2}}\quad \mathrm{and}\quad p({\varepsilon }_{2}\,| \,{\varepsilon }_{1})=\displaystyle \frac{1}{({\varepsilon }_{1}-a)},\end{eqnarray} \tag{ A5 }$

for support ${\varepsilon }_{1}\in [a,b]$ and ${\varepsilon }_{2}\in [a,{\varepsilon }_{1}]$ . We then require (according to standard prior implementation for nested sampling)

$\begin{eqnarray}&&{x}_{1}({\varepsilon }_{1};{\boldsymbol{\theta }})={\int }_{a}^{{\varepsilon }_{1}}\displaystyle \frac{2(t-a)}{{\left(b-a\right)}^{2}}{dt}=\displaystyle \frac{{\left({\varepsilon }_{1}-a\right)}^{2}}{{\left(b-a\right)}^{2}};\end{eqnarray} \tag{ A6 }$

inverting yields

$\begin{eqnarray}&&{\varepsilon }_{1}({x}_{1};{\boldsymbol{\theta }})=a+(b-a)\sqrt{{x}_{1}}.\end{eqnarray} \tag{ A7 }$

Further,

$\begin{eqnarray}\begin{array}{rcl}{x}_{2}({\varepsilon }_{2};{\varepsilon }_{1}) & = & \displaystyle \frac{1}{({\varepsilon }_{1}-a)}{\int }_{a}^{t}{dt}=\displaystyle \frac{{\varepsilon }_{2}-a}{({\varepsilon }_{1}-a)}\\ & \Rightarrow & {\varepsilon }_{2}({x}_{2};{x}_{1})=a+({\varepsilon }_{1}-a){x}_{2}.\end{array}\end{eqnarray} \tag{ A8 }$

Returning to the likelihood function $p\left({\boldsymbol{d}}| {\boldsymbol{M}},{\boldsymbol{\Lambda }}\right)$ , the function is almost degenerate, with support (defined in terms of some small threshold value of the likelihood normalized to the global maximum) only for $| g({\boldsymbol{M}})| \leqslant \epsilon$ , where $g({\boldsymbol{M}})$ is the chirp combination minus some (now) constant M_chirp and $\epsilon \in {{\mathbb{R}}}^{+}$ is small. In the limit $\epsilon \to 0$ , the likelihood function is degenerate in the ${\boldsymbol{M}}$ -subspace. We enforce no order in M₁ and M₂. Generally, the marginal posterior density is then

$\begin{eqnarray}&&p({\boldsymbol{\theta }},{\varepsilon }_{1}| {\boldsymbol{d}})\propto p({\varepsilon }_{1}| {\boldsymbol{\theta }})p({\boldsymbol{\theta }})\times \int p({\boldsymbol{d}}| {\boldsymbol{M}},{\boldsymbol{\Lambda }})p({\varepsilon }_{2}| {\boldsymbol{\theta }})d{\varepsilon }_{2}.\end{eqnarray} \tag{ A9 }$

If $\epsilon \to 0$ , then

$\begin{eqnarray}\begin{array}{rcl}p({\boldsymbol{\theta }},{\varepsilon }_{1}| {\boldsymbol{d}}) & \propto & p({\varepsilon }_{1}| {\boldsymbol{\theta }})p({\boldsymbol{\theta }})p({\boldsymbol{d}}| {M}_{1},{\boldsymbol{\Lambda }})\\ & & \times \int \delta ({\varepsilon }_{2}-h({\varepsilon }_{1},{\boldsymbol{\theta }};{M}_{\mathrm{chirp}}))p({\varepsilon }_{2}| {\boldsymbol{\theta }})d{\varepsilon }_{2}\\ & = & p({\varepsilon }_{1}| {\boldsymbol{\theta }})p({\boldsymbol{\theta }})p({\boldsymbol{d}}| {M}_{1},{\boldsymbol{\Lambda }})\mathop{\underbrace{p(h({\varepsilon }_{1},{\boldsymbol{\theta }};{M}_{\mathrm{chirp}})| {\boldsymbol{\theta }})}}\limits_{\mathrm{support}\,{\varepsilon }_{2}\in [{\rm{a}},{\rm{b}}]},\end{array}\end{eqnarray} \tag{ A10 }$

where M₂ is the total mass yielded via inversion of the chirp combination of ${\boldsymbol{M}}$ , and $h({\varepsilon }_{1},{\boldsymbol{\theta }};{M}_{\mathrm{chirp}})$ is the associated value of ε₁ for the EOS specified by ${\boldsymbol{\theta }}$ . Further, ${{\rm{\Lambda }}}_{2}={{\rm{\Lambda }}}_{2}({\varepsilon }_{1},{\boldsymbol{\theta }};{M}_{\mathrm{chirp}})$ . The last factor implies that we need to take the prior support of ε₂ into account given hyperparameters ${\boldsymbol{\theta }};$ note that if $h({\varepsilon }_{1},{\boldsymbol{\theta }};{M}_{\mathrm{chirp}})$ has no solution, it will by definition not be within the prior support, and note also that the set of solutions to the inversion problem is generally a proper superset of the prior support.

However, because we enforced an order on the total masses, such that M₁ ≥ M₂, then

$\begin{eqnarray}&&p({\boldsymbol{\theta }},{\varepsilon }_{1}| {\boldsymbol{d}})\propto p({\varepsilon }_{1}| {\boldsymbol{\theta }})p({\boldsymbol{\theta }})\times \int p({\boldsymbol{d}}| {\boldsymbol{M}},{\boldsymbol{\Lambda }})p({\varepsilon }_{2}| {\varepsilon }_{1},{\boldsymbol{\theta }})d{\varepsilon }_{2}.\end{eqnarray} \tag{ A11 }$

If $\epsilon \to 0$ , then

$\begin{eqnarray}\begin{array}{rcl}p({\boldsymbol{\theta }},{\varepsilon }_{1}| {\boldsymbol{d}}) & \propto & p({\varepsilon }_{1}| {\boldsymbol{\theta }})p({\boldsymbol{\theta }})p({\boldsymbol{d}}| {M}_{1},{\boldsymbol{\Lambda }})\\ & & \times \int \delta ({\varepsilon }_{2}-h({\varepsilon }_{1},{\boldsymbol{\theta }};{M}_{\mathrm{chirp}}))p({\varepsilon }_{2}| {\varepsilon }_{1},{\boldsymbol{\theta }})d{\varepsilon }_{2}\\ & = & p({\varepsilon }_{1}| {\boldsymbol{\theta }})p({\boldsymbol{\theta }})p({\boldsymbol{d}}| {M}_{1},{\boldsymbol{\Lambda }})\\ & & \times \mathop{\underbrace{p(h({\varepsilon }_{1},{\boldsymbol{\theta }};{M}_{\mathrm{chirp}})| {\varepsilon }_{1},{\boldsymbol{\theta }})}}\limits_{\mathrm{support}\,{\varepsilon }_{2}\in [{\rm{a}},{\varepsilon }_{1}]}.\end{array}\end{eqnarray} \tag{ A12 }$

This means that if the most massive binary component is such that $h({\varepsilon }_{1},{\boldsymbol{\theta }};{M}_{\mathrm{chirp}})\geqslant {\varepsilon }_{1}$ , then the last factor imposes zero (approximate) posterior density at point ${[{\boldsymbol{\theta }},h({\varepsilon }_{1},{\boldsymbol{\theta }};{M}_{\mathrm{chirp}})]}^{\top }$ . Note that if the hyperparameters ${\boldsymbol{\theta }}$ merely delimit stability against radial perturbations, then $p(h({\varepsilon }_{1},{\boldsymbol{\theta }};{M}_{\mathrm{chirp}})| {\varepsilon }_{1},{\boldsymbol{\theta }})$ loses dependence, becoming $p(h({\varepsilon }_{1},{\boldsymbol{\theta }};{M}_{\mathrm{chirp}})| {\varepsilon }_{1})$ . When transforming M₁ in Equation (A12) to the mass ratio q we obtain the GW-related part of Equation (3) in the main text.

In reality the likelihood function is finite for $| g({\boldsymbol{M}})| \leqslant \epsilon$ with finite, with the marginal posterior distribution of the chirp combination having 90% credible interval about the median of ${M}_{\mathrm{chirp}}={1.186}_{-0.001}^{+0.001}$ M_⊙. If we are to account for this likelihood information accurately, but avoid sampling from a relatively very diffuse prior in the central densities and thus in the chirp mass, then we need to perform fast marginalization over the central density ε₂ to decrement the dimensionality of the sampling space. To proceed, we need to examine the conditional likelihood function

$\begin{eqnarray}&&L({\varepsilon }_{2};{\varepsilon }_{1},{\boldsymbol{\theta }})=p({\boldsymbol{d}}| {\boldsymbol{M}},{\boldsymbol{\Lambda }}).\end{eqnarray} \tag{ A13 }$

Given M₁, there is one maximum in the conditional likelihood function when slicing through the chirp-sensitive likelihood function. Further, ${M}_{2}={M}_{2}({\varepsilon }_{2};{\boldsymbol{\theta }})$ , while not a monotone transformation in the presence of unstable branches, is ordered, meaning there should exist one maximum with respect to ε₂. In order to marginalize we therefore aim to first approximate the central density ε₂ that maximizes the conditional likelihood function. We thus approximate—via inversion— $h({\varepsilon }_{1},{\boldsymbol{\theta }};{M}_{\mathrm{chirp}}^{{\prime} })$ , as above, where ${M}_{\mathrm{chirp}}^{{\prime} }$ is now the median chirp mass in the marginal posterior distribution of M_chirp. Given this estimator of the maximum, we can generate bounds for numerical integration. For instance, we can integrate on the interval ${\varepsilon }_{2}\in [\alpha ,\beta ]$ , where $\alpha :=h({\varepsilon }_{1},{\boldsymbol{\theta }};{M}_{\mathrm{chirp}}^{{\prime} }-n{{ \mathcal C }}_{l})$ and $\beta :=h({\varepsilon }_{1},{\boldsymbol{\theta }};{M}_{\mathrm{chirp}}^{{\prime} }+n{{ \mathcal C }}_{u})$ where n is some integer and the X% marginal credible interval on the chirp mass is CI ${}_{{\rm{X}} \% }=\{{M}_{\mathrm{chirp}}:{M}_{\mathrm{chirp}}\in [{M}_{\mathrm{chirp}}^{{\prime} }-{{ \mathcal C }}_{l},{M}_{\mathrm{chirp}}^{{\prime} }\,+{{ \mathcal C }}_{u}]\}$ . We can then perform numerical quadrature where the integrand requires numerical integration to transform $({\varepsilon }_{2},{\boldsymbol{\theta }})\,\mapsto ({M}_{2},{{\rm{\Lambda }}}_{2})$ . If an order is imposed on the central densities, then the upper-bound for quadrature is $\min [h({\varepsilon }_{1},{\boldsymbol{\theta }};{M}_{\mathrm{chirp}}^{{\prime} }+n{{ \mathcal C }}_{u}),{\varepsilon }_{1}]$ .

Another approach would be to inject likelihood information into the prior density function of the mass M₂ (and thus of the central density ε₂). The prior support of M₂ (and thus ε₂) is restricted to some narrow interval corresponding to a narrow interval in chirp mass about the marginal median value. This is uncomfortable to have to rely on, one reason being that prior predictive probabilities may be compromised for model comparison; another is that a prior may be defined on the space of M₂ that is inconsistent with the prior density $p({\varepsilon }_{1}| {\boldsymbol{\theta }})$ , or a prior is defined directly on the space of the chirp mass, leading to a similar inconsistency. Given this modification of the prior support relative to the protocol outlined above, (nested) sampling proceeds without decrementing the dimensionality, but at far lower cost and effectively without risk of error due to insufficient resolution.

In this work we compared two approaches quantitatively: (i) the delta-function approximation—specifically Equation (A12), and (ii) the approach wherein likelihood information is injected into the prior via modification of the support, in the form of a narrow prior on the chirp mass about the median in marginal posterior mass; see the text following Equation (3). The joint posterior distribution of the EOS parameters is consistent for both approaches.

Appendix B: Approximations to the Likelihood Function

B.1. The NS–NS Scenario

The posterior distribution on the two binary components from GW170817 derived by the LVC (see Abbott et al. 2019a) is related to the nuisance-marginalized likelihood function through

$\begin{eqnarray}&&\begin{array}{l}p({{\rm{\Lambda }}}_{1},{{\rm{\Lambda }}}_{2},{M}_{1},{M}_{2}| \,{{\boldsymbol{d}}}_{\mathrm{GW}})\\ \quad \propto \,p({{\rm{\Lambda }}}_{1},{{\rm{\Lambda }}}_{2},{M}_{1},{M}_{2})p({{\boldsymbol{d}}}_{\mathrm{GW}}| \,{{\rm{\Lambda }}}_{1},{{\rm{\Lambda }}}_{2},{M}_{1},{M}_{2}),\end{array}\end{eqnarray} \tag{ B1 }$

where we have only shown the four parameters of interest here. In this work we have equated the two by arguing that the prior distribution $p({{\rm{\Lambda }}}_{1},{{\rm{\Lambda }}}_{2},{M}_{1},{M}_{2})$ is jointly flat. Furthermore we have performed a coordinate transformation from the two component masses to the mass ratio q and the chirp mass M_chirp to get

$\begin{eqnarray}&&p({{\rm{\Lambda }}}_{1},{{\rm{\Lambda }}}_{2},q,{M}_{\mathrm{chirp}}| {{\boldsymbol{d}}}_{\mathrm{GW}}).\end{eqnarray} \tag{ B2 }$

As the chirp mass is constrained to a very small range we fix its value to M_chirp = 1.186 M_⊙ such that we can write a conditional distribution

$\begin{eqnarray}&&p({{\rm{\Lambda }}}_{1},{{\rm{\Lambda }}}_{2},q| {{\boldsymbol{d}}}_{\mathrm{GW}},{M}_{\mathrm{chirp}}),\end{eqnarray} \tag{ B3 }$

which is a slice through the posterior distribution. We use, however, the marginalized posterior distribution $p({{\rm{\Lambda }}}_{1},{{\rm{\Lambda }}}_{2},q| {{\boldsymbol{d}}}_{\mathrm{GW}})$ in this Letter, which we compare to the conditional distribution in the left panel of Figure B1.

Figure B1. Refer to the following caption and surrounding text. — **Figure B1.** Left panel: comparison of the distribution of the mass ratio q when we use the posterior distribution marginalized over the chirp mass (green) or the posterior distribution conditional on a given chirp mass (blue). Both distributions are very similar, which is why we argue that the usage of the marginalized distribution throughout the Letter is justified. Right panel: prior distribution on the mass ratio q in the black dashed lines, the posterior distribution on q in green, and the reweighted posterior that corresponds to a flat prior in q in blue. Note that the prior distribution on q is only for a small interval in chirp mass where there is posterior support, which is why there is more support toward lower- than higher-mass ratios. We conclude that there is slightly more posterior support for equal mass ratios when reweighting but this is negligible compared to the statistical uncertainty in GW170817.
Download figure:
Standard image High-resolution image

We furthermore note that when transforming from component masses to mass ratio and chirp mass the prior on these quantities is no longer jointly flat. We therefore check that the posterior distribution on q does not change significantly when applying a weighting to the distribution to ensure a jointly flat prior. We show the difference between the distributions in the right panel of Figure B1.

B.2. The NS–BH Scenario

Finally we have approximated the likelihood function for the NS–BH case by transforming Λ₁ and Λ₂ to the parameter $\tilde{{\rm{\Lambda }}}$ and marginalizing over Λ₁, which is different from taking a slice through the posterior at Λ₁ = 0. However, to compute the constraints from the EM counterpart the latter is not possible since there are not enough samples in the posterior that have Λ₁ = 0. As an alternative we approximate Λ₁ = 0 by taking the samples in the distribution where Λ₁ < 30, such that we are not dominated by stochastic noise of having too few samples. In Figure B2 we compare the two approaches when we set the posterior support for parameter sets that result in M_rem < 0.1 M_⊙ to zero, while setting uniform support everywhere else. Including these constraints from the EM counterpart, we conclude that there is a negligible change in the posterior distribution of $\tilde{{\rm{\Lambda }}}$ .