Detecting the Undetected: Overcoming Biases in Gravitational-Wave Population Studies

In the flourishing field of gravitational-wave astronomy, accurately inferring binary black hole merger formation channels is paramount. The Bayesian hierarchical model selection analysis offers a promising methodology (see, e.g., One Channel to Rule Them All, Zevin et al. 2021). However, recently, Cheng et al. (2023) highlighted a critical caveat: observed channels absent in known models can bias branching fraction estimates. In this research note, we introduce a test to detect missing channels in such analyses. Our findings show a commendable success rate in identifying these elusive channels. Yet, in scenarios where missing channels closely overlap with recognized ones, discerning the difference remains challenging.


INTRODUCTION
Gravitational-wave observations of binary black hole (BBH) mergers offer an invaluable benchmark for population synthesis studies, aiming to unravel the formation mechanisms leading to these exotic objects.By utilizing binary population tools like POSYDON (Fragos et al. 2023), researchers can predict the observable properties of merging BBHs.Different formation scenarios imprint different signatures in the intrinsic distributions of the observable BBH properties such as their chirp mass, mass ratio, effective spin parameter, and merger redshift.
The discovery of merging compact binaries with GWs has led to a number of studies that attempt to infer the branching fractions of relevant formation channels (e.g., Stevenson et al. 2015;Zevin et al. 2017;Farr et al. 2018).More recently, Zevin et al. (2021) expanded this methodology to include five formation channels modeled self-consistently.Nonetheless, the recent study of Cheng et al. (2023) cautions about substantial biases in branching fraction computations if an unaccounted-for formation channel exists in observed data.
This research note introduces a test to detect the presence of an unknown, unaccounted-for, formation channel in the Bayesian hierarchical model selection analysis.In the next section, we showcase the test procedure using as a reference the five formation channels used in the analysis of Zevin et al. (2021) and Cheng et al. (2023).

PROCEDURE AND RESULTS
Our objective is to retrieve branching fractions from observed merging BBH populations.This entails mapping the observed distribution within a set of known population models.In a non-orthonormal basis scenario, expected coefficients of the branching fractions are determined via a system of linear equations.We denote the observed population, P O , as a normalized probability density function (PDF) composed of formation channels p i for i = 1...N , with corresponding branching fractions b i , i.e.P O = N i=1 b i p i (⃗ x).Using the observed population and the modeled formation channels, we can compute the dot product of the observed population with a given formation channel model, R j , and the dot products of the different modeled formation channels, M ij , as defined below.Here, all PDFs are defined within an n-dimensional space, ⃗ x ∈ R n .We have, where, for convenience, we define The distributions P O and p i can be derived by fitting PDFs to the observed BBH population and the Monte Carlo-sampled formation channel models through methods arXiv:2310.10736v1[astro-ph.HE] 16 Oct 2023 like kernel density estimation (KDE), normalizing flows, etc.Once M ij and R j are known, one can solve the system of linear equation with respect to the branching fractions b i .In the case of non-degenerate formation channels p i , the expansion of P O in terms of b i p i (⃗ x) is unique in terms of the branching fractions b i .However, in cases where high degeneracy exists in the formation channels, the solutions become non-unique, and the algorithm can struggle to recognize if a degenerate formation channel is missing.
To validate our approach, we employed models for the formation of merging BBHs including different formation channels from isolated binary evolution and dynamical formation as released by Zevin et al. (2021).These channels include: common envelope (CE) and stable mass transfer (SMT) channels (Bavera et al. 2021), chemically homogeneous evolution (CHE; du Buisson et al. 2020), nuclear star cluster (NSC; Antonini et al. 2019), and globular cluster (GC; Rodriguez et al. 2019).To illustrate how model uncertainties are treated, for the CE channel, we consider five different values for the efficiency of CE ejection, α CE ∈ [0.2, 0.5, 1, 2, 5], while for all models we consider four different values for the birth spin of isolated black holes as a proxy for the efficiency of angular momentum transport in the interior of the black hole progenitor stars, namely χ ∈ [0.0, 0.1, 0.2, 0.5].These parameters were arbitrarily taken from a larger set of model uncertainties and echo physical uncertainties in stellar and binary evolution simulations.
For each parameter set, we conducted five simulation rounds.In each round, we generated a sample of 10 5 BBHs as our observed population by drawing from all channels, while we omit one channel.This is done in the detectable space, with the detection effects accounted for.The genuine branching fraction b u of the excluded channel ranged between 0 and 1, and the rest bore equal branching fractions totaling 1 − b u .The aim of our test is to detect disparities between recognized and inherent channels.It must be noted that the analysis presented here uses a large sample size to obtain an accurate measurement of the orthogonality of distributions, which implies that the method will only be of use with the next-generation gravitational wave detectors with much greater number of detections than currently available at the writing of this research note.The need for large sample sizes comes from the difficulty of computationally fitting a PDF to the observed data.Alliteratively, improved techniques for fitting a PDF, rather than the KDE method used here, might reduce the samples required for accurate results.Finally, the analysis presented here, does not include measurement uncertainties in the mock observed population.A careful treatment of measurement uncertainties in illustrated test is left for future work.
We present the test results in Figure 1.The dashed black curve serves as our baseline, denoting no bias; i.e., an unknown channel doesn't skew inferred branching fractions.This benchmark is predominantly observed in the cases of absent CE and CHE channels, implying these channels exhibit distinct signatures that can tell them apart from the others, e.g., their characteristic effective spin parameter distributions.Conversely, for other instancesthe specific scenarios without NSC or GC -the curves fluctuate between the no-bias line ( e i = 1 − β) and the fully biased line ( e i = 1).While this suggests that the existence of certain unseen channels can inflate assessed fractions, they never reach complete bias, suggesting our method can still detect missing channels.However, it might undervalue the branching fraction of the unknown channel or overvalue those of known ones.Exceptionally, for a few cases like in the α CE = 0.2 and χ = 0 scenario without GC, the method falters, barely diverging from the fully biased curve ( e i = 1).This anomaly can be attributed to degeneracies between formation channels predicting a similar observable distribution of the considered BBH observational properties: chirp mass, mass ratio, effective spin parameter and redshift of merger.Hence, the procedure presented here can be used as a test to reject the results of a Bayesian hierarchical model selection analysis but not to confirm them.

CONCLUSION
We introduced a procedure to assess if branching fraction estimations from studies of observed gravitational-wave populations might be skewed by unaccounted formation channels.When no degeneracies exist between channels, discrepancies between the sum of branching fractions of known populations and unity emerge.This highlights potential overlooked channels.Current methodologies (e.g., Zevin et al. 2021) assume comprehensive knowledge of formation channels, risking biases.The present research note proposes a test to detect the presence of such bias.
A thorough interpretation of branching fractions from hierarchical Bayesian model selection remains crucial, ensuring that astrophysical insights are accurate and not artifacts of a biased analysis.

CODE AND DATA RELEASE STATEMENT
All necessary code to reproduce this work can be found in: https://github.com/rraikman/popsynth-dotproducts.This study made use of the BBH formation channel models released by Zevin et al. (2021) on https://zenodo.org/record/4947741.ACKNOWLEDGEMENTS

CEFigure 1 .
Figure1.Biasing plot -ability of our method to measure the bias caused by removing a certain formation channel from the set of known distributions.Across different values of αCE and χ, we show the sum of estimated branching fractions from the known channels as a function of the branching fraction of the unknown channel.The optimal case is the sloped dashed black line, where the branching fraction of the unknown channel is perfectly predicted.The flat dotted black line represents no knowledge of the removed formation channels, and an overestimation of the prevalence of the known channels.