Constraining the Properties of Black Hole Seeds from the Farthest Quasars

Over 60 yr after the discovery of the first quasar, more than 275 such sources are identified in the epoch of reionization at z > 6. JWST is now exploring higher redshifts (z ≳ 8) and lower-mass (≲107 M ⊙) ranges. The discovery of progressively farther quasars is instrumental to constraining the properties of the first population of black holes (BHs), or BH seeds, formed at z ∼ 20–30. For the first time, we use Bayesian analysis of the most comprehensive catalog of quasars at z > 6 to constrain the distribution of BH seeds. We show that the mass distribution of BH seeds can be effectively described by combining a power law and a lognormal function tailored to the mass ranges associated with light and heavy seeds, assuming Eddington-limited growth and early seeding time. Our analysis reveals a power-law slope of −0.70−0.46+0.46 and a lognormal mean of 4.44−0.30+0.30 . The inferred values of the Eddington ratio, the duty cycle, and the mean radiative efficiency are 0.82−0.10+0.10 , 0.66−0.23+0.23 , and 0.06−0.02+0.02 , respectively. Models that solely incorporate a power law or a lognormal distribution within the specific mass range corresponding to light and heavy seeds are statistically strongly disfavored, unlike models not restricted to this specific range. Our results suggest that including both components is necessary to comprehensively account for the masses of high-redshift quasars, and that both light and heavy seeds formed in the early Universe and grew to form the population of quasars we observe.

1. INTRODUCTION Sixty years ago, the first quasar, cataloged as 3C 273, was identified at z = 0.158 (Schmidt 1963).At the time of discovery, this source was the farthest ever observed and led to a race to identify the mechanism responsible for such an efficient transformation of matter into energy (see, e.g., Salpeter 1964).Since then, quasar discoveries have exploded in number and consistently broken distance records.With the Sloan Digital Sky Survey (SDSS), quasars at z ∼ 5−6 (Fan et al. 1999(Fan et al. , 2001)), and then well into the reionization epoch (Fan et al. 2006) were discovered.
Nowadays, we routinely detect quasars at z ≳ 7 (Wang et al. 2021).According to a recent and comprehensive review, we have detected 275 quasars at z > 6 and 8 at z > 7 (Fan et al. 2022).To date, the farthest supermassive black hole (SMBH) ever detected, GN-z11, was discovered in the JADES survey by the James Webb Space Telescope (JWST) at z = 10.6 (Maiolino et al. 2023a), or only 440 Myr after the Big Bang, assuming Planck cosmological parameters (Planck Collaboration et al. 2020).Note that this SMBH, while being significantly farther than previous record-holders, is character-ized by a mass of only log 10 M • = 6.2 ± 0.3 M ⊙ , offering for the first time a glimpse of the lower end of the SMBH distribution at such redshifts.
Upcoming surveys with new observational facilities (e.g., Euclid, the Nancy Grace Roman Space Telescope) will extend our reach of searches and detect even farther quasars (Tee et al. 2023).Current forecasts predict that the first data release of the near-infrared telescope Euclid will lead to 13 − 25 new quasar discoveries in the redshift range 7 < z < 9 (Euclid Collaboration et al. 2019).These forecasts are based on linear extrapolations at higher redshifts of the rate of decrease of the spatial density of quasars currently detected.Based on current estimates of the rate of decline, Fan et al. (2019) and Wang et al. (2019b) predict that the farthest quasar with a mass M • > 10 9 M ⊙ should be detected in the redshift range 9 < z < 12.
By pushing the frontier of the farthest quasar detected, we automatically gain insights into the properties of the first population of BHs, or BH seeds (Pacucci & Loeb 2022).As BH seeds should be formed in the redshift range z ∼ 20 − 30 (Barkana & Loeb 2001), detections of higher redshift quasars shorten the time between formation and observation, thus significantly shrinking the uncertainty associated with the size of the parameter space that describes the properties of the first BHs.
In this study, we expand on the work by Pacucci & Loeb (2022) by using Bayesian analysis, as well as a far more comprehensive catalog of quasars at z > 6, to derive the parameters of the distribution of BH seeds.For the first time, we place statistical constraints on the population of BH seeds that accreted gas to form the quasars we observe at redshift z > 5, which carries crucial insights on the formation of the first BHs in the Universe.
This paper is organized as follows.In Section 2, we discuss our statistical framework to constrain the seed population of BHs.In Section 3, we present our results.Finally, in Section 4, we discuss the implications of our results and draw our conclusions.

METHODS
In this Section, we outline our method to derive statistical constraints on the population of BH seeds.
We use the recently published (Fan et al. 2022) catalog of 113 z > 5.9 quasars with robust BH mass estimates from the Mg II line1 .Their mass and redshift distributions are displayed in Figure 1.The Supplementary Material section of Fan et al. (2022) contains a table with detailed information on all the quasars included.The highest-redshift sample (i.e., z > 7) is derived from Mortlock et al. (2011); Wang et al. (2018); Bañados et al. (2018); Matsuoka et al. (2019); Yang et al. (2019Yang et al. ( , 2020)); Wang et al. (2021), which are selected using a variety of ground-based and space-based facilities.
In the catalog, no quasars are detected with an absolute magnitude at λ = 1450 Å fainter than M thr = −24.4,which we adopt as our detection threshold to model observational completeness.We relate the average absolute magnitude at λ = 1450 Å to the black hole mass M • and the Eddington ratio f edd (defined as the ratio between the actual mean accretion rate and the mean Eddington accretion rate) via a bolometric correction C 1450 = 4.2 (Runnoe et al. 2012).This is defined as L bol = C 1450 λL λ .Hence, in the AB magnitude system, we derive the relation We note that the bolometric correction indicates the average ratio between the bolometric luminosity and the in-band luminosity of a specific class of objects.
Of course, the effective ratio for members of that class varies.
The SMBH mass at a cosmic time t(z) (where z is the detection redshift) is related to the initial BH seed mass, m seed , by where t seed = 130 Myr is the assumed formation time (z = 25, see, e.g., Barkana & Loeb 2001), t edd = 450 Myr is the growth time scale (see, e.g., Pacucci & Loeb 2022), D is the duty cycle (the fraction of time that the BH has accreted since its formation time) and ϵ is the mean radiative efficiency factor over that time.
Note that f edd and D are degenerate.Given our set of data of observed absolute magnitudes, d d d = M M M obs , the posterior probability on the parameters that describe our model λ λ λ can be written as In the previous equation, π(λ λ λ) is the prior on the parameters λ λ λ describing the population model p pop (θ θ θ|λ λ λ), 6.00 6.25 6.50 6.75 7.00 7.25 7.50 Redshift L(d i |θ θ θ) is the likelihood of observing the data set given the population properties, and P det (θ θ θ) is the fraction of events in the Universe that would be detected for a particular population model, characterized by the population parameters λ λ λ.In our analysis, we have λ λ λ = {f edd , Υ Υ Υ, Ξ Ξ Ξ}, where Υ Υ Υ = {D, ϵ} and Ξ Ξ Ξ is the set of parameters describing the shape of the distribution of m seed .We model the likelihood of observing the data set given the population properties as where σ i represents the uncertainty in the measurement, whose upper limit we conservatively take to be 10%, to agree with the uncertainty reported in Fan et al. (2022).This error accounts for uncertainties in the determination of the absolute magnitude of the source, determined from the relative magnitude, which is estimated with great accuracy, and the redshift; in the case of Mg II line redshift measurements, typical estimates are well within the 0.5% of the true value.By definition, the fraction of events in the Universe that would be detected for a particular population model, characterized by the population parameters, is This quantity represents the number of events that would pass a threshold and, therefore, the completeness of the observed sample.In our case, any SMBH with an absolute magnitude larger than M thr is undetected.
Finally, the population model is taken to be whose characteristics are described by the set of parameters Ξ Ξ Ξ.
Since BH seeds come in two flavors (see Section 1), we take the mass distribution to be described by the sum of two components ("Power Law + Lognormal" -PLN), which are reminiscent of the theoretical distributions of light seeds ("Power Law" -PL) and heavy seeds ("Lognormal" -LN) In the previous equation, Ξ Ξ Ξ = {Ξ Ξ Ξ P L , Ξ Ξ Ξ LN , f 1 }, H is the Heaviside function, {m min seed,PL , m max seed,PL } are the minimum and maximum seed masses for the PL distribution, {m min seed,LN , m max seed,LN } are the minimum and maximum seed masses for the LN distribution, which can have values in the range [10 − 10 5 ] M ⊙ , and (9) with Ξ Ξ Ξ P L = {α} and Ξ Ξ Ξ LN = {µ, Σ}.
In our analysis, we use the following priors: uniform in the range [0, 1] for f edd (i.e., we do not allow mean accretion rates that are super-Eddington), uniform in the range [0−1] for D, and uniform in the range [0.01−1] for ϵ.For the parameters that described the BH mass function, we have priors that are uniform and in the range [0, 1] for f 1 , [−5, 0] for α, [3, 5] for µ, and [0, 3] for Σ.

RESULTS
Using the data on the farthest quasars from Fan et al. (2022), we fit their masses as a function of the parameters that describe their accretion (f edd , D, ϵ) and of the set of parameters Ξ Ξ Ξ that describe the mass distribution of BH seeds.We use the nested-sampling code nestle2 to maximize the log-likelihood of our model and to infer the confidence regions of our parameters.Notably, the nested algorithm also supplies us with the marginalized likelihood, which can be used for model selection.
For our primary model, we assume that the mass distribution of BH seeds is described by  the sum of power-law and lognormal distributions, with {m min seed,PL , m max seed,PL } = {10, 10 3 } M ⊙ and {m min seed,LN , m max seed,LN } = {10 3 , 10 5 } M ⊙ .This distribution is likely the most accurate from a physical point of view, aligning with the typical predicted form of the distributions for light and heavy BH seeds within their respective mass categories (see, e.g., Volonteri 2010;Fer-rara et al. 2014, andthe discussion in Pacucci &Loeb 2022).
We show the hyper-posterior distribution of the parameters that describe the mass distributions of the BH seeds, along with the inferred values of the parameters that describe their growth, in Figure 2. The inferred values of the Eddington fraction, the duty cycle, and the mean radiative efficiency are 0.82 +0.10  −0.10 , 0.66  Table 1.Summary of the results for various initial mass functions of BH seeds.The first column lists the models' names, followed by their parameters' inferred values.The last column reports the log evidence of the model.0.06 +0.02 −0.02 , respectively.We find that the slope of the power-law portion of the mass distribution is −0.70 +0.46  −0.46 , while the mean and variance of the lognormal portion are 4.44.+0.30 −0.30 and 1.02 +0.60 −0.60 .For comparison, we consider the case that the mass distribution of BH seeds is described by a simple power law (in this case, we fix f 1 = 1) over the whole mass range [10−10 5 ] M ⊙ .We also use another form of the distribution, a pure lognormal (in this case, we fix f 1 = 1) over the whole mass range [10−10 5 ] M ⊙ .Table 1 reports the parameters that describe the mass distributions of the BH seeds and the inferred values of the parameters that describe their growth.We observe that the Eddington fraction, the duty cycle value, and the mean radiative efficiency remain consistent throughout our models.
Figure 3 shows the inferred mass distribution of BH seeds in the three cases described above.We observe that the PLN distribution exhibits a clear overlap with the LN distribution for masses greater than ∼ 10 3 M ⊙ , while it transitions towards a power-law distribution for masses below ∼ 10 3 M ⊙ , albeit with a gentler slope.
We also report the results for two more models, where we restrict the PL model to BH seed masses in the range [10, 10 3 ] M ⊙ and the LN model to BH seed masses [10 3 , 10 5 ] M ⊙ , reminiscent of the typical mass range of light and heavy seeds discussed in the literature.These models prefer Eddington fractions close to unity but with the duty cycle value and the mean radiative efficiency consistent with the other models.
In Table 1, we also report the log evidence (log Z) of the three models, defined as the marginalized likelihood in Eq. 3. The log evidence serves as a tool for conducting a Bayesian ratio test to determine the preferable model.The model with the highest log evidence is favored in this test.Given the values of their log evidence, the PL model restricted to the range [10, 10 3 ] M ⊙ and the LN model restricted to the range [10 3 , 10 5 ] M ⊙ are very disfavored statistically with respect to our reference model that represents the typical predicted form of the distributions for light and heavy BH seeds within their respective mass categories.

DISCUSSION AND CONCLUSIONS
In this Letter, we developed a state-of-the-art Bayesian analysis to infer the mass distribution properties of BH seeds that originated the quasars we observe in the high-redshift Universe.By combining data on their redshift and mass, derived from the review by Fan et al. (2022), we have obtained accurate constraints on the properties of the seed population.
We have shown that the distribution of BH seeds' masses can be best characterized by the combination of a power law and a lognormal function within the mass intervals of [10 − 10 3 ] M ⊙ and [10 3 − 10 5 ] M ⊙ , respectively.This combination corresponds appropriately to the light and heavy seeds.Our analysis yielded a power-law slope of −0.70 +0.46  −0.46 and a lognormal mean of 4.44 +0.30 −0.30 .Models that exclusively incorporate either a power law or a lognormal function within the respective mass ranges for light and heavy seeds are statistically strongly disfavored.This implies that both components are necessary to explain the mass distribution of high-redshift quasars.
Constraining the properties of the population of BH seeds, such as f edd , D, ϵ, and the parameters describing the shape of the distribution of m seed , is crucial for a variety of astrophysical and cosmological applications.For example, most large-scale cosmological simulations, e.g., ASTRID (Ni et al. 2022;Bird et al. 2022) and Il-lustrisTNG (Springel et al. 2018;Pillepich et al. 2018;Nelson et al. 2019) include active galactic nuclei feedback, with significantly different seeding prescriptions.Recent studies have shown that seeding prescriptions profoundly impact the evolution of individual galaxies (e.g., Weinberger et al. 2017;Wang et al. 2019a).
The mass distributions of BH seeds are in the mass range of the elusive population of intermediate-mass BHs.The attributes of these distributions, along with their distribution across different redshifts, have a crucial role in influencing the rates and characteristics of gravitational wave detections using upcoming observatories.Both the Laser Interferometer Space Antenna (LISA, Amaro-Seoane et al. 2023) for BH masses ≳ 10 3 M ⊙ and the third-generation, ground-based gravitational wave detectors, with an emphasis on lowfrequency and thus for masses ≲ 10 3 M ⊙ (Cosmic Explorer, e.g., Reitze et al. 2019 andEinstein Telescope, e.g., Punturo et al. 2010) are predicted to detect this population of high-z sources systematically, and further constrain their properties (Pacucci & Loeb 2020;Chen et al. 2022;Fragione & Loeb 2023).
For the first time, our study provided a framework to utilize the complete knowledge of the population of currently detected quasars to infer the properties of BH seeds in the early Universe.As new, more distant quasars become known, especially with JWST (see, e.g., Larson et al. 2023;Maiolino et al. 2023c), our constraints will become more robust and more descriptive of this, thus far, elusive population of ancient BHs.

Figure 2 .
Figure2.Hyper-posterior for the parameters describing the accretion of SMBHs (Eq.2) and the mass distribution of BH seeds (Eq.7) for our model "Power Law + Lognormal", representative of forming heavy seeds in the mass range [10 3 − 10 5 ] M⊙ following a lognormal distribution, and light seeds in the mass range [10 − 10 3 ] M⊙ following a power-law distribution.

Figure 3 .
Figure3.Inferred mass distribution of BH seeds in the case of a "Power Law + Lognormal" (green), "Power Law" (black), and "Lognormal" (orange) function.The solid line corresponds to the median at each mass, while shaded bands denote 50 per cent and 90 per cent credible intervals.