Mapping the Growth of Supermassive Black Holes as a Function of Galaxy Stellar Mass and Redshift

The growth of supermassive black holes is strongly linked to their galaxies. It has been shown that the population mean black hole accretion rate ( BHAR¯ ) primarily correlates with the galaxy stellar mass (M ⋆) and redshift for the general galaxy population. This work aims to provide the best measurements of BHAR¯ as a function of M ⋆ and redshift over ranges of 109.5 < M ⋆ < 1012 M ⊙ and z < 4. We compile an unprecedentedly large sample with 8000 active galactic nuclei (AGNs) and 1.3 million normal galaxies from nine high-quality survey fields following a wedding cake design. We further develop a semiparametric Bayesian method that can reasonably estimate BHAR¯ and the corresponding uncertainties, even for sparsely populated regions in the parameter space. BHAR¯ is constrained by X-ray surveys sampling the AGN accretion power and UV-to-infrared multiwavelength surveys sampling the galaxy population. Our results can independently predict the X-ray luminosity function (XLF) from the galaxy stellar mass function (SMF), and the prediction is consistent with the observed XLF. We also try adding external constraints from the observed SMF and XLF. We further measure BHAR¯ for star-forming and quiescent galaxies and show that star-forming BHAR¯ is generally larger than or at least comparable to the quiescent BHAR¯ .


INTRODUCTION
Supermassive black holes (SMBHs) and galaxies appear to be fundamentally linked (e.g., Kormendy & Ho 2013).Especially, the SMBH mass (M BH ) and the galaxy bulge mass are found to be tightly correlated in the local universe, and the cosmic evolution of the SMBH accretion density and the star-formation density are broadly similar and both peak at z ≈ 2. Therefore, it is a fundamental question in extragalactic astronomy to understand how cosmic SMBH growth depends on galaxy properties.
SMBHs grow primarily through rapid accretion when they are observed as active galactic nuclei (AGNs); mergers are an additional growth mode.X-ray emission is known to be a good indicator of AGN activity because of its universality among AGNs, high penetrating power through obscuration, and low dilution from galaxy starlight (e.g.,  Alexander 2015; Brandt & Yang 2022).Therefore, X-ray surveys can be used to constrain the accretion distribution and the black-hole accretion rate (BHAR = dM BH /dt) of SMBHs (e.g., Aird et al. 2012Aird et al. , 2018;;Bongiorno et al. 2012Bongiorno et al. , 2016;;Georgakakis et al. 2017;Wang et al. 2017;Yang et al. 2017Yang et al. , 2018Yang et al. , 2019;;Ni et al. 2019Ni et al. , 2021a)).However, the duration of a galaxy within the AGN phase is relatively short, and AGNs also have strong variability (e.g., Hickox et al. 2014;Yang et al. 2016); thus, BHAR is often averaged over a large galaxy sample as an estimator of BHAR, the long-term population mean BHAR.
BHAR has been shown to be redshift-dependent and correlated with both stellar mass (M ⋆ ) and star-formation rate (SFR), while the M ⋆ dependence is more fundamental for the general galaxy population (e.g., Hickox et al. 2014;Yang et al. 2017Yang et al. , 2018)).In recent years, it has been found that BHAR is also strongly related to galaxy morphology (e.g., Ni et al. 2019Ni et al. , 2021a;;Yang et al. 2019;Aird et al. 2022), which may be more fundamental than the BHAR − M ⋆ re-lation.However, such morphological measurements are expensive to obtain and require superb image resolution from, e.g., the Hubble Space Telescope, which inevitably are limited to small sky areas and thus can only provide a limited sample size covering a limited parameter space.In contrast, M ⋆ and SFR are much more accessible.Notably, modern multi-wavelength photometric surveys have provided extensive photometric data, based on which M ⋆ and SFR can be estimated by fitting the corresponding spectral energy distributions (SEDs; e.g., Zou et al. 2022).Therefore, a well-measured BHAR − M ⋆ relation is still essential to link SMBHs and galaxies.
The latest version of BHAR as a function of (M ⋆ , z) was derived in Yang et al. (2018) using the data from the Cosmic Evolution Survey (COSMOS; Laigle et al. 2016;Weaver et al. 2022), Great Observatories Origins Deep Survey (GOODS)-S, and GOODS-N (Grogin et al. 2011;Koekemoer et al. 2011).Although the relation in Yang et al. (2018) has proved to be successful over the years, there are still some limitations.First, although COSMOS, GOODS-S, and GOODS-N are sufficiently deep to probe BHAR up to z ≈ 4, they cannot effectively sample the last half of cosmic time (z ≲ 0.8) because of their limited sky areas.Second, Yang et al. (2018) adopted strong assumptions when parametrically estimating BHAR, which may lead to underestimated BHAR uncertainties.Built upon Yang et al. (2018), this work aims to provide the best available map of BHAR(M ⋆ , z) with the currently best data and statistical methodology.Now is indeed the right time to re-measure BHAR given the fact that the past five years have witnessed the completion of several large, sensitive X-ray surveys in fields together with deep optical-to-IR surveys (e.g., Chen et al. 2018;Ni et al. 2021b).These new X-ray surveys, when combined with previous ones, can return a large AGN sample over ten times larger than previous ones, as will be discussed in Section 2. In this work, we compile an unprecedentedly large sample from nine well-studied survey fields, many of which were finished after Yang et al. (2018) and even within ≲ 2 yrs before this work.Our adopted surveys follow a weddingcake design and contain both deep, pencil-beam and shallow, wide ones, allowing us to effectively explore a wide range of parameter space.We further develop a semiparametric Bayesian approach that can return reasonable estimations and uncertainties, even for sparsely populated regions in the parameter space.
This work is structured as follows.Section 2 describes the data.Section 3 presents our methodology and BHAR measurements.In Section 4, we discuss the implications of our results.Section 5 summarizes this work.We adopt a flat ΛCDM cosmology with H 0 = 70 km s −1 Mpc −1 , Ω Λ = 0.7, and Ω M = 0.3.

DATA AND SAMPLE
We use the data within the Cosmic Assembly Near-Infrared Deep Extragalactic Legacy Survey (CANDELS) fields, four of the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) Deep-Drilling Fields (DDFs), and eROSITA Final Equatorial Depth Survey (eFEDS) field.CANDELS and the LSST DDFs both contain several distinct fields, and we put those individual fields sharing similar areas and depths under the same umbrella ("CANDELS" or "LSST DDFs") for convenience.Our adopted fields have X-ray coverage to provide AGN information and quality multiwavelength data, which are essential for measuring galaxy properties.We summarize our field information in Table 1 and discuss them in the following subsections.

CANDELS Fields
CANDELS (Grogin et al. 2011;Koekemoer et al. 2011) is a pencil-beam survey covering five fields -GOODS-S (0.05 deg 2 ), GOODS-N (0.05 deg 2 ), Extended Groth Strip (EGS; 0.06 deg 2 ), UKIRT Infrared Deep Sky Survey Ultra-Deep Survey (UDS; 0.06 deg 2 ), and a tiny part of COSMOS (denoted as CANDELS-COSMOS hereafter; 0.06 deg 2 ).All the fields have ultra-deep UV-to-IR data (see, e.g., Yang et al. 2019 and references therein), allowing for detections of galaxies up to high redshift and low M ⋆ and reliable measurements of these galaxies' properties.The first four have deep Chandra observations reaching ∼Ms depths from Luo et al. (2017;GOODS-S), Xue et al. (2016;GOODS-N), Nandra et al. (2015;EGS), and Kocevski et al. (2018;UDS) and can thus effectively sample AGNs at high redshift and/or low luminosity.However, CANDELS-COSMOS shares the same X-ray depth as the full COSMOS field, and CANDELS-COSMOS is much smaller.Therefore, we do not use CANDELS-COSMOS but will instead directly analyze the full COSMOS field in Section 2.2.
We adopt the galaxy-property catalog in Yang et al. (2019), who have measured M ⋆ and SFRs by fitting SEDs for all the CANDELS sources.

LSST DDFs
The LSST DDFs (e.g., Brandt et al. 2018;Zou et al. 2022) include five fields -COSMOS, Wide Chandra Deep Field-South (W-CDF-S), European Large-Area Infrared Space Observatory Survey-S1 (ELAIS-S1), XMM-Newton Large Scale Structure (XMM-LSS), and Euclid Deep Field-South (EDF-S).EDF-S has been selected as a LSST DDF only recently in 2022 and currently does not have sufficient data available, and we thus only use the former four original LSST DDFs with superb data accumulated over ∼ a decade.Note that this work does not use any actual LSST data because the Vera C. Rubin Observatory is still under construction at the time of writing this article.(1) Field names.GOODS-S, GOODS-N, EGS, and UDS belong to CANDELS and are discussed in Section 2.1.COSMOS, ELAIS-S1, W-CDF-S, and XMM-LSS belong to the LSST DDFs and are discussed in Section 2.2.eFEDS is discussed in Section 2.3.(2) Sky areas of the fields, only accounting for the regions we are using.
(3) The limiting AB magnitudes we adopted in Section 2.4 to calculate the M ⋆ completeness curves, and the reference bands are written within parentheses.(4) The typical depths in exposure time of the X-ray surveys, and the parentheses list the observatories with which our adopted X-ray surveys were conducted.For XMM-Newton, the reported exposure is the typical flare-filtered one for a single EPIC camera.All the three EPIC cameras (one EPIC-pn and two EPIC-MOS) were used for the XMM-Newton observations, adding to ≈ COSMOS is a deg 2 -scale field with deep multi-wavelength data (e.g., Weaver et al. 2022).Civano et al. (2016) presented medium-depth (≈ 160 ks) Chandra observations in COSMOS.The galaxy properties measured through SED fitting covering the X-ray to far-IR are taken from Yu et al. (2023).We only use the region with "FLAG COMBINED = 0" (i.e., within the UltraVISTA region and far from bright stars and image edges) in Weaver et al. (2022) to ensure quality multi-wavelength characterizations.Ni et al. (2021b) presented ≈ 30 ks XMM-Newton observations in ELAIS-S1 and W-CDF-S, and Chen et al. (2018) presented ≈ 40 ks XMM-Newton observations in XMM-LSS.The galaxy properties in these three fields are taken from Zou et al. (2022).We limit our analyses to the overlapping region between the X-ray catalogs and Zou et al. (2022) because quality multi-wavelength data are essential for estimating photometric redshifts (photozs), M ⋆ , and SFRs.Besides, GOODS-S and UDS in Section 2.1 are embedded within W-CDF-S and XMM-LSS, respectively, and we remove the regions covered by GOODS-S and UDS to avoid double counting sources.Due to these reasons, our areas are slightly smaller than those reported in Chen et al. (2018) and Ni et al. (2021b).

eFEDS
eFEDS is a 10 2 deg 2 -scale field covered by eROSITA with ≈ 2 ks observations (Brunner et al. 2022).We focus on the 60 deg 2 GAMA09 region (Driver et al. 2022) within eFEDS because the remaining parts do not have sufficient multi-wavelength data to constrain the host-galaxy properties (e.g., Salvato et al. 2022).Unlike Chandra or XMM-Newton, eROSITA mostly works at < 2.3 keV, which is more sensitive to obscuration.We thus rely on the X-ray properties cataloged in Liu et al. (2022) for eFEDS sources, which are measured through detailed X-ray spectral fitting and thereby can largely overcome obscuration effects.As suggested in Liu et al. (2022), we only use sources with detection likelihoods > 10 because fainter sources generally do not allow meaningful X-ray spectral analyses.

Sample Construction
Sources in these fields all have either spectroscopic redshifts (spec-zs) or high-quality photo-zs, as have been extensively examined in previous literature.Representative examples are listed in Column 7 of Table 1.Many more successful works built upon these redshifts have also indirectly justified their general reliability.When compared to the available spec-zs, the photo-zs are of high quality -our sample has a σ NMAD of 0.03 (0.04) and an outlier fraction of 4% (15%) for galaxies (AGNs).1 Spec-zs are adopted when available, and half of the involved AGNs have spec-zs.
We select sources with 0.05 < z < 4 and log M ⋆,min = 9.5 < log M ⋆ < log M ⋆,max = 12 .Sources labeled as stars are removed, as has been presented in the references in Column 6 of Table 1.Only ≲ 15% of sources in each field are classified as stars.We apply a lower cut for z because photo-zs are less reliable when too small (e.g., see Appendix C of Zou et al. 2021), and the peculiar motions become non-negligible as well.We limit log M ⋆ > 9.5 because dwarf AGNs usually have much less reliable measurements and require special treatment (e.g., Zou et al. 2023).We apply the same upper cuts as in Yang et al. (2018) for both M ⋆ and z because very few sources can exceed these thresholds.
We further construct a complete sample by applying redshift-dependent M ⋆ cuts.To estimate the M ⋆ depth for each field, we first adopt a reference band and denote its limiting magnitude as m lim .Following Pozzetti et al. (2010), we convert the magnitude depth to the expected limiting M ⋆ for each galaxy with a magnitude of m: log M lim = log M ⋆ + 0.4(m − m lim ).At each redshift, we adopt the M ⋆ completeness threshold as the value above which 90% of the M lim values lie.Sources below the M ⋆ completeness curves are removed.For the CANDELS fields, we adopt the H band with a limiting magnitude of 26.5 mag, and almost all the sources above our log M ⋆ cut of 9.5 are above the CANDELS M ⋆ completeness curves, enabling constraints upon BHAR in the low-M ⋆ and high-z regime.For the LSST DDFs, we adopt the K s band, and their limiting K s magnitudes are 24 for COSMOS (Laigle et al. 2016) and 23.5 for W-CDF-S, ELAIS-S1, and XMM-LSS (Jarvis et al. 2013), respectively.For eFEDS, we adopt the Z band with a limiting magnitude of 22.These M ⋆ completeness cuts also automatically ensure the general SED-fitting reliability.The typical i-band magnitudes of sources at these limiting magnitudes are i ≈ 24.8 at K s = 23.5, i ≈ 25.3 at K s = 24, and i ≈ 22.4 at Z = 22.These i-band magnitudes are roughly equal to the nominal "depths" of SEDs in Zou et al. (2022; see their Figure 30) and Yu et al. (2023), below which the number of available photometric bands may become small.
We then define λ = L X /M ⋆ , where L X is the intrinsic 2 − 10 keV luminosity, and we always adopt erg s −1 M −1 ⊙ as the unit for λ.We use the X-ray surveys mentioned in the previous subsections to select AGNs.Following Aird et al. (2012) and Yang et al. (2018), we only use sources detected in the hard band (HB)2 for CANDELS and the LSST DDFs.The reason is to minimize the effects of obscuration.Selecting AGNs in soft bands (< 2 keV) is biased toward little or no absorption.Since the obscuration level is known to be correlated with λ (e.g., Ricci et al. 2017), soft-band-selected AGNs are expected to be biased in terms of λ.Besides, our analyses need intrinsic L X , and HB fluxes are the least affected by obscuration.To calculate L X and, consequently, λ of these HB-detected sources, we use Equation A4 in Zou et al. (2022) and adopt a photon index of 1.6.As discussed in Yang et al. (2018), a photon index of 1.6 returns L X agreeing the best with those from X-ray spectral fitting.For eFEDS, as mentioned in Section 2.3, we use the de-absorbed 0.5−2 keV flux in Liu et al. (2022) and convert it to L X assuming a photon index of 1.8.Although eROSITA observations are more prone to obscuration effects, and it is less accurate to measure L X with soft X-rays below ≈ 2 keV, we have verified in Appendix C that our median results remain similar when excluding eFEDS.It should be noted that we do not exclusively rely upon eFEDS to provide constraints at low-z and/or high-M ⋆ .The LSST DDFs, especially with the X-ray coverage in Chen et al. (2018) and Ni et al. (2021b) added, already have 12.6 deg 2 of coverage with useful HB sensitivity (see Table 1), and thus can also provide beneficial constraints.We define AGNs as those with log λ > log λ min = 31.5 and neglect the contribution of SMBHs with λ ≤ λ min to BHAR.This is because few of the X-ray-detected AGNs are below λ min , and the emission from X-ray binaries may become nonnegligible for low-λ sources.As we will show in Section 3.3, BHAR is indeed dominated by sources above λ min .
In total, we have eight thousand AGNs and 1.3 million normal galaxies, and they are plotted in the z − M ⋆ and z − λ planes in Figure 1, where each column presents fields with comparable depths and areas.Note that Yang et al. (2019), Zou et al. (2022), andYu et al. (2023), from which our adopted galaxy properties are taken, all have appropriately considered the AGN emission for AGNs.We will also assess the impact of AGNs that dominate the SEDs in Appendix D.

METHOD AND RESULTS
Denoting p(λ | M ⋆ , z) as the conditional probability density per unit log λ of a galaxy with (M ⋆ , z) hosting an AGN with λ and k bol (L X ) as the L X -dependent 2 − 10 keV bolometric correction (i.e., the ratio between the AGN bolometric luminosity and L X ), BHAR can be expressed as follows.

BHAR(M
where ϵ is the radiative efficiency of the accretion.The key step in measuring BHAR is hence to derive p(λ | M ⋆ , z).Some literature models the L X distribution instead of λ (e.g., Aird et al. 2012).These two approaches are equivalent, and p(λ | M ⋆ , z) and p(L X | M ⋆ , z) are interchangeable.The only reason for choosing one instead of the other is for convenience, as λ is a scaled parameter that can serve as a rough proxy for the Eddington ratio.
We require λ c > λ min because, otherwise, the model will always degenerate to a single power law and has no dependence on γ 1 once λ c lies below λ min .We also require γ 2 > 0; otherwise, p(λ | M ⋆ , z) will not be a probability measure, and the model-predicted number of AGNs will diverge. 3It has been shown that a double power-law can indeed approximate p(λ | M ⋆ , z) well (e.g., Bongiorno et al. 2016;Aird et al. 2018;Yang et al. 2018).Similarly, the observed AGN X-ray luminosity function (XLF) also follows a double power-law (e.g., Ueda et al. 2014), and a p(λ | M ⋆ , z) roughly with a double power-law shape is needed to reproduce the XLF (Section 3.2).

The Detection Probability
We denote P det ( f X ) as the probability that a source with a 2 − 10 keV flux of f X is detected by a given X-ray survey.Following Section 3.4 in Zou et al. (2023), we adopt the following functional form for P det ( f X ): where a and b are parameters determining the shape of P det ( f X ).We follow the same procedures as in Zou et al. (2023) to calibrate a and b and report the results in Table 1.Briefly, we compared the f X distribution with the 2−10 keV log N −log S relation in Georgakakis et al. (2008), which is the well-determined expected surface number density per unit f X with the detection procedures deconvolved.
The comparison can return best-fit (a, b) parameters such that the convolution between the log N − log S relation and P det best matches the observed f X distribution.It is necessary to adopt a functional form because it improves the computational speed by several orders of magnitude, as will be discussed below.The form of Equation 3 has been shown to be appropriate for X-ray surveys (e.g., Yan et al. 2023;Zou et al. 2023) because its overall shape is similar to X-ray sensitivity curves, and, in our case, it indeed returns consistent BHAR as in Yang et al. (2018), who did not adopt this functional form for P det .
There is a subtle difference between eFEDS and the other fields.For the latter, their f X is the observed value taken from the original X-ray catalogs.The log N − log S relation is also for the observed f X ; thus, P det is for the observed f X .For eFEDS, we adopt the intrinsic, de-absorbed 0.5 − 2 keV flux in Liu et al. (2022) and multiply it by 1.57 to convert it to the intrinsic 2 − 10 keV flux assuming a photon-index of Γ = 1.8.For consistency, we should correct the log N − log S relation such that it works for the intrinsic f X .We use the XLF (ϕ L ) in Ueda et al. (2014) to derive the correction.The XLF-predicted intrinsic log N − log S relation is where A all sky is the all-sky solid angle, V C is the comoving volume within a redshift of z, η(z) is a function of z converting L X to the intrinsic 2 − 10 keV flux for a power-law X-ray spectrum with a power-law photon index of Γ = 1.8, and D L is the luminosity distance.We limit the integration to z < 5 because the contribution of higher-redshift sources to the total source number is negligible.Similarly, the predicted observed log N − log S relation is where the N H function p (N H | L X , z) is the conditional probability density per unit log N H of an AGN with (L X , z), as given in Section 3 of Ueda et al. (2014).This function is normalized such that is the absorption factor for a source with Γ = 1.8 and is calculated based on photoelectric absorption and Comptonscattering losses (i.e., zphabs×cabs) in XSPEC.
The XLF-predicted N( f X,obs > S ) is similar to the observed log N − log S relation, with an absolute difference generally below 0.2 dex.
We found that log N( f X,int > S )/N( f X,obs > S ) is almost a constant around 0.15 dex at log S > 10 −14 erg cm −2 s −1 , and thus we add 0.15 dex to the observed log N−log S relation in Georgakakis et al. (2008) to approximate the intrinsic relation.Applying this intrinsic relation for our calibration in Equation 3, we can obtain the eFEDS P det as a function of the intrinsic f X .Given that the intrinsic f X instead of the observed f X is always adopted in our analyses of eFEDS, the fact that eFEDS is more easily affected by absorption has been appropriately accounted for and absorbed into P det .For example, the fact that obscured AGNs may be missed by eFEDS causes the a value to slightly shift to a larger value due to the correction applied to the observed log N −log S relation.One may wonder why we convert the 0.5 − 2 keV flux to 2 − 10 keV flux instead of directly using 0.5 − 2 keV flux.Since the intrinsic flux is always adopted, the conversion, in principle, would not cause systematic biases.The main reason is that the correction to the log N − log S relation is considerably smaller for the 2 − 10 keV band than for the 0.5 − 2 keV band.
One caveat is that we limit the integration range of N H in Equation 7 to be below 10 24 cm −2 , which equivalently means that we neglect the contribution from Compton-thick (CT) AGNs with N H > 10 24 cm −2 for eFEDS.Similarly, in our other fields observed by Chandra or XMM-Newton, we also implicitly neglect most CT AGNs because they can hardly be detected even in the HB.Generally, X-ray observations below 10 keV cannot provide effective constraints for the CT population, and the intrinsic fraction of CT AGNs is highly uncertain (e.g., Ananna et al. 2019).Therefore, any attempt to measure the intrinsic CT population properties using X-rays below 10 keV is likely prone to large systematic uncertainties.The CT population might indeed contribute to the SMBH growth and is missed by our measurements, especially at high redshift (e.g., Yang et al. 2021), but observations in the regime insensitive to the CT obscuration are necessary to reveal it (e.g., Yang et al. 2023).

The Likelihood
When compared with the observed data, the log-likelihood function (e.g., Loredo 2004) where η(z) is given in Equation 6.We adopt Γ = 1.8 and 1.6 for eFEDS and the other fields, respectively.Different Γ values are adopted because the adopted f X inside our P det function is the intrinsic value for eFEDS while being the observed one for the other fields (Section 3.1.1).Equation 10involves an integration, and Equation 9 computes Equation 10 many times in the summation for a single evaluation of L. Numerically integrating Equation 10 is slow, making it impractical to sample more than one or two dozen free parameters (as will be shown later, we will have 10 4 free parameters).Fortunately, as previously suggested in Zou et al. (2023), Equation 10 can be analytically solved when choosing appropriate functional forms for p(λ | M ⋆ , z) and P det ( f X ), and our Equations 2 and 3 enable this.This is one of the most important steps enabling our whole semiparametric analyses.We define Using Equation 21in Zou et al. (2023), Equation 12 can be reduced as follows.
Equation 10 can then be expressed as follows.
The above equations express L as a function of (A, λ c , γ 1 , γ 2 ), which themselves are functions of (M ⋆ , z).The dependences of (A, λ c , γ 1 , γ 2 ) on (M ⋆ , z) lack clear guidelines, and we use a nonparametric approach to model them.We divide the (M ⋆ , z) plane into N M × N z grids and adopt the (A, λ c , γ 1 , γ 2 ) values in each grid element as free parameters; i.e., we have 4N M N z free parameters in total.Such an approach is conceptually similar to and was indeed initially inspired by the "gold standard" nonparametric starformation history (e.g., Leja et al. 2019) in SED fitting.In a strict statistical sense, a method is called "nonparametric" only if the number of free parameters scales with the number of data points.In contrast, we used a fixed number of free parameters, which does not exactly satisfy the statistical definition.Although we can easily adjust N M and N z so that the number of free parameters scales with the number of data points, this makes the computation infeasible because we have millions of galaxies; besides, with our continuity prior in Section 3.1.3,further increasing the number of free parameters does not improve our results materially.In our context, we use the word "nonparametric" because our number of free parameters is far larger than that of typical parametric methods, and our method is effectively similar to the fully nonparametric approach.This same argument also works for "nonparametric star-formation history" in, e.g., Leja et al. (2019).
This method has an important advantage over a parametric one in our case.As Figure 1 shows, most of our data are clustered within a small region of the (M ⋆ , z) plane -the number of sources significantly decreases at both low z (≲ 0.8) and high z (≳ 2), the number of galaxies strongly depends on M ⋆ , and most AGNs are confined within 10 10.5 ≲ M ⋆ ≲ 10 11.2 M ⊙ .This indicates that if we assume any parametric form for (A, λ c , γ 1 , γ 2 ), the fitted parameters will be dominated by the small but well-populated region in the (M ⋆ , z) plane.Especially, one strong argument disfavoring parametric fitting is that our ultimate goal is to derive BHAR across all redshifts, but any parametric fitting will return results dominated by sources in a small redshift range (e.g., Yang et al. 2018).Our semiparametric settings avoid this problem.
Equation 9 then becomes where n gal i j and n AGN i j are the numbers of galaxies and AGNs within the (i, j) bin, respectively.L is defined for each individual survey field, and they are added together to return the final likelihood.

The Prior
We adopt a continuity prior: where X denotes each one of (log A, log λ c , γ 1 , γ 2 ), and σ X is our chosen a priori parameters to quantify the overall variations of X across the whole fitting ranges.The goal of this continuity prior is to transport information among grid elements.Without this prior, the fitted parameters in each grid element become unstable and vary strongly.This prior is defined in a way such that the information flow is roughly independent of the grid size.The continuity prior is defined only for the differences, and we need a further prior for the X's in a single cell and adopt it as flat in the (log A, log λ c , γ 1 , γ 2 ) space.We set bounds for these parameters to ensure propriety of the prior (Tak et al. 2018): −10 < log A < 10, log λ min < log λ c < 40, −5 < γ 1 < 10, and 0 < γ 2 < 10.These ranges are sufficiently large to encompass any reasonable parameter values.Our posterior (Section 3.1.4)may also become less numerically stable outside these bounds.The resulting prior is explicitly shown below.
Note that it is defined in the (log A, log λ c , γ 1 , γ 2 ) space, and an appropriate Jacobian determinant should be added when transforming the parameter space.For sampling purposes, variable transformations are usually needed.
In fact, our prior setting is essentially a rasterized approximation to the continuous surface of a Gaussian process (GP) regression (e.g., Rasmussen & Williams 2006).This is because the blocky prior surface over the (M ⋆ , z) plane becomes the non-parametric GP-based continuous surface as the resolution of the grid increases (i.e., increasing N M and N z to the infinity).Therefore, a full GP regression involves a large number of free parameters scaling with the galaxy sample size (≈ 10 6 ), while our rasterized approach only involves 10 4 parameters.GP also requires computations of O(n 3 ) for matrix inversions, while our approach turns the matrix-inversion problem into products of multiple univariate Gaussian densities.Due to these reasons, a full GP regression is computationally infeasible in our case, but our approach effectively works similarly and is much less computationally demanding.

The Posterior
The posterior is We call our overall modeling "semiparametric" because we adopt p(λ | M ⋆ , z) as a parametric function of λ, while the dependences of (A, λ c , γ 1 , γ 2 ) on (M ⋆ , z) are nonparametric.
Readers may wonder why we do not also adopt a nonparametric function for p(λ | M ⋆ , z).In principle, it could be done and was presented in Georgakakis et al. (2017) and Aird et al. (2018).Since any model contains subjective assumptions, the choice of the methodology should be guided by the assumptions we want to retain or avoid.Compared to nonparametric modeling, the assumptions of parametric models are much stronger.We nonparametrically model (A, λ c , γ 1 , γ 2 ) as functions of (M ⋆ , z) because we genuinely do not know their dependencies and thus want to minimize assumptions.However, we are satisfied with and thus want to retain the inherent assumption of our parametrization of p(λ | M ⋆ , z) that the true dependence is indeed well-approximated by a double power-law when λ > λ min .Previous works have shown that a double power-law indeed works, and, as far as we know, there is no clear evidence suggesting that this assumption would fail.Especially, the nonparametric form of p(λ | M ⋆ , z) inferred from Aird et al. (2018) is also roughly a double power-law.The adopted approach essentially depends on our ultimate goal.It is certainly better to minimize the assumption for p(λ | M ⋆ , z) and adopt a nonparametric form for it if the ultimate goal is to derive the shape of p(λ | M ⋆ , z).However, our goal is different -we are ultimately interested in BHAR and thus want to assume a double power-law form for p(λ | M ⋆ , z).

Hamiltonian Monte Carlo Sampling of p(λ | M ⋆ , z)
Given the high dimensionality, Hamiltonian Monte Carlo (HMC; e.g., Betancourt 2017) should be one of the most practical methods to sample the posterior.As far as we know, other sampling methods either cannot work efficiently in our high-dimension case (e.g., the traditional Metropolis-Hastings algorithm) or do not have well-developed packages readily available (e.g., Bayer et al. 2023).HMC needs both the posterior and its gradient in the parameter space.The posterior has been presented in the previous subsections, and we present the gradient in Appendix A. We use DynamicHMC.jl5 to conduct the HMC sampling.We adopt N M = 49 and N z = 50.The sampling results are presented in Figure 2.These parameter maps will be released online.
To examine our fitting quality, we compare the model p(λ | M ⋆ , z) with the observed values.We use the n obs /n mdl method to obtain binned estimators of p(λ | M ⋆ , z), as outlined in Aird et al. (2012).For a given (z, M ⋆ , λ) bin ranging we denote the number of observed AGNs as n obs and calculate the modelpredicted number as n mdl : where the summation runs over all the sources within [z low , z high ]×[M ⋆,low , M ⋆,high ].The observed binned estimator of p(λ | M ⋆ , z) is then the fitted model evaluated at the bin center scaled by n obs /n mdl , and its uncertainty is calculated from the Poisson error of n obs following Gehrels (1986).We present our model p(λ | M ⋆ , z) and the binned estimators in Figure 3, and they are consistent.The uncertainties become large especially in the high-z/low-M ⋆ and low-z/high-M ⋆ regimes because of a limited number of AGNs being available.In the high-z/low-M ⋆ regime, most of the con- Comparing ϕ L,mdl and the observed XLF, ϕ L,obs , can thus further assess our fitting quality.We adopt ϕ M in Wright et al. (2018) and the median parameter maps in Figure 2 to calculate ϕ L,mdl .We present the comparison between ϕ L,mdl and ϕ L,obs from Ueda et al. (2014) in Figure 4, and they agree well.Note that for comparison purposes here, we do not need to optimize the computation of Equation 22; however, we will present a more optimized computation algorithm later in Section 4.1, where we do need fast computational speed.Also, note that Equation 22 ignores the contribution from sources with M ⋆ below M ⋆,min = 10 9.5 M ⊙ or above M ⋆,max = 10 12 M ⊙ to the XLF.This is appropriate because the XLF is dominated by AGNs with 10 9.5 < M ⋆ < 10 12 M ⊙ .
As a simple check, for the parameters in Figure 2, if we extrapolate the integration in Equation 22to (−∞, +∞), the typical ϕ L,mdl will only increase by 0.01 dex at 43 < L X < 43.5, the lowest L X bin that we will later adopt in Section 4.1.This increment is even smaller for higher L X bins.

Measuring BHAR
Equation 1 converts p(λ | M ⋆ , z) to BHAR.We adopt ϵ = 0.1 and k bol from Equation 2 in Duras et al. (2020).In principle, ϵ may depend upon other factors such as the accretion state (e.g., Yuan & Narayan 2014), but it is infeasible to accurately measure ϵ for our individual sources.We adopt ϵ as 0.1 because it is a typical value for the general AGN population (e.g., Brandt & Alexander 2015) and has been widely used in previous literature (e.g., Yang et al. 2017Yang et al. , 2018Yang et al. , 2019;;Ni et al. 2019Ni et al. , 2021a)).The k bol relation in Duras et al. (2020) diverges at high L X .To avoid it, we cap k bol not to exceed 363, the value when the bolometric luminosity is 10 14.5 L ⊙ , which is roughly the brightest sample used in Duras et al. (2020) to calibrate the k bol relation.We show the L X − k bol relation in Figure 5, in which we also plot the relation used in Yang et al. (2018), derived from Lusso et al. (2012), for a comparison.The two relations are similar, with a small offset of ≈ 0.07 dex that is almost negligible compared to the BHAR uncertainty (Figure 6).The deviation of the two relations at L X ≳ 10 45 erg s −1 has little impact on BHAR because BHAR has little contribution from log λ ≳ 35 (see Figure 3).
Equation 1 ignores the contribution to BHAR from sources at log λ < 31.5 because X-ray binaries may not be negligible at lower λ, and our X-ray surveys can hardly provide strong constraints in the low-λ regime.However, this will not cause material biases because BHAR is dominated by sources at log λ ≳ 31.5 (e.g., Section 3.2.3 in Yang et al. 2018).We logp( M , z) 9.5 < logM < 10.0 10.0 < logM < 10.5 10.5 < logM < 11.0 This work 11.0 < logM < 11.5 0.1 < z < 0.5 11.5 < logM < 12.0 logp( M , z)   have also tried pushing the lower integration bound in Equation 1 down by 2 dex, and the returned BHAR would only increase by a typical value of ≈ 0.02 dex and no more than ≈ 0.1 dex.Such a difference is much smaller than the fitted BHAR uncertainty.This exercise may even overestimate the influence because p(λ | M ⋆ , z) may bend downward or quickly vanish at very small λ (Aird et al. 2017(Aird et al. , 2018)).Therefore, the cut at log λ = 31.5 is not expected to cause material biases to BHAR.We show our sampled BHAR results in Figure 6, and the BHAR maps will be released online.The median map clearly shows that BHAR increases with both M ⋆ and z, qualitatively consistent with the conclusions in Yang et al. (2018).The uncertainty map reveals that the BHAR constraints at both the low-z/high-M ⋆ and the high-z/low-M ⋆ regime are relatively more limited.We will present more quantitative comparison with Yang et al. (2018) and other works in Section 4.2.Besides, we verified that AGN-dominated sources do not cause material biases to our BHAR measurements in Appendix D.
There are slight, local fluctuations in BHAR that are caused by the statistical noise of the data and are confined within the extent allowed by our prior, and the BHAR map is smooth globally, as can be seen in the top panel of Figure 6.The fluctuation levels and BHAR uncertainties depend on our prior settings but almost not on our bin size because our bins are set to be correlated (Section 3.1.3).For example, relaxing the prior by choosing larger σ X would return larger fluctuations and uncertainties.This "arbitrariness" is inherent in modeling. 6Overall, our prior is reasonably constructed (Sec- tion 3.1.3)and provides beneficial regularizations.We have assessed the potential issue of whether such arbitrary choices affect the following posterior inferences and the resulting scientific conclusions qualitatively.For example, we have con-ducted a sensitivity check of our priors and confirmed that the impact of lower or higher resolution of the prior surface (corresponding to larger or smaller bin sizes) does not influence the resulting posterior inference in a noticeable way, and changing σ X generally would not cause material changes of the median BHAR map.

DISCUSSION
Given that this article is already lengthy and full of technical details, we decide to present more scientific investigations of our results in future dedicated works.However, we would like to present brief, immediate, but sufficiently informative explorations of our results in this section, which helps justify the quality and serves as a precursor of further detailed scientific investigations.

Adding External Constraints from the SMF and XLF
Section 3.2 uses the SMF and XLF to examine the fitting quality of p(λ | M ⋆ , z).It is also possible to follow a reversed direction -we can add external constraints from the SMF and XLF (named "the SMF-XLF constraints" hereafter) into our posterior.This approach was adopted in Yang et al. (2018).As a start, we revisit the numerical computations of ϕ L,mdl in Equation 22. Again, numerical integrations should be avoided whenever possible, and we hence derive an analytical formula for ϕ L,mdl .ϕ M is expressed as a two-component Schechter function in Wright et al. (2018): where (M c , α 1 , α 2 , ϕ 1 , ϕ 2 ) are redshift-dependent parameters.We further define an auxiliary function ψ such that the modelpredicted XLF in Equation 22 can be simplified as summations of ψ (see below). where dt is the generalized incomplete Gamma function.The contribution of each grid element to the integration in Equation 22is 26) ) is the lower bound of the i th M ⋆ -grid element, and j z is the index of the z-grid element containing z.
We then follow the procedure in Yang et al. (2018) to compare ϕ L,mdl and ϕ L,obs in Ueda et al. (2014).ϕ L,obs is evaluated at several (L X , z) values, and the number of sources (n XLF ) in Ueda et al. (2014) contributing to ϕ L,obs is recorded.Following Yang et al. (2018), we use the soft-band XLF at L X > 10 43 erg s −1 in Ueda et al. (2014).Their soft-band XLF has been corrected for obscuration and spans a wider L X range compared to their HB XLF, and their soft-band and HB XLFs are also consistent (see Figure 4).The L X cut at 10 43 erg s −1 is adopted to avoid contamination from X-ray binaries.The log-likelihood when comparing ϕ L,mdl and ϕ L,obs is ln where k runs over all the L X and z bins of the observed XLF in Ueda et al. (2014).This term is called the SMF-XLF likelihood in Yang et al. (2018).
To add the SMF-XLF constraints, Equation 20 should be modified as follows.
Its gradient is presented in Appendix B for HMC sampling.
We then sample the above posterior with HMC and present the resulting BHAR in Figure 7.The BHAR curves with or without the SMF-XLF constraints are largely consistent with a small (< 1σ) difference.This is expected because Figure 4 shows that our BHAR without the SMF-XLF constraints leads to consistent XLFs with those in Ueda et al. (2014).
Although there is a good consistency after adding the SMF-XLF constraints in our case, extra cautions are generally needed.The SMF and XLF taken from other literature works usually involve inherent assumptions about their parametric forms.When putting the SMF and XLF into our posterior, we will inevitably "absorb" these assumptions.Besides, the original data used to measure the XLF may overlap with one's dataset, especially given that the X-ray data in GOODS-S are also necessary to constrain the XLF at low-L X and/or high-z.Such an overlap causes double counting of the involved sources.Especially, more considerations would be needed if the posterior is dominated by the SMF-XLF constraints.

Comparison with Previous Works
Figure 3 compares our p(λ | M ⋆ , z) with some representative results in previous literature.Aird et al. (2012; black solid lines in Figure 3) used a single power-law to fit p(L X | M ⋆ , z) at z < 1, which is converted to a single power-law p(λ | M ⋆ , z) in Figure 3.The single powerlaw curves broadly follow our double power-law ones, and the single power-law index lies within the range between γ 1 and γ 2 .This indicates that a single power-law model can serve well as the first-order approximation of p(λ | M ⋆ , z), as has been widely adopted in other works (e.g., Bongiorno et al. 2012;Wang et al. 2017;Birchall et al. 2020Birchall et al. , 2023;;Zou et al. 2023), especially when the data are limited.However, the real p(λ | M ⋆ , z) is more complicated, and a double power-law model can return better characterizations.As Figure 3 shows, the binned p(λ | M ⋆ , z) estimators generally do not show systematic deviations from our double power-law curves (e.g., no further breaks are visible), and thus a double power-law model is sufficient to capture the main structures of p(λ | M ⋆ , z) at λ > λ min .Bongiorno et al. (2016) and Yang et al. (2018) adopted a double power-law model similar to ours, and we plot their results as the dash-dotted and dashed lines in Figure 3, respectively.Our p(λ | M ⋆ , z) curves are nearly identical to those in Yang et al. (2018) at 10 ≲ log M ⋆ ≲ 11.5 and 1 ≲ z ≲ 2.5 but begin diverging in other parameter ranges.In the lowest-mass bin (9.5 < log M ⋆ < 10), our p(λ | M ⋆ , z) is still similar to those in Yang et al. (2018) at log λ ≲ 33.5 but is lower than theirs at higher λ.In the highest-mass bin (11.5 < log M ⋆ < 12), our p(λ | M ⋆ , z) is larger at z ≲ 2 and smaller at z ≳ 2 than for Yang et al. (2018).It should be noted that these parameter regions with noticeable p(λ | M ⋆ , z) differences generally have limited data and are far away from the bulk of other data points, and the results in these regions are subject to large uncertainties.For Bongiorno et al. (2016), their p(λ | M ⋆ , z) is similar to ours at 10 ≲ log M ⋆ ≲ 11.5 and 1.5 ≲ z ≲ 2 but has a much steeper low-λ power-law index at z < 1.5.Two reasons may be responsible for the difference -the data used in Bongiorno et al. (2016) are not sufficiently deep to effectively probe the low-λ regime; their model always fixes the breakpoint at log λ = 33.8 when M ⋆ = 10 11 M ⊙ , while our results suggest that the breakpoint tends to become smaller as redshift decreases.
Georgakakis et al. ( 2017) and Aird et al. (2018) adopted nonparametric methods to measure p(λ | M ⋆ , z) without assuming a double power-law form.Our results show good agreement with theirs, especially in regimes well-covered by the data, suggesting that a double power-law is indeed a good approximation of p(λ | M ⋆ , z).Nonetheless, some differences are worth noting.At log λ ≳ 34 where the data become limited, the p(λ | M ⋆ , z) in Aird et al. (2018) tends to be flatter than ours, while that in Georgakakis et al. (2017) tends to be steeper than ours.This high-λ regime is highly uncertain and subject to the adopted methodology -for instance, the prior adopted in Aird et al. ( 2018) prefers a flatter slope at high λ.Another feature is that the p(λ | M ⋆ , z) in Aird et al. (2018) sometimes shows downward bending at log λ ≈ 32 − 33, while that in Georgakakis et al. (2017) does not show a clear bending, although the large uncertainty may be responsible for the lack of bending.In principle, a downward bending at some low λ is expected; otherwise, p(λ | M ⋆ , z) would diverge.Such bending can also be seen in Georgakakis et al. (2017), but below log λ min = 31.5(see, e.g., their Figure 7).Our double power-law model is unable to capture this feature, and Figure 3 shows that the bending in Aird et al. (2018) mainly becomes prominent at high redshift (z ≳ 3).
Another metric that can be measured from p(λ | M ⋆ , z) is the fraction of galaxies hosting accreting SMBHs above a given λ threshold (λ thres ), as calculated below.
For a consistent comparison with Aird et al. (2018), we adopt the same λ thres = 32 as theirs.We calculate f AGN at several (M ⋆ , z) values and plot the results in Figure 8.Our results generally agree well with those in Aird et al. (2018) and follow similar evolutionary trends with respect to M ⋆ and z.At log M ⋆ ≳ 10, f AGN increases with z up to z ≈ 1.5 − 2 and reaches a plateau at higher redshift; while for less-massive galaxies, the redshift evolution is weaker.At low redshift (z ≲ 0.5), f AGN is similar regardless of M ⋆ , and this conclusion can be further extended down to log M ⋆ < 9.5, as .BHAR as a function of M ⋆ at several redshifts.The red curves are our median BHAR with the SMF-XLF constraints added, and the orange shaded regions represent the corresponding 1σ and 2σ uncertainty ranges.The blue curves are our median BHAR and 1σ uncertainties without the SMF-XLF constraints.Zou et al. (2023) showed that the λ-based f AGN in the dwarfgalaxy population in this redshift range is also similar to f AGN in massive galaxies.At higher redshift (z ≳ 1), the dependence of f AGN on M ⋆ becomes more apparent due to M ⋆ -dependent redshift evolution rates of f AGN , and there is a positive correlation between f AGN and M ⋆ at log M ⋆ ≲ 10.5.However, for massive galaxies with log M ⋆ ≳ 10.5, f AGN nearly does not depend on M ⋆ .A full physical explanation of these complicated correlations between f AGN and (M ⋆ , z) will require further detailed analyses of p(λ | M ⋆ , z) with at least partially physically driven modeling, and we leave these analyses for future work.We further compare our BHAR with those from Yang et al. (2018) in Figure 9.Our median relation is largely similar to theirs, but some subtle differences exist.Our low-mass BHAR at log M ⋆ ≲ 10 is slightly smaller across all redshifts, though not very significant.Our high-mass BHAR at log M ⋆ ≳ 11.5 differs the most with Yang et al. (2018), and ours tends to be smaller at z ≳ 3 while being larger at z ≲ 2. These differences originate from different p(λ | M ⋆ , z), as discussed earlier in this section.As shown in Figure 3, our low-mass p(λ | M ⋆ , z) is smaller than for Yang et al. (2018) only at high λ, and thus the low-mass BHAR difference is small.Our high-mass p(λ | M ⋆ , z) at log M ⋆ ≳ 11.5, instead, shows a redshift-dependent difference in the normalization.Nevertheless, the uncertainties in these extreme regimes are large, and they are also subject to model choices.Dedicated analyses of these extreme-mass sources with deeper or wider 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 z data may be necessary to further pin down the uncertainty.Another important difference is that the BHAR − M ⋆ relation in Yang et al. (2018) flattens at low redshift, but ours do not show such a trend.Therefore, the BHAR in Yang et al. (2018) is less reliable at z ≲ 0.8, as they noted; if their relation is further extrapolated below z = 0.5, their BHAR − M ⋆ relation would become flat and is thus unphysical.Our BHAR uncertainties are also considerably larger than those in Yang et al. (2018), even though we used more data.This is because Yang et al. (2018) adopted a parametric modeling method, which includes strong a priori assumptions.In contrast, this work minimizes such assumptions, and thus the fitted uncertainties reflect those directly inherited from the data.

Star-forming versus Quiescent Galaxies
Star-forming galaxies generally have stronger AGN activity than quiescent galaxies (e.g., Aird et al. 2018Aird et al. , 2019)).We hence examine if star-forming and quiescent galaxies have the same BHAR in this section.
To separate star-forming and quiescent galaxies, we adopt the star-forming main sequence (MS) in Popesso et al. (2023) and define quiescent galaxies as those lying at least 1 dex below the MS; the remaining galaxies are star-forming ones.Since the star-forming and quiescent subpopulations do not individually follow the SMF and XLF, we do not apply the SMF-XLF constraints as in Section 4.1.We measure their BHAR and present the results in Figure 10.The BHAR of both star-forming and quiescent galaxies increases with M ⋆ and z.When comparing the BHAR of these two subpopulations, star-forming galaxies generally have larger BHAR, suggesting that star-forming galaxies indeed host more active SMBHs, possibly due to more available cold gas for both star formation and SMBH accretion.The BHAR difference between the two populations also depends on M ⋆ and z.At log M ⋆ ≲ 10.5, the difference is generally small across most of the redshift range.At higher mass, the difference is small at low redshift but becomes apparent when z increases to 1 and further decreases at higher redshift.There is also tentative evidence suggesting that the redshift at which the difference reaches its peak might also shift with M ⋆ , with the peak of the BHAR difference of higher-mass galaxies occurring at higher redshift.
One caveat that should be noted is that our results depend on the classification between star-forming and quiescent galaxies.Such a classification is more reliable at M ⋆ ≲ 10 10.5 M ⊙ , but it may become sensitive to the adopted method at higher M ⋆ and lower z (e.g., Donnari et al. 2019).Cristello et al. (2024) show that the star-forming and quiescent subpopulations cannot be safely defined for massive galaxies, and Feldmann (2017) also argued that the bimodal separation is not necessarily appropriate.The proposed redshiftdependent maximum M ⋆ values for reliable classifications in Cristello et al. (2024) can be well described by the following equation: and they are explicitly plotted in Figure 10.We also plot the BHAR of the whole population in Figure 10, and it is similar to the star-forming BHAR below the M ⋆ threshold in Equation 31 and becomes more in the middle between the star-forming and quiescent BHAR with rising M ⋆ .Therefore, Equation 31 can also serve as an approximate threshold of whether the contribution of the SMBH growth in quiescent galaxies to the overall SMBH growth is important.
Our results suggest that the BHAR(M ⋆ , z) function may also depend on SFR, with star-forming galaxies hosting enhanced AGN activity (e.g., Aird et al. 2018Aird et al. , 2019;;Birchall et al. 2023).However, such a dependence is only secondary (Yang et al. 2017), and SFR is usually more challenging to measure and more subject to confusion with AGN emission.Still, more physical insights can be gained by incorporating SFR-based parameters, especially when probing p(λ | M ⋆ , z) instead of BHAR (Aird et al. 2018).We leave further analyses on including SFR into the BHAR(M ⋆ , z) function to the future, in which different classification schemes from binary (star-forming versus quiescent) up to four categories (starburst, star-forming, transitioning, and quiescent) will be explored.

SUMMARY AND FUTURE WORK
In this work, we mapped BHAR as a function of (M ⋆ , z) over the vast majority of cosmic time, and our main results are summarized as follows: 1.We compiled an unprecedentedly large sample from nine fields -CANDELS (including GOODS-S, GOODS-N, EGS, and UDS), the LSST DDFs (including COSMOS, ELAIS-S1, W-CDF-S, and XMM-LSS), and eFEDS.These fields include both deep, small-area surveys and shallow, large-area ones.The former provide rich information in the high-z, low-M ⋆ , and/or low-λ regime, while the latter provide complementary information in the low-z, high-M ⋆ , and/or high-λ regime.Therefore, our sample can effectively constrain BHAR across a large range of the relevant parameter space.See Section 2.
2. We developed a semiparametric Bayesian method to measure BHAR, where a double power-law model with respect to λ is used to measure p(λ | M ⋆ , z), and the relevant parameters nonparametrically depend on (M ⋆ , z).This method has two main advantages.It avoids the "extrapolation" of parameters from wellpopulated regions in the parameter space to lesspopulated regions.It also adopts much weaker assumptions than parametric methods, enabling more flexible constraints and more representative fitting uncertainties from the data.See Section 3.1.
3. We sampled p(λ | M ⋆ , z) and measured BHAR at 10 9.5 < M ⋆ < 10 12 M ⊙ and z < 4. We have verified the fitting quality by comparing our model p(λ | M ⋆ , z) and the corresponding binned estimators and also by comparing our inferred XLF with the observed one.We showed that BHAR increases with both M ⋆ and z. Figure 3 visually illustrates that p(λ | M ⋆ , z) evolves over (M ⋆ , z).Observationally, it is still unclear what the exact evolution pattern is, let alone the physics driving such an evolution.It is also unknown from a theoretical perspective because no simulations appear to produce consistent evolution patterns of p(λ | M ⋆ , z) with the observed ones (e.g., Habouzit et al. 2022).It even complicates matters further that p(λ | M ⋆ , z) may evolve differently for star-forming and quiescent galaxies, as proposed in a phenomenological scenario in Aird et al. (2018).We leave detailed analyses of the p(λ | M ⋆ , z) evolution to a subsequent future work.We will .BHAR for star-forming (blue) and quiescent (red) galaxies.The shaded regions represent 1σ uncertainty ranges.The black dashed curves are the BHAR with all the galaxies included, i.e., those in Figure 6.The vertical black lines mark the maximum M ⋆ values where star-forming and quiescent galaxies can be reliably classified at the corresponding z (Cristello et al. 2024).Star-forming galaxies have larger BHAR.
first identify the qualitative evolution pattern of the dependence of p(λ | M ⋆ , z) on M ⋆ and z for different galaxy populations and then develop a quantitative, parametric model to depict the identified evolution pattern.With the clearly understood p(λ | M ⋆ , z), we will address the following scientific questions.Is the broad decline in SMBH growth below z ≈ 1 due to the shift of accretion activity to smaller galaxies or reduction of the typical λ? How large is the AGN "duty cycle", which is an integration of p(λ | M ⋆ , z), in different galaxy populations?Does M ⋆ modulate the duty cycle or modulate the typical outburst luminosity in the AGN phase?Is there any difference in the SMBH feeding in star-forming and quiescent galaxies?APPENDIX A. GRADIENT OF THE POSTERIOR This Appendix presents the gradient of our posterior in Equation 20.We found that, at least in our case, analytical differentiation enables a much higher computational speed and/or less memory compared with other differentiation algorithms.We thus adopt the analytically-derived gradient and directly present the derivation results below.
The partial derivatives of ln π cont in Equation 19are in which X denotes each one of (log A, log λ c , γ 1 , γ 2 ), and we define X 0 j ≡ X 1 j , X N M +1, j ≡ X N M , j , X i0 ≡ X i1 , and X i,N z +1 ≡ X i,N z to incorporate X's at the boundary.The gradient of the log-posterior in Equation 20 is thus When transforming the parameter space, the gradient of the corresponding Jacobian should also be added.

B. GRADIENT OF THE POSTERIOR WITH THE SMF-XLF CONSTRAINTS ADDED
This Appendix presents the gradient of our posterior after adding the SMF-XLF constraints in Equation 29.First, the partial derivatives of ψ(γ, M 1 , M 2 , A, λ c ; L X ) in Equation 24are ∂ψ ∂A = ψ A (B14) where ∂Γ GI ∂ζ (ζ, x 1 , x 2 ) is the partial derivative relative to the first argument of Γ GI (ζ, x 1 , x 2 ).The partial derivatives of ψ DP (A, λ c , γ 1 , γ 2 , M 1 , M 2 ; L X ) in Equation 25are Based on Equation 26, we have ∂ϕ L,mdl ∂X i j (L X , z) = δ j j z × ∂ψ DP ∂X (A i j z , λ c,i j z , γ 1,i j z , γ 2,i j z , M LB,i , M LB,i+1 ; L X ), (B22) where δ j j z = 0 (1) if j j z ( j = j z ), and X denotes each one of (A, λ c , γ 1 , γ 2 ).The partial derivatives of ln L SMF−XLF in Equation 27are The gradient of the posterior in Equation 29is where ∇ ln L and ∇ ln π cont were presented in Appendix A.
C. RESULTS WITHOUT EFEDS eFEDS is primarily observed through soft X-rays below 2 keV, which are more prone to obscuration compared to our other fields.To examine if our results are biased by this effect, we try excluding eFEDS in this appendix, and the corresponding BHAR results are shown in Figure 11.There is not any material systematic difference in the median BHAR after excluding eFEDS, and the uncertainty becomes larger in certain parameter ranges; e.g., the difference in width of the shaded regions in Figure 11 is apparent at M ⋆ ≈ 10 10.8 M ⊙ and z = 0.5.The uncertainties generally grow by no more than 60%.Therefore, no strong systematic biases are introduced by eFEDS, and eFEDS also helps constrain BHAR.This verifies that the absorption effects have been appropriately considered, as detailed in Section 3.1.1.Besides, given that the LSST DDFs already cover 12.6 deg 2 with sensitive HB data, eFEDS provides useful constraints but is not fully dominant.

D. IMPACT OF AGN-DOMINATED SOURCES
It is generally more challenging to reliably measure M ⋆ from the galaxy component for sources with SEDs dominated by the AGN component.We assess if the less-reliable M ⋆ measurements for such sources have strong impact on our BHAR results.It has been shown that the CANDELS fields are largely free from this potential issue (Aird et al. 2018;Yang et al. 2018) due to their small solid angles, superb multi-wavelength coverage, and deep X-ray surveys.For the LSST DDFs and eFEDS, their X-ray surveys are wider and shallower, and thus a larger fraction of the detected AGNs are luminous and may dominate the SEDs.We thus primar-ily focus on the AGN-dominated sources in the LSST DDFs and eFEDS.
M ⋆ is largely constrained by the rest-frame near-infrared (NIR) data because the old-star emission peaks in the NIR.For the purpose of assessing the M ⋆ measurements, we define a source to be "AGN-dominated" if its AGN component contributes > 50% of the rest-frame 1 µm light, as measured from its decomposed SED.A similar definition was also adopted in Aird et al. (2018).About 10 − 15% of our AGNs are classified as AGN-dominated.Note that this definition significantly overlaps but is not the same as the broadline AGN definition.In a general sense, broad-line AGNs are sources with strong AGN signatures (e.g., spectroscopi-cally detected broad emission lines) in the optical.However, a large fraction of broad-line AGNs are not necessarily AGNdominated in the NIR because the galaxy emission usually reaches a peak while the AGN emission reaches a valley in the NIR.We found that around half of the broad-line AGNs in Ni et al. (2021b) are classified as AGN-dominated under our definition, and the non-AGN-dominated ones indeed generally have lower L X .We adopt our current definition because it is simpler and also more physically related to the M ⋆ measurement.
We remove AGN-dominated sources in the LSST DDFs and eFEDS and measure BHAR again following Section 3.3.We further estimate the AGN number-density maps in the (M ⋆ , z) plane using kernel density estimations before and after excluding these AGN-dominated sources and apply the number-density ratio as a function of (M ⋆ , z) as a correction of BHAR to account for the fact that fewer AGNs are in-cluded after removing AGN-dominated sources.These procedures are conducted for the whole population as well as star-forming and quiescent galaxies.We compare BHAR with the original ones in Figure 12.The quiescent curves almost do not change after removing AGN-dominated sources, while the whole-population and star-forming BHAR become slightly smaller.The difference at high redshift is slightly larger than that at low redshift because high-z sources need higher L X to be detected in the X-ray and are hence more likely to be AGN-dominated, but the difference is still generally no more than the 1σ uncertainties.Besides, our numberbased correction underestimates the real loss of accretion power because AGN-dominated sources, by construction, tend to have higher λ than the remaining ones.The difference in BHAR should be even smaller.Therefore, the relatively larger M ⋆ uncertainties of AGN-dominated sources are not expected to cause material biases to our BHAR.

Figure 1 .
Figure 1.Our sample in the z − M ⋆ (top) and z − λ (bottom) planes.The left, middle, and right panels are for CANDELS, the LSST DDFs, and eFEDS, respectively.The points are AGNs.The background grey-scale cells in the left panel are the 2-D histogram of the number of normal galaxies, with darker cells representing more galaxies.The apparent deficiency of sources in the high-z and/or low-M ⋆ regime in the middle and right panels is due to our M ⋆ completeness cuts.

Figure 2 .
Figure2.The sampled maps of (A, λ c , γ 1 , γ 2 ).The top panels are the median posteriors, and the bottom panels are the 1σ uncertainties, defined as the half width of the posterior's 16th-84th percentile range.straintsare from deep CANDELS fields, especially GOODS-S, because the other fields are not sufficiently deep in both X-rays and other multi-wavelength bands.For example, 60% (80%) of AGNs in our sample with M ⋆ < 10 10 M ⊙ and z > 2 (z > 3) are from GOODS-S.At z < 0.5, ≳ 60% of AGNs are from eFEDS, and even the 60 deg 2 eFEDS is not sufficiently large to effectively sample high-M ⋆ sources at low redshift.We also plot several p(λ | M ⋆ , z) results from previous works and leave more detailed discussions on the comparison between our p(λ | M ⋆ , z) and previous ones to Section 4.2.As another independent check, p(λ | M ⋆ , z), by definition, can connect the SMF (ϕ M ) and XLF (ϕ L ).That is, the SMF and p(λ | M ⋆ , z) can jointly predict the XLF (e.g.,Bongiorno et al. 2016;Georgakakis et al. 2017):

Figure 3 .
Figure3.Comparison between our p(λ | M ⋆ , z) and other measurements.The red points are the binned estimators with 1σ error bars based on our data.The blue curves are our fitted median p(λ | M ⋆ , z), evaluated at the bin centers, and the blue shaded regions are the corresponding 90% confidence ranges.The black solid straight lines are the single power-law models inAird et al. (2012).The dash-dotted and dashed curves are the double power-law models inBongiorno et al. (2016) andYang et al. (2018), respectively.The cyan and orange shaded regions are the 90% confidence intervals of the nonparametric p(λ | M ⋆ , z) inAird et al. (2018)  andGeorgakakis et al. (2017), respectively.

Figure 4 .Figure 5 .
Figure4.The XLFs at different redshifts.The red (blue) data points are the soft-band (hard-band) XLFs inUeda et al. (2014).The cyan curves are the best-fit XLF models inUeda et al. (2014), and the black curves are our ϕ L,mdl based on the median parameter maps in Figure2and the SMF inWright et al. (2018).The absorption correction has been appropriately applied for both our measurements (see Section 3.1.1)and the XLFs inUeda et al. (2014).Our models agree with the observed XLFs well.

Figure 6 .
Figure 6.The top and middle panels are the sampled BHAR maps, where the unit of BHAR is M ⊙ yr −1 .The bottom panel shows the BHAR − M ⋆ relation at several redshifts, where the solid curves are the median values, and the shaded regions are the corresponding 1σ uncertainty ranges.BHAR generally increases with both M ⋆ and z.
Figure7.BHAR as a function of M ⋆ at several redshifts.The red curves are our median BHAR with the SMF-XLF constraints added, and the orange shaded regions represent the corresponding 1σ and 2σ uncertainty ranges.The blue curves are our median BHAR and 1σ uncertainties without the SMF-XLF constraints.

Figure 8 .
Figure 8. f AGN evaluated at several (M ⋆ , z) values versus z.Our results are plotted as open circles with 1σ error bars, where different colors represent different M ⋆ .The dashed lines are those in Aird et al. (2018).

Figure 9 .
Figure9.The comparison of our BHAR with those inYang et al. (2018).The blue curves are our median BHAR, and the cyan shaded regions represent the corresponding 1σ and 2σ uncertainty ranges.The BHAR and the corresponding 1σ uncertainty inYang et al. (2018) are plotted as the black curves.
Figure10.BHAR for star-forming (blue) and quiescent (red) galaxies.The shaded regions represent 1σ uncertainty ranges.The black dashed curves are the BHAR with all the galaxies included, i.e., those in Figure6.The vertical black lines mark the maximum M ⋆ values where star-forming and quiescent galaxies can be reliably classified at the corresponding z(Cristello et al. 2024).Star-forming galaxies have larger BHAR.

Figure 11 .
Figure 11.Comparison between BHAR with eFEDS included (red) and excluded (blue) in the fitting.The shaded regions represent 1σ uncertainty ranges.The red curves are similar to the blue ones, and the red uncertainties are smaller than the blue ones in certain regimes, indicating that eFEDS does not cause systematic biases and helps constrain BHAR.

Table 1 .
Basic Information for the Fields Used in This Work