The DESI PRObabilistic Value-added Bright Galaxy Survey (PROVABGS) Mock Challenge

The PRObabilistic Value-added Bright Galaxy Survey (PROVABGS) catalog will provide measurements of galaxy properties, such as stellar mass (M *), star formation rate (SFR), stellar metallicity (Z), and stellar age (t age), for >10 million galaxies of the Dark Energy Spectroscopic Instrument (DESI) Bright Galaxy Survey. Full posterior distributions of the galaxy properties will be inferred using state-of-the-art Bayesian spectral energy distribution (SED) modeling of DESI spectroscopy and Legacy Surveys photometry. In this work, we present the SED model, the neural emulator for the model, and the Bayesian inference framework of PROVABGS. Furthermore, we apply the PROVABGS SED modeling on realistic synthetic DESI spectra and photometry, constructed using the L-Galaxies semi-analytic model. We compare the inferred galaxy properties to the true values of the simulation using a hierarchical Bayesian framework to quantify accuracy and precision. Overall, we accurately infer the true M *, SFR, Z, and t age of the simulated galaxies. However, the priors on galaxy properties induced by the SED model have a significant impact on the posteriors, which we characterize in detail. This work also demonstrates that a joint analysis of spectra and photometry significantly improves the constraints on galaxy properties over photometry alone and is necessary to mitigate the impact of the priors. With the methodology presented and validated in this work, PROVABGS will maximize information extracted from DESI observations and extend current galaxy studies to new regimes and unlock cutting-edge probabilistic analyses. https://github.com/changhoonhahn/provabgs/


INTRODUCTION
Large galaxy surveys have been transformational for our understanding of galaxy evolution.With surveys such as the Sloan Digital Sky Survey (SDSS; York et al. 2000), Galaxy and Mass Assembly survey (GAMA; Driver et al. 2011), and PRIsm MUlti-object Survey (PRIMUS; Coil et al. 2011), we have now established the global trends of galaxies in the local universe.Population statistiscs, such as the stellar mass function (Li & White 2009;Marchesini et al. 2009;Moustakas et al. 2013) or quiescent fraction (Kauffmann et al. 2003;Blanton et al. 2003;Baldry et al. 2006;Taylor et al. 2009), and their evolution are now well understood.Many global scaling relations of galaxy propreties such as the mass-metallicity relation (Tremonti et al. 2004) or the "star formation sequence" (Noeske et al. 2007;Daddi et al. 2007;Salim et al. 2007) have also been firmly established.Despite their importance in building our current understanding, however, the empirical relations from existing observations are inadequate for shedding further light on how galaxies form and evolve.
More precise and accurate measurements have the potential to reveal new trends among galaxies undetected by previous observations.So do new approaches that go beyond observed relations.Empirical prescriptions for physical processes can be combined with N -body simulations that capture hierarchical structure formation in empirical models (e.g.UniverseMachine; Behroozi et al. 2019).The predictions of these models can be compared to the observed distributions of galaxy properties to derive insights into physical processes, such as the timescale of star formation quenching (Wetzel et al. 2013;Hahn et al. 2017;Tinker et al. 2017).Predicted distributions of galaxy properties of large-scale cosmological hydrodyanmical simulations can also be compared to observations (e.g.Genel et al. 2014;Somerville & Davé 2015;Davé et al. 2017;Trayford et al. 2017;Dickey et al. 2021;Donnari et al. 2021).Though such comparisons are currently limited by the computation of costs of simulations, advances in machine learning techniques for accelerating and emulating simulations will enable such comparisons to explore a broad range of galaxy formation models (e.g.Villaescusa-Navarro et al. 2021).Soon we will be able to compare detailed galaxy formation models directly against observations and explore the parameter spaces and physical prescriptions of the models.While many different approaches are available for expanding our understanding of galaxies, they all require more statistically powerful galaxy samples with well controlled systematics and well understood selection functions.
Better observations, however, must be accompanied by better and more consistent methodogy.The statistical power of large galaxy surveys are squandered when they are analyzed inconsistently with a hodgepodge of methodologies since analyses cannot take advantage of new techniques and approaches.In this regard, value-added catalogs (VACs) that provide consistently measured galaxy properties for entire galaxy surveys are instrumental and have been used by hundreds of galaxy studies (see Blanton & Moustakas 2009, for a review).For SDSS galaxies, the NYU-VAGC (Blanton et al. 2005) provided photometric properties (e.g.absolute magnitudes) and the MPA-JHU catalog (Brinchmann et al. 2004)1 provided spectral properties (e.g.emission line luminosities).Despite being released over a decade ago, these VACs are still widely used today (e.g.Alpaslan & Tinker 2021;O'Donnell et al. 2021;Trevisan et al. 2021).
Probabilistic VACs are the next advancement in VACs that will extract even more the information from galaxy surveys.Unlike previous VACs that provide point estimates and rough estimates of uncertainties, probabilistic catalogs provide full posterior distributions of galaxy propertiesp(θ | X i ), the probability of galaxy properties θ given observations, X i , of galaxy i. Posteriors offer more accurate measurements of galaxy properties because they estimate the uncertainties and any degeneracies among them more accurately.They also open the doors for principled population inference.Given observations of a set of galaxies, {X i }, we can combine individual posteriors of the galaxies to rigorously derive the distribution of their physical properties: p(θ|{X i }).For example, from posteriors on stellar mass, M * , and star formation rate, SFR, we can infer p(M * , SFR|{X i }) by combining the posteriors, or with the latest machine learning techniques (Leja et al. 2021).This M * -SFR distribution can then be used to measure the intrinsic width star formation sequence with unprecedented accuracy and provide key insight into star formation and stellar and AGN feedback in star-forming galaxies (e.g.Davies et al. 2021).
With probabilistic catalogs, we can also include galaxies with less tightly constrained properties in our analyses since posteriors accurately quantify uncertainties.This means we can probe less explored, low signal-to-noise, regimes that may shed new light on galaxy evolution, such as dwarf galaxies.We can also more reliably quantify the fraction of extreme/outlier galaxies, e.g.quiescent fraction of field dwarf galaxies (Geha et al. 2012).Probabilistic catalogs also open the door for Bayesian Hierarchical approaches and improve the statistical power of BGS through Bayesian shrinkage: the joint posterior Figure 1.DESI will conduct the largest spectroscopic survey to date covering ∼14, 000 deg 2 .During dark time, DESI will measure >20 million spectra of luminous red galaxies, emission line galaxies, and quasars out to z > 3.During bright time, DESI will measure the spectra of ∼10 million galaxies out to z∼0.6 with the Bright Galaxy Survey (BGS).Left: BGS (blue) will cover ∼2× the SDSS footprint (orange) and ∼45× the GAMA footprint (red).Right: We present the redshift distribution of BGS as predicted by the Millenium-XXL simulation (blue; Smith et al. 2017).We include the redshift distribution of SDSS and GAMA multiplied by 10× for comparison.BGS will be roughly two orders of magnitude deeper than the SDSS main galaxy sample and 0.375 mag deeper than GAMA.BGS will provide spectra for a magnitude limited sample of ∼10 million galaxies down to r < 19.5 (BGS Bright) and a deeper sample of ∼ 5 million galaxies as faint as r < 20.175 (BGS Faint).
of the galaxy sample can be used as the prior to shrink the uncertainties on the properties of individual galaxies.Overall, probabilistic VACs will enable a new level of statistical robustness in galaxy studies and more fully extract the statistical power of galaxy surveys.The PRObabilistic Value-Added Bright Galaxy Survey (PROVABGS) catalog will be a probabilistic VAC constructed from the next pivotal large galaxy survey: the Dark Energy Spectroscopic Instrument (DESI).Over the next five years, DESI will use its 5000 robotically-actuated fibers to provide redshifts of ∼30 million galaxies over ∼14, 000 deg 2 , a third of the sky (DESI Collaboration et al. 2016a,b).The redshifts will be spectroscopically measured from optical spectra that spans the wavelength range 3600 < λ < 9800Å with spectral resolutions R = λ/∆λ = 2000 − 5000.In addition, DESI targets will also have photometry from the Legacy Imaging Surveys Data Release 9 (LS; Dey et al. 2019), used for target selection.LS is a combination of three public projects (Dark Energy Camera Legacy Survey, Beijing-Arizona Sky Survey, and Mayall z-band Legacy Survey) that jointly imaged the DESI footprint in three optical bands (g, r, and z).It also includes photometry in the Wide-field Infrared Survey Explorer W 1, W 2, W 3, and W 4 infrared bands, derived from all imaging through year 4 of NEOWISE-Reactivation force-photometered in the unWISE maps at the locations of LS optical sources (Meisner et al. 2017a,b).
During bright time, when the night sky is ∼2.5× brighter than nominal dark conditions, DESI will conduct the Bright Galaxy Survey (BGS).BGS will provide a r < 19.5 magnitude-limited sample of ∼10 million galaxies out to redshift z < 0.6 -the BGS Bright sample.It will also provide a surface brightness and color-selected sample of ∼5 million faint galaxies with 19.5 < r < 20.175 -the BGS We include the M * distribution of MXXL galaxies with r < 20 (blue) for reference.Many such fainter galaxies will be included in the BGS Faint sample, which will observe galaxies as faint as r < 20.175.BGS will observe a broad range of galaxies with high completeness and provide galaxy samples with unprecedented statistical power.This includes a large sample of <10 9 M dwarf galaxies.We will apply state-of-the-art Bayesian SED modeling to all BGS galaxies to construct the PRObabilistic Value-Added BGS (PROVABGS) catalog, which will unlock more sophisticated statistical approaches for galaxy evolution studies.
Faint sample.The selection and completeness as well as the effect of systematics of the BGS samples are characterized in detail in Hahn et al. (in prep.).Compared to the seminal SDSS main galaxy survey, BGS will provide optical spectra two magnitudes deeper, over twice the sky, and double the median redshift z∼0.2 (Figure 1).It will observe a broader range of galaxies than previous surveys with unprecendented statistical power.For all >10 million BGS galaxies, PROVABGS will provide full posterior probability distributions of physical properties such as M * , SFR, metallicity (Z), and stellar age (t age ).These properties will be inferred from both the LS photometry and DESI spectroscopy using a state-of-the-art Bayesian modeling of the galaxy spectral energy distribution (SED).PROVABGS will enable conventional analyses to be extended to a more statistically powerful spectroscopic galaxy sample.Population statistics such as the stellar mass function or the star formation sequence will be measured with higher precision than previously possible and over a much wider range of galaxies (Figure 2).In particular, with the faint apparent magnitude limit of BGS (r < 20.175), PROVABGS will include low mass (<10 9 M ) dwarf populations, which provide important probes of the physics of dark matter and star formation feedback.The high completeness and simple selection function of the BGS Bright sample will also facilitate comparisons to empirical models or galaxy formation simulations with novel approaches.
In this paper, we present the mock challenge for PROVABGS conducted by the DESI Galaxy Quasar Physics working group.We present the state-of-the-art SED modeling that will be used to infer the galaxy properties of BGS galaxies and construct the PROVABGS.We use an SED model with non-parametric prescriptions for galaxy star formation and metallicity histories and accelerate the parameter inference using neural emulators.Moreover, we validate our SED modeling on realistic mock BGS observations constructed using the L-Galaxies semi-analytic model (Henriques et al.LGAL SED forward modeled photometry broadband filter response We forward model DESI optical g, r, and z band photometry (red) for our simulated galaxies (Section 2.1) by convolving their SEDs (black dotted) with the broadband filters (dashed) and then applying an empirical noise model based on BGS objects in LS (Section 2.3).Right: The g−r and r−z color distribution of the forward modeled LGal photometry is in good agreement with the color distribution of LS BGS objects (black contours).We plot a subsample of the total 2,123 simulated galaxies from LGal that we use in this work.

2015
) and DESI survey simulations.By applying our SED model on mock observations, where we know the true galaxy properties, we demonstrate that we can accurately infer galaxy properties for PROVABGS and highlight the advantages of jointly analyzing photometry and spectra.Furthermore, we characterize, in detail, the limits of our SED modeling so that future studies using PROVABGS can use this work as a reference in interpreting their results.
In Section 2, we describe the L-Galaxies semi-analytic model and how we use them to construct synthetic BGS observations.We then present the SED model, our Bayesian parameter inference framework with neural emulators, and the mock challenge in Section 3. We present the results of the mock challenge in Section 4 and discuss their implications in Section 5.

SIMULATIONS
In this Section, we describe how we construct mock observations from simulated galaxies of the L-Galaxies semi-analytic galaxy formation model (SAM).We use a forward model that includes realistic noise, instrumental effects, and observational systematics to produce DESI-like photometry and spectra.Later, we apply Bayesian SED modeling to these mock observations and demonstrate that we can accurately infer the true galaxy propertries.

L-Galaxies
L-Galaxies (hereafter LGal; Henriques et al. 2015) is a state-of-the-at semi-analytic galaxy formation model run on subhalo merger trees from the Millennium (Springel et al. 2005)  LGal includes prescriptions for gas infall and cooling, star formation, disc and bulge formation, stellar and black hole feedback, and the environmental effects of tidal and ram-pressure stripping.Feedback from active galactic nuclei (AGN), which prevents hot gas from cooling, is the major mechanism for quenching star formation in massive galaxies.
LGal model parameters are calibrated against the observed stellar mass function and passive (quiescent) fraction at four different redshifts from z = 3 to 0. We refer readers to Henriques et al. (2015) for further detail on LGal.

Spectral Energy Distributions
For each simulated galaxy, LGal provides the star formation histories (SFHs) and chemical enrichment histories (ZH) for its bulge and disk components, separately, in approximately log-spaced lookback time bins.We treat each lookback time bin, i, as a single stellar population (SSP) of age t i .We then derive the luminosities of the bulge and disk components by summing up the luminosities of their SSPs: ). (1) i are the star formation rate and metallicity of the bulge or disk component in lookback time bin i. ∆t i is the width of the bin.L SSP corresponds to the luminosity of the SSP, which we calculate using the Flexible Stellar Population Synthesis (FSPS; Conroy et al. 2009;Conroy & Gunn 2010) model.For FSPS, we use the MIST isochrones (Paxton et al. 2011(Paxton et al. , 2013(Paxton et al. , 2015;;Choi et al. 2016;Dotter 2016) and the Chabrier (2003) initial mass function (IMF).Also, we use the default spectral library in FSPS: the MILES spectral library (Sánchez-Blázquez et al. 2006) over the wavelength range 3800 − 7100Å and the BaSeL library (Lejeune et al. 1997(Lejeune et al. , 1998;;Westera et al. 2002) outside of those limits.
Next, we apply velocity dispersions to L comp.(λ).For the disk, we apply a fixed 50 km/s velocity dispersion.For the bulge, we derive its velocity dispersion using the Zahid et al. (2016) empirical relation that depends on the total bulge mass.Afterwards, we apply dust attenuation to stellar emission in the disk component (L disk ) based on the cold gas content and orientation of the disk.The attenuation curve is derived using a mixed-screen model with the Mathis (1983) dust extinction curve.Stellar emission from stars younger than 30Myr are further attenuated with a uniform dust screen and a wavelength dependent optical depth.No dust attenuation is applied to the bulge component.We use the same dust attenuation that Henriques et al. (2015) uses to construct galaxy colors from LGal that match observations.Finally, we combine the attenuated disk component and the bulge component to construct the total luminosity of the simulated galaxy and then convert this rest-frame luminosity to observed-frame SED flux using its redshift, z.
A(λ) here is the dust attenuation for the disk component described above and d L (z) is the luminosity distance.In the left panel of Figure 3, we present an example of the SED flux constructed for an arbitrary LGal galaxy (black dotted).

Forward Modeling DESI Photometry
Hahn et al.
In this section, we describe how we construct realistic LS-like photometry from the SEDs of simulated galaxies described in the last section.First, we convolve the SEDs with the broadband filters of the LS to generate broadband photometric fluxes: (3) f SED is the galaxy SED (Eq.2) and R X is the transmission curve for filter in the X band.We generate photometry for the LS g, r, and z optical bands.Next, we apply realistic measurement uncertainties to the derived photometry by sampling the noise distribution of BGS targets from LS DR9.We do this by matching each simulated galaxy to a BGS target with the nearest r-band magnitude and g − r and r − z colors.The photometric uncertainties (σ X ) and r-band fiber flux (f fiber r ) of the BGS object are then assigned to the simulated galaxy.We apply photometric noise by sampling a Gaussian distribution with standard deviation σ X : Finally, we impose the target selection criteria of BGS (Ruiz-Macias et al. 2021, Hahn et al. in prep.).
In the left panel of Figure 3, we overplot the forward modeled photometry (red) on top of the SED flux (black) for an arbitrary LGal galaxy.For reference, we also plot R X for the g, r, and z bands of LS in blue, orange, and green, respectively.On the right panel, we compare the g −r versus r −z color distribution for the forward modeled LGal galaxies (red) to the color distribution of BGS objects in LS (black contour).The errorbars represent the photometric uncertainties.The LGal galaxies have already been validated against observations, including U V J-band photometry Henriques et al. (2015).However, we further confirm that the forward modeled photometry show good agreement with LS BGS targets in optical color space.

Forward Modeling DESI Spectra
Next, we construct realistic DESI-like spectroscopy from the SEDs of simulated galaxies.We begin by forward modeling the fiber aperture effect.DESI uses fiber-fed spectrographs with fibers that have angular radii of 1".Hence, only the light from a galaxy within this fiber aperture is collected by the instrument.Among BGS targets in LS, 40% have r e < 1" and 81% have r e < 2" so the fiber aperture effect significantly impacts the majority of BGS galaxies (r e is the half-light radius of the galaxy surface brightness model fit by Tractor2 ).To model this fiber apertuer effect, we use LS measurements of photometric fiber flux within a 1" radius aperture (f fiber X ), which estimates the flux that passes through to the fibers.When we assigned photometric uncertainties to our simulated galaxies based on r, g − r, and r − z in Section 2.3, we also assigned r-band fiber flux.We model the flux that passes through the fiber by scaling the SED flux by the r band fiber fraction, the ratio of f fiber r over the total r band flux: LGal simulated galaxies by applying a fiber aperture correction to the SED (dashed) and a realistic DESI noise model.We apply a fiber aperture correction by scaling down the full SED (dotted) by the r-band fiber fraction derived from LS imaging.The noise model accounts for the DESI spectrograph response and the bright time observing conditions of BGS (Hahn et al. in prep., Schlafly et al. in prep.).We represent the spectra from the b, r, and z arms of the DESI spectraphs in blue, orange, and green respectively.Our forward model produces realistic DESI-like spectra that accurately reproduce the noise levels and characteristics of actual BGS spectra.
This fiber aperture correction assumes that there is no significant color dependence.It also assume that there are no significant biases in the fiber flux measurements in LS due to miscentering of objects.We discuss the implications of these assumptions later in Section 5 and will investigate them further in Ramos et al. (in prep.).In addition to the aperture correction, we also use f fiber r to derive "measured" f fiber r , since we do not know the true fiber fraction in actual observations: We later use f fiber r to set the prior on the nuisance parameter of our SED modeling (Section 3).Next, we apply a noise model that simulates the DESI instrument response and bright time observing conditions of BGS.We use the same noise model as the spectral simulations3 used for the BGS survey design and validation (Hahn et al. in prep.).We refer readers to Schlafly et al. (in prep.) for details about the survey operations and simulations and Guy et al. (in prep.) for details on the DESI spectroscopic data reduction pipeline.Specifically, we use nominal dark time observing conditions with a 180s exposure time, which accurately reproduce the spectral noise and redshift success rates of observed BGS spectra in early DESI observations.In Figure 4, we present the forward modeled BGS spectrum of an arbitrary LGal galaxy (solid).We mark the spectrum from each arm of the three DESI spectrographs separately (blue, orange, green).For reference, we include the full SED (dotted) and fiber fraction scaled SED (dashed) of the galaxy.
Hahn et al.

Stellar Population Synthesis Modeling
PROVABGS will provide galaxy properties inferred from joint SED modeling of DESI photometry and spectra.For the SED modeling, we use a state-of-the-art stellar population synthesis (SPS) model that uses a non-parametric SFH with a starburst, a non-parametric ZH that varies with time, and a flexible dust attenuation prescription.
The form of the SFH is one of the most important factors in the accuracy of an SPS model.In general, the form of the SFH requires balancing between being flexible enough to describe the wide range of SFHs in observations while not being too flexible that it can describe any SFH at the expense of constraining power.If the model SFH is not flexible enough to describe actual SFHs of galaxies, then unbiased galaxy properties cannot be inferred using the SPS model.For instance, most SPS models (e.g.CIGALE, Serra et al. 2011;Boquien et al. 2019;BAGPIPES, Carnall et al. 2018) use parametric SFH such as the exponentially declining τ -model.Such functional forms, however, produce biased estimates of galaxy properties (e.g.M * and SFR) when used to fit mock observations of simulated galaxies (Simha et al. 2014;Pacifici et al. 2015;Ciesla et al. 2017;Carnall et al. 2018).On the other hand, many non-parametric forms of the SFH are overly flexible and allow unphysical SFHs (Leja et al. 2019), which unncessarily increases parameter degeneracies and discards constraining power.
In our SPS model, we use a non-parametric SFH with two components: one based on non-negative matrix factorization (NMF; Lee & Seung 1999;Cichocki & Phan 2009;Févotte & Idier 2011) basis functions and a starburst component.For the first component, SFH is a linear combination of four NMF SFH bases: {s SFH i } are the NMF basis functions and {β i } are the coefficients.The integral in the denominator normalizes the NMF basis functions to unity.We constrain i β i = 1, so the total SFH of the component over the age of the galaxy (t age ) is normalized to unity.{s SFH i } are derived from the Illustris cosmological hydrodynamic simulation (Vogelsberger et al. 2014;Genel et al. 2014;Nelson et al. 2015).We compile, rebin, and smooth the SFHs of Illustris galaxies and then perform NMF on them to derive {s SFH i }.We find that 4 components is sufficient to accurately reconstruct the SFHs from Illustris.We present the NMF SFH bases as a function of lookback time in left panel of Figure 5.By using NMF instead of e.g.Principal Component Analysis (PCA), we ensure that all of the SFH bases are non-negative and, thus, physically meaningful.For further details on the derivation of the NMF bases, we refer readers to Appendix A. Assuming that the SFHs of Illustris galaxies resemble the SFHs of real galaxies, our NMF form provides a compact and flexible representation of the SFHs.
The NMF basis functions are derived from smooth SFHs, which means that it does not include any stochasticity.However, observations and high resolution zoom-in hydrodyanmical simulations both find significant stochasticity in galaxy SFHs (Sparre et al. 2017;Caplar & Tacchella 2019;Hahn et al. 2019;Iyer et al. 2020).To include stochasticity in our SPS model, we include a starburst component that consists of a SSP.Thus, for the total SFH, we use f burst is the fraction of total stellar mass formed during the starburst; t burst is the time at which the starburst occurs; δ D is the Dirac delta function.In total we use 6 free parameters in our SFH: 4 NMF basis coefficients (β i ), f burst , and t burst .
Another key part of an SPS model is the chemical enrichment history, or ZH.Current SPS models mostly assume a flat ZH, constant metallicity over time (Carnall et al. 2019;Leja et al. 2019).Since galaxies do not have constant metallicities throughout their history, this assumption can significantly bias the inferred galaxy properties (Thorne et al. 2021).Instead, we take a similar approach to the SFH and use NMF basis functions for ZH: {s ZH i (t)} are the ZH NMF basis functions and {γ i } are the coefficients.{s ZH i (t)} are fit using the ZHs of simulated galaxies from Illustris in the same fashion as the SFH.In the right panel of Figure 5, we present the ZH NMF bases as a function of lookback time.We use two NMF components, so our ZH prescription has 2 free parameters.
We use the SFH and ZH above to model the unattenuated rest-frame luminosity as a linear combination of multiple SSPs, evaluated at logarithmically-spaced lookback time bins.We use a fixed log-binning with the bin egdes starting with (0, 10 6.05 yr), (10 6.05 , 10 6.15 yr), and continuing on with bins of width 0.1 dex.The binning is truncated at the age of the model galaxy.For a z = 0 galaxy, this binning produces 43 t lookback bins.We use log-spaced t lookback bins because it better reproduces galaxy luminosities evaluated with much higher resolution t lookback binning than linearly-spacing, for the same number of bins.At each of the 43 t lookback bin i, we evaluate the luminosity of a SSP with ZH(t i ), where t i is the center of t lookback bin, and total stellar mass calculated by resampling the SFH in Eq. 8. We use FSPS to evaluate the SSP luminosities and use the MIST isochrones, the combination of MILES and BaSeL spectral libraries, and the Chabrier ( 2003) IMF (same as in Section 2.2).Since we use MIST isochrones, we impose a minimum and maximum limit to ZH based on its coverage: 4.49 × 10 −5 and 4.49 × 10 −2 , respectively.These metallicity values are in units of absolute metallicity and can be converted to solar metallicity using Z = 0.019.We note that our stellar metallicity range is significantly broader than previous studies for additional flexibility (e.g.Leja et al. 2017;Carnall et al. 2019;Tacchella et al. 2021).Since we model galaxies solely as a linear combination of SSPs, we do not model nebular emission.We, therefore, exclude emission lines in our SED modeling by masking the wavelength ranges of emission lines.
Before we combine the SSP luminosities, we apply dust attenuation.We use a two component Charlot & Fall (2000) dust attenuation model with birth cloud (BC) and diffuse-dust (ISM) components.The BC component represents the extra dust attenuation of young stars that are embedded in modecular clouds and HII regions.For SSPs younger than t i < 100Myr, we apply the following BC dust attenuation: τ BC is the BC optical depth that determines the strength of the BC attenuation.Afterwards, all SSPs are attenuated by the diffuse dust using the Kriek & Conroy (2013) attenuation curve parameterization: τ ISM is the diffuse dust optical depth.n dust is the Calzetti (2001) dust index, which determines the slope of the attenuation curve.k Cal (λ) is the Calzetti (2001) attenuation curve and D(λ) is the UV dust bump, parameterized using a Lorentzian-like Drude profile: where λ 0 = 2175Å, ∆λ = 350Å, and E b = 0.85 − 1.9 n dust are the central wavelength, full width at half maximum, and strength of the bump, respectively.Once dust attenuation is applied to the SSPs, we sum them up to get the rest-frame luminosity of the galaxy.In total, our SPS model has 12 free parameters: M * , 4 SFH basis coefficients, f burst , t burst , 2 ZH basis coefficients, τ BC , τ ISM , and n dust .
In practice, each model evaulation using FSPS requires ∼340 ms.Though this is not a prohibitive computational cost on its own, sampling a high dimensional parameter space for inference requires > 100, 000 evaluationsi.e.
10 CPU hours per galaxy.For the >10 million BGS galaxies, this would require >100 million CPU hours.Instead, we use an emulator for the model luminosity, which uses a PCA neural network (NN) following the approach of Alsing et al. (2020).
To construct our emulator, we first generate N model = 1, 000, 000 model luminosities, L(λ; θ), from unique SPS parameters, θ, sampled from the prior (Section 3.2, Table 1).We then split the model luminosities into four wavelength bins: 2000 -3600, 3600 -5500, 5500 -7410, and 7410 -60000Å with N spec = 127, 2109, 2113, and 549 resolution elements, respectively.For each wavelength bin, a PCA is done in the N spec -dimensional space to yield PCA basis functions, or eigenspectra.We represent the model luminosity using the first N basis = 50, 50, 50, and 30 eigenspectra and their corresponding PCA coefficients.A NN is then trained on the set of N model models to derive a mapping from the 12 SPS parameters to the N basis PCA coefficients for each wavelength bin.
Once trained, our emulator works as follows.For a given set of SPS parameters, the NN for each wavelength bin predicts PCA coefficients.The coefficients are then linearly combined with the eigenspectra to predict the model luminosity in the wavelength bin.The luminosity in all four wavelength bins are concatenated to produce the full model luminosity.Throughout the wavelength range relevant for BGS, 3000 < λ < 9800Å, we achieve < 1% accurate with the emulator.For details on the training, validation, and performance of our PCA NN emulator, we refer readers to Kwon et al. (in prep.).With the neural emulator, each model evaluation only requires ∼2.9 ms -100× faster than with FSPS.
From the rest-frame luminosity, we obtain the observed-frame, redshifted, flux in the same way as Eq. 2. In our case, redshift is not a free parameter since we will have high quality spectroscopic redshifts for every DESI BGS galaxy.BGS redshifts will have small redshift error, σ z < 0.0005(1 + z) (150 km/s), and <5% catastrophic failures, ∆z/(1 + z) < 0.003 (<1000 km/s).To model DESI photometry, we convolve the model flux with the LS broadband filters as in Eq. 3. To model DESI spectra, we first apply Gaussian velocity dispersion.In this work, we keep velocity dispersion fixed at 0 km/s as a conservative test for our SED modeling when we use an explicitly incorrect velocity dispersion.Later when we apply our SPS model to observations, the velocity dispersion will be set to a more realistic value.It can also be set as a free parameter.After velocity dispersions, the broadened flux is resampled into the DESI wavelength binning.Since DESI spectra do not necessarily include all the light of a galaxy, we include a nuisance parameter f fiber , a normalization factor on the spectra to account for fiber aperture effects.Next, the model photometry and spectrum can be directly compared to observations.

Bayesian Parameter Inference
Using the SPS model above, we perform Bayesian parameter inference to derive posterior probability distributions of the SPS parameters from photometry and spectroscopy.From Bayes rule, we write down the posterior as where X is the photometry or spectrum and θ is the set of SPS parameters.p(X | θ) is the likelihood, which we calculate independently for the photometry and for the spectrum 1 .

.
3 .The contours mark the 68 and 95% percentiles.We use a Gaussian likelihood and the prior specified in Table 1 to evaluate the posterior and sample the distribution using ensemble slice MCMC.With our Bayesian SED modeling approach, we accurately quantify uncertainties and capture complexities ( e.g.parameter degeneracies and multimodality) in the posterior distribution.Bottom: We compare the best-fit model observables (orange) to the mock observations (black).We find excellent agreement for both the LS photometry (left) and the DESI spectrum (right).m photo and m spec represent SPS model photometry and spectroscopy.σ photo and σ spec respresent the uncertainties on the measured photometry and spectrum.In calculating L spec , we exclude wavelength ranges of width 40Å surrounding the OII, Hβ, OIII, and Hα emission lines since our SED model does not model gas emissions.We consider the photometry indepedent from the spectrum so we combine the likelihoods when jointly modeling the spectrophotometry: p(θ) in Eq. 13 is the prior on the SPS parameters.For most of our parameters, we use uninformative uniform priors with conservatively chosen ranges that are listed in Table 1.However, for the priors of {β 1 , β 2 , β 3 , β 4 }, the NMF coefficients for the SFH, we use a Dirichlet distribution to maintain the normalization of the SFH in Eq. 7.With Dirichlet priors, β i are within 0 < β i < 1 and satisfy the constraint i β i = 1.Now that we can evaluate the posterior at given θ, we estimate the posterior distributions using Markov Chain Monte Carlo (MCMC) sampling.We use the Karamanis & Beutler (2020) ensemble slice sampling algorithm with the zeus Python package4 .Ensemble slice sampling is an extension of standard slice sampling that does not requires specifying the initial length scale or any further handtuning.It generally converges faster than other MCMC algorithms (e.g.Metropolis) and generates chains with significantly lower autocorrelation.
When we sample the posterior, we do not directly sample our 12 dimensional SPS parameter space because we use a Dirichlet prior on the SFH NMF coefficients.Dirichlet distributions are difficult to directly sample so we instead use the Betancourt (2012) sampling method, which transforms an N dimensional Dirichlet distribution into an easier to sample N − 1 dimensional space.Hence, we sample the posterior in the transformed 11 dimensional space.Given this dimensionality, we run our MCMC sampling with 30 walkers.Overall, we find that the sampling converges after 2,500 iterations with a 500 iteration burn in.Deriving the posterior distribution from a joint SED modeling of photometry and spectra, with the emulator, takes ∼10 CPU minutes per galaxy.In principle, since our emulator uses a PCA NN, we can further expedite our paremeter inference using more efficient sampling methods that exploit gradient information, such as Hamiltonian Monte Carlo.We will explore further speed ups to our SED modeling in future works.
In Figure 6 we present the posterior distribution of our 12 SPS model parameters for an arbitrarily chosen LGal mock observation.We mark the 68 and 95 percentiles of the distribution with the contours.The posterior distribution reveal there are significant degeneracies between SPS parameters: e.g.β SFH 2 and f burst .Furthermore, the distribution is multimodal (see f burst panels).With our Bayesian SED modeling, we are able to capture such complexities in the posterior that would be lost with point estimates or maximum likelihood approaches.In the bottom panels, we compare our SPS model evaluated at the best-fit parameters (orange) with the LGal mock observations (black).On the left, we compare the g, r, z band magnitudes; on the right, we compare spectra.We find excellent agreement between the best-fit SPS model and mock observations.The entire PROVABGS SED modeling pipeline, including the neural emulators and parameter inference framework, is publicly available at https://github.com/changhoonhahn/provabgs/.

RESULTS
The goal of this work is to demonstrate the precision and accuracy of inferred galaxy properties for PROVABGS.We apply our SED modeling to the mock observables of 2,123 LGal galaxies.From the posterior distributions of the SPS parameters, we derive the following physical galaxy properties: stellar mass (M * ), SFR averaged over 1 Gyr (SFR 1Gyr (17) In Figure 7, we compare the galaxy properties inferred from SED modeling the mock observations, θ, to the true (input) galaxy properties, θ true , of the simulated galaxies.From left to right, we compare log M * , log SFR 1Gyr , log Z MW , t age,MW , and τ ISM in each column.The inferred properties in the top, middle, and bottom rows are derived from SED modeling of spectra, photometry, and spectrophotometry, respectively.In each panel, we represent θ by plotting 10 samples from the marginalized posterior for each simulated galaxy.We also include violin plots of θ for a handful of randomly selected galaxies.The width of the violin plot represents the marginalized posterior distribution of θ.We note that in our SED modeling of spectra only, we do not include f fiber so the true stellar mass in this case corresponds to f fiber × M * , which has a different range than for the photometry and spectrophotometry cases.The comparison demonstrates that overall we robustly infer galaxy properties using the PROVABGS SED modeling.For each simulated galaxy, we plot 10 samples drawn from the marginalized posterior of θ.We also include violin plots, whose widths represent the marginalized posteriors, for a handful of randomly selected galaxies.The posteriors demonstrate that, overall, we can derive accurate and precise constraints on certain galaxy properties from joint SED modeling of DESI photometry and spectra.
In more detail, we find that we infer unbiased and precise constraints on M * throughout the entire M * range.We also infer robust SFR 1Gyr above log SFR 1Gyr > −1 dex; below this limit, however, the inferred SFR 1Gyr are significantly less precise and overestimate the true SFR 1Gyr .This bias at low SFR 1Gyr is caused by model priors, which we discuss in further detail later in Section 5 and Appendix B. Both Z MW and t age,MW are not precisely constrained.The violin plots suggest that the inferred Z MW overestimate the true Z true .For t age,MW , the posteriors are less precise for galaxies with older stellar populations and they reveal the log-spaced t lookback binning used in our SPS model for t age,MW > 6 Gyr.Lastly, τ ISM is overall accurately inferred for galaxies with low τ ISM but appears to be underestimated for high τ ISM .
The overall constraints on galaxy properties for the mock observations is encouraging due to the significant differences in the forward model used to generate the observations and the SPS model used in the SED modeling.First, the SFHs and ZHs in the mock observations are taken directly from LGal simulation outputs while the SFH and ZH parameterization in the SPS model is based on NMF bases fit to Illustris galaxies.Second, in the forward model, we construct the SED of the bulge and disk components of the simulated galaxies separately: the components have separate SFHs and ZHs.The SPS model treats all galaxies as having one component.Third, we fix velocity disperions to 0 km/s in our SPS model.Lastly, we use different dust prescriptions: Mathis (1983) dust attenuation curve in the forward model and the Kriek & Conroy (2013) curve in the SPS model.Despite these significant differences, our constraints on certain galaxy properties are unbiased and precise.
Figure 7, also highlights the advantages of jointly modeling spectra and photometry.Comparing the constraints from spectrophotometry (bottom) versus photometry alone (middle), we find that including spectra significantly tightens the constraints for all properties.In addition, including spectra also appears to reduce biases of the constraints.For instance, with only photometry, we derive significantly more biased SFR 1Gyr constraints.This is due to the limited constraining power of photometry, which allows the posteriors to be dominated by model priors.Adding spectra, significantly increases the contribution of the likelihood and ameliorates this effect.
Beyond qualitative comparisons of the posterior, we want to quantify the precision and accuracy of the inferred galaxy properties.Let ∆ θ,i be the discrepancy between the inferred and true parameters for each galaxy: the mean (µ ∆ θ ) and standard deviation (σ ∆ θ ) of the distribution that represent the accuracy and precision of the inferred posteriors for the galaxy population.We can infer the population hyperparameters, µ ∆ θ and σ ∆ θ , using a hierarchical Bayesian framework (e.g.Hogg et al. 2010;Foreman-Mackey et al. 2014;Baronchelli et al. 2020).
Let {X i } represent the photometry or spectrum of a galaxy population and η ∆ = {µ ∆ θ , σ ∆ θ } represent the population hyperparameters.Our goal is to constrain η ∆ from {X i }i.e. to infer p(η ∆ | {X i }).We expand θ i is the SPS parameters for galaxy i and p({X i } | {θ i }) is likelihood of the set of observations {X i } given the set of {θ i }.Since the likelihoods for each of the N galaxies, p(X i | θ i ), are not correlated, we can factorize and write the expression above as .The accuracy and precision of galaxy property posteriors from our joint SED modeling of spectrophotometry, quantified using population hyperparameters η ∆ = {µ ∆ θ , σ ∆ θ }, as a function of true galaxy property (green).We derive η ∆ from the posteriors using a Hierarchical Bayesian approach.We plot θ true +µ ∆ θ in solid line and represent σ ∆ θ with the shaded region.We include η ∆ for SED modeling of photometry alone (orange) for comparison.Including DESI spectra significantly improves both the accuracy and precision of the inferred galaxy properties.log SFR p(θ i | X i ) is the posterior for an individual galaxy, so the integral can be estimated using the Monte Carlo samples from the posterior: S i is the number of posterior samples and θ i,j is the j th sample of galaxy i. p(θ i,j | η ∆ ) = p(∆ θ,i,j | η ∆ ) is a Gaussian distribution and, hence, easy to evaluate.p(θ i,j ) = 1 since we use uninformative and Dirichlet priors (Table 1).Finally, we derive the maximum a posteriori (MAP) value of η ∆ by maximizing the p(η ∆ | {X i }) posterior distribution.This type of population inference is a major advantage of inferring full posteriors distributions of the galaxy properties.We discuss the derivation and interpretation of the hyperparameters in more detail in Appendix C.
In Figure 8, we present the accuracy (µ ∆ θ ) and precision (σ ∆ θ ) of our joint SED modeling of spectra and photometry (green) as a function of true galaxy property.µ ∆ θ (solid) and σ ∆ θ (shaded region) are the MAP values of p(η ∆ | {X i }) posterior.In each panel, we derive p(η ∆ | {X i }) for log M * , SFR 1Gyr , log Z MW , t age,MW , and τ ISM in bins of widths 0.2 dex, 0.5 dex, 0.05 dex, 0.5 Gyr, and 0.1, respectively.We only include bins with more than ten galaxies.For comparison, we include η ∆ for SED modeling of photometry alone (orange).We also include η ∆ for log Z MW of galaxies with r fiber > 20 (black dot-dashed) and η ∆ for τ ISM of galaxies without bulges (black dotted), which we discuss later.
In Figure 9, we examine how the accuracy and precision of our galaxy parameter constraints are impacted by signal-to-noise ratio (SNR) or photometric color.We present η ∆ of our joint SED modeling of spectra and photometry as a function of r fiber , r, g − r, and r − z. r fiber and r magnitudes serve as proxies of the SNR for the spectra and photometry, respectively.In each row, we plot η ∆ for a different galaxy property: log M * , SFR 1Gyr , log Z MW , t age,MW and τ ISM (from top to bottom).
Lastly, in Figure 10, we investigate whether there are any underlying dependences in the inferred galaxy properties on the M * -SFR plane.In the top and bottom panels, we present µ ∆ θ and σ ∆ θ in (log M * , log SFR 1Gyr ) bins for log M * , log SFR 1Gyr , log Z MW , t age,MW and τ ISM (left to right).We use log M * bins of width 0.225 dex and log SFR 1Gyr bins of width 0.25 dex for log SFR 1Gyr > 0 dex and 0.5 dex for log SFR 1Gyr < 0 dex.We only show bins with more than 10 galaxies.On the M * − SFR plane, we can examine whether the accuracy and precision of the inferred properties have significant dependencies for galaxy type.
Based on Figures 8, 9, and 10, we draw the following conclusions on the accuracy and precision of the inferred posteriors for each galaxy property: Inferred log M * : Overall, we infer accurate and precise log M * from the PROVABGS SED modeling.There is no significant dependence in µ ∆ θ and σ ∆ θ with true log M * throughout the M * range.We accurately infer the true M * throughout ∼10 9 − 10 12 M with uniform precision of σ ∆ log M * ∼0.1 dex.We also find no significant dependence on SNR -neither r fiber nor r magnitudes significantly affect µ ∆ log M * and σ ∆ log M * .There is a noticeable correlation with g − r and r − z color, which also appears in the M * − SFR plane.However, this correlation is small compared to the precision of our inferred posterior on log M * .When we compare the η ∆ from spectrophotometry to η ∆ from photometry we find that including DESI spectra increases both the accuracy and precision of the constraints, especially at high M * > 10 11 M .
Inferred log SFR 1Gyr : We infer accurate log SFR 1Gyr for galaxies with log SFR 1Gyr > −1 dex with ∼0.1 dex precision.In fact, we find a log SFR 1Gyr ∼ −1 dex lower bound for the inferred log SFR 1Gyr .Below this limit, we significantly overestimate log SFR 1Gyr , consistent with the bias in Figure 7, and the constraints are also significantly broader, σ ∆ log M * ∼0.25 − 0.3 dex.Comparing µ ∆ θ and σ ∆ θ from spectrophotometry versus from only photometry, we confirm that including spectra significantly improves the accuracy and tightens the log SFR 1Gyr constraints.For SFR 1Gyr below log SFR 1Gyr < −1 dex, including spectra reduces the bias ∼1 dex -an order of magnitude. .Accuracy and precision of the galaxy properties inferred from joint SED modeling of spectrophotometry as a function of r fiber , r, g − r, and r − z. r fiber and r magnitudes are proxies for spectral and photometric SNR.From the top to bottom rows, we present η ∆ for log M * , log SFR 1Gyr , log Z MW , t age,MW and τ ISM .We find a significant dependence on spectral SNR in the inferred log Z MW .When the spectral SNR is low (r fiber > 20), the prior on log Z MW imposed by the SPS model dominate the posterior and cause Z MW to be overestimated.We find a significant color dependence on log SFR 1Gyr , log Z MW , and t age,MW .
For log Z MW and t age,MW , the dependence is driven by underlying correlations with spectral SNR and true t age,MW .Meanwhile, log SFR 1Gyr is overestimated for the reddest galaxies with r − z > 0.6, which correspond to quiescent galaxies with log SFR 1Gyr < −1 dex.Otherwise we find no significant dependence on SNR or optical color.
We find no significant correlation between the accuracy and precision of SFR 1Gyr with spectral or photometric SNR.However, there is a more significant color dependence where we overestimate log SFR 1Gyr by µ ∆ log SFR 1Gyr >0.5 dex for the reddest galaxies with g − r > 1.5 and r − z > 0.6.The constraints for these galaxies are also significantly less precise: σ ∆ log SFR 1Gyr ∼ 0.5 dex.The bias is also apparent in Figure 10, where we significantly overestimate SFR 1Gyr for quiescent galaxies.SFR 1Gyr is and precisely constrained for all types of galaxies.log SFR 1Gyr is accurately and precisely constrained for all galaxies except for quiescent galaxies with log SFR 1Gyr < −1 dex.log Z MW is overestimated for star-forming galaxies, due to their overall lower spectral SNR.t age,MW is accurately and precisely constrained for starforming galaxies that have overall younger stellar populations.τ ISM is accurately and precisely constrained for all galaxies except massive star-forming galaxies, which have high true τ ISM .
also slightly underestimated for the most massive (M * > 10 11 M ) star-forming galaxies.These biases are consequences of our SPS model priors.SFR 1Gyr is a derived quantity; hence, the uninformative priors we impose on SPS parameters induce non-uniform priors on them.Our SPS model imposes a prior on log SSFR 1Gyr that is skewed towards the peaks at ∼-10.4 dex (Appendix B, Figure 15).Consequently, the posterior overestimates SFR 1Gyr at low SFR 1Gyr (red, quiescent galaxies) and underestimates SFR 1Gyr at the highest SFR 1Gyr .
Inferred log Z MW : Unlike in Figure 7, η ∆ in Figure 8 clearly reveals the accuracy and precision of the posteriors on log Z MW .We find that µ ∆ θ depends significantly on the true Z MW : inferred log Z MW is overestimated by ∼0.2 dex below log Z MW < −2 dex and slightly underestimated at the highest log Z MW > −1.6 dex.σ ∆ θ ∼ 0.15 dex is uniform throughout the Z MW range.Similar to SFR 1Gyr , the bias in inferred Z MW is a consequence of our SPS model priors.The prior skews log Z MW constraints towards the peak of the prior at log Z MW ∼ −1.5. Figure 8 also includes η ∆ for posteriors derived from photometry alone (orange), which demonstrates that including DESI spectra substantially improves the accuracy of the log Z MW constraints.Including spectra reduces the overall bias on Z MW by ∼0.3 dex.The improvement comes from the likelihood contribution from DESI spectra reducing the relative contribution of the prior on the posterior.This is also why we find that the posteriors overestimate log Z MW at r fiber > 20 in Figure 9.These correspond to mock observations with low spectral SNR where the contribution of the likelihood from the spectra is lower and the prior on log Z MW has a larger effect.The color dependence of µ ∆ θ for Z MW in Figure 9 is also a consequence of this spectral SNR dependence; so is the M * − SFR dependence in Figure 10.If we exclude galaxies with low spectral SNR, both the color and M * − SFR dependences are substantially reduced: for r fiber < 20 galaxies, we infer log Z MW with µ ∆ θ < 0.15 dex and σ ∆ θ ∼ 0.1 (Figure 8; black dot-dashed).The Z MW posteriors further underscore the constraining power of DESI spectra.
Inferred t age,MW : Figure 8 confirms that we derive unbiased and precise constraints on t age,MW out to t age,MW < 8 Gyr.Below this limit, we infer t age,MW with σ ∆ θ ∼0.5 Gyr.For galaxies with older stellar populations above this limit, the log-spaced t lookback binning in our SPS model (Section 3.1) expectedly underestimates t age,MW constraints and produces larger uncertainties (σ ∆t age,MW 1 Gyr).Meanwhile, we find no significant SNR or color dependence in Figure 9.At r − z > 0.6, t age,MW is underestimated, but this is driven by the correlation between r − z and true t age,MW : simulated galaxies with r − z > 0.6 have overall older stellar populations.In Figure 10, we do not find a clear M * − SFR dependence; however, |µ ∆t age,MW | is larger and constraints are significantly less precise for galaxies with older stellar populations below the star-forming sequence.
Inferred τ ISM : Lastly, we find that both the accuracy and precision of our τ ISM depend significantly on the true τ ISM value.The inferred constraints increasingly underestimate τ ISM with lower precision for greater τ ISM .The bias is due to discrepancies between the dust prescriptions of SPS model and the mock observations.First, we use a dust prescription with a different attenuation curve in the SPS model than in the forward model.This places a strict limit on how accurately we can derive τ ISM .We intentially introduce this discrepancy since we do not know the "true" attenuation curve of observed galaxies in practice.Another reason for the biased τ ISM constraints is that we only attenuate the stellar emission in the disk component of the simulated galaxies and not the bulge component (Section 2.2).The true τ ISM is the optical depth for the disk component while our τ ISM constraints correspond to the optical depth of dust attenuation for the entire galaxies, a quantity that will be lower than the true τ ISM depending on how much the bulge contributes to the SED.Given these discrepancies, in this work we are primarily testing whether the PROVABGS SPS modeling can successfully marginalize over the effect of dust and derive robust constraints on the other galaxy properties.
Nevertheless, we find no significant SNR or color dependence on the accuracy and precision of τ ISM constraints (Figure 9).Furthermore, we find unbiased and precise τ ISM constraints for all galaxies except star-forming galaxies above M * > 10 11 M where we underestimate τ ISM .Massive star-forming galaxies in this regime mainly have τ ISM > 1.In Figure 8, we present a more apples-to-apples comparison of the τ ISM constraints, where we present η ∆ for only galaxies without bulge contributions (black dotted).For these galaxies, the bias in our τ ISM constraints is reduced and µ ∆ θ < 0.5 throughout the τ ISM range.Our constraints are still biased, however, due to the discrepant attenuation curves.

Hahn et al.
We emphasize that the primary goal of dust prescription in our SPS model is to marginalize out the effect of dust.Based on the accuracy and precision of the constraints on other galaxy properties, the PROVABGS SPS model achieves this objective.

Impact of Model Priors
The most significant limitation of the PROVABGS SED modeling in inferring the true galaxy properties is the prior on galaxy properties imposed by the model.The effect of such priors is a major limitation for any SED modeling method (e.g.Carnall et al. 2019;Leja et al. 2019) and is a consequence of the fact that galaxy properties are not parameters of the SPS model.For instance, SFR 1Gyr , Z MW , and t age,MW are derived by integrating the SFH and ZH (Eq.17), which are parameterized by β 1 , β 2 , β 3 , β 4 , f burst , t burst , and γ 1 , γ 2 .The uniform and Dirichlet priors on these parameters (Section 3.2 and Table 1) do not translate into uniform priors on SFR 1Gyr , Z MW and t age,MW .Other galaxy properties (e.g.SFH, and ZH) likewise have non-uniform, and undesireable, priors.
One way to address this issue is to choose an SED model parameterization that does not impose extreme priors on galaxy properties and to characterize the priors in detail so that final posteriors can be appropriately interpreted.For the PROVABGS model, we explicitly chose our SFH prescription so that the prior on log SSFR 1Gyr spans the range −12 to −9 dex.Furthermore, we fully characterize the prior on SSFR 1Gyr , Z MW , t age,MW , SFH, and ZH in Appendix B (Figures 15 and 16).This way, we understand exactly how the model prior impacts the derived posteriors as we discuss in Section 4. Beyond mitigating the effect of the priors, we can alternatively impose uniform prior (or any other desired prior distribution) on the derived galaxy properties by adjusting the priors on the SED model parameters.Handley & Millea (2019) recently demonstrated that maximum-entropy priors can be used for this purpose to impose uniform priors on the inferred sum of neutrino masses in cosmological analyses.In an upcoming paper, Hahn (in prep.),I will demonstrate that maximum-entropy priors can also be used in Bayesian SED modeling to correct for the impact of priors on infer posteriors on derived galaxy properties.

Aperture Effects
In this work, we use forward modeled mock observations to demonstrate that we can infer accurate and precise posteriors on certain galaxy properties.The mock observations are constructed from LGal and include photometry and spectra.In the mock spectra, we model the fiber aperture effect i.e. spectra only include light from a galaxy collected within its fiber diameter -by scaling the SED flux (Section 2.4).In our SED modeling, we account for this fiber aperture effect using a normalization factor, f fiber (Section 3.1).Hence, our mock observations and SED modeling have a consistent treatment of the fiber aperture effect.In observations, however, aperture effects can be wavelength dependent (Gerssen et al. 2012;Richards et al. 2016) and if the dependence is strong, an overall f fiber factor would not be sufficient.We examine the wavelength dependence for BGS by comparing the ratio of the fiber aperture flux over total flux, f fiber X /f X , in g, r, and z bands of BGS targets from LS.We find find no significant difference in the flux ratios of the different bands, which suggests that the fiber aperture effect does not have a strong wavelength dependence for BGS galaxies.
Flux calibration performed by the DESI spectral pipeline can also induce wavelength dependent residuals.DESI spectra are measured using three-arm spectrographs that split the spectra into three b, r, and z channels with overlapping wavelength ranges: 3600 − 5930, 5660 − 7720, and 7470 − 9800Å.After flat fielding and sky subtraction, flux calibration is performed on each channel of the spectra by matching physical stellar models to spectra of spectrohotometric standard stars observed in the same exposure (Guy et al. in prep.).Since the calibration is performed for each channel separately, imperfections can imprint a wavelength dependent residual.In a subsequent paper, Ramos et al. (in prep.), we examine the fiber aperture effect and wavelength dependent imprints on DESI spectra using BGS spectra from the DESI Survey Validation data and observations from the Mapping Nearby Galaxies at APO (MaNGA) survey.Using galaxy properties derived using the PROVABGS pipeline for spectra from integrated field unit MaNGA observations, we will present aperture corrections that can be applied on derived BGS galaxy properties.We also note that the PROVABGS SED modeling pipeline already includes flux calibration models beyond a single f fiber and can easily be extended to include more sophisticated models (e.g.Chebyschev polynomial; Carnall et al. 2019;Tacchella et al. 2021).

Stellar Model Choices
In both our PROVABGS SED model and mock observations, we use the MIST isochrones, the combined MILES+BaSeL spectral library, and the Chabrier (2003) IMF.With the same set of choices, our analysis does not consider how different choices for stellar evolution or IMF can affect the inferred galaxy properties.Yet, it is well-established that there are major uncertainties in each of these choices (Conroy et al. 2009;Conroy 2013).For instance, recent observational works suggest that there may be significant variations in IMF (e.g.Treu et al. 2010;van Dokkum & Conroy 2010;Rosani et al. 2018;Sonnenfeld et al. 2019).Different SPS model choices can also significantly impact the derived galaxy propeties (e.g.Ge et al. 2019).We reserve a detailed examination of this effect for future work.In the meantime, for the PROVABGS catalog we will release multiple catalogs each with different sets of choices for isochrone, spectral library, and IMF.

Advantages of PROVABGS
We demonstrate with the mock challenge that we can derive accurate and precise constraints on specific galaxy properties using the PROVABGS SED modeling.The PROVABGS catalog will have a number of key advantages over other value-added galaxy catalogs.First, PROVABGS will provide full Bayesian posteriors on galaxy properties instead of "best-fit" point estimates from maximizing the likelihood.Posterior distributions are essential for accurately estimating uncertainties on galaxy properties.These uncertainties are significant, especially for properties such as Z MW (Figure 7).Ignoring them dramatically overestimates the statistical precision of the derived galaxy properties and can significantly bias any galaxy study.
Furthermore, the PROVABGS posteriors will be derived from MCMC sampling rather than gridbased methods often used in the past (e.g.da Cunha et al. 2008;Moustakas et al. 2013; Boquien et al.With the PROVABGS SPS model, we can infer posteriors on the full star formation and metallicity histories.We present the inferred SFH and ZH for an arbitrarily chosen star-forming (blue) and quiescent galaxy (orange).The shaded region represent the 64 and 95% confidence intervals of the SFH and ZH posteriors.For comparison, we include the true SFH and ZH (dashed).The inferred SFH and ZH show good agreement with the true values; however, similar to the inferred SFR 1Gyr and Z MW , the SFH and ZH are significantly impacted by priors imposed by the SPS model.

2019).
As a result, they can accurately estimate posterior distributions with significant parameter degeneracies or multiple modes (peaks).For instance, in the posterior of Figure 6 we find degeneracies between f burst and {β 1 , β 2 , β 3 , β 4 } and between {γ 1 , γ 2 } and {β 1 , β 2 , β 3 , β 4 }.The posterior is also multi-modal.Accurate estimates of the full posterior distribution are especially important, as they enable the maximum-entropy method, mentioned earlier, to correct for the significant impact of priors on derived galaxy properties.Grid-based methods also scale exponentially with the number of SPS parameters so they quickly become infeasible as the dimensionality of SPS models increase.MCMC, on the other hand, scales approximately linearly with the number of parameters.
In this work, we primarily focus on the following physical properties of galaxies: log M * , log SFR 1Gyr , log Z MW , t age,MW , and τ ISM .The PROVABGS SPS model, however, can constrain galaxy properties beyond these properties.Posteriors on the SPS model parameters can, thus, be used to derive constraints on the SFH and ZH.In Figure 11, we present the inferred SFH and ZH of two simulated galaxies from our LGal sample: a star-forming (blue) and a quiescent galaxy (orange).We mark the 68 and 95% confidence intervals in the shaded regions.For comparison, we include the true SFH and ZH from LGal (dashed).The inferred SFH and ZH is able to generally recover the true histories.We emphasize that current SPS models typically assume constant ZHs that does not vary over time (Carnall et al. 2019;Leja et al. 2019).Hence inferring ZH over time is a key advantage of the PROVABGS SPS model.Similar to the inferred SFR 1Gyr and Z MW , the SFH and ZH constraints are also impacted by the priors imposed by our SPS model (Appendix B, Figure 16).
Another key advantage of PROVABGS is that it will infer galaxy properties from joint SED modeling of photometry and spectra.Our results illustrate the advantages of including spectra in SED modeling.Galaxy spectra provide substantial statistical power for constraining galaxy properties.In addition to tightening constraints overall, their statistical power is essential for mitigating the effect of the model priors.For instance, including spectra in the SED modeling significantly reduces the bias of our Z MW and t age,MW constraints (Figure 8).It also reduces the lower bound on the inferred SFR 1Gyr .In fact, without spectra, we are dominated by priors on SFR 1Gyr and cannot robustly infer galaxy properties of quiescent galaxies with log SFR 1Gyr < 0 dex.

Applications of PROVABGS
PROVABGS will be a value-added galaxy catalog with unprecedented statistical power.With physical galaxy properties of over 10 million DESI BGS galaxies, PROVABGS will provide a transformational galaxy sample to extend previous statistical galaxy studies.For example, we will be able to make the most precise measurement of the stellar mass function (Li & White 2009;Moustakas et al. 2013, SMF), star-forming sequence (Noeske et al. 2007;Curtis-Lake et al. 2021), mass-metallicity relation (Tremonti et al. 2004), or any other summary statistic of galaxy populations.PROVABGS will also include large sample of dwarf galaxies thanks to the faint apparent magnitude limit of BGS.Dwarf galaxies are dark matter dominated and, thus, probe the physics of dark matter; they are also sensitive to star formation feedback and can help distinguish different aspects of galaxy formation (Mao et al. 2021).Galaxy studies examining the galaxy-halo connection can also be extended to exploit the additional statistical power of PROVABGS (e.g.Tinker et al. 2011;Wetzel et al. 2013;Zu & Mandelbaum 2015;Hahn et al. 2017Hahn et al. , 2019)).With detailed galaxy properties, PROVABGS will also enable multiple-tracer galaxy clustering analyses that can circumvent cosmic variance in inferring cosmological parameters (Seljak 2009;McDonald & Seljak 2009;Wang & Zhao 2020).Analyses exploiting new forward modeling approaches, such as Hahn et al. (2021), will also greatly benefit from the statistical power of PROVABGS.
In addition to the applications above, PROVABGS will also unlock applications that can exploit the full posteriors of the probabilistic catalog.In this work, we utilized the posteriors in order to quantify accuracy and precision of galaxy population constraints using population inference with a hierarchical Bayesian approach.This is only the simplest illustration of such an approach.Another application is to use posteriors on M * , p(M * | X i ), to measure p(M * | {X i }) -the probabilistic SMF.With full posteriors, we can probe even the lowest signal-to-noise regime accurately so the SMF will be reliable at the lowest mass end, down to ∼10 7 M (Figure 2).This will constrain the SMF of dwarf galaxies and have important implications for both galaxy evolution and cosmology.
Probabilistic analyses can extend to higher dimensions.Joint posteriors on M * and SFR, p(M * , SFR | X i ) can be used to measure the probabilistic star formation sequence.Since the posteriors reliably estimate the uncertainties and parameter degeneracies, we will more accurately infer the intrinsic width of the SFS, which encodes information about star formation and stellar and AGN feedback in galaxies (Davies et al. 2021).We can even extend the approach to infer the distribution of all galaxy properties given observations, p(θ|{X i }), which would exploit the full statistical power of observations and reveal new trends among galaxy properties.This is only possible with population inference using the posterior distributions of every galaxy.
Population inference also allows us to avoid stacking observations.Stacking makes the strong assumption that galaxies that are grouped together in some e.g.color-space are from a subpopulation Hahn et al. with the same properties.This assumption fails if, for instance, there are contaminants or multiple disparate galaxy subpopulations that are degenerate in color-space and therefore are included in the stack.With all of the applications listed above, PROVABGS will enable us to fully extract the statistical power of >10 million BGS galaxies.

SUMMARY
Over the next five years, DESI will measure spectra for >30 million galaxies, each with optical photometry from the Legacy Surveys.BGS, which will extend out to z ∼ 0.6, will provide a r < 19.5 magnitude-limited sample of ∼10 million galaxies spanning a wide range of galaxy properties with high completeness.It will also include a sample of ∼5 million fainter galaxies down to r < 20.175 selected based on a fiber magnitude and color.This upcoming dataset offers a unique opportunity to leverage its statistical power for galaxy evolution and maximize its scientific impact.Accurate galaxy properties for such a galaxy sample, for instance, would enable us to measure population statistics and empirical relations of galaxies with unprecedented precision.It would also enable more complete and precise comparisons between observations and galaxy formation models, which will shed light into the physical processes of galaxy evolution.To exploit this opportunity, we will construct the PRObabilistic Value-Added Bright Galaxy Survey (PROVABGS) catalog, where we will apply state-of-the-art Bayesian SED modeling to jointly analyze DESI photometry and spectroscopy.PROVABGS will provide full posterior distributions of galaxy properties, such as stellar mass (M * ), star formation rate (SFR), stellar metallicity (Z MW ), and stellar age (t age,MW ), for all >10 million BGS galaxies.
In this work, we present and validate the SED model, Bayesian inference framework, and other methodology that will be used to construct PROVABGS5 .We use 2,123 galaxies in the L-Galaxies semi-analytic model to construct realistic synthetic DESI spectra and photometry.We build SEDs using SPS based on the star formation and chemical enrichment histories of the simulated galaxies.Then, we simulate the SEDs using the forward modeling pipeline used in the BGS survey design.Afterwards, we apply the PROVABGS SED modeling on the mock DESI observations to derive posteriors on M * , SFR 1Gyr , Z MW , and t age,MW .From the posteriors and the population inference we conduct to quantify accuracy and precision, we find: • Overall, we derive posteriors of galaxy properties that are in good agreement with the true properties of the simulated galaxies.Furthermore, with posteriors rather than point estimates we accurately estimate the uncertainties on the galaxy properties.We infer posteriors with the following levels of precision: σ log M * ∼ 0.1 dex, σ log SFR 1Gyr ∼ 0.1 dex, σ log Z MW ∼ 0.15 dex, and σ t age,MW ∼ 0.5 Gyr.Our results also demonstrate that we successfully marginalize over the effect of dust and other nuisance parameters.
• Like any SED model, the PROVABGS SED model imposes significantly non-uniform priors on galaxy properties.We find that these priors impose a lower bound on SFR 1Gyr of SFR 1Gyr > 10 −1 M /yr.It also biases Z MW by ∼0.3 dex for observations with low spectral signal-to-noise and imposes an upper bound of t age,MW < 8 Gyr.We characterize the priors in detail so that constraints on galaxy properties can be interpreted in future studies that use PROVABGS.
• We compare the posteriors derived from DESI spectrophotometry to those derived from photometry alone.Including DESI spectra substantially improves the constraints on galaxy properties.Moreover, jointly analyzing spectra is essential for mitigating the impact of the SED model priors.For example, with photometry alone, the priors impose a more restrictive SFR 1Gyr > 1M /yr lower bound and bias Z MW ∼0.5 dex.
We demonstrate with our mock challenge that we will derive accurate and precise constraints on specific galaxy properties in PROVABGS.Beyond M * , SFR 1Gyr , Z MW , and t age,MW , which we focus on in this work, PROVABGS will also constrain star formation and metallicity histories.With galaxy properties of over >10 million BGS galaxies, current galaxy studies will be able to use the PROVABGS catalog to exploit the statistical power of BGS for the most precise measurements of various galaxy relations.Since the BGS samples span a wide range of galaxies, PROVABGS will also enable galaxy studies to investigate less explored regimes, such as dwarf galaxy populations.
Furthermore, PROVABGS will be a fully probabilistic catalog.With posteriors for all the galaxy properties, we can conduct more rigorous statistical analyses using new techinques such as population inference and hierarchical Bayesian modeling.We demonstrate one such approach in this work by using population inference to estimate the overall accuracy and precision of our galaxy property constraints.These methods will not only improve the accuracy of our analyses but they will also allow us to fully exploit the statistical power of DESI observations.Despite the overall success of the PROVABGS methodologies that we demonstrate, there are some limitations.For instance, we only consider a simple model for the effect of the DESI fiber aperture and flux calibration.A more detailed investigation will be presented in Ramos et al. (in prep.).We also do not consider varying the isochrones, stellar library, or IMF.Instead, we will release multiple versions of PROVABGS with different sets of assumptions.Lastly, we find that the most significant limitation to deriving accurate galaxy properties comes from the prior imposed by the SED model.We will address this limitation and present a method to impose uniform priors on galaxy properties in Hahn (in prep.).
DESI has started its main 5 year operation.Already, as part of survey validation, DESI has collected over 400,000 spectra of BGS galaxies that will be released in the Survey Validation Data Assembly (SVDA).The SVDA release will also be accompanied by papers describing the data reduction pipeline, redshift fitting algorithm, fiber assignment, survey operation and simulations, visual inspection, and target selection for the various tracers.Finally, using BGS observations in the SVDA, we will construct and release the PROVABGS-SV catalog and present the probabilistic stellar mass function measured from it in the subsequent paper.
The entire PROVABGS SED modeling pipeline, including the neural emulators and Bayesian inference framework, is publicly available at: https://github.com/changhoonhahn/provabgs/.All of the software and scripts used in our analysis are publicly available at: https://github.com/changhoonhahn/gqp_mc.We mark the contributions of each of the NMF components in the faded colored lines.The middle and bottom panels compare the spectra obtained from integrating the original and reconstructed SFH and ZH.In this case, the NMF basis offers a good reconstruction of the SFH and ZH, which results in a small residuals in the corresponding spectra.
take the distribution of stellar ages in 400 bins, logarithmically distributed between 8.6 Myrs and 13.65 Gyrs, and compute the stellar mass formed in each bin.For the ZHs, we take the mass-weighted metallicity in each of the bins.Next, the vectors for the SFHs and ZHs are normalized independently i.e. we do not keep information of which ZH corresponds to each SFH.Therefore we do not impose the mass-metallicity relation of the simulation onto our basis vectors (see Thorne et al. 2021 for a parameterization that links SFH with ZH throught he mass-metallicity relation).We take each set of simulated SFHs and ZHs as a reasonable representation of possible SFHs and ZHs in the Universe.Prior to decomposition, each individual vector is smoothed on a scale of 400 Myr, which removes any information on smaller timescales.We decompose the set of SFHs into 4 independent components, and the set of ZHs into 2 independent components.The resulting components are shown in the main text (Figure 5).12 but for another Illustris galaxy.In this case, the NMF basis fails to reproduce a burst of star formation at recent times, leading directly to an underestimation of the luminosity, especially towards the bluer wavelengths.
Figures 12 and 13 show two examples of the NMF direct reconstruction on two galaxies.The two galaxies are chosen as examples of a 'fair' and a 'poor' reconstruction.In all cases the reconstructions can be improved by increasing the number of components, and doing so effectively improves our ability to model shorter timescale features in the SFH and ZHs.In this work, we instead include a stochastic burst component in the SFH (Section 3.1).
In Figure 14, we present how NMF reconstruction projects onto certain derived properties: total stellar mass formed, mass-weighted age, mass-weighted metallicity and mass in young stars (mass formed in the last 200 Myr).Besides the total stellar mass, the other derived properties are impacted by the lack of short timescale features.Our stochastic burst component directly addresses this limitiation.Therefore, the NMF basis can be seen as a reasonable and minimal set to recover the broad shape of the star-formation and metallicity histories, which is complemented in our SPS model by the stochastic burst component.On the left column we present the distribution of each quantity, and on the right column we show direct comparisons in scatter plots.The orange dashed lines shows the one-to-one line.Total stellar mass is well recovered, as expected given its lack of sensitivity to smaller bursts.Mass-weighted ages are poorly recovered at young and old ages, as a direct consequence of the lack of resolution of our basis.The mass-weighted metallicity is well recovered on the mean, though with a large scatter.The mass formed in young stars is again affected by the lack of resolution of our basis.In our SPS model, we include a stochastic burst component to account for this limitation (Section 3.1).

B. SPS MODEL PRIORS
SED models impose undesireable non-uniform priors on galaxy properties that significantly impact their ability to infer unbiased galaxy properties such as SFR 1Gyr , Z MW , and t age,MW .For the PROVABGS SED modeling we present in this work, model imposed priors place a lower bound on SFR 1Gyr , bias Z MW for observations with low spectral SNR, and place an upper bound of t age,MW < 8 Gyr.Given their significant impact, we quantify and characterize the model imposed prior in further detail below.
Model imposed priors are a consequence of the fact that many of the galaxy properties of interest are not explicit parameters of the SPS model.Out of the properties we focus on in this work, only M * is a parameter in our SPS model.M * determines the overall amplitude of the SED.Meanwhile, specific-SFH [yr −1 ] z = 0.1 Figure 16.Priors imposed by our SPS model on the specific-SFH (sSFH) for galaxies at z = 0.1.We represent 68, 95, 99.7% of the sSFH distribution with the shaded regions (dark to light).Our SPS model imposes a prior on sSFH that is asymmetric and peaked at ∼5 × 10 −10 yr −1 .For observations where the likelihood distribution is diffuse (e.g.low SNR), the inferred SFH will be significantly skewed towards the peak of the distribution.Overall, this prior will cause the inferred SFHs to be flatter over t lookback and skew towards the peak as we see in the comparison between the inferred and true SFHs in Figure 11.Any analysis of SFHs based on SED modeling must take account the effect of model imposed priors.
We illustrate and quantify the model imposed priors on the galaxy properties in Figure 15.We present the probability distribution of the priors on log M * , log SSFR 1Gyr , log Z MW , and t age,MW for galaxies at z = 0.1.The distribution is derived by first sampling SPS parameters from prior specified in Table 1, θ ∼ p(θ).Then for each θ , the galaxy properties are derived using Eq. 17.We present log SSFR 1Gyr = log(SFR 1Gyr /M * ) instead of log SFR 1Gyr to remove the correlation with M * .The contours mark the 68 and 95% of the distribution.We note that the prior distribution depends on redshift since it determines t age .The dependence is relatively small over the BGS z range so we only show z = 0.1 for simplicity.
We confirm that the prior on log M * is uniform as we specify in Table 1.For the other parameters, however, the model imposed prior is not uniform.For SSFR 1Gyr , the prior spans −13.5 < log SSFR 1Gyr < −9; however, it skews toward the primary peak at log SSFR 1Gyr = −10.4.The secondary peak near −9 dex is a consequence of the starburst component that we include in the SFH.By definition SSFR 1Gyr cannot exceed 10 −9 yr −1 .For log Z MW and t age,MW , the priors are also skewed distributions that peak near -1.6 dex and 6 Gyr, respectively.Furthermore, for t age,MW , the prior reveals the imprint of the log-spaced t lookback bins (see t age,MW versus log SSFR 1Gyr panel).As we discuss in the main text, the shape of the model imposed priors on SSFR 1Gyr , Z MW , and t age,MW explains the limitations of the posteriors we derive from our SED modeling.
In addition to the galaxy properties above, we also characterize the model imposed prior on specific-SFH (sSFH) in Figure 16.The sSFH is the SFH normalized by total stellar mass.The shaded regions represent 68, 95, 99.7% of the SFH distribution (dark to light).We show the prior for galaxies at z = 0.1.Throughout the t lookback range, the sSFH prior is asymmetric and peaks at ∼5 × 10 −10 yr −1 .Since this prior is implicitly included, the SFH posterior will also be skewed towards this sSFH peak depending on the relative amplitude and width of the likelihood distribution.In other words, the inferred SFH will generally be flatter as a function of t lookback than the true SFH.We can see this effect in Figure 11.For the star-forming galaxy with a relatively flat SFH at intermediate values, the inferred SFH is in good agreement with the true SFH.However, for the quiescent galaxy, which has high SFRs at early times (t lookback > 6 Gyr) and low SFRs at late times (t lookback < 2 Gyr), the inferred SFH is flatter and skewed towards intermediate values.We note that the prior on SFH is similar to the priors on SFHs by various nonparametric SPS models in Leja et al. (2019).Any detailed analysis of SFHs (e.g.quenching timescale or star formation variable) based on SED modeling must take the impact of model imposed priors on SFH into account or taken with a grain of salt.
We emphasize that all SPS models impose undesirable priors on derived galaxy properties.And any deviation of the priors on galaxy properties from a uniform distribution impacts the posteriors on the galaxy properties.Galaxy properties derived from SED modeling must, therefore, characterize and account for the priors imposed on them by the model for unbiased and accurate analyses.In this appendix, we characterize the model imposed priors of our PROVABGS SED model for the main galaxy properties that we explore in this work.This allows us to interpret the posteriors of galaxy properties for PROVABGS and qualitatively disentangle the effect of the prior.Upcoming work in Hahn in prep.will demonstrate that maximum-entropy priors can be used to substantially mitigate the impact of model impose priors on the posteriors of galaxy properties (see Section 5).

C. POPULATION INFERENCE
We quantify the accuracy and precision of the inferred galaxy properties from our SED modeling using population hyperparameters η ∆ = {µ ∆ θ , σ ∆ θ } (Section 4).These hyperparameters describe the distribution of the difference between the inferred and true parameters, ∆ θ , assuming that the distribution has a Gaussian functional form (Eq. 18).The η ∆ values we present in this work are MAP estimates of p(η ∆ |{X i }), the probability distribution of η ∆ given some galaxy population observations.They are inferred using population inference as described in the main text and Eqs 19 -24.Our approach for quantifying the accuracy and precision has a number of key advantages over other methods.For instance, a naive way to quantify the accuracy and precision would be to estimate the median and standard deviations of individual posteriors then averaging them.This assumes that each individual posterior is close to a Gaussian.As we later demonstrate, this is an incorrect assumption that reduces the posterior distribution to point estimates.Another approach would be to stack the posteriors by summing up all of the individual posteriors.Neither of these approaches mathematically estimate the distribution we are actually interested in estimating: p(η ∆ |{X i }).Moreover, both approaches are biased.Malz & Hogg (2020) recently demonstrated this in the context of combining photometric redshift posteriors.
We illustrate the population inference approach in Figure 17 where we present the distribution of log M * described by the accuracy and precision hyperparameters derived for galaxies with 10.6 < M * ∼ 10 10.7 M N (µ ∆ θ , σ ∆ θ ) Figure 17.The log M * distribution described by the accuracy and precision hyperparameters for galaxies with 10.6 < log M * < 10.8: N (10.7 + µ ∆ θ , σ ∆ θ ) (black dashed).The hyperparameters are MAP estimates of p(µ ∆ θ , σ ∆ θ |{X i }) derived from population inference (Eq.19-24).We include individual log M * posteriors of several galaxies with M * ∼ 10 10.7 M for comparison.The individual posteriors have a wide variety of shapes, which can bias naive estimates of their accuracy and precision.The comparison illustrates that N (µ ∆ θ , σ ∆ θ ) provides a robust estimate of the overall accuracy and precision of the inferred posteriors.log M * < 10.8: N (µ ∆ θ + 10.7, σ ∆ θ ) (black dashed).For comparison, we plot posteriors of log M * for several individual galaxies with log M * ∼ 10.7.There is significant variation in the individual posteriors and many of them are not well described by a Gaussian distribution.This variation is an expected consequence of noise in the observables and MCMC sampling.We note that estimating the accuracy and precision by stacking the posteriors, for instance, significantly underestimates the precision.Meanwhile, the accuracy and precision hyperparameters capture the overall accuracy and precision of the individual posteriors.

Figure 2 .
Figure 2. Stellar mass (M * ) distribution as a function of redshift of the r < 19.5 magnitude-limited BGS Bright sample (orange) as predicted by the MXXL simulation.We include the M * distribution of MXXL galaxies with r < 20 (blue) for reference.Many such fainter galaxies will be included in the BGS Faint sample, which will observe galaxies as faint as r < 20.175.BGS will observe a broad range of galaxies with high completeness and provide galaxy samples with unprecedented statistical power.This includes a large sample of <10 9 M dwarf galaxies.We will apply state-of-the-art Bayesian SED modeling to all BGS galaxies to construct the PRObabilistic Value-Added BGS (PROVABGS) catalog, which will unlock more sophisticated statistical approaches for galaxy evolution studies.

Figure 4 .
Figure4.We construct simulated DESI spectra (solid) for LGal simulated galaxies by applying a fiber aperture correction to the SED (dashed) and a realistic DESI noise model.We apply a fiber aperture correction by scaling down the full SED (dotted) by the r-band fiber fraction derived from LS imaging.The noise model accounts for the DESI spectrograph response and the bright time observing conditions of BGS(Hahn et al. in prep., Schlafly et al. in prep.).We represent the spectra from the b, r, and z arms of the DESI spectraphs in blue, orange, and green respectively.Our forward model produces realistic DESI-like spectra that accurately reproduce the noise levels and characteristics of actual BGS spectra.

Figure 5 .
Figure5.Non-negative matrix factorization basis functions for the SFH (left) and ZH (right) used in the non-parametric SFH and ZH prescriptions of our SPS model.These basis functions are derived from the SFHs and ZHs of simulated galaxies in the Illustris cosmological hydrodynamic simulations.With the NMF basis functions, we can reproduce the wide range of SFHs and ZHs of Illustris galaxies (Appendix A).

Figure 6 .
Figure6.Top: Posterior probability distribution of our 12 SPS model parameters derived from joint SED modeling of the mock DESI photometry and spectrum.The contours mark the 68 and 95% percentiles.We use a Gaussian likelihood and the prior specified in Table1to evaluate the posterior and sample the distribution using ensemble slice MCMC.With our Bayesian SED modeling approach, we accurately quantify uncertainties and capture complexities ( e.g.parameter degeneracies and multimodality) in the posterior distribution.Bottom: We compare the best-fit model observables (orange) to the mock observations (black).We find excellent agreement for both the LS photometry (left) and the DESI spectrum (right).

Figure 7 .
Figure 7.Comparison between the true galaxy properties, θ true , and those inferred from SED modeling of mock observations, θ.From the left to right columns, we compare log M * , log SFR 1Gyr , log Z MW , t age,MW and τ ISM .The inferred galaxy properties are derived from SED modeling of mock spectra (top), photometry (middle), and spectrophotometry (bottom).For each simulated galaxy, we plot 10 samples drawn from the marginalized posterior of θ.We also include violin plots, whose widths represent the marginalized posteriors, for a handful of randomly selected galaxies.The posteriors demonstrate that, overall, we can derive accurate and precise constraints on certain galaxy properties from joint SED modeling of DESI photometry and spectra.
Figure8.The accuracy and precision of galaxy property posteriors from our joint SED modeling of spectrophotometry, quantified using population hyperparameters η ∆ = {µ ∆ θ , σ ∆ θ }, as a function of true galaxy property (green).We derive η ∆ from the posteriors using a Hierarchical Bayesian approach.We plot θ true +µ ∆ θ in solid line and represent σ ∆ θ with the shaded region.We include η ∆ for SED modeling of photometry alone (orange) for comparison.Including DESI spectra significantly improves both the accuracy and precision of the inferred galaxy properties.log SFR 1Gyr , log Z MW , and t age,MW constraints are significantly impacted by priors imposed by the SPS model (Appendix B).Discrepancies in the dust prescriptions between our SPS model and the mock observations drive the bias in τ ISM .Nevertheless, we accurately and precisely infer: log M * for all M * , log SFR 1Gyr above log SFR 1Gyr > −1 dex, and t age,MW below 8 Gyr.
Figure9.Accuracy and precision of the galaxy properties inferred from joint SED modeling of spectrophotometry as a function of r fiber , r, g − r, and r − z. r fiber and r magnitudes are proxies for spectral and photometric SNR.From the top to bottom rows, we present η ∆ for log M * , log SFR 1Gyr , log Z MW , t age,MW and τ ISM .We find a significant dependence on spectral SNR in the inferred log Z MW .When the spectral SNR is low (r fiber > 20), the prior on log Z MW imposed by the SPS model dominate the posterior and cause Z MW to be overestimated.We find a significant color dependence on log SFR 1Gyr , log Z MW , and t age,MW .For log Z MW and t age,MW , the dependence is driven by underlying correlations with spectral SNR and true t age,MW .Meanwhile, log SFR 1Gyr is overestimated for the reddest galaxies with r − z > 0.6, which correspond to quiescent galaxies with log SFR 1Gyr < −1 dex.Otherwise we find no significant dependence on SNR or optical color.

Figure 10 .
Figure10.Accuracy and precision of the galaxy properties inferred from joint SED modeling of spectrophotometry as a function of the galaxies' true M * and SFR 1Gyr .We present µ ∆ θ (top) and σ ∆ θ (bottom) in (M * , SFR 1Gyr ) bins for log M * , log SFR 1Gyr , log Z MW , t age,MW and τ ISM from left to right.log M * is accurately and precisely constrained for all types of galaxies.log SFR 1Gyr is accurately and precisely constrained for all galaxies except for quiescent galaxies with log SFR 1Gyr < −1 dex.log Z MW is overestimated for star-forming galaxies, due to their overall lower spectral SNR.t age,MW is accurately and precisely constrained for starforming galaxies that have overall younger stellar populations.τ ISM is accurately and precisely constrained for all galaxies except massive star-forming galaxies, which have high true τ ISM .
Figure11.With the PROVABGS SPS model, we can infer posteriors on the full star formation and metallicity histories.We present the inferred SFH and ZH for an arbitrarily chosen star-forming (blue) and quiescent galaxy (orange).The shaded region represent the 64 and 95% confidence intervals of the SFH and ZH posteriors.For comparison, we include the true SFH and ZH (dashed).The inferred SFH and ZH show good agreement with the true values; however, similar to the inferred SFR 1Gyr and Z MW , the SFH and ZH are significantly impacted by priors imposed by the SPS model.

Figure 12 .
Figure12.The original and NMF-reconstructed SFHs (top left) and ZHs (top right) of a galaxy in the Illustris simulation.The original SFH is shown after smoothing on a scale of 400 Myr.We mark the contributions of each of the NMF components in the faded colored lines.The middle and bottom panels compare the spectra obtained from integrating the original and reconstructed SFH and ZH.In this case, the NMF basis offers a good reconstruction of the SFH and ZH, which results in a small residuals in the corresponding spectra.

Figure 14 .
Figure14.Comparison of original versus NMF reconstructed total stellar mass formed, mass-weighted age, mass-weighted metallicity, and mass formed in the last 200 Myr (from top to bottom).On the left column we present the distribution of each quantity, and on the right column we show direct comparisons in scatter plots.The orange dashed lines shows the one-to-one line.Total stellar mass is well recovered, as expected given its lack of sensitivity to smaller bursts.Mass-weighted ages are poorly recovered at young and old ages, as a direct consequence of the lack of resolution of our basis.The mass-weighted metallicity is well recovered on the mean, though with a large scatter.The mass formed in young stars is again affected by the lack of resolution of our basis.In our SPS model, we include a stochastic burst component to account for this limitation (Section 3.1).
Hahn et al.

Table 1 .
Parameters of the PROVABGS SPS model and their priors used for joint SED modeling of DESI photometry and spectroscopy.
The accompanying data used in this work, including the mock DESI ob-servations and posteriors derived from PROVABGS, is available at: https://doi.org/10.5281/zenodo.5910635.