Brought to you by:

The following article is Open access

Deep Realistic Extragalactic Model (DREaM) Galaxy Catalogs: Predictions for a Roman Ultra-deep Field

, , , , , , , , , , , and

Published 2022 February 24 © 2022. The Author(s). Published by the American Astronomical Society.
, , Citation Nicole E. Drakos et al 2022 ApJ 926 194 DOI 10.3847/1538-4357/ac46fb

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0004-637X/926/2/194

Abstract

In the next decade, deep galaxy surveys from telescopes such as the James Webb Space Telescope and Roman Space Telescope will provide transformational data sets that will greatly enhance the understanding of galaxy formation during the epoch of reionization (EoR). In this work, we present the Deep Realistic Extragalactic Model (DREaM) for creating synthetic galaxy catalogs. Our model combines dark matter simulations, subhalo abundance matching and empirical models, and includes galaxy positions, morphologies, and spectral energy distributions. The resulting synthetic catalog extends to redshifts z ∼ 12, and galaxy masses ${\mathrm{log}}_{10}(M/{M}_{\odot })=5$ covering an area of 1 deg2 on the sky. We use DREaM to explore the science returns of a 1 deg2 Roman ultra-deep field (UDF), and to provide a resource for optimizing ultra-deep survey designs. We find that a Roman UDF to ∼30 mAB will potentially detect more than 106 MUV < − 17 galaxies, with more than 104 at redshifts z > 7, offering an unparalleled data set for constraining galaxy properties during the EoR. Our synthetic catalogs and simulated images are made publicly available to provide the community with a tool to prepare for upcoming data.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

The basic picture of galaxy formation is well established. Galaxies first form within the gravitational potential wells of dark matter halos, and continue to grow through the accretion of surrounding matter. Galaxies eventually produce enough radiation to ionize the intergalactic medium (IGM)—an epoch called reionization (for a review, see Robertson et al. 2010; Stark 2016). Constraints from a variety of probes, including the cosmic microwave background (CMB; e.g., Planck Collaboration et al. 2020) and quasar absorption lines (e.g., Becker et al. 2015), indicate that reionization happens between z = 6 and z = 9. If galaxies dominate the contribution of photoionizing radiation, the cosmic star formation rate density (CSFRD) provides a measure of the photoionizing rate. Observing high-redshift galaxies, to study galaxy formation and their role in reionization, requires very deep imaging.

Extragalactic ultra-deep surveys such as the Hubble Ultra-Deep Field (e.g., Beckwith et al. 2006) and Hubble Frontier Fields (HFFs; e.g., Lotz et al. 2017) have detected galaxies to magnitudes mAB ∼ 30 and have begun to measure galaxy properties out to redshifts of z ∼ 10. However, there are still many open questions at these high redshifts including the emergence of quiescent galaxies (QGs), the evolution of the UV luminosity function (UVLF; e.g., Bouwens et al. 2021), and the exact timeline and mechanism of cosmic reionization (e.g., Bunker et al. 2004; Finkelstein et al. 2012; Robertson et al. 2015). Upcoming telescopes, including James Webb Space Telescope (JWST) and the Nancy Grace Roman Telescope (Roman) will produce a large influx of data in the coming years that will greatly advance our understanding of galaxy evolution in the epoch of reionization (EoR). Given that Roman is scheduled to launch in a few years, the purpose of this paper is to examine the science returns of an ultra-deep survey with Roman.

The main advantage of Roman compared to other space telescopes is its wide field of view (FOV); the Roman Wide-Field Instrument (WFI) FOV is more than 100x larger than Hubble Space Telescope's (HST's) Wide Field Camera 3 (WFC3) and JWST's NIRCam. This large area will increase the number of detected galaxies, discover bright and rare sources, reduce cosmic variance, and probe the environment around galaxies and active galactic nuclei (AGNs) at unprecedented redshifts. As outlined in Koekemoer et al. (2019), a potential Roman ultra-deep field (UDF) survey could cover ∼1 deg2 and image to mAB ∼ 30 in ∼600 hr of exposure time per filter. This survey would elucidate the properties of the dominant ionizing sources at the time of reionization, allow tests for variations in the high-z faint-end slope of the UVLF with environment, and likely provide the first galaxy clustering constraints at early times for faint galaxies.

A prerequisite to understanding in detail what a Roman UDF will be able to detect is accurate modeling of the expected observations. In particular, synthetic galaxy catalogs are useful for predicting the science returns of an upcoming survey, to test analysis tools, and identify potential observational biases (e.g., Williams et al. 2018; Korytov et al. 2019; Yung et al. 2019a; Behroozi et al. 2020; Somerville et al. 2021). To make accurate predictions, the quality and complexity of synthetic observations needs to increase with expanding theoretical and observational knowledge. This work presents the Deep Realistic Extragalactic Model (DREaM), a model for generating synthetic galaxies out to redshifts past the EoR. DREaM accurately reproduces a wide range of theoretical and observational trends, including stellar mass functions (SMFs), and the CSFRD. We use DREaM to create synthetic data for a potential Roman UDF, to provide synthetic catalogs for the community to help develop analysis and pipeline tools, and quantify the potential scientific returns of a Roman UDF.

To accurately capture the environment around each galaxy, we begin with dark matter simulations, and then create an observed lightcone by stitching together the discrete simulation outputs. The outline of this paper is as follows: an overview of DREaM is given in Section 2. The underlying dark matter simulations, and the method for assigning galaxies to dark matter halos is given in Section 3, and the process of creating the observed lightcone is presented in Section 4. The galaxy morphologies and spectral energy distributions (SEDs) are assigned as outlined in Sections 5 and 6, respectively. The resulting star formation history (SFH) of the universe is discussed in Section 7, and preliminary predictions for the science returns of a 1 deg2 UDF with Roman are given in Section 8. Finally, the implications of this study and future work are discussed in Section 9.

2. Overview of Methods

While knowledge of galaxy distributions and properties comes primarily from observations, dark matter structure is typically studied through numerical simulations. The galaxy–halo connection describes how visible galaxies relate to the underlying dark matter structure (for a recent review, see Wechsler & Tinker 2018). Many different approaches have been used to link halo properties to galaxies, including hydrodynamical simulations (e.g., Katz & Gunn 1991; Katz 1992; Vogelsberger et al. 2014), semianalytic models (SAMs; e.g., White & Frenk 1991; Kauffmann et al. 1993; Somerville & Primack 1999; Guo et al. 2013), halo occupation distribution (HOD) models (Jing et al. 1998; Berlind & Weinberg 2002; Wechsler et al. 2002), abundance matching (AM; e.g., Kravtsov et al. 2004; Conroy et al. 2006; Vale & Ostriker 2006), and machine learning (e.g., Jo & Kim 2019; Moster et al. 2020; Wechsler et al. 2021). These different techniques range from physically driven, computationally expensive approaches to empirical models designed to reproduce known observational trends.

More physically based approaches are theoretically more predictive than empirical models. However, both hydrodynamical simulations and SAMs typically struggle to reproduce observed trends to the same accuracy as empirical models (which match observations by construction). 11 Therefore, to successfully reproduce observational trends (including luminosity functions, SFHs, and galaxy clustering), synthetic galaxy catalogs most commonly use empirical models that place galaxies in dark matter structure using AM related methods (e.g., Moster et al. 2018; Behroozi et al. 2019, 2020; DeRose et al. 2019, 2021) or HOD models (e.g., van den Bosch et al. 2005; Zu & Mandelbaum 2015).

For the synthetic catalog presented in this work, we begin with dark matter simulations, and then use subhalo abundance matching (SHAM) to model the galaxy–halo connection, as outlined in Section 3. The main advantages to this approach are that by beginning with dark matter simulations, we can accurately capture the large-scale structure, and provide host dark matter halo properties for each galaxy. SHAM methods reproduce the proper SMFs by construction, and are known to reproduce the spatial distribution of galaxies in the local universe (e.g., Kravtsov et al. 2004; Reddick et al. 2013; Lehmann et al. 2017).

To simulate a 1 deg2 survey that extends past z ∼ 10, we use a simulation volume with sides of co-moving length 115 h−1Mpc. Figure 1 shows the co-moving transverse size of a patch of sky covering 1 deg2 as a function of redshift, and the corresponding volume of the survey. To reach the desired depth of the synthetic realization, we tile 60 boxes in the line-of-sight direction, as detailed in Section 4.1.

Figure 1.

Figure 1. Co-moving transverse size, L, and volume, V, of a survey with a square 1 deg2 FOV. We calculate distances assuming a Planck 2018 cosmology (Planck Collaboration et al. 2020). A redshift of z = 10 corresponds to a co-moving size of ∼115 Mpch−1 in the transverse direction.

Standard image High-resolution image

When constructing synthetic observations from simulations, the discrete time snapshots of the simulations need to be related to the observable sky, in which the distance of the object corresponds to its observed time. This relation between simulation data and the observable sky can be achieved by creating a lightcone. Our lightcone pipeline is described in detail in Section 4, but, in brief, given merger histories for each halo, we calculate if, and at which time, each halo crosses the observer's past lightcone. If the position of the galaxy on the lightcone falls within the survey volume, the galaxy is included in our synthetic catalog.

After creating the lightcone, we assign galaxy morphological properties and SEDs in a manner similar to the phenomenological model from Williams et al. (2018). Details of these procedures are outlined in Sections 5 and 6. We use the publicly available flexible stellar population synthesis (FSPS; Conroy et al. 2009; Conroy & Gunn 2010) code to generate galaxy SEDs and calculate galaxy fluxes in each of the proposed Roman filters.

A summary of the methods used to generate the synthetic catalogs is shown in Figure 2. Section 3.1 describes the underlying dark matter simulation and creation of the halo catalog. Sections 3.2 and 4 outline the SHAM and lightcone procedure, respectively. We assign morphologies and SEDs to the galaxies as described in Sections 5 and 6, respectively). Then, Section 8 outlines the Roman photometry.

Figure 2.

Figure 2. Overview of methods used to create the Roman DREaM galaxy catalog, with the corresponding paper sections labeled. The galaxy catalog is based on a dark matter simulation. SHAM is used to assign galaxy masses to each dark matter halo, and then galaxy properties are assigned using empirical relations. Data products are shown with light blue rectangles, and pipelines are shown with dark blue ovals. The SED pipeline is shown in more detail in Figure 9.

Standard image High-resolution image

3. Generating Synthetic Galaxies

As described above, we begin our catalog by running dark matter only simulations, and then use SHAM to assign stellar masses, resulting in halo catalogs with corresponding galaxy masses for every simulation time output. This section describes the underlying dark matter simulations, halo catalogs, and SHAM procedure. In the following section, we use these galaxies to create a lightcone realization of the catalog. We treat host halos and their subhalos (i.e., halos that exist inside a larger halo as a self-bound structure) separately, and use the term "halo" to refer to both host and subhalos.

3.1. Dark Matter Simulation

We use a Planck 2018 cosmology (Planck Collaboration et al. 2020; Ωb = 0.04893, Ω0 = 0.3111, ΩΛ = 0.6889, H0 =67.66 km/s/Mpc, σ8 = 0.8102, and ns = 0.9665), with a box size of 115 h−1Mpc and N = 20483 particles. This corresponds to a particle mass of 1.5 × 107 M h−1. The initial conditions are created using Music (Hahn & Abel 2011), and the simulations are run in Gadget-2 (Springel 2005), with a softening length of 1.13 co-moving h−1kpc. The simulation outputs are shown in Figure 3. We generate halo catalogs and merger trees using Rockstar (Behroozi et al. 2013a) and Consistent Trees (Behroozi et al. 2013b). We define halo masses using the Bryan & Norman (1998) virial definition. Halos are required to have at least 20 particles, which corresponds to a minimum mass of 8.50 M h−1.

Figure 3.

Figure 3. Cosmological N-body simulation with a box size of 115 h−1Mpc. The figure shows the projected mass density at various redshifts (as labeled), in a 7 Mpc h−1 slice in the radial direction. These simulations serve as the basic input for finding halos and modeling the galaxy population.

Standard image High-resolution image

3.2. Galaxy Masses

Given the halo lightcone catalog constructed in Section 3, we assign a galaxy stellar mass, Mgal, to each halo. The SHAM procedure we use can be expressed mathematically as:

Equation (1)

where ϕ(Mgal, z) is the SMF (i.e., the co-moving number density of galaxies per unit galaxy mass, per unit redshift), and n(x) is the number density of halos as a function of some mass proxy, x.

In the simplest formulation of SHAM, galaxy mass is directly mapped to halo mass. However, subhalo properties at the time of accretion are a better indicator of galaxy mass, since as subhalos undergo tidal stripping, they lose a significant amount of dark matter before the galaxy is disrupted (e.g., Conroy et al. 2006; Vale & Ostriker 2006). There are many different choices of mass proxies, and each of these choices produces slightly different spatial distributions of galaxies. We adopt the peak of the maximum circular velocity over the entire merger history, Vpeak, as the halo mass proxy, because SHAM with Vpeak as the mass proxy is known to reproduce the small-scale clustering of galaxies (Hearin et al. 2013; Reddick et al. 2013; Lehmann et al. 2017; Campbell et al. 2018).

We use the SMFs from Williams et al. (2018), which are continuously evolving double-Schechter functions for both star-forming and QGs out to redshifts z > 10. By construction, these SMFs match observational constraints at low redshifts (z ≤ 4), and reproduce known UVLFs when convolved with the distribution of rest-frame UV magnitudes, MUV, i.e.,

Equation (2)

where ${ \mathcal N }[{M}_{\mathrm{UV}},{\bar{M}}_{\mathrm{UV}}({M}_{\mathrm{gal}},z),{\sigma }_{\mathrm{UV}}]$ is a normal distribution centered on the average UV magnitude ${\bar{M}}_{\mathrm{UV}}$, and σUV is the scatter in the Mgal${\bar{M}}_{\mathrm{UV}}$ relation. Williams et al. (2018) provided a relation to describe the Mgal${\bar{M}}_{\mathrm{UV}}$ relation and scatter, which we use to model our galaxies (see Section 6.4).

For redshifts z ≤ 4, Williams et al. (2018) fit the double-Schechter function to the data from Tomczak et al. (2014). For QGs, above z ∼ 4, the Schechter parameters are extrapolated to higher redshifts. This extrapolation is in agreement with the few constraints for QGs at z > 3.5. For star-forming galaxies (SFGs), the high-redshift SMFs are inferred from UVLFs. Specifically, Williams et al. (2018) used data from Bouwens et al. (2015) for 4 ≤ z ≤ 8 and from Oesch et al. (2018) at z = 10 to fit the SMF in Equation (2). Beyond z ∼ 10, the SMFs for the SFGs are extrapolated.

We perform SHAM on the halo catalogs using the SMFs described above. Specifically, given N halos, we first find the minimum galaxy mass, ${M}_{\min }$, such that ${N}_{\mathrm{gal}}(\gt {M}_{\min })=N$, where

Equation (3)

and V is the co-moving volume of the simulation. We then sample N galaxy masses from the SMF above ${M}_{\min }$. Finally, we rank-order the galaxy masses and assign galaxy masses to halos (such that the halo with the largest Vpeak value gets the largest galaxy mass).

Scatter is commonly introduced in the relation between galaxy and halo properties by either deconvolving the SMF (e.g., Behroozi et al. 2010) or by directly adding scatter to the stellar masses, re-ranking, and iteratively solving for the galaxy mass (Hearin et al. 2013). Since our main goals are not to reproduce the observed scatter in the stellar-mass–halo-mass relation (SHMR), we did not include scatter in the SHAM procedure. As illustrated in Section 4.3, we are able to produce realistic statistical galaxy properties, such as galaxy clustering with our approach.

We classify each galaxy as either an SFG or a QG by randomly generating a number and comparing it to the probability a galaxy of that mass is star-forming, as calculated from the SMFs. This method does ignore a possible correlation between star formation rate (SFR) and mass accretion histories (e.g., Behroozi et al. 2019), which could potentially impact galaxy clustering with SFR. In particular, QGs are more likely to be found in denser environments (e.g., Kauffmann et al. 2004; Kimm et al. 2009). Finally we remove all galaxies from the catalog with Mgal < 105 M. As shown in Section 8, the fraction of detectable galaxies below this mass is zero for redshifts z > 1. 12

4. Lightcone Pipeline

Thus far we have created halo catalogs with galaxy masses at different time outputs. In this section we detail how we construct a lightcone from these discrete snapshots to create a 1 deg2 survey of galaxies (the "Lightcone Catalog"). In Section 4.3, we demonstrate that we are able to reproduce essential statistical properties of galaxy populations with this Lightcone Catalog.

4.1. Lightcone Crossing

For this work, we consider a survey with a 1 deg2 FOV. To reach redshifts z ≳ 10, we tile 60 boxes in the line-of-sight direction. This corresponds to a maximum redshift of 13 and a co-moving distance of 6900 Mpc. As in, e.g., Bernyk et al. (2016), to avoid the replication of structures viewed by the observer, we randomly translate, reflect, and permute the axes of every tiled simulation box. We construct the halo lightcone by finding where each halo first crossed the observer's past lightcone (i.e., the location at which light has had just enough time to reach the observer). Technical aspects associated with this procedure are outlined extensively in the literature (e.g., Evrard et al. 2002; Blaizot et al. 2005; Kitzbichler & White 2007; Merson et al. 2013; Bernyk et al. 2016; Smith et al. 2017; Korytov et al. 2019). In this section, we closely follow the procedure used to generate the CosmoDC2 sky catalog for the Vera C. Rubin Observatory (Hollowed 2019; Korytov et al. 2019).

We assume that the observer is at the origin of the coordinate system (x, y, z) = (0, 0, 0). For each tiled box, we begin by placing all of the host halos on the lightcone. For every host halo in snapshot j, we determine its position in the subsequent snapshot, rj+1. We extrapolate the halo position from j by assuming constant velocity to find its extrapolated position in snapshot j + 1, rj+1,extrap. If the halo did not have a descendant located in snapshot j + 1, we set rj+1 = rj+1,extrap. Otherwise, we use:

Equation (4)

where L is the length of the simulation box, "int" represents integer, and rj+1,desc is the position of the descendant in snapshot j + 1. This approach allows the halo to cross the edge of the simulation box between snapshots j and j + 1, and thus rj+1 might be outside the domain of the tiled box. In these cases, we also consider the scenario where the halo crosses the lightcone on the other side of the box; i.e., we apply periodic boundary conditions, such that

Equation (5)

to allow the halo to cross the observer lightcone between positions rj p and ${r}_{j+1}^{p}$.

Given the positions rj and rj+1 for each host halo, we calculate the time the halo would cross the past lightcone, te (Equations (27)–(29) in Korytov et al. 2019). If the halo crosses between the snapshot times tj and tj+1, we calculate the position on the lightcone, re from:

Equation (6)

We do not include any halos where re is beyond the domain of the tiled box.

The final step to generating the lightcone is to assign halo properties (e.g., mass and substructure) to each object. One approach to assigning halo properties is to allow the merger of halos to happen at a time randomly between the two snapshot (e.g., Smith et al. 2017). However, this approach will result in halos being double counted, and the lightcone needs to be carefully pruned. Since our simulations have very fine time resolution (500 snapshots between redshifts z = 0 and z = 20), we can avoid the technical difficulties and assumptions introduced by needing to prune the catalog, and follow the same approach from Korytov et al. (2019). Specifically, every halo that crosses the lightcone between snapshots j and j + 1 is assigned properties from snapshot j. When including the substructure, we ensure subhalos have the same position and velocity offset from the host as in snapshot j.

4.2. Survey Volume

Given the lightcone catalog consisting of halo properties, galaxy masses, and positions, we cut out a wedge in the survey volume corresponding to 1 deg2. Following Bernyk et al. (2016), we convert the co-moving (x, y, z) positions of each galaxy to an angular position as follows:

Equation (7)

We only consider galaxies with R.A.< 1 deg and decl. < 1 deg, and then center the galaxies on (R.A., decl. ) = (0, 0). 13

Figure 4 shows the resulting galaxy mass density as a function of position. The lightcone procedure results in a realistic distribution of galaxies that traces the underlying cosmic web. Since the survey will reach very deep distances, the survey is tiled in six rows, with each row showing 10 tiled boxes. The survey wedge is complete to a maximum distance of 6588 Mpc/h (which corresponds to a cosmological redshift of ∼10.5); at higher redshifts, the angle of the survey is larger than the width of the simulation box. There are discontinuities in the galaxy density where the simulation boxes are tiled, which is a common feature for lightcones. As addressed in Bernyk et al. (2016), statistical properties of the galaxy catalog will be accurate on scales smaller than the box size.

Figure 4.

Figure 4. Mass density of galaxies in survey volume, with an FOV of 1 deg2. The x-axis shows the cosmological redshift (top) and the co-moving distance (bottom). We have plotted 10 tiled boxes in each row, for a total of 60 boxes to reach the desired redshift range. This figure demonstrates the depth and distribution of galaxies of a 1 deg2 UDF.

Standard image High-resolution image

4.3. Statistical Properties of Galaxy Lightcone

To verify the lightcone pipeline results in a realistic galaxy population, we calculate the halo mass functions (HMFs), SMFs, SHMRs, and galaxy clustering of the synthetic galaxy catalog.

4.3.1. Halo Mass Functions

To demonstrate that the HMF of the halo lightcone catalog is correct, we plot the recovered HMF of host halos in Figure 5. The theoretical curve is from the parameterization from Despali et al. (2016), and calculated using the HMF routine in Colossus (Diemer 2018). The theoretical curve and simulation results agree very well. The catalog was constructed to be complete above galaxy masses Mgal = 105 M, which corresponds to a halo mass limit of approximately Mhalo ∼ 109.5 h−1 M

Figure 5.

Figure 5. Expected HMF from the Despali et al. (2016) parameterization (dotted lines) compared to the host halos in the lightcone catalog (solid lines). Shaded regions are Poisson uncertainties. The theoretical curve is the volume-weighted average over the redshift band. The lightcone catalog agrees with the expected HMF, verifying that the lightcone procedure reproduces the correct redshift and mass distribution of dark matter halos.

Standard image High-resolution image

4.3.2. Stellar Mass Functions and the Stellar-to-halo Mass Relations

Additionally, we show the SMF of the lightcone catalog in Figure 6 for both SFGs and QGs. For comparison, we show the volume-weighted average SMFs from Williams et al. (2018; dotted lines), averaged over the redshift bin. The catalog is complete at all redshifts for galaxy masses greater than 105 M. The galaxies in the lightcone agree with the SMFs from Williams et al. (2018), which verifies that the lightcone pipeline and abundance matching procedure reproduce realistic galaxy counts. As discussed in Section 3.2, producing the desired SMF is important for reproducing observed luminosity functions.

Figure 6.

Figure 6. SMFs for the lightcone catalog. SFGs and QGs are shown in the top and bottom panels, respectively. Shaded regions are Poisson uncertainties. The Williams et al. (2018) SMFs that were used in the abundance matching procedure are shown with dotted lines, calculated as a volume-weighted average over the redshift bin. The lightcone pipeline and abundance matching procedure reproduce the desired SMFs.

Standard image High-resolution image

4.3.3. Galaxy Clustering

Galaxy clustering is commonly described using the two-point correlation function (2PCF), ξ(r), defined as

Equation (8)

where dP is the excess probability above the Poisson noise, ng (r) is the mean density of galaxies at separation r, and dV is the differential volume. In practice, the projected 2PCF, wp , is often used. The 2PCF depends on the projected separation, rp , and the line-of-sight separation, π:

Equation (9)

where the second equality assumes isotropy. The upper limit, ${\pi }_{\max }$, needs to be chosen large enough to nullify the effect of peculiar velocities, but not so large that it creates artificial edge effects from the survey boundary. The optimal value for ${\pi }_{\max }$ depends on the underlying survey volume, but is typically in the range 40–80 Mpc (Zehavi et al. 2005; van den Bosch et al.2013).

We calculate ξ based on the commonly used Landy-Szalay estimator (Landy & Szalay 1993),

Equation (10)

where DD, DR, and RR are the normalized number of counts from data–data, data–random, and random–random pairs. The random catalog contains 103 points per ${\mathrm{arcmin}}^{2}$, with distances assigned such that the random catalog has the same redshift distribution as the synthetic catalog, i.e.,

Equation (11)

where θ is the survey angle, and ϕ is the Williams et al. (2018) SMF. Values for R.A. and decl. coordinates for each random galaxy were selected so that galaxies are distributed isotropically:

Equation (12)

where r1, r2 are random numbers selected uniformly between 0 and 1. The random catalog is then centered on (R.A., decl.) = (0,0), in the same way as the synthetic catalog.

To measure the projected 2PCF, we use the package Corrfunc (Sinha & Garrison 2020), with ${\pi }_{\max }\,=60\,{h}^{-1}\,\mathrm{Mpc}$. We estimate errors in the measured 2PCF by bootstrapping the galaxy catalog data with 200 subsamples. We present this measured 2PCF in Figure 7, along with Sloan Digital Sky Survey (SDSS) data for redshift z ≈ 0.1 galaxies from Yang et al. (2012). To compare our synthetic catalog 2PCF to the data, we use galaxies in the redshift range 0.1 < z < 0.2 within a 10 deg × 10 deg survey. The low-redshift clustering of the synthetic galaxies agrees with observations within the 1σ error bars, indicating that the lightcone halo clustering is realistic.

Figure 7.

Figure 7. 2PCF in a 100 deg2 lightcone catalog for all galaxies 0.1 < z < 0.2 (lines). We estimate uncertainties in the synthetic catalog 2PCF by bootstrapping the data. In comparison, SDSS data from Yang et al. (2012), for galaxies at redshifts z ≈ 0.1 are shown with black points. The agreement between the simulation and data shows that the lightcone catalog matches the observed galaxy clustering at low redshifts.

Standard image High-resolution image

5. Galaxy Morphologies

Galaxy morphologies offer key insights into galaxy evolution, and are important for understanding observational systematics. We model all galaxies as Sérsic (1968) profiles with an index, ns . Additionally, all galaxies are assigned a projected size Reff, a projected axis ratio q = b/a, and a position angle (PA). We follow the morphological prescriptions from Williams et al. (2018)—which are based on the empirical distributions measured in HST images—with the exception of galaxy sizes.

5.1. Size

Galaxies sizes are known to decrease with increasing redshift at a fixed stellar mass (e.g., van der Wel et al. 2014; Shibuya et al. 2015; Curtis-Lake et al. 2016), typically evolving as

Equation (13)

where Re is the half-light radius. Further, SFGs and QGs evolve differently, with SFGs having lower values of α (e.g., van der Wel et al. 2014; Ma et al. 2018). 14 In addition to these observed trends, theoretical models commonly show a correlation between the size of a galaxy and that of its host dark matter halo (e.g., Mo et al. 1998; Kravtsov 2013; Jiang et al. 2019). Given that our synthetic catalog begins with the underlying dark matter structure, we assign galaxy sizes based on the dark matter halo sizes, and then test that the size–redshift–mass relation is consistent with current data.

Specifically, we characterize the size of each galaxy as the half-light radius in the semimajor axis Reff. 15 To assign Reff to each galaxy, we use the relation between halo and galaxy size (e.g., Kravtsov 2013; Zanisi et al. 2020):

Equation (14)

where Rvir is the radius of the halo in physical units. We assign Reff values from this mean relation, with log-normal scatter, σR . We use the coefficients A and scatter σR from Zanisi et al. (2020), as summarized in Table 1. Zanisi et al. (2020) used size distributions of central galaxies from the SDSS DR7 sample (Abazajian et al. 2009; Meert et al. 2016). While these coefficients capture the dependence of Reff on mass and on whether a galaxy is star-forming or quiescent, they are calibrated for z = 0 galaxies.

Table 1. Coefficients, A, for the ReffRvir Relation (Equation (14)) and the Scatter σR ; Values Are from Zanisi et al. (2020)

${\mathrm{log}}_{10}({M}_{\mathrm{gal}}/{M}_{\odot })$ A (SFGs) A (QGs) σR
≤9.50.0180.0060.2
(9.5,10]0.0190.0070.2
(10,10.5]0.0190.0100.15
(10.5,11]0.0190.0110.15
(11,11.5]0.0190.0150.15
>11.50.0240.0160.1

Download table as:  ASCIITypeset image

To test whether this relation extends reasonably to higher redshifts, we compare the median Reffz relation from our catalog to data from van der Wel et al. (2014) and Shibuya et al. (2015), as shown in Figure 8. The data sets shown in Figure 8 consist of galaxies from the 3D-HST+CANDELS catalog. Redshifts and masses are from Skelton et al. (2014), and galaxies are classified as either star-forming or quiescent using UVJ diagram criteria. The van der Wel et al. (2014) Reff measurements are slightly higher than the Shibuya et al. (2015) measurements (see Figure 3 in Shibuya et al. 2015), due to the differences in the method used to fit Reff. Our results are more consistent with Shibuya et al. (2015).

Figure 8.

Figure 8. Median half-light semimajor radius, Reff as a function of redshift from the synthetic galaxy catalog for SFGs (top) and QGs (bottom). Different columns show different mass bins. The shaded regions are the 1σ standard deviation in each bin for the synthetic galaxies. The points show median data Reff from van der Wel et al. (2014) and Shibuya et al. (2015), with the 16th and 84th percentiles of the data point distribution. Our synthetic galaxies match observed trends of galaxy sizes with mass, redshift, and galaxy classification.

Standard image High-resolution image

Figure 8 demonstrates that our simple assumption that Equation (14) holds for all redshifts agrees remarkably well with the data. We capture the decrease in Reff with both decreasing mass and increasing redshift. We also capture the effect that QGs are smaller than SFGs on average. Our ability to capture redshift trends, despite only using known relations between Reff and Rvir at z = 0, likely reflects the fact that virial radii evolve as Rvir ∝1/(1 + z), and therefore higher-redshift galaxies have smaller virial radii. Our finding that the redshift evolution in the mass–size relation can be captured through the redshift evolution Rvir has also been found by Mowla et al. (2019), who show that the ratio r80/Rvir is constant for redshifts z < 3.

5.2. Axis Ratios, Sérsic Indices, and Position Angles

The distribution of galaxy axis ratios and Sérsic indices should differ between SFGs and QGs (e.g., Franx et al. 2008; Bell et al. 2012; Mortlock et al. 2013) and with redshift (e.g., van der Wel et al. 2011; Guo et al. 2015). To assign shape (defined as the projected axis ratio q = b/a, where a is the semimajor half-light size and b is the semiminor half-light size) and Sérsic indices, ns , we use the method directly from Williams et al. (2018). Specifically, using data compiled from van der Wel et al. (2012) and Skelton et al. (2014), Williams et al. (2018) found the distribution of q and ns as a function of redshift. Due to limited data, morphologies for QGs with z > 4 are drawn from the 3 ≤ z ≤ 4 distribution, and morphologies for SFGs with z > 6 are drawn from the 5 ≤ z ≤ 6 distributions. We use these resulting distributions to draw q and ns values to assign to each galaxy.

We assign PAs uniformly between 0 and 2π. Synthetic galaxy catalogs often assume that galaxies are oriented isotropically (e.g., Williams et al. 2018), and this assumption will not affect the galaxy clustering, number counts, or cosmic SFH of our synthetic catalog.

6. Galaxy SEDs

In this section we outline the methods we use to generate galaxy SEDs. The SED pipeline aims to reproduce observed SFHs and UVLFs, as these quantities dictate the number of ionizing photons produced by galaxies (see Section 7). The SEDs also have realistic SFG and QG properties, including their observed colors, ages, and metallicity (see Appendix B).

6.1. Overview

One of the main observables included in the synthetic catalog is the photometry in the Roman filters, which requires accurate modeling of the galaxy SEDs. We model the SEDs for each galaxy using the software FSPS (Conroy et al. 2009; Conroy & Gunn 2010). This section serves as an overview of the model used to generate the galaxy SEDs.

We use a Chabrier (2003) initial mass function (IMF), and include the IGM absorption model from Madau (1995). The SFH model is described in Section 6.2.1. Additionally, for SFGs, we include the FSPS nebular emission model (Byler et al. 2017), which is controlled by the gas ionization parameter and the gas metallicity. As in Williams et al. (2018), we approximate the stellar and ISM metallicities as a single metallicity value, Zmet.

The dust modeling includes dust emission and dust absorption. Dust emission follows Draine & Li (2007), which is a silicate-graphite-PAH grain model. For dust absorption, we use the Calzetti et al. (2000) attenuation curve, where dust attenuation is applied to all starlight equally, and therefore depends on one parameter, ${\bar{\tau }}_{v}$. This parameter gives the opacity at 5500 Å, and will set the normalization of the Calzetti dust attenuation curve. We assume dust attenuation is the same for emission line and continuum. We also include the asymptotic giant branch circumstellar dust models from Villaume et al. (2015). Including circumstellar dust can make a significant contribution to the IR emission from galaxies with little diffuse gas, and will need to be included in stellar modeling to accurately interpret data from upcoming IR facilities, including JWST.

Overall, this leaves seven free parameters to describe the galaxy SEDs:

  • 1.  
    Mgal: the stellar mass of the galaxy.
  • 2.  
    z: the redshift of the galaxy (or alternatively, tage, the age of the universe at that redshift).
  • 3.  
    Zmet: the galaxy metallicity. The gas-phase and stellar metallicities are assumed to be equal.
  • 4.  
    tstart: the age of the universe at the start of star formation.
  • 5.  
    τ: the e-folding time for star formation.
  • 6.  
    ${\bar{\tau }}_{v}$: the dust attenuation parameter, defined as the opacity at 5500 Å.
  • 7.  
    US : the gas ionization parameter.

In addition to these seven parameters, other relevant quantities included in the synthetic catalog include the SFR, the UV magnitude, MUV, the slope of the UV continuum, β, and the rest-frame colors of the galaxies. Section 6.2 describes the calculation of each of these quantities.

Each galaxy begins with a redshift z (from the lightcone pipeline), a mass, Mgal (from abundance matching), and is labeled as either an SFG or a QG, as sampled from the SMF (see Section 3.2). Sections 6.4 and 6.5 below describe the SED pipeline for assigning the remaining FSPS parameters (Zmet, tstart, τ, ${\bar{\tau }}_{v}$, and US ) for SFGs and QGs, respectively. The pipeline is summarized in Figure 9.

Figure 9.

Figure 9. Overview of the SED pipeline. We begin with a galaxy catalog consisting of redshift and masses for each galaxy, and parent catalogs for the SFGs and QGs. We propose parameters for every galaxy in the catalog and find the nearest neighbors in the parent catalogs. We then assign FSPS parameters based on a weighted average of the nearest neighbors.

Standard image High-resolution image

In brief, we first generate "parent catalogs" for both SFGs and QGs, spanning a realistic range of parameters. These parent catalogs serve as a lookup table for realistic galaxy SEDs. Details on how the parent catalogs are calculated are given in Section 6.3. We propose parameters for each galaxy, and find the 10 nearest neighbors in the parent catalog for each set of galaxy parameters using k-d trees. 16 For SFGs, the proposed parameters are Mgal, z, MUV, and β while for QGs, the proposed parameters are Mgal and z. The distance metrics used to find the nearest neighbors are detailed in Sections 6.4 and 6.5.

We assign the five free FSPS parameters (Zmet, tstart, τ, ${\bar{\tau }}_{v}$, and US ) to each galaxy by taking the weighted average of the nearest neighbors. Given these five parameters, along with the mass and redshift of the galaxy, we use FSPS to generate an SED. We then calculate the UV properties, SFRs and rest-frame colors directly from the spectrum. These calculated properties will not be identical to the proposed parameters; however, we still retain the imposed scaling relations in the parent catalogs (see Appendix B).

6.2. Calculated Quantities

For the SED pipeline outlined above, we need to accurately measure the SFR, rest-frame colors, and UV properties of the galaxy SEDs. This section details the methods we use to measure these quantities. Since FSPS normalizes the stellar modeling, such that one stellar mass is created over the formation history, we scale all of the spectra. Specifically, we scale all fluxes and the SFR by Mgal/f, where f is the surviving mass fraction in the stellar population, not including stellar remnants.

6.2.1. Star Formation History

Galaxy SFRs dictate the CSFRD of the universe, which in turn constrains the amount of ionizing photons produced by galaxies in the EoR. Given the importance of SFRs to galaxy evolution and the sources of ionizing photons during the EoR, we need to ensure our synthetic galaxies have SFRs consistent with observations.

The SFR depends on the age, e-folding time, and mass of the galaxy. We model the SFH, ψ(t), using a "delayed-tau model," in which:

Equation (15)

where t is the time since the start of star formation, tstart. As discussed in Williams et al. (2018), this parameterization achieves the expectation from simulations that high-redshift galaxies have rising SFHs (Finlator et al. 2011), and accurately reproduces the colors and mass-to-light ratios of galaxies in smoothed particle hydrodynamics simulations (e.g., Simha et al. 2014).

To determine the SFR for each galaxy, we use the FSPS calculated SFR, and scale it by Mgal/f, as described above.

6.2.2. Rest-frame Colors

UVJ diagram galaxy classifications allow for the separation of QGs and dusty SFGs (e.g., Williams et al. 2009; Whitaker et al. 2013; Papovich et al. 2018). To examine the distribution of the synthetic galaxies in UVJ space, we use calculated U−V and V−J colors. Specifically, we use the Johnson U and V filters, and the WFC3 F125W J filter included in FSPS.

6.2.3. UV Properties

Roman will constrain the number of ionizing photons that are produced by galaxies in the EOR by precisely measuring the UVLF at high redshift. The UVLF depends on the SMF and the UV magnitude, MUV (see Equation (2)).

Following Robertson et al. (2013), we define the UV magnitude, MUV, as the average magnitude at rest-frame wavelength in a flat filter, in the range 1450–1550 Å. We measure MUV directly from the rest-frame galaxy SEDs by calculating the average flux density:

Equation (16)

with λ1 = 1450Å and λ2 = 1550Å.

In addition to the UV magnitude, the slope of the UV continuum, β, helps determine the role of galaxies in ionizing the universe. The slope β is defined as fλ λβ (e.g., Meurer et al. 1999). The presence of very blue UV continuum slopes in Roman imaging may provide a signpost of very high escape fractions in early galaxies. We use the method from Dunlop et al. (2012) to determine β. Specifically, we take the SEDs shifted to redshift z = 7, and calculate

Equation (17)

using the FSPS included filters F125W and F160W for WFC3.

6.3. Parent Catalogs

We create SFG and QG parent catalogs that serve as a lookup table between the assigned FSPS parameters (Mgal, z, Zmet, tstart, τ, ${\bar{\tau }}_{v}$, and US ) and the derived spectra quantities (SFR, MUV, β, and rest-frame colors). We create the parent catalogs by sampling from known scaling relations. By only populating regions of parameter space corresponding to the desired galaxy population, we can fully sample the relevant space, while also limiting the size of the parent catalog. The distributions we use to construct the parent catalogs are summarized in Tables 2 and 3 for SFGs and QGs, respectively, and outlined in detail in this section.

Table 2. Parameters for SEDs of SFGs

ParameterProposed ParameterParent CatalogDistance MetricAssigned Parameter
Mgal fixeduniform in ${\mathrm{log}}_{10}({M}_{\mathrm{gal}}/{M}_{\odot })\in [5,12]$ ${\mathrm{log}}_{10}({M}_{\mathrm{gal}}/10{M}_{\odot })$ fixed
z fixeduniform in a = 1/(1 + z) ∈ [0.07, 1] a = 10/(1 + z)fixed
Zmet ...FMR...nearest neighbors
US ... US Zmet relation...nearest neighbors
${\bar{\tau }}_{V}$ ... ${ \mathcal N }(\mu =0,\sigma =0.5)\in [0,4]$ ...nearest neighbors
tstart ...uniform in [1 Myr, tage]...nearest neighbors
τ ...uniform in [0.1, 100] Gyr...nearest neighbors
ψ ...calculated; cut ${\mathrm{log}}_{10}(\psi /{M}_{\odot }{\mathrm{yr}}^{-1})\in [-5,4]$ ...calculated
MUV MUVMgal calculated; cut in [−25, −5] MUV + 20calculated
β βMUV calculated; cut in [−3, −1] β calculated
U-V ...calculated...calculated
V-J ...calculated...calculated

Note. The synthetic galaxies have fixed parameters Mgal and z and "proposed" parameters MUV and β. We identify the 10 closest neighbors in the parent catalog (defined by the distance metric), and assign the final catalog properties Zmet, US , ${\bar{\tau }}_{V}$, tstart, and τ from the nearest neighbors.

Download table as:  ASCIITypeset image

6.3.1. Star-forming Galaxy Parent Catalog

For the SFG parent catalog, we sample the seven FSPS parameters for 106 galaxies. The mass, redshift, and SFH parameters tstart and τ are sampled from uniform distributions, as summarized in Table 2. We sample the dust parameter, ${\bar{\tau }}_{V}$, uniformly from a normal distribution centered on zero with a standard deviation of 0.5, truncated between 0 and 4.

Since the metallicity of galaxies depends on the metal production from stars, SFR is related to metallicity. More massive galaxies have higher metallicities on average (the mass–metallicity relation; e.g Lequeux et al. 1979; Tremonti et al. 2004), and the scatter in this relation depends strongly on SFR. The relation between MgalZmetψ is known as the fundamental metallicity relation (FMR; e.g., Mannucci et al. 2010; Hunt et al. 2016), and does not display any significant redshift evolution up to at least redshift z > 3 (e.g., Cresci et al. 2019, and references therein). Given these observed trends, we first assign realistic SFRs for galaxies, before assigning metallicities from the FMR.

To assign realistic SFRs, we use the parameterization from Schreiber et al. (2017), constructed to match observations in the redshift range 0 < z < 7. For SFGs,

Equation (18)

with a log-normal scatter of 0.3 dex.

We then propose Zmet by drawing from the FMR presented in Williams et al. (2018; based off of data from Hunt et al. 2016)

Equation (19)

with scatter from a Student's t-distribution with 3° of freedom, a standard deviation of 0.3 in ${\mathrm{log}}_{10}({Z}_{\mathrm{met}}/{Z}_{\odot })$, and truncated between − 2.2 < Zmet < 0.24. A Student's t-distribution is more heavily tailed than a Gaussian, and agrees better with observational data. As discussed in Williams et al. (2018), imposing a maximum metallicity, ${\mathrm{log}}_{10}({Z}_{\mathrm{met}}/{Z}_{\odot })=0.24$, in the synthetic galaxies reproduces the observed turnover in the mass–metallicity relation, despite the fact that Equation (19) is linear in mass. We use the same FMR to assign metallicities to both SFGs and QGs, as we find that existing measurements of QGs are consistent with the FMR used to create this synthetic catalog (e.g., Peng et al. 2015; Leethochawalit et al. 2018).

Gas-phase metallicity is also an indirect tracer of gas flows in galaxies. Metallicity and stellar mass both correlate with SFH, but gas flows will introduce scatter in the metallicity and stellar mass relation. Thus, the gas ionization parameter, US (which represents the ratio of the number density of ionizing photons to the number density of hydrogen atoms) correlates with metallicity. To assign the gas ionization parameter, we use the low-redshift relation from Carton et al. (2017):

Equation (20)

with a scatter of 0.3 dex sampled from a Student's t-distribution with three degrees of freedom, truncated between $-4\lt {\mathrm{log}}_{10}{U}_{S}\,\lt -1$. As in Williams et al. (2018), we assume the relation holds at all redshifts.

Given the seven FSPS parameters, we then calculate MUV, β, and ψ as outlined in Section 6.2. We discard galaxies that have unrealistic ψ, MUV, or β values, as summarized in Table 2. The resulting SFG parent catalog has ∼ 9 × 105 galaxies.

6.3.2. Quiescent Galaxy Parent Catalog

We generate the parent catalog for the QGs in a similar way to the parent catalog for the SFGs. For 107 galaxies, we assign galaxy mass, redshifts, and SFH parameters tstart and τ from uniform distributions, as summarized in Table 3. We sample the dust parameter, ${\bar{\tau }}_{V}$ uniformly from a normal distribution centered on zero with a standard deviation of 0.5, truncated between 0 and 4.

Table 3. Parameters for SEDs of QGs

ParameterProposed ParameterParent CatalogDistance MetricAssigned Parameter
Mgal fixeduniform in ${\mathrm{log}}_{10}({M}_{\mathrm{gal}}/{M}_{\odot })\in [5,12]$ ${\mathrm{log}}_{10}({M}_{\mathrm{gal}}/10{M}_{\odot })$ fixed
z fixeduniform in [0, 13] a = 10/(1 + z)fixed
Zmet ...FMR...nearest neighbors
${\bar{\tau }}_{V}$ ... ${ \mathcal N }(\mu =0,\sigma =0.5)\in [0,4]$ ...nearest neighbors
tstart ...uniform in [1 Myr, tage]...nearest neighbors
τ ...uniform in [0.01, 10] Gyr...nearest neighbors
ψ ...calculated; cut ${\mathrm{log}}_{10}(\psi /{M}_{\odot }\,{\mathrm{yr}}^{-1})\in [-4,1]$ ...calculated
U-V ...calculated; cut in UVJ diagram...calculated
V-J ...calculated; cut in UVJ diagram...calculated

Note. The synthetic galaxies have fixed parameters Mgal and z. We identify the 10 closest neighbors in the parent catalog (defined by the distance metric), and assign the final catalog properties Zmet, ${\bar{\tau }}_{V}$, tstart, and τ from the nearest neighbors.

Download table as:  ASCIITypeset image

We assume QG metallicities, Zmet, follow the same FMR as SFGs, as given in Equation (19), which is consistent with existing measurements of QGs (e.g., Peng et al. 2015; Leethochawalit et al. 2018). We propose SFRs from Schreiber et al. (2017) as before. For QGs,

Equation (21)

with a log-normal scatter of 0.45 dex.

As before, given the seven FSPS parameters, we then calculate ψ and the UVJ colors, as outlined in Section 6.2. We remove any galaxies in the QG parent catalog that do not fall in the proper space in the UVJ diagram, to avoid including any galaxies with unrealistic colors. Specifically, we use the criteria that a galaxy can be classified as SF if it satisfies any of the following conditions:

Equation (22)

(Williams et al. 2009). After cuts in SFR and UVJ, the resulting parent catalog has 5 × 104 galaxies. Though SFR is not set explicitly for QGs, these cuts ensure that QGs have a low SFR, differentiating them from the SFG population (see Appendix B).

6.4. Star-forming Galaxies

This section outlines our procedure for assigning realistic SEDs for SFGs. To accurately capture the observed CSFRD and UVLFs, galaxies must have realistic SFRs and UV magnitudes. We also aim to produce realistic UV continuum slopes, β, the FMR, and the relationship between metallicity and gas ionization. Overall, we propose five parameters to define the SFG SEDs, ψ, MUV, β, Zmet, and US . Details of how these proposed parameters are generated, and how the FSPS parameters are determined for the SFGs are given in this section, and summarized in Table 2.

To reproduce observed UVLFs, the synthetic galaxies need to follow the same ${\bar{M}}_{\mathrm{UV}}$Mgal relation that Williams et al. (2018) used to generate the high-redshift SMFs used in this work:

Equation (23)

Additionally, Williams et al. (2018) found that scatter observed in 3D-HST data (Skelton et al. 2014) is constant in both stellar mass and redshift, with an average value of σUV = 0.7. Therefore, we propose MUV values from Equation (23) with a Gaussian scatter of σUV = 0.7.

We also use the relations from Williams et al. (2018) to propose UV continuum slopes, β,

Equation (24)

Given the proposed galaxy properties, we next determine which FSPS parameters to assign. To generate SEDs that match closely to the proposed parameters, we find the 10 nearest neighbors to the SFG parent catalog. Table 2 outlines the distance metric we use to find the nearest neighbors. We average the five free FSPS parameters (Zmet, tstart, τ, ${\bar{\tau }}_{v}$, and US ) of these neighbors, with a weight based on their distance to the proposed parameters. In cases where the assigned tstart value is greater than tage, we set tstart = tage. This method allows us to "fit" the target (proposed) parameters, even with the very large number of galaxies, something that would not be computationally realistic to achieve iteratively.

Given the large degeneracies between different galaxy properties (e.g., age, metallicity, dust, SFH), we find that our solution depends sensitively on how realistically we populate the parent catalogs, and also how we define the distance metric. In particular, we find that to ensure the final parameters were close to the proposed parameters, it is especially important to find neighbors that are close in redshift.

6.5. Quiescent Galaxies

In addition to SFGs, we include a QG population. We turn off the nebular emission model for QGs, as low-redshift galaxies are known to have very low gas content. Though galaxy modeling often assumes QGs do not contain any gas or dust (e.g., Williams et al. 2018), recent results from Gobat et al. (2018) suggest that higher-redshift galaxies should have significant dust content. We therefore include dust for the QGs. We assign QG properties in a similar manner as to how we assign SFG properties. Given the QG parent catalog described in Section 6.3, we determine FSPS parameters using the 10 nearest neighbors in mass and redshift. We summarize the details of this procedure in Table 3.

6.6. Example SEDs

We present the resulting SEDs for two example galaxies at redshift z ∼ 8 in Figure 10. The Roman WFI filters include seven photometric filters: R062, Z087, Y106, J129, H158, F184, and F213 17 , a wide-field filter W146, a grism, and a prism. We plot the flux calculated in the seven photometric filters R062, Z087, Y106, J129, H158, F184, and F213. Both galaxies show a clear Lyman break in Y106.

Figure 10.

Figure 10. Example SEDs for an SFG (top) and a QG (bottom) at redshift z ≈ 8. The effective area of the Roman photometric filters (R062, Z087, Y106, J129, H158, F184, and F213) are over-plotted. The observed magnitudes in each band are shown as colored points. The SFG would be selected in this UDF as a Lyman-break galaxy (LBG) using Roman photometry.

Standard image High-resolution image

The SFG (top panel) should be selected as a Lyman-break galaxy (LBG) in Roman using the common Lyman-break method. This example also displays strong Lyα emission. Lyα emitters (LAEs) are a common way to study the EoR, and Roman will detect many such galaxies. Quantifying the science returns of LAEs in a Roman UDF will be the subject of future work.

Due to the SMFs we used to construct our catalog, we model a small population of high-redshift QGs, such as the one shown in the bottom panel of Figure 10. It is unclear whether redshift z = 8 galaxies exist, or what the mechanism of their formation would be. Due to the large area a deep survey with Roman will cover, a Roman UDF could help identify rare high-redshift QGs if they exist. This will be discussed further in Section 9.2.

7. Cosmic Star Formation History

Since one of the main goals of a Roman UDF is to study the ionizing photon contribution from galaxies, we ensure we reproduce observed UVLFs (Section 7.1) and the CSFRD (Section 7.2). Additional galaxy properties are discussed in Appendix B to show the range of scientific questions our synthetic galaxy catalog can address.

7.1. UV Properties

Placing strong constraints on the faint end of the UVLF during the EoR is a main goal of a Roman UDF. To reproduce observed UVLFs, the synthetic galaxies must follow the correct distribution of MUV, as discussed in Section 6.2.3. On average, the UV magnitude, MUV, decreases (becomes brighter) with galaxy mass, Mgal (e.g., Stark et al. 2009). For galaxies with ${\mathrm{log}}_{10}({M}_{\mathrm{gal}}/{M}_{\odot })\gt 10$, the relationship flattens (e.g., Spitler et al. 2014). Apart from this flattening at high masses, the observed MUVMgal relation has a constant slope, with the normalization evolving with redshift (e.g., Duncan et al. 2014; Stefanon et al. 2017). Similarly, the UV continuum slope, β, correlates with MUV. Though the exact relation between β and MUV is still being studied, β appears bluer (decreases) with increasing (fainter) MUV and increasing redshift (e.g., Bouwens et al. 2014).

We calculate the UV properties, MUV and β, of the synthetic SFGs as described in Section 6.2.3, and presented in Figure 11. We reproduce trends for both MUV and β imposed from Williams et al. (2018). The MUV–mass relation enables us to reproduce observed UVLFs, as discussed below. We also closely reproduce the βMUV relation for β > − 2.5. Below β < − 2.5, our galaxy catalog agrees with studies that indicate that the βMUV relationship flattens for MUV ≳ − 19 galaxies (e.g., Bouwens et al. 2014).

Figure 11.

Figure 11. Average MUV–mass relation (top) and βMUV relation (bottom) for SFGs in the synthetic catalog (solid lines) and the standard deviation in the bin (shaded region). We compare the synthetic catalog to the relations from Williams et al. (2018), calculated as the volume-weighted average over the redshift bin (dotted lines). The synthetic galaxy catalogs are constructed to match the underlying MUV–mass relations, to reproduce observed UVLFs. The synthetic galaxy catalog matches the underlying relations closely, except for a flattening in the βMUV relation for faint galaxies. This flattening is consistent with current constraints, as described in Section 7.1.

Standard image High-resolution image

We show the UVLFs of our synthetic SFGs in Figure 12, compared to the UVLFs that are expected from convolving the SMF with a normal distribution, centered on MUV (Equation (2)). The galaxy catalog agrees very well with the imposed relation, and is consistent with current observations, as shown in Section 8.3. Our catalog has slightly fewer galaxies at faint magnitudes (MUV > − 17) than expected due to the fact that we do not perfectly recreate the MUVMgal relationship. These magnitudes are fainter than the detection limits of a 30 mAB survey, and our synthetic catalog UVLF is consistent with current observations, as discussed in Section 8.3.

Figure 12.

Figure 12. UV luminosity function of the synthetic galaxy catalog. The dotted lines show the expected UVLF from convolving the Williams et al. (2018) SMF with the MUVMgal relationship (Equation (2)), and then averaged over the redshift bin. Overall, the data matches the underlying relations very well, and agrees with current observations (see Figure 18).

Standard image High-resolution image

7.2. Cosmic Star Formation Rate Density

As discussed in Section 6.2.1, the CSFRD describes the ionizing photon contribution of galaxies as a function of cosmic time (for a review, see, e.g., Madau & Dickinson 2014). We display the evolution of the CSFRD in the top panel of Figure 13. We calculate the CSFRD by taking the sum of galaxy SFRs, ψ (averaged over the last 100 Myr), in each bin divided by the co-moving volume of the bin. As in Williams et al. (2018), to compare to Madau & Dickinson (2014), we imposed a luminosity limit of 0.03 L*. 18

Figure 13.

Figure 13. Evolution of star formation in the universe. Top: evolution of the CSFRD of synthetic galaxies with a luminosity limit of 0.03 L* (dark blue). Error bars are smaller than the width of the line. The dashed black line is the CSFR density of the universe compiled by Madau & Dickinson (2014), and colored points are measurements from the literature (Cucciati et al. 2012; Bouwens et al. 2015; Oesch et al. 2018). The data are converted from a Salpeter (1955) to a Chabrier (2003) IMF by dividing the CSFRD by a factor of 1.7. Bottom: the redshift evolution of the median specific star formation rate (sSFR) of $8.8\lt {\mathrm{log}}_{10}({M}_{\mathrm{gal}}/{M}_{\odot })\lt 10$ synthetic galaxies (solid line) compared to measurements from the literature (colored points; Tasca et al. 2015). Overall, the cosmic SFH of the synthetic galaxy catalogs is in close agreement with observational data.

Standard image High-resolution image

Similar to Williams et al. (2018), we find that the CSFRD agrees roughly with the model presented in Madau & Dickinson (2014), but more closely to data (e.g., Cucciati et al. 2012; Oesch et al. 2018). Most notably, our synthetic galaxies agree with high-redshift (z > 8) measurements of the CSFRD. As in Williams et al. (2018), our synthetic galaxies have a slight deficit at redshifts 1 < z < 3, which reflect the underlying SMFs. As discussed in Williams et al. (2018), the SMFs we used do not include the dusty SFGs currently missed in the UV-selected samples.

We show the evolution of the specific star formation rate (sSFR) in the bottom panel of Figure 13, for galaxies with $8.8\lt {\mathrm{log}}_{10}({M}_{\mathrm{gal}}/{M}_{\odot })\lt 10$. We calculate the median sSFR by dividing the SFR averaged over the last 100 Myr by the galaxy mass. For comparison, we plot data from Tasca et al. (2015). The synthetic galaxies and observations agree very well. The sSFR of galaxies at redshifts z > 4 is a matter of active research, but we find an increase in sSFR, consistent with current studies (Stark 2016). Given that our galaxy catalog is consistent with observations for both galaxy clustering and cosmic SFHs, we now turn our attention to making preliminary predictions for the science returns of a Roman UDF.

8. Predictions for a Roman Ultra-deep Field Survey

In this section, we demonstrate the capabilities of a 1 deg2 Roman UDF to study the earliest galaxies at the EoR and beyond. We begin by discussing our method for determining whether a galaxy is selected (Section 8.1), and the expected galaxy number counts (Section 8.2), before presenting our projected constraints (Section 8.3). Throughout this section we compare our predictions for a Roman UDF to the best current constraints in the literature. However, by the time Roman launches, JWST will have provided an extraordinary amount of data, greatly enhancing our understanding of the universe. We discuss synergies between a Roman UDF and JWST in Section 9.4.

8.1. Selection Criteria

Understanding of galaxy formation increased immensely when techniques to select high-redshift (z > 5) galaxies were developed (for a review, see, e.g., Dunlop 2013). The most commonly used method is the Lyman-break technique, which selects LBGs based on a step in the blue UV continuum emission at λrest = 1216 Å. To determine whether a UDF with Roman would select each galaxy in the synthetic catalog, we use a magnitude limit based on the Lyman break, in the same manner as Williams et al. (2018). Specifically, for each filter, we assign a nonoverlapping wavelength range (the dividing point was taken to be halfway between the centers of two adjacent bands). Each galaxy is assigned a "dropout" filter, according to the wavelength that corresponds to its Lyman break, λ = 1216(1 + z)Å. If a galaxy is brighter than the magnitude limit in the band immediately redder to its dropout filter, we consider it to be detectable. We use the magnitude limit mAB = 30 for a 5σ detection.

We show the fraction of detectable galaxies in Figure 14. The mass at which ∼50% of galaxies ranges from ∼106 at z ≈ 1–107 at z = 10. We find that nearly 100% of galaxies are detectable above masses of Mgal = 108 M, which indicates that there should be very tight constraints on galaxy properties (such as galaxy clustering and UVLFs) above this mass. Given this, we can measure the clustering of faint galaxies (MUV ≲− 18) at the EoR. With completeness corrections, we can measure quantities such as the UVLFs for galaxies brighter than MUV ≈ − 17 out to redshifts z ≈ 10. Section 8.3 will explore these potential constraints in more detail.

Figure 14.

Figure 14. Fraction of detectable galaxies (5σ) in the Roman synthetic catalog. Nearly all galaxies with masses Mgal > 108 M (corresponding to MUV < − 18) are detectable out to redshifts z ≈ 10.

Standard image High-resolution image

In the following sections, we assume that any detectable galaxy is selected. This approach assumes that there is no scatter in the photometric redshifts, and no galaxies are obscured by forefront galaxies. Future work will examine Roman UDF constraints in more detail, taking into account these effects. The catalogs and images presented in this work will be instrumental to explore these sources of systematic errors.

8.2. Number Counts

Here we provide number counts of the detectable galaxies (5σ), to understand the science returns of a Roman UDF and measure the completeness of the survey. Figure 15 shows both the cumulative number counts (top), and the differential number counts (bottom). We find that a 1 deg2 Roman UDF could detect more than 104 galaxies at redshifts z > 7. For comparison, HST has detected on the order of 103 galaxies at redshifts z ∼ 4–10 (Koekemoer et al. 2013; Lotz et al. 2017), and Cycle 1 JWST programs are expected to detect ∼5000 galaxies at redshifts z > 6 (Williams et al. 2018).

Figure 15.

Figure 15. The cumulative (top) and differential (bottom) number counts of the detectable galaxies (5σ) in our Roman UDF synthetic catalog. A Roman UDF will contain over a million galaxies, with tens of thousands in the EoR.

Standard image High-resolution image

Though a UDF catalog will consist largely of SFGs, there will also be a number of QGs, as demonstrated in Figure 16. These predictions indicate that a 1 deg2 will contain a handful of detectable QGs above redshift z = 7 if they exist. In combination with, e.g., JWST spectroscopy, this could identify z > 7 QGs. Given that we currently have only detected QGs out to redshifts of z ∼ 4–5 (Merlin et al. 2019; Valentino et al. 2020), this will greatly enhance our understanding of the QG population. Determining the number and mass distribution of high-redshift QGs is fundamental to understanding galaxy evolution, and for testing different models for the emergence of QGs (see, e.g., Cortese et al. 2021, for a recent review on quenching).

Figure 16.

Figure 16. Number counts of detectable galaxies (5σ) for SFGs (top) and QGs (bottom) in our Roman UDF synthetic catalog. A Roman UDF will detect QGs out to very high redshifts, possibly even detecting a few at the EoR.

Standard image High-resolution image

Due to the nature of these kinds of predictions, the underlying galaxy properties in the catalog are extrapolated beyond our current knowledge (in particular, the number of low-mass objects is dependent on extrapolations of the SMFs). Therefore, the number counts presented in this section represent a reasonable estimate of what a UDF might see, but there is much to be learned in the future. Regardless, the large number of galaxies that Roman will detect will provide the best census to date of faint high-redshift galaxies, including those thought responsible for reionizing the universe.

8.3. The Science Returns of a Roman UDF

Some of the quantities that a Roman UDF will aim to constrain include the SMF, the UV luminosity function, galaxy clustering and the galaxy–halo connection at high redshifts. In this section we provide preliminary projected constraints for a 1 deg2 Roman UDF, based on the detectable galaxies identified in Section 8.1. We have not included uncertainties from cosmic variance or from errors on photometric redshifts in our predictions, as that is beyond the scope this paper. However, we note that our synthetic catalog provides the basis for rigorous studies of systematic errors in a Roman UDF, and will be the focus of future work.

8.3.1. Stellar Mass Functions

The evolution of the SMF of galaxies over cosmic time provides information on how galaxy populations have evolved, the SFH of the universe, and the galaxy–halo connection. Figure 17 shows the SMF for the detectable galaxies in the synthetic catalog, with the underlying SMFs shown as dotted lines for both SFGs (top) and QGs (bottom). We perform a completeness correction by dividing the number of detectable galaxies by the fraction of detectable galaxies at the corresponding mass and redshift. As shown in Figure 6, our synthetic catalog matches the SMF nearly exactly by construction.

Figure 17.

Figure 17. SMF of the detectable galaxies (5σ) in our synthetic galaxy catalog for SFGs (top) and QGs (bottom), with completeness corrections. The synthetic galaxies agree with the Williams et al. (2018) SMFs (dotted lines) by construction. A Roman UDF can potentially constrain the SMF beyond redshift z = 10 (Mgal > 106.5 M) for SFGs, and to redshift z ≈ 7 (Mgal > 109.5 M) for QGs.

Standard image High-resolution image

For our catalog, the SMF for the detectable SFGs is measurable to masses of Mgal > 106.5 M for all redshifts. For the detectable QGs, the SMF is measurable to redshifts z ∼ 7 (masses Mgal > 109.5 M). In comparison, observations have only recently placed any constraints on the SFG SMF at redshift z = 8–10. Further, these constraints only exist for galaxies with Mgal < 108 M, and have very large uncertainties (about 100%; Stefanon et al. 2021). Measurements of QG SMFs only extend to redshifts z ≈ 4 (e.g., Girelli et al. 2019).

The exact constraints on an SMF from Roman will depend on stellar mass measurements and the accuracy of completeness corrections. This work provides a first calculation of the completeness of a Roman UDF (Section 8.1), and highly realistic simulated data to enable future rigorous calculations. In practice, stellar mass estimates of high-redshift galaxies will require photometry redshifts beyond the reddest Roman filters. However, JWST will provide tight constraints on the MUVMgal relation, which can be used to estimate stellar masses of galaxies in a Roman UDF.

8.3.2. UV Luminosity Functions

To clearly answer whether there were enough galaxies during the EoR to reionize the universe, we need strong constraints on the faint end of the UVLF during the EoR. Bouwens et al. (2021) provide the most recent, comprehensive constraints on the UVLF, and are able to constrain the faint end (MUV < − 17) out to redshift z ∼ 10. At redshift z ∼ 10, only eight sources are used to constrain the UVLF, so the uncertainties are very large (∼50%–100%). In comparison, a Roman UDF could detect ∼103 sources at z ∼ 10, and thus greatly reduce the uncertainties in the UVLF.

We show the UVLF of the detectable synthetic galaxies in Figure 18, compared to the data from Bouwens et al. (2021). As in the previous section, we perform a completeness correction by dividing the number of selected galaxies by the fraction of selected galaxies at the corresponding mass and redshift. We predict a Roman UDF will be able to constrain the UVLF beyond redshift z ≈ 10, with galaxies MUV < − 17. Due to the vast number of galaxies in a Roman UDF, it could constrain the faint-end UVLF to within ∼1 percent to redshifts z ≳ 10, far tighter than existing limits.

Figure 18.

Figure 18. The UVLF of the detectable galaxies in our synthetic galaxy catalog (blue lines) and the 1σ Poisson noise (shaded region). We show the recent Bouwens et al. (2021) data for comparison (black points). We also indicate the number of sources, N, in each panel used to calculate the UVLF. A Roman UDF will provide remarkably tight constraints on the faint end of the UVLF at high redshifts.

Standard image High-resolution image

We note that the bright end of the synthetic galaxy catalog UVLF is higher than the Bouwens et al. (2021) data at low redshifts. As shown in Figure 12, our galaxy catalog agrees with the underlying UVLF model we used at the bright end, and therefore this difference can be attributed to the SMFs used in this work. This excess of low-redshift (z < 4), bright galaxies is consistent with the possibility of inefficient mass quenching, low dust obscuration, or hidden AGN activity, as suggested by Harikane et al. (2021).

8.3.3. Galaxy Clustering

Roman's wide, contiguous FOV will allow for measurements of the 2PCF of faint galaxies at the time of reionization and thus their underlying dark matter halo masses. To date, halo masses have only been strongly constrained to redshifts z ∼ 6, and are limited to bright (MUV ∼ − 20) galaxies at redshifts z = 4–6 (e.g., Harikane et al. 2018). JWST will potentially provide the first measurement of the clustering of high-redshift galaxies (Endsley et al. 2020), but due to its smaller FOV, it will likely suffer from cosmic variance, and will not have the same capability as Roman to study clustering in different environments.

We measure the 2PCF of the detectable galaxies in the 1 deg2 survey, as outlined in Section 4.3. We included all galaxies brighter than MUV < − 18. Figure 19 shows the measured 2PCF. The 2PCF for these faint galaxies is measured to very high accuracy (within 1%) out to redshifts z ≈ 7, with constraints out to z ≈ 10 (to within 10%).

Figure 19.

Figure 19. The 2PCF of the MUV < − 18 detectable galaxies in our synthetic galaxy catalog (blue lines), with the uncertainty calculated from bootstrapping the data (shaded region). The 2PCF of the detectable galaxies is measured to within 1% for galaxies z ≲ 7, and to within 10% at z ≈ 10.

Standard image High-resolution image

8.3.4. Galaxy–Halo Connection

In addition to constraining the number and spatial distribution of faint, high-redshift galaxies, Roman will also provide constraints on the galaxy–halo connection. Galaxy clustering can be used to infer dark matter halo mass, as the underlying dark matter will dictate the gravitational field (Mo & White 1996). Together with galaxy mass measurements, galaxy clustering gives a direct measure of the SHMR (which summarizes the connection between galaxy masses and their host dark matter halos).

As discussed above, halo masses have only been measured directly for redshifts z ≲ 6 (e.g., Harikane et al. 2018), while at high redshifts, SHMR constraints have been achieved with abundance matching between observed stellar masses and dark matter only simulations (e.g., Stefanon et al. 2021). There is a potential disagreement between these two techniques—the SHMR measured from abundance matching is typically three to four times higher than that measured from clustering (Stefanon et al. 2021). The difference between SHMRs derived from clustering and AM indicates that direct halo mass measurements are needed to understand the galaxy–halo connection at high redshifts. A Roman UDF will provide extraordinary data, which may elucidate the origin of this discrepancy.

We show the SHMR for our synthetic galaxy catalog in Figure 20, and recent constraints from Harikane et al. (2018) and Stefanon et al. (2021). The Stefanon et al. (2021) data were scaled down by a factor of 1.7 to convert from a Salpeter (1955) IMF to a Chabrier (2003) IMF. We use all of the 5σ Lyman-break selected galaxies in the catalog. Our synthetic catalog clustering measurements agree well with the Stefanon et al. (2021) data (within ∼1σ), and are slightly higher than the Harikane et al. (2018) data.

Figure 20.

Figure 20. SHMR of the detectable galaxies in our synthetic galaxy catalog (blue solid line), and the 1σ spread (shaded blue region). We compare to data from Harikane et al. (2018; orange points; error bars show error in the mean) and Stefanon et al. (2021; red points; error bars denote the scatter). The Stefanon et al. (2021) data has been scaled by a factor of 1/1.7 to convert to a Chabrier (2003) IMF. Stefanon et al. (2021) derived halo masses from abundance matching, while Harikane et al. (2018) derived halo masses from galaxy clustering data.

Standard image High-resolution image

In this work we do not attempt to predict how well a Roman UDF will be able to constrain the SHMR. As discussed in Section 8.3.1, stellar mass measurements for high-redshift galaxies likely require photometry at wavelengths longer than what Roman will provide, but they may be possible to estimate with scaling relations. Since our synthetic galaxy catalogs contain information regarding host dark matter halo properties, this work can form the basis of future analyses of the galaxy–halo connection in UDFs.

8.4. Synthetic Images

In addition to the galaxy catalog, we present synthetic images of a Roman UDF. These images are intended for developing analysis tools, and studying systematics (see Section 9.5). We include FITS files for each Roman filter, along with full-resolution versions of the RGB images presented in this section in our data release (see Appendix A for details).

We create the synthetic images using GalSim (Rowe et al. 2015). GalSim contains a module specifically for Roman observations (Troxel et al. 2021), which includes five of the Roman filters: Z087, Y106, J129, H158, and F184. We model each galaxy as a Sérsic profile, with an index ns , axis ratio q, and PA, as described in Section 5. We truncate the distribution of Sérsic indices between 0.3 and 6.2, to avoid numerical inaccuracies at more extreme values. The scale radius of the Sérsic profile, r0, is directly related to the assigned half-light radius, Reff:

Equation (25)

where b ≈ 2ns − 1/3 (Moriondo et al. 1998). Figure 21 shows a composite of the full synthetic galaxy catalog using Z087 (blue), Y106 (green), and H158 (red) filters. We note that we do not include simulated stars in any of the released images.

Figure 21.

Figure 21. Noise-free composite simulated image of the full 1 deg2 galaxy catalog using the Z087 (blue), Y106 (green), and H158 (red) filters. The image was created with the Roman module in GalSim. A native resolution version is available at https://www.nicoledrakos.com/dream.

Standard image High-resolution image

To simulate a Roman 30 mAB survey, we convolve the image with the Roman point-spread function (PSF), and add noise to the image. We calculate the image noise assuming proposed exposure times from Koekemoer et al. (2019), to reach a 5σ limit of mAB ≈ 30 (see Table 4). We add noise to the image in the same manner as Troxel et al. (2021). First, we generate the sky background accounting for the stray light and thermal emission from the telescope, which is added to the image. Given this, we add errors associated with Poisson noise, reciprocity failure, dark current, the calibration of the Roman detectors, inter-pixel capacitance, and instrument read noise.

Table 4. Approximate Exposure Times for the Simulated Filters, as Calculated in Koekemoer et al. (2019) to Reach a 5σ Limit of mAB ≈ 30

FilterExposure Time (hr)
Z08760
Y10670
J12990
H15840
F18460

Download table as:  ASCIITypeset image

We present an RGB visualization of the full Roman catalog, with the PSF and noise included in Figure 22, with one Roman footprint overlaid on top. Each Roman pointing will have 18 detectors. This visualization shows the rich amount of structure that will be contained in a Roman UDF. As described in Koekemoer et al. (2019), a 1 deg2 UDF would consist of three Roman pointings. Though a UDF based on three tiled WFI pointings would not be perfectly square, and the exposure time would not be perfectly uniform, these synthetic images demonstrate the richness of a 1 deg2 UDF and will be incredibly useful in designing and preparing for wide, deep surveys.

Figure 22.

Figure 22. Composite simulated image of a region of the galaxy catalog using the Z087 (blue), Y106 (green), and H158 (red) filters convolved with the Roman PSF. Noise is calculated assuming a depth of ∼30 mAB in each filter. The image was created with the Roman module in GalSim. The Roman footprint is shown on top, as are insets showing a zoomed-in region of the image. A native resolution version of this 1 deg2 visualization is available at https://www.nicoledrakos.com/dream.

Standard image High-resolution image

We include FITS maps of the galaxy catalog for all five filters currently included in GalSim in our data release. The top row of Figure 23 shows the flux in one WFI detector (each detector has an area of ∼$7\buildrel{\,\prime}\over{.} 3\,\times \,7\buildrel{\,\prime}\over{.} 3$). The bottom row shows an example z = 9.4 galaxy (with the background subtracted in each panel). This galaxy would be selected as a Y106 dropout. The dropout galaxy has a size of Reff = 0.38 kpc, and therefore is not resolved with the WFI (which has an angular resolution of $0\buildrel{\,\prime}\over{.} 11$ pixel−1).

Figure 23.

Figure 23. Flux in one WFI detector (top row), and an example of a Y106 dropout galaxy (bottom row). Each column corresponds to a different photometric filter, as labeled. The WFI detector has 4k x 4k pixels, with an angular resolution of $0\buildrel{\,\prime}\over{.} 11$ pixel−1. The example dropout galaxy is at redshift z = 9.4, and has a galaxy mass of Mgal = 107.6 M.

Standard image High-resolution image

The synthetic images we release are at the resolution of the WFI detector, $0\buildrel{\,\prime}\over{.} 11$ pixel−1. An actual Roman UDF would likely be dithered, which would sample the sky on a subsampled pixel scale, which would improve the PSF sampling and angular resolution. Since this process would create correlations between adjacent pixels, we did not include this in our data release. Future work will build upon DREaM to examine dithering strategies of a Roman UDF.

9. Discussion

This work presents synthetic galaxy catalogs for a 1 deg2 UDF with Roman, created using DREaM, a model for deep, realistic realizations of galaxy populations. Our model successfully reproduces a number of well-known trends, including the size–mass relation, the fundamental metallicity relation, and UV luminosity functions. Additionally, we reproduce observed low-redshift galaxy clustering and SMFs. We have made the galaxy catalog, and synthetic images public, to provide the community tools to prepare for a potential UDF with Roman.

A UDF survey with Roman will address a number of science topics, including the EoR, the emergence of QGs, and the high-redshift galaxy–halo connection. Roman's large FOV will capture enormous sample sizes, over contiguous fields, imaging multiple ionization bubbles. This large area will decrease cosmic variance, and probe detailed environments around high-redshift galaxies.

9.1. The Epoch of Reionization

The EoR is the final frontier for galaxy surveys. Given the difficulty in measuring galaxies at high redshifts, this period in the universe's history is remarkably unconstrained. High-redshift low-mass galaxies were likely the major source of the ionizing photons in the EoR (e.g., Finkelstein et al. 2019), and observations indicate that reionization was a "patchy" process (e.g., Furlanetto & Oh 2005; Villasenor et al. 2021). To fully understand the EoR, we need a complete census of galaxies and their ionizing photon contribution.

To determine whether there are enough high-redshift, faint galaxies to cause reionization, we must accurately model the CSFRD, which depends on the UVLF. Though there are some constraints on the UVLF to z ∼ 10 (e.g., Bouwens et al. 2021), there is still much improvement to be made on constraining both the bright steep end of the UVLF and the faint end (e.g., Bowler et al. 2014, 2015) at high redshifts. JWST will make improvements on this front. For example, Kauffmann et al. (2020) predicted that the $100\,{\mathrm{arcmin}}^{2}$ Cosmic Evolution Early Release Science Survey will constrain the faint end of the UVLF to a precision of 0.25 at z ≥ 8, but that a survey would need to be at least $300\,{\mathrm{arcmin}}^{2}$ to constrain the bright end up to z = 8. We predict a Roman UDF will capture enough high-redshift galaxies to constrain the UVLF to within 1% at redshifts beyond z ∼ 10. This will either confirm that there are enough faint galaxies to account for the ionizing photons needed to cause reionization, or indicate that another source, such as AGNs (e.g., Madau & Haardt 2015), must contribute.

In addition to the abundance of galaxies, the total ionizing photon budget depends on the Lyman-continuum (LyC) production efficiency, ξion, and the LyC escape fraction, fesc. To account for reionization, galaxies need higher escape fractions at high redshifts than have been observed at low redshifts (Davies et al. 2021). Since the neutral IGM absorbs LyC photons, fesc is very difficult to constrain at the EoR. However, fesc could possibly be measured during the EoR using indirect methods (e.g., Leethochawalit et al. 2016; Zackrisson et al. 2017; Chisholm et al. 2018, 2020). A Roman UDF will detect tens of thousands of galaxies during the epoch of reionization, including rare, bright sources that are ideal for spectroscopic follow-up to measure fesc. Further, since Roman can map out the density around each galaxy, it will allow for the measurement of the escape fraction in different environments.

9.2. The Emergence of Quiescent Galaxies

There exists a clear bimodality in SFRs, indicating two distinct populations (star-forming and quiescent). The decline in the CSFRD since redshift z ≈ 2 was likely caused by the quenching of galaxies (e.g., Renzini 2016). However, the mechanisms that transform galaxies from star-forming to quiescent are not fully understood. Possible mechanisms that have been proposed include feedback from stars and AGNs, or the removal of gas through tidal or ram pressure stripping. Quenching is likely caused by a combination of these mechanisms, and the dominant processes may be redshift dependent (Kalita et al. 2021).

Advances in the understanding of the origin of QGs is being greatly improved by surveys such as the spectroscopic survey Gemini Observations of Galaxies in Rich Early ENvironments (Balogh et al. 2021), which targets QGs in clusters around redshift z ∼ 1. Roman has the potential to identify QG populations out to much higher redshifts. In particular, the synthetic catalog presented in this work predicts that a UDF with Roman will contain ∼105 detectable QGs, including QGs beyond redshift z = 7. Though the exact number of QGs relies heavily on the underlying assumed SMF, Roman will likely detect the highest redshift QG to date if they exist, allowing for the study of quenching mechanisms at high redshifts.

9.3. The Galaxy–Halo Connection

A Roman UDF will likely provide the first strong clustering measurements of the faint, high-redshift galaxies responsible for reionization, which will enable a measurement of the underlying dark matter mass. In our preliminary predictions, we have shown that the coverage from a UDF will constrain the 2PCF for faint galaxies out to redshifts z = 10 (at the 10% level). These measurements will likely provide the first direct measurement of halo masses at z = 10, placing constraints the galaxy–halo connection at high redshift. Since the galaxy–halo connection depends on both galaxy formation physics and the underlying cosmological model, this will provide important tests for current models of galaxy formation.

9.4. Synergies with JWST

Roman's extensive FOV will allow for the contiguous, deep imaging of galaxies, reducing cosmic variance, and probe the environment around individual galaxies. The large number of galaxies detected by a Roman UDF will reduce Poisson noise, and increase the number of rare objects that will be detected. Since a single Roman pointing is wide enough to capture several reionization bubbles at the height of reionization, a Roman UDF will allow for the study of differences between galaxies in ionized and neutral regions. For instance, there may be possible variations in the faint-end slope of the UVLF with environment, which can be measured with a deep, wide, galaxy survey.

Before the launch of Roman, JWST will begin to address questions regarding the EoR, galaxy formation, and the galaxy–halo connection. In particular, the JWST Advanced Deep Extragalactic Survey (JADES) is a $236{\mathrm{arcmin}}^{2}$ planned imaging and spectroscopy survey. 19 Though JADES will provide revolutionary data, it will detect less than ∼10 × the number of objects at all redshifts compared to a Roman UDF. In addition, cosmic variance may dominate over Poisson noise for future high-z surveys (Trapp & Furlanetto 2020), and JADES will cover a much smaller area.

Though Roman will cover a huge area on the sky, it will be difficult to obtain photometry at wavelengths greater than 2 μm. At high redshifts, accurate measurements of stellar masses and QG identification require supplementary imaging from another observing facility. JWST will probe very far into the infrared, which will potentially allow for Balmer-break selection of galaxies, improving redshift measurements and stellar mass estimates (though JWST is not sensitive to light blueward of 1 μm). Therefore, JWST may provide valuable spectroscopic follow-up to rare detections from a Roman UDF. For example, Roman will detect many galaxies on the bright end of the UVLF. Bright galaxies (e.g., starburst galaxies, with MUV < − 22) are ideal for spectroscopic follow-up with JWST, and can be used to observe nebular lines and estimate fesc.

9.5. Applications of Synthetic Images

In addition to the simulated galaxy catalog, we provide synthetic images of a Roman UDF. These images can be used to determine the impact of source blending, line confusion, and potential problems with SED fitting (e.g., Borlaff et al. 2019; Kauffmann et al. 2020; Massara et al. 2021). Having very deep synthetic images will be useful for studying WFI systematics, processing issues (e.g., low surface brightness issues), and secondary analysis (e.g., photo z studies). Quantifying these issues will also be beneficial for many other Roman surveys, such as the High Latitude Survey. For instance, in Section 8.1, we assumed that all galaxies that are detectable will be selected. However, since a UDF will be so richly populated with structure, a fraction of the high-redshift galaxies will be obscured by forefront galaxies. We intend to use the synthetic images and catalogs in this work to quantify this effect.

9.6. Grism Predictions

In this paper we have only included photometric predictions, but the realistically modeled galaxy SEDs can also be used to generate grism predictions. In particular, the MOSFIRE Deep Evolution Field (MOSDEF) survey (Kriek et al. 2015) measured the detailed rest-frame optical emission-line SEDs in galaxies z = 2–3. More recent measurements with Keck/MOSFIRE have presented spectroscopic measurements out to z ∼ 8 (Topping et al. 2021). We can combine these observations with our synthetic catalog, to guide the Roman grism data reduction pipeline at high redshifts.

The synthetic catalog can also be used to study LAEs. LAEs produce a large amount of ionizing photons, and both UVLFs and clustering of LAEs are important to characterize EoR. The Lyα UVLF decreases toward the early stage of EoR, since neutral hydrogen absorbs Lyα photons (e.g., Ouchi et al. 2018). The clustering of LAEs emitters constrains the ionized fraction and topology of reionization.

10. Summary and Conclusions

This work presents DREaM, a Deep Realistic Extragalactic Model for creating synthetic galaxy catalogs. We use this model to understand the potential power of a 1 deg2 UDF with Roman, and provide publicly available realistic synthetic galaxy catalogs and images. The synthetic catalogs and images will aid the community in designing and interpreting a Roman UDF. A summary of our main predictions is given below.

A 1 deg2 Roman UDF will:

  • 1.  
    contain more than 106 detectable galaxies, with more than 104 during the EoR (z > 7).
  • 2.  
    contain ∼105 detectable QGs, including a few at redshifts beyond z ∼ 7, likely detecting the farthest redshift QG to date.
  • 3.  
    help constrain SMFs for SFGs to redshifts beyond z ∼ 10, and for QGs past redshift z ∼ 7.
  • 4.  
    provide tight constraints (within 1%) on the faint end (MUV < − 17) of the UV luminosity function, out to redshifts z ∼ 10.
  • 5.  
    provide high-redshift (z > 7) constraints on the clustering of the faint galaxies thought to be responsible for reionization.
  • 6.  
    look for variations in the UVLF in different environments.

Overall, Roman's wide FOV offers a unique ability to create wide, deep surveys. A Roman UDF would enable a tremendous amount of science by detecting the largest census of high-redshift galaxies to date, and differentiating between galaxy populations in low- and high-density regions during the EoR.

The authors would like to thank the anonymous referee for useful comments. This work was supported by NASA contract NNG16PJ25C. The authors acknowledge use of the lux supercomputer at UC Santa Cruz, funded by NSF MRI grant AST 1828315, and the NASA supercomputers. In addition, the authors thank Christina Williams, Yifei Luo, David O. Jones, and Kevin Hainline for useful discussion.

Software: numpy (Harris et al. 2020), matplotlib (Hunter 2007), scipy (Virtanen et al. 2020), astropy (Astropy Collaboration et al. 2013, 2018), python-fsps (Foreman-Mackey et al. 2014), Colossus (Diemer 2018), Music (Hahn & Abel 2011), Gadget-2 (Springel 2005), Rockstar (Behroozi et al. 2013a), Consistent Trees (Behroozi et al. 2013b), FSPS (Conroy et al. 2009; Conroy & Gunn 2010), CorrFunc (Sinha & Garrison 2020).

Appendix A: Data Release

We have made a number of our data products public, as summarized in Table 5. We release a main galaxy catalog, a catalog containing intrinsic galaxy properties, a halo catalog, and the synthetic image in five filters (Z087, Y106, J129, H158, and F184). The galaxy properties that are included in the catalogs are summarized in Table 6. These data products, and an interactive online visualization of the synthetic images are available at https://www.nicoledrakos.com/dream. Full galaxy spectra can be generated using the FSPS intrinsic galaxy properties, or provided upon request.

Table 5. Data Products

ProductFilenameDescription
Main Catalog DREaM_main.fits Contains galaxy masses, positions, morphologies, rest-frame properties, and photometry in Roman and JWST filters. See Table 6 for more information.
Intrinsic Properties DREaM_intrinsic.fits Contains the FSPS parameters used to generate galaxy SEDs. See Table 6 for more information.
Halo Properties DREaM_halos.fits Contains the host halo properties, such as the mass, shape, size, spin, and peculiar velocity of the host halos. See Table 6 for more information.
Images DREaM_FXXX.fits Synthetic images of the galaxy catalog in five Roman bands (FXXX = Z087, Y106, J129, H158, and F184), as described in Section 8.4.

Note. Available online at https://www.nicoledrakos.com/dream.

Download table as:  ASCIITypeset image

Table 6. Catalog Content

VariablesDescriptionMCIPCHPC
IDGalaxy ID. The same across all of the catalogs
RA, DecRA and Dec [degrees, −0.5 to 0.5]
redshiftGalaxy redshift
M_haloHalo mass [M/h]
M_galGalaxy mass [M]
logpsiStar formation rate $[{\mathrm{log}}_{10}(\psi /({M}_{\odot }/\mathrm{yr}))]$
M_UVRest-frame UV magnitude
R_effHalf-light radius in the semimajor axis [kpc, physical]
n_sSérsic index
qProjected axis ratio: semiminor to semimajor half-light size
PAPosition angle [radians, 0 to 2π]
betaRest-frame UV continuum slope
U, V, JRest-frame magnitude in the U, V, and J bands
R062, Z087, Y106, J129, H158, F184, F213Apparent AB magnitude in Roman filters
F070W, F090W, F115W, F150W, F200W, F277W, F356W, F444WApparent AB magnitude in JWST filters
SFStar-forming (True) or Quiescent (False)
t_startAge of universe when galaxy started forming [Gyr]
tau e-folding time for star formation [Gyr]
logZMetallicity parameter $[{\mathrm{log}}_{10}(Z/{Z}_{\odot })$]
dustDust attenuation parameter
logUSGas ionization parameter $[{\mathrm{log}}_{10}({U}_{s})$]
V_maxHalo maximum circular velocity [km s−1, physical]
R_sHalo scale radius [kpc/h, co-moving]
R_virHalo virial radius [kpc/h, co-moving]
M_peakHalo peak mass over accretion history [M/h]
V_peakHalo peak ${V}_{\max }$ over accretion history [km s−1, physical]
b_aHalo axis ratio, b/a
c_aHalo axis ratio, c/a
haloIDID of host halo
hostIDID of least-massive host halo (−1 if distinct halo).
spin_BBullock et al. (2001) halo spin parameter
spin_PPeebles (1971) halo spin parameter
x,y,zHalo positions [Mpc/h, co-moving]
vx,vy,vzHalo peculiar velocities [km s−1, physical]

Note. The galaxy information is contained in either the main catalog (MC), internal properties catalog (IPC), or halo properties catalog (HPC).

Download table as:  ASCIITypeset image

Appendix B: Synthetic Galaxy Properties

Our main goal was to study the ability of a Roman UDF to constrain the photoionizing contribution of high-redshift galaxies, and the environments around these galaxies. Therefore, we carefully constructed the DREaM galaxy catalogs to have realistic clustering properties and UV properties. In addition to these properties, our synthetic galaxies also capture a number of other observational trends.

In this section we examine some of the properties of the full galaxy catalog. We begin by looking at the how the UVJ colors (Section B.1), ages (Section B.3), and SFRs (Section B.2) differ between star-forming and QGs. Additionally we verify that we reproduce the well-known fundamental metallicity relation (Section B.4).

B.1. UVJ Colors

As discussed in Section 6.2.2, UVJ diagrams differentiate between star-forming and QG populations. We display the U−V and V−J colors for our two galaxy populations in Figure 24, along with the UVJ selection box from Williams et al. (2009). The two populations occupy two distinct regions in this parameter space, demonstrating that the synthetic galaxy catalog does capture the bimodal population. Observational constraints for QGs do not currently exist for galaxies past redshifts z ≈ 4; the UVJ distribution of the synthetic galaxies at low redshifts does agree with observations (e.g., Schreiber et al.2015).

Figure 24.

Figure 24. UVJ diagram of synthetic SFGs (blue) and QGs (orange). We used the selection box from Williams et al. (2009; black lines). Our synthetic galaxies demonstrate a clear bimodal distribution, with SFGs falling in the top-left corner of the UVJ diagram.

Standard image High-resolution image

B.2. Star Formation Rate

We do not explicitly assign SFRs to the galaxies in the synthetic catalog, but we reproduce realistic trends in SFRs. By assigning realistic MUV values to the SFGs, and constraining QGs to the appropriate place in the UVJ color diagrams, we accurately model a bimodal population, with SFGs having higher SFRs than QGs. Additionally we reproduce the observed cosmic star formation rate density (CSFRD) as demonstrated in Section 7.2.

Figure 25 shows the SFR–mass relation of the synthetic catalog compared to the relations from Schreiber et al. (2017). We closely match the Schreiber et al. (2017) SFG SFR–mass relations. The SFRs for QGs are lower than the Schreiber et al. (2017) relation. However, the SFRs–mass relations from Schreiber et al. (2015) assumed that all IR emission from QGs originated from residual star formation. Alternative explanations are that the IR emission originates from AGN torus emission, dust heating, or incorrect classification of SFGs. The QG SFRs still agree with what current observations can predict.

Figure 25.

Figure 25. The average SFR vs. galaxy mass for synthetic SFGs (top) and QGs (bottom). The SFRs increase with mass, and the SFGs have higher SFRs than QGs, as expected. For comparison, we plot the relation from Schreiber et al. (2017; dotted lines). The galaxy catalog agrees closely with the Schreiber et al. (2017) parameterization for the SFGs. The QGs have lower SFRs than the Schreiber et al. (2017) parameterization, but are still consistent with observations (see discussion in text).

Standard image High-resolution image

B.3. Age

Galaxy ages are a measure of SFHs. SFHs of individual galaxies give a direct measurement of the evolution of different galaxy populations. Trends in galaxy ages have been well established; for instance, QGs are older than SFGs, and low-redshift galaxies are older than high-redshift galaxies (e.g., Webb et al. 2020). However, galaxy ages for individual galaxies are difficult to measure due to degeneracies between other parameters (metallicity, dust), sensitivity to priors in fitting (e.g., Leja et al. 2019), and the similarity between SEDs in galaxies older than ∼5 Gyr (e.g., Gallazzi et al. 2005). Our synthetic galaxy catalogs provide age estimates, and realistic SEDs to examine this further.

We calculate the mass-weighted age of the synthetic galaxies as

Equation (B1)

For the delayed-tau model,

Equation (B2)

where ttot = tagetstart is the total time of star formation.

Figure 26 shows the average galaxy ages as a function of redshift for both SFGs (blue) and QGs (red). QGs are older than SFGs at all redshifts, and age decreases with increasing redshift, as expected. At low redshifts, QGs are ≈ 3 Gyr older than SFGs.

Figure 26.

Figure 26. Average mass-weighted age of synthetic galaxies for SFGs (blue points) and QGs (red points), as defined in Equation (B1). QGs are older than SFGs, and ages decrease with increasing redshift. Galaxy ages are less than the age of the universe (dashed black line).

Standard image High-resolution image

B.4. Metallicity

Galaxy metallicity can greatly affect quantities such as galaxy color, and therefore must be accurately modeled. We assigned metallicities from the FMR (Equation (19)), to the parent catalogs used in the SED pipeline. Figure 27 shows the FMR for the synthetic galaxies. We show for comparison the FMR (dotted lines), where we have used the assumption $12+{\mathrm{log}}_{10}({\rm{O}}/{\rm{H}})\approx {\mathrm{log}}_{10}({Z}_{\mathrm{met}}/{Z}_{\mathrm{sol}})$.

Figure 27.

Figure 27. Mass–metallicity relation of the synthetic galaxies (solid lines). The shaded regions are one standard deviation. Dotted lines show the FMR from Williams et al. (2018), which were used to generate the parent catalog SEDs. The synthetic galaxies have increased metallicity with mass and SFR, in agreement with the underlying FMR.

Standard image High-resolution image

Our synthetic galaxies follow the expected FMR, where metallicity increases with mass, with a turnover at high masses, as seen in observations (see discussion in Section 6.3). The low-mass galaxies have lower metallicities than the high-mass galaxies as expected, but do not vary much with SFR. However, at these low masses, very few galaxies will have SFRs ψ > −1.

Footnotes

  • 11  

    Though there has been success with the Santa Cruz SAM (Somerville & Davé 2015) in making predictions for upcoming JWST observations (Yung et al. 2019a, 2019b, 2020b, 2020a).

  • 12  

    Though faint high-redshift galaxies can also be detected through galaxy lensing, we expect this number to be very low for low-mass galaxies. For example, Kikuchihara et al. (2020) used gravitational lensing techniques to detect very faint dropout galaxies in the HFFs to redshifts z ∼ 6, and detected galaxies with Mgal > 106 M.

  • 13  

    Recentering the synthetic catalog introduces a slight in-homogeneity. For a 1 deg2 survey, this difference is on the order of 10−5.

  • 14  

    However, Miller et al. (2019) recently showed that differences between size trends for SFGs and QGs may disappear if you use r80—the radius enclosing 80% of stellar light—rather than Re .

  • 15  

    Reff is related to the commonly used circularized half-mass–radius, Reff,circ, by ${R}_{\mathrm{eff},\mathrm{circ}}=\sqrt{b/a}{R}_{\mathrm{eff}}$, where b/a is the minor-to-major axis ratio.

  • 16  

    k-d trees are k-dimensional data structures arranged as a binary tree. This structure allows for rapid searches to find the nearest neighbors of a given point in the k-dimensional space.

  • 17  

    The F213 was recently added; the science justifications for this filter are outlined in Stauffer et al. (2018). The higher background in F213 will result in brighter flux limits compared to bluer Roman bands.

  • 18  

    The equivalent limit in MUV is MUV ∼ − 14.5 at z < 1 (Cucciati et al. 2012), MUV ∼ − 15.5 at 1 < z < 2 (Cucciati et al. 2012), MUV ∼ − 16.89 at 2 < z < 2.7 (Reddy & Steidel 2009), and MUV ∼ − 17 at z > 2.7 (Bouwens et al. 2015, 2016; Finkelstein et al. 2015; Oesch et al. 2018).

  • 19  

    Other JWST programs, such as Public Release IMaging for Extragalactic Research, and the COSMOS-Webb survey will cover larger areas than JADES, and probe the EoR, but neither will go as deep or as wide as the 1 deg2 Roman UDF.

Please wait… references are loading.
10.3847/1538-4357/ac46fb