Spectral Templates Optimal for Selecting Galaxies at z > 8 with the JWST

The selection of high-redshift galaxies often involves spectral energy distribution (SED) fitting to photometric data, an expectation for contamination levels, and measurement of sample completeness—all vetted through comparison to spectroscopic redshift measurements of a sub-sample. The first JWST data are now being taken over several extragalactic fields to different depths and across various areas, which will be ideal for the discovery and classification of galaxies out to distances previously uncharted. As spectroscopic redshift measurements for sources in this epoch will not be initially available to compare with the first photometric measurements of z > 8 galaxies, robust photometric redshifts are of the utmost importance. Galaxies at z > 8 are expected to have bluer rest-frame ultraviolet (UV) colors than typically used model SED templates, which could lead to catastrophic photometric redshift failures. We use a combination of BPASS and Cloudy models to create a supporting set of templates that match the predicted rest-UV colors of z > 8 simulated galaxies. We test these new templates by fitting simulated galaxies in a mock catalog, Yung et al., which mimic expected field depths and areas of the JWST Cosmic Evolution Early Release Science Survey (m 5σ ∼ 28.6 over ∼100 arcmin2). We use EAZY to highlight the improvements in redshift recovery with the inclusion of our new template set and suggest criteria for selecting galaxies at 8 < z < 10 with the JWST, providing an important test case for observers venturing into this new era of astronomy.

been opened up to discovery.Deep 1-5 m JWST imaging paired with Hubble Space Telescope (HST) optical imaging allows galaxy selection via their Ly-breaks (below which intergalactic hydrogen absorbs the rest-frame ultraviolet [UV] light emitted from distant galaxies).The JWST coverage allows detected galaxies to have both multiple "dropout" bands (non-detections blue-ward of the break) and multiple detection bands (significant detections red-ward of the break), substantially improving the discovery of galaxies in the reionization epoch ( > 7).The biggest advances with JWST data lie at  > 9, where HST efforts could only see such galaxies in L .
at 1-2 filters at  ∼ 9-10, and not at all at  > 11.Unsurprisingly, within days of the data being released, several studies identified tens of candidate galaxies at  > 10 (e.g.Adams et al. 2022;Atek et al. 2022), with a few at  12 (Finkelstein et al. 2022a;Harikane et al. 2022;Finkelstein et al. 2022b), and even  ∼ 17 (Donnan et al. 2022).The observed number density of galaxy candidates is exceeding predictions (Finkelstein et al. 2022b), with a variety of theoretical explanations already popping up exploring possible explanations ranging from dust-free stellar populations (Ferrara et al. 2022) to extremely efficient star-formation (Mason et al. 2022;Mirocha & Furlanetto 2022).
The selection of high-redshift galaxies often involves spectral energy distribution (SED) fitting to photometric data, vetted through comparison to spectroscopic redshift measurements.The first JWST data is now being taken over a variety of fields, to different depths and across various areas, which will be ideal for the discovery and classification of galaxies out to distances previously unobtainable.As statisticallysignificant spectroscopic redshift measurements for sources in this epoch will not be initially available to compare with the first photometric measurements of  > 8 galaxies, robust photometric redshifts are of the utmost importance.While photometric-redshift calculations at these high redshifts primarily measure the Lyman break, similar to color-color selection (e.g.Steidel & Hamilton 1993;Giavalisco et al. 2004;Bouwens et al. 2015;Bridge et al. 2019), SED fitting has the advantage that it simultaneously uses all available photometric information.This simplifies the selection process and results in a more inclusive sample of high-redshift candidates than color-color selection alone as it includes objects that might fall just outside color selection windows (e.g.McLure et al. 2009;Finkelstein et al. 2010Finkelstein et al. , 2015a;;Bowler et al. 2012;Atek et al. 2015;Livermore et al. 2017;Bouwens et al. 2019).
To perform photometric redshift estimations many use photometric-redshift (photo-z) codes such as EAZY (Brammer et al. 2008).These codes use all available photometry and compare to a series of SED templates, allowing nonlinear combinations of any number of provided templates.While these templates have been optimized to best match well-studied spectroscopic redshifts, the bulk of these spectroscopic measurements are at  < 4. Thus the appropriateness of these templates for galaxies at higher redshifts, such as those of particular interest to JWST surveys, is less well known.
Galaxies at  > 8 are expected to have bluer rest-frame ultraviolet (UV) colors than typically-used model templates, which could lead to catastrophic photometric redshift failures.Finkelstein et al. (2022c) compared the native EAZY template github.com/gbrammer/eazy-photozset to their sample of  = 6-8 galaxies (Finkelstein et al. 2015b), and found that these templates did not span the full color range of their comparison sample (Figure 5 in their paper).Specifically, while many of the included templates in EAZY were redder than these high-redshift galaxies, the bluest template was only as blue as their median high-redshift galaxy.It is expected that the  > 8 galaxies that will be studied in depth with JWST will have colors at least as blue as those  = 6−8 galaxies discussed by Finkelstein et al. (2022c).Due to expected young stellar populations at such early times in cosmic history, a decrease in metallicity at higher redshifts, and active star formation episodes these high-redshift galaxies likely have increasingly bluer colors.It is imperative that we use appropriate models in our SED fits to ensure the accuracy of our photometric redshifts.
The selection of high-redshift galaxies is often made even more difficult due to the high rates of contamination from lower-redshift galaxies that mimic many of the same selection criteria.We must be looking into the best ways to reduce the contamination fraction in our candidate galaxy selection process.This is often done by utilizing a number of spectroscopic redshift measurements to calibrate photometric redshift accuracy, where we are able to get a measure of how disparate the actual vs recovered redshifts of galaxies are on average.Unfortunately, during the first years of JWST data we will not have a significant number of spectroscopic redshifts available above  ∼ 8 with which to conduct this comparison.The time is now for building up samples of galaxies in the reionization era as upcoming galaxy legacy surveys, and ensuring accurate measurements of their redshifts.To enable these analyses, we use a catalog of simulated galaxies that are expected to be representative of those at  > 8 and perform an SED-fitting process to determine the accuracy and coverage of current SED templates.We explore how the creation of a new suite of blue galaxy templates can improve our fits to these simulated galaxies, and discuss the best selection criteria for selecting high-redshift galaxies with JWST.
In §2 we discuss the simulated galaxy catalog which provides a robust sample of  = 8 − 10 galaxies with which to test our SED-fitting templates and methods.In §3 we address the color-space that our simulated, and expected real high-z galaxies, occupy but which is not covered by existing galaxy templates in the EAZY software and the templates we created to span this gap.We then test the improvements to our photometric redshift fits from EAZY that these new templates enable in §4.We also explore the robustness of our photometric redshift fits to these simulated galaxies when placed at depths equal to those predicted from one of the JWST Early Release Science surveys with public data access in the first months of observations: the Cosmic Evolution Early Release Science Survey (CEERS: m∼28.6, 100 arcmin 2 , PI Finkelstein, Bagley et al. 2022) in §5.We also provide some S T H -G 3 suggested criteria for selecting galaxies at  > 8 with JWST in §6, which minimizes the contamination from low-redshift interlopers while maintaining a successful recovery-rate of target high-z galaxies.We then present our conclusions in §7.For this paper we express all magnitudes in the AB system (Oke & Gunn 1983) unless otherwise noted.In this paper we assume the latest Planck flat ΛCDM cosmology with  0 = 67.36km s −1 Mpc −1 , Ω  = 0.3153, and Ω Λ = 0.6847 (Planck Collaboration et al. 2020).

SIMULATING THE HIGH-REDSHIFT UNIVERSE
As the first JWST data are released and attention turns to the  > 8 Universe, we explore whether previously-used templates are blue enough to match the expected colors of  > 8 galaxies.We note that while HST did discover some galaxies at  = 9-10, most were fairly massive (e.g.Tacchella et al. 2022) and thus might not have colors indicative of the bulk of the lower-mass population which JWST will study.As data from JWST is only just starting to be taken, we explore the expected color-space of these galaxies by using a simulated catalog.
In this work, we adopt three realizations of a modified version of simulated lightcones with footprints overlapping the observed EGS field as presented in Yung et al. (2022) and Somerville et al. (2021).Each of these lightcones spans 782 arcmin 2 with dimensions of 17 arcmin × 46 arcmin, containing galaxies in 0  10 and resolving galaxies down to  * ∼ 10 7 M .The mock lightcone is constructed based on dark matter halos extracted from the Bolshoi-Planck body cosmological simulation (Klypin et al. 2016) using the lightcone package provided as part of the the publiclyavailable U M code (Behroozi et al. 2019(Behroozi et al. , 2020)).These dark matter halos are processed with the Santa Cruz semi-analytic model (SAM) for galaxy formation (Somerville & Primack 1999;Somerville et al. 2015), with dark matter halo merger trees constructed on-the-fly using an extended Press-Schechter (EPS)-based algorithm (Somerville & Kolatt 1999).We refer the reader to Yung et al. (2022) for detail regarding the construction of the simulated lightcones.
The Santa Cruz SAM tracks a wide variety of baryonic processes using prescriptions derived analytically, inferred by observations or extracted from numerical simulations, and provides physically-backed predictions for galaxies across wide ranges of redshift and mass.This model has been shown to be able to reproduce the observed evolution in distribution functions of rest-frame UV luminosity, stellar mass, and SFR from  ∼ 0 to the highest redshift where observational constraints are available (Somerville et al. 2015;Yung et al. 2019a,b).The model performance during the epoch of reionization has been extensively tested and shown to agree extremely well with the observed evolution in one-point distribution functions of many galaxy properties, scaling relations, IGM and CMB reionization constraints, and two-point correlation functions (Yung et al. 2019a(Yung et al. ,b, 2020a(Yung et al. ,b, 2021(Yung et al. , 2022)).
The physically-predicted properties and star-formation history (SFH) are assigned SEDs which are generated based on stellar population synthesis (SPS) models by Bruzual & Charlot (2003).In addition, we include nebular emission lines predicted by numerical models from Hirschmann et al. (2017Hirschmann et al. ( , 2019)).These models account for excitation from young stellar populations, feedback from accreting supermassive black holes, and post-AGN stars.The nebular emission lines-included are forward-modelled into a rest-frame UV luminosity and observed-frame JWST photometry, including ISM dust attenuation (Calzetti et al. 2000) and IGM extinction (Madau et al. 1996).
For this project we use the published CEERS Simulated Data Product V3 catalog which includes the 0th realization of the SAM containing 1,472,791 total galaxies.The redshift distribution of the full SAM catalog is shown in Figure 1 (green) with a zoom in on the 6,578  = 8 − 10 galaxies.As real observations with JWST are limited by our ability to detect objects in the images, we impose a S/N> 3 cut in F200W where the "noise" is set to the expected 1 CEERS depth (m=30.72;Finkelstein et al. 2017).The rest of this paper utilizes the 913,288 galaxies (3084 at  > 8) that meet this criterion (purple).
Figure 1.The redshift distribution of the full SAM catalog of 1,472,791 galaxies is shown in green with a zoom in on the 6,578  = 8 − 10 galaxies from the CEERS Simulated Data Product V3 (Yung et al. 2022).As real observations with JWST are limited by our ability to detect objects in the images we impose an initial S/N > 3 cut in F200W, where the noise is set to the 1 expected CEERS depth ( 3 = 29.5;Finkelstein et al. 2017).This paper utilizes the 913,288 galaxies (3084 at  > 8) that meet this initial criterion (purple).ceers.github.io/releasesL .

Comparing Template Colors to Predicted z > 8 Galaxies
To perform our photometric redshift estimations we use the photometric-redshift (photo-z) code EAZY (Brammer et al. 2008), and the included templates.The latest EAZY template set, known as "tweak_fsps_QSF_v12_v3" is based on the Flexible Stellar Population Synthesis (FSPS) code (Conroy & Gunn 2010).This template set has further been corrected (or "tweaked") for systematic offsets observed between data and the models.Finkelstein et al. (2022c) found that the native EAZY FSPS templates were redder than their sample of observed  = 6 − 8 galaxies.To cover a larger color-range which better represented their high-z sample, they added as an additional template the observed spectrum of the  = 2.3 galaxy BX418, which is young, low-mass, and blue (Erb et al. 2010).This galaxy's color is 0.12 mag bluer than the bluest EAZY FSPS template, and has a color bluer than 85% of the known high-redshift galaxies at the time.Finkelstein et al. (2022c) add two versions of this template; one with the observed Ly emission, and one where Ly was removed, to account for blue galaxies whose Ly has been absorbed from a potentially neutral IGM (e.g.Miralda-Escudé & Rees 1998;Malhotra & Rhoads 2006;Dĳkstra 2014).
We first test whether the colors of the native EAZY FSPS templates, plus the single bluer Erb et al. (2010) template added by Finkelstein et al. (2022c), cover the full color-space of our simulated galaxies.We redshifted these templates to to  = 10 (a reasonable redshift of "first discovery" for JWST), and measured the JWST/NIRCam F200W − F277W color.We chose this color as it measures the rest-frame UV color around 2000Å at this redshift, and it is fully red-ward of the Ly break for  13.5.Since we are only measuring a color between two filters we do not normalize the templates, and only multiply the wavelength by (1 + ).
The templates included with the EAZY software are in   units, so we must first convert them into   and then pass these templates through both the NIRCAM F200W and F277W filters by interpolating the filter transmission curve onto the SED template wavelength array.We then set any values for the filter transmission that are negative after interpolation or are smaller than 0.001 to 0.0 and integrate the SED template through the filter using where   is the transmission curve for the filter, and   is the flux of the SED template (Papovich et al. 2001).This gives the flux bandpass-averaged flux,   , in that filter band.We  1. SED template F200W − F277W colors used for highredshift galaxy target selection.Each of the templates has been redshifted to  = 10 for this measurement.The "tweak FSPS" models are distributed with the EAZY software (Brammer et al. 2008).A template based on the observations of Erb et al. (2010) had been previously included by Finkelstein et al. (2022c) for high-z galaxies in order to include a bluer template that matched the colors of their  = 6 − 8 sample.We created BPASS and BPASS + C emission line templates to fully cover the color space of simulated high-redshift galaxies.We note that the nebular continuum emission included in the BPASS + C templates makes them redder in color than the BPASS only models that do not include emission lines.
We compared the color space spanned by these templates (vertical lines) to our simulated galaxies (black histogram) from Yung et al. (2022) in Figure 2. We find that the FSPS templates that are included with EAZY are all much redder than our simulated  > 8 galaxies.We also note that the template from Erb et al. (2010) that was added by Finkelstein et al. (2022c), while bluer than the FSPS templates, is still redder than a majority of our simulated high-redshift galaxies.This shows that bluer models are needed in our template set to better represent the expected colors of  > 8 galaxies and ensure accurate SED fits with JWST data.2010) (green), but it is still redder than the majority of our simulated high-redshift galaxies.We created BPASS (purple) and BPASS + C emission line (blue) templates and note that the BPASS + C templates are redder in color than the BPASS only models due to their nebular continuum emission.This full template set can now reproduce the colors of all high-redshift galaxies in our simulated sample.

CREATING BLUE GALAXY SED TEMPLATES
As none of the EAZY FSPS templates have colors blue enough to match our simulated high-redshift ( > 8) galaxies we created new, bluer templates that would more accurately represent our target galaxies.

BPASS Templates
We created model SED templates using BPASS v2.2.1 (Eldridge et al. 2017;Stanway & Eldridge 2018) which contain low metalicities (as expected in the high-redshift Universe), young stellar populations (since not much time has passed since the Big Bang at  > 8), and which also include binary bpass.auckland.ac.nz/ stars.We chose the templates that used the Chabrier (2003) 100 M upper mass limit on the stellar initial mass function (IMF), and note that when we looked at the 300 M masslimit IMF templates the colors at  = 10 did not change significantly.These BPASS templates do not include emission lines and all have a low metallicity,  = 0.001 (5%  ).We created 3 templates named: binc100z001age6, binc100z001age65, and binc100z001age7 which have log stellar ages of 6, 6.5, and 7 Myr respectively.
The BPASS templates are in Å and the flux is in L (L  ) so we must first convert them into F  for EAZY.These BPASS models have high spectral resolution, thus we rebin them from Δ = 1Å to Δ = 10Å.We list the measured F200W − F277W colors of our new BPASS templates in Table 1 and plot them in purple in Figures 2 and 3.The addition of these templates results in F200W − F277W colors to < −0.4,which is bluer than any of our  > 8 galaxies in the SAM.This ensures that we are fully covering the color-space of our simulated galaxies, thus providing SED models which accurately match the data, resulting in more-accurate photometric redshifts, as we show in §4.

C Nebular Emission
While the BPASS templates do not include any emission lines in their spectra, the simulated (and real) galaxies do, thus we explore adding emission lines to these new BPASS templates.Similar studies focused upon high-redshift galaxies have found that models with higher ionization parameters (log U > −2.5) and lower metallicities ( 0.3  ) better reproduce the observed properties (e.g.Inoue et al. 2016;Jaskot & Ravindranath 2016;Stark et al. 2017;Hutchison et al. 2019;Topping et al. 2021).This effect has been seen with lower-redshift analogue samples as well (e.g.Sobral et al. 2018;Berg et al. 2016Berg et al. , 2018Berg et al. , 2019;;Tang et al. 2019Tang et al. , 2021)).In several instances (both in high-redshift and lower-redshift analogue sources), observations paired to photoionization modeling have suggested metallicities as low as  ∼ 0.03 − 0.15  (e.g.Erb et al. 2010;Stark et al. 2015a,b;Vanzella et al. 2016;Berg et al. 2018Berg et al. , 2021;;Senchyna et al. 2021), low values which we anticipate may be increasingly common the higher in redshift, and further back in time, we probe.
Motivated by these and other studies, we model the emission line spectra using C v17.0 (Ferland et al. 2017) with an ionization parameter log U = −2 and with the gasphase metallicity = 0.05  (fixed to stellar metallicity).In line with the prescription of other higher-redshift modeling (e.g.Jaskot & Ravindranath 2016;Steidel et al. 2016;Stark 2016;Stark et al. 2017;Hutchison et al. 2019), we set the Hydrogen density of the gas to be 300 cm −3 , assume a spherical geometry for the nebular gas, and set the covering factor of the gas to be 100%.templates that have high nebular-line EWs, as described in §3.4.As can be seen, these newly created templates are bluer than the standard set, better matching the expected colors of  > 8 galaxies.All templates are normalized to their flux density at 2.301m.We also include in this plot the template from Erb et al. (2010) used by Finkelstein et al. (2022c) which includes a high-EW    emission line.This template was not used in our analysis as the new BPASS and C templates satisfied the same color range and our fits were not improved with its inclusion.
As the high-redshift galaxies we are specifically targeting are likely to have little to no detectable Ly emission due to attenuation by the neutral IGM during this epoch, we removed this emission feature from the template spectra.We also note that the SAM galaxies do not include Ly emission, thus by doing this we are choosing templates that more accurately represent our simulated data.To remove the Ly emission feature, we cut out the array between 0.120 and 0.125 m and interpolate a flat continuum line over that range.
We note that C creates both nebular line and nebular continuum emission; the latter results in a moderate reddening of the continuum slope.
The BPASS+C models are redder in the F200W − F277W color than the BPASS-only models due to this effect.Our 3 additional BPASS+C templates are named: binc100z001age6_cloudy, binc100z001age65_cloudy, and binc100z001age7_cloudy.We list the measured F200W -F277W colors of these new template in Table 1 and plot them in blue in both Figures 2 and 3.

New Suite of Blue SED Templates for Use with EAZY
To ensure that we are covering the color space of our highredshift galaxies we compare the F200W -F277W color distribution of the  > 8 galaxies in the SAM to the colors of our full template set in Figure 2. The distribution of SAM F200W -F277W galaxy colors are shown by the black histogram in Figure 2 while the color for each SED template is plotted as a vertical line.With the addition of our six new templates (three with and three without C nebular emission) we now have a set of SED templates that represent the full range of rest-UV colors of our simulated  > 8 galaxies.We report the colors for each template in Table 1.The full set of templates redshifted to  = 10, inclusive of our new BPASS and BPASS+C templates, are plotted in Figure 3.The plot is normalized to the flux at 2.3m as this is between the F200W and F277W filters and shows the slope (color) between them visually.We note that by including templates that have the C parameters detailed above, we are assuming an escape fraction,  e = 0.This may not be true of our high-redshift ( > 8) galaxies, but since we include both the BPASS templates without nebular emission, and the BPASS+C templates with it, EAZY's linear combination of templates can generate a composite SED for any level of escape fraction.
For this project we used the new template set described above where Ly has been removed from the spectra as the IGM attenuation at high redshifts ( > 8) impacts its transmission and is not included in the SAM galaxies.We make these new SED templates public for the community to utilize and provide sets of them without Ly (for high-redshift galaxies), with reduced Ly emission (either 1/3 or 1/10 of that produced by C ), and with full Ly strength.The templates, corresponding EAZY parameter files, and descriptions can be found at ceers.github.io/LarsonSEDTemplates.

IMPROVEMENTS TO PHOTOMETRIC REDSHIFT FITS WITH THESE NEW TEMPLATES
With the new set of SED templates which spans the full color range of our simulated galaxies we tested if the addition of these templates improves our photometric redshift fits.After making the 3 cut in F200W on the SAM as described in §2 we have 913288 galaxies ranging from  = 0 − 10 that we run through EAZY, using the 7 NIRCam filters from the CEERS Survey (F115W, F150W, F200W, F277W, F356W, F410M, and F444W) plus 4 HST filters from CAN-DELS (ACS F606W, ACS F814W, WFC3 F125W, and WFC3 F160W) to fit photometric redshifts.We allowed the redshift to span from 0.1 <  < 15, in steps of 0.01, and adopted a flat luminosity prior as we are just beginning to explore galaxies at early times.For our reported recovered redshift we use the output   value from EAZY.

Testing Redshift Recovery with Updated Template Set
To determine if our additional, bluer templates improve the redshift fits we compared the recovered redshift from EAZY to the input redshift from the SAM with and without the inclusion of our additional templates, setting the flux uncertainties as equivalent to a 5  = 30 depth in each filter (e.g., 1 noise of 0.73 nJy), and without perturbing the SAM fluxes for this test.We did two runs of EAZY on the full catalog, first using only the original FSPS templates, and then again using the full set of SED templates (FSPS, BPASS, and BPASS+C ). Figure 4 (Left) shows the recovered redshifts from EAZY compared to the input redshifts from the SAM for both the run using the EAZY FSPS templates (red) and after including our new templates (blue).There is significant improvement in recovering the correct redshifts with the new templates, as with the old templates the fits chose  = 0 ∼ 2 solutions for many of the  > 2 galaxies.We also show the difference between recovered and input redshift (Δ) for all of our photometric redshift fits as a function of input redshift in Figure 4 (middle) where the accuracy of our recovered redshifts using the new templates (blue) is much higher than just with the original EAZY templates (red), especially at the high redshifts of interest ( > 8).
To measure how well we recover the input redshifts with EAZY we calculate the median and standard deviation of Δ 1+ in bins of Δ = 0.2 and magnitude bins of Δ = 0.5 where we use a Normalized Median Absolute Deviation 'NMAD' for the standard deviation calculations as it is more outlierresistant.The fraction of outliers in our fits where we define outliers as those with a Δ > 2 binned in magnitude.For each of these parameters we note that the faint galaxies ( > 27.5, purple lines) are the sources with the least-accurate recovered redshifts, which is not unexpected as constraints on their colors are the poorest.

NMAD = median(|data − median(data)|) sigma
Here sigma is the inverse of the error function of 0.5 × √ 2 or 0.67449 (assuming a Gaussian error distribution).We show the median (solid line) and NMAD (dashed line) of our photometric redshift fits both with (blue) and without (red) the new templates in Figure 4. We also calculate an outlier fraction where we define outliers as those where Δ > 0.2 ×  (dotted line).This highlights the improvement generated when including our new templates over the original EAZY template set.Across the redshift range of our SAM galaxies ( = 0 − 10) our fits using the new templates do a significantly better job at accurately recovering the redshift of our sources than when using only the original EAZY templates; where the median Δ, NMAD, and outlier fraction are all lower.

CREATING A SIMULATED CATALOG AT CEERS DEPTHS
There are many upcoming surveys with JWST that will be searching for distant galaxies; one of which is the Cosmic Evolution Early Release Science (CEERS) Survey (Bagley et al. 2022) which covers 96.8 arcmin 2 to an expected 5 depth of m∼28.6 (Finkelstein et al. 2017).CEERS has published simulated catalogs of the field and created mock observations using the lightcones from Yung et al. (2022).Table 2 shows the expected 5 depths in each of the CEERS filters , and includes the current HST ACS and WFC3 depth ceers.github.io/obsover the same area from the CANDELS survey (Grogin et al. 2011) as measured by Finkelstein et al. (2022c).For all of the following tests we run EAZY using 4 HST filters: F606W, F814W, F125W, F160W, and 7 NIRCAM filters: F115W, F150W, F200W, F277W, F356W, F444W, and F410M.We use the 18 template set that includes the 12 tweak FSPS models, the 3 new BPASS templates described in §3.3, and the 3 new BPASS + C templates described in §3.4.For the error in each filter we use the expected 1 CEERS depth (Finkelstein et al. 2017) for every galaxy and perturb the input fluxes of the sources to mimic expected errors in real data as described below.

Perturbing Fluxes by Realistic Errors
To best recreate realistic values for our simulated mock galaxies, we "observe" their simulated fluxes by randomly perturbing them by an amount proportional to the expected flux errors.We do this via the method described in Bagley et al. (2022) where they modeled the noise to have a Voigt profile distribution (a Gaussian core with Lorentzian wings).We use the 1-depth in each filter (Table 2) as the Gaussian  for our perturbations.The following results use these perturbed fluxes for each of the simulated galaxies, with the 1 depth as the error for each filter.

S T
H -G 9

Measuring Accuracy of Photometric Redshifts with a Simulated JWST Catalog of Galaxies
We run EAZY in the same manner as described above (with both the HST and JWST filters covering the CEERS field) on the 913,288 galaxies in the SAM using the perturbed fluxes and the CEERS 1 depth as the errors.For these runs we use the full template set that includes the original FSPS templates and our set of six new bluer ones.The goal is to determine how well we are able to recover the redshifts of our sources given our best approximation of true observing conditions.
To better quantify the accuracy of our redshift fits we show the median (left), NMAD (center), and outlier fraction (right) for our recovered redshifts from EAZY at the CEERS depths (Figure 5).Here we define outliers as those with a Δ > 2 binned in magnitude.We separate our measurements for each in magnitude bins as our ability to accurately fit and recover redshifts for real galaxies is magnitude dependent.For our magnitude distribution we use the perturbed F200W magnitudes.This figure highlights that the recovered photometric redshifts are accurate across  = 0-10 for sources with  < 27.5 (≈10 detections).The accuracy progressively worsens for fainter galaxies, which is not unexpected as they are the hardest to detect and measure accurately.However, as highlighted in the bottom row, even faint galaxies are fairly recoverable at  > 9 as the Lyman break passes out of the deep CEERS F115W band, providing another dropout detection.(Grogin et al. 2011) and expected CEERS (Finkelstein et al. 2017) JWST/NIRCam 5 depths and the 1 errors used to perturb the SAM galaxies, shown in both magnitude and flux (  ).

SELECTION CRITERIA FOR Z>8 GALAXIES WITH JWST
Through rigorous testing and analysis we detail below the selection criteria that best identifies robust high-redshift ( > 8) galaxy samples with our simulated JWST catalog, while minimizing the level of low-redshift ( < 5) interlopers in our sample.As above, we use EAZY to calculate redshifts probabilities, P(z)'s and we focus on  > 8 as this epoch is made more accessible with JWST's infrared wavelength coverage.At this redshift, galaxies drop out of the WFC3 F814W band due to their Lyman break at a rest frame of 1200 Å, leaving only two HST-band detections.The CEERS JWST filters reach further into the infrared, providing a wider range of photometric coverage for these galaxies, improving our ability to detect and characterize them, though we note that the HST F814W data is crucial to probe the Ly break at  9.

S/N F200W & S/N F277W > 5:
The first cut that we make on our catalog is in S/N in two of our filters, F200W and F277W.Requiring a significant detection in both the F200W and F277W bands aims to mimic detection bands in actual photometry, as both of these filters are red-ward of the Ly break in galaxies at these redshifts and should thus be detected by JWST at these wavelengths.We note that we ran tests on different S/N cuts in these bands individually and combined and find that 5 in both removes a fair number of low-redshift ( < 5) sources from the sample while not reducing the number of actual  > 8 galaxies we recover.

∫
P( z > 7) > 0.85: The second cut we apply is one that requires > 85% of the redshift P(z) to reside at  > 8 (integrated out to the maximum redshift we considered with EAZY of 15), allowing only < 15% to be present in a low-redshift solution.Making this cut removes any flat P(z)'s, where the redshift is not well constrained by the SED fits, and any that might have significant peaks at low redshift, creating a robust sample of galaxies that are expected to be at high-redshift.We note that we also tried cuts at  > 8, and ones that had higher percentages of their P(z) above the redshift cut (i.e.< 90%) or lower (i.e.< 75%), but our adopted criteria maximized our recovery rate of high-redshift galaxies while minimizing the contamination by low-redshift ( < 5) galaxies.
2 < 15: The third cut we require is for EAZY to have found a good fit to the data, rejecting objects where even the bestfitting solution is not a match to the observed photometry.The maximum allowed  2 of the EAZY fit was also set to other values ranging from 15-35, but 15 was the best threshold we found for maximizing our recovery and minimizing our contamination rates of low-redshift galaxies.In the left-most panel of Figure 6 we show this distribution from our sample of galaxies, where the total number of high-redshift galaxies in our sample after making our first two cuts is 4902, with 3303 of those being contaminants (red,  < 5) and 984 of those being actual  > 8 galaxies (blue).Making this cut in  2 removes 576 total sources, 490 being low-redshift contaminants while only removing 46 of our high-redshift ( > 8) galaxies from our sample, leaving a remaining sample of 4326 candidate galaxies.Figure 6.Plots illustrating the details of the selection criteria cuts we made to select for high-redshift galaxies and the impact each cut has on our final sample.In each figure we plot the real  > 8 sources in our sample in blue and the contaminating  < 5 galaxies in red.Prior to the cuts shown here we required a S/N > 5 in both F200W and F277W and an ∫ ( > 7) > 0.85 (see §6) which left us with a sample of 4902 potential high-redshift ( > 8) galaxies.We then make a cut in  2 of the EAZY fit (Left) which removes 576 sources, 460 of which were contaminating low-redshift galaxies.In §6.1 we discuss the need for additional selection criteria that will remove contaminating low-redshift interlopers from our sample and that the above two color cuts: F150W − F444W (Center) and F150W − F200W (Right) were the two that most distinguished between these two sets of galaxies remaining in our sample.After requiring a F150W − F444W color < 0.3 and F150W − F200W < 0.2 we are able to remove an additional 1410 sources, 1394 of which are contaminants, while only losing 16 real  > 8 galaxies from the sample.We show the remaining 2670 sources in our sample and their distribution in true redshift and F200W magnitude in Figure 7, showing that a majority of the contaminants are at  > 28.

Color Cuts to Reduce Contamination
Our sample of 4326 sources after these first three selection cuts still contains many low-redshift ( < 5) galaxies (2813, 65% of our sample) and thus we explored additional selection criteria that could differentiate between these contaminating sources and our actual high-redshift ( > 8) galaxies in the SAM.Of the different criteria we explored, we found two color cuts which led to a direct distinction between the lowand high-redshift galaxies remaining in our sample.F150W − F444W < 0.3: The first color cut that we make requires F150W − F444W < 0.3, as this was the most distinct difference between our low-and high-redshift galaxies still remaining in our sample (see Figure 6 center panel).This cut removed 1081 galaxies from our sample of 4326 with only 6 of those being actual  > 8 galaxies.This dropped our contamination rate from 65% (where 2813/4326 galaxies were low-redshift) to 54%.This particular color spans a widerange of wavelengths for these galaxies and in our SAM more of the low-redshift galaxies have a redder color where F444W is brighter than F150W, while the high-redshift galaxies are bluer and thus have a smaller/negative value.This was also evidenced by our measurement of the galaxy colors and the motivation for creating bluer templates for EAZY for this project (see §3.3).F150W − F200W < 0.2: The second color cut that we make is to require that F150W − F200W < 0.2 as shown in the right panel of Figure 6.This cut removes an additional 329 galaxies from our sample with 315 of those being contam-inating low-redshift ( < 5) galaxies (red).This drops our contamination rate from 54% to 48% while still only removing 10 of our detectable input high-redshift ( > 8) galaxies (blue).
After the above five selection criteria we are left with a remaining sample of 2670 galaxies, with 897 of those being real high-redshift ( > 8) sources, and 1294 being low-redshift galaxies as shown in Figure 7.We note that many of our low-redshift contaminating sources are at  < 3 and we plot them both as a function of input redshift from the SAM and F200W perturbed magnitude in Figure 7.Many of these contaminating galaxies are faint, m F200W > 28 (horizontal line), which is not unexpected as the faint galaxies are harder to measure accurate colors for when their fluxes are closer to the detection limit.We also show the full distribution of input redshifts for our sample as histograms along the top axis of Figure 7 marking in red the same galaxies we have been calling contaminants ( < 5), and in blue those that we have designated as actual  > 8 galaxies.

Calculating Completeness
Here we define completeness as the number of detectable, real,  > 8 sources that are recovered by EAZY as being at  > 8, compared to the known number of true  > 8 sources in the catalog.We define detectable as the sources in the lightcone that have input redshifts from the SAM above 8 and which also meet the S/N requirement of our selection criteria (here we use S/N F200W & F277W > 5).In our SAM we have 6578 total galaxies at  > 8, 3084 of which meet our Here we plot these sources by their input redshift from the SAM vs the F200W perturbed magnitude with corresponding histograms for each axis.Our final sample includes 2670 sources, 1294 of which are contaminating  < 5 galaxies while 897 are actual  > 8 sources.As shown by the horizontal line, the sample is dominated by low-z contaminants at  28, but brighter than m∼27 all of our selected sources are high-redshift galaxies.This shows that, with the colors from sources in this simulated catalog, near our survey detection limits we struggle with distinguishing these high-redshift sources from low-redshift interlopers, though we expect the true contamination rates in observations to be lower than those predicted here ( §6.3).Details about our contamination and completeness fractions are shown in Figure 8. initial cut at S/N > 3 in F200W and run through EAZY (see inset in Figure 1).Here we only include those  > 8 galaxies that meet the first sample cut of S/N > 5 in both F200W and F277W as being truly detectable high-redshift sources, of which there are 1375.Of these sources in our final sample of high-redshift galaxies that meet all 5 of the selection criteria we recover 897 of the 1375, or 65.2%.In Figure 8 we show our completeness versus redshift, showing different magnitudes as the solid lines.We have higher completeness fractions above  > 9.5 as at this redshift we gain a full dropout band with JWST, F115W, significantly improving our SED fits as they have a distinctly detectable Ly break.We also suffer low completeness at the faint end of our high-redshift sources, where we are also dominated by contamination (see §6.3).

Calculating Contamination
Contaminants are defined as those galaxies that have an input redshift of  < 5 but which meet all our selection criteria for a high-redshift ( > 8) galaxy, and remain in our sample.The contamination fraction is the calculated as the total number of contaminants divided by the total number of sources in our final sample.In our final sample we have 2670 galaxies that met all of our selection criteria and 1294 of them are actual low-redshift contaminants giving a total contamination rate of 48.5%.In Figure 8 we show our completeness fractions in different redshift bins as a function of magnitude (dashed lines) and note that just as shown in our final sample distribution (Figure 7) we become dominated by contamination at the fainter end, closer to our survey limit (at m > 28), but that contamination is very minimal at  < 27.5.We note also that these specific contamination rates are dependant on the colors of low-redshift galaxies in these simulations.Finally, real galaxy surveys are sensitive to contamination from stellar sources, however the SAM only includes galaxies and as such we are exploring the impact of galactic contamination.
It is important to note that these contamination rates are likely overestimates.They are dependent upon the colors of galaxies in the mock catalog at all redshifts.Simulations in general struggle to produce redder galaxies at lower redshifts (e.g.Somerville & Davé 2015;Trayford et al. 2016); bluer overall colors for galaxies would lead to higher contamination rates in our sample.Additionally, Yung et al. (2019a) described the need to reduce dust attenuation at higher-redshifts L .
in the SAM we use for this paper.This primarily affects the simulated galaxies at  > 4, but our contaminating galaxies are typically at  < 3. Furthermore, the surface density of contaminants we measure for our sample is 1.65 arcmin −2 which would imply ∼60 contaminating low-redshift interlopers in the 35 arcmin 2 of the first epoch of CEERS.This number is much greater than the total sample of candidate  = 8.5−10 galaxies observed in this field thus far (Donnan et al. 2022;Finkelstein et al. 2022b).Together, this implies our contamination rates are higher than we are likely to encounter in the real JWST data.As illustrated in Figure 7, we suffer from high contamination rates at the faint end of our sample (m>27.5)at all redshifts.It is also notable that the redshift range at which we recover the highest fraction of real  > 8 galaxies is above  = 9.5 where the Ly-break falls within the JWST filters, providing the SED-fitting process the clearest high-redshift feature.Overall, we maintain a high recovery (completeness) fraction for our galaxies, where we recover a total of 897 of 1325 real  > 8 sources in the SAM.The ones being missed by our selection criteria are predominantly at the faint end, close to our detection limits and where we are most dominated by contamination.

CONCLUSIONS
Galaxies at  > 8 are expected to have bluer rest-frame UV colors than traditional model SED templates, which can lead to catastrophic photometric redshift failures.We explored the recommended FSPS templates included with the EAZY photometric-redshift fitting software (Brammer et al. 2008), and found that they are all redder in the JWST bands than the simulated  > 8 galaxies from the CEERS mock catalogs Yung et al. (2022).This is similar to what Finkelstein et al. (2022c) discovered for their observed  = 6 − 8 galaxies.To enable improved photometric redshift measurements we created a supporting set of SED templates which match the predicted rest-UV colors of  > 8 simulated galaxies.We used EAZY to highlight the improvements in redshift recovery after the inclusion of our new template set, also suggested a set of criteria for selecting galaxies at  > 8 with JWST surveys.
We use the published simulated galaxy catalog for CEERS as detailed in Yung et al. (2022), which is based off of the Santa Cruz semi-analytic model (SAM) for galaxy formation (Somerville & Primack 1999;Somerville & Davé 2015) to which physically-predicted properties and star formation histories are assigned SEDs generated based upon SPS models from Bruzual & Charlot (2003).This catalog contains a total of 1,472,791 galaxies between  = 0 − 10, 6,578 of which are at  > 8, but as real observations with JWST are limited by our ability to detect objects in our data we impose a S/N > 3 cut in F200W where the noise is set as the 1 depth of the CEERS observations.This leaves us with 913,288 simulated galaxies (3,084 at  = 8 − 10) to use in determining the expected colors of high-redshift ( > 8) galaxies as measured by JWST.
Our new suite of SED templates for fitting high-redshift ( > 8) galaxies were designed to have properties expected of galaxies in the early Universe.We used the BPASS v2.2.1 (Eldridge et al. 2017;Stanway & Eldridge 2018) model templates and selected for those that had low metallicity (5% Z ), young stellar populations (log stellar ages of 6, 6.5, and 7 Myr), inclusive of binary stars, and with an upper mass limit of 100 M on a Chabrier IMF (Chabrier 2003).We note that these templates do not include any emission lines so we add another set of templates where we use C v17.0 (Ferland et al. 2017) to model appropriate emission line spectra.In line with our expectations of high-redshift ( > 8) galaxies we use high ionization parameters (log U = −2), low gas-phase metallicities ( = 5% ), Hydrogen gas density of 300 cm −3 with a spherical covering fraction of 100%, and remove Ly-emission as these galaxies are expected to be in a predominantly neutral IGM.These templates also include nebular continuum emission as well as emission lines, which produces redder colors for those templates with emission lines than those without.With this new set of six SED templates we are covering the full F200W − F277W color space of our simulated high-redshift ( > 8) galaxies (down to −0.43 mag), where the previous FSPS models only extended to −0.06 mag and where inclusion of the young, blue, low-mass galaxy, BX418 (from Erb et al. 2010) by previous studies such as Finkelstein et al. (2022c) had only reached a color of −0.2 mag.We make these templates publicly available for use at ceers.github.io/LarsonSEDTemplates.
We also use our new suite of templates and the simulated CEERS catalog of galaxies to determine how best to select high-redshift ( > 8) galaxies with JWST, in ways that maximize completeness and minimize contamination by low-T H -G 13 redshift ( < 5) interlopers.What follows is the best criteria for identifying high-redshift candidates that we could determine for early JWST data prior to having sufficient spectroscopic redshifts in this era to better calibrate our photometric redshift fits.We first make a requirement for a significant detection in both the F200W and F277W bands (S/N > 5) to mimic detection bands in actual photometry.Then we require that the ∫ P( z > 7) > 0.85 from the EAZY redshift probability distribution, which removes any flat P(z)'s or those that have significant peaks at low redshift.We then place an upper limit on the  2 of 15 to ensure a reasonably good fit to the data.We find that these criteria still leave our sample dominated by contamination by low-redshift ( < 5) interlopers so we impose two color cuts: F150W − F444W < 0.3 mag and F150W − F200W < 0.2 mag, dropping our overall contamination rate by >15% while sacrificing only a handful of real high-redshift ( > 8) sources.
After applying these cuts, we find that our overall recovery rate of sources in our final sample that have input redshifts  > 8 is over 65%, only suffering from significant incompleteness at the faint end ( > 28).This range is also where we encounter increased contamination fractions, though we expect the observed contamination rates to be lower than those predicted here.This is likely due to the simulated catalog not accurately reproducing the red colors of observed low-redshift galaxies.
We find that these above five selection criteria, combined with the inclusion of bluer SED templates such as the ones published here, are the best combination to ensure minimal contamination rates by low-redshift interlopers ( < 5), while maximizing the recovery of real high-redshift ( > 8).These results provide an important road map for observers venturing into this new era of astronomy with JWST, while also highlighting the need for spectroscopic follow-up to confirm high-redshift galaxy candidates and measure accurate contamination rates.

Figure 2 .
Figure 2. The black histogram shows the distribution of F200W − F277W colors for the  > 8 galaxies from the SAM catalog of Yung et al. (2022).The solid vertical lines show the rest-UV color of the SED templates we used for photometric-redshift fitting, using the color calculated by integrating the templates through the JWST/NIRCam F200W and F277W filters after placing them at  = 10.The bluest EAZY FSPS template only reaches a rest-UV color of −0.1, while the majority of the comparison high-redshift sample have bluer (more negative) colors.Finkelstein et al. (2022c) added a bluer template from Erb et al. (2010) (green), but it is still redder than the majority of our simulated high-redshift galaxies.We created BPASS (purple) and BPASS + C emission line (blue) templates and note that the BPASS + C templates are redder in color than the BPASS only models due to their nebular continuum emission.This full template set can now reproduce the colors of all high-redshift galaxies in our simulated sample.

Figure 3 .
Figure 3.The rest-frame ultraviolet region of the EAZY template set, redshifted to  = 10, used in our analysis to measure photometric redshifts.The red and orange lines show the latest standard EAZY template set (tweak_fsps_QSF_v12_v3), while the purple lines show the BPASS models we create here as described in §3.3.The blue lines show the BPASS + Ctemplates that have high nebular-line EWs, as described in §3.4.As can be seen, these newly created templates are bluer than the standard set, better matching the expected colors of  > 8 galaxies.All templates are normalized to their flux density at 2.301m.We also include in this plot the template fromErb et al. (2010) used byFinkelstein et al. (2022c) which includes a high-EW    emission line.This template was not used in our analysis as the new BPASS and C templates satisfied the same color range and our fits were not improved with its inclusion.

Figure 4 .
Figure 4. Left: Comparison of our recovered redshifts from EAZY vs the input redshifts from the SAM.We ran our sample of 913288 simulated galaxies from  = 0 − 10 through EAZY using only the included EAZY FSPS templates (red), and then again after adding our new SED template set (blue) as described in §3.3 and Figure3.For this test we set the errors in each filter equivalent to a 5 depth of  = 30.Center: The Δ (input redshift -recovered redshift) vs input redshifts of our SAM galaxies from both EAZY runs.Right: We show the median (dashed line), and median absolute deviation (solid line) of the Δ for both EAZY runs, where the inclusion of the new templates provides significant improvement as both values are lower across the full redshift range.We calculate and outlier fraction, or catastrophic failures, as those where Δ > 0.2 ×  (dotted line) which highlights the set of  > 4 galaxies that are fit at lower redshifts when using only the original templates, but whose redshifts are accurately recovered after the inclusion of our new set of SED templates.

Figure 5 .
Figure5.Plots illustrating the accuracy of our recovered redshifts from EAZY versus input redshifts for our mock CEERS observations of the SAM galaxies.We show the full  = 0 − 10 set of 913,288 galaxies in the SAM (Top), and a zoom in on the  = 8 − 10 range of particular interest to JWST studies (Bottom).Left: The median Δ as a function of redshift, separated in bins of F200W magnitude.Center: The standard deviation (NMAD) of our Δ in corresponding magnitude bins.Right: The fraction of outliers in our fits where we define outliers as those with a Δ > 2 binned in magnitude.For each of these parameters we note that the faint galaxies ( > 27.5, purple lines) are the sources with the least-accurate recovered redshifts, which is not unexpected as constraints on their colors are the poorest.

Figure 7 .
Figure7.Our final sample of high-redshift galaxies after making cuts based upon the five selection detailed in §6.Here we plot these sources by their input redshift from the SAM vs the F200W perturbed magnitude with corresponding histograms for each axis.Our final sample includes 2670 sources, 1294 of which are contaminating  < 5 galaxies while 897 are actual  > 8 sources.As shown by the horizontal line, the sample is dominated by low-z contaminants at  28, but brighter than m∼27 all of our selected sources are high-redshift galaxies.This shows that, with the colors from sources in this simulated catalog, near our survey detection limits we struggle with distinguishing these high-redshift sources from low-redshift interlopers, though we expect the true contamination rates in observations to be lower than those predicted here ( §6.3).Details about our contamination and completeness fractions are shown in Figure8.

Figure 8 .
Figure 8.Here we plot our completeness and contamination fractions as a function of magnitude in several distinct redshift bins.As illustrated in Figure7, we suffer from high contamination rates at the faint end of our sample (m>27.5)at all redshifts.It is also notable that the redshift range at which we recover the highest fraction of real  > 8 galaxies is above  = 9.5 where the Ly-break falls within the JWST filters, providing the SED-fitting process the clearest high-redshift feature.Overall, we maintain a high recovery (completeness) fraction for our galaxies, where we recover a total of 897 of 1325 real  > 8 sources in the SAM.The ones being missed by our selection criteria are predominantly at the faint end, close to our detection limits and where we are most dominated by contamination.
F277W color of the template as m F200W − m F277W = −2.5log 10  F200W  F277W where redder colors would have more positive values, and bluer colors would have more negative values.We list the F200W − F277W colors of the native EAZY FSPS template set, and the additional Erb et al. (2010) template used by Finkelstein et al. (2022c) in Table 1, noting that at the JWST wavelengths the Erb et al. (2010) template is still bluer (more negative) than all of the EAZY FSPS templates.

Table 2 .
The reported CANDELS HST