Rest-frame UV Colors for Faint Galaxies at z ∼ 9–16 with the JWST NGDEEP Survey

We present measurements of the rest-frame UV spectral slope, β, for a sample of 36 faint star-forming galaxies at z ∼ 9–16 discovered in one of the deepest JWST NIRCam surveys to date, the Next Generation Deep Extragalactic Exploratory Public Survey. We use robust photometric measurements for UV-faint galaxies (down to M UV ∼ −16), originally published in Leung et al., and measure values of the UV spectral slope via photometric power-law fitting to both the observed photometry and stellar population models obtained through spectral energy distribution (SED) fitting with Bagpipes. We obtain a median and 68% confidence interval for β from photometric power-law fitting of βPL=−2.7−0.5+0.5 and from SED fitting, βSED=−2.3−0.1+0.2 for the full sample. We show that when only two to three photometric detections are available, SED fitting has a lower scatter and reduced biases than photometric power-law fitting. We quantify this bias and find that after correction the median βSED,corr=−2.5−0.2+0.2 . We measure physical properties for our galaxies with Bagpipes and find that our faint ( MUV=−18.1−0.9+0.7 ) sample is low in mass ( log[M*/M⊙]=7.7−0.5+0.5 ), fairly dust-poor ( Av=0.1−0.1+0.2 mag), and modestly young ( log[age]=7.8−0.8+0.2 yr) with a median star formation rate of log(SFR)=−0.3−0.4+0.4M⊙yr−1 . We find no strong evidence for ultrablue UV spectral slopes (β ∼ −3) within our sample, as would be expected for exotically metal-poor (Z/Z ⊙ < 10−3) stellar populations with very high Lyman continuum escape fractions. Our observations are consistent with model predictions that galaxies of these stellar masses at z ∼ 9–16 should have only modestly low metallicities (Z/Z ⊙ ∼ 0.1–0.2).


INTRODUCTION
Analyzing galaxies in the rest-frame ultraviolet (UV) regime provides pivotal insights into the early universe.Massive stars within active star-forming galaxies emit UV radiation and, when observed, can offer information on various astrophysical properties.One key parameter of interest is the rest-frame UV spectral slope, β, a measure of the steepness of the UV continuum (where f λ ∝ λ β ; Calzetti et al. 1994;Meurer et al. 1999).This serves as a potent tool in investigating underlying physical processes within galaxies, including dust attenuation, stellar mass, star formation rate, and metallicity (Finkelstein et al. 2012;Wilkins et al. 2013;Bouwens et al. 2014;Rogers et al. 2014, e.g.).
Historically, studies investigating the rest-frame UV spectral slope have relied on data obtained from ground and space-based observatories, predominantly the Hubble Space Telescope (HST ).Such observations have significantly advanced our understanding of galaxy properties from z ∼ 2 − 10, revealing the intricate connection between the rest-frame UV spectral slope and other galaxy properties (Hathi et al. 2008;Wilkins et al. 2011;Finkelstein et al. 2012;Bouwens et al. 2012;Dunlop et al. 2012Dunlop et al. , 2013;;Hathi et al. 2013;Bouwens et al. 2014;Bhatawdekar & Conselice 2021;Tacchella et al. 2022).However, recent advancements in observational capabilities, particularly with the advent of JWST, have revolutionized our ability to explore the rest-frame UV spectral slope with unprecedented precision and sensitivity to higher redshifts.
JWST (Gardner et al. 2006(Gardner et al. , 2023)), equipped with its suite of cutting-edge imaging and spectroscopic instruments, such as the NIRCam (Rieke et al. 2023) instrument used for this work, is providing a remarkable leap forward in our understanding of the evolution of galaxy properties.With its improved sensitivity in the near-infrared and mid-infrared wavelength range, JWST enables the precise determination of the rest-frame UV spectral slope for a larger sample of galaxies further back in time than observed before (at z ≳ 10).
One enticing question for this early epoch is whether the first generations of stars can be observed.Such stellar populations are expected to be extremely metal-poor (Z ≤ 10 −3 Z ⊙ ; Schaerer 2003), if not completely metalfree (e.g., Population III), resulting in very blue values of β < −3.Prior to JWST, there were few convincing ultra-blue galaxies (with β measured via photometric power-law fitting) that had been found (Schaerer 2002; * NSF Graduate Research Fellow Bouwens et al. 2010;Labbé et al. 2010;Ono et al. 2010;Jiang et al. 2020).The direct detection of these exotic populations has been challenging due to several factors, including their expected brief lifetimes, and that when they explode and pollute the surrounding interstellar medium future generations of stars contain metals, leaving a narrow window in time to detect metal-free stars.Lastly, prior to JWST, we were limited technologically, unable to probe to z > 10 where such objects are expected to exist.With the advent of JWST, it is therefore imperative to explore what constraints can be placed on stellar populations at the highest redshifts.Here using the first JWST public deep field, the Next Generation Deep Extragalactic Exploratory Public (NGDEEP) Survey, we search for such objects, emphasizing exploring methods of accurately measuring the UV spectral slope.
This paper aims to investigate the range of UV spectral slopes observed in faint galaxies at z > 10, as well as explore potential correlations between β and various galaxy properties.UV-faint galaxies have already been characterized from z ∼ 6 − 7 with HST.These galaxies have an average UV slope β ∼ −2.4,indicative of lowmetallicity, low-dust populations (Wilkins et al. 2011;Finkelstein et al. 2012;Dunlop et al. 2013;Bouwens et al. 2014;Wilkins et al. 2016).Early studies with JWST (Topping et al. 2022;Cullen et al. 2023) have measured β ≲ −3.0 for a few galaxies, though the uncertainties are large thus, no convincing discovery of exotic stellar populations has been made.Here we aim to push these observations to lower luminosities at higher redshift while also presenting a methodology to reduce the scatter on our measurements of β.
This paper is structured as follows.In Section 2, we describe the NGDEEP survey and the data reduction process.In Section 3, we discuss our galaxy sample and our methods for photometric power-law and SED-fitting to derive β.In Section 4, we describe our findings from fitting observations and simulations to models and explore correlations between β and galaxy parameters.In Section 5, we discuss our results and present our conclusions in Section 6.We use the Planck Collaboration et al. (2020) cosmology of H 0 = 67.4km s −1 Mpc −1 , Ω m = 0.315 and Ω Λ = 0.685 and all magnitudes are given in the AB system (Oke & Gunn 1983).time (Bagley et al. 2023).NGDEEP's parallel ultradeep NIRCam imaging in the Hubble Ultra-Deep Field "Parallel-2" (HUDF-Par2) field should ultimately detect ≳ 100 galaxies at redshifts z = 9 ∼ 16 over 5 arcmin2 at 1 − 5µm on the deepest HST F814W imaging in the sky (m lim,F814W ∼ 30) from the HUDF12 campaign (Ellis et al. 2013;Koekemoer et al. 2013).NGDEEP should ultimately reach m ∼ 30.8, up to ∼ 0.5 mag deeper than Guaranteed Time Observation (GTO) NIRCam surveys in some filters, and will provide robust measurements for β and galaxy morphologies for UV-faint, low-mass galaxies.While the full program was planned for early 2023, only half of the program was obtained due to a temporary suspension of operations for NIRISS, causing the NGDEEP observations to be pushed to the edge of the visibility window.The next visibility window satisfying the PA requirement of the parallel observations will occur in early 2024 when the remaining observations are expected to be taken.In this study, we report results using NIRCam data from the first half of the NGDEEP program, which is among the deepest data yet obtained by JWST.
We use the NIRCam and HST /ACS F814W imaging mosaics and photometric catalogs presented in Leung et al. (2023).We refer the reader there for full details but summarize the reduction and cataloging process here.The NIRCam imaging data are reduced using the JWST pipeline1 (Bushouse et al. 2022) with custom modifications.We process the exposures through Stage 1 of the pipeline, applying custom procedures to remove snowballs, wisps, and 1/f noise.The images are then processed through Stage 2. Since only preflight flats are available for the short wavelength filters through the CRDS, we apply custom sky flats to these filters, and we use the recently updated CRDS flats for the long wavelength filters.Before combining the exposures into mosaics through Stage 3, a custom version of the TweakReg routine is used to align the NIRCam images with the ACS mosaics (Koekemoer et al. 2011(Koekemoer et al. , 2013)), whose astrometry is tied to Gaia DR3 2 .Finally, we perform a custom background subtraction procedure on all the mosaics.
Aperture photometry was performed using Source Extractor (Bertin & Arnouts 1996).Colors were measured using small elliptical Kron apertures, accounting for the varying PSF across 0.8-5 µm via matching the point-spread-function (PSF) of bands bluer than F277W to the F277W PSF.For F356W and F444W, source-by-source correction factors were derived to account for missing flux in the F277W-defined aperture.Total fluxes were derived first by scaling to the flux ratio in F277W between the flux in the small elliptical aperture to the default Source Extractor "MAG AUTO" Kron parameters, with a residual aperture correction (typically 5-10%) measured and applied following source injection simulations.Flux uncertainties were measured empirically from the data and assigned to each object for each filter based on the size of its elliptical aperture.

METHODOLOGY
In Section 3.1, we describe our sample of high redshift galaxies.In Section 3.2, we describe the process of deriving the UV spectral slope SED-fitting (and other galaxy properties).In Section 3.3, we describe the process of deriving the UV spectral slope from photometric powerlaw fitting to the observed photometry.In Section 3.4, we compare our results with other observations.

Galaxy Sample
We make use of the galaxy sample from Leung et al. (2023), summarizing their sample selection and completeness here.They incorporate a combination of flux detection signal-to-noise ratios (S/N) with quantities derived from photometric redshift fitting with EAZY (Brammer et al. 2010), using the full probability density functions of the photometric redshift, P (z), and the best-fit redshift, z a. Their final sample was then visually inspected to reject any spurious sources and to ensure a real dropout in all filter bands blueward of the Lyman break corresponding to the estimated redshift.We here make use of 36 of the full sample of 38 galaxies from Leung et al. (2023), which span photometric redshifts z ∼ 8.5 − 15.5 and UV magnitudes −21 ≤ M UV ≤ −16.The two galaxies we excluded are red sources whose fits with all photometric data points with Bagpipes (Carnall et al. 2021) yield unphysical results.As discussed in Leung et al. (2023), this may indicate that these sources have an AGN component contributing to their emission, similar to other sources identified at brighter magnitudes and lower redshifts (e.g.Kocevski et al. 2023;Labbe et al. 2023;Matthee et al. 2023;Barro et al. 2023).
To explore any potential sample bias against UV-red objects, we make use of the same source injection simulations described in Leung et al. (2023).They measured the completeness of their photometric selection by injecting and attempting to recover 50,000 mock objects with varied F277W magnitudes, colors, and surface brightness profiles across images, measuring completeness via the proportion recovered in both photometric and sample selection criteria within specified magnitude bins.Galaxy size, a potential influence on sample completeness, is incorporated as an estimation parameter.Any bias in size measurements, identified by comparing input and recovered half-light radii, is corrected during the completeness calculation to ensure accurate data evaluation and selection robustness.We explore the completeness as a function of β and find that our sample selection remains sensitive to significantly redder UV slopes than our reddest object, and we observe that the completeness is only ∼ 15% lower at β = −1.5 than β = −2.5.We conclude that our photometric selection criteria do not bias the observed β distribution.

Deriving the UV spectral slope from SED fitting
For each source in the sample, we run Bagpipes (Carnall et al. 2021), a Bayesian SED-fitting code where photometric data is provided as input along with a suite of user-defined priors, and the posterior distributions of galaxy properties and corresponding model spectra are returned.Our priors span a wide parameter space to ensure we are not omitting valuable information (see Table 1).We aim to measure β from these model spectra directly, following Finkelstein et al. (2012).
We made two modifications to the Bagpipes source code: (1) we ensure that the Bagpipes is consistent with the EAZY results used to select the galaxy sample by implementing an additional redshift prior that is the cumulative distribution function (CDF) of our EAZY P(z) while allowing the redshift range tested to be free from z = 0 − 20.(2) We also incorporate a new free parameter added to Bagpipes, the Lyman continuum escape fraction, f esc , to allow for the flexibility of measuring bluer colors within our fitting (f esc was otherwise by default set to zero, which can lead to nebular continuum dominating at young ages, and thus β > −3 even for dust-free, metal-poor populations).By incorporating this f esc component, alongside other tuneable priors, we are able to allow Bagpipes to generate a model reaching a minimum blue 'floor' of β SED = −3.25 when f esc = 1 (compared to β SED = −2.44 when f esc = 0).Ideally, this β floor would be bluer than the bluest observed value (which, as we have shown, is not possible if f esc is fixed to zero).This should result in the recovery of the correct median β without any systematic bias.For this work, we make use of the default BC03 stellar models (Bruzual &Charlot 2003), andCalzetti et al. (1994) dust models Bagpipes provides, though we note that future work using Population III star models may be able to reach bluer values.
We utilize the median (and the difference between the median and the 68% confidence bounds) for redshift, stellar mass, UV magnitude, dust attenuation, star formation rate, and mass-weighted age posteriors as our final values (and error bars) in this work.For each source, we set Bagpipes to fit model SEDs to the photometric data points and their errors, return the best-fit SED model, and return 1000 draws from the resulting posteriors for various galaxy properties as a function of these SEDs.
We measured the UV slope, β SED , from each of the 1000 posterior model spectra via a power-law fit to all model flux density data points from λ rest = 1500 − 3000 Å, while excluding a 10 Å region around wavelengths where we expect high-redshift emission lines to lie (λ rest = 1549, 1909, 2796, 2803 Å), resulting in an average of 131 data points per model SED for the UV slope calculation.We adopt the median and the difference from the 1σ bounds from these measurements as our final β SED value and uncertainties, where log(f λ ) = βlog(λ) + log(y int ).This approach, while deviating from the wavelength windows of Calzetti et al. (1994), similarly avoids strong emission lines.For our SED figures (Figures 2 and 8), we display the lowest χ 2 model SED, the central 68%, and the full range covered by the full model posterior.Here we describe our process of measuring the UV spectral slope via photometric power-law fitting to the observed photometry.This is a more commonly used method to measure the UV spectral slope that is not reliant on stellar population models, though is very sensitive to the number of photometric data points available.When measuring the UV spectral slope with photometric power-law fitting to the observed photometry, β PL , we redshift the rest-frame λ = 1500 − 3000 Å regime to the corresponding median redshift estimated from EAZY, z a, for each galaxy in the sample and keep the filters whose central wavelength within each filter curve falls fully within the redshifted wavelength range.The redshift range of our sample, z ∼ 9 − 16, results in typically 2-3 photometric data points being available to fit a line (namely F200W, F277W, F356W, and at the highest redshifts, F444W).Once the photometric data points within the wavelength range are determined, we fit the data points to a line (as defined at the end of Section 3.2) and run this process through Emcee (Foreman-Mackey et al. 2013).This procedure maximizes the likelihood that the model described by three free parameters matches the observed photometry for a given source: β, y int , and fractional error factor, log(f) where it is assumed that the likelihood function and its uncertainties are simply a Gaussian where the variance is underestimated by some fractional amount.Results are derived from the median and 68th percentile of the posterior distribution on these three parameters from a chain consisting of 5000 steps and a burn-in of 100 steps (i.e., the MCMC process is iterated over 5000 times to estimate these parameters).
UV spectral slopes for our sample from both measurement methods are shown in Figure 1.This plot shows both the original and corrected β values (corrections applied are discussed in Section 4.2).We show that the SED-fitting method yields smaller uncertainties on average (see Figure 2 for examples of this) -this is simply due to the use of more data points along the specified wavelength range mentioned in Section 3.2 given that the SED is a good fit to the data (future works with varying stellar models may help alleviate any limitations our current model may have, see §5).After the correction is applied, we find that β SED is in fairly good agreement with β PL for some objects, while the β PL measurements show significantly larger scatter, extending both to much bluer and much redder values than β SED .Specifically, while no objects reach β SED < −3, there are many whose β PL < −3.

Literature Comparison Samples
In comparing our work to other observations, we wanted to ensure we used datasets with selection strategies similar to the ones we implemented, and that the redshift range overlapped with this work.The observational datasets we used to compare values of β versus redshift, stellar mass, and UV magnitude are as follows: UV slope values and corresponding galaxy parameters at z = 9 − 11 by Tacchella et al. (2022) whose sample consists of 11 bright (H-band magnitude < 26.6) galaxy candidates selected in the HST CANDELS fields from Finkelstein et al. (2022) with photometry span-ning the UV/optical with HST and near-infrared with Spitzer /IRAC.The work of Topping et al. (2022) utilizes the initial NIRCam data release at z = 7 − 11 for 123 galaxies from the JWST Cosmic Evolution Early Release Science (CEERS, Finkelstein et al. (2017)) survey, this survey overlaps with that of the HST EGS field.At z = 8 − 16, we use the 61 galaxy observations in Cullen et al. (2023) from three JWST fields: SMACS J0723, GLASS, and CEERS (Pontoppidan et al. 2022) and the ground-based COSMOS/UltraVISTA survey (McCracken et al. 2012).Finally, we include a sample of 18 galaxies in the NGDEEP field at z = 8 − 16 from the works of Austin et al. (2023).

RESULTS
In this Section, we describe the evolution of the UV spectral slope and its ties to galaxy properties.In Section 4.1, we showcase six galaxies in our sample that span various extremes in properties to showcase our sample and β measurements.In Section 4.2, we discuss the potential for measurement biases, using a sample of simulated galaxies which we fit with Bagpipes, deriving a correction factor to our observed sample.In Section 4.3, we present our results for our corrected β values as a function of various galaxy parameters.

Highlighting Six Example Galaxies
In Figure 2, we show six galaxies from our sample of 36 galaxies in the NGDEEP field with their corresponding photometry and model SEDs.The figure shows SEDs for the highest redshift galaxy in our sample at z SED = 15.5 +0.3 −0.3 , whose photometric power-law and SED-fitted UV spectral slope are both very blue, −0.2 .This is consistent with a low-metallicity, young stellar population, though spectroscopic follow-up is needed both to confirm the redshift as well as the metallicity.We also obtained photometric data for a z = 8.5 +0.3 −0.4 galaxy whose UV magnitude is quite faint, M UV = −16.2+0.2 −0.2 and still exhibits very blue colors both with photometric power-law and SED-fitting, β SED = −2.3+0.2 −0.2 (−2.2 before correction) and β PL = −2.7 +0.5 −0.5 , highlighting the large scatter in β PL at faint measured fluxes.
We also show SEDs for one of the bluest sources in our sample at z = 8.6 +0.1 −0.1 with a β SED = −2.7 +0.1 −0.1 (−2.5 before correction) and one of the 'reddest' sources in our sample with a β SED = −1.5 +0.3 −0.2 (−1.3 before correction).Although when modeled with Bagpipes, the bluest galaxy is considered to be a fairly young galaxy, ∼ 10 7 years old (similar to the bulk of our sample), the lack of dust attenuation and relatively high specific star formation rate both play a role in the extremely blue slope.The redder galaxy, on the other hand, is slightly fainter, with a bit more dust attenuation contributing to our measurement.Lastly, we show two examples of galaxies whose β SED fits are either bluer or redder than the measured photometric power-law fits, β PL .However, in both instances, we show that our SED-fitting method yields smaller errors than photometric powerlaw fitting overall.As discussed, depending on the filter wavelength coverage and the redshift range, the powerlaw method can be limited to just two data points, yielding not only larger uncertainties but also larger scatter in the measured value.

Verifying our methodology with simulations
To test the accuracy and assess any bias in our SED and photometric power-law measurements of the UV spectral slope, we utilize source injection simulations.Full details of this process are described in Leung et al. (2023), but we summarize briefly here.The simulated sources are created with a range of F277W magnitudes, colors, and surface brightness profiles.Once added to a given image, photometry and resulting photometric redshifts are defined in the same manner as what is done on real images.Utilizing the same selection criteria as the observed galaxies, we randomly draw a sample of 1000 simulated galaxies that are defined to have 'true' input UV spectral slopes, stellar mass, and UV magnitudes.We ensure these galaxies span a wide enough parameter space for each property to fully align with the dynamic range of the observations.With this sample, we run Bagpipes and Emcee in the same fashion as the observed sample in order to obtain β, UV magnitude, and stellar mass.We then explore whether there exist any measurement biases by taking the difference between β SED , β PL , and the input β as functions of input β.As shown in the left two panels of Figure 3, we can see that SED-fitting yields a closer fit to the 'true' UV slope than with photometric power-law fitting, where anything past ∆β ± 0.5 is to be considered a strong outlier (for SED-fitting, this fraction is 75 out of 1000 galaxies and for photometric power-law fitting, this fraction is 274 out of 1000 galaxies).In particular, the recovered photometric power-law β values preferentially scatter towards bluer slopes.
Although our SED-fitting method yields tighter agreement between the input and recovered UV spectral slope, we still hit a "minimum β" plateau in what is returned from Bagpipes, as evidenced by the good agreement at β > −2, which begins to flatten at bluer input colors.Photometric power-law fitting, with its larger spread, essentially shows that more filters are needed past 2 − 3 at these high redshifts in order to tighten measurements.This, however, would require a series of medium-band filters in the rest-frame UV regime that could probe just as deep as the current filters being used, which would be very costly observationally.
Knowing that SED-fitting and photometric power-law fitting have limitations in their ability to precisely measure the UV spectral slope, we use these simulations to derive a bias-correction factor.As we cannot know the true value of β for real sources, we derive this correction as a function of observed rest-UV color.For each source, both simulated and observed, we measure a rest UV color by taking the difference, in magnitude, of the first two filters redward of the filter where the Lymanbreak is expected to lie.
To adjust for discrepancies in measured versus true UV slopes, we calculate a correction factor, ∆β, which is the difference between measured UV slopes (from SED or photometric fitting) and known input values.We calculate this quantity in UV color bins of 0.25, using only those with over 20 data points.To apply this correction to a given real object, we interpolate the ∆β curves to a given observed UV color value, assigning edge values to outliers.As illustrated in Figure 1, applying this correction even just to β SED aligns original and adjusted β SED values closer to β PL across various redshifts (detailed in Table 2).The Pearson R correlation coefficient analysis shows an improvement from ρ = 0.38 to ρ = 0.48 post-correction, indicating alleviation of bias in the SED-derived UV slopes, evidenced by a shift towards bluer β SED within the error margins of the sources.These corrected β SED values will be used as the baseline for all subsequent analyses within this study.
We note that this process was similarly done with photometric power-law fitting.However, we obtain an even stronger bias, given the small number of data points being fit.As such, we do not apply these corrections to the photometric power-law fitted values, though we do show the derived correction values for both methods in the right-hand panel of Figure 3. Note-a = Redshift determined from EAZY P(z) b = Redshift determined from Bagpipes with the EAZY redshift prior applied c = Photometric power-law fit of the UV spectral slope as determined with Emcee d = Adjusted βSED values with correction applied as discussed in Section 4.2

Analysis of β SED and other Bagpipes galaxy parameters
We compare the results of the Bagpipes posteriors for galaxy parameters versus the bias-corrected values of β SED , and discuss any correlations.Previous works have explored correlations mainly between β and stellar mass (Finkelstein et al. 2010(Finkelstein et al. , 2012)), UV absolute magnitude (Bouwens et al. 2010(Bouwens et al. , 2012)), and dust attenuation (Calzetti et al. 1994;Meurer et al. 1999).Here, we explore monotonic trends with these parameters as well The SED method has a tighter scatter, though exhibits a clear bias to redder measured values at true values of β ≲ −2.The photometric power-law method shows a significantly larger scatter, as well as a larger bias (in the blue direction) at all input values of β.We quantify these biases in the right panel, showing the difference in β as a function of the rest-UV color for our simulated sources.These colors are chosen as they can be measured for real galaxies (where the intrinsic β is not known).We show that the SED-fitting method results in a more accurate recovery of β than photometric power-law fitting.However, as there is still a significant bias, we use this measured ∆β value as a correction factor applied to βSED for the observed galaxies in our sample (see Figure 1).As the scatter and bias are both worse for βPL, and we use βSED for our analysis below, we do not apply these corrections to βPL.
as with star formation rate (SFR), mass-weighted age, and f esc using Spearman R correlation coefficients 3 , ρ (Spearman 1904).We perform a Monte Carlo resampling to estimate the Spearman correlation coefficient and the corresponding p-value between the corrected β SED and other galaxy parameters, accounting for their uncertainties.We repeatedly add normally distributed random noise to the data and recalculate the ρ and pvalue.The median ρ and p-value, along with their 68% and 95% confidence intervals, are noted in the legends of Figures 4, 5, and 6.We offer the median and difference from the 68th confidence bounds for each parameter in Table 3.For Figures 4 and 5, we plot our corrected β SED values versus redshift, UV magnitude, and stellar mass and offer comparisons with other works that have been done at high redshift (Tacchella et al. 2022;Topping et al. 2022;Cullen et al. 2023;Austin et al. 2023).

DISCUSSION
Here we discuss uncertainties that affect our SED modeling (Section 5.1), a comparison with previous work at this redshift range (Section 5.2), a comparison with model predictions (Section 5.3), and a discussion on the implications of our results (Section 5.4).

Modeling caveats
When utilizing Bagpipes to build our model SEDs, we set several priors that allow a large range of starforming galaxy SED models to be fit.While testing to see how blue our models could go with the set list of priors, stellar grids, and dust laws, we were able to reach a blue floor of β SED = −3.25.Both our sample of 36 galaxies and the simulated 1000 galaxies were shown to reach an artificial plateau at β SED ∼ −2.8 regardless of this floor (see Figures 1 and 3).Though we attempt to correct for this bias, future work can improve upon our methodology by adopting bluer stellar grid models, such as the Yggdrasil stellar models (Zackrisson et al. 2011) that vary in initial mass function (IMF), metallicity, and star formation history.
The works of Bouwens et al. (2010), Rogers et al. (2013), andDunlop et al. (2013) discuss the possibility of bias in the UV spectral slope, mainly for fainter galaxies at higher redshifts.This bias is possible because fainter red galaxies may not satisfy sample selection criteria, while fainter blue galaxies would, fostering a blue bias, particularly for UV-faint galaxies.
We also conducted tests to evaluate the impact of introducing a free f esc parameter, with a flat prior between zero and one, to allow SEDs without a strong nebular continuum.For our sample, we run Bagpipes with and without f esc varying, and measure β in both instances.We find that fixing f esc = 0 versus allowing it to be a free parameter imparts a minimal change in the recovered UV spectral slope.We quantify this via a nebular reddening estimate, ∆β neb = β SED,corr,fesc=0 − β SED,corr = −0.01 ± 0.1.This indicates that nebular emission does not significantly redden the slope or alter the overall SED shape for this sample.We acknowledge that our reported f esc values, even when this parameter is incorporated, are not tightly constrained and remain highly uncertain.Furthermore, although this incorporation allows β values to become bluer than when not considered, as previously mentioned, we are still limited by how blue the modeling can go with and without the prior added.

Comparison with previous work
In this work, we directly compare our results to the most recent works within the same redshift range as mentioned in Section 3.4.We first compare the results for β, M UV , and stellar mass for our sample with that of Austin et al. (2023) given their overlapping scope with respect to the field and redshift range of observations.We find that their median M UV ∼ −18.9 is slightly brighter than our sample with M UV ∼ −18.1 and our median stellar masses are comparable at ∼ 10 8 M ⊙ .Any differences in M UV could be due to their photometric data reduction (the Leung et al. 2023 reduction is significantly deeper) and/or the subsequent SED fitting to the observational data (for this analysis, we quote their values for their fits with LePhare, see their Table 2).Their average β ∼ −2.61 is bluer than ours and spans a wider range of values than our work, at β SED ∼ −2.45, but this may be a result of their use of photometric power-law fitting as their primary approach to measuring the UV spectral slope.
The work of Topping et al. (2022) studied 123 galaxies in the JWST CEERS field at z = 7 − 11.These galax- M UV ∼ −21.5 with a median β ∼ −1.8 when measured via SED-fitting, and others we compared to, but offered a bright and high mass-end to any trends in the UV spectral slope with M UV and stellar mass.With this addition to the study, we could primarily note that our lower mass galaxies have significantly bluer UV slopes than their work.Bouwens et al. (2012) found a luminosity dependence and thus a UV magnitude relationship with the UV spectral slope from z = 4 − 7. Specifically, they stated that broad M UV ranges (−21.5 < M UV < −16.5 for their work) would be needed to quantify a true relationship between β and UV magnitude.It is possible that these trends could depend on where M UV is defined.As shown Bouwens et al. (2012), if one defines M UV at λ rest ∼ 1500 Å, there is, on average, no correlation between M UV and β.However, if one defines this at slightly redder wavelengths, i.e. λ rest ∼ 2000 Å, there is evidence for a strong negative correlation as β gets bluer.The works we compare this analysis to have also defined M UV at the rest-wavelength 1500 Å regime, and as such, we have yet to see, at high redshift, strong trends in UV magnitude and the UV spectral slope.
Although we expect β to decrease as we move earlier in cosmic history, where on average, for the whole population, younger, smaller galaxies with just a few generations of stellar populations existed, the weak trends with redshift we see here could be due to the sample size for each redshift bin we have in our population.When we only look at our sample, there is no significant correlation between β and redshift, while one emerges when combined with other works.Future analyses with larger sample sizes that span both UV-bright and faint galaxies and wider ranges of redshift will be more capable of probing any evolution of the UV spectral slope, as well as minimizing the effects of cosmic variance.We also note that previous works, e.g., Topping et al. (2022); Austin et al. (2023); Cullen et al. (2023), estimate the UV spectral slope by applying a power-law fit to the photometry.While this is a reasonable approach, as we have discussed, this method can be less reliable when photometry is limited.While our preferred SED method has reduced scatter, it is not free of its own biases.Improvement can be made via higher spectral resolution data (e.g., medium-band imaging or prism spectroscopy) with JWST in the rest-frame UV-regime of these galaxy SEDs would need to be provided.

Comparison to model predictions
Here, we compare the results of our observations and the other observational datasets utilized for this work alongside three cosmological simulations: the Santa Cruz semi-analytic model (SC-SAM) (Yung et al. 2023a), the First Light And Reionisation Epoch Simulations (FLARES, Lovell et al. 2021;Vijayan et al. 2021;Wilkins et al. 2022), and the SIMBA-EoR simulations (Davé et al. 2019, Jones et al. in prep.).
We match the results from our observations with those found in other studies, using the predictions given by the Santa Cruz semi-analytic model (SC-SAM Yung et al. 2023a) at z = 9 (compared to galaxies in redshift bin z = 8.5 − 9.5) and z = 11 (compared to galaxies from z = 9.5 − 12, as binned in Leung et al. 2023).These predictions are made with halo merger trees extracted from the suite of dark matter-only simulations gureft, which provide robust halo merger histories with 170 snapshots stored between 40 ≳ z ≳ 6 (Yung et al. 2023b).The SC-SAM tracks the full star formation and chemical enrichment histories of predicted galaxies under the influence of a set of key physical processes (see Somerville et al. 2015;Yung et al. 2019, 2022 andreferences therein) and couples them with the stellar population synthesis (SPS) model Binary Population and Spectral Synthesis (BPASS; Eldridge et al. 2017;Byrne et al. 2022) to produce high-resolution SEDs for each galaxy.
We also compare observations to the predictions from the First Light And Reionisation Epoch Simulations (FLARES, Lovell et al. 2021;Vijayan et al. 2021;Wilkins et al. 2022) at z = 9.FLARES re-simulates 40 regions selected from a large (3.2 Gpc) 3 dark matteronly simulation using a variant of eagle (Schaye et al. 2015;Crain et al. 2015) physics model.By simulating a wide range of environments and statistically combining them flares is able to probe a larger dynamic range of galaxies than possible with comparable volume periodic simulations.
Finally, we compare all observational datasets to the predictions from the Simba-EoR simulations (Davé et al. 2019, Jones et al. in prep.).Simba-EoR builds on the Simba model (Davé et al. 2019), adding a sophisticated subgrid model for the interstellar medium that directly tracks H 2 and dust co-evolution via a fully coupled chemical network modulated by a local interstellar radiation field, and run at higher resolution.This results in significantly more early star formation versus Simba, leading to good agreement with the z > 9 UV luminosity function (UVLF) (Finkelstein et al. (2023)).
For each galaxy in the simulations, we make use of their UV spectral slope, stellar mass, and UV magnitude.In Figure 7, we compare our work and that of previous works utilized in earlier sections to the gureft, flares, and Simba-EoR models.We find that the measurements of β SED , M UV , and M * as derived from Bagpipes for our sample match well with predictions in both redshift bins in comparison to gureft predictions (the brighter, more massive galaxies in our z = 9, 11 samples are also in agreement with the flares and Simba-EoR simulations).Notably, we do not observe any ultra-blue UV spectral slopes (β < −3), which is in line with these simulations not predicting such extreme values for the galaxy redshifts and masses we examine.Although they are also in agreement with our sample, the gureft and flares simulations do not consider any Population III stellar grids, and our observations do not drive this inclusion for galaxies at the redshift and mass range considered here.Future works incorporating these Population III stellar grids, such as the Yggdrasil models (Zackrisson et al. 2011), will be necessary to explain any potential future observations of β < −3.

Implications
JWST, which, as of now, can detect candidate galaxies out to z ∼ 16 as shown in this work and others, may provide insights into the early universe and the existence of some of the earliest generations of stars.However, it remains significantly challenging to distinguish between Population III stars and low-metallicity Population II stars due to the absence of distinctive photometric features.While potential signatures might be discovered in future works through spectroscopic analysis, observing at substantial depths will be needed to acquire this detailed information.
Although direct observation of Population III stars remains challenging, the galaxies we observe at high redshift are not without their importance.Many of these galaxies are likely low-metallicity, as inferred from the relatively blue average values we have obtained for β, and by the evidence for low levels of dust attenuation and low stellar mass, providing a view of the universe when it was younger and less chemically enriched.For example, the local galaxy NGC1705 has β ∼ −2.5 (Calzetti et al. 1994), similar to our median value, with a measured gas-phase abundance of ∼20% Solar.We thus expect that the typical galaxy in our sample is composed of mainly Population II stars, like galaxies we observe at lower redshifts (Greif & Bromm 2006).However, their low metallicities mean they can serve as analogs for the conditions under which Population III stars might have formed.This contrasts slightly with findings by Pérez-González et al. (2023).They observed a heightened activity in the universe during its early phases, especially around z ≳ 11.Such elevated activity, as they state, could imply a notable presence of Population III stars, as these primordial stars might contribute significantly to the heightened UV photon production.However, our data indicates a more prevalent role for Population II stars.Together, these studies raise intriguing questions about the early universe's stellar populations and their contributions to its early luminosity.
Building on this, the characteristics of galaxies housing different stellar populations are anticipated to vary across several parameters.Here, we discuss the differences in what results are expected from Population III stars and what we are obtaining in this work.
Dust attenuation is expected to be zero in galaxies with Population III stars (Bromm et al. 1999).Given that these stars are thought to be the first generation and formed in a dust (and metal) free environment, the corresponding galaxies would have significantly less dust than those with Population II stars along the line of sight.Although with Bagpipes, we are obtaining A v values on the order of zero, based on our values of β, our galaxies are consistent with having low but not zero metallicity stellar populations.In such cases, follow-up spectroscopy is vital to verify such situations (e.g., measurements of the rest-frame UV spectral slope directly from the spectra; Balmer decrements are unlikely to be detected at these high redshifts currently).
In terms of age, galaxies with Population III stars are expected to be among the youngest in the universe (existing at high redshifts), as these stars are theorized to have formed within a few hundred million years following the Big Bang (Bromm et al. 1999;Greif & Bromm 2006).Deep imaging and spectroscopic surveys like NGDEEP are perhaps our best chance at catching a glimpse of these stellar populations, or at least a direct descendant of them (i.e., Population II stars enriched by Population III remnants).Our galaxy sample has ages that span on the order of 1 − 100 Myr, with the average age being log(Age) = 7.8 +0.2 −0.8 years old.The work of Zackrisson et al. (2011) studies the SED tracks of Population I, II, and III (III.1,III.2, and III (Kroupa IMF); see Section 2 of their work for more information on the different Pop III types).Defining stellar populations by Type A, B, and C (where Type A has an f esc = 0, Type B an 0 < f esc < 1, and Type C an f esc = 1), we compare the UV slopes of our galaxy sample versus what they predict different stellar populations have when binned by age at 1, 10, and 100 Myr (see their Table 2 for full list of UV slopes for instantaneous-burst models).Our galaxy sample is fully consistent with that of Population II tracks for both Type A and C, where Type A tracks range from −2.2 < β < −2.0 and Type C tracks range from −3.1 < β < −2.0 for 1 − 100 Myr.We find some evidence that our sample follows that of the less extreme Population III type cases when binned by age due to the uncertainty in f esc (these UV slopes span  We thus conclude that the galaxies we have observed with NGDEEP do not show evidence of harboring ultralow metallicity and/or Population III stars.While we lack current observational data for galaxies harboring Population III stars, theoretical predictions and simulations offer valuable insight for distinguishing potential differences when comparing them to galaxies inhabited by Population I and II stars.The forthcoming observations from JWST will play a pivotal role in validating these predictions, thus enhancing our comprehension of the early universe.Despite the fact that this study has not confirmed the presence of such exotic stellar populations, there remains a possibility that they exist at higher redshifts and/or perhaps exhibit fainter luminosities than what we can detect at this moment.

CONCLUSIONS
Using new data from the ultra-deep JWST NGDEEP survey, we analyze the evolution of the rest-frame UV spectral slope as a function of redshift from z ∼ 9 − 16.We measure the UV spectral slope both via SED-fitting with Bagpipes and via photometric power-law fitting to the observed photometry.We compare our observed UV slopes to the measured stellar mass, UV magnitude, redshift, age, SFR, dust attenuation, and f esc .With these comparisons, we reach the following conclusions: 1. We quantify the bias for both SED-fitting and power-law methods with source-injection simulations.We show that for both observed and simulated galaxies, the SED-fitting technique with Bagpipes yields more accurate (less scatter) UV spectral slope results than photometric power-law fitting.Both methods still exhibit biases -we quantify this bias for the SED fitting method and use it to correct our UV slope measurements.After this correction, we find a reasonable agreement between the SED fitting and power-law fitting methodologies (we note we do not apply any corrections to the photometric power-law measurements), though the latter exhibits significant scatter to both much redder and bluer colors.
2. When measuring β via the SED-fitting method for the entire sample, we obtain an average UV slope, β SED = −2.46+0.24 −0.19 (−2.30+0.19 −0.13 before bias correction), compared to β PL = −2.65 +0.52  −0.51 via photometric power-law fitting.These average values indicate that our sample of galaxies is predominantly low-metallicity, but we do not find evidence for any galaxies whose stellar populations could be considered exotic (i.e., enriched by Population III stars, β ≤ −3).

We find moderately positive correlations between
the UV spectral slope and dust attenuation, age, stellar mass, and star formation rate as derived from Bagpipes.We find weak-to-no correlations between the UV spectral slope and the escape fraction, UV magnitude, and redshift.Further observations, with future data reductions and the long-awaited second half of data from this survey, will increase our sample size and perhaps offer a more accurate representation of galaxies during this epoch.
4. We compare our observations with the results of the SC-SAM (Yung et al. 2023a), FLARES (Lovell et al. 2021;Vijayan et al. 2021;Wilkins et al. 2022), and the SIMBA-EoR simulations (Davé et al. 2019, Jones et al. in prep.) and find that both show no evidence of ultra-blue UV slopes (β < −3) at these redshifts and stellar masses.
Despite achieving solid measurements of the UV spectral slope, the precise mechanisms driving the observed changes remain a subject of discussion.Further clarity on this matter requires a larger sample of these stellar masses and redshifts.Currently, we hypothesize that variations in the UV spectral slope in the observed galaxies are predominantly influenced by dust attenuation, A v , and stellar metallicity.Incorporating bluer SED models in future studies with Bagpipes could potentially offer pivotal insights into these variations.A more in-depth analysis involving dust growth or perhaps varying dust attenuation curves (Salim et al. 2018), coupled with assessments of stellar mass, SFR, and age, is essential to outline detailed attributes of the sample more accurately.The second half of data collection for this survey, complemented by future deep-imaging surveys with JWST, seems promising in offering a detailed story on the evolutionary dynamics of the UV spectral slope during these early epochs.
7. ACKNOWLEDGEMENTS AMM thanks Michael Topping for access to Topping et al. (2022) data for the purpose of this analysis.We acknowledge that the location where this work took place, the University of Texas at Austin, which sits on indigenous land.The Tonkawa lived in central Texas, and the Comanche and Apache moved through this area.We pay our respects to all the American Indian and Indigenous Peoples and communities who have been or have become a part of these lands and territories in Texas, on this piece of Turtle Island.

Figure 1 .
Figure 1.UV β slope derived from photometric power-law fitting versus from SED-fitting color-coded by redshift.Original β values are shown as open circles, while the filled circles have their SED-fitting-based β corrected for measurement bias (discussed in Section 4.2).The blue dashed horizontal line is the bluest βSED can go with Bagpipes, and the grey dashed line is the one-to-one line when equating βSED to βPL.This relation shows that photometric power-law fitting, on average, yields bluer UV slopes and larger errors than SEDfitting.

Figure 2 .
Figure 2. Examples of Bagpipes SED fits to the observed photometry (orange circles) for six cases: (a) The highest redshift source in our sample at zSED = 15.5 +0.3 −0.3 , (b) one of the faintest sources in our sample at MUV = −16.2+0.3 −0.2 , (c) one of the bluest sources in our sample with βSED,corr = −2.7 +0.1 −0.1 , (d) one of the 'reddest' sources in our sample with βSED,corr = −1.5 +0.3 −0.2, (e) a source where βPL is redder than βSED, and (f) a source where βPL is bluer than βSED.The black line is the SED model with the lowest χ 2 from Bagpipes, the light grey shaded region represents the full range of SEDs being fit, and the darker grey shaded region is the 1σ bounds of the posterior sample of SED models.The blue (orange) shaded regions and lines represent the best-fit line to the photometry, where the UV spectral slopes and errors are obtained from when fit via the SED (photometric power-law) method.It is clear, especially in the last two cases, that scatter in a single photometric band can affect βPL significantly more than βSED.

Figure 3 .
Figure3.The results of our source-injection simulations.The first two panels show the input value of β versus the recovered SED-fitted (left) and photometric power-law (middle) β.The SED method has a tighter scatter, though exhibits a clear bias to redder measured values at true values of β ≲ −2.The photometric power-law method shows a significantly larger scatter, as well as a larger bias (in the blue direction) at all input values of β.We quantify these biases in the right panel, showing the difference in β as a function of the rest-UV color for our simulated sources.These colors are chosen as they can be measured for real galaxies (where the intrinsic β is not known).We show that the SED-fitting method results in a more accurate recovery of β than photometric power-law fitting.However, as there is still a significant bias, we use this measured ∆β value as a correction factor applied to βSED for the observed galaxies in our sample (see Figure1).As the scatter and bias are both worse for βPL, and we use βSED for our analysis below, we do not apply these corrections to βPL.

Figure 4 .
Figure 4. UV spectral slope, β, as a function of redshift (color-coded by rest-UV apparent magnitude, mAB, defined at λrest = 1500 Å) from this work are shown as circles.Squares are from Topping et al. (2022), stars are from Tacchella et al. (2022), and diamonds are from Austin et al. (2023).The black line and shaded region at βSED = −2.51± 0.07 is the baseline spectral slope for halos enriched by only Population III stars from Jaacks et al. (2018).This work is consistent with the works of others, however, no clear trends are shown with UV spectral slope as a function of redshift when all data is combined.The corresponding Spearman R correlation coefficient is listed for this work only in the lower right.Within our sample, we see no evidence for ultra-blue UV spectral slopes.

Figure 5 .
Figure 5. UV β slope as a function of UV magnitude and stellar mass (color-coded by redshift) from this work are shown as circles; all other symbols and the shaded line are the same as Figure4.Where our sample overlaps with previous works, our results are consistent, albeit with lower scatter.With our sample alone, we see a moderately strong correlation with stellar mass, but a negligible correlation with UV magnitude.When combining our sample with previous work, we see a weak trend between the UV spectral slope and UV magnitude, and a somewhat stronger trend with stellar mass, where brighter, more massive galaxies exhibit redder UV colors than faint, low-mass counterparts.The corresponding Spearman R correlation coefficients are listed for this work only on the lower right of each panel.

Figure 6 .
Figure 6.UV β slope as a function of dust attenuation, star formation rate, age, and escape fraction (color-coded by redshift).The black line and shaded region at βSED = −2.51± 0.07 is the baseline spectral slope for halos enriched by only Population III stars from Jaacks et al. (2018).SFR, dust attenuation, and age show a moderately positive correlation with β.While, finally, fesc exhibit a negligible correlation with β.The corresponding Spearman R correlation coefficients are listed for each parameter in the upper left of each panel.

Figure 8 .
Figure 8. Bagpipes SEDs to observed photometry for all 36 galaxies in our sample.The solid black line is the least χ 2 SED model, the dark grey-shaded regions indicate the 1σ bounds of all 1000 SEDs fit to the photometry, while the light grey shaded regions are the full (100th percentile) range of the SED models.The orange points are the observed photometry taken with JWST.The orange (and purple) lines and shaded regions are the median and 1σ confidence interval for the measured βPL (and βSED), as noted in the legends.

Table 1 .
Bagpipes priors defined for this work.

Table 2 .
Redshift and UV spectral slopes for this sample.

Table 3 .
Bagpipes median and 1σ posteriors for galaxy parameters.