Probing the 20 < M UV < 17 Luminosity Function at 8 < z < 13 with the NIRCam Parallel Field of the MIRI Deep Survey

We present the ultraviolet luminosity function and an estimate of the cosmic star formation rate density at 8 < z < 13 derived from deep NIRCam observations taken in parallel with the MIRI Deep Survey of the Hubble Ultra Deep Field ( HUDF ) , NIRCam covering the parallel ﬁ eld 2. Our deep ( 40 hr ) NIRCam observations reach an F277W magnitude of 30.8 ( 5 σ ) , more than 2 mag deeper than JWST public data sets already analyzed to ﬁ nd high-redshift galaxies. We select a sample of 44 z > 8 galaxy candidates based on their dropout nature in the F115W and / or F150W ﬁ lters, a high probability for their photometric redshifts, estimated with three different codes, being at z > 8, good ﬁ ts based on χ 2 calculations, and predominant solutions compared to z < 8 alternatives. We ﬁ nd mild evolution in the luminosity function from z ∼ 13 to z ∼ 8, i.e., only a small increase in the average number density of ∼ 0.2 dex, while the faint-end slope and absolute magnitude of the knee remain approximately constant, with values α = − 2.2 ± 0.1, and M * = − 20.8 ± 0.2 mag. Comparing our results with the predictions of state-of-the-art galaxy evolution models, we ﬁ nd two main results: ( 1 ) a slower increase with time in the cosmic star formation rate density compared to a steeper rise predicted by models; ( 2 ) nearly a factor of 10 higher star formation activity concentrated in scales around 2 kpc in galaxies with stellar masses ∼ 10 8 M e during the ﬁ rst 350 Myr of the universe, z ∼ 12, with models matching better the luminosity density observational estimations ∼ 150 Myr later, by z ∼ 9.

1. INTRODUCTION Some of the oldest stars discovered in the Milky Way indicate a very early onset of star formation in the Universe (Tumlinson 2010).Indeed, some inferred stellar ages are older than the Hubble time or very close (Cowan et al. 2002;Creevey et al. 2015;Schlaufman et al. 2018;Vagnozzi et al. 2022), within 100-200 Myr after (or even before!) the Big Bang.On a complementary approach, (the quite complex) stellar population modeling of stellar clusters and nearby galaxies also indicates very old formation ages for, e.g., the central regions or cores of some elliptical galaxies (Friel 1995;McDermid et al. 2015;Cappellari 2016).
From the point of view of cosmological evolution, one of the most fundamental questions is how fast the Universe was able to start forming stars and galaxies.The Hubble Space Telescope (HST) was able to reach redshifts up to z ∼ 11 (Oesch et al. 2016;Jiang et al. 2021), just 400 Myr after the Big Bang, thus providing statistical samples of high-redshift galaxy candidates from which luminosity functions at z = 8-10 could be built (see review Robertson 2022, and references therein, among them Bouwens et al. 2015;Finkelstein et al. 2015;McLeod et al. 2016;Yue et al. 2018;Ishigaki et al. 2018;Bhatawdekar et al. 2019;Rojas-Ruiz et al. 2020;Bowler et al. 2020;Bagley et al. 2022).
With the launch of JWST on Christmas Day 2021, and the start of scientific operations on July 2022, several thorough multi-filter searches for the highest redshift galaxies have been carried out with the NIRCam instrument, mainly with datasets integrating for less than 2 hours per band.Tens of high redshift galaxy candidates have been reported at z = 10-15, with a few at 15 < z < 18, and even a handful at z ∼ 20 (Donnan et al. 2023;Naidu et al. 2022a;Finkelstein et al. 2022Finkelstein et al. , 2023;;Castellano et al. 2022;Bouwens et al. 2022b;Robertson et al. 2022;Rodighiero et al. 2023;Harikane et al. 2022;Whitler et al. 2023;Yan et al. 2023).How-ever, different teams analyzing the same observations have only agreed on a few objects (Bouwens et al. 2022a).The spectroscopic confirmation of z > 10 candidates is starting to be carried out with JWST/NIRSpec, however it may require exposure times of tens or even hundreds of hours (Curtis-Lake et al. 2022;Arrabal Haro et al. 2023a,b, or taking advantage of lensing clusters, Roberts-Borsani et al. 2022;Williams et al. 2022), thus bringing into question the feasibility of obtaining robust samples composed of hundreds or even tens of z > 10 galaxies, with more promising results at 8 < z < 10 ( Isobe et al. 2023;Sanders et al. 2023;Tang et al. 2023;Fujimoto et al. 2023).
Much more important than the race to discover the highest redshift galaxy or the oldest galaxy ever formed is the exploration of the first hundred million years of galaxy evolution.Constraining the evolution of galaxies within this primordial epoch will provide constraints to simulations which still have significant uncertainties in a broad range of fundamental areas such as (super) massive black hole early growth or primordial black hole abundance, the nature and interplay between dark and baryonic matter in the Dark Ages (Lovell et al. 2014;Dayal et al. 2017), or the mechanism followed by star formation fueled by pristine or low metallicity gas (Zackrisson et al. 2011), affecting properties such as the Initial Mass Function (IMF; Bastian et al. 2010;Marks et al. 2012), and the feedback processes in primitive galaxies (Haslbauer et al. 2022).
In this Letter, we address the topic of identifying and characterizing the galaxy population at z > 8 using new ultra-deep NIRCam observations.We observed for a factor 7x longer than previous JWST observations (e.g., in the SMACS0723, GLASS, or CEERS fields), reaching 30-31 mag 5σ limiting magnitudes and enabling us to probe the high-redshift Universe nearly a factor of 10 fainter than the previously analyzed observations in JWST cycle 1 surveys.This allowed us to constrain, for the first time, the faint end of the ultraviolet (UV) luminosity function from z = 8 to z ∼ 13.This Letter is organized as follows.Section 2 presents our dataset, the NIRCam parallel of the MIRI Deep Survey.Section 3 describes our method to select galaxies at z > 8.The sample is used to construct UV luminosity functions and derive the early evolution of the luminosity density in the Universe, presented in Section 4. Section 5 summarizes our findings.

DESCRIPTION OF THE DATA
This Letter is based on the analysis of the NIRCam data taken in parallel with the Guaranteed Time Observations (GTO) of the Hubble Ultra Deep Field (HUDF) carried out in December 2022 by the MIRI European Consortium Team, Program ID 1283 (PIs Hans Ulrik Nørgaard-Nielsen R.I.P., and Göran Östlin).While MIRI was observing the HUDF with the F560W filter during 61 hours (including overheads) in our MIRI Deep Survey (MDS, Östlin et al., in prep.), parallel data were also being acquired by the NIRISS (20 hours) and NIR-Cam (40 hours) instruments.
The NIRCam observations were carried out with the F115W, F150W, F277W, and F356W filters.Nominal exposure times were 55187 s in each filter, achieved with 2 visits per filter using 20 integrations with NGROUP=7, DEEP8 readout pattern, and 10 dithering positions following the MIRI-optimized CYCLING medium-size pattern with different starting points in each visit.Unfortunately, one of the 4 visits scheduled for our NIRCam observations could not be executed due to several JWST safe-mode events in December 2022, and one more visit had a significant difference in the roll angle with respect to the first two completed observations (23 • vs. 31 • ).This resulted in a smaller area in common between the four filters: 8.0 arcmin 2 instead of the 9.7 arcmin 2 expected for a full NIRCam field of view (a further area cut is described in Section 3.1).The aborted observations also resulted in shorter exposure times for the data used in this paper: nominal exposure times of 55187 s for F115W, and F277W and 27594 s for F150W and F356W.
The typical 5σ depth of these observations (measured in a 0.3 -diameter circular aperture and corrected for the encircled energy fraction, ∼ 75%) are 30.3mag and 30.8 mag in the short and long wavelength channels.This is 2-3 magnitudes deeper than other surveys such as CEERS, GLASS, or SMACS0723 (see Bouwens et al. 2022a).
The NIRCam data were calibrated with the jwst pipeline version 1.8.4,reference files in pmap version 1023.Apart from the standard pipeline stages (including snowball and wisp correction), we also applied a background homogenization algorithm prior to obtaining the final mosaics, including 1/f -noise removal.The whole dataset was registered to the same World Coordinate System (WCS) reference frame using the Hubble Legacy Field (HLF) catalog in Whitaker et al. (2019), based on Gaia DR1.2 (Gaia Collaboration et al. 2016a, Gaia Collaboration et al. 2016b).All images were drizzled to the same plate scale, namely, 0.03 arcsec/pixel.The FWHM of the point spread function (PSF) is 0.07 , 0.07 , 0.12 , and 0.14 for F115W, F150W, F277W, and F356W, respectively.All images were PSF-matched to the reddest filter.
The MDS NIRCam-parallel observations target a region known as the HUDF-P2 region, a field where the deepest HST/Advanced Camera for Surveys (ACS) F814W data exist, obtained simultaneously while observing the HUDF with WFC3.The HUDF-P2 field is located to the south-east (SE) of the HUDF, at the edge of the CANDELS footprint (Grogin et al. 2011;Koekemoer et al. 2011).A varying fraction of the NIRCam field of view is covered by HST ACS and WFC3 observations, all counting with heterogeneous depths.We used the Hubble Legacy Field v2 images for all filters used by ACS and WFC3 (Whitaker et al. 2019).Table 1 presents depths and area coverage for all the datasets used in this paper.

SELECTION OF Z 8 GALAXY CANDIDATES
A robust selection of high-redshift galaxy candidates is hampered by three different systematic effects.The first one is the photometry, since we are dealing with intrinsically faint sources which should be almost or completely undetected in some bands (e.g., those blue wards of the Lyman break or the Lyman-α jump), thus complicating the spectral energy distribution (SED) analysis.The comparison of results obtained by analyzing photometry in different apertures can potentially help to get rid of noise or contamination by neighboring sources in the determination of the photometric redshifts, as we will explain in Section 3.1.The second difficulty is the degeneracy in the analysis of the SED for the candidates, given that large red colors consistent with the Lyman-α break can also be mimicked by the Balmer or 4000 Å breaks as well as by the presence of strong (i.e., high equivalent width) emission lines or dusty starbursts (Zavala et al. 2022;Naidu et al. 2022b;McKinney et al. 2022;Pérez-González et al. 2022).The third problem, directly related to the previous one, is the limited and/or biased set of templates that different programs estimating photometric redshifts use.This translates to methodological differences in how the SED degeneracy is addressed.In order to cope with the last two difficulties, we not only used several SEDs for a given source (measured in different apertures) but we also estimated the photometric redshifts with different codes and a priori assumptions (including the set of templates, but also how to treat low signal-to-noise data points).This is described in Section 3.2.

Photometric catalog
We selected galaxies in the F277W filter (the deepest) and measured photometry in all bands following a methodology especially developed to deal with ultradeep data and crowded fields.
First, we masked the four bright stars present in one of the modules and the very bright spikes coming from two other stars located to the East of our pointing (outside the field of view, but the bright diffraction spikes protrude our observations).This resulted in the loss of 9% of the field of view.The final MDS-NIRCam-par survey area is 7.3 arcmin 2 .
We then detected sources in several passes, starting from the most extended ones and progressing to smaller and fainter galaxies in a number of levels.The galaxies detected in a given level were masked completely down to the isophote corresponding to 5σ of the sky (∼ 26.3 mag arcsec −2 in F277W), building a first segmentation map of galaxy cores.Afterwards, the segmentation map for each galaxy was extended to the fainter outskirts by fitting isophotes constructed by dilation of the galaxy cores.The outskirts were fitted down to 1σ of the sky (around 28.0 mag arcsec −2 in F277W) or the Kron (1980) aperture, whichever was reached first.The Kron factor was varied as we were progressing to smaller/fainter galaxies, starting at 2.5 for the brightest galaxies and reaching 1.2 for the faintest sources (see Finkelstein et al. 2023).These numbers were calibrated with the sky surface brightness limit mentioned above.The method allows us to select faint galaxies (or globular clusters) in the outskirts of bright extended objects and improve the background determination in the whole field.Once all galaxies in a given level were fitted, the residual image was constructed by removing the core and the outskirts fitted with the isophotal analysis, and the new frame was fed to the next level of detection.We implemented a total of 10 levels for our final catalog in this dataset, starting from galaxies covering more than 10 000 pixels (10 arcsec 2 ) in the first pass to the faintest objects with just 30 pixels (∼0.03 arcsec 2 ).
Once the procedure was completed for the selection band, the photometry was measured in all other filters by fixing centers, shapes, and sizes.The final catalog is composed of 40 526 sources, of which 33 905 were detected at the 5σ level or higher in F277W.The photometric catalog included color measurements in the Kron (1980) aperture as well as fixed-diameter circular apertures with sizes 0.2 , 0.3 , 0.4 , and 0.5 .The SEDs corresponding to these fixed apertures were scaled to the Kron aperture to obtain total integrated magnitudes by applying a constant offset to all bands in a galaxy-bygalaxy basis.
Photometric errors were estimated by measuring the background noise locally around each source with a procedure similar to that presented in Pérez-González et al. (2008), devised to take into account correlated noise.We remark that estimating realistic uncertainties not affected by drizzling effects (i.e., not underestimated; see, e.g., Labbé et al. 2003, Quadri et al. 2007, Pérez-González et al. 2008or Finkelstein et al. 2023) and consistently between the SW and LW channels (which are drizzled in a different way given their distinct nominal pixel sizes, see below) is of utmost importance for our selection and analysis of dropout sources, especially given that the observations for the bluest channels just provide upper limits for the flux.
We built artificial apertures composed of randomlyselected non-contiguous pixels, adding up the number of pixels of the photometric aperture whose uncertainty we were trying to estimate.We also imposed that each chosen pixel should be more than three pixels away from any other pixel entering the analysis, this distance being a good approximation of the area that contributes to the flux of each pixel in the final mosaics for any band (after drizzling).With this additional procedure added to a random selection strategy, we ensured that all the pixels included in the noise calculation are independent, thus minimizing the effects of correlated noise.We remark that in the selection of high-redshift galaxy candidates, we analyzed five different SEDs for each source using the fluxes and noise calculations in the mentioned apertures.
We also calculated photometric errors using randomly distributed circular apertures (as in Harikane et al. 2022, for example).With this method, we find values of the rms of the sky that are 60% and 3% smaller for the long and short wavelength channels, respectively, compared to our method based on randomly-selected noncontiguous pixels.We interpret this difference as the effect of correlated noise, which biases the noise calculation in randomly distributed circular apertures since they include contiguous pixels whose signal comes from the same original pixel (i.e., with nominal size).Our random-selection method avoids contiguous pixels, so the effect of correlated noise should be minimized.We note that the F277W and F356W images are drizzled to half the nominal pixel size (65 mas/pixel), while the nominal pixel size for the F115W and F150W filters (32 mas/pixel) is very similar to the value used in our final mosaics (30 mas).Therefore, we should expect a larger effect of correlated noise in the rms calculations with circular apertures for the long wavelength bands, in agreement with the results of our test.

Photometric redshift estimation
The SEDs for each source were fitted with three different codes and techniques, each providing a redshift probability distribution function (zPDF).
Our fiducial photometric redshift estimation came from running the eazy code (Brammer et al. 2008) using the default FSPS templates (Conroy & Gunn 2010), plus a dusty galaxy template (Muzzin et al. 2013).We added the templates presented by Larson et al. (2022) to optimize the analysis of high-redshift galaxies, which include emission-line galaxies with high equivalent widths (known to contaminate JWST high-redshift samples, see Zavala et al. 2022 andNaidu et al. 2022b) and a variety of UV slopes.We used a flat prior in F277W magnitude and no template error, allowing the redshift to take values between z = 0 and z = 20.Data points with low (< 3) signal-to-noise ratio (SNR) were treated in two different ways.We either used directly measured fluxes (even if negative) or we also changed those values to 5σ upper limits and used them in a modified version of eazy.The modification consisted of not allowing any fit to provide brighter fluxes (achieved by penalizing the χ 2 calculation) than those limits and excluding the band in the χ 2 calculation for templates providing lower fluxes (Pérez-González et al. 2022, Mérida et al. 2023, submitted).
We also used the prospector SED-fitting algorithm (Johnson et al. 2021) to further verify the estimated photometric redshifts of our candidates.We adopted a setup similar to that used in Langeroodi et al. (2022) for fitting the stellar populations of spectroscopicallyconfirmed z ∼ 8 galaxies.In brief, the free parameters include the total formed stellar mass, the stellar metallicity, the nebular metallicity, the nebular ionization parameter, the dust attenuation (modelled with three free parameters, adopting the dust model of Kriek & Conroy 2013), and the optical depth of intergalactic medium (see Langeroodi et al. 2022, for details).Unlike Langeroodi et al. (2022), where the redshift was fixed to the spectroscopically measured value, here we treated redshift as a free parameter with a flat prior in the range z = 0 -20.The star formation history (SFH) was modelled non-parametrically in five temporal bins, with the last bin spanning 0-10 Myr in lookback time and the rest evenly spaced in log(lookback time) up to z = 35.We sampled the parameter space using the dynesty sampler (Speagle 2020; Koposov et al. 2022), a dynamic nested sampler based on Higson et al. (2019).
Finally, we used the template-fitting code LePhare (Arnouts et al. 2002;Ilbert et al. 2006) to help with the selection of our candidates.We adopted two configurations of this program.For the first one, we followed Rinaldi et al. (2022Rinaldi et al. ( , 2023)).Briefly, we made use of galaxy templates with the following set of SFHs: a standard exponentially declining (SF R(t) ∝ exp −(t−t0)/τ ) and an instantaneous burst by adopting a simple stellar population (SSP) model (SF R(t) ∝ δ(t)).In particular, for the standard exponentially declining models we used the following e-folding timescales (τ ) in Gyr: 0.01, 0.1, 0.3, 1, 3, 5, 10, 15.We employed the stellar population synthesis (SPS) models of Bruzual & Charlot (2003) based on a Chabrier (2003) IMF, considering two different values of the metallicity: solar metallicity and one-fifth of solar metallicity.To account for the effects of internal dust extinction, we convolved the model templates with a modified version of the reddening law of Calzetti et al. (2000), where we adopted the extrapolation provided by Leitherer et al. (2002) at shorter wavelengths.For that purpose, we allowed the color excess E(B − V ) to range from 0 to 1.5 mag (with steps of 0.1), reaching A V 6 mag.A second configuration was explored, this time following the one described in Ilbert et al. (2015) and Kauffmann et al. (2022).Since our aim is to select rare sources at z > 8, we explore a large range of dust-attenuation in order to identify and reject possible contaminants.We consider two dust attenuation curves as a free parameter (Calzetti et al. 2000;Arnouts et al. 2013) with E(B − V ) varying from 0 to 2 mag (reaching A V = 8 mag).In this execution, we also adopted the recipes of Saito et al. (2020) to include the emission lines, allowing for an additional free rescaling by a factor two of all line fluxes.We note that LePhare's fits are performed based on the fluxes and not the magnitudes, which do not require the need of introducing upper limits, considering that the flux uncertainties at low signal-to-noise are still meaningful (following Laigle et al. 2016).We rejected sources with χ 2 > 100 which corresponds to an unreliable fit and sources which are best fitted by a brown dwarf stellar template.
Given that we had two executions of eazy and two of LePhare, in order to avoid a bias towards any of these codes, we considered the highest redshift among the two executions for each code to select galaxies as explained in the next subsection.
Summarizing, we searched for high-redshift galaxy candidates using five different SEDs and three photometric redshift codes and techniques.

Selection of candidates
We constructed a sample of z 8 galaxy candidates by searching for F115W and F150W dropouts in our NIRCam data and applying a similar methodology to that presented in Finkelstein et al. (2023).Distinctively, we explored the effect of the photometric aperture and the peculiarities of different photometric redshift estimation codes.Indeed, for each galaxy we used five SEDs derived from different aperture sizes and three redshift estimation codes, as outlined in the previous section, working with weighted averages for all relevant quantities used in the selection.We applied twice the weight of any other SED to that corresponding to 0.3 diameter aperture (typically used in this type of studies, see Bouwens et al. 2022a, Harikane et al. 2022, Donnan et al. 2023, Naidu et al. 2022a, Adams et al. 2023), which maximized the SNR of the measurements while keeping PSF-related aperture correction below 25%.
First, we selected all galaxies whose median F277W and F356W flux, averaged across all five SEDs, had a SNR > 5.This cut resulted in a sample of 16 133 sources.Then, we restricted the catalog to all sources with a median SNR < 3 in F115W and/or F150W (in both the PSF-matched and original images), resulting in a sample of 3883 1.15 µm dropouts and 2547 1.50 µm dropouts.
We then selected the galaxies whose most probable (as found by integrating the zPDF) and peak (that providing the smallest χ 2 value) redshifts were above z > 8 for any of the 3 codes (1942 sources).We further cut the sample to keep only those galaxies with a > 50% cumulative probability of lying at z > 8 (1831 sources).The final cut was in χ 2 values, ensuring that the difference between the minimum value of the zPDF at z < 8 and z > 8, ∆ log χ 2 was larger than 0.4 dex (as in Finkelstein et al. 2023, but not as large as in Harikane et al. 2022), leaving 160 sources.The additional cut used in Finkelstein et al. (2023) based on the probability in ∆z intervals being higher at z > 8 than in any other unity- redshift interval did not add anything to our method.All these cuts were based on our fiducial eazy run.
We then inspected each of the 160 sources visually in the NIRCam, WFC3 and ACS bands, as well as the SED Figure 2. Examples of F115W and F150W dropout galaxies in our sample, in this case for sources fainter than F277W = 30 mag (distinctively probed by our dataset).Same information as in Figure 1 given.fits from the different codes.We vetted galaxies which were not affected by artifacts such as (surviving) spikes and/or contamination by nearby objects.We only kept in our final sample the sources for which at least two out of the three photometric redshift codes and three of the five photometric apertures provided a peak redshift above z > 8.
We estimated the fraction of low redshift interlopers and completeness of our selection method by simulating galaxies with the SEDs provided by the JAGUAR mock catalog (Williams et al. 2018).We added noise to the catalog based on the depths of our survey and applied the selection technique (based on aggregated probabilities, goodness of fits at low and high redshift, and photo-z estimation with 3 codes) to the resulting list of sources.This analysis provided a completeness of 80% down to F277W magnitude 31, and a contamination of 3%, all interlopers and missed sources presenting magnitudes fainter than 30.5 mag.No correction from interlopers is applied in the luminosity function calculations presented in the following section.
The final sample of z > 8 galaxy candidates consists of 44 objects.We show examples in Figures 1 and 2, four dropout sources in the F115W and F150W filters, with the first figure centered on brighter galaxies (F277W < 30 mag) and the second one showing fainter examples (F277W > 30 mag).
We compared our selected z > 8 candidate galaxies with those reported in Austin et al. (2023) for the common area with the NGDEEP survey (Bagley et al. 2023).Only 2 sources in Austin et al. (2023) lie within our field-of-view, another 3 (2) are located in the edges (gap between detectors) of our imaging data and only covered by 2 (0) bands (i.e., they could not arrive to our selection in any case).Of the 2 sources in common, our MDS011049 (z ∼ 9.5 in this letter) is NGD-z11a (z ∼ 11.1 for Austin et al. (2023)).We assign it with a lower redshift mostly because it is well detected in F150W (although near our image edge).The other common source is NGD-z11b, MDS007481 in our parent catalog; it did not enter our final selection since we detected two redshift solutions, one at z ∼ 6.5 ± 0.2 and another one at z = 11.1 ± 0.5, the goodness of fit difference between them lied below our cut, and the lowz solution was favored by the photometry measured in some of our aperture.
Figure 3 presents the distribution of F277W (Kronaperture) magnitudes and photometric redshifts of the final sample, compared to general catalogs and other similar compilations of z > 8 candidates and confirmed galaxies in the first surveys carried out by JWST in 2022.The figure shows that our MDS-NIRCam-par data reach 2 magnitudes fainter than the surveys of SMACS0723, GLASS, and CEERS, with a histogram peaking at F277W ∼ 30 mag, compared to 28 mag for the first datasets.The 2022 shallow surveys add up ∼ 8 times the area of our observations, but we increase by almost a factor of 10 the number of detected sources per unit magnitude bin down to 31 mag.
Our deeper data allow us to find z > 8 galaxy candidates at fainter magnitudes, probing lower luminosities than previous works (see Section 4).Our sample, plotted in red in Figure 3, is compared with other photometrically selected candidates and the four first spectroscopic confirmations at z > 10 provided by JWST (Curtis-Lake et al. 2022; Arrabal Haro et al. 2023a,b) as well as faint lensed galaxies above z ∼ 7 confirmed with NIRSpec spectroscopy (Carnall et al. 2023;Williams et al. 2022;Langeroodi et al. 2022;Roberts-Borsani et al. 2022).We have seven candidates in the magnitude regime probed in the 2022 shallow surveys due to our limited area, with the bulk of our sample concentrating around 30 mag (median and quartiles F277W = 30.2 30.7  29.8 mag).Our sample doubles the number of z ∼ 9 galaxy candidates and covers an unexplored magnitude regime up to z ∼ 13.
We note that our data are well suited to get z ∼ 9 galaxy candidates with well covered SEDs to identify the break and obtain photometric redshift with relatively small (formal) uncertainties.However, at z 13 we cannot constrain the position of the break since we do not have the F200W filter.This translates also to a higher uncertainty in the photo-z estimates at z 10, when the breaks start to enter the F150W filter and we only have the F277W−F356W color to constrain the redshift further.This is the explanation for the larger error bars at those redshifts (especially for the z ∼ 19 candidate, shown in Figure 2) and the gap at z ∼ 10, even when the galaxies are not significantly fainter.This was tested and confirmed with JAGUAR simulations similar to those mentioned in Section 3. We compared results using our NIRCam filter set with those achieved when adding F200W fluxes (with depths similar to the F115W filter).The inclusion of the F200W filter decreased the typical photometric redshift errors by a factor of > 3 and also provided larger completeness levels, reaching values beyond 90% (compared to 80% without that filter); the interloper fraction decreased from 3% to 1% when adding F200W to the simulations.

UV luminosity functions
We divided the sample in 3 redshift bins, namely, 8 < z < 10, 10 < z < 11.5, and 11.5 < z < 13, which correspond to Universe ages ∼ 540 Myr, ∼ 420 Myr, and ∼ 350 Myr.We calculated absolute UV magnitudes (M UV , averaged in a 0.01 µm window around 0.15 µm) from SED fitting, obtaining uncertainties by repeating the stellar population modeling after varying the photometry according to their errors and considering the redshift probability distribution functions.
The luminosity functions were constructed using a V max formalism and the stepwise maximum likelihood method (Efstathiou et al. 1988).Completeness was estimated by inserting artificial sources extracted from the real data in the detection image, covering the magnitude range between 26 and 32 mag, and then repeating the multi-layer source detection presented in Section 3.1.We find that our catalog is 80% complete at F277W = 30.4mag, 50% at 30.6 mag and 10% at 31.0 mag.
Using the implementation to determine cosmic variance effects presented by Trapp & Furlanetto (2020), consistent with the predictions from Bhowmick et al. (2020) and Ucci et al. (2021), we estimate a ∼25% (18%) uncertainty in our luminosity function estimation at the bright (faint) end for z ∼ 9, the values increasing to 34% (28%) at the highest redshifts.These uncertainties, similar to those presented in Trenti & Stiavelli (2008), are included in the following figures and discussion.
The luminosity functions are presented in Figure 4 for the three redshift bins mentioned above, comparing with previous estimates in the literature.Our results are given in Table 2.The typical systematic offsets and scatter of previous calculations are as high as 0.4 dex, with only a minor degradation as we move to higher redshifts, and some larger differences (0.6 dex level) at the bright end, M UV < −21 mag, at z ∼ 9 and z ∼ 12.
We fit all these estimates (our data and other data points) with Schechter (1976) functions without any prior using an MCMC method.The results are given in Table 3.We note that several of the luminosity function data points are based on the same datasets, although not on the same sources and the methodologies to select candidates are different.This might affect the uncertainties in the fits.
To test this possibility, we repeated the fits at z ∼ 9 only including our data points and those in Donnan et al. (2023) and Harikane et al. (2022), which cover all the public JWST fields (including Stephan's Quintet, not present in Bouwens et al. 2022a) as well as the COS-MOS field for the bright end.We found completely consistent results and slightly larger uncertainties: (α, M * , φ * ) = (−2.30+0.16  −0.24 , −21.00 +0.34 −0.45 , 1.8 +1.1 −1.7 ) (normalization in units of 10 −5 Mpc −3 mag −1 ) for the whole dataset, compared to (−2.36 +0.17 Overall, we find little evolution between z ∼ 13 and z ∼ 8, the fits run roughly in parallel.The faint-end slope is consistent across all redshifts down to M UV ∼ −17 mag, with a value α = −2.2± 0.1 and no indications of a steepening (in contrast with previous findings, e.g., Bouwens et al. 2022a).We also obtain very similar knee absolute magnitudes, consistent within uncertainties, also considering the degeneracy with the other two Schechter parameters.The main difference between the three redshift bins is a small average number density evolution, increasing by 0.2-0.3dex from z ∼ 13 to z ∼ 8 (between the last and first bins considered in this Letter), i.e., in ∼ 200 Myr, although this difference is of the same order as the current uncertainties for individual luminosity function estimations.

The cosmic UV luminosity and SFR density
We integrated the luminosity functions presented in Figure 4 down to the same absolute magnitude for all redshift bins, M UV ∼ −17 mag, obtaining the (directly observed) UV luminosity densities at 8 < z < 13.In Figure 5, we compare our results with estimates from the literature.This figure includes pre-JWST era measurements (Oesch et al. 2018), as well as more recent estimates obtained with JWST data (Bouwens et al. 2022a;Donnan et al. 2023;Harikane et al. 2022).
Figure 5 also shows theoretical predictions from a range of state-of-the-art models in the literature covering a variety of approaches.The simulation compilation includes the UniverseMachine semi-empirical model (Behroozi et al. 2020), the Santa Cruz semi-analytical models (Yung et al. 2019(Yung et al. , 2020)), and the fiducial CDM model in Maio & Viel (2022).We also compare with the hydrodynamical simulations Illustris/TNG (Vogelsberger et al. 2014;Nelson et al. 2015;Springel et al. 2018;Naiman et al. 2018;Marinacci et al. 2018;Nelson et al. 2018;Pillepich et al. 2018Pillepich et al. , 2019;;Nelson et al. 2019)  Table 3. Results for the Schechter (1976) parametrization fits to the luminosity functions presented in Fig. 4. The last row shows the integrated luminosity for absolute magnitudes MUV < −17 mag.
2015), and SIMBA (Davé et al. 2019).We also compare with predictions from re-simulations of multiple regions, such as THESAN (two realizations are shown, THESAN-1 and THESAN-sdao-2, both based on Illustris TNG galaxy formation model; Kannan et al. 2022) and the First Light And Reionisation Epoch Simulations (FLARES, which adopt the AGNdT9 variant of EAGLE; Lovell et al. 2021;Vijayan et al. 2021).All these simulations probe a large box size, spanning from 50 cMpc in TNG50 to 250 cMpc in UniverseMachine and 500 cMpc for FLARES.These models have been validated to match observations at lower redshifts than the epochs probed in this Letter.
For this comparison with simulations, we directly took UV luminosity functions from the mentioned papers or added the SFRs (typically averaged over 100 Myr periods) of all or certain subsamples (see below) of simulated galaxies in given simulation snapshot, which can be converted to a luminosity density assuming a constant star formation event following a given IMF, Chabrier (2003) in our case.
We note that most of these simulations provide properties integrated in galaxy regions which are larger than the actual measurements from our NIRCam observations.Indeed, the median and quartile values of the Kron apertures for our (slightly resolved) sample are 0.46 0.58 0.36 arcsec (in diameter), i.e., observations typically refer to regions of 0.5 , which translates to ∼ 2 kpc physical size (2.4 kpc for z = 8, 1.7 kpc for z = 13).This means that some of the vertical offsets seen between our and literature's measurements and the model predictions can be interpreted in terms of aperture corrections.
We exemplify this issue for the TNG100 simulation, shown with blue large-dot lines for whole-galaxy SFRs.Galaxies in the TNG100 simulation at these redshifts present typical half-mass radii around 3 kpc, 6 kpc diameter, 3 times the typical photometric aperture sizes we use in this work.In Figure 5, we also show the predictions for 0.5 apertures (small dots).While the former curve (large dots, whole galaxy) runs closer to the observational data points, the latter (small dots, limited aperture), which is more directly comparable to the data, is consistent with observations at z ∼ 8 but predicts 0.7 dex smaller luminosity densities at z ∼ 13.This means that the star formation is occurring in more compact regions than what these simulations predict.The importance of galaxy size and morphology in JWST surveys is indeed paramount to understanding galaxy evolution in the early Universe (Costantin et al. 2022).
Another point to consider is the mass range covered by observations and simulations.The typical stellar masses (assuming standard IMF and SFHs recipes) for our sample are log(M /M ) = 7.8 8.6 7.2 (median and quar- ).We also plot predictions from several galaxy formation simulations, listed in the following lines.The Illustris-1, TNG50, TNG100, TNG300 results are plotted with blue continuous, dashed, dotted, and dash-dotted lines, respectively, all referring to SFRs measured in 0.5 diameter apertures for M > 10 7 M galaxies, except for TNG300, cut at M > 10 8 M ).The thick dotted line refers to TNG100 results using whole galaxy measurements Vogelsberger et al. 2014;Nelson et al. 2015;Springel et al. 2018;Naiman et al. 2018;Marinacci et al. 2018;Nelson et al. 2018;Pillepich et al. 2018Pillepich et al. , 2019;;Nelson et al. 2019).The EAGLE simulations (Schaye et al. 2015;Crain et al. 2015) are plotted in green for galaxy samples cut in stellar masses M > 10 6,7,8 M in continuous, dotted, dashed lines.The simba predictions are shown with a brown continuous line for all galaxies, dashed line for F277W < 31 mag sources (Davé et al. 2019).The thesan-1 and thesan-sdao-2 simulations (Kannan et al. 2022) are shown in magenta with continuous and dotted lines, respectively (see text for differences).The FLARES predictions are shown in cyan (Lovell et al. 2021;Vijayan et al. 2021).The UniverseMachine semi-empirical model (Behroozi et al. 2020) is shown in orange with the same mass cuts and line styles mentioned for EAGLE.Finally, the models in Maio & Viel (2022) are shown in yellow and the predictions from the Santa Cruz semi-analytic models (Yung et al. 2019(Yung et al. , 2020) ) are depicted in orchid.
tiles, Pérez-González et al., in prep.).For some models, such as EAGLE (in green), UniverseMachine (in orange) and Illustris/TNG (comparing TNG300 vs other resolutions), we are able to distinguish between masses.These 3 sets of models show differences of 0.2-0.6 dex when considering a cut in M > 10 6 M instead of M > 10 7 M , and they predict 0.4-0.8dex smaller densities when cutting at M > 10 8 M , these curves being more comparable to our sample.
Disregarding the difficulty in comparing with models due to aperture and mass effects (not always traceable, since some simulations do not count with the adequate resolution and/or information), most simulations predict an increase of luminosity density of almost two orders of magnitude in the 300 Myr from z ∼ 13 to z ∼ 8.The observations differ from this increase rate.We note that our JWST-based results at these redshifts present a striking agreement considering the different approaches used to identify high-redshift galaxies.All these estimates together, jointly with the first results of the UV luminosity density beyond z = 13 shown in Figure 5, suggest a shallower increase or a more constant behaviour of the earliest phases of galaxy formation from z = 18 until z ∼ 9, followed by a more pronounced increase of star formation activity at z 9. Quantitatively, the luminosity density evolution follows a (1+z) −4.5±1.0 law according to the data at 8 < z < 14, (1 + z) −2.8±0.9 at 8 < z < 18, while most models predict steeper evolution, with an average slope around −10, ranging from −20 for simba or −15 for eagle to −8 for Illustris/TNG.
Overall, most simulations present tension with these first measurements provided by JWST, as they typically underpredict the abundance of high-redshift luminous galaxies and/or their SFHs, stellar masses, and compactness.Among the simulations, the results of THE-SAN seem in agreement with observed log(ρ SFR ) values in the redshift range 10 < z < 13 (although with no information about the spatial extension of the stellar cores we are probing with JWST), but with faster evolution (compare the steeper magenta line compared to red region).In this case, however, there is a significant difference depending on the reionization histories.In thesan-1 (continuous magenta line), the duration of reionization is short, where low mass halos have a significant contribution.In contrast, the duration is longer in THESAN-SDAO-2 (dotted magenta line, closer to the data), since it assumes non-standard dark matter models, with high mass halos being the primary drivers of reionization.Furthermore, at z > 8 the total SFR density in THESAN-1 is dominated by the lowest mass halos (10 8 < M halo /M < 10 9 ), which are usually unresolved in medium-resolution simulations, thus reducing the SFR density by almost an order of magnitude.
From the observational perspective, the lack of spectroscopic confirmation (still quite demanding/unfeasible for a statistically significant sample), still hampers our physical interpretation of the early stage in galaxy formation (see Harikane et al. 2023).The spectroscopy is not only needed to confirm candidates but also to train our photometric redshift algorithms, necessary to probe the faint end of the luminosity function and to improve statistics and control cosmic variance effects.On the theoretical side, various possibilities are on the table, like a variable (temperature-dependant) stellar initial mass function (Steinhardt et al. 2022a,b), early dark energy models (Smith et al. 2022), dark matter properties (Dayal et al. 2017), or star formation efficiencies ∼ 15 − 30% higher than expected (Inayoshi et al. 2022).

SUMMARY AND CONCLUSIONS
We present a sample of 44 z > 8 galaxy candidates identified in the NIRCam parallel observations of the MIRI Deep Survey (centered in the Hubble Ultra Deep Field and parallel pointing 2, respectively).These data reach around 2 magnitudes deeper than previous surveys carried out during Early Release Observations, Early Release Science, and Cycle 1 programs (e.g., the SMACS0723 ERO survey, CEERS, and GLASS), reaching 5σ depths around 30.8 mag.The median and quartiles of the magnitude distribution of our sample is F 277W = 30.2 30.7  29.8 mag.Aiming to reduce the effects of photometric uncertainties and a priori assumptions in the determination of redshifts, the selection of our z > 8 galaxy candidates is based on the analysis of spectral energy distributions measured in a variety of apertures and with different photometric redshift codes.We also apply restringent cuts in the goodness of fit and the relevance of highcompared to low-redshift solutions, following standard procedures in the literature.
The sample probes the absolute magnitude regime −19.5 < M UV < −16.5 mag of the luminosity function at 8 < z < 13, corresponding to Universe ages between 350 and 540 Myr.We constrain the faint end, obtaining a constant slope of −2.2 ± 0.1 in the full redshift range.Jointly with previous results based on shallower largerarea surveys probing the bright-end of the luminosity function, we find that the regime we explore accounts for nearly 50% of the total luminosity density (integrated through all magnitudes brighter than −17 mag).
Our estimates of the integrated ultraviolet luminosity density, which is a good proxy for the cosmic star formation rate density (via assumptions in relevant star formation properties such as the initial mass function, the metallicity, binary star fraction, or the star formation history or burstiness behavior) are compared with a wide range of predictions from state-of-the-art galaxy evolution models.The two main results are: (1) we find a shallower slope in the early stages in galaxy formation at 8 < z < 13, where many models predict nearly a factor of 10 less star formation activity; (2) although some models predict similar values of the cosmic star formation rate density to those measured in this paper and in recent publications based on JWST data, when taking into account the sizes (as measured by the aperture diameters used in the photometric extractions) and the stellar masses corresponding to the apparent magnitudes (assuming typical stellar mass calculation recipes and a standard universal initial mass function), the simulations typically identify the location of star formation in larger, less massive galaxies.For the models where the sizes and masses can be constrained to compare more fairly with observations, systematic differences with our results about the cosmic ultraviolet luminosity (or SFR) density are found at the ×4 − 10 level (depending on the specific model), indicating that the observations reveal a much more active Universe in the production of photons (through star formation or nuclear activity) the first 500 Myr on ∼ 2 kpc scales, especially at z 11 (i.e., first 350 Myr of the Universe).4. Sample of z > 8 galaxy candidates presented in this letter.We provide coordinates, integrated magnitudes in the F 277W filter and photometric redshifts estimated with eazy.The comment gives information about how many of the 3 photometric redshift codes used in this letter agree in the z > 8 determination.

Figure 1 .
Figure 1.Examples of F115W and F150W dropout galaxies in our sample.For each source, in the top left, we show the SED (for the 0.3 diameter aperture; filled dots) and template fits to low and high redshift solutions.Arrows indicate 1σ upper limits.The top right panel shows the photometric redshift probability distribution function for the different codes used in this work and the derived redshift and uncertainties.The bottom of the figure for each source presents 1.5 × 1.5 postage stamps in ACS, WFC3 and NIRCam bands, with the source marked with a 0.15 radius red circle.

Figure 3 .
Figure3.Magnitude in the NIRCam F277W filter vs. photometric redshift for z > 8 galaxies.General galaxy samples extracted from the catalogs released by G. Brammer for the SMACS0723, GLASS, and CEERS datasets gathered in 2022 are plotted with blue small dots to illustrate the magnitude limits of the surveys and the number of high-redshift galaxies in unsupervised and unvetted photometric catalogs.Our parent galaxy sample is plotted with red small dots, and the final sample of z > 8 galaxy candidates with red hexagons.We depict 1σ uncertainties plotted in both axes and the data point referring to the zPDF peak redshift.We also plot the photometric high-redshift galaxy candidates of Donnan et al. (2023), Naidu et al. (2022a), Bouwens et al. (2022a), Harikane et al. (2022), and Finkelstein et al. (2023) with large black circles (we do not remove repeated sources with different measurements), as well as spectroscopically confirmed galaxies from JADES (Curtis-Lake et al. 2022; Robertson et al. 2022), lensing clusters(Carnall et al. 2023;Williams et al. 2022;Langeroodi et al. 2022;Roberts-Borsani et al. 2022, not corrected for magnification), and CEERS sources(Arrabal Haro et al. 2023a,b), all in green.The top panel shows the histogram of redshifts for these samples of high-redshift galaxy candidates.The right panel shows the magnitude distribution of the different datasets mentioned before, with the three 2022 surveys plotted with transparency.Our deeper data (peaking around 2 magnitudes fainter than previous surveys) probe the 29 to 31 mag regime, providing candidates at the previously unexplored faint-end of the luminosity functions.

Figure 4 .
Figure 4. From left to right, UV luminosity functions at z ∼ 9, z ∼ 11 and z ∼ 12.We plot our results with red hexagons (uncertainties smaller than data points in some cases, see Table 2), and compare with literature estimations from Donnan et al. (2023); Naidu et al. (2022a); Finkelstein et al. (2023); Bouwens et al. (2022a), Harikane et al. (2022), and Castellano et al.(2022).We fit all points (ours and those in the literature) to a Schechter (1976) function, which is plotted with red continuous lines in each panel, with the fits for other redshifts shown with dashed lines.

Figure 5 .
Figure 5. Evolution of the UV luminosity density, transformed into SFR density on the right vertical axis (assuming a Chabrier 2003 IMF).Our results are plotted with a red shaded region.Estimates from the literature are plotted with gray symbols(Oesch et al. 2018; Bouwens et al. 2022a,b;Donnan et al. 2023;Harikane et al. 2022).The densities at z > 16 are shown in light gray since they were mostly based on a z ∼ 5 redshift interloper(Arrabal Haro et al. 2023a;Harikane et al. 2023).We also plot predictions from several galaxy formation simulations, listed in the following lines.The Illustris-1, TNG50, TNG100, TNG300 results are plotted with blue continuous, dashed, dotted, and dash-dotted lines, respectively, all referring to SFRs measured in 0.5 diameter apertures for M > 10 7 M galaxies, except for TNG300, cut at M > 10 8 M ).The thick dotted line refers to TNG100 results using whole galaxy measurementsVogelsberger et al. 2014;Nelson et al. 2015;Springel et al. 2018;Naiman et al. 2018;Marinacci et al. 2018;Nelson et al. 2018;Pillepich et al. 2018Pillepich et al. , 2019;;Nelson et al. 2019).The EAGLE simulations(Schaye et al. 2015;Crain et al. 2015) are plotted in green for galaxy samples cut in stellar masses M > 10 6,7,8 M in continuous, dotted, dashed lines.The simba predictions are shown with a brown continuous line for all galaxies, dashed line for F277W < 31 mag sources(Davé et al. 2019).The thesan-1 and thesan-sdao-2 simulations(Kannan et al. 2022) are shown in magenta with continuous and dotted lines, respectively (see text for differences).The FLARES predictions are shown in cyan(Lovell et al. 2021;Vijayan et al. 2021).The UniverseMachine semi-empirical model(Behroozi et al. 2020) is shown in orange with the same mass cuts and line styles mentioned for EAGLE.Finally, the models inMaio & Viel (2022) are shown in yellow and the predictions from the Santa Cruz semi-analytic models(Yung et al. 2019(Yung et al. , 2020) ) are depicted in orchid.

Table 1 .
Table with information for all filters used in this work.We show the area in common with the MIRI European Consortium Guaranteed Time Observations (Schaye et al. 2015;Crain et al. et al.