The Structure and Morphology of Galaxies during the Epoch of Reionization Revealed by JWST

We analyze 347 galaxies at redshift 4 < z < 9.5 using JWST observations from the Cosmic Evolution Early Release Science (CEERS) program by simultaneously fitting a two-dimensional parametric model to the seven-filter Near Infrared Camera images to measure the overall structural parameters and quantify the global properties of the galaxies in the rest-frame optical band. Particular attention is devoted to deriving robust uncertainties that include, among other factors, the influence of cosmological surface brightness dimming and resolution effects. Using the global Sérsic index (n < 1.5) and observed axial ratio (q < 0.6) as a guide, we place a conservative lower limit of ∼45% on the incidence of galactic disks. Galaxies follow a relation between the rest-frame optical luminosity and effective radius in the redshift range 4 < z < 9.5, as well as separately over the intervals 4 < z < 5 and 5 ≤ z < 9.5, with a very similar slope but a marginally lower zero-point in the higher-redshift bin (R e = 0.69 ± 0.05 kpc) compared to the lower-redshift bin (R e = 0.91 ± 0.04 kpc). Within the limitations of the current sample size, we find no significant redshift evolution of n or R e at these early epochs.


INTRODUCTION
The structure and morphology of galaxies encode important clues about their formation mechanism and evolutionary history.After their initial formation, the continued accumulation of gas fuels star formation in the outer regions of galaxies (e.g., Whitney et al. 2021).Subsequent mergers and tidal interactions facilitate the transformation of galaxy morphology, size, and concentration (e.g., van Dokkum & Franx 2001;Naab et al. 2009;Delgado-Serrano et al. 2010;Bluck et al. 2012;McLure et al. 2013;Conselice 2014;Stark 2016).Multiple pathways can lead to the formation of galactic bulges, including the early, rapid collapse of centrally concentrated gas, inward migration of large, starforming clumps initially born in an extended disk, and galaxy-galaxy mergers (e.g., Conselice 2003;Bournaud et al. 2014;Conselice 2014).
sunwen@stu.pku.edu.cnStudies of the fraction of galaxies of various morphological types show that the Hubble sequence was not yet fully established at z ≳ 2: disk galaxies and ellipticals become as common as peculiar galaxies only at z ≲ 1.5 (e.g., Abraham et al. 1996;Conselice et al. 2005Conselice et al. , 2008;;Conselice & Arnold 2009;Whitney et al. 2021).Using the Sérsic (1968) function to parameterize the global radial light distribution, an approximate proxy to describe galaxy morphology, it is apparent that starforming galaxies at z ≈ 4 − 6 are late-type systems with typical Sérsic indices n ≈ 1 − 1.5 (Shibuya et al. 2015).Sérsic n increases systematically toward lower redshifts, such that by z ≈ 1 early-type galaxies, characterized by large n, become the predominant morphological class of massive galaxies (e.g., Buitrago et al. 2013).
Galaxy size grows over time through major and minor mergers, inside-out star formation, and gas accretion.Change in size with redshift is now well characterized at z ≲ 4: distant galaxies are more compact than local galaxies of the same mass or luminosity (e.g., Daddi et al. 2005;Trujillo et al. 2007;Buitrago et al. 2008;Cassata et al. 2013;Whitney et al. 2019).It has been argued that galaxy growth, especially for the most massive members of the population, principally involved dry, minor mergers (e.g., Naab et al. 2009;Bluck et al. 2012;Furlong et al. 2017), although some contend that the role of major mergers cannot be overlooked (e.g., Davari et al. 2017).Observation of the rest-frame ultraviolet (UV) emission of galaxies at z > 4 suggests that at fixed luminosity or mass, galaxy sizes slowly decrease toward higher redshift (e.g., Bouwens et al. 2004;Ferguson et al. 2004;Oesch et al. 2010a;Shibuya et al. 2015).These results resonate with expectations from the scenario proposed by Fall & Efstathiou (1980), which posits that galactic disks form within dark matter halos that acquire angular momentum from tidal torques.Baryons initially share the same specific angular momentum as the dark matter, and angular momentum is conserved as the baryons collapse and cool to form a disk.In this scenario, the expected size scales as R e ∝ H(z) −1 at a fixed halo circular velocity or R e ∝ H(z) − 2 3 at a fixed halo mass (Fall & Efstathiou 1980;Mo et al. 1998;Ferguson et al. 2004), where H(z) is the Hubble parameter at redshift z.
Whether and how galaxy size evolves at high redshift remain controversial.For instance, while Whitney et al. (2019) show that mergers contribute to the increase in galaxy size from z = 7 to z = 1, Curtis-Lake et al. (2016) find little evidence that galaxy size in the rest-frame UV evolves at high redshift.Most studies of galaxy structure evolution have been limited to redshifts less than ∼ 4 (e.g., Driver et al. 1995;Abraham et al. 1996;Stanford et al. 2004;Papovich et al. 2005;Conselice et al. 2008;Wuyts et al. 2011;Lee et al. 2013;Mortlock et al. 2013;van der Wel et al. 2014;Tacchella et al. 2015;Zhang et al. 2019;Whitney et al. 2021;Costantin et al. 2022).At higher redshifts, the detailed properties of galaxies are challenging to resolve because the galaxies are faint, small, and impacted by surface brightness dimming.Equally seriously, the reddest band accessible with the Hubble Space Telescope (HST), ∼ 1.6 µm, can no longer capture the rest-frame optical emission of galaxies that better traces their overall stellar population.Thus, although deep near-infrared HST imaging has greatly advanced our understanding of the restframe UV properties of galaxies at z > 4 (e.g., Oesch et al. 2010b;Grogin et al. 2011;Trenti et al. 2011;Lotz et al. 2017), we still critically lack robust measurements of the morphologies and structural parameters of galaxies during their formative years.
The James Webb Space Telescope (JWST: McElwain et al. 2023;Rigby et al. 2023) has ushered in a new era for probing many aspects of the early epoch of galaxy evolution, including their morphology and struc-ture (e.g., Ferreira et al. 2022;Kartaltepe et al. 2023;Tacchella et al. 2023).With a 6.5 m primary mirror, JWST has 7 times the light-gathering power of HST.Equally importantly, JWST significantly extends the long-wavelength sensitivity of HST while maintaining excellent image quality, a crucial combination for resolving the internal stellar properties of high-redshift galaxies.For example, the reddest wide filter (F444W) of the Near Infrared Camera (NIRCam; Rieke et al. 2023) enables the study of the rest-frame optical light of galaxies up to z ≈ 9.5.Contrary to a preconception established by previous HST studies, recent results from JWST indicate that regular disks are common in galaxies at z ≳ 2 (e.g., Ferreira et al. 2022;Jacobs et al. 2023;Nelson et al. 2023;Robertson et al. 2023), suggestive of an early emergence of the Hubble sequence, although the fraction of purely irregular galaxies, showing no signs of a disk or spheroidal component, rises toward higher redshift (Kartaltepe et al. 2023).These studies also discuss a variety of systematic trends involving the Sérsic index, with no clear consensus yet emerging from the initial explorations of JWST data.
In this study, we investigate the structural parameters of a sample of galaxies at z = 4 − 9.5 selected from the Cosmic Evolution Early Release Science (CEERS; Finkelstein et al. 2023) survey, which covers ∼ 100 arcmin 2 of the Extended Groth Strip (EGS; Davis et al. 2007) with JWST imaging and spectroscopy using NIRCam.We use the sample, which traverses into the epoch of reionization, to investigate the incidence of galactic disks, the luminosity-size relation of galaxies, and the possible evolution of size and morphology with redshift.Although the CEERS data already have been the target of recent studies with broadly similar science goals, as mentioned above, our approach differs in several key aspects.Despite the impressive capabilities of JWST, securing accurate structural parameters with robust uncertainties for high-redshift galaxies is still a non-trivial challenge.Some high-redshift sources remain barely resolved even with JWST.We examine the potential impact of the choice of image pixel scale used in the data reduction process, and we systematically investigate different treatments of the point-spread function (PSF) to arrive at an optimal solution.As the appearance of a galaxy changes with redshift as a result of cosmological surface brightness dimming and image resolution (e.g., Giavalisco et al. 1996;Hibbard & Vacca 1997;Conselice 2003;Barden et al. 2008;Vika et al. 2013;Davari et al. 2016), we design realistic mock experiments to quantify the contribution of these effects to the final error budget.Lastly, galaxy morphology and structure depend on wavelength, as a consequence of internal variations in stellar population and dust attenuation (Windhorst et al. 2002;Taylor-Mager et al. 2007;Kelvin et al. 2012;Häußler et al. 2013;Vika et al. 2013).It is important to take this into consideration when measuring the structural parameters of galaxies across different bands, as is the case for the seven-band images of CEERS.We adopt GALFITM (Häußler et al. 2013;Vika et al. 2013) to construct a wavelength-dependent model to fit the multiband images simultaneously, an approach shown to deliver more accurate photometric and structural parameters, especially in the regime of low signalto-noise ratio (Häußler et al. 2022).This technique has been put into practice in a variety of applications (e.g., Zhuang & Ho 2022), including CEERS itself (Kartaltepe et al. 2023).
The paper is organized as follows.Section 2 describes the observations used in this work, the procedure for data reduction, the strategy to construct the PSF, and sample selection.Section 3 introduces our method of image analysis, measurement results, mock simulations to quantify redshift effects, and error analysis.Section 4 discusses the main implications for the incidence of galactic disks, the luminosity-size relation, and the possible evolution of the galaxy structural parameters.A summary appears in Section 5. We assume the latest Planck flat ΛCDM cosmology with H 0 = 67.36km s −1 Mpc −1 , Ω m = 0.3153, and Ω Λ = 0.6847 (Planck Collaboration et al. 2020).All magnitudes are in the absolute bolometric (AB) system (Oke & Gunn 1983).

CEERS Data
We analyze the first four of the 10 CEERS NIRCam pointings (CEERS1, CEERS2, CEERS3, and CEERS6) obtained on 21 June 2022, which cover 34.5 arcmin2 .Each pointing uses the filters F115W, F150W, and F200W in the short-wavelength (SW) channel and the filters F277W, F356W, F410M, and F444W in the longwavelength (LW) channel.The total exposure time per filter is typically 2835 s for pixels observed in all three dithers, except for F115W, which has 2 times longer exposure time to increase the depth of the wavelength range bluer than the Lyman-break at z > 10.The CEERS survey is optimized to study the abundance and physical nature of galaxies in the early Universe (z > 10) and the processes of galaxy assembly and black hole growth at z ≈ 1 − 10. et al. (2023) describe the data reduction of the first public release (Data Release 0.5) of the CEERS NIRCam imaging 1 , which includes custom procedures beyond those of the standard JWST pipeline for removal of 1/f noise, wisps, and snowballs from the countrate maps, astrometric calibration, image coaddition and mosaicing, and background subtraction.The final images are drizzled to a pixel scale of 0. ′′ 030.

Bagley
In view of the large distances and intrinsically compact sizes of high-redshift galaxies (e.g., Damjanov et al. 2009;Cimatti et al. 2012) we place special emphasis on achieving the highest possible resolution in order to obtain the most reliable measurements of source structure and morphology.The possibility that some galaxies may host an active galactic nucleus (e.g., Harikane et al. 2023) further underscores the need to resolve accurately their internal substructure.While the default pixel size of 0. ′′ 030 provided by CEERS Data Release 0.5 Nyquist samples the PSF of the bands in the LW channel, which have full-width at half-maximum (FWHM) ∼ 0. ′′ 119 − 0. ′′ 160, it undersamples the three SW channel bands, which have a PSF of FWHM = 0. ′′ 0605, 0. ′′ 0647, and 0. ′′ 0750 for F115W, F150W, and F220W, respectively (Zhuang & Shen 2023).To investigate the potential effects of the output pixel scale after drizzling on our measured results, we perform a custom set of data reduction to generate images with a finer output pixel scale for the bands in the SW channel.Our calibration procedures are similar to those of Bagley et al. (2023).Starting with the Stage 2 data products acquired from the STScI MAST Portal 2 , we process them using version 1.7.2 of the JWST Calibration Pipeline with the mapping file jwst 0965.pmap.Prior to removing 1/f noise, we use Photutils to mask sources in four iterations, dilating them in between iterations by 45, 35, 29, and 25 pixels.We adopt larger dilation sizes than Finkelstein et al. (2022) to better mask extended objects.In the resampling step of the Stage 3 pipeline, we drizzle the individual images to 0. ′′ 015 pixel −1 for the SW channel and 0. ′′ 030 pixel −1 for the LW channel.As our image analysis method (Section 4.1) requires an identical pixel scale across all filters, prior to model fitting we reproject the mosaics in the three SW filters to the same final pixel scale of the images of the LW filters (0. ′′ 030 pixel −1 ).We use the Gaia DR3 source catalog3 (Gaia Collaboration et al. 2016) for astrometric calibration.After masking bad pixels and sources, we use Photutils.Background2D to estimate and subtract any residual background.We do not remove "snow- balls" (circular defects) from the NIRCam mosaics (e.g., Finkelstein et al. 2022;Merlin et al. 2022;Bagley et al. 2023;Rigby et al. 2023), but we carefully inspect all the images to confirm that our sources of interest are not contaminated by these artifacts.The finer drizzle pixel scale has a minimal impact on the source magnitude and size, but the effect on the Sérsic index can be substantial (Section 3.4).
The flux calibration procedure of the JWST pipeline is still evolving.Boyer et al. (2022) evaluate the flux calibration of NIRCam using globular cluster data from the Resolved Stellar Populations Early Release Science program, concluding that the zero point for the F150W filter derived with their two-dimensional kernel density estimator technique is slightly better than those from Brammer (2022) and much better than others.However, a similar analysis is not available for the other NIRCam filters.We adopt the zero point for F150W from Boyer et al. (2022), and for the other six filters, we use the zero points from Brammer (2022).

Point-spread Function
Despite the excellent spatial resolution of JWST, some high-redshift galaxies remain barely resolved.Under these circumstances, the accuracy of the PSF model is of paramount importance for obtaining reliable source structural parameters.We detail our procedure for constructing the PSF for the CEERS fields and compare different PSF models in Appendix A. For each NIR-Cam filter, we construct a PSF by stacking a number of isolated, unsaturated stars, which are identified as point-like sources in our source catalog (Section 2.4) that have FWHM < 6 pixels, as determined by fitting a twodimensional Gaussian profile to an image cutout of size 7 times the Kron (1980) radius of the source that is uncontaminated by neighboring sources.To separate stars from other compact sources, such as quasars and dwarf galaxies, we perform a least-squares fit between the observed SED of the candidate point-like source and stellar spectral templates from the ESO Library of Stellar Spectrum4 .The observed SED comprises 10 photometric points, three (F115W, F150W, and F200W) from the SW channel in combination with seven additional bands covering λ ≈ 0.4 − 1.6 µm (Stefanon et al. 2017).We exclude the four LW bands because the stellar spectral templates do not extend beyond ∼ 2 µm.To avoid potential confusion with quasars, we do not consider templates of type O, B, and A stars, which, in any case, are expected to be rare because of their short main-sequence lifetimes and because they are faint in the near-infrared.
Figure 1 shows two examples of our final sample of 18 stars, which are either G-type or K-type giants, identified in the four CEERS pointings.Because of the relatively small number of stars available, we do not pro-duce a separate PSF for each pointing but instead combine the stars in all four pointings to produce a master, stacked PSF of high signal-to-noise ratio for each filter.PSF variations across different pointings should be small because they utilize the same dither pattern and were taken close in time (Finkelstein et al. 2023).We extract a 81 × 81 pixel cutout for each star, and all the star cutouts are 4 times oversampled to align their centers.We construct the final PSF by mean-combining the individual stars and resampling the oversampled images back to their original resolution.See Appendix A for details.

Sample Definition
We employ custom Photutils scripts for source detection and photometry.We use the LW channel data instead of the SW bands for source detection to avoid the complications of substructures arising from star-forming clumps, which, more prominent in the rest-frame UV, can split single galaxies into multiple sources.Stacking the mosaics of the four LW bands weighted by their inverse variance produces a detection image of a high signal-to-noise ratio.Sources are defined as contiguous regions of the segmentation map that contain five or more connected pixels with values larger than 1σ above the background.A local peak must have at least 0.001 of the total source flux to be deblended as a separate object.We visually inspect all sources to verify that they have been properly deblended.
This study focuses on the subset of galaxies in CEERS with redshifts 4 < z < 9.5 that cross-matches within 1. ′′ 0 with the multiwavelength catalog of the CANDELS EGS field of Stefanon et al. (2017), who curated 22 bands of photometry from 0.4 to 8 µm, including HST WFC3 and ACS data from CANDELS.Although there are other catalogs of photometric redshifts available (e.g., Duncan et al. 2019;Whitney et al. 2021;Kodra et al. 2023), we choose Stefanon et al.'s catalog for convenience because it provides both photometric redshifts and stellar masses.Other recent CEERS studies also follow this strategy (e.g., Guo et al. 2023;Robertson et al. 2023).Whenever possible, preference is given to spectroscopic redshifts.A total of 389 galaxies match our selection criteria.Since there are overlaps between different CEERS NIRCam pointings, five galaxies have multiple images; we retain the best image and discard the redundant ones.After removing galaxies that lie at the edge of the field and do not have complete imaging, and those that are located on the spikes of foreground stars or are contaminated by a nearby bright source, we are left with a final sample of 347 galaxies (Table 1).Figure 2 shows the redshift and stellar mass distribution of the sample.The median uncertainty of the sample's stellar masses is σ [log (M * /M ⊙ )] = 0.327 dex and that of the photometric redshifts is σ(z)/(1 + z) = 0.407.

Methodology
Owing to internal variations in stellar population and dust attenuation, galaxy morphology and structure depend on wavelength.To account for the variation of galaxy structure with wavelength, we analyze the images of all seven NIRCam bands using GALFITM (Häußler et al. 2013;Vika et al. 2013), a multiband version of the widely used two-dimensional image fitting code GALFIT (Peng et al. 2002(Peng et al. , 2010)).The program fits the pixelregistered images of multiple filters to produce a consistent, wavelength-dependent model of the galaxy, with the aid of a user-specified polynomial function to constrain the wavelength dependence of the structural parameters of the model components5 .The free parameters of the model are fitted to the multiband data simultaneously by minimizing a single likelihood function.Häußler et al. (2022) show, through a series of simulations, that this approach significantly reduces deviations from true parameter values, allows component sizes and Sérsic indices to be measured more accurately, and constrains the band-to-band parameter variations  to more physical values.GALFITM delivers better performance especially in the regime of low signal-to-noise ratio (Häußler et al. 2013;Vika et al. 2013;Nadolny et al. 2021;Häußler et al. 2022).
After some experimentation, we find that a cutout size of 7 times the Kron (1980) radius of the source on the detection image (Section 2.4) can capture the galaxy outskirts while including adequate background.The center of the galaxy needs to be aligned when creating the cutouts to correct for residual offsets of up to several pixels that exist between the images from the SW and LW channels.The segmentation map is used to mask objects excluded from the fit.As the segmentation map often misses the fainter outer regions of bright sources, we dilate the segmentation image of each object by calculating a "growth radius" (Ho et al. 2011;Li et al. 2011;Huang et al. 2013b).From trial and error, we find that the optimal growth radius for the CEERS images can be approximated by R = min[8, 0.8 N/π], where N is the number of pixels contained in the original segmentation image of the object.After masking all sources in each cutout, the local background is estimated through sigma-clipping and then removed.We confirm that the dilated mask of nearby large objects does not influence the object of interest.Similar to Finkelstein et al. (2023) and Kartaltepe et al. (2023), we use the cutout of the error array (ERR extension) from the JWST pipeline as the input sigma image, which includes not only Poisson noise from the source but also instrumental noise.The sigma image is scaled typically by a factor ∼ 1.3 − 2.5, depending on the filter, such that its background pixel values roughly equal the standard deviation of the background pixel values of the science image (SCI extension).
With the background-subtracted cutout, mask, and PSF of each band in hand, we simultaneously fit the seven bands with a two-dimensional surface brightness model represented by the Sérsic function where R e is the effective radius of the galaxy that contains half of the total flux, Σ e is the surface brightness at R e , the Sérsic index n specifies the shape of the light profile, and κ is related to n by the incomplete gamma function, Γ (2n) = 2γ (2n, κ) (Ciotti 1991).The special case of n = 1 corresponds to the exponential profile often used to describe galactic disks (Freeman 1970), and n = 4 is the classic canonical profile of an elliptical galaxy (de Vaucouleurs 1948).Initial guesses of the magnitudes come from the Kron aperture photometry, and for the other parameters (source location, R e , position angle Θ, and axial ratio q) they are available from the source detection procedure.As in Häußler et al. (2013), Θ and q are held constant with wavelength, while the magnitude of the galaxy is free to vary; this is realized in GALFITM by setting the Chebyshev polynomial to a maximum order of 6.The global profile and size of both early-type and late-type galaxies change smoothly and systematically with wavelength, with n increasing and R e decreasing systematically from the UV to the near-infrared, a consequence of gradients in dust attenuation, stellar population, and metallicity (Kelvin et al. 2012).The wavelength dependence can be described largely by a linear function, although in some cases a second-order function is needed to account for a mild curvature (Häußler et al. 2013).In view of the broad spectral coverage of NIRCam, we allow n and R e to vary quadratically with wavelength.This should suffice.Treu et al. (2023) show that the morphology of Lyman-break galaxies at z > 7 does not change significantly from the rest-frame UV to the optical.

Results
Figure 3 shows the simultaneous, multiband fitting results of a sample galaxy.The smooth residuals and χ 2 ν ≈ 1 indicate a successful fit. Figure 4 gives additional examples of model fits for the band closest to the rest-frame optical for representative galaxies with disky morphologies (n = 0.8 − 1.4), while Figure 5 highlights what might be deemed more spheroid-dominated systems (n = 2.3 − 7.9).Again, the fits are largely satisfactory, as evidenced by the distribution of χ 2 ν for the entire sample, which is strongly clustered near 1 (Figure 6).Twenty-three sources exhibit notably higher levels of substructure in their residual maps.These sources, which typically have χ 2 ν > 1.9, are flagged as "quality 2" in Table 1.The more pronounced residuals in these outliers arise from various causes, most commonly because of the presence of complex, multiple components apparently associated with mergers, tidal features, or internal clumpy substructure (Figure 7).A few have an unusually large Sérsic n but small R e , suggestive of the presence of an exceptionally compact central component, possibly associated with an active nucleus (C.H. Chen et al., in preparation).The following discussion (Section 4) uses all the galaxies with a quality flag = 1 (χ 2 ν ≤ 1.9), representing 93% of the total sample.The summary of the fitting results in Table 1 pertains to measurements made in the filter closest to the rest-frame optical (∼ 3800 − 7800 Å) for each galaxy.

Effects of Redshift on the Determination of Galaxy Structure
The appearance of a galaxy depends on the observing conditions of the telescope and redshift owing to cosmological dimming, resolution effects, and sensitivity (e.g., Giavalisco et al. 1996;Hibbard & Vacca 1997;Conselice 2003;Barden et al. 2008;Vika et al. 2013;Davari et al. 2016).We investigate how these observational effects influence our galaxy structure measurements by creating mock images that simulate the same galaxy located at different redshifts.By analyzing the simulated images in exactly the same manner as the real images, we can quantify the differences between the measured and input values of the structural parameters, thereby obtaining a quantitative estimate of the systematic bias and uncertainty contributing to the final error budget of our measurements (Appendix B).
We begin by using GALFIT to generate Sérsic model images with known input parameters covering those measured for our real galaxy sample.To convert the model image of a galaxy of a given physical size from low to high redshift, we modify its angular size with redshift as where d L is the luminosity distance.For a given absolute magnitude, the observed surface brightness I changes with redshift as We calculate the image rebinning factor and flux scaling factor following Section 3.1 of Barden et al. (2008).
To convert model images of galaxies at lower redshift to higher redshifts to mimic the CEERS observations, we create mock images with an output pixel scale of 0. ′′ 030, convolve them with the PSF for the corresponding filter (Section 2.3), introduce realistic noise levels, and add background.As in Barden et al. (2008), we compute a convolution kernel to produce the NIRCam the galaxy flux, and (2) correlated noise and intrinsic variations of the background, which we simulate using blank regions from actual NIRCam images.
Starting at z = 0.1, we generate a series of mock galaxies at increasing redshifts, focusing on the interval 4 < z < 9 for which we divide into 11 discrete redshift bins separated by ∆z = 0.5.The increase of angular size with redshift (Equation 2) and the concomitant dimming of surface brightness (Equation 3), coupled with the observational characteristics of NIRCam, underscore the main trends visible in the simulated images (Figure 8).Our immediate interest primarily concerns the rest-frame optical band appropriate to each of the sources of our sample (see Appendix B for details).

Influence of the Pixel Scale
To investigate the potential impact of the output pixel scale after drizzling on our measured results, we repeat the model fits with the images reduced by ourselves using a finer pixel scale.As GALFITM requires a common pixel scale across all filters, we reproject the mosaics of the three SW filters to the pixel scale of the LW filters (0. ′′ 030) prior to fitting.Judging by the median difference and standard deviation of the two sets of measurements (Figure 9), we conclude that the effect is minimal on m opt (−0.007±0.093mag) and R e (0.006±0.068 kpc), but the scatter for n is nonnegligible (−0.030 ± 0.312).The galaxies that exhibit the worst scatter in n tend to have lower signal-to-noise ratios.The final error budget of each measured parameter (Table 1) incorporates the uncertainty introduced by the choice of pixel scale used in the drizzling process.

Error Budget
The rich structural complexities of galaxies are difficult to capture with simple parametric models.Parameters returned by codes such as GALFIT or GALFITM may suffer from various degrees of systematic bias and degeneracy that can underestimate the real uncertainties, even if the formal statistical errors are small and the fitting residuals look acceptable (see, e.g., Häußler et al. 2007;Kim et al. 2008;Vika et al. 2013;Gao & Ho 2017;Zhao et al. 2021;Zhuang & Ho 2022).We design realistic input-output experiments to determine the real parameter uncertainties using a set of mock galaxies generated from the best-fit parameters of each galaxy in our sample.Following Zhuang & Ho (2022), we use the best-fit parameters derived for each galaxy to construct mock images that exactly mimic the object-specific parameters of the actual observation.For each galaxy in each band, we generate 100 realizations of mock observations that account for the Poisson noise associated with the source, background Gaussian noise, and the properties of the specific image (gain, exposure time, and background variation).Then, we use GALFITM to repeat the model fitting and adopt the median value and standard deviation of the 100 results as the measurement and its error.
As discussed in Section 2.2 and documented in Section 3.4, our final choice of drizzling the images to a common scale of ∼ 0. ′′ 030 pixel −1 has little impact on m and R e , but the effect is not negligible for n.We take the difference between the measurements based on images reduced with two different pixel scales as another source of uncertainty.This, along with the contribution from redshift effects and the standard statistical uncertainty returned by GALFITM, constitute the quadrature contributions to the final error budget of each parameter.

Incidence of Galactic Disks
The distribution of Sérsic indices for the full sample at 4 < z < 9.5 strongly peaks at n ≈ 1, with a long tail that extends to values as large as n ≈ 8 (Figure 10).In total, ∼ 55% of the sample are well fit with n < 1.5, an ofteninvoked criterion to designate disk-dominated systems.The residuals of the fits are relatively clean (Figure 4), indicating that a simple, single-component Sérsic model suffices to describe the overall light distribution.The choice of pixel scale for drizzling (Section 3.4) and redshift effects (Appendix B) can increase the uncertainty of individual measurements of n, but these factors do not bias the overall distribution.The axial ratio is more challenging to interpret for a randomly oriented mixture of galaxies of different morphological types.Diskdominated systems exhibit axial ratios skewed toward Normalized number Full sample q < 0.6 q ≥ 0.6

Disk
Figure 10.The normalized distribution of axial ratio q and Sérsic index n measured in the rest-frame optical.Typical uncertainties are shown in the upper-right corner of the main panel.The histograms highlight in green objects that have either q < 0.6 or n < 1.5, and in red those that have either q ≥ 0.6 or n ≥ 1.5.Sources characterized by both q < 0.6 and n < 1.5 constitute the most conservative candidates for disk-dominated galaxies.
lower values compared to ellipticals, which tend to be rounder with axial ratios closer to 1 (e.g., Padilla & Strauss 2008).If we designate galaxies with q < 0.6 as having a substantial disk component, the disk fraction is ∼ 75%.If, on the other hand, we conservatively require that the galaxy has both n < 1.5 and q < 0.6, then we can place a firm lower limit of ∼ 45% on the incidence of galactic disks in our sample.In agreement with other recent analyses of high-redshift galaxies based on CEERS (Ferreira et al. 2022(Ferreira et al. , 2023;;Jacobs et al. 2023;Nelson et al. 2023;Kartaltepe et al. 2023;Robertson et al. 2023), the incidence of disky galaxies at this early cosmic epoch is much higher than expected based on previous HST studies (e.g., Kartaltepe et al. 2015).Differences in the depth and wavelength coverage between JWST and HST may contribute to this apparent discrepancy in disk fraction detected from the two missions.
The deeper images of JWST can detect more easily the faint disk component previously missed by HST.Conversely, a galaxy may be mistaken as a pure spheroid if HST is only sensitive to its compact central component.An intrinsically disky galaxy with UV-bright starforming clumps may also be misclassified as irregular if HST misses the underlying disk component.
We caution that without kinematical information, image-based analyses, including ours, can overestimate the fraction of disk-dominated systems.Vega-Ferrero et al. (2023) apply self-supervised machine learning to explore the morphological diversity of galaxies at z ≥ 3.By comparing traditional morphological measures with the physical parameters obtained from cosmological simulations, they find that approximately 50% of the galaxies visually classified as disks based on the imaging data are intrinsically prolate or spheroidal objects.

Luminosity-size Relation
The correlation between the luminosity of a galaxy and its spatial distribution within the galaxy, described by the luminosity-size relation, offers important insights into the evolution and assembly history of the galaxy.Figure 11 shows the distribution of rest-frame optical luminosity versus effective radius, for the subset of 84 galaxies with m F160W < 26 mag, which is the 90% detection completeness limit for extended sources in the catalog of Stefanon et al. (2017, see their Section 5.1), as well as for the entire sample of 347 galaxies.We further divide each sample into two redshift bins to see whether there is any evolution between 4 < z < 5 and 5 ≤ z < 9.5.Correlations are evident for all the samples considered.
To quantify the observed empirical relations, we assume that the size distribution at fixed L follows a log-normal function (Shen et al. 2003), a prescription motivated by the disk formation theory of Fall & Efstathiou (1980).Describing the luminosity-size relation by a power law (e.g., Huang et al. 2013a), the probability density function of the L − R e pair can be expressed as with where R e is the peak of the size distribution, σ is the log-normal dispersion of log R e , L 0 is the characteristic luminosity corresponding to an absolute magnitude M = −21.0,R 0 = R e at L 0 , and β is the slope of the relation.Table 2 lists the best-fit parameters.Within the considerable scatter of ∼ 0.5 − 0.6 dex, the fits of both redshift ranges are statistically consis- Full sample 4 < z < 5 5 z < 9.5 4 < z < 9.5 χ 2 ν 1.9 χ 2 ν > 1.9 Figure 11.Luminosity-size relation for galaxies at 4 < z < 5 (blue points, blue solid line), 5 ≤ z < 9.5 (red points, red dotted line), and 4 < z < 9.5 (green dashed line), for (a) the subsample of 84 galaxies with mF160W < 26 mag and (b) the full sample of 347 galaxies; the shaded region corresponds to the log-normal dispersion of log Re (Table 2).Filled and open points correspond to fits with χ 2 ν ≤ 1.9 and χ 2 ν > 1.9, respectively.The absolute magnitudes and effective radii pertain to the filter that most closely approximates the rest-frame optical band.
tent with each other, and they are not strongly dependent on whether we apply the completeness cut of Stefanon et al. (2017).Our results for the redshift bins 4 < z < 5 and 5 ≤ z < 9.5 agree with those of studies at similar redshifts based on the rest-frame optical/UV (e.g., Shibuya et al. 2015;Yang et al. 2022), indicating little evidence for wavelength dependence of the luminosity-size relation at 4 < z < 9.5.The higher redshift bin may have a marginally lower zero point: for the m F160W < 26 mag sample, R 0 = 0.85 ± 0.08 kpc at 5 ≤ z < 9.5, to be compared with R 0 = 1.02 ± 0.09 kpc for 4 < z < 5; the corresponding values for the full sample are R 0 = 0.69±0.05kpc and R 0 = 0.91±0.04kpc, respectively.These results are consistent trends reported by Shibuya et al. (2015) for the rest-frame UV band at 0 < z < 8.

Evolution of Sérsic Index and Effective Radius
The evolutionary pathway of a galaxy, and hence its structural parameters, may depend on mass.To discern the possible evolution of galaxy structure with redshift, we focus on two stellar mass bins using sources that overlap in the redshift range 4 < z < 9.5: M * = 10 8 − 10 9.4 M ⊙ (233 galaxies) and M * = 10 9.4 − 10 10.5 M ⊙ (73 galaxies).To warrant against sample incompleteness, we further isolate the subset of galaxies with m F160W < 26 mag that meet the 90% detection completeness limit of Stefanon et al. (2017), at the expense of drastically reducing the sample to merely 46 galaxies for the lowmass bin and 38 galaxies for the high-mass bin.
Focusing first on the subset of galaxies with m F160W < 26 mag, Figure 12a (top panel) suggests that n drops systematically from z ≳ 6 to z ≈ 4. If we adopt a fiducial criterion of n < 1.5 to designate a disky morphology, the fraction of disk-dominated galaxies, for both mass bins combined, roughly doubles from ∼ 30% at z = 6 − 9.5 to ∼ 60% at z = 4 − 6.Our results qualitatively agree with those of Ferreira et al. (2023) and Kartaltepe et al. (2023) based on visual classifications.However, for the low-mass and high-mass bin, the difference between the median values of n at z ≳ 6 and z = 4 − 6 is only 0.55σ and 1.45σ, respectively (the xσ difference is calculated as , where m 1 and m 2 denote the median value of the distribution, and σ 1 and σ 2 give the standard deviation of the distribution).According to the two-sample Kolmogorov-Smirnov test, we cannot reject the null hypothesis that the distributions of n at z ≳ 6 and z = 4 − 6 are similar, with p−value of 0.1 for both the low-mass and high-mass bins, suggesting that apparent redshift evolution of n is insignificant.
Galaxy effective radius also seems to exhibit a mild but systematic increase toward lower redshift (Figure 12a,bottom panel), from a median R e ≈ 0.57 kpc and 0.81 kpc for the low-mass and high-mass bin at z = 6 − 9.5, to corresponding values of R e ≈ 1.02 kpc and 1.46 kpc at z = 4 − 6.However, for the low-mass and high-mass bin, the difference between the median values of R e at z ≳ 6 and z = 4 − 6 is only 0.47σ and 0.49σ, respectively.The two-sample Kolmogorov-Smirnov test cannot reject the null hypothesis that the distributions of R e of the two redshift intervals are similar, with p−value of 0.52 and 0.34 for the low-mass and high-mass bin, respectively.As mentioned in Section 1, the expected size scaling from semi-analytical models is R e ∝ H(z) −1 at a fixed halo circular velocity or R e ∝ H(z) − 2 3 at a fixed halo mass (Fall & Efstathiou 1980;Mo et al. 1998), which, for H(z) ∼ (1 + z) 3 2 at z > 2, translate to R e ∝ (1 + z) − 3 2 and R e ∝ (1 + z) −1 , respectively.As observations track the evolution of R e at fixed M * (or L), we expect to find a trend that falls between these two functional forms (Ferguson et al. 2004).Parameter-izing the size evolution as R e ∝ (1 + z) −α , the best-fit relation yields a power-law slope of α = 1.39 ± 0.33 for the low-mass bin and α = 1.27 ± 0.60 for the high-mass bin, which are statistically indistinguishable given their large uncertainties but broadly agree with theoretical expectations.These results are consistent with the predictions for the size evolution of galaxies at 3 ≤ z ≤ 6 made by Costantin et al. (2023) using synthetic images generated from cosmological simulations tailored for JWST observations.
When the entire sample is considered in aggregate (Figure 12b), neither n nor R e exhibits any significant trend with redshift over the range z = 4−9.5.This holds for both the high-mass and low-mass objects.For both mass bins combined, the difference between the median values of n or R e at z ≳ 6 and z = 4 − 6 is less than 0.3σ.These conclusions are further confirmed through the two-sample Kolmogorov-Smirnov test.We are unsure how to interpret these results, except to note that any trends based on small-number statistics at this early stage of the JWST mission should be regarded with extreme caution.

SUMMARY
We study the structural parameters of 347 galaxies at 4 < z < 9.5 using ∼ 34.5 arcmin 2 of JWST NIR-Cam data from the CEERS program covering seven bands from ∼ 1 to 4.4 µm.With the aid of GALFITM, we perform two-dimensional, simultaneous, multiband model fitting to derive robust seven-band photometry and global structural parameters.After evaluating different methods of PSF construction, we finally derive PSFs from stacking isolated, bright stars.The final error budget of the structural parameters takes into consideration the influence of the pixel scale after drizzling and the effects of cosmological redshift based on realistic mock simulations.
Our main results are as follows: 1. We detect a significant population of diskdominated galaxies.The distribution of global Sérsic indices in the rest-frame optical band peaks at n ≈ 1, with ∼ 55% of the sample having n < 1.5.If, in addition, we require that disky galaxies have an axial ratio q < 0.6, we place a conservative lower limit of ∼ 45% on the incidence of galactic disks.
2. Galaxies follow a relation between rest-frame optical luminosity and size, over the entire redshift range of 4 < z < 9.5 and separately over the intervals 4 < z < 5 and 5 ≤ z < 9.5.Galaxies in the higher redshift bin are marginally more compact (R e = 0.69 ± 0.05 kpc) than those in the lower redshift bin (R e = 0.91 ± 0.04 kpc).These results are qualitatively consistent with previous findings in the rest-frame UV band.
3. Within the limitations of the current sample size, we find no significant redshift evolution of n or R e at these early epochs.
Future work can be improved in several directions.Without spectroscopic confirmation, it remains possible that some of the objects in our sample may have inaccurate photometric redshifts.For convenience and consistency with other recent work in the literature, we made use of available photometric redshifts and stellar masses from the catalog of Stefanon et al. (2017), which was based on pre-JWST observations.A consistent set of updated photometric redshifts and stellar masses should be derived by incorporating the JWST data.The currently limited sample should be expanded using the remaining CEERS fields, as well as other NIR-Cam imaging surveys, such as COSMOS-Web (?), despite having shallower depth and less extensive bandpass coverage.Lastly, it is clear that the NIRCam images are beginning to reveal nascent internal substructures, even for galaxies in the epoch of reionization.More sophisticated analysis is needed to characterize the main structural components in order to elucidate the birth of the Hubble sequence.
We thank the referee for helpful comments and suggestions.

B. INFLUENCE OF REDSHIFT
As detailed in Section 3.3, we generate images simulating the same galaxy located at different redshifts.We fit the simulated images with GALFITM to investigate the influence of redshift on our results.Figures B1, B2, and B3 quantify the differences between the measured and input values of magnitude m, Sérsic index n, and effective radius R e , respectively, for representative simulated galaxies at z = 9 with m = 23.0,23.8, 24.7, 25.6, and 27.8 mag, n = 0.7, 1.1, 1.7, 2.5, and 3.9, and R e = 0.4, 1.5, and 2.8 kpc.These input parameters span representative values observed in our sample.For simplicity, the mock images used in the analysis of redshift effects are all based on single-component Sérsic models.We recognize that irregular features or substructures within the galaxy may lead to additional systematics not fully captured in our tests.However, this study only focuses on securing robust measurements of the integrated magnitude and the global structural parameters n and R e , which are not strongly influenced by the existence of substructures (e.g., Peng et al. 2010;Meert et al. 2013;Davari et al. 2014Davari et al. , 2016)).While including the effects of irregular features and substructures is beyond the scope of the current simulations, we suspect that they do not substantially alter our main conclusions.
The decrease of surface brightness with increasing redshift induces larger fluctuations at higher redshift for the differences between the recovered and input parameters.For galaxies with the same m and R e , the fluctuations of ∆m, ∆ log n, and especially ∆ log R e become more pronounced with increasing n.This may be due to the difficulty of accurately modeling the central region of concentrated galaxies with large n.At a given m and R e , the fluctuations between the output and input parameter values, particularly for n and R e , rise toward larger R e .This is because at fixed m and n the surface brightness of a galaxy decreases with increasing R e .The deviations can be quite large at z ≈ 8 − 9, especially for faint simulated galaxies with large R e , whose low surface brightness renders them barely distinguishable from background fluctuations (see the black lines in the middle and bottom rows of Figures B1-B3).
In general, m tends to be overestimated for galaxies in our sample, possibly because we miss the faint, outer regions of these galaxies at z > 4 even with JWST.We tend to underestimate n, especially for concentrated galaxies with large n (right column in Figure B2).We overestimate R e , particularly for small galaxies (upper row in Figure B3).These results are reasonable because cosmological surface brightness dimming impacts the inner, brighter regions of a galaxy more than its outer, fainter regions (see Equation 3), which will lead to n measurements smaller and R e measurements larger than true values at high redshift.The cosmological resolution effect presents a challenge to accurate modeling of the central regions of concentrated galaxies with either large n or small R e .
To determine the systematic bias and uncertainty induced by redshift effects on the photometric and structural measurements of our sample, we follow the method of ? to select simulated galaxies that are similar to our targets in the rest-frame optical band, as follows: • Select mock galaxy images at the redshift closest to that of each galaxy.
• Select mock galaxies with brightness and structures similar to that of each galaxy, based on the error-weighted difference of m, R e , and n: |mock − observed| / σ(mock) 2 + σ(observed) 2 < 1.
• Ensure that the axial ratios of the mock and observed galaxy do not differ by more than q = 0.15.
At least 30 mock galaxies satisfy the above criteria for each member of our sample.For each galaxy in our sample, we select the 30 simulated galaxies closest to its rest-frame optical magnitude to calculate the differences between their output and input parameter values (∆m, ∆ log n, ∆ log R e ).The median and standard deviation of the 30 results represents the systematic bias and uncertainty due to redshift effects, which are incorporated into the final error budget reported in Table 1.

Figure 1 .
Figure 1.Two example stars in the CEERS1 field.From left to right are their images in the F115W, F150W, and F200W bands, and their photometric data points overplotted on the best-fit stellar spectrum.

Figure 2 .
Figure 2. The distribution of redshift and stellar mass of the sample.

Figure 3 .
Figure 3. Simultaneous, multiband fitting of the galaxy EGS−31079 (Stefanon et al. 2017) at z = 5.2.Rows from top to bottom are the results for the seven JWST/NIRCam filters F115W, F150W, F200W, F277W, F356W, F410M, and F444W.The upper panel of the left column shows the radial surface brightness profile of the galaxy (open circles with error bars), the best-fitting model (red dashed line), and the PSF model (blue dotted line).The χ 2 ν from GALFITM for each band is given in the lower-left corner, while that for all seven bands is given in the upper-right corner of the first panel in the bottom row.The lower subpanel gives the residuals between the data and the best-fit model (data − model).The images show, from left to right, the original data, model, and residuals.The best-fit Sérsic index n and effective radius Re are given in the third column.

Figure 4 .
Figure 4. Simultaneous, multiband fitting in the rest-frame optical band for example galaxies with different M * and relatively disky morphologies at various redshifts.The upper panel of the left column shows the radial surface brightness profile of the galaxy (open circles with error bars), the best-fitting model (red dashed line), and the PSF model (blue dotted line).The χ 2 ν from GALFITM for all seven bands is given in the lower-left corner.The lower subpanel gives the residuals between the data and the best-fit model (data − model).The images show, from left to right, the original data, model, and residuals.The best-fit Sérsic index n and effective radius Re are given in the third column.

Figure 5 .
Figure 5.As with Figure 4, but for galaxies that are more spheroidal in morphology.The upper panel of the left column shows the radial surface brightness profile of the galaxy (open circles with error bars), the best-fitting model (red dashed line), and the PSF model (blue dotted line).The χ 2 ν from GALFITM for all seven bands is given in the lower-left corner.The lower subpanel gives the residuals between the data and the best-fit model (data − model).The images show, from left to right, the original data, model, and residuals.The best-fit Sérsic index n and effective radius Re are given in the third column.

F277WFigure 7 .Figure 8 .
Figure 7.As with Figure 4, but for galaxies whose fits have large residuals.PSF from the input PSF used for generating the model image.Because the FWHM and geometric shrinking of the input PSF vary according to both the input and target redshift, the convolution kernel is calculated for each input galaxy and output redshift.We convolve the image from the last step with the PSF-matching kernel so that the PSF of the final image matches that of the desired NIRCam band.Noise in the output image mainly comes from (1) random Poisson noise associated with

Figure 9 .
Figure 9.Comparison between the results derived using images from the first public release (Data Release 0.5) of CEERS and from those reduced by us, which have a finer pixel scale after drizzling (0. ′′ 015 pixel −1 instead of 0. ′′ 030 pixel −1 ), for the (a) magnitude, (b) effective radius, and (c) Sérsic index derived from the filter that most closely approximates the rest-frame optical band.Median differences (y-axis − x-axis) and standard deviations are given in the lower-right corner of each panel, which also shows the typical uncertainties.The dashed line indicates the 1:1 relation.Points are color-coded by the signal-to-noise ratio in the filter that most closely approximates the rest-frame optical band.

Figure 12 .
Figure12.Evolution with the redshift of Sérsic index n (top) and effective radius Re (bottom) for two stellar mass bins (red: M * = 10 8 − 10 9.4 M⊙; blue: M * = 10 9.4 − 10 10.5 M⊙), shown separately for (a) the subsample of 84 galaxies with mF160W < 26 mag and (b) the full sample.The error bars define the 15% and 85% percentile of the distribution for each redshift bin, whose violin-style shaded region displays the probability density distribution smoothed by a Gaussian kernel and normalized by the number of galaxies in each bin; the relative width of the violin shape in each bin corresponds to the fraction of galaxies with that value of n or Re.

Figure B2 .
Figure B2.As in Figure B1, but for the difference between measured and input values of Sérsic index (∆ log n = log nout − log nin).

Figure B3 .
Figure B3.As in Figure B1, but for the difference between measured and input values of effective radius (∆ log Re = log Re,out − log Re,in).

Table 1 .
Measurements of Magnitudes and Structural Parameters

Table 2 .
Best-fit Parameters of the Luminosity-size Relation