JWST Census for the Mass–Metallicity Star Formation Relations at z = 4–10 with Self-consistent Flux Calibration and Proper Metallicity Calibrators

We present the evolution of the mass–metallicity (MZ) relation at z = 4–10 derived with 135 galaxies identified in JWST/NIRSpec data taken from the three major public spectroscopy programs of ERO, GLASS, and CEERS. Because there are many discrepancies between the flux measurements reported by the early ERO studies, we first establish our NIRSpec data reduction procedure for reliable emission-line flux measurements and errors, successfully explaining Balmer decrements with no statistical tensions thorough comparisons with the early ERO studies. Applying the reduction procedure to the 135 galaxies, we obtain emission-line fluxes for physical property measurements. We confirm that 10 out of the 135 galaxies with [O iii] λ4363 lines have electron temperatures of ≃(1.1–2.3) × 104 K, similar to lower-z star-forming galaxies, which can be explained by heating by young massive stars. We derive the metallicities of the 10 galaxies by a direct method and the rest of the galaxies with strong lines using the metallicity calibrations of Nakajima et al. applicable for these low-mass metal-poor galaxies, anchoring the metallicities with the direct-method measurements. We thus obtain the MZ relation and star formation rate (SFR)–MZ relation over z = 4–10. We find that there is a small evolution of the MZ relation from z ∼ 2–3 to z = 4–10, while interestingly the SFR–MZ relation shows no evolution up to z ∼ 8 but a significant decrease at z > 8 beyond the errors This SFR–MZ relation decrease at z > 8 may suggest a break of the metallicity equilibrium state via star formation, inflow, and outflow, while further statistical and local-baseline studies are needed for a conclusion.


Introduction
The James Webb Space Telescope (JWST) has dramatically advanced the exploration of the high-redshift Universe.One important advancement for the high-redshift community is provided by the high sensitivity of the Near Infrared Spectrograph (NIRSpec; Jakobsen et al. 2022) at λ ; 2-5 μm, allowing us to address directly the key questions relating to the physical conditions of the interstellar medium (ISM) and the nature of the ionizing spectrum for galaxies in the early Universe.The properties of the hot ISM, especially the gasphase metallicity determined by the oxygen abundance ( + 12 log O H ( )), can be accurately constrained using restframe optical emission lines.These emission lines have been well calibrated and understood over decades, as discussed in a recent review by Maiolino & Mannucci (2019).Gas-phase metallicities provide a crucial experimental tool for studying the early chemical enrichment and feedback processes in galaxy evolution.These measurements can be used in conjunction with cosmological simulations (e.g., Finlator & Davé 2008;Ma et al. 2016;Torrey et al. 2019;Langan et al. 2020;Ucci et al. 2023) and chemical evolution models (e.g., Davé et al. 2012;Lilly et al. 2013;Dayal & Ferrara 2018) to gain insights into galaxy formation and evolution.Moreover, the efficiency of ionizing photon production, as quantified by ξ ion (the production rate of hydrogen ionizing photons per unit luminosity in the UV continuum), can be used to study the nature of the ionizing spectrum.Hydrogen Balmer emission via recombination physics provides the best constraints on ξ ion , while highionization lines such as He II offer key spectroscopic tools to investigate the hardness of the ionizing spectrum.These approaches are crucial for understanding the stellar populations in early galaxies, diagnosing the presence of accreting black holes in the system, and detecting galaxies hosting the very first generation of stars in the early Universe (e.g., Schaerer 2003;Inoue 2011;Kewley et al. 2013;Bouwens et al. 2016;Nakajima & Maiolino 2022;Trussler et al. 2022;Katz et al. 2023a).
Determination of these properties ideally requires key optical emission lines such as [O III] λλ5007, 4959, [O III] λ4363, [O II] λλ3726, 3729, 7 and the Balmer lines.In particular, [O III] λ4363 is crucial for gas-phase metallicity studies based on the electron temperature (T e ).However, these lines are limited in their applicability to galaxies up to z  3 in the pre-JWST era, with the detection of [O III] λ4363 from sources at z  1 still challenging due to faintness (e.g., Christensen et al. 2012;Jones et al. 2015;Sanders et al. 2020Sanders et al. , 2021)).These lines are beyond the reach of ground-based telescopes at higher redshift.Now JWST/NIRSpec has come online to directly examine these key emission lines in detail for high-redshift sources.
Using the Early Release Observations (ERO) of NIRSpec within one month after the data release, several studies report the detection of [O III] λ4363 from three objects at z = 7.6-8.5, which are magnified by the strong lensing galaxy cluster SMACS J0723.3-7327 (Arellano-Córdova et al. 2022;Schaerer et al. 2022;Curti et al. 2023a;Trump et al. 2023;Rhoads et al. 2023;Brinchmann 2023).These objects have provided the first reliable metallicity determinations at high redshift based on direct T e measurements.The metallicity values from the different teams are overall consistent to each other for each source, resulting in a mass-metallicity (MZ) relation which is scattered around the extrapolation of the z ; 2-3 relation toward the lower mass end (M å = 10 7.5 -10 9 M e ).Curti et al. (2023a) also examine the MZ-star formation rate (SFR) relation, indicated to be a fundamental relation from z = 0 to z  2-3 (Mannucci et al. 2010;Maiolino & Mannucci 2019;Sanders et al. 2021, and the references therein), suggesting that no clear evolution is seen from z = 0-3 to z = 7.6-8.5,particularly excepting for the z = 8.5 object (see below).However, the sample size is obviously too small to conclude anything regarding the evolution on the MZ relation and its SFR dependence.
Moreover, several pieces of evidence in the previous ERO studies imply that the early release of the NIRSpec products contain some issues.At the early stage of the ERO data release, the reductions including the flux calibration are partly based on predicted preflight data, such as the throughputs of the spectrograph, the optical telescope element, and the NIRSpec FORE optics.A sensitivity function, whose shape is strongly dependent on the wavelength, is necessary for appropriate flux measurements as adopted in Curti et al. (2023a) using standard stars taken during commissioning.Even after the flux calibration, there remains some tensions for faint emission lines.A notable issue is seen in the Balmer decrements such as Hγ/Hβ and Hδ/Hβ, which are (not always but sometimes) unexplainable in a consistent way within the uncertainties according to Case B recombination.Such tensions, especially seen in the faint emission lines, may be caused by background residuals and/or hot pixels that persist in the released final spectra.Interestingly, the z = 8.5 object (ID 04590) is suggested to present a high [O III] λλ4363/5007 line ratio, and accordingly a very high T e  25,000 K, which is hard to be explained by heating by young massive stars alone (Katz et al. 2023b).The low metallicity indicates it falls significantly below the MZ-SFR relation (Curti et al. 2023a) and can be in an early stage of galaxy evolution.Such a high electron temperature and low metallicity need to be confirmed by a more carefully reduced spectrum free from any tensions.The NIRSpec data reduction procedure will be revisited as one of the key parts of this paper for reliable emission-line flux measurements and errors.
Despite these possible uncertainties, recent NIRSpec observations are now providing an increasing number of spectra from high-redshift objects, and the reports of the identifications of strong emission lines such as [O III] λλ5007, 4959 and Hβ from objects at z > 7-8 follows.Notable results have come from the Early Release Science (ERS) observations of GLASS (Treu et al. 2022;Morishita et al. 2023;Mascia et al. 2023) and CEERS (Finkelstein et al. 2023;Bagley et al. 2023;Sanders et al. 2023;Tang et al. 2023;Fujimoto et al. 2023;Shapley et al. 2023), two Director's Discretionary Time (DDT) programs (Heintz et al. 2022;Wang et al. 2022;Langeroodi et al. 2022;Williams et al. 2023), and the Guaranteed Time Observations of JADES (Bunker et al. 2023;Cameron et al. 2023b;Curtis-Lake et al. 2023;Robertson et al. 2023;Saxena et al. 2023;Curti et al. 2023b).Regarding the ISM properties, Sanders et al. (2023) examine the ionization properties of ∼160 galaxies at z = 2-9 from CEERS by using high-to-low ionization emission-line ratios such as [O III] λ5007/[O II] λ3727, and suggest that galaxies tend to present hard ionizing spectra at high redshift.The metallicities at z > 6.5 are suggested to be subsolar on average (see also Tang et al. 2023;Shapley et al. 2023).A similarly high ionization state and modest metallicity is also suggested in 29 galaxies at z = 4.8-8 from GLASS (Mascia et al. 2023) and in 26 galaxies at z = 5.5-9.5 from JADES (Cameron et al. 2023b; see also Saxena et al. 2023).Fujimoto et al. (2023) explicitly derive metallicities of ∼10, z = 8-9 CEERS galaxies using a metallicity indicator of [O III] λ5007/Hβ, suggesting that these high-redshift galaxies have a lower metallicity than z = 0-3 galaxies for a given stellar mass.More extremely, Williams et al. (2023) report a z = 9.5 galaxy with [O III]+Hβ, revealing a very high [O III]/[O II] ratio and a modestly low metallicity, + 12 log O H ( )= 7.48 ± 0.08, falling below the z = 0 MZ relation but still consistent within 2σ.Including the z = 9.5 object, Heintz et al. (2022) analyzed five objects at z > 7.8 from the DDT programs and suggest a systematic decrease of metallicity at z > 7.8 on the MZ-SFR relation found at lower redshift, as seen in the ERO z = 8.5 object.A similar conclusion is also derived by Langeroodi et al. (2022).Boyett et al. (2023) report another z = 9 galaxy from GLASS whose [Ne III] λ3869/[O II] ratio implies a subsolar metallicity.Finally, Bunker et al. (2023) report the astonishing spectrum of GN-z11 at z = 10.6 using the deep JADES observations.The author use several metal lines such as [Ne III], [O II], as well as the UV lines of NIV] λ1486, NIII] λ1748, and CIII] λ1909 and imply an unusually high nitrogen-to-oxygen abundance ratio for its modest (∼subsolar) oxygen abundance (see also Cameron et al. 2023a).We note that these metallicities are based on strong emission-line ratios, which are empirical indicators and calibrated mostly in the local Universe (e.g., Maiolino & Mannucci 2019).The applicability of these methods at high redshift need to be carefully confirmed (e.g., Curti et al. 2023a), particularly given the indication of different degrees of ionization in the ISMs of galaxies between z = 0 and high redshift for a fixed metallicity (Sanders et al. 2023).
In this paper, we provide a summary of three major public NIRSpec observation programs of ERO, GLASS, and CEERS, aimed at characterizing the MZ relation at z = 4-10 and examining its evolution over cosmic time.We establish a reliable data reduction procedure for emission-line flux measurements and errors using NIRSpec data, which is applied to construct a large sample of galaxies at z = 4-10 as detailed in Section 2. In Section 3, we utilize key emission-line ratios to derive metallicities for the JWST objects with improved NIRSpec spectra.We then investigate the MZ relation and its dependence on SFR at high redshift.We begin with the 10 objects in which [O III] λ4363 is identified, and metallicity is determined using the direct T e method.The empirical metallicity indicators validated with these 10 objects are then applied to estimate metallicities for the remaining JWST objects.We analyze the MZ and MZ-SFR relations at z = 4-10 using the metallicity measurements obtained for the JWST objects.The results and implications of these metallicity relationships are discussed in Section 4, and we provide a summary of our conclusions.Throughout the paper we adopt a standard ΛCDM cosmology with Ω Λ = 0.7, Ω m = 0.3, and H 0 = 70 km s −1 Mpc −1 .

Data
We assemble the publicly available NIRSpec data of ERO and ERS, rereduce and analyze the spectra, and identify highredshift galaxies with rest-frame optical emission lines to discuss their nebular properties.In the following subsections, we detail our improved reduction procedures, particularly for the ERO data to demonstrate the improvements by comparing with the early results.

NIRSpec Observations and Data Reduction
The JWST/NIRSpec ERO observations were undertaken on UT 2022 June 30, targeting the SMACS 0723 lensing cluster field (Pontoppidan et al. 2022;Proposal ID: 2736).Multislit spectroscopy was taken using the microshutter assembly (MSA), with the medium-resolution (R ∼ 1000) gratings/filters of G235M/F170LP and G395M/F290LP sampling the wavelength ranges of 1.7-3.1 and 2.9-5.1 μm, respectively.Two observing blocks were carried out, o007 and o008, each of which consisted of three exposures of 2918 s in G395M and F235M each, with a three-shutter slitlet nod pattern, i.e., slightly nodded positions, nod1, nod2, and nod3, in different shutters within the slitlets.The total integration time was 4.86 hr in G235M and G395M each.
By visually checking all of the 1D and 2D composite spectra that were publicly available, we identified five objects whose [O III] λλ5007, 4959+Hβ were clearly detected and that were useful for the investigation of the nebular properties at high redshift (z > 5), with IDs 04590, 05144, 06322, 08140, and 10612.Although there were additional two sources in the ERO sample that were tentatively reported at z = 5-6 (Brinchmann 2023;Mahler et al. 2023), they were not included in the following analysis because of their lines' poorer confidence levels.
For the five sources, data reduction was reperformed for a more careful analysis and flux calibration.Starting with the Level 1 products provided by the observatory, we used the Spec2 and Spec3 pipelines per nod and per observing block using the Python library for JWST observations by STScI (v1.8.5).We used the latest reference files stored in a pmap file of either 1028 or 1027, where notably the flat files were created from in-flight data for all MOS data taken after the launch, resolving the MOS flux calibration issues reported in the early report of the NIRSpec ERO data (see below for more details about the flux calibration).According to the original reduction procedure for NIRSpec MSA data provided by the CEERS team, 8 when using the Spec2 pipeline, a single science image for a particular nodding position was background subtracted by using the other nod images.However, we realized some hot pixels including uncleaned cosmic rays in one of the background images fell on the spatial position of the science objects and badly influenced the final 1D spectrum.To avoid such additional noise, we carefully checked the background-subtracted science images and removed one of the other nod images, if necessary, for the background images and reran the Spec2 pipeline in an iterative manner.Furthermore, we subtracted the local residual background at each spectral pixel for each science image by masking the spatial position of the object and taking the median value of the pixels in the range [−10 pixel:+10 pixel] along the spectral direction. 9We applied the Spec2 pipeline assuming that all of the five sources were point sources, as they all looked compact in the NIRCam images with the half-light radii smaller than the point-spread function (PSF; e.g., Harikane et al. 2023a;Tacchella et al. 2023; see also Ono et al. 2023 for the size evolution at high redshift).This assumption affected how the path-loss corrections, which accounted for various types of signal loss in the NIRSpec data, were calculated and applied.The correction depended on the position of the source within the configured MSA shutter, the aperture used for the observations, the wavelength, as well as the type of the object.We used the original MSA metadata files to refer to the relative placement of the sources within the MSA shutters assuming the pointing uncertainties during the observations were negligible, but manually changed the values in the column STELLARITY to force the objects to be treated as point sources.Figure 1 summarizes the six 2D outcomes (three nods for two observing blocks) for each of the ERO sources.
Checking the six single exposures in both 1D and 2D for each of the sources, we confirmed no signal was detected in the nod2/ o007 observing block for 04590, as already pointed out by the earlier studies (e.g., Curti et al. 2023a;Arellano-Córdova et al. 2022).This exposure obviously needed to be removed from the final composition.Moreover, we noticed the spectra of nod2/o008 of 05144 were badly affected especially around the Hγ+[O III] λ4363 wavelength regime.To be conservative, we decided to remove the visit for 05144 when making the composite.In short, the integration time we used for the composite spectra of 04590 and 05144 was 4.05 hr, and 4.86 hr for 06355, 08140, and 10612.
The composite was made by median stacking the available five to six 2D spectra.The individual 2D spectra were residual background subtracted, and shifted to have common spatial and spectral ranges for the composite.All the shifts were integer movements.We adopted a 2D median-stacking procedure to alleviate further the possible impacts of hot pixels that persisted in the final spectra, which were produced by the standard procedure by the Spec3 pipeline.Moreover, the 2D medianstacking method allowed us to rescue the frames where the spectrum was partly out of the slit.Examples were found in the two nod2 positions for 10612 whose flux losses were >3σ if extracted in individual frames and compared with the other nod 1D spectra.Our method treated the outer regions of the slit as NaN and calculated the median value without NaN at each pixel in 2D.Using the 2D composite, a 1D spectrum was produced for each source and grating via the summation of 3 pixels (= 0 3 ; 2 × FWHM) along the spatial direction centered on the spatial peak position.We adopted a narrow extraction aperture as compared to the previous studies to minimize the effects coming from noisy regions close to the edge, particularly for 04590, 08140, and 10612 that were observed with a short MSA slitlet.
The flux calibration was one key procedure to be improved for the early NIRSpec studies, as some of the reference files needed for the successful flux calibration, such as the flat frames for the spectrograph and the FORE optics, were based on the predicted preflight throughputs.As for the pmap file of 1022, these flat files have been updated and created from in-flight data for all MOS data.The MOS flux calibration accuracy is now quite accurate, with the uncertainties at approximately 5% or less over the entire wavelength range. 10We therefore used the flux solutions provided by the Spec2 pipeline together with the latest reference files.Finally, we scaled the spectrum to match the broadband photometry of JWST/NIRCam to combine the spectroscopic and photometric measures based on a common aperture to derive quantities such as the equivalent widths (EWs) of the optical emission lines and ξ ion .We used the NIRCam broadband data that covered the [O III]+Hβ emission for the scaling.The final composite 1D spectrum is presented in the bottom panel for each of the sources in Figure 1.

Notes.
a Observed flux ratios relative to Hβ (×100).Upper-limit values at the 1σ level.
b The presence of a high-ionization line of [Ne IV] is seen (Brinchmann 2023).c The presence of broad Hα is indicated (Harikane et al. 2023b).
(This table is available in machine-readable form.) To evaluate the noise level, we used the 2D noise frames of read-out noise (VARinferior RNOISE) and Poisson noise (VARinferior POISSON) outputted in the Spec2 procedure for each object and nod.In addition, we added the standard deviation of the residual background in each spectral bin in quadrature.We then averaged the noise frames of the available nods and extracted the 1D noise spectra in quadrature by using the same spectral and spatial ranges as adopted for the science spectra.Finally the scaling was performed in the same manner as done for the science spectra.

Emission-line Flux Measurements
We measured the key optical emission lines of hydrogen Balmer lines, , and [O III] λ5007, 4959 by performing a Gaussian profile fitting to each line, and used the noise spectrum to weight the fit.In the fitting procedure for the relatively faint objects of 04590, 05144, and 08140, the redshift of [O III] λ5007 (i.e., the strongest emission line) was adopted for the Gaussians of the other lines.We perform all wavelength measurements on the vacuum wavelength scale, although we follow a long-standing convention of specifying the emission lines by their wavelength in air.The error of each flux measurement was calculated by adding the noise levels of the spectral bins in quadrature within the wavelength range of ±FWHM centered on the Gaussian peak.Using the flux measurements and the error evaluations, we confirm that four of the ERO objects, 04590, 05144, 06355, and 10612, present a >3σ detection of [O III] λ4363.We also derived the EWs of the optical emission lines by combining the fluxes with the continuum level provided from a best-fit spectral energy distribution (SED) to the appropriate rest-frame optical broadband photometry (Section 2.1.3).We summarize the key emission-line measurements in Table 1 for the [O III] λ4363-detected sources to discuss the metallicity indicators (Section 3.2).For the full sample, we list the metallicity diagnostic emission-line ratios in Table D1 in Appendix D together with the other physical quantities.
To test our new flux measurements with the improved reduction and calibration of the NIRSpec spectra, we show in Figure 2 the Balmer decrements of Hγ/Hβ and Hδ/Hβ for the four ERO objects with a significant detection of the three lines, and compare them with the prediction of Case B recombination.Observed Balmer line ratios should follow the sequence of the expected Balmer decrements as a function of dust attenuation, as shown with the curves adopting different reddening laws (Cardelli et al. 1989;Calzetti et al. 2000;Gordon et al. 2003) and electron temperatures (T e = 10,000 and 25,000 K).The four ERO objects all follow the sequence and present physically reasonable Balmer decrements within the uncertainties.The 3σ upper limits on Hγ/Hβ and Hδ/Hβ, along with the observed Hα/Hβ ratio, are consistent with the sequence of the expected Balmer decrements for 08140.In contrast, previous studies claim some tensions of the observed Balmer decrements based on the early release products, which show inconsistencies with the expected line ratios, as referenced with the open and the cross symbols in Figure 2. We speculate such inconsistencies have been improved by implementing the residual background subtractions and the two-dimensional stacking procedure, where we can ease the potential impacts of hot pixels that could have persisted in the final spectra in the early release.We note that the error bars of our measurements are slightly larger than those in the previous studies (Arellano-Córdova et al. 2022;Curti et al. 2023a), mostly due to the addition of the uncertainty of the subtracted residual background into the noise levels.
Prior to quantitative analysis, it is necessary to consider corrections for dust reddening.We use the observed Balmer decrement of Hγ/Hβ for the four objects to estimate the dust attenuation assuming the Calzetti et al. (2000) attenuation curve and Case B recombination with T e = 17,500 K. Adding the lower signal-to-noise ratio emission of Hδ does not change the estimates, as indicated in Figure 2.For 08140, we use the Hα/Hβ ratio instead.We adopt E(B−V ) = 0.22 for 04590, 0.03 for 05144, 0.08 for 10612, and 0.00 for 08140 and 06355, and do not propagate the uncertainty in E(B−V ) in the following analysis.

NIRCam Photometry and SED Fit
All five ERO objects have been observed with JWST/NIRCam in the F090W, F150W, F200W, F277W, F356W, and F444W filters.These photometric data are crucial for deriving stellar  Curti et al. 2023a;Rhoads et al. 2023).For the measurements of Arellano-Córdova et al. (2022), we adopt the average values based on the two spectra (o007 and o008) for 06355 and 10612, but refer only to the o008 value for 04590.For those with line ratios outside the ranges of the plot, we place them close to the edge of the plot with an arrow to indicate the direction toward the actually reported line ratio.The lines show the expected Balmer decrements as a function of E(B−V ) for the three different attenuation curves (Cardelli et al. 1989 with dashed;Calzetti et al. 2000 with dotted;and SMC from Gordon et al. 2003 with dotted-dashed) and the two different electron temperatures (T e = 10,000 K in gray and 25,000 K in magenta), although the different attenuation curves and T e values have a small impact on this diagram.The deviation of the Hδ/Hβ ratio from the theoretically expected value based on the Hγ/Hβ ratio (ΔHδ/Hβ = ((Hδ/ Hβ) obs − (Hδ/Hβ) expected ) normalized by (Hδ/Hβ) expected ) is highlighted in the upper panel, and that of the Hγ/Hβ ratio based on the Hδ/Hβ ratio in the left panel, for each symbol as in the main panel.The corresponding E(B−V ) values are printed in these subpanels, where the Calzetti et al. (2000) attenuation curve and T e =17,500 K are adopted for reference.Our measurements all follow the sequence of the expected Balmer decrements within the uncertainties, while quite a few of the measurements from the earlier studies show the ratios that are not physically and/or consistently explainable.
properties such as stellar mass, and for evaluating the continuum level to correct for slit loss in the NIRSpec observations.The derived corrections are important for obtaining key properties, such as the total SFR via recombination physics, ξ ion , and the EWs of optical emission lines such as EW(Hβ).
We refer to the catalog provided by Harikane et al. (2023a) to extract the necessary photometric data.We adopt the total magnitudes that are estimated from the 0 3 diameter aperture magnitudes with the aperture corrections.The uncertainties of the magnitudes include the photometric errors from the images as well as the 10% error floor to account for systematic uncertainties arising from, e.g., the zero-point corrections.
Using the total magnitudes and the errors of NIRCam, we perform the Bayesian SED fitting with Prospector (Johnson et al. 2021) to derive the stellar population properties.The procedure of the SED fitting is the same as that of Harikane et al. (2023a), except for the fixed redshifts based on the NIRSpec emission-line measurements.We use the stellar population synthesis package, Flexible Stellar Population Synthesis (FSPS; Conroy et al. 2009;Conroy & Gunn 2010) for the stellar SEDs, and include nebular emission from the photoionization models of Cloudy (Byler et al. 2017).We assume the stellar initial mass function (IMF) of Chabrier (2003), the intergalactic medium (IGM) attenuation model of Madau (1995), the Calzetti et al. (2000) dust attenuation law, and a fixed metallicity of 0.2 Z e .We choose a flexible star formation history as adopted in Harikane et al. (2023a).
The stellar masses we obtain from the SED fitting are the apparent values, i.e., after experiencing the strong lensing magnifications for the ERO objects.To correct for the lensing effects, we refer to the mass model of SMACS J0723 constructed with glafic (Oguri 2010(Oguri , 2021) ) updated with the new JWST ERO data, as adopted in Harikane et al. (2023a).The adopted magnification factors as well as the stellar masses after the lensing magnification corrections are summarized in Table D1.The errors of the best-fit parameters from the SED fitting correspond to the 1σ confidence interval (Δχ 2 < 1) for each parameter.

ERS GLASS Data
The ERS GLASS NIRSpec MSA observations were carried out on UT 2022 November 10-11 for objects behind the galaxy cluster Abell 2744 (Proposal ID: 1324;Treu et al. 2022).They adopted the R ∼ 2700 high-resolution gratings/filters of G140H/F100LP, G235H/F170LP, and G395H/F290LP, sampling the wavelength ranges of 1.0-1.9,1.7-3.2, and 2.8-5.3 μm, respectively.Twelve exposures of 1473 s were conducted in one pointing of NIRSpec MSA for each spectral configuration with the three-shutter slitlet nod pattern, totaling an on-source integration of 4.9 hr in each of G140H, G235H, and G395H.
Through the initial screening of all of the public 1D and 2D composite spectra, we identified 15 objects at z > 4 with [O III] +Hβ. 11The sample contains the members of the protocluster at z = 7.89 as presented by Morishita et al. (2023).We rereduced the spectra of the 15 objects with our custom procedure as described in Section 2.1.1.We carefully checked all of the 12 nod spectra for each object to determine the extraction aperture and any nods to be removed from the composition.For 10005, the light from another member of the z = 7.89 protocluster partly came into the same slit at a spatially offset position.We changed the background frames during the spec2 procedure so that the signal from 10005 was not oversubtracted by the neighboring object in the different nods.
The emission-line flux measurements were performed in the same manner as explained in Section 2.1.2.Among the 15 objects, we identified five with [O III] λ4363.One of the [O III] λ4363detected source, 40066, lacked [O II] λ3727 emission which fell in the detector gap.The other four objects, whose flux measurements are given in Table 1, are thus used for the metallicity measurements with the direct T e method (Section 3.2).The nebular dust attenuation was evaluated using Hα, Hβ, and/or Hγ.
The NIRCam multiband photometric catalog was generated in the same manner as in Section 2.1.3based on the images publicly available from the GLASS and UNCOVER (Proposal ID: 2561) programs.We use the data of F115W, F150W, F200W, F277W, F356W, and F444W for the subsequent SED fitting analysis.Note that one of 15 objects (160122) is not covered by NIRCam, and its stellar mass is not determined.
All of the GLASS objects are strongly lensed.To correct for the magnification factors, we use the glafic (Oguri 2010(Oguri , 2021) ) mass model of Kawamata et al. (2018) with a minor update based on Multi Unit Spectroscopic Explorer (MUSE) follow-up spectroscopy (Bergamini et al. 2023).The updated mass model uses 45 multiple image systems, 27 of which have spectroscopic redshifts.This model reproduces the observed multiple image positions with an accuracy of 0 42.The derived magnification factors, the stellar masses after the lensing magnification correction, as well as the other key quantities for the entire GLASS sample are summarized in Table D1 in Appendix D.

ERS CEERS Data
The ERS CEERS NIRSpec MSA observations were carried out on UT 2022 December 20-22 and 24 in the blank field of EGS (Finkelstein et al. 2023;Fujimoto et al. 2023;Proposal ID: 1345).They used both prism (R ∼ 100) and mediumresolution (R ∼ 1000) gratings to cover from 1.0 to 5 μm in six pointings (P4, P5, P7, P8, P9, and P10) with some overlaps of objects.Note that two of the prism pointings, P9 and P10, were not fully completed.To compensate the two prism pointings, additional observations were performed on UT 2023 February 9-10 (P11 and P12; see also Arrabal Haro et al. 2023).These additional data were also analyzed and presented in this paper to provide a complete spectroscopic sample of CEERS to the community.Three exposures of 1036 s were conducted for each spectral configuration with the three-shutter slitlet nod pattern, totaling an on-source integration of 0.9 hr in each prism, G140M, G235M, and G395M observation, except for P9 and P11 where another set of three exposures of 1036 s were repeated in the prism observations.At redshifts z > 4, we detected a total of 147 prism spectra and 77 medium grating spectra with [O III]+Hβ emission lines.Among these, three objects (IDs 00618, 00717, and 01027) were observed with two different prism and medium grating pointings each, resulting in four duplicate observations.Additionally, one object (ID 00603) was observed with two different prism pointings and one medium grating pointing, resulting in three duplicate observations.Furthermore, 50 11 There were six other objects whose [O III]+Hβ appeared at λ = 2.5-5.2μm in the spectrum, with IDs 340899, 341905, 340055, 341715, 341712, and 80121.After our careful investigations, they were concluded as fake detections and hence not high-redshift objects, since they were located along the same spatial (cross-dispersion) position of the other high-redshift objects on the MSA mask, and their spectra were contaminated by the others.Indeed, the fake [O III]+Hβ detections showed inconsistent redshifts, indicating the spectrum was coming from a different slit.
objects were observed with both the prism and medium grating, resulting in two duplicate observations with one pointing each.In summary, a total of 163 objects were identified through the ERS CEERS NIRSpec observations, as summarized in Table D1 in Appendix D. For the 163 objects, we performed our rereduction procedure as described in Section 2.1.1.
The same emission-line flux measurements were performed as detailed in Section 2.1.2.We identified two out of the 163 objects with [O III] λ4363 among the CEERS sample.Their flux measurements are added in Table 1.The nebular dust attenuation was evaluated using Hα, Hβ, and/or Hγ depending on the detections and the redshift.
The CEERS NIRCam observations consists of 10 pointings, four of which were taken in 2022 June and the rest in 2022 December (Bagley et al. 2023).The early observations of four pointings are presented in Harikane et al. (2023a).We reduce the latter observations' data in the same manner, and generate the NIRCam full photometry catalog, including the seven bands of F115W, F150W, F200W, F277W, F356W, F410M, and F444W (Y.Harikane et al. 2023, in preparation).One caveat is that not all of the NIRSpec CEERS objects are covered by NIRCam.We find 91 of the 163 CEERS objects have NIRCam coverage.For the 91 objects, we perform SED fitting as in Section 2.1.3to derive the stellar masses, and scale the NIRSpec spectra to be consistent with the NIRCam total magnitude photometry (i.e., slit-loss correction).For the others except for a single object (ID 01115), we rely on Hubble Space Telescope (HST) WFC3 photometry (Stefanon et al. 2017), as detailed in Isobe et al. (2023).Briefly, we derive the UV absolute magnitudes (M UV ) from the HST/WFC3 photometry, use the empirical relationships between stellar mass and M UV at z = 4-8 (Song et al. 2016) to estimate the stellar masses, and rescale them to the Chabrier (2003) IMF.We exclude the single object with ID 01115 from the following analysis as its HST photometry is not given.For those not detected in the HST bands, we translate the corresponding 5σ limiting magnitudes to 5σ upper limits of stellar mass.While acknowledging that the empirical method utilizing M UV for estimating stellar mass may be less reliable compared to the method involving an SED fit to NIRCam photometry, we demonstrate in the subsequent sections (Sections 3.3 and 3.4) that our main findings remain unchanged even when excluding objects without NIRCam photometry.This is partly because the fraction of such objects is small (∼28%) in deriving the average relationships.Furthermore, we have found reasonable consistency in the stellar mass estimates between the two methods by using objects with both NIRCam and HST photometry.Details are described in Appendix A. Note that we cannot correct for the slit loss of the NIRSpec spectra for objects without NIRCam photometry.This will not affect the metallicity measurements as we only use the line ratios.However, that does prevent us from deriving the EWs of the optical emission lines such as EW(Hβ), and the SFR based on the total Hβ luminosity.
In summary, we construct a JWST sample of 182 objects at z = 3.8-8.9.The redshift distribution of the sample is shown in red in Figure 3.

Emission-line Ratios
To examine the nebular properties of JWST objects at z = 4-9 and compare with those of lower-redshift galaxies, we plot them in a line ratio diagnostic diagram of 4 (see also Mascia et al. 2023;Tang et al. 2023;Sanders et al. 2023).This kind of diagram has been widely used to investigate the metallicity and ionization state in the local Universe and up to z = 2-3 (e.g., Kewley & Dopita 2002;Maiolino et al. 2008;Nakajima & Ouchi 2014;Troncoso et al. 2014;Sanders et al. 2016;Onodera et al. 2016;Strom et al. 2017;Nakajima et al. 2020).When compared to local Sloan Digital Sky Survey (SDSS) galaxies, which are free from AGN contamination as determined by the BPT diagram (e.g., Baldwin et al. 1981;Kauffmann et al. 2003; see below for more details), we observe that the z = 2-3 galaxies exhibit a higher O32 line ratio, indicating an overall increase in ionization parameter at higher redshift.Figure 4 confirms the trend, and the z = 4-9 JWST objects show a further high O32 line ratio on average than typically seen in z = 2-3 continuumselected galaxies.Remarkably, one of the ERO objects, ERO_04590 at z = 8.5, presents a stringent lower limit of O32 as >14.8 (3σ), which corresponds to an ionization parameter of > -U log 2.21 according to the prescription of Kewley & Dopita (2002; see also Fujimoto et al. 2022). 12The lower limit is a factor of ∼10 higher value than typically seen in SDSS galaxies.
At z = 2-3, low-mass LAEs are suggested to have a remarkably high O32 line ratio, which would be achieved by a hard ionizing radiation field and/or low metallicity (Nakajima et al. 2016;Trainor et al. 2016;Erb et al. 2016).Its implication regarding the escape of ionizing photons (and that of Lyα photons) are also discussed (e.g., Nakajima & Ouchi 2014;Izotov et al. 2016Izotov et al. , 2018;;Verhamme et al. 2017;Steidel et al. 2018;Fletcher et al. 2019;Erb et al. 2019;Nakajima et al. 2020;Katz et al. 2020;Flury et al. 2022).The extreme O32 line ratios seen in the z = 4-9 JWST objects are comparable to those in z = 2-3 low-mass LAEs and z ∼ 0 Lyman-continuum (LyC) leaking objects, suggesting that highredshift sources have a nebular and ionization condition of gas that is similar to those typically found in lower-redshift lowmass galaxies having strong Lyα and/or LyC escape with a hard ionizing spectrum.Similar findings are also discussed by Mascia et al. (2023) and Sanders et al. (2023), and notably by Tang et al. (2023), where the strong LAE galaxies at z > 7 are directly examined with JWST and suggested to present the largest O32 line ratios.
One uncertainty in interpreting O32 arises as it also depends on the metallicity.In the next section, we present our method to derive the metallicities for the JWST objects using several indicators including R23, particularly by referring to the objects whose [O III] λ4363 is detected and metallicity is precisely determined with electron temperature.
Before moving to the metallicity results, we also examine our objects using another popular emission-line diagram, the [N II] BPT diagram, which plots the ratios of [O III] λ5007/Hβ and [N II] λ6584/Hα (Baldwin et al. 1981).This diagram is commonly used to diagnose the presence of an AGN and is applicable to our JWST objects up to a redshift of z < 6.9, where the Hα+[N II] emission is covered by the NIRSpec spectra.Figure 5 shows the [N II] BPT diagram for our JWST objects at z < 6.9, along with two curves that discriminate between sources dominated by AGNs and stars (Kewley et al. 2001;Kauffmann et al. 2003).Sources located above the curves are classified as AGN dominated.
Figure 5 shows that most JWST sources have an upper limit of [N II].The [N II]-detected objects fall either below or on the demarcation curves within the measurement uncertainties.None of the JWST objects are thus classified as obvious AGNs.However, it is worth noting that low-metallicity AGNs with Z  0.5 Z e may contaminate the star-forming galaxy region on the diagram (e.g., Kewley et al. 2013;Nakajima & Maiolino 2022).We cannot fully rule out the presence of such low-metallicity AGNs in our sample based on the [N II] BPT result, but we suggest an absence of evolved AGNs as far as we explore (z < 6.9).Accordingly, we do not remove any objects from the sample for the following results based on the [N II] BPT diagram.Although there are some objects whose [N II] upper limits are too weak to conclude their BPT diagnostics, we mention in the following sections (Sections 3.3 and 3.4) that our main results are not changed by excluding these unclear objects.
As a complementary approach to our method using the [N II] BPT diagram, a companion study by Harikane et al. (2023b) conducts a search for faint AGNs in our full JWST spectroscopic sample by examining the broad component (FWHM >1000 km s −1 ) around the Hα emission line.The authors identify 10 objects with signatures of AGNs at redshifts z = 4.0-6.9(see also Kocevski et al. 2023;Übler et al. 2023).
To avoid potential biases in our MZ relation, we remove these The gray open symbols represent the averages of lower-redshift samples of continuum-selected galaxies (Nakajima & Ouchi 2014;Troncoso et al. 2014;Sanders et al. 2016;Onodera et al. 2016;Strom et al. 2017) and Lyα emitting galaxies (LAEs; Nakajima & Ouchi 2014;Erb et al. 2016;Nakajima et al. 2020), as indicated in the legend.Arrows indicate 3σ lower limits.Gray shading illustrates the equivalent distribution for nearby SDSS galaxies.10 objects from our sample when we examine the MZ relation.These objects are flagged in Table D1.

Direct T e Method
Ten objects (four from ERO, four from GLASS, and two from CEERS) present a significant detection of [O III] λ4363 allowing us to determine the oxygen abundance reliably with the direct T e method at z = 4.0-8.5 and use it as a proxy for the gas-phase metallicity.We follow the procedure as summarized in Nakajima et al. (2022) to derive T e -based metallicities.Briefly, we first estimate the electron temperature of the O 2+ zone (T e ([O III])) using the reddening-corrected [O III] λλ4363/ 5007 ratio and an assumed electron number density of 100 cm −3 with the task getTemDen from PyNeb (Luridiana et al. 2015).Different densities such as 10-1000 cm −3 do not significantly change the results (see Isobe et al. 2023 for the typical electron density in high-redshift galaxies; see also Fujimoto et al. 2022) ) using the PyNeb package getIonAbundance.We do not take into account a higher ionization abundance of O 3+ /H + , as we follow the approximation given by Izotov et al. (2006) and confirm no clear presence of the He II λ4686 emission line in the spectra.In the spectrum of ERO_04590, the [O II] doublet is not detected at the 3σ level.We therefore count the O 2+ component alone for its oxygen abundance.We have checked that the 3σ upper limit of [O II] would increase the O/H value by 0.05 dex, which is small enough with respect to the measurement error (;0.15 dex).The measured oxygen abundances as well as T e ([O III]) are summarized in Table 1.
For the ERO objects, several earlier studies exist that report metallicities with electron temperatures (Schaerer et al. 2022;Curti et al. 2023a;Trump et al. 2023;Rhoads et al. 2023;Arellano-Córdova et al. 2022;Brinchmann 2023).It is practically useful to compare our measurements with theirs.Figure 6 compares the T e ([O III]) values based on our measurements with the earlier studies for the four [O III] λ4363-detected ERO objects.Different colors correspond to different objects.In some previous studies, the z = 8.5 object (ERO_04590; red marks in the plot) is claimed to present an exceptionally high electron temperature at T e > 2.5 × 10 4 K, which is not usually observed in the local Universe and may require some additional explanations (e.g., Katz et al. 2023b;Rhoads et al. 2023).Our remeasurement confirms a high, but not extremely high, electron temperature for ERO_04590, T e = (2.08 ± 0.26) × 10 4 K, which is explainable by heating by young massive stars without any special mechanisms.It is thus important to reduce and combine the spectra carefully, as well as to extract the 1D spectra to discuss the nebular properties reliably with the measurements of faint emission lines.For the other objects, their electron temperatures are modest, T e = (1.1-2.3)× 10 4 K, fairly consistent with the values in the earlier literature and similar to the electron temperatures seen in lower-redshift star-forming galaxies.We will also discuss comparisons of metallicity determinations later on the MZ relationship.
The four ERO objects as well as the six newly identified GLASS and CEERS objects with [O III] λ4363 provide the opportunity to test the empirical metallicity indicators at such high redshift of z > 4 (Curti et al. 2023a).In Figure 7, we examine the following five popular metallicity indicators: Ne3O2), as well as R23 and O32 by comparing with the empirical indicators proposed by Maiolino et al. (2008; see also Nagao et al. 2006), Curti et al. (2017Curti et al. ( , 2020)), Bian et al. (2018), andNakajima et al. (2022).We draw each empirical relationship only in the calibrated metallicity range without showing any extrapolation.For the relationships of Nakajima et al. (2022), two curves are depicted in each panel, showing the dependence of EW(Hβ) on the indicators.The authors use EW(Hβ) as a proxy to correct for the degree of ionization state of the gas in each galaxy, because EW(Hβ) is sensitive to the current massive-star formation efficiency and known to be well correlated with the ionization state as probed by, e.g., O32 (e.g., Nakajima & Ouchi 2014;Mingozzi et al. 2020;Nakajima et al. 2022).To examine the dependencies of EW(Hβ) on the indicators, the JWST objects are color coded by EW(Hβ) except for CEERS_01536, whose EW(Hβ) is not derived due to the lack of NIRCam images (Section 2.3).
The JWST objects tend to present high-ionization emission lines such as   2022), we show the average values based on the two spectra (i.e., two observing blocks: o007 and o008) for 06355 and 10612, but refer only to the o008 value for 04590.For each object, the symbols are slightly offset in the horizontal direction for display purposes.Some of the measurements in the literature do not present uncertainties and are plotted without error bars in the vertical direction.and Nakajima et al. (2022) for the large EW(Hβ) objects.Because the relationships of Bian et al. (2018) are based on highly ionized objects as typically found at z ∼ 2-3 on the [N II] BPT diagram (e.g., Baldwin et al. 1981;Steidel et al. 2014;Shapley et al. 2015), those consistencies suggest the JWST objects at z = 4-8.5 are highly ionized systems.This trend is consistent with the analysis by Sanders et al. (2023).One caveat is that there is one JWST object whose EW(Hβ) is smaller than 100 Å, GLASS_10021, falling close to the large EW(Hβ) relationships of Nakajima et al. (2022) rather than the small EW ones like the other JWST objects despite the relatively small EW(Hβ).This suggests we need to test whether the prescriptions work for low-ionization sources as well with a larger sample at high redshift.
In summary, although we cannot fully test the indicators for low-ionization sources, the current results suggest that the degree of ionization significantly influences the resulting metallicity value if one relies on an empirical metallicity indicator, and that no strong redshift evolution is seen in the strong line ratios as a function of metallicity at least for highly ionized galaxies.Ionization corrections, such as those proposed in Nakajima et al. (2022), are thus crucial for metallicity estimations with the strong line methods.In particular, we find R23, R3, and R2 show good agreement with the observations and the empirical relationships, and confirm that they are sensitive to a small change in metallicity.The indicators of O32 and Ne3O2 look highly dependent on the ionization state and their plateau-like behaviors against metallicity prevent us from deriving a stable metallicity solution at the metal-poor regime.
Figure 8 shows another metallicity indicator proposed by Izotov et al. (2019Izotov et al. ( , 2021)), which combines R23 and O32 to correct for the ionization state to improve the accuracy of metallicity in the low-metallicity regime.Unfortunately there is only one object, ERO_04590, whose metallicity is low enough to be fairly compared with the method, and whose nondetection of [O II] prevents us from confirming the reliability and accuracy of the method.This is exactly the situation Nakajima et al. (2022) have anticipated, i.e., not all of the lines are spectroscopically available at high redshift, and it is practically useful to use EW(Hβ) as a probe of the ionization state.Still, many of the JWST objects are located along the simple extrapolation of the relationship, indicating the method can be useful even up to and found that this object is slightly above the limit that can be explained by star-forming galaxies alone.This indicates the possibility of a hidden highenergy ionizing source.In addition, Brinchmann (2023) tentatively suggest the presence of the high-ionization lines [Ne IV] λλ2422, 2424, indicating a hard ionizing spectrum up to ∼60 eV, which cannot be fully explained by a conventional stellar population.These pieces of evidence suggest that the anomalous nature of this object may be explained by the presence of a high-energy ionizing source in the system, which may account for its deviation from the empirical metallicity relationships.Another note is for GLASS_160133 and GLASS_150029 that have a broad component associated with their Hα emission line, indicating the presence of faint AGNs in the systems.However, they still present metallicity diagnostic line ratios and EW(Hβ) values that agree reasonably well with the relationships for star-forming galaxies, implying a minor contribution from the AGN to the total emission strengths as well as to the optical continuum for these two objects.In any case, one would need higher ionization lines (e.g., [Ne V]; Cleri et al. 2023) to discuss further the presence of hard radiation components in these systems.

Empirical Method
For the other JWST objects without a T e -based metallicity (N o = 182 -10 = 172), we adopt the empirical metallicity indicators which have been tested and confirmed to work at high redshift (Section 3.2.1) to estimate metallicities.Prior to estimating metallicities empirically, we require that the objects have coverage of the [O III]+Hβ emission lines in their spectra, and that the [O III] λλ5007/4959 doublet line ratio is consistent with the theoretical value (2.98; Storey & Zeippen 2000) at a 2.5σ significance level.We only work with objects that meet these criteria and have appropriate flux measurements.As a result, 11 objects were removed from the subsequent analysis.
For the remaining 161 objects, we use the R23 index as a primary indicator, following the overall consistency of the metallicity indicator at high redshift as shown in Section 3.2.1.One caveat in using R23 is that two solutions are derived for a given R23 value.We use the prescription of Nakajima et al. (2022) with EW(Hβ) for the low-metallicity solutions ( + 12 log O H ( )  8.0), and the average relationship (without any ionization correction) for the high-metallicity ones (8.0).The latter is almost consistent with the indicators presented in Curti et al. (2017Curti et al. ( , 2020)).If the two metallicity solutions are not significantly separated, i.e., the observed R23 spans the peak value that appears around + 12 log O H ( )= 8.0 within the uncertainties, we adopt the 1σ lower limit of the low-metallicity solution as the lower limit, and the upper limit of the highmetallicity solution as the upper limit.If there are two distinct metallicity solutions, we rely on the O32 line ratios to distinguish between them.It should be noted that O32 is used only for the purpose of distinguishing between the two branches, and not for calculating the actual metallicity value.Specifically, we choose the R23-based metallicity solution whose expected O32 line ratio according to the O32 metallicity indicator of Nakajima et al. (2022) is more close to the observed O32 than the other.If [O II] is not detected and the resulting lower limit on O32 is higher than the expected O32 at the high-metallicity branch at the >3σ level, we choose the low-metallicity solution.If the lower limit on O32 is not high enough to distinguish, we translate EW(Hβ) to O32 using the average relationship found in Nakajima et al. (2022): log EW (Hβ) = 0.64 × log O32 + 1.68, and choose the solution.Finally, the systematic uncertainty of the empirical method is added in quadrature to the metallicity errors.For the other cases, we cannot reliably distinguish between the two solutions and thus leave its metallicity unconstrained based on R23.
We are unable to derive a R23-based metallicity for objects that lack [O II] coverage and/or appropriate dust correction due to the absence of multiple Balmer emission lines.For such objects, we estimate the metallicity using the R3 indicator, as it does not require a reddening correction.We follow a similar procedure as for R23 to account for the two-branch nature of the R3 index when estimating the metallicities.As O32 is not applicable for these objects, we rely on EW(Hβ) to correct for the ionization state and attempt to differentiate between the two solutions.
In summary, our sample consists of 86 objects with an R23based metallicity, 49 objects with an R3-based metallicity, and 26 objects without a metallicity constraint.We note that for the objects in the CEERS sample that have multiple spectra taken with different gratings and/or pointings, we estimate the metallicity for each spectrum and calculate the average value and its uncertainty by combining the individual metallicity measurements.While we understand that some of the spectra (P11 and P12) were obtained with different position angles compared to the others, we assume that the objects are compact and the different observations targeting the same objects at the center of the slit probe the same gas properties.After excluding 10 objects with spectroscopic signatures of AGNs Figure 8. Empirical metallicity indicator using both R23 and O32 as proposed by Izotov et al. (2019Izotov et al. ( , 2021) ) in the low-metallicity regime (solid) and its extrapolation (dotted).The red symbols show the JWST objects as in Figure 7.
(Section 3.1), comprising eight objects with an R23-based metallicity and two objects with a T e -based metallicity, we use 127 objects with an R23-or R3-based metallicity, along with eight objects with a direct T e method metallicity, for the subsequent analysis and discussion of the MZ relationship.

Other High-redshift Galaxies from the Literature
In addition to our JWST sample of 135 objects constructed with the ERO, GLASS, and CEERS programs, we have compiled reports of metallicity at high redshift from the literature to discuss with a larger sample.Our compilation includes three objects at z = 8.1-9.5 identified with the two DDT programs using the prism grating of NIRSpec (Proposal IDs: DD-2756 and DD-2767; Heintz et al. 2022;Langeroodi et al. 2022;Wang et al. 2022;Williams et al. 2023), one stacked point of 117 EIGER objects at z = 5.3-6.9 whose spectra are taken with the NIRCam slitless spectroscopy (Proposal ID: 1243; Matthee et al. 2023;Kashino et al. 2023), four objects at z = 6.1-6.4 with commissioning data of NIRCam slitless spectroscopy (Sun et al. 2022a(Sun et al. , 2022b)), and four Atacama Large Millimeter/submillimeter Array (ALMA) objects at z = 7.2-9.1 whose metallicities are measured with [O III] 88 μm (Jones et al. 2020).Their redshift distributions are illustrated in blue in Figure 3.Note that the EIGER stacked object is treated as a single object in the histogram as well as in the following figures.
For a fair comparison of metallicity in the following analysis, it is important to have metallicities that are estimated in a consistent manner, as done for our sample.All of the measurements compiled above are based on empirical indicators using strong emission lines that are calibrated with the direct T e method.For the NIRSpec DDT objects, five objects at z = 7.9-9.5 have reported metallicity measurements in several studies (Heintz et al. 2022;Langeroodi et al. 2022;Wang et al. 2022;Williams et al. 2023).Despite initially adopting the O32 index as a metallicity indicator in the original manuscript, Heintz et al. (2022) have updated their results, now primarily using the R3 index of Nakajima et al. (2022) with correction for the ionization state for large EW(Hβ) (200 Å) objects.This approach is supported by Figure 7 in this paper and has been adopted for some of our JWST objects.It should be noted, however, that it would be more appropriate to adopt the EW(Hβ) dependence of the indicator following the variation of EW(Hβ) as observed in the DDT sample, instead of fixing the relation for large EW(Hβ) (Heintz et al. 2022).Large EW(Hβ) values of ∼180-250Å are confirmed for two of the five objects (Wang et al. 2022;Williams et al. 2023).Among the five objects reported in Heintz et al. (2022), we use three objects, RXJ-z9500, RXJ-z8152, and RXJ-z8149, in this paper.Their metallicities are estimated to be + 12 log O H ( )= 7.56 (+0.16/−0.17),7.68 (+0.18/−0.19),and 7.29 (+0.22/ −0.28), respectively.One of the objects not included in our compilation, Abell-z7878, was originally suggested to have an [O III] λλ5007/4959 doublet ratio that is significantly smaller than the theoretical value.We remove this object from our compilation following the same approach used for our JWST sample.The remaining object, Abell-z7885, still relies on the O32 index for estimating its metallicity.We have decided not to include it in our compilation due to several uncertainties associated with using the O32 index for metallicity estimations, such as the strong dependence of O32 on ionization state (Figure 7) and the accuracy of the dust reddening corrections, as we were careful regarding the results presented in the original manuscript of Heintz et al. (2022).
The metallicity values for RXJ-z9500, RXJ-z8152, and RXJ-z8149 in Heintz et al. (2022) are consistent with those reported in other studies.The metallicity of RXJ-z9500 is first reported by Williams et al. (2023), where it was estimated to be + 12 log O H ( )= 7.48 ± 0.08 using the R23+O32 method of Izotov et al. (2019Izotov et al. ( , 2021; Figure 8).An ionization correction is thus taken into account for the metallicity measurement there.Similarly, Langeroodi et al. (2022) estimate metallicities for RXJ-z8152 and RXJ-z8149 (RX2129-ID11002 and RX2129-ID11022 in Langeroodi et al. 2022) using the R23 +O32 method (Izotov et al. 2019(Izotov et al. , 2021)) For the other compilation, we adopt the metallicity values as derived in the original papers.The stacked EIGER object has an already derived metallicity fairly consistent with the value based on our method using R23.The four NIRCam objects are all located in the high-metallicity branch based on the [N II] λ6564 detection, and the indicator of Bian et al. (2018) is used.We also note that the stellar masses and SFRs are all corrected for different IMFs to have the same Chabrier (2003) IMF using the conversion factors shown in Madau & Dickinson (2014).
In summary, our total sample consists of 147 galaxies (135 from our reduction of ERO, GLASS, and CEERS, and 12 from the compilation) with metallicity measurements at z = 4-10.This sample is used in the following analysis of the metallicity relationship.

MZ Relation
The main focus of this paper is to discuss the evolution of metallicity in star-forming galaxies.In this section, we specifically present the stellar MZ relation at high redshift using the JWST observations, as well as data compiled from relevant literature.
Before presenting the full results we first focus on the four ERO objects and compare in Figure 9  The red symbols denote the galaxies at z = 4-10, including our 135 ERO, GLASS, and CEERS galaxies, the three NIRSpec DDT objects, as well as the massive galaxies provided by NIRCam slitless and ALMA spectroscopy, and the low-mass stacked object as presented in Section 3.2.3.We note again that the 10 objects with spectroscopic signatures of AGNs (Harikane et al. 2023b) have been excluded here to discuss conservatively the results free from any AGN biases (Section 3.1).The objects analyzed using the direct T e method are marked with red open circles, suggesting a positive correlation between mass and metallicity in the redshift range z = 5-8.5.Furthermore, we divide our full sample of ERO, GLASS, and CEERS galaxies into three subsamples according to stellar mass: M å = 10 7 -10 8 , 10 8 -10 9 , and 10 9 -10 10 M e , and obtain the average MZ relation as shown with the large open stars in Figure 10 and as given in Table 2.Note that the compiled objects, as well as the CEERS objects with only an upper limit on M å , are not used in deriving the average relation.Following the single power-law form of Sanders et al. (2021), the average MZ relation of the z = 4-10 galaxies can be approximated as  with the best-fit parameters of Z 10 = 8.24 ± 0.05 and γ = 0.25 ± 0.03 in the mass range of M å ∼10 7.5 -10 9.5 M e , as shown with a gray long-dashed line.The parameter γ corresponds to the slope of the MZ relation, and its best-fit value and uncertainty confirm an increasing trend of metallicity with stellar mass, as tentatively seen with the three ERO objects (e.g., Schaerer et al. 2022;Curti et al. 2023a;Trump et al. 2023), and as widely known at low redshift.
Compared to the z = 0 MZ relation, these high-redshift galaxies clearly present a metallicity lower than typical galaxies at z = 0 for a given stellar mass.The decrease is typically ∼0.5 dex around M å ∼10 9 M e , but it becomes smaller at the low-mass end (∼0.3dex).Interestingly, a similar offset of ∼0.3 dex is observed between z = 0 and z ∼ 2-3, suggesting that the evolution of the MZ relation is small from z ∼ 2-3 to z = 4-10.Although there may be a decrease of ∼0.2 dex in the typical metallicity at the high-mass end of M å ∼10 9.5 M e from z ∼ 2-3 to z = 4-10, no strong evolution is found beyond the errors.The same conclusion can be drawn from comparisons with the MZ relation at z  4, where the metallicities are empirically estimated with the strong line indicators, as presented in Appendix B.
Figure 10.The relationship between stellar mass and metallicity.The red points represent galaxies at z = 4-10, including the filled circles, pentagons, and diamonds, which respectively represent the ERO, GLASS, and CEERS objects analyzed in this paper.Red circles denote galaxies whose metallicities are determined using the direct T e method.The large red stars represent the average relationship for the ERO, GLASS, and CEERS objects in three equally separated mass ranges (M å = 10 7 -10 8 , 10 8 -10 9 , and 10 9 -10 10 M e ), along with the best-fit function shown as a thick long-dashed line (Equation ( 1 Figure 10 also includes a comparison with the latest results from the JADES observations.Curti et al. (2023b) very recently reported on the metallicity measurements of galaxies at z = 3-10 based on deep JADES spectroscopic observations (see also Cameron et al. 2023b).In the figure, their MZ relation is shown using emerald green open pentagons, which represent the combination of the low-mass JADES sample with the highmass CEERS sample, the latter of which is provided in this paper.We find that their MZ relation, despite covering slightly different redshift ranges, agrees well with ours (Equation ( 1)).It is not surprising to see a good agreement between our results and those reported by Curti et al. (2023b) in the high-mass end (M å  10 8 M e ), since this regime is dominated by the CEERS objects presented in this paper.The JADES objects confirm that the MZ relation we obtained in Equation (1) continues down to M å ; 10 7 M e on average.
In Appendix A, we investigate our MZ relation using only galaxies for which the stellar mass is well determined through SED fitting to the JWST/NIRCam photometry, after excluding CEERS objects whose masses are estimated empirically from M UV (as discussed in Section 2.3).Because we have already excluded CEERS objects with only an upper limit on M å , all of which are M UV based, in deriving the average relation, the fraction of objects without NIRCam photometry is small (∼28%).Indeed, our conclusions remain unchanged when considering only objects with reliable measures of M å , as demonstrated in the figures in Appendix A. Moreover, we mention in the [N II] BPT diagram in Section 3.1 that some of the JWST objects have [N II] upper limits that are too weak to conclude the ionization nature (stars versus AGNs).By removing these unclear objects and using only galaxies that are surely diagnosed as star-forming galaxies in the [N II] BPT diagram (i.e., those with an (upper limit on) [N II]/Hα falling below the demarcation curve), we obtain a fully consistent MZ relation as found in the full sample (Figure 10).This implies a negligible contribution of AGNs in our sample.
To examine further any redshift evolution among our sample from z = 4 to 10, we plot in Figure 11 the MZ relations for the three different redshift bins, z = 4-6, 6-8, and 8-10.In each panel, the subsample is further divided into two groups based on their masses, and their average values are shown with large stars as summarized in Table 2.Although there are large error bars, these average points suggest that the slope and normalization of the MZ relation in the different redshift bins are consistent with those based on the full sample at z = 4-10 (Equation ( 1)). Figure 11 indicates that there is no significant evolution in the MZ relation among our z = 4-10 sample.We note that there may be a weak trend toward lower metallicity in the highest-redshift bin, albeit with a small sample size.Curti et al. (2023b) also tentatively suggest a similar evolution.That will be further explored in the discussion.
The no/weak evolution is consistent with the predictions of some cosmological simulations, as presented and compared in Figure 11.We compile the hydrodynamic, N-body, and/or seminumerical simulations showing the MZ relation at high redshift: FIRE by Ma et al. (2016) Wilkins et al. (2023) in magenta (see also Lovell et al. 2021).We plot the theoretical predictions at z = 5, 7, and 9 for the redshift bins of z = 4-6, 6-8, and 8-10, respectively.For the FIRE curves, we extrapolate the result at z = 6 to z = 7 and 9 using their redshift evolution function, although the evolution is tiny (D =log O H 0.025 ( ) dex from z = 6 to 7, and −0.05 dex from z = 6 to 9).Similarly, the ILLUSTRISTNG predictions at z = 5, z = 7, and z = 9 are extrapolated from the results at z = 4, z = 6, and z = 6, respectively, using their redshift evolution function (D =log O H 0.08 ( ) dex from z = 4 to 5, −0.075 dex from z = 6 to 7, and −0.1 dex from z = 6 to 9).For the FIRSTLIGHT results, we refer to Langan et al. (2020) and Nakazato et al. (2023) to prove the low-mass and the massive regimes, respectively.We assume the z = 8 relation of Langan et al. (2020) in panel (c), and the z = 6 relation of Nakazato et al. (2023) in panel (a), assuming no strong evolution at z = 8-9 and z = 5-6, respectively.For the FLARES results, we adopt the stellar metallicities of only young (<10 Myr) star particles and assume the stellar and gas-phase metallicities in the region of massive-star formation are comparable.Likewise, the metallicities for the ILLUSTRISTNG and FIRSTLIGHT models are SFR weighted and mass weighted by young (<100 Myr) star particles, respectively, allowing a fair comparison with our metallicity measurements based on the nebular emission lines.On the other hand, we note that the FIRE model adopts the mass-weighted metallicity of all gas particle that belong to the ISM, and the ASTRAEUS model counts the oxygen mass in the halo without any weighting, which could result in an inequitable comparison with the observations.
Comparing the observations with the simulations in Figure 11, we find the observed MZ relation and its weak evolution over z = 4-10 are in good agreement particularly with ILLUSTRISTNG, FIRSTLIGHT, and FLARES in the mass range of M å = 10 8 -10 9.5 M e .Notably, the slope of the MZ relation found in the ILLUSTRISTNG results likely coincides with the observations.On the other hand, the simulations of FIRE and ASTRAEUS are generally suggested to underpredict metallicities except for some metal-poor galaxies such as ERO_04590 found in the highest-redshift bin at z > 8.That is probably due to their implementation of feedback, i.e., their galaxies eject too much metal and/or accreting gas is too efficient in lowering the metal content in the galaxies (Ma et al. 2016;Ucci et al. 2023).Finally, we note that the average metallicity measured for low-mass galaxies below M å < 10 8 M e can be slightly higher than the existing predictions by ∼0.2 dex.It is thus necessary to increase the sample size to determine the MZ relation in the low-mass end as well as to extend the theoretical predictions such as ILLUSTRISTNG toward lower mass to conclude the consistency and to understand better the early chemical enrichment in the lowmass systems at high redshift.

SFR-MZ Relation
The next important aspect is the SFR dependence of the MZ relation to discuss the chemical evolution, given the claim of a redshift-invariant fundamental relation between mass, metallicity, and SFR (SFR-MZ relation) out to z ∼ 2-3 (Mannucci et al. 2010;Sanders et al. 2021).There are several expressions to describe the mutual dependencies between the three quantities (Mannucci et al. 2010;Lara-López et al. 2010;Andrews & Martini 2013;Sanders et al. 2017;Curti et al. 2020;Sanders et al. 2021).In this paper, we primarily use the variable μ α originally proposed by Mannucci et al. (2010): ), with α = 0.66 being adopted here that minimizes the scatter of the local low-metallicity galaxies with a direct T e -based metallicity in the μ α -metallicity plane ( To estimate the SFRs for the JWST objects, we adopt the total, reddening-corrected Hβ luminosity for each object, as it is the best indicator for ongoing (10 Myr) star formation activity.The SFR relation of Kennicutt (1998) is adopted for the Balmer lines with a correction of the IMF to Chabrier (2003) using the conversion factor of Madau & Dickinson (2014).Accordingly, the SFR-MZ relation is examined only for the JWST objects whose reddening corrections are successfully applied and whose spectra are slit-loss corrected.The latter constraint depends on whether the object has NIRCam coverage (only in the CEERS field; Section 2.3).Among the CEERS objects lacking a slit-loss correction, we rescue 42 objects whose UV stellar emission is constrained with HST.For these objects, we translate M UV into SFR(Hβ) assuming a typical ionizing photon production efficiency as indicated at high redshift ( x =  log 25.6 0.2; ion e.g., Endsley et al. 2023;Matthee et al. 2023) and a factor of 0.44 difference between the reddening E(B−V ) for the nebular and stellar emission (Calzetti et al. 2000).The assumption of ξ ion will be revisited elsewhere (K.Nakajima et al. 2023, in preparation) using the JWST sample presented in this paper.The objects lacking a proper dust reddening correction or having just upper limits on both M å and SFR are excluded.In short, 96 out of the 147 objects are used for the SFR-MZ relation at high redshift.
The second half of Table 2 presents the average SFR values for different masses and redshifts.
In Figure 12, we show the individual and average distributions of the JWST objects on the stellar mass-SFR plane using the SFR values as derived above (i.e., mainly from Hβ).The high-redshift objects are distributed along the sequence of specific SFR (sSFR) = 10 −8 -10 −7 yr −1 .This is overall consistent with the star formation main sequence of galaxies at z = 4-7, where an sSFR of 10 −8.5 -10 −7.5 yr −1 is typically suggested (e.g., Stark et al. 2013;de Barros et al. 2014;Santini et al. 2017;Popesso et al. 2023).We note that some of our galaxies are above sSFR 10 −7.5 yr −1 , in particular in the low-mass regime.This can be because the current spectroscopically confirmed sample is partly biased toward actively star-forming systems, and/or because the sample contains higher-redshift objects at z > 7.
Figure 13(a) shows the SFR-MZ relation of the z = 4-10 galaxies and their average points as shown in Figure 10 on the μ 0.66 -metallicity plane, together with the z = 0 average relation (Equation ( 2)) and the data points of z ; 2.3 and 3.3 stacked galaxies (Sanders et al. 2021).Adopting the best-fit parameter α = 0.66 found in low-redshift metal-poor star-forming galaxies (Andrews & Martini 2013), the z = 4-10 objects interestingly fall on the same SFR-MZ relationship.The JADES+CEERS sample, notably including lower-mass galaxies than ours, is also illustrated to show a consistent SFR-MZ relation on average.
We adopt the SFR-MZ relation of Andrews & Martini (2013) at z = 0 that is compared with the relation at z = 4-10 because the sample of Andrews & Martini (2013) contains comparably low-mass, actively star-forming galaxies.Figure 14(a) clarifies the parameter space that is explored by Andrews & Martini (2013), demonstrating that the majority (83%) of the JWST objects have μ 0.66 > 7.5 and can be directly compared with the relation of Andrews & Martini (2013).Note that local extremely metal-poor galaxies with M å ∼ 10 5 -10 7 M e and + 12 log O H ( )= 7.0-7.7 are suggested to be reproduced by the chemical evolution models (Lilly et al. 2013) with the same parameters as found for the Andrews & Martini (2013) galaxies (Nishigaki et al. 2023).This evidence supports the simple extrapolation of the relation toward lower masses.On the other hand in Figure 14(b), we show another form of the SFR-MZ relation derived by Curti et al. (2020).The authors find another best-fit α = 0.55 as the best 2D projection of the SFR-MZ relation on the μ α -metallicity plane by using the global sample of stacked SDSS galaxies.The relation is determined over μ 0.55 ∼ 8.5-11.0.Because ∼85% of the JWST sample have μ 0.55 below 8.5, most of the z = 4-10 galaxies occupy the parameter space that is not explored by Curti et al. (2020).Moreover, Curti et al. (2020) clarify a different degree of SFR dependence of the MZ relation for the low-and high-sSFR subsamples in the local Universe.For the low-sSFR subsample (sSFR < 10 −9.5 yr −1 ), a weaker SFR dependence is indicated (α = 0.22).On the other hand, the high-sSFR subsample with sSFR > 10 −9.5 yr −1 presents a stronger SFR dependence with α = 0.65, in close agreement with the best-fit value found by Andrews & Martini (2013).Given the high sSFRs for the JWST objects (sSFR ∼ 10 −7 -10 −8 yr −1 ; see Figure 12) and also for the z = 2-3 galaxies (sSFR ∼ 10 −8.5 yr −1 , we conclude the strong SFR dependence of α = 0.66 as found by Andrews & Martini (2013) is preferred and adopted in this paper to be compared with the high-redshift galaxies.We discuss the evolution of the Curti et al. (2020) SFR-MZ relation later in Section 4.
Figure 13(b) clarifies the redshift evolution of the SFR-MZ relation of Andrews & Martini (2013) by showing the residual metallicity for a given μ 0.66 with respect to the z = 0 relation: .66 (Equation ( 2)) for each galaxy's redshift.In this panel, the average points are derived for the objects found in the different redshift bins, at z = 4-6, 6-8, and 8-10.
Two interesting results arise in Figure 13(b).One is that the SFR-MZ relation shows no evolution up to z ∼ 8 within Δlog O/H ; 0.3 dex.Another is that a significant decrease of metallicity is found beyond the errors at z > 8.The decrease at the highest-redshift bin is not visible in panel (a) due to the small sample size, but hinted by the MZ relation at z = 8-10 (Figure 11(c)).To examine further the evolution of the SFR-MZ relation, we plot in Figure 15 the residual metallicity as a function of stellar mass for each of the three redshift bins.According to the two average points probing the different mass ranges, no significant dependency of the residual metallicity on stellar mass is indicated in each redshift bin with the current sample.For the highest-redshift bin, the decrease of metallicity is suggested for individual galaxies regardless of their stellar masses.Although the JADES+CEERS subsample at z = 6-10 appears to exhibit a mass dependence in the residual metallicity, we suggest this would be caused by a bias introduced by the dominance of lower-redshift galaxies (i.e., z = 6-8) in the lower-mass regime.This is consistent with the MZ plane in Figure 11, where the lowest-mass point of the JADES+CEERS subsample at z = 6-10 is more consistent with our z = 6-8 relation rather than the z = 8-10 one.These results suggest that chemical properties of star-forming galaxies up to z ∼ 8 are very similar to those of local and z = 2-3 counterparts, while a break of the metallicity equilibrium state may be in place beyond z ∼ 8.More discussions follow in Section 4.
Finally, we have conducted several tests to demonstrate the robustness of the results presented in this section.In Appendix A, we examine the SFR-MZ relation using only the JWST galaxies whose stellar masses are well determined with SED fits to the NIRCam photometry and whose SFRs are obtained with the slit loss-corrected Hβ.Our conclusions remain unchanged even with the smaller sample that has good measurements of M å and SFR, although the scatter of individual data points, as seen in Figure 15 for example, becomes less prominent.In Appendix C, we further investigate the SFR-MZ relation by using only the JWST galaxies with μ 0.66 > 7.5, where the relation is explored by Andrews & Martini (2013), allowing for a more robust comparison with high-redshift galaxies to discuss their evolution.We confirm that almost the same SFR-MZ relation and its redshift evolution are obtained based on this subsample, suggesting that the extrapolation of the relation at μ 0.66 ∼ 7-7.5 for the low-mass end of galaxies in our sample does not introduce biases.Furthermore, we note that a consistent view of the SFR-MZ relation is also supported by using only galaxies that are surely diagnosed as star formation-dominated in the [N II] BPT diagram (see also Section 3.3).

Discussion and Summary
We have conducted a (re)analysis of the JWST/NIRSpec ERO data, as well as the ERS data of GLASS and CEERS, to investigate the chemical properties of galaxies at redshifts ranging from z = 4 to 10.Our analysis includes the use of 135 JWST objects based on our improved reduction and calibration of the NIRSpec data, as well as the compilation of 12 objects from the other recent JWST observations and previous literature.We confirm that our new emission-line flux measurements and errors successfully address the issues reported in the literature related to Balmer decrements and electron temperatures.The estimated electron temperatures for the four ERO objects, along with the six GLASS + CEERS objects with [O III] λ4363 at z = 4-8.5 range from T e = 1.1 × 10 4 to 2.3 × 10 4 K.These temperatures are similar to those found in lower-redshift star-forming galaxies and can be fully explained by heating by young massive stars, without the need for additional ionizing mechanisms.This is particularly evident for the z = 8.5 object, whose T e has been updated from 25,000 K to (2.08 ± 0.26) × 10 4 K.
We have determined the MZ relation for galaxies at z = 4-10 and find no significant evolution compared to z ∼ 2-3 when extrapolating to the low-mass regime.This result is consistent with the early JWST study results reported shortly after the ERO data release, except for the z = 8.5 object whose deviation from the others on the MZ relation becomes small after remeasurement of T e .Furthermore, our compilation of results does not show any significant evolution among the z = 4-10 sample.Theoretical simulations also predict a similar trend of small redshift evolution in the MZ relation at z ∼ 5-9 Ma et al. 2016;Langan et al. 2020;Ucci et al. 2023;Nakazato et al. 2023), with some simulations suggesting a potentially observable decrease in metallicity with redshift for a given mass (∼0.26 dex from z = 5 to 9; Torrey et al. 2019).Although there may be a weak evolution toward lower metallicity at z > 8, the current sample has relatively large uncertainties in metallicity (D log (O/H) ∼ 0.3 dex), which makes it difficult to distinguish fully between the different predictions of redshift evolution at high redshift (z > 4), especially for low-mass galaxies with M å < 10 8 M e .Further detailed observational and theoretical studies are needed to address the apparent discrepancy and understand the early chemical enrichment in low-mass systems at high redshift.
We have also investigated the SFR dependence of the MZ relation (SFR-MZ relation) at high redshift, finding (i) no evolution from z = 0 to z = 4-8 within D = log O H 0.3 ( ) dex, and (ii) a significant decrease of metallicity at z > 8.The former finding supports the idea that the SFR-MZ relation, also known as the fundamental metallicity relation (FMR), can indeed describe the properties of galaxies with no redshift evolution.This suggests the existence of a metallicity equilibrium state via mechanisms such as star formation, metal-poor gas inflow, and outflow (e.g., Lilly et al. 2013) that persist up to z ∼ 8.This sounds inconsistent with the potential deviation from the FMR at z  2.5 as originally indicated by Mannucci et al. (2010).The authors have suggested, based on metallicities estimated using local empirical relations, that the same SFR-MZ relation shows no redshift evolution up to z ∼ 2.5.They claimed that galaxies at z ∼ 3 fall below the relation by approximately ∼0.6 dex, with a combination of M å and SFR of α = 0.32 in μ α .However, when using T e -based metallicities from Sanders et al. (2021) and adopting a different parameterization of α = 0.66 in μ α from Andrews & Martini (2013), no clear evolution from z = 0 up to z ; 3.3 is found.Our findings, which utilize the parameterization of α = 0.66, support the claim of no average evolution of the SFR-MZ relation up to z ∼ 8.This apparent inconsistency is likely attributed to the differences in metallicity estimations.High-redshift galaxies are known to have a higher ionization state of gas, parameterized by the ionization parameter, compared to local galaxies (Figure 4; see also, e.g., Nakajima & Ouchi 2014;Sanders et al. 2023).Consequently, local empirical relations that assume a relatively low ionization parameter may result in biased metallicity estimates for high-redshift galaxies, particularly at z  2.5 where the Hα+[N II] λ6584 lines are not available in the pre-JWST era.In fact, Figure 7 demonstrates that JWST objects with a large EW(Hβ) exhibit a systematic offset from the empirical relations of Maiolino et al. (2008), which is used for metallicity estimations in Mannucci et al. (2010).Therefore, accurate metallicity estimations using the reliable direct T e method, or accounting for the ionization state evolution as prescribed in Nakajima et al. (2022) and Izotov et al. (2019Izotov et al. ( , 2021)), are essential for future metallicity studies of the high-redshift Universe with JWST.
The latter finding of the potential evolution of the SFR-MZ relation beyond z ∼ 8 is intriguing.A similar result is supported by Heintz et al. (2022) as independently illustrated with the three DDT objects in our figures (red open right-pointing triangles), and also by the recent study of Curti et al. (2023b) based on the deep JADES observations.However, in contrast to our results using the SFR-MZ relation of Andrews & Martini (2013), Curti et al. (2023b) argue that galaxies begin to deviate at z ∼ 6, if they adopt the SFR-MZ relation of Curti et al. (2020).Because we confirm that our results and the JADES study yield similar MZ and SFR-MZ relations, as evidenced by the consistency between the red and emerald green symbols in Figures 10, 11, and 13, we attribute the redshift differences where deviations occur to the different functional forms of the SFR-MZ relation defined at z = 0.This is also supported by the evolution of the SFR-MZ relation based on the Curti et al. (2020) formalism for our JWST objects, as shown in Figure 16, which indicates a consistent evolution with deviations starting at z ∼ 6 as reported by Curti et al. (2023b).While the use of a 3D SFR-MZ relation can account for the SFR dependence of the MZ relation's slope and capture the metallicity variations more accurately than a simple 2D projection using μ α as assumed in the Andrews & Martini (2013) relation (e.g., Curti et al. 2020), this is probably not an essential issue in this study.This is because the JWST objects analyzed so far present a narrow distribution of sSFRs (Figure 12; see also Figure 4 of Curti et al. 2023b).We acknowledge that extended local-baseline studies that cover low-mass and high-sSFR galaxies at z = 0 are necessary to discuss the chemical evolution, including the fundamental SFR-MZ relation, and determine when and if high-redshift galaxies begin to deviate from the relation.In this work, however, we regard that the best estimates are the results based on the SFR-MZ relation by Andrews & Martini (2013) as it provides a more appropriate comparison by covering the lowmass regime where most of the JWST objects are found (Section 3.4 and Figure 14).
If we assume that the first star formation began to occur at z = 16-27 (Harikane et al. 2022; see also, e.g., Abel et al. 2002;Bromm et al. 2002;Dayal & Ferrara 2018), galaxies at z = 8-10 would be at most ∼200-500 Myr old.During such a short  10, including the average points, which are based on the three mass ranges that are equally separated.The coefficient of 0.66 for the combination of stellar mass and SFR is as indicated by Andrews & Martini (2013) in the local Universe, down to M å ∼ 10 7.4 M e .Stacked galaxies at z = 2-3 with T e -based metallicities are also suggested to fall on the z ∼ 0 relation in this parameter space (Sanders et al. 2021).Notably, the JWST objects analyzed in this study at z = 4-10 (red) also fall on the same SFR-MZ relation on average.A consistent view is obtained with the JADES+CEERS average points at z = 3-10 (emerald green; Curti et al. 2023b).Right: (b) metallicity difference from the SFR-MZ relation of Andrews & Martini (2013; D log (O/H) = observed − predicted metallicity) for different redshift sources.The symbols are consistent with panel (a), except for the average points (large red stars), which are recalculated in this panel to represent the average D log (O/H) values for the objects found in three different redshift bins, z = 4-6, 6-8, and 8-10.For the JADES+CEERS sample, average points of three mass bins are plotted for the two subsamples at z = 3-6 and 6-10, as found in Figure 11.This plot illustrates that no evolution is seen up to z ∼ 8, but a significant decrease in metallicity is observed beyond z > 8, albeit with a small sample size in this highest-redshift bin.
Figure 14.Left: (a) similar to the left panel of Figure 13, but with the regions of μ 0.66 explored and unexplored by the original paper of Andrews & Martini (2013) highlighted.The blue solid curve represents the best-fit 2D projection of the SFR-MZ relation by Andrews & Martini (2013), determined down to μ 0.66 ∼ 7.5, with extrapolation toward lower mass shown by the dotted curve.Among the JWST objects, 83% have μ 0.66 > 7.5 and can be directly compared with the Andrews & Martini (2013) relation.Right: (b) similar to the left panel, but with a different coefficient of 0.55 adopted for the combination of mass and SFR on the abscissa axis, as presented by Curti et al. (2020).The blue solid curve represents the best-fit 2D projection of Curti et al. (2020)ʼs SFR-MZ relation, determined down to μ 0.55 ∼ 8.5, and extrapolated toward lower mass with the dotted curve.Only approximately 15% of the JWST objects have μ 0.55 > 8.5, and as such, the vast majority of the sample occupies a parameter space that is not explored by Curti et al. (2020).
timescale after the Big Bang, there could be a higher probability for galaxies at higher redshift to be in the early stages of formation, where they have not yet reached the metallicity equilibrium state via processes such as star formation, inflow, and outflow.An alternative explanation could be varying degrees of feedback processes, including inflow and outflow, in the early Universe.As evident from the comparison of cosmological simulation results in Figure 11, galaxies could exhibit lower metallicities if metals are efficiently ejected from the galaxies or diluted by gas accretion from inflow of pristine gas, as observed in simulations such as FIRE and ASTRAEUS.Furthermore, other events such as galaxy mergers and AGN activities may also play a role in modulating star formation, causing metal redistribution, and potentially influencing the SFR-MZ relationship at high redshift (e.g., Springel et al. 2005;Torrey et al. 2012;Weinberger et al. 2018).Due to the limited sample size and associated metallicity errors, particularly at z > 8, it is challenging to make definitive conclusions from the current study.Further statistical investigations of metallicity are crucial to confirm the suggested lack of evolution in the SFR-MZ relation up to z ∼ 8, followed by a decrease at higher redshift.These findings need to be rigorously compared with theoretical studies to understand better the underlying physics that govern early galaxy evolution.13, but employing the SFR-MZ relation of Curti et al. (2020).As similarly found in Figure 13, our analysis (red) and the JADES+CEERS study by Curti et al. (2023b;emerald green) suggest a consistent evolution of the SFR-MZ relation from z = 4-10 on average.If we adopt the formalism of Curti et al. (2020), which is defined at z = 0, our JWST objects begin to deviate from the SFR-MZ relation at z ∼ 6.This finding contrasts with the use of the z = 0 SFR-MZ relation of Andrews & Martini (2013), which suggests deviations starting at z ∼ 8 (Figure 13(b)).It is worth noting, however, that a simple extrapolation of the relation toward lower masses and higher SFRs is assumed for the majority of the JWST objects (see Figure 14(b)).
#2736 (ERO).The authors acknowledge the teams of JWST commissioning, ERO, GLASS, UNCOVER, and CEERS for developing their observing programs with a zero-exclusiveaccess period.Moreover, this work is based in part on observations taken by the CANDELS Multi-Cycle Treasury Program with the NASA/ESA HST, which is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS 5-26555.This paper is supported by World Premier International Research Center Initiative (WPI Initiative), MEXT, Japan, as well as the joint research program of the Institute of Cosmic Ray Research (ICRR), the University of Tokyo.This work is supported by KAKENHI (JP19H00697, JP20H00180, and JP21H04467) Grant-in-Aid for Scientific Research (A) through the Japan Society for the Promotion of Science.In addition, K.N. acknowledges support from JSPS KAKENHI Grant JP20K22373.Y.I. is supported by JSPS KAKENHI Grant JP21J20785, and also acknowledges funding from the Hayakawa Satio Fund awarded by the Astronomical Society of Japan.Y.H. acknowledges support from JSPS KAKENHI Grant JP21K13953.

Appendix A MZ and SFR-MZ Relations Based Only on Galaxy
Properties Derived from SED Fitting to NIRCam Photometry In the main text, we present the MZ and the SFR-MZ relations for the full JWST sample with the NIRSpec spectra.Out of the 135 JWST objects with metallicity measurements, 81 objects have NIRCam imaging data, and their stellar masses M å are derived based on SED fitting to the NIRCam photometry.The remaining 54 objects, all of which are in the CEERS field, are not covered by NIRCam, and their M å values are empirically estimated using UV luminosities from HST photometry (Section 2.3).Additionally, their SFRs are estimated from M UV with some assumptions for comparison with SFR(Hβ) (Section 3.4).We note that 23 out of the 54 objects are not detected by HST and have only upper limits on M å and SFR.Since objects with only an upper limit on M å are not considered when deriving the average MZ relations and best fit, there are practically 31 objects (i.e., = 54 -23), accounting for ∼28% of the sample, which may introduce biases due to different methods used for M å and SFR estimation.In this appendix, we present the main figures with only the JWST objects having NIRCam photometry and properties derived from SED fitting, in order to examine how the main results can be affected by excluding the 31 objects with less certain estimations of M å and SFR.
Figure A1 presents the MZ relation using only the JWST objects with M å based on NIRCam photometry.Their average points are shown with the inverted purple stars.The red stars and the gray long-dashed line represent the average relations based on the full sample of the JWST objects as shown in Figure 10.We confirm that the difference between the two is negligibly small, and the MZ relation based only on the JWST objects with reliable M å estimates is fully consistent with that based on the full sample, within the uncertainties.
Likewise, Figures A2, A3, and A4 are regenerated plots of Figures 11, 13, and 15 from the main text, respectively, using only the JWST objects with NIRCam photometry.These figures confirm that our results remain unchanged when adopting the subsample, indicating that no clear biases are introduced by using the 31 objects with M å and SFR estimated in a different manner.We have also checked the consistency of M å between the two methods by comparing the estimates from the 81 CEERS objects with both measurements, and found that ∼90% of them have consistent M å values at the 2σ level.The remaining eight objects show overestimated M UV -based M å values compared to the NIRCam-based M å values at >2σ level.Although such a small fraction of outliers may exist among the objects lacking NIRCam photometry and introduce additional scatter in the MZ and Figure 11(a) versus Figure A2(a) at the high-mass region), the impact is minor as demonstrated in this appendix.

Appendix B Comparing with Empirically Estimated Metallicity Relations at Low Redshift Using Strong Line Indicators
In the main text we refer to Andrews & Martini (2013) and Sanders et al. (2021) to compare our results with the MZ relations at lower redshift.This is because their metallicities are reliably determined with the direct T e method.Moreover, the z = 0 relation of Andrews & Martini (2013) is explored down to M å ∼ 10 7.4 M e and can be directly compared with most of the JWST objects.In this appendix, we also present comparisons with the other MZ relations at low redshift whose metallicities are empirically estimated with the strong line indicators.
Figure B1 shows the strong line-based MZ relation at z  3 in addition to our results (the average points and the best fit) and the T e -based MZ relation at z = 0-3 (Andrews & Martini 2013;Sanders et al. 2021) as in Figure 10.The strong line-based MZ relations include those for z ∼ 0.7 galaxies (Savaglio et al. 2005), z ∼ 1-2 (Papovich et al. 2022), z ∼ 2-3 (Erb et al. 2006;Steidel et al. 2014), and z ∼ 3-4 (Troncoso et al. 2014; see also Maiolino et al. 2008;Mannucci et al. 2009;    Onodera et al. 2016).The results by Savaglio et al. (2005) and Erb et al. (2006) are revisited by Maiolino et al. (2008) and their metallicities are remeasured.We also apply corrections to the mass scale to have the same Chabrier (2003) IMF.For the z = 2-3 galaxies the [N II]/Hα index is mainly used for the metallicity estimations, while for the other redshift studies combinations of [O II], [O III], and Hβ (i.e., R23 index and O32 index in practice) are utilized, in addition to [Ne III] when available.Although there are some differences in the MZ relations between the different studies and methods for a similar redshift, several factors account for the scatter, such as different SFR activities for the different samples and variations of the metallicity indicators.Despite the slight differences, these overall confirm a redshift evolution of MZ relation, i.e., a decreasing trend of metallicity in the mass range of 10 9 -10 11 M e .Comparing with these strong line-based MZ relations, our main conclusion does not change that the JWST objects at z = 4-10 are distributed along the simple extrapolation of the MZ relation at z ∼ 2-3 toward lower mass.

Appendix C Evolution of the SFR-MZ Relation without the Low-mass
End of Galaxies Figure 14(a) highlights that the majority of the JWST sample (83%) has μ 0.66 > 7.5 and is directly compared with the lowerredshift galaxies in the SFR-MZ relation of Andrews & Martini (2013).Still, we rely on the simple extrapolation of the relation toward lower μ 0.66 for the remaining galaxies and discuss the evolution of the SFR-MZ relation together with the direct comparison of galaxies with μ 0.66 > 7.5.In this appendix, we present how the low-mass end of galaxies with μ 0.66 < 7.5 can impact the view of the evolution of the SFR-MZ relation.
Figure C1 shows the redshift evolution of the SFR-MZ relation by only using the JWST objects having μ 0.66 > 7.5.Their averages are represented by the inverted blue stars.As a comparison, the average relations based on the full sample are shown with the red stars, as presented in Figure 13, which are positioned almost behind the blue stars.This confirms that the same results arise from the subsample as discussed in the main text, i.e., (i) the same Andrews & Martini (2013) relation between mass, metallicity, and SFR can explain the properties of galaxies up to z ∼ 8, and (ii) a deficit in metallicity is indicated at z > 8.We therefore conclude that there is no critical impact of the low-mass end of galaxies and the extrapolation of the relation at least in the range μ 0.66 ∼ 7-7.5 for the discussion of the evolution of the SFR-MZ relation.

Appendix D Summary of the Properties of the JWST Objects in a
Tabular Format In Table D1, we have compiled the essential properties of the complete sample of JWST objects presented in this paper, based on our improved reduction and calibration of the NIRSpec data from ERO, ERS GLASS, and ERS CEERS.

Figure 1 .
Figure1.Observed NIRSpec spectra for the five z = 5-8.5 ERO objects.For each object: (top) 2D spectra for the six individual exposures (three nods each in o007 and o008); (bottom) 1D composite spectrum.We generate the 1D composite spectrum by first median stacking the individual 2D spectra and then extracting the 1D spectrum with a small aperture (see text).This procedure allows us to minimize the possible impacts of hot pixels and to rescue some individual 2D frames where the spectrum is partly out of the slit (e.g., two nod2 frames of 10612).

Figure 2 .
Figure 2. Balmer emission-line ratios of Hγ/Hβ and Hδ/Hβ for the ERO objects (04590 in red, 05144 in emerald green, 06355 in orange, and 10612 in green) are shown in the main panel.The colored filled circles represent the new emission-line flux measurements and errors with our improved reduction, while the other symbols show the measurements based on the early data release, as shown in the legend (Arellano-Córdova et al. 2022; Curti et al. 2023a; Rhoads et al. 2023).For the measurements of Arellano-Córdova et al. (2022), we adopt the average values based on the two spectra (o007 and o008) for 06355 and 10612, but refer only to the o008 value for 04590.For those with line ratios outside the ranges of the plot, we place them close to the edge of the plot with an arrow to indicate the direction toward the actually reported line ratio.The lines show the expected Balmer decrements as a function of E(B−V ) for the three different attenuation curves (Cardelli et al. 1989 with dashed; Calzetti et al. 2000 with dotted; and SMC from Gordon et al. 2003 with dotted-dashed) and the two different electron temperatures (T e = 10,000 K in gray and 25,000 K in magenta), although the different attenuation curves and T e values have a small impact on this diagram.The deviation of the Hδ/Hβ ratio from the theoretically expected value based on the Hγ/Hβ ratio (ΔHδ/Hβ = ((Hδ/ Hβ) obs − (Hδ/Hβ) expected ) normalized by (Hδ/Hβ) expected ) is highlighted in the upper panel, and that of the Hγ/Hβ ratio based on the Hδ/Hβ ratio in the left panel, for each symbol as in the main panel.The corresponding E(B−V ) values are printed in these subpanels, where the Calzetti et al. (2000) attenuation curveand T e =17,500 K are adopted for reference.Our measurements all follow the sequence of the expected Balmer decrements within the uncertainties, while quite a few of the measurements from the earlier studies show the ratios that are not physically and/or consistently explainable.

Figure 3 .
Figure 3. Redshift distribution of the high-redshift galaxy samples used in this work.The red histogram presents the JWST sample constructed in this study using our analyzed ERO, GLASS, and CEERS spectra.The blue histogram shows the samples compiled from the literature (see Section 3.2.3).The sum of these samples, represented by the gray histogram, is used to investigate the MZ relationship at z = 4-10 in this work.

Figure 4 .
Figure 4. O32 vs. R23 diagram comparing the JWST sample at z = 4-9, analyzed in this work, with lower-redshift samples at z = 2-3.The JWST objects are shown as red filled symbols, with circles representing ERO data, pentagons representing GLASS data, and diamonds representing CEERS data.Objects with metallicity measurements obtained using the direct T e method are indicated with red open circles.The gray open symbols represent the averages of lower-redshift samples of continuum-selected galaxies(Nakajima & Ouchi 2014;Troncoso et al. 2014;Sanders et al. 2016;Onodera et al. 2016;Strom et al. 2017) and Lyα emitting galaxies (LAEs;Nakajima & Ouchi 2014;Erb et al. 2016;Nakajima et al. 2020), as indicated in the legend.Arrows indicate 3σ lower limits.Gray shading illustrates the equivalent distribution for nearby SDSS galaxies.

Figure 5 .
Figure 5.The [N II] BPT diagram.Red symbols represent the JWST objects, as shown in Figure 4, whose Hα+[N II] lines are covered in the NIRSpec spectra.Arrows indicate 3σ limits.Two popular demarcation curves between AGNs and star-forming galaxies, from Kewley et al. (2001; dot long dashed; "Kew01") and Kauffmann et al. (2003; short dash long dashed; "Kau03"), are also shown.None of the objects fall above the demarcation curves significantly beyond the measurement uncertainties.
[O III]  and [Ne III] stronger (and low-ionization lines such as [O II] weaker) than expected from the local empirical relationships as defined byMaiolino et al. (2008) andCurti et al. (2017Curti et al. ( , 2020)).Rather, they present a fairly good agreement with the relationships proposed byBian et al. (2018)

Figure 6 .
Figure 6.Comparison of literature determinations of electron temperature (T e ) for four ERO galaxies (04590 in red, 06355 in orange, 10612 in green, and 05144 in emerald green) where [O III] λ4363 is identified.Different symbols indicate different publications, as shown in the legend.For the measurements of Arellano-Córdova et al. (2022), we show the average values based on the two spectra (i.e., two observing blocks: o007 and o008) for 06355 and 10612, but refer only to the o008 value for 04590.For each object, the symbols are slightly offset in the horizontal direction for display purposes.Some of the measurements in the literature do not present uncertainties and are plotted without error bars in the vertical direction.

Figure 7 .
Figure7.Comparison of the metallicity relationships with strong line ratios (R23, R3, R2, O32, and Ne3O2 from top left to bottom right) for ERO (circles), GLASS (pentagons), and CEERS (diamonds) galaxies with metallicity measurements based on the direct T e method.Each data point is color coded by its EW(Hβ) value, if available, as shown in the legend.The black dashed, long dashed, and dotted-dashed curves represent the metallicity relationships fromMaiolino et al. (2008),Curti et al. (2017Curti et al. ( , 2020)), andBian et al. (2018), respectively.The blue and orange curves show the functions derived inNakajima et al. (2022)  for low-metallicity galaxies, with blue representing high-ionization galaxies (EW(Hβ) > 200 Å) and orange representing low-ionization galaxies (EW(Hβ) < 100 Å).Each curve is shown only within the metallicity range explored in the original paper.
, and report + 12 log O H ( )= 7.65 ± 0.07 and <7.51 (1σ), respectively.Wang et al. (2022) independently suggest a consistent metallicity for RXJ-z8152, although the ionization correction is not fully taken into account.We use the properties, including metallicities, that are calculated by averaging the values from Williams et al. (2023), Heintz et al. (2022), and Langeroodi et al. (2022) for the three DDT objects in our compilation.

Figure 9 .
Figure 9.Comparison between the MZ relation obtained in this study for ERO objects at z = 6.3-8.5 using T e -based metallicities (colored circles) and those reported in the literature.Different symbol colors indicate different objects, as shown in Figure 6, and different symbols represent different publications, as indicated in the legend.For the metallicities from Arellano-Córdova et al. (2022), we show the average values based on the two spectra (i.e., two observing blocks: o007 and o008) for 06355 and 10612, but refer only to the o008 value for 04590.For the relationships of Schaerer et al. (2022) and Curti et al. (2023a), the stellar masses are adopted as originally estimated in each paper.For the others where stellar masses are referenced from other studies, we show the two stellar masses from Carnall et al. (2023) and Tacchella et al. (2023) for each object (only Carnall et al. 2023 for 05144).The comparison confirms the good agreement between our measurements and earlier results on the MZ relation.
our MZ relation with those presented in the earlier studies(Arellano-Córdova et al. 2022;Schaerer et al. 2022;Curti et al. 2023a;Trump et al. 2023;Rhoads et al. 2023;Brinchmann 2023).The metallicities are all derived based on the direct T e method as summarized in Figure6.The stellar masses are corrected for different IMFs to have the sameChabrier (2003) IMF.Schaerer et al. (2022) andCurti et al. (2023a) derive the masses in their own papers, while the others refer to eitherCarnall et al. (2023) orTacchella et al. (2023) and hence both values are shown in the plot.We confirm that our MZ relation for the four ERO objects overall shows good agreement with the earlier results.One note is that we confirm the metallicity of ERO_04590 is not extremely low, + 12 log O H ( )= 7.26 (+0.15/−0.13)for its stellar mass.Figure10illustrates the relation between stellar mass and metallicity (MZ) with the full sample at z = 4-10.The MZ relation determined at z = 0, ;2.3, and ;3.3 with the direct T e method is also plotted(Andrews & Martini 2013;Sanders et al. 2021; see alsoNishigaki et al. 2023 that the T e -based MZ relation at z = 0 continues decreasing down to M å ∼ 10 5 M e ).
Figure10.The relationship between stellar mass and metallicity.The red points represent galaxies at z = 4-10, including the filled circles, pentagons, and diamonds, which respectively represent the ERO, GLASS, and CEERS objects analyzed in this paper.Red circles denote galaxies whose metallicities are determined using the direct T e method.The large red stars represent the average relationship for the ERO, GLASS, and CEERS objects in three equally separated mass ranges (M å = 10 7 -10 8 , 10 8 -10 9 , and 10 9 -10 10 M e ), along with the best-fit function shown as a thick long-dashed line (Equation (1)).Emerald green open pentagons show the average relation at z = 3-10 based on the JADES+CEERS sample(Curti et al. 2023b).Other red symbols represent high-redshift objects compiled from the literature, including open right-pointing triangles for NIRSpec DDT objects(Williams et al. 2023;Heintz et al. 2022;Langeroodi et al. 2022), left-pointing triangles for four NIRCam objects(Sun et al. 2022a(Sun et al. , 2022b)), an open hexagram for the stacked EIGER object(Matthee et al. 2023), and crosses for ALMA objects(Jones et al. 2020).Additionally, the relationship at lower redshift is displayed, including SDSS stacked galaxies in the local Universe shown as a gray dashed curve(Andrews & Martini 2013), and those at z ∼ 2.3 with gray open triangles and ∼3.3 with gray open squares (best fits shown as gray dotted-dashed curves;Sanders et al. 2021).The curves are displayed in the mass ranges explored in the original papers.These low-redshift metallicities are based on the direct T e method.

Figure 11 .
Figure 11.The MZ relation in three different redshift bins: (a) z = 4-6, (b) z = 6-8, and (c) z = 8-10.The red symbols and the best-fit function (Equation (1)) are as shown in Figure 10.The large stars represent the average MZ relation rederived in each redshift bin, by splitting the sample into two groups based on stellar mass to have equal numbers of galaxies.For the JADES +CEERS relation (Curti et al. 2023b in emerald green), we plot their z = 3-6 subsample relation in panel (a), while adopt the z = 6-10 relation in panels (b)and (c).In addition, the cosmological simulation results at z = 5, 7, and 9 are displayed in panels (a), (b), and (c), respectively: FIRE in black(Ma et al. 2016), ILLUSTRISTNG in green(Torrey et al. 2019), FIRSTLIGHT in blue(Langan et al. 2020 in the low-mass regime with solid curves; andNakazato et al. 2023 in the high-mass regime with shades), ASTRAEUS in yellow(Ucci et al. 2023), and FLARES in magenta(Wilkins et al. 2023).Some extrapolations are applied, as detailed in the text.

Figure 12 .
Figure 12.The relationship between stellar mass and SFR, with the SFRs derived based on the total Hβ luminosity (as described in the text) for use in the SFR-MZ relation.The symbols used are consistent with those in Figure 10.The background lines represent sSFR values ranging from 10 −9 to 10 −6 yr −1 , from bottom to top.The JWST objects are distributed along the sequence of sSFR values around 10 −8 -10 −7 yr −1 .

Figure 13 .
Figure 13.Left: (a) SFR dependence of the MZ (SFR-MZ) relation.The symbols and curves are consistent with those in Figure 10, including the average points, which are based on the three mass ranges that are equally separated.The coefficient of 0.66 for the combination of stellar mass and SFR is as indicated by Andrews & Martini (2013) in the local Universe, down to M å ∼ 10 7.4 M e .Stacked galaxies at z = 2-3 with T e -based metallicities are also suggested to fall on the z ∼ 0 relation in this parameter space (Sanders et al. 2021).Notably, the JWST objects analyzed in this study at z = 4-10 (red) also fall on the same SFR-MZ relation on average.A consistent view is obtained with the JADES+CEERS average points at z = 3-10 (emerald green; Curti et al. 2023b).Right: (b) metallicity difference from the SFR-MZ relation of Andrews & Martini (2013; D log (O/H) = observed − predicted metallicity) for different redshift sources.The symbols are consistent with panel (a), except for the average points (large red stars), which are recalculated in this panel to represent the average D log (O/H) values for the objects found in three different redshift bins, z =4-6, 6-8, and 8-10.For the JADES+CEERS sample, average points of three mass bins are plotted for the two subsamples at z = 3-6 and 6-10, as found in Figure11.This plot illustrates that no evolution is seen up to z ∼ 8, but a significant decrease in metallicity is observed beyond z > 8, albeit with a small sample size in this highest-redshift bin.

Figure 15 .
Figure 15.Metallicity difference from the SFR-MZ relation of Andrews & Martini (2013) as a function of stellar mass in three different redshift bins: (a) z = 4-6, (b) z = 6-8, and (c) z = 8-10.The symbols are consistent with those in Figure 11.No clear dependence of D log (O/H) on stellar mass is evident beyond the statistical errors.

Figure 16 .
Figure16.Similar to the right panel of Figure13, but employing the SFR-MZ relation ofCurti et al. (2020).As similarly found in Figure13, our analysis (red) and the JADES+CEERS study byCurti et al. (2023b; emerald green)   suggest a consistent evolution of the SFR-MZ relation from z = 4-10 on average.If we adopt the formalism ofCurti et al. (2020), which is defined at z = 0, our JWST objects begin to deviate from the SFR-MZ relation at z ∼ 6.This finding contrasts with the use of the z = 0 SFR-MZ relation ofAndrews & Martini (2013), which suggests deviations starting at z ∼ 8 (Figure13(b)).It is worth noting, however, that a simple extrapolation of the relation toward lower masses and higher SFRs is assumed for the majority of the JWST objects (see Figure14(b)).

Figure A1 .
Figure A1.Similar to Figure 10, but excluding the individual JWST objects whose stellar masses are empirically derived, so all the plotted data points have stellar masses that are derived based on SED fitting to NIRCam photometry.The average of the JWST objects plotted on this figure are represented by purple inverted stars.As a comparison, the average relations based on the full sample are shown with the red stars and the gray long-dashed line, as presented in Figure 10, which are positioned almost behind the purple inverted stars.This confirms that our results remain unchanged when adopting the subsample with certain estimations of M å .

Figure A3 .
Figure A3.Similar to Figure 13, but excluding the individual JWST objects without JWST/NIRCam photometry.The average relations of the JWST objects in each panel are represented by purple inverted stars.

Figure A2 .
Figure A2.Similar to Figure 11, but excluding the individual JWST objects without JWST/NIRCam photometry.The average relations of the JWST objects in each panel are represented by purple inverted stars.

Figure A4 .
Figure A4.Similar to Figure 15, but excluding the individual JWST objects without JWST/NIRCam photometry.The average relations of the JWST objects in each panel are represented by purple inverted stars.In the z = 8-10 panel, only a single average point is displayed due to the limited sample size.

Figure B1 .
Figure B1.Similar to Figure 10, but excluding the individual JWST objects, and instead focusing on comparisons with low-redshift MZ relations based on empirically estimated metallicities using strong line indicators, as indicated in the legend.Each MZ relation is depicted in the mass range explored in the original paper.

Figure C1 .
Figure C1.Similar to the right panel of Figure13, but excluding the individual JWST objects with μ 0.66 < 7.5.The average relations of the subsample are denoted with the blue inverted stars.As a comparison, the average relations based on the full sample are shown with the red stars, as presented in Figure13, which are positioned almost behind the blue inverted stars.This confirms that our results on the redshift evolution of the SFR-MZ relation remains unchanged when considering only objects with μ 0.66 > 7.5.

Table 1
Summary of JWST Objects Studied with the Direct T e Method . The temperature of the O + zone, T e ([O II]), is extrapolated from T e ([O III]) employing the prescription of Izotov et al. (2006; see Brinchmann 2023 for a different assumption).We derive the abundance of O 2+ /H + with the fluxes of [O III] λλ4959, 5007 to Hβ and T e ([O III]), and O + /H + with the [O II] to Hβ reddening-corrected ratio and T e ([O II] in black, ILLUSTRISTNG by Torrey et al. (2019) in green, FIRSTLIGHT by Langan et al. (2020) and Nakazato et al. (2023) in blue (see also Ceverino et al. 2017), ASTRAEUS by Ucci et al. (2023) in yellow, and FLARES by

Table 2
Average Values of the Stellar Mass, Metallicity, and SFR for our JWST Sample Note.The first half of this table displays the average masses and metallicities of the samples used for the the MZ relation, while the second half shows the samples used for the SFR-MZ relation.The second group of samples excludes objects whose SFRs are not measured or constrained (Section 3.4), resulting in a smaller sample size compared to the first half.

Table D1
Summary of the Full JWST Sample Analyzed in this Paper