Early Results from GLASS-JWST. XXIV. The Mass–Metallicity Relation in Lensed Field Galaxies at Cosmic Noon with NIRISS*

We present a measurement of the mass – metallicity relation ( MZR ) at Cosmic Noon, using the JWST near-infrared wide-ﬁ eld slitless spectroscopy obtained by the GLASS-JWST Early Release Science program. By combining the power of JWST and the lensing magni ﬁ cation by the foreground cluster A2744, we extend the measurements of the MZR to the dwarf mass regime at high redshifts. A sample of 50 galaxies with several emission lines is identi ﬁ ed across two wide redshift ranges of z = 1.8 – 2.3 and 2.6 – 3.4 in the stellar mass range of  * Î ( ) [ ] M M log 6.9, 10.0 . The observed slope of MZR is 0.223 ± 0.017 and 0.294 ± 0.010 at these two redshift ranges, respectively, consistent with the slopes measured in ﬁ eld galaxies with higher masses. In addition, we assess the impact of the morphological broadening on emission line measurement by comparing two methods of using 2D forward modeling and line pro ﬁ le ﬁ tting to 1D extracted spectra. We show that ignoring the morphological broadening effect when deriving line ﬂ uxes from grism spectra results in a systematic reduction of ﬂ ux by ∼ 30% on average. This discrepancy appears to affect all the lines and thus does not lead to signi ﬁ cant changes in ﬂ ux ratio and metallicity measurements. This assessment of the morphological broadening effect using JWST data presents, for the ﬁ rst time, an important guideline for future work deriving galaxy line ﬂ uxes from wide-ﬁ eld slitless spectroscopy, such as Euclid, Roman, and the Chinese Space Station Telescope. Uni ﬁ ed Astronomy Thesaurus concepts: Strong gravitational lensing ( 1643 ) ; Galaxy photometry ( 611 ) ; Galaxy spectroscopy ( 2171 ) ; Dwarf galaxies ( 416 ) ; High-redshift galaxies ( 734 ) ; Abell clusters ( 9 ) ; Metallicity ( 1031 )


INTRODUCTION
Nearly all elements heavier than Helium (referred to as metals in astronomy) are synthesized by stellar nuclear reactions, making them a good tracer of star formation activity across cosmic time.Star formation rate (SFR) and metal enrichment peak at the "Cosmic Noon" epoch z ∼ 2 (Madau & Dickinson 2014, Fig.9), confirmed by a census of deep surveys with Hubble Space Telescope (HST), the Sloan Digital Sky Survey (SDSS), and other facilities.Metals are thought to be expelled into the interstellar/intergalactic medium (ISM/IGM) by stellar explosions such as supernovae and stellar winds.The cumulative history of the baryonic mass assembly, e.g., star formation, gas accretion, mergers, feedback, and galactic winds, altogether governs the total amount of metals remaining in gas (Finlator & Davé 2008;Davé et al. 2012;Lilly et al. 2013;Dekel & Mandelker 2014;Peng & Maiolino 2014).Therefore, the elemental abundances provide a crucial diagnostic of the past history of star formation and complex gas movements driven by galactic feedback and tidal interactions (Lilly et al. 2013;Maiolino & Mannucci 2019).Since detailed abundances are not directly measurable at extragalactic distances, the relative oxygen abundance (number density) compared to hydrogen in ionized gaseous nebulae (reported as 12 + log(O/H)), is often chosen as the observational proxy of metallicity for simplicity.
Several scaling relations have been established, characterizing the tight correlations between various physical properties of star-forming galaxies, e.g., stellar mass (M * ), metallicity Z, SFR, luminosity, size, and morphology (see Kewley et al. 2019;Maiolino & Mannucci 2019, for recent reviews).Metallicity abundance evolution was found to exhibit a strong correlation with mass during galaxy evolution history (Davé et al. 2011;Lu et al. 2015b).The mass-metallicity relation (MZR), has been quantitatively established in the past two decades in both the local (Tremonti et al. 2004;Zahid et al. 2012;Andrews & Martini 2013, mainly from SDSS), and the distant universe out to z ∼ 3 (Erb et al. 2006;Maiolino et al. 2008;Zahid et al. 2011;Henry et al. 2013bHenry et al. , 2021;;Sanders et al. 2015Sanders et al. , 2021)).Recently, the launch of JWST has enabled the measurement of the MZR out to z ∼ 8 (e.g., Arellano-Córdova et al. 2022;Schaerer et al. 2022;Trump et al. 2023;Rhoads et al. 2023;Curti et al. 2023a,b;Nakajima et al. 2023;Sanders et al. 2023; Matthee † ARC DECRA Fellow et al. 2023).The slope of the MZR is sensitive to the properties of outflows (e.g., mass loading factor, gas outflow velocity), which are a crucial ingredient to galaxy evolution models (see Davé et al. 2012;Lu et al. 2015a;Henry et al. 2021).The MZR slope has also been used to reveal trends in how the star formation efficiency and galaxy gas mass fraction depend on stellar mass (Baldry et al. 2008;Zahid et al. 2014).Mannucci et al. (2010) first suggested a so-called fundamental metallicity relation (FMR), which aims to explain the scatter and redshift evolution of the MZR by introducing the SFR as an additional variable, creating a 3-parameter scaling relation.The FMR has a small intrinsic scatter of ∼ 0.05 dex in metallicity, making it possible to trace the metal production rates in stellar within cosmological time (Finlator & Davé 2008).Moreover, spatially resolved chemical information encoded by the metallicity radial gradients (Jones et al. 2015b;Wang et al. 2017Wang et al. , 2019Wang et al. , 2020;;Wang et al. 2022a;Franchetto et al. 2021), is a sensitive probes of baryonic assembly and the complex gas flows driven by both galactic feedback and tidal interactions.
The Near-infrared Imager and Slitless Spectrograph (NIRISS; Willott et al. 2022) onboard the JWST now enables a tremendous leap forward with its superior sensitivity, angular resolution, and longer wavelength coverage compared to HST/WFC3.This allows metallicity measurements with better precision in galaxies with lower stellar mass at the cosmic noon epoch 1 < z < 3. Similar measurements have been done using data from NIRSpec gratings (e.g., Shapley et al. 2023;Curti et al. 2023b), NIRSpec prism (Langeroodi et al. 2023), NIRCam WFSS (Matthee et al. 2023), and NIRISS (Li et al. 2022).This paper takes advantage of the deep NIRISS spectroscopy acquired by the Early Release Science (ERS) program GLASS-JWST (ID ERS-13241 ; Treu et al. 2022) in the field of the galaxy cluster Abell 2744 (A2744).By exploiting the gravitational lensing magnification produced by the foreground A2744 cluster, we are able to extend the measurement of the MZR down to 10 7 solar mass M ⊙ .
In this paper, we present a measurement of the MZR using the NIRISS and NIRCam data from a sample of 50 lensed field galaxies in a low mass range at z ∼ 2 − 3.In Sect.2, we describe the data acquisition and galaxy sample analyzed in this work.In Sect.3, we demonstrate our method to extract metallicity and stellar mass for both individual galaxies and their stacked spectrum.The main goal of this work is to present our MZR measurements in Fig. 5.We discuss the results in Sect. 4 and summarize the main conclusions in Sect. 5.The AB magnitude system, the standard concordance cosmology (Ω m = 0.3, Ω Λ = 0.7, H 0 = 70 km s −1 Mpc −1 ), and the Chabrier (2003) Initial Mass Function (IMF) are adopted.The metallic lines are denoted in the following manner, if presented without wavelength:

OBSERVATION DATA
We use the joint JWST NIRISS and NIRCam data targeting the A2744 lensing field cluster.The NIRISS data are used to estimate the metallicity through modeling of emission line flux ratios, while the NIRCam data are used to calculate the stellar mass through Spectral Energy Distribution (SED) Fitting.
The spectroscopy data from JWST/NIRISS of GLASS-ERS (program DD-ERS-1324, PI: T. Treu), with the observing strategy described by Treu et al. (2022), is reduced in Paper I (Roberts-Borsani et al. 2022a).Briefly, the core of the A2744 cluster (130" × 130") was observed for ∼ 18.1 hr with NIRISS wide-field slitless spectroscopy and direct imaging for ∼ 2.36 hr in three filters (F115W, F150W, and F200W)2 on June 28-29, 2022 and July 07, 2023.The total exposure times for the majority of sources in each of these three bands amount to 5.4, 5.7, 2.9 hours (as detailed in Fig. 1).This provides low-resolution R := λ/∆λ ∼ 150 spectra of all objects in the field of view with continuous wavelength coverage from λ ∈ [1.0, 2.2] µm.This includes the strong restframe optical emission lines .8, 3.4], and Hα, [S ii] at z ∈ [1.8, 2.3]3 .Spectra are taken at two orthogonal dispersion angles (using the GR150C and GR150R grism elements), which helps to minimize the effects of contamination by overlapping spectral traces.

MEASUREMENTS
In this section, we present the measurements of the physical properties derived from spectroscopy and photometry, with the result of 50 individual galaxies shown in Tab.A1.
Quantities (e.g., the stellar mass M * and SFR) that are derived from a single flux must be corrected for the modest gravitational lensing magnification by the foreground A2744 cluster.But properties that are derived from flux ratio (e.g., metallicity Z) or other observed quantities, are independent of lensing magnification.We adopt our latest high-precision, JWST-based lensing model (Bergamini et al. 2023a,b) to estimate the lensing magnification µ.We do not consider the uncertainty of µ because the relative error is only ∼ 2.3%.The median estimate of µ is consistent but more precise with the calculation derived from the public Hubble Frontier Fields (HFF) lensing tool5 (Lotz et al. 2017) using Sharon & Johnson version (Johnson et al. 2014) and the CATS version (Jauzac et al. 2015) computed by Lenstool software6 (Petri 2016).

Grism Redshift and Emission-line Flux
We utilize the Grism Redshift and Line Analysis software Grizli (Brammer 2023) to reduce NIRISS data using the standard JWST pipeline (version 1.11.1) and the latest reference file (under the jwst_1100.pmapcontext).The detailed procedures are largely described in Roberts-Borsani et al. (2022b).Briefly, Grizli analyzes the paired direct imaging and grism exposures through forward modeling, and yields contamination subtracted 1D & 2D grism spectra, along with the best-fit spectroscopic redshifts.
For each source, the one dimensional (1D) spectrum is constructed using a linear superposition of a spectra from a library consisting of four sets of empirical continuum spectra covering a range of stellar population ages (Brammer et al. 2008;Erb et al. 2010;Muzzin et al. 2013;Conroy & van Dokkum 2012) and Gaussian-shaped nebular emission lines at the observed wavelengths given by the source redshift.The intrinsic 1D spectrum and the spatial distribution of flux mea- sured in the paired direct image are utilized to generate a 2D model spectrum based on the grism sensitivity and dispersion function, similar to the "fluxcube" model produced by the aXe software (Kümmel et al. 2009).This 2D forwardmodeled spectrum is then compared to the observation by Grizli and a global χ 2 calculation is performed to determine the best-fit superposition coefficients for both the continuum templates and Gaussian amplitudes, the latter of which correspond to the best-fit emission line fluxes.In this way, our 2D forward modeling practice not only determines the source redshift, but also measures the emission line fluxes, taking into account the morphological broadening effect.We refer the interested readers to Appendix A of Wang et al. (2019), for the full descriptions of the redshift fitting procedure.
We obtain a parent sample of 4756 sources with F150W apparent magnitudes between [18,32] ABmag (the 5σ depth is 28.7 according to Treu et al. (2022)), on which our Grizli analyses result in meaningful redshift constraints.Several goodness-of-fit criteria are implemented to ensure the reliability of our redshift fit: a reduced chi-square close to 1 (χ 2 < 2.2), a sharply peaked posterior of the redshift (∆z) posterior /(1 + z peak ) < 0.002, high evidence of Bayesian information criterion compared to polynomials (BIC > 100).As a result, there are 348 sources in the redshift range z ∈ [0.05, 10], with secure grism redshift measurements according to the above joint selection criteria.A total of 86 sources with secure grism redshifts are at redshifts z ∈ ) are used for selection, to avoid potential metallicity bias.Then we visually inspect the 1D spectra of each galaxy individually, excluding 7 of those that are heavily contaminated.The 50 galaxies showing prominent nebular emission features, with 0 possible AGNs exclusion in Sect.3.4, will make up the final sample presented in Tab.A1.A 'textbook case' of our samples (ID: 05184 in Tab.A1) has been carefully studied through spatial mapping in our recent work (Wang et al. 2022b).We show as an example 1D/2D spectra for six galaxies in our sample in Fig. 1, annotated with their exposure times, best-fit grism redshifts, and stellar masses (which will be discussed in Sect.3.3).
Since the 1D grism spectra are extracted by Grizli simultaneously, it allows us to directly fit it using several 1D Gaussian profiles to obtain line fluxes and errors, as detailed in Sect.3.5.But we still use the previous 2D flux other than 1D as our default result for subsequent calculations.The comparison of the line flux measurements between this 1D line profile fitting and the 2D Grizli forward-modeling procedure, is discussed in Sect.4.2.

Gas-phase metallicity and Star Formation Rate
We use these observed line flux ( f o i , σ o i ) to simultaneously estimate 3 parameters: jointly metallicity, nebular dust extinction, and de-reddened Hβ line flux (12 + log(O/H), A v , f Hβ ).We follow the previous series of work (Jones et al. 2015b;Wang et al. 2017Wang et al. , 2019Wang et al. , 2020;;Wang et al. 2022a), by constructing a Bayesian inference method that uses multiple calibration relations to jointly constrain metallicity 12 + log(O/H), and (A v , f Hβ ) simultaneously.Our method is more reliable than the conventional way of turning line flux ratios into metallicities, since it takes into account the intrinsic scatter in strong-line O/H calibrations (σ R i in Eq. 1 ).And it combines multiple line flux measurements and properly marginalizes over the dust extinction correction.It also emphasizes bright lines (e.g., [O ii], [O iii]) with high signal-tonoise ratios (SNRs) and marginalizes faint lines (e.g., Hβ) or even non-detection lines with low SNRs quantitatively, (i.e., by assigning weights to each line according to its SNR in the likelihood function).
The Markov Chain Monte Carlo (MCMC) sampler Emcee software (Foreman-Mackey et al. 2013) is employed to sample the likelihood profile L ∝ exp (−χ 2 /2) with: Here the summation i includes all emission lines, with their intrinsic scatters σ R i := σ cal i • R i • ln 10.The inherent flux and uncertainty ( f i , σ i ) for each line, are corrected from observation ( f o i , σ o i ) for dust attenuation by parameter A v using the Calzetti et al. (2000) extinction law.R i refers to the line flux ratio, which is empirically calibrated by a polynomial as a function of metallicity: log R = n j=0 c j • (x) j , x := 12 + log(O/H), where (x) j means jth power of x, with the coefficients summarized in Tab. 1.For flux ratio calibrations that do not use Hβ as the denominator (e.g., [Ne iii]/[O iii]), the terms f Hβ in Eq.1 need to be replaced by the corresponding lines (e.g., f [O iii] ).And one more term of uncertainty (e.g., σ 2 O3 • R 2 Ne3 ) needs to be added to the denominator of χ 2 .
A wide range of strong line calibrations between line flux ratio and metallicity has been established (see Appendix C in Wang et al. 2019, for a summary) (also see Maiolino & Mannucci 2019;Kewley et al. 2019, for recent reviews).Different choices can result in offsets as high as 0.7 dex (see e.g., Kewley & Ellison 2008).In this work, we adopt mainly the diagnostics group "O 3 − O 2 " of calibrations prescribed by Bian et al. (2018, hereafter B18), for comparison with Sanders et al. (2021); Wang et al. (2022a).The purely empirical calibrations in Bian et al. (2018, B18) are based on a sample of local analogs of high-z galaxies according to the location on the BPT diagram, with the notations and coefficients summarized in Tab. 1.
These calibrations are recommended for the metallicity range of 7.8 < 12 + log(O/H), which is appropriate for our sample that does not reach metallicities as low as those found at higher redshift Curti et al. (2023a); Heintz et al. (2023).As a sanity check, we computed metallicities using the calibrations from Sanders et al. (2023), and indeed we do not find galaxies with metallicities significantly lower than 7.8.In order to make complete use of emission lines of spectra, we also collect Ne 3 O 3 , S 2 diagnostics at the same time, even though the corresponding line fluxes are not so strong for our sample.We have tested that if they are removed, they do not  significantly affect the metallicity estimation, which is dominated by the first 2 diagnostics O 3 , O 2 in B18 and 2 Balmer decrements.We adopt the intrinsic Balmer decrement flux ratios assuming Case B recombination with T e ∼ 10, 000K.We neglect the line-blending effect, since they are likely small in most cases (see Fig. 4 and Append.C in Henry et al. 2021, for more information).This Bayesian method is used to derive properties (12 + log(O/H), A v , f Hβ ) of galaxies both from our individual spectra sample here and from the stacked spectra presented in Sect.3.5.
From the de-reddened Hβ flux f Hβ , we estimate the instantaneous SFR of our sample galaxies, based on Balmer line luminosities.This approach provides a valuable proxy of the ongoing star formation on a time scale of ∼10Myr, highly relevant for galaxies displaying strong nebular emission lines.Assuming the Kennicutt (1998) calibration and the Balmer decrement ratio of Hα/Hβ = 2.86 from the case B recombination for typical Hii regions, we calculate: suitable for the Chabrier (2003) initial mass function.The total luminosity L(Hβ) = 4πD 2 L (z) • f Hβ is corrected for lensing magnification according to Bergamini et al. (2023a).The corrected SFR values are given in Tab.A1.

Stellar mass and Lensing magnification
In this section, we fit broad-band photometry to obtain stellar mass M * of target galaxies through SED fitting.We  directly use the combined photometric catalog7 released by the GLASS-JWST team (Paris et al. 2023).The photometric fluxes measured within 2× PSF FWHM apertures of all 16 bands are included if available.We match 2983/4756 galaxies of our NIRISS spectroscopy catalog in Sect.3.1 to the 24389 galaxies of the NIRCam photometric catalog with on-sky distances (d2d) lower than 0.7 arcsecs (5× FWHM in the F444W band, conservatively).As done in Sect.3.1, the final selected sample of 50 galaxies yields accurate d2d match (< 0.14 arcsecs, around the angular resolution of JWST/NIRISS), and visually cross-matching with the NIR-Cam image further validates our sources.
To estimate the stellar masses M * of our sample galaxies, we use the Bagpipes software (Carnall et al. 2018) to fit the BC03 (Bruzual & Charlot 2003) models of SEDs to the photometric measurements derived above.We assume the Chabrier (2003) initial mass function, a metallicity range of Z/Z ⊙ ∈ (0, 2.5), the Calzetti et al. (2000) extinction law with A v in the range of (0, 3).We use the Double Power Law (DPL) model other than simple exponentially declining form to capture the complex Star Formation History (SFH) of our galaxies at cosmic noon (rather than local universe), following Carnall et al. (2019).The nebular emission component is also added into the SED during the fit, since our galaxies are exclusively strong line emitters by selection.The redshifts of our galaxies are fixed to their best-fit grism values, with a conservative uncertainty of z σ = 0.003.Note that we have obtained the entire redshift posterior from Grizli in Sec:3.1, and set a criterion of (∆z) posterior /(1 + z peak ) < 0.002 for se-cure redshift measurements.But here we still set a Gaussian prior centered on z peak with z σ = 0.003 for simplicity in SED fitting, following Momcheva et al. (2016).Actually, the minimum, median, and maximum values of ∆z/(1 + z) for our sample are 1.4 × 10 −4 , 2.8 × 10 −4 , 1.5 × 10 −3 , respectively.
Our mass estimates are in agreement with Santini et al. (2023), even though we stress that our results are more robust, because we use spectroscopic redshifts.After correcting magnification according to our recent lensing model (Bergamini et al. 2023a), we are allowed to take a glimpse of the loci of our galaxies in the SFR-M* diagram as in Fig. 2. We show the star-forming main sequence fitted by Speagle et al. (2014), which is extrapolated from log(M * /M ⊙ ) ∈ [9.7, 11.1] to the mass range of our sample with ±0.2 dex scatters.Sanders et al. (2021) gives stacked results of field galaxies fairly close to their extrapolated best fit out to log(M * /M ⊙ ) = 9.Our sample generally scatters around the main sequence at higher M * .But at lower M * high SFR galaxies are dominant, especially for z ∼ 3 at M * /M ⊙ ≲ 3 × 10 8 .It might account for the low metallicity at the low mass region when assuming the FMR (Mannucci et al. 2010), which will be discussed in Sect.4.1.

AGN contamination
The metallicity diagnostics used in this work are strictly for star-forming regions/galaxies, and the results will be incorrect if there is Active Galactic Nucleus (AGN) emission.So the last step is to exclude the AGN contamination from purely star-forming galaxies, by using the mass-excitation (MEx) diagram as shown in Fig. 3.
AGNs leave strong signatures on nebular line ratios such as [O iii] λ5007/Hβ and/or [N ii] λ6584/Hα, which form the most traditional version of the BPT diagram (Baldwin et al. 1981).Due to the limited spectral resolution of JWST/NIRISS slitless spectroscopy (R ∼ 150), [N ii] is entirely blended with Hα, which precludes us from using the BPT diagram to remove AGN contamination.
Fortunately, Juneau et al. (2014) proposed an effective approach coined the mass-excitation (MEx) diagram, using M * as a proxy for [N ii]/Hα, which functions well at z ∼ 0 (i.e.SDSS DR7).Coil et al. (2015) further modified the MEx demarcation by horizontally shifting these curves to high-M * by 0.75 dex, which is shown to be more applicable to the MOS-DEF sample (Sanders et al. 2021) at z ∼ 2.3.We thus rely on this modified MEx to prune AGN contamination from our galaxy sample.As shown in Fig. 3, the green and red curves mark the steep gradient of P(AGN) ∼ 0.3 and P(AGN) ∼ 0.8 respectively, which represent the probability that the galaxy hosts an AGN.
Most of the sources are clearly un-likely AGN, and some scattered around the critical line are ambiguous.There are only two galaxies slightly above the upper demarcation within 1σ.Because our analysis is based on stacking, a small minority of contaminating AGN will have a negligible impact.Given the limited sample size, we tend to retain more applicable data, and consequently, no possible AGN is eliminated and we preserve all 50 galaxies.

Stacking spectra
Robust emission lines are required to estimate metallicity for MZR measurement.So we need composite spectra ob-tained by stacking procedure to achieve higher SNR from low-resolution grism spectra.In the previous subsection, we have selected 50 spectroscopically confirmed galaxies in the A2744 lensed field that are undergoing active star formation.Then they are divided into 2 redshift bins (z ∈ [1.8, 2.3] and z ∈ [2.6, 3.4]), and 3 mass bins respectively as in Tab. 2. Our choice of binning aims to have a reasonable number of galaxies per bin.We tested that changing the mass bins does not significantly affect our conclusions.Approximately each mass bin contains ∼ 7 individual galaxies, and the SNR will be increased roughly by a factor of √ 7 = 2.6.The 1D/2D spectra of representative galaxies in each of the 6 bins are shown in Fig. 1.
Then we adopt the following stacking procedures, similar to those utilized by Henry et al. (2021); Wang et al. (2022a): 1. Subtract continuum models from the extracted grism spectra.The continua are constructed by Grizli combining two orients.We apply a multiplicative factor to the continuum models to make sure there is no offset between the modeled and observed continuum levels around emission lines, to avoid continuum oversubtraction.
2. Normalize the continuum-subtracted spectrum of each object using its measured [O iii] flux, to avoid excessive weighting toward objects with stronger line fluxes.
Here the [O iii] fluxes we used are the results of 1D line profile fitting instead of 2D forward modeling by Grizli, for a more straightforward normalization.
3. De-redshift each normalized spectrum to its rest frame, and resample on the same wavelength grid using Spec-tRes8 with the integrated flux preservation.
4. Take the median and the variance of the normalized fluxes at each wavelength grid, as the value and uncertainty of the stacked spectrum.
As shown in Fig. 4, these key lines are more significant in stacked spectra.The (relative) emission line fluxes are measured by fitting a set of Gaussian profiles to the line in stacked spectra, as well as individual spectra.We simultaneously fit [O ii], [Ne iii], Hδ, Hγ, Hβ, [O iii], Hα, and [S ii].The amplitude ratio of [O iii] λλ4960, 5008 doublets is fixed to 1:2.98 following Storey & Zeippen (2000).The centroids of Gaussian profiles are allowed a small shift of the corresponding rest-frame wavelengths of emission lines, within ±10 Å, in order to accommodate systematic uncertainties.The FWHMs of each line are not required to be the same, but set between [10, 25]Å, consistent with the rest-frame spectral   .Stacked grism spectra for galaxies residing in several mass bins at two redshift ranges, as shown in the upper (1.8 < z < 2.3) and lower (2.6 < z < 3.4) panels, respectively.Each mass bin contains 5 ∼ 11 galaxies, with the exact number of galaxies and corresponding mass range highlighted above each stacked spectrum.In each set of spectra, the blue curves represent the median stacked spectrum, the cyan bands mark the standard deviation flux uncertainties, and the red dashed curves show the best-fit Gaussian fits to multiple emission lines, while [S ii], Hα are across a discontinuous range among other lines (i.e., the [O iii] λλ4960,5008 doublets, Hβ, Hγ, Hδ, [Ne iii], and [O ii]) in the broken axes at right parts.The details of the stacking procedures are presented in Sect.3.5.resolution ∆λ ≈ 7Å corresponding to R ≈ 150 for NIRISS.We use the software LMFit 9 to perform the nonlinear leastsquares minimization, with the measured quantities summarized in Tab. 2. The stacked metallicity is estimated using the same methods as the individual galaxies outlined in Sect.3.2.Our later discussion will mainly focus on the stacked results.

RESULTS
From the joint analysis of the JWST/NIRISS and JWST/NIRCam data, we revisit the measurement of the MZR using the stacked spectra of the A2744-lensed field galaxies within the mass range of M * ∈ (10 6.9 , 10 10.0 )M ⊙ at z ∈ (1.8, 3.4), shown in Sect.4.1.We also perform a systematic investigation of the differences between 2D and 1D forward modeled fluxes of nebular emission lines from slitless spectroscopy, as detailed in Sect.4.2.

The MZR at the low mass end
Our key scientific result is the measurement of the gasphase MZR in the low mass range of log(M * /M ⊙ ) ∈ (6.9, 10.0) at z ∈ (1.8, 3.4).The slope of the MZR has been 9 https://lmfit.github.io/lmfit-py/shown to be a key diagnostic of galaxy chemical evolution and the cycling of baryons and metals through star formation and gas flows (see e.g., Maiolino & Mannucci 2019, and references therein).In particular, Sanders et al. (2021) argues that the shape of the MZR at z ∼ 2 − 3 is more tightly regulated by the efficiency of metal removal by gas outflows ζ out , rather than by the change of gas fractions with stellar mass µ gas (M * ). Henry et al. (2013a) observes a steepening of the MZR slope at z ∼ 2, suggesting a transition from momentumdriven winds to energy-driven winds as the primary prescription for galactic outflows in the low-mass end.
We find a clear correlation between metallicity and stellar mass for both individual galaxies and stacked spectra at z ∈ [1.8, 2.3] and z ∈ [2.6, 3.4], as shown in the left panel of Fig. 5.The z ∼ 2 and z ∼ 3 individual galaxy samples have Spearman correlation coefficients of 0.788, 0.688 with the p−value of 6.36×10 −7 , 3.98×10 −4 , respectively.We perform linear regression over the stacks to derive the MZR: where β is the slope and Z 8 is the normalization at M * = 10 8 M ⊙ , as the blue and red solid line with uncertainties at z ∼ 2, 3 in both panels of Fig. 5.We measure the MZR slope to be β = 0.223 ± 0.017 and β = 0.294 ± 0.010 for our galaxy samples at z median = 1.90 and z median = 2.88, respectively.We see moderate evolution in the MZR normalization from z ∼ 2 to z ∼ 3: ∆Z 8 = −0.11± 0.02.The stacked MZRs demonstrate good agreement with the individual results (linear fits are shown in the shaded regions in the left panel of Fig. 5.) The large uncertainty of the stacked metallicity in the z ∼ 3 lowest mass bin, comes from the limited number of galaxies.More importantly, all 5 galaxies within this bin are high-SFR galaxies (Fig. 2), which might explain their low stacked metallicity, under the assumption that the star-forming main sequence (Speagle et al. 2014) and the FMR (Mannucci et al. 2010) are valid below M * ≲ 8.A detailed study and characterization of incompleteness at the low mass end is beyond the scope of this paper, and is left for future work.We summarize our measurements in Table 3, along with other literature results.The right panel of Fig. 5 shows the comparisons to other observations and two cosmological hydrodynamic simulations.In addition to z ∼ 2, 3, we also include 3 latest MZR measurements at a very high redshift from JWST/NIRSpec for comparison.We measure the slope of the MZR to be β ∼ 0.25 for both z ∼ 2 and z ∼ 3. Our slopes at low mass are slightly lower than those found by Sanders et al. (2021), but ours are in lower mass ranges.The shallower normalization could be accounted for the MZR evolution from ours z median = 1.90, 2.88 to theirs z ∼ 2.3, 3.3.Furthermore, we follow their analytic model to understand what physical processes set the slope at the dwarf mass range.In the Peeples & Shankar (2011) [2.6, 3.4] (red squares), with their linear fits represented by shaded regions and solid lines.Right: comparison to other observational works, along with the IllustrisTNG100 simulation (Torrey et al. 2019) and the FIRE simulation (Ma et al. 2016).These colored lines are linear regressions of their respective results, with their parameters summarized in Tab. 3.
ISM is expressed as: Following the assumption by Sanders et al. ( 2021) that the gas fraction µ gas = 10 µ 0 M * −0.36 (µ 0 = 3.89, 3.96 for z ∼ 2, 3, respectively), the coefficient α = 0.7 • (0.64 + β), the nucleosynthetic stellar yield y/Z ISM = 10 9.2−(12+log(O/H)) the metal loading factors of inflowing gas accretion ζ in = 0, we calculate the loading factors of outflowing galactic winds ζ out at each stacked point and linear fit.We get: (5) And we find that log(ζ out /αµ gas ) is only a little bit above zero over the mass range, with ζ out ≈ 1.01 − 1.5 × αµ gas .Thus our results indicate that the shallower MZR may be attributed to a shallower M * scaling of the metal loading of the galactic outflows ζ out at the low mass end.We generalize their conclusions that outflows ζ out remain the dominant mechanism other than gas fraction µ gas that sets the MZR slope, and µ gas gradually carries more relative importance and rise to nearly the same order as ζ out for the low mass regime.
Our MZR slope β ∼ 0.25 is steeper than those reported in Li et al. (2022) at the same redshifts and similar mass range as in Tab. 3.Although we use the same NIRISS data of the A2744 lensed field, we only match 28 out of 50 galaxies with the on-sky distances(d2d) lower than 1 arcsec to the Abell catalogue of Li et al. (2022), and only 18/50 of them are in agreement with our metallicity measurements within 1σ confidence interval.This difference likely arises from the updated calibration files used in our NIRISS data reduction, and from our Bayesian approach in the metallicity inference using multiple line ratios to joint fit other than only [O iii]/[O ii] from Bian et al. (2018).In addition, we include the new JWST/NIRCam imaging data covering the rest-frame optical wavelength ranges for our sample galaxies (Paris et al. 2023), use more complex SFH (DPL), and employ the latest JWST-based lensing model (Bergamini et al. 2023a) for more reliable stellar mass estimates.Another source of difference is their choice of exponentially declining SFH (τ model) which may not be appropriate for our high-redshift star-forming galaxies (Reddy et al. 2012), and might introduce a significant bias in stellar mass M * estimation (Pacifici et al. 2015;Carnall et al. 2018Carnall et al. , 2019)).
In agreement with previous work, we also find a tendency for the slope of the MZR to flatten out in the low mass at around M * /M ⊙ ≲ 10 9 , although not as significant.As for higher redshift z ∼ 3−10, our inferred slopes β are consistent with those by Curti et al. (2023a); Nakajima et al. (2023), but our intercept Z 8 are ∼ 0.3 dex higher.At that time, the metal might be enriching and hence the MZR might be building up (Curti et al. 2023a), and it is not until the SFR peaks at "cosmic noon" z ∼ 2 − 3 that the MZR exhibits a higher intercept.
The MZR measurements are also sensitive to different strong line calibrations, especially for the intercept Z 8 (Kewley & Ellison 2008), as discussed in Sect.3.2.In Tab. 3, we also provide the MZR from stacks using the Sanders et al. ( 2023) calibration for comparison.Although the measured slopes are significantly steeper than our default B18 MZR, they are still consistent with Heintz et al. (2023) for dwarf galaxies at higher redshift.We fit the stacked result presented by Henry et al. (2021) in the similar mass range, which assumes Curti et al. (2017) calibration.Our slope agrees with theirs β = 0.22 ± 0.03, but the intercept is ∼ 0.1 dex higher.This agrees with Wang et al. (2022a); Li et al. (2022), who test that the calibrations of Bian et al. (2018) yielded a steeper MZR than the calibrations of Curti et al. (2017) when analyzing the same data.
Moreover, we compare our results with two simulation works presented separately in Fig. 5. Our individual measurements are largely compatible with the result of the simulations IllustrisTNG (Torrey et al. 2019).But several high metallicity galaxies lift the stacked MZR up high slightly, yielding a steeper slope than they predicted.Our measured slopes are in better agreement with the FIRE simulation results (Ma et al. 2016), which are capable of resolving high-z dwarf galaxies with sufficient spatial resolution.
In addition, all the MZRs discussed above are derived from galaxy populations residing in random fields.There has been continuous discussion about the environmental dependence of MZR shapes at high redshifts (Peng & Maiolino 2014;Bahé et al. 2017;Calabrò et al. 2022;Wang et al. 2023).
Here we raise one recent observation of the MZR at z ∼ 2.2 showing a much shallower slope (β = 0.14 ± 0.02), measured using the HST grism spectroscopy of 36 galaxies residing in the core of the massive BOSS1244 protocluster (Wang et al. 2022a).Our work presented here confirms the significant difference between the MZR slopes measured in field and overdense environments, indicating the change in metal removal efficiency as a function of the environment.

Investigation of the morphological broadening effect on measurements of line flux and metallicity
Since metallicity estimates heavily rely on line flux measurements, in this section we verify that different methodologies in deriving emission line fluxes from the NIRISS slitless spectroscopy with limited spectral resolution do not result in significant biases on the metallicity derivations.
For grism spectroscopy, it has long been recognized that the morphological broadening effect can change the overall spectral shape and flux levels of galaxies (see e.g., van Dokkum et al. 2011;Wang et al. 2019Wang et al. , 2020)).We thus systematically compare, for the first time, two methods to measure emission line flux from slitless spectroscopy, with and without the consideration of this morphological broadening effect.The 2D forward modeling analysis of Grizli is depicted in Sect.3.1.In this section, we describe the line profile fitting to 1D extracted spectra using LMfit.The morphology of a galaxy has already been taken into account when forward modeling its 2D spectrum by Grizli.The extracted 1D spectra are morphologically broadened along the dispersion direction, and can vary significantly in spectral slope and flux level for the same object due to the different projected 1D morphology (see Fig. 8,9 of Wang et al. 2019, for examples).Therefore, we regard the 2D line flux as the reference intrinsic value, and 1D flux as the measurement not corrected for the morphology.The difference has not yet been fully investigated, and thus demands immediate attention, with the upcoming advent of large slitless spectroscopic surveys, e.g., Euclid, Roman, and the Chinese Space Station Telescope (CSST).
In the top 3 panels of Fig. 6, we show the comparison between line flux measured from 2D or 1D spectra, and try to associate it with the half-light radius r 50 .The flux ratio of 2D to 1D deviates from 1 tangibly, and 2D flux modeled by Grizli are larger in most cases (47/48, 41/48, 43/48 for [O iii], [O ii], Hβ, respectively) than 1D flux fitted using LMfit by a median factor of ∼ 30% (with wide dispersion -0.3 -5, where minus factor means 2D flux is lower than 1D flux).This strong offset does not seem to be related to SNR.As expected, we find it does correlate with the half-light radius r 50 of the individual galaxies, although not as strong as the Pearson correlation coefficients R shown.The unit of r 50 is the pixel, and here 1 pixel corresponds to 0.03 arcsec, as illustrated in Sect. 2. Furthermore, Pearson R decreases as the SNR decreases from the first 3 brightest lines to Hβ, convincing us of this weak correlation.Linear fitting is employed in an attempt to describe this phenomenon, although it is based on limited data.This non-zero inconsistency first appears when we use 1D [O iii] flux to normalize our individual spectra for stacking.We rechecked our MZR using 2D [O iii] flux to normalize for stacking, and found the bias of metallicity is lower than 1σ.It indicates that the bias of the two flux measurements may be obscured by the stacking procedure, although we need larger a sample and more tests to verify this assertion.A more significant effect may be seen in the physical quantities directly determined by the line flux value, such as SFR.
Since the flux ratio of 2D to 1D exhibits a correlation with the half-light radius r 50 , we interpret this discrepancy as a morphological broadening effect.The morphological broadening of the spectrum is not due to physical factors such as velocity dispersion or radiative damping, but is simply an observational effect of the extended source (van Dokkum et al. 2011;Wang et al. 2020).For an ideal point source with no physical broadening effect, the emission line will be measured as a δ function.But if we could spatially resolve the galaxy, which is common in slitless spectroscopy, the emission line would be broadened as a result of the superposition of δ functions from individual pixels.Therefore, more parts of the line edge will be drowned in the noise, resulting in lower total line flux modeled by the Gaussian function.And of course, larger sources produce more broadening, yielding lower flux measurements.We, therefore, deem the top 3 panels of Figure 6 to be the first attempt to quantitatively analyze the impact of the morphological broadening effect.For large sources (r 50 > 10), the intrinsic flux could be several times larger than the broadened flux.
Although the 2D measurements are larger than the 1D results, in general, it seems that this bias is the same for all emission lines of the same source.As one can notice in the top 3 panels of Fig. 6, for a given source with the same abscissa r 50 , the corresponding ordinate values 2D/1D of all 3 lines are quite close to each other, although our naked eye can only recognize those outliers.And we have tested that these patterns are also independent of their SNRs.Moreover, we show the line flux ratio in the bottom 3 panels of Fig. 6, and they nearly follow the one-on-one line, with few outliers.That means even if this effect is not taken into account like in the 1D method, the flux ratios do not deviate from the 2D method significantly.Therefore, it indicates that the bias of the morphological broadening effect is systematic.We colorcode them with the metallicity or the dust extinction A v de-rived in Sect.3.2 using 2D Grizli flux ratio.The color patterns demonstrate the physical meaning of these line ratios, i.e., the gas-phase metallicity diagnostics /Hβ, and the dust extinction indicator Hα/Hβ.The dotted line in the lower right marks the 'intrinsic' line ratio in the absence of dust attenuation Hα/Hβ = 2.86.The few sources below it may be due to low SNR and measurement errors (see e.g., Nelson et al. 2016).
As a consequence, our key result of the metallicity measurement derived from the ratio of two lines in Sect.3.2, will not be greatly influenced by the 2D/1D flux measurement method.However the direct line flux (e.g., Hβ) and the derived quantity (e.g., SFR) of a single emission line could be biased, and for a large source, the intrinsic flux could be several times larger than the measured one.The coarse linear fitting here might describe the distinction between 2D/1D forward modeling flux of emission line to some extent.We interpret this discrepancy as a morphological broadening effect.We recommend carefully checking the way flux is measured to match the scientific requirement, and carefully forward modeling the spectrum through the convolution of the mor-phological broadening effect.The systematic offset, for the first time, may present an important guideline for future work deriving line fluxes with wide-field slitless spectroscopy, especially for large sky surveys to be conducted by e.g., Euclid, Roman, and CSST, where it is time-consuming for 2D emission line modeling.

CONCLUSIONS
We have presented a comprehensive measurement of the MZR at a dwarf mass range using grism slitless spectroscopy.The grism data are acquired by the GLASS-JWST ERS program, targeting the A2744 lensed field.From the joint analysis of the JWST/NIRISS and JWST/NIRCam data, we select a secure sample of 50 field galaxies with M * /M ⊙ ∈ [10 6.9 , 10 10.0 ] and 12 + log(O/H) ∈ [7.8, 8.7] at 2 redshift range z ∈ [1.8, 2.3] and z ∈ [2.6, 3.4], assuming the strong line calibration of (Bian et al. 2018).Our galaxies are divided into several mass bins and their spectra are stacked to increase the SNR.Then we apply our forward modeling Bayesian metallicity inference method to the stacked line fluxes.We derive the MZR in the A2744 lensed field as 12 + log(O/H) = β × log(M * /10 8 M ⊙ ) + Z 8 with β = 0.223 ± 0.017 and β = 0.294 ± 0.010 in these two redshift ranges z median = 1.90 and z median = 2.88, respectively, as well as a slight evolution: ∆Z 8 = −0.11± 0.02, as presented in Tab. 3 and Fig. 5. Our MZRs have slopes that are consistent with those reported by Sanders et al. (2021) at the higher mass end and similar redshifts, suggesting that gas outflow mechanisms with the same metal removal efficiency extend to the low-mass regime (≲ 10 9 M * ) at cosmic noon.This M * scaling of metallicity is well reproduced by the FIRE simulations (Ma et al. 2016).
In addition, we assess the impact of the morphological broadening on emission line measurement by comparing two methods of using 2D forward modeling and line profile fitting to 1D extracted spectra.We show that ignoring the morphological broadening effect when deriving line fluxes from grism spectra results in a systematic reduction of flux by ∼ 30% on average.The coarse linear fitting in Fig. 6 could characterize the impact of the morphological broadening effect on modeling the emission line flux to some extent.The direct value (e.g., Hβ) and derived quantity (e.g., SFR) of a single emission line flux could be biased, if one does not account for the galaxy morphology.However, this system-atic effect does not significantly influence the line ratio and its derived quantities, e.g., metallicity, dust extinction, age, etc..For this reason, we recommend careful inspection of the line modeling, especially for the next generation of large sky surveys, e.g., Euclid, Roman, and CSST.
We would like to thank the anonymous referee for the constructive comments that help us improve the clarity of this paper.This paper is dedicated to the memory of our beloved colleague Mario Nonino who passed away prematurely.We miss him and are indebted to him for his countless contributions to the GLASS-JWST project.This work is based on observations made with the NASA/ESA/CSA James Webb Space Telescope.The data were obtained from the Mikulski Archive for Space Telescopes at the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS 5-03127 for JWST.These observations are associated with program JWST-ERS-1324.We acknowledge financial support from NASA through grant JWST-ERS-1324.X. H. thanks Xiaolei Meng, Lei Sun, and Lilan Yang for the useful discussion.We thank the entire GLASS team that helped shape the manuscript.X.In Tab.A1, we show the observed and measured physical properties of all the 50 galaxies in our sample, including galaxy ID (ID Grism), coordinates (R.A. and Decl.) and grism redshift (z grizm ) analyzed by Grizli, the matched ID in photometry of Paris et al. (2023) (ID Photo.), the stellar mass M * estimated by SED fitting, the gravitational lensing magnification µ calculated using the model of Bergamini et al. (2023a), and the dust attenuation (A ν ), the de-redden Balmer emission line flux f Hβ (with its derived SFR), and the gas-phase metallicity 12 + log(O/H) jointly estimated using our Bayesian method.Note that M * , S FR have already been corrected by lensing magnification µ, but f Hβ has not.In Tab.A2, we exhibit the emission line flux measurements by 2D/1D method, which are discussed in detail in Sect.4.2.Note that all f line are not corrected by µ.

Figure 3 .
Figure3.The Mass-Excitation diagram of our sample, used to exclude possible AGN galaxies.The position of the likely AGN galaxies with the possibility of 0.8 and 0.3, are marked by the red and green curves.No significant possible AGN contamination is evident in our samples, with one galaxy (ID=03854) only slightly off by 1σ.

Figure 4
Figure4.Stacked grism spectra for galaxies residing in several mass bins at two redshift ranges, as shown in the upper (1.8 < z < 2.3) and lower (2.6 < z < 3.4) panels, respectively.Each mass bin contains 5 ∼ 11 galaxies, with the exact number of galaxies and corresponding mass range highlighted above each stacked spectrum.In each set of spectra, the blue curves represent the median stacked spectrum, the cyan bands mark the standard deviation flux uncertainties, and the red dashed curves show the best-fit Gaussian fits to multiple emission lines, while [S ii], Hα are across a discontinuous range among other lines (i.e., the [O iii] λλ4960,5008 doublets, Hβ, Hγ, Hδ, [Ne iii], and [O ii]) in the broken axes at right parts.The details of the stacking procedures are presented in Sect.3.5.

Figure 6 .
Figure6.Comparison between the emission line fluxes derived using the 2D/1D forward modeling methods, explained in detail in Sect.3.1 & 3.5, respectively.The top 3 panels show the galaxy radius vs. the flux ratio of 2D to 1D for each line.The 2D fluxes are tangibly higher than 1D fluxes (above the black line), and it seems systematic for all 3 brightest lines of each source (with the same r 50 ).We find a correlation between them (green line), although not so strong, with the Pearson correlation coefficients and the p-value exhibited in the top right corner, as well as the green result of linear fitting at the center.Their color marks the SNR of the flux from the 1D method, showing no significant correlation.The bottom 3 panels show the line flux ratio, while their color marks the metallicity or the dust extinction derived in Sect.3.2 using Grizli flux ratio.These two distributions nearly scatter across the equality line (in black) within the uncertainty.But there are several outliers and a slight systematic overestimation for 2D, which is more obvious for Hα/Hβ at the bottom right.
W. is supported by the Fundamental Research Funds for the Central Universities, and the CAS Project for Young Scientists in Basic Research, Grant No. YSBR-062.This research is supported in part by the Australian Research Council Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D), through project number CE170100013.We acknowledge support from the INAF Large Grant 2022 "Extragalactic Surveys with JWST" (PI Pentericci).B.M. is supported by an Australian Government Research Training Program (RTP) Scholarship.K.G. is supported by the Australian Research Council through the Discovery Early Career Researcher Award (DECRA) Fellowship (project number DE220100766) funded by the Australian Government.
QUANTITIES OF OUR SAMPLE

Table 1 .
Coefficients for the emission line flux ratio diagnostics used in this work.
b the line flux ratio R [S ii] is calibrated by polnomial with coefficients given by the 'best' row, and the uncertainty σ [S ii] is given by the 'upper' and 'lower' row, where the metallicity x is relative to solar x := 12 + log(O/H) − 8.69.

Table 2 .
Measured properties of the stacked spectra.groupNgalmassrangelogM med * [O iii]/Hβ [O ii]/Hβ [O iii]/[O ii] Note-The multiple emission line flux ratios are measured from the stacked spectra shown in Fig.4.The mass range and the median stellar mass log M med *are both logarithmic values log(M * /M ⊙ ).The metallicity inference is derived from the measured line flux ratios in the stacked spectra presented in each corresponding row, using the method described in Sect.3.2.Here we use the strong line calibrations prescribed byBian et al.  (2018, B18)and some others.See Table1for the relevant coefficients.
model, the metallicity of the Figure 5. MZR measurements for the star-forming field galaxies behind the A2744 cluster.Left: the individual (hollow) and the stacked (solid) result of our galaxy sample at z ∈ [1.8, 2.3] (blue triangles) and z ∈

Table A1 .
Measured Properties of Individual Galaxies

Table A2 .
Flux derived from 2D/1D forward modeling of the individual galaxies.
The first 4 columns are the same as Tab.A1.Columns 5-11 and 12-18 are the 2D/1D forward modeling flux, respectively for each emission line.The error bars shown in the table correspond to 1-σ confidence intervals.