Ultra-deep Keck/MOSFIRE Spectroscopic Observations of z ∼ 2 Galaxies: Direct Oxygen Abundances and Nebular Excitation Properties

Using deep near-infrared Keck/MOSFIRE observations, we analyze the rest-optical spectra of eight star-forming galaxies in the COSMOS and GOODS-N fields. We reach integration times of ∼10 hr in the deepest bands, pushing the limits on current ground-based observational capabilities. The targets fall into two redshift bins, of five galaxies at z ∼ 1.7 and three galaxies at z ∼ 2.5, and were selected as likely to yield significant auroral-line detections. Even with long integration times, detection of the auroral lines remains challenging. We stack the spectra together into subsets based on redshift, improving the signal-to-noise ratio on the [O iii]λ4364 auroral emission line and, in turn, enabling a direct measurement of the oxygen abundance for each stack. We compare these measurements to commonly employed strong-line ratios alongside measurements from the literature. We find that the stacks fall within the distribution of z > 1 literature measurements, but a larger sample size is needed to robustly constrain the relationships between strong-line ratios and oxygen abundance at high redshift. We additionally report detections of [O i]λ6302 for nine individual galaxies and composite spectra of 21 targets in the MOSFIRE pointings. We plot their line ratios on the [O iii]λ5008/Hβ versus [O i]λ6302/Hα diagnostic diagram, comparing our targets to local galaxies and H ii regions. We find that the [O i]/Hα ratios in our sample of galaxies are consistent with being produced in gas ionized by α-enhanced massive stars, as has been previously inferred for rapidly forming galaxies at early cosmic times.


INTRODUCTION
Tracing the chemical evolution of galaxies is key to understanding how galaxy growth and evolution occur over time.The metallicity of a galaxy is influenced by numerous mechanisms such as the reprocessing of gas into heavier elements through nucleosynthesis; metalenriched outflows driven by supernovae, AGN, and stellar winds; accretion of pristine hydrogen gas onto a galaxy; and accretion of enriched, recycled gas in the form of galactic fountains (Tumlinson et al. 2017;Davé et al. 2017).Observationally, metallicity commonly refers to the gas-phase oxygen abundance in the interstellar medium (ISM) of a galaxy since oxygen is the most abundant metal and produces strong rest-optical emission line features.The oxygen abundance in the * NHFP Hubble Fellow ISM of star-forming galaxies has been observed to correlate tightly with the stellar mass, encapsulated in what is referred to as the mass-metallicity relation (MZR).Early evidence for a MZR goes back to Lequeux et al. (1979) who measured oxygen abundances in a small sample of nearby blue compact dwarf galaxies.Later studies (e.g., Tremonti et al. 2004;Kewley & Ellison 2008;Andrews & Martini 2013) showed that there is a MZR that generally describes galaxies in the local universe.Furthermore, many works (e.g., Erb et al. 2006;Maiolino et al. 2008;Mannucci et al. 2009;Zahid et al. 2011;Steidel et al. 2014;Zahid et al. 2014;Yabe et al. 2015;Ly et al. 2016;Guo et al. 2016;Sanders et al. 2021) have revealed an evolution in the MZR with redshift, noting a change in the turnover mass and the normalization at higher z.Folding in the global star-formation rate (SFR) to the MZR yields the fundamental metallicity relation (FMR).This relation appears not to evolve through cosmic time at least as far back as z ∼ 3 (e.g., Mannucci et al. 2010;Sanders et al. 2021;Heintz et al. 2022), though there is evidence of deviations from the FMR at higher redshifts (e.g., Curti et al. 2023a).
The existence of these scaling relations with galaxy parameters gives insight into the processes that govern galaxy formation.The SFR, which is governed by the gas reservoir in a galaxy, is influenced by the baryon cycle, and is therefore tied to the chemical evolution in the ISM through the processes described above.Additionally, the stellar mass represents the integrated sum of star formation and is also related to the total metal production across a galaxy's lifetime.Overall, the three parameters that comprise the FMR probe the important mechanisms that determine galaxy evolution.The invariance of the FMR through a large portion of cosmic history suggests that galaxies are driven towards an equilibrium among inflow, star formation, and outflow (e.g., Davé et al. 2012;Peng & Maiolino 2014).These scaling relations are additionally useful for hydrodynamical simulations of galaxy formation since the comparison with observations provides further constraints on the subgrid physics determining the outputs (e.g., Ma et al. 2016;Davé et al. 2017;De Rossi et al. 2017;Torrey et al. 2019).Thus, measuring these galaxy parameters with high accuracy is instrumental in our understanding of galaxy formation and evolution.
Making robust measurements of the metallicity of a galaxy, however, can prove quite challenging.One of the more physically-motivated methods of determining the gas-phase oxygen abundance of a galaxy involves measuring the average electron temperature and density of its H ii regions.From these physical properties, one can determine the emissivities of the emission-line transitions from each ion species which, when scaled by their respective line flux measurements, yields the abundance of each ion relative to hydrogen (i.e.O + /H + and O 2+ /H + , see Izotov et al. 2006;Luridiana et al. 2015;Peimbert et al. 2017, for more detail).This method of abundance determination is often referred to as the "direct" method.However, to obtain the electron temperature, one must be able to detect a set of faint rest-optical auroral emission lines (e.g.[O iii]λ4364, [O ii]λλ7322, 7332), which can be a hundred times fainter than their nebular counterparts, requiring very long exposure times (Garnett 1992;Pérez-Montero 2017).The task becomes increasingly challenging when observing targets at z > 1 due to the varying transmission of Earth's atmosphere in the near-infrared (i.e., rest-frame optical) and the apparent faintness of targets due to their increased distance.Additionally, in the case of the [O iii]λ4364 line, whose strength relative to [O iii]λ5008 is temperature-sensitive, detection becomes more difficult in higher-metallicity galaxies.This challenge is due to the effects of more efficient metal-line cooling which leads to lower electron temperatures in the constituent H ii regions in more metal-rich galaxies, rendering direct metallicity measurements much more difficult.
In light of the challenges associated with the direct method, it is common to determine metal abundances using indirect metallicity indicators which rely on the ratios of strong emission lines (e.g., Dopita et al. 2013;Pilyugin & Grebel 2016;Bian et al. 2018;Curti et al. 2020;Nakajima et al. 2022).These relations translating strong-line ratios to metallicity are calibrated to photoionization models and/or measurements of direct metallicity in H ii regions and galaxies in the local universe.It is uncertain, however, whether these indirect indicators remain accurate at higher redshifts since conditions in the ISM of galaxies at earlier cosmic times may not resemble those in the local universe.Current observations suggest that galaxies at z ∼ 2 may be characterized by harder ionizing spectra and may also have N/O ratios that vary slightly from galaxies in the local universe at fixed oxygen abundance (e.g., Shapley et al. 2015;Strom et al. 2017;Shapley et al. 2019;Heintz et al. 2022).
The evolving conditions in the ISM of galaxies has typically been traced by measurements of both high-and low-ionization emission lines (e.g., [O iii]/Hβ, [N ii]/Hα, [S ii]/Hα).However, one low-ionization diagnostic that has been missing in analyses of galaxies at z > 1 is the [O i]λ6302/Hα line ratio.In studies of low-redshift galaxies, this line ratio offers insights into the hardness of the ionizing spectrum, the contribution of diffuse ionized gas (DIG), and the presence of shocks (e.g., Zhang et al. 2017).However, due to the intrinsic faintness of the [O i]λ6302 line, studies of this line diagnostic at high redshift have typically proven very difficult.
Similarly, due to the difficulty of obtaining auroralline measurements at high redshift with ground-based facilities, there only exists a small sample of z > 1 galaxies in the literature for which direct oxygen abundances have been measured (examples from the literature are discussed in Sanders et al. 2020).Additionally, the integration times for most ground-based near-infrared spectroscopic surveys do not reach the required depth to achieve significant auroral-line detections.The prevalence of auroral-line detections is, however, increasing as a result of observations made by the new James Webb Space Telescope (JWST) (e.g., Curti et al. 2023b;Nakajima et al. 2023;Sanders et al. 2023a).
The deep spectral observations analyzed in this study push the limits of ground-based, 10-m-class observatories and highlight the importance of JWST and other future state-of-the-art observatories in characterizing galaxy properties at and beyond cosmic noon.In this study, we utilize the MOSFIRE instrument (McLean et al. 2012) on the 10-m Keck I telescope to make deep observations of a sample of eight galaxies at z ∼ 1.7−2.5, reaching up to ∼10 hours of integration time in some bands.We additionally produce composite spectra from this sample of eight galaxies and determine their average characteristics (i.e., chemical abundances, electron temperatures, and densities).The depth of these observations additionally enables the detection of the [O i]λ6302 feature, beyond the reach of more typical spectroscopic samples with shallower integration times (e.g., Kriek et al. 2015).Based on these measurements, we investigate the position of our sample of z ∼ 2 galaxies in the [O iii]λ5008/Hβ vs. [N ii]λ6585/Hα, [S ii]λλ6718, 6733/Hα, and [O i]λ6302/Hα diagnostic diagrams (hereafter BPT diagrams1 ), the latter probing an unexplored parameter space at z > 1.
In Section 2 of this paper, we give an overview of the observational setup and data processing methods.In Section 3, we present the results of our analysis of the spectra in our sample.In Section 4, we compare our measurements with commonly-used metallicity indicators as well as consider the nature of [O i]λ6302 emission at high redshift.Throughout, we adopt the following cosmological parameters: H 0 = 70 km s −1 Mpc −1 , Ω m = 0.3, and Ω Λ = 0.7.Additionally, we assume a Chabrier (2003) IMF, solar abundances of 12 + log(O/H) ⊙ = 8.69 and 12 + log(N/H) ⊙ = 7.83, and a solar metallicity of Z ⊙ = 0.014 (Asplund et al. 2009).

Sample Selection and Observation Configurations
The target galaxies in this analysis reside in the COS-MOS and GOODS-N extragalactic legacy fields covered by the CANDELS and 3D-HST surveys (Grogin et al. 2011;Koekemoer et al. 2011;Momcheva et al. 2016).These galaxies were drawn from the MOSFIRE Deep Evolution Field (MOSDEF) survey (Kriek et al. 2015) as well as the sample of extreme emission-line galaxies (EELGs) presented by Tang et al. (2019) selected from the 3D-HST WFC3 grism emission-line catalog (Momcheva et al. 2016).Targets of interest were selected based on their probability of yielding auroralline detections.To this effect, the target sample was composed of galaxies that had bright [O iii]λ5008 emission and whose ratios among strong nebular emission lines suggested a high electron temperature based on relations observed in z ∼ 0 H ii regions (Sanders et al. 2017).Such thermal properties would result in brighter [O iii]λ4364 and [O ii]λλ7322, 7332 auroral lines, necessary for making direct metallicity measurements.We also required galaxy targets to lie within the following redshift intervals: 1.62 ≤ z ≤ 1.70, 2.32 ≤ z ≤ 2.61, and 2.95 ≤ z ≤ 3.18 to capture both auroral and strong nebular emission lines in the near-infrared atmospheric transmission windows covered by the Y, J, H, and Kbands.
In light of these selection criteria, we targeted a sample of 12 galaxies, which we refer to as "auroral" targets.Seven of the auroral targets were in the COSMOS field and five were in the GOODS-N field.Two of the galaxies from the COSMOS field are the subjects of a recent paper by Sanders et al. (2023b) in which their oxygen abundances were measured via the [O ii]λλ7322, 7332 auroral line doublet, representing the first such measurements beyond the local universe.
From the remaining sample of 10 [O iii] auroral targets, two from the COSMOS field were not considered in this study.The first of these targets was excluded because the galaxy was dithered on top of an adjacent object in the field, causing the target signal to be strongly contaminated by the negative trace of its neighbor on the slit.The second of these targets was at z = 3.12, placing Hα beyond the coverage of the K-band, and the remaining higher-order Balmer emission lines fell onto atmospheric sky lines.As a result, it was not possible to apply Balmer-decrement-based dust corrections on this particular target.Because of these considerations, our final auroral-target sample consisted of eight galaxies: three at z ∼ 2.5 in the COSMOS field and five at z ∼ 1.7 in the GOODS-N field.The properties of these targets are summarized in Table 1.We utilized the multiplexing capabilities of MOSFIRE to observe an additional 29 filler targets (14 in the COSMOS pointing and 15 in the GOODS-N pointing).For the subset of 15 filler-target spectra that contained Hα, Hβ, and [O i]λ6302 coverage, spanning z = 1.405 to z = 2.515, we analyzed them for [O i]λ6302 emission, and we refer to these targets as "non-auroral" targets.

Data Reduction and Flux Calibration
The two-dimensional (2D) reduction of the MOS-FIRE data was performed using an IDL data processing pipeline described in Kriek et al. (2015).The extractions of the 2D spectra were accomplished for each target on each mask individually using the bmep2 (Freeman et al. 2019) IDL program, and we used the same slit-loss correction routine as in Reddy et al. (2015) and Kriek et al. (2015).
In order to monitor seeing conditions and carefully combine individual frames, a slit was placed on a star in each mask.Each exposure was weighted according to the observed flux of the slit star.An issue arose with the flux calibrations from the targets on the COSMOS mask due to the fact that a galaxy in the field was dithered on top of the slit star for that mask, thereby contaminating the slit star spectrum that was used to apply the absolute flux scaling.Consequently, the default flux calibration for the COSMOS mask was unreliable.This effect had varying significance in each of the bands.In order to mitigate this source of systematic error and to ensure accurate band-to-band flux calibrations, we compared the spectra with available photometric observations and with corresponding spectral observations from the MOS-DEF survey.The existing MOSDEF spectra were taken with different slit mask configurations from those used for the new, deep observations and thus do not suffer from the same dithering issue.
The targets in the COSMOS field that had corresponding data in the MOSDEF survey were scaled multiplicatively such that the total flux of significantlydetected lines (>5σ) matched the flux of the same lines in the MOSDEF spectra.For targets on the COSMOS mask with no corresponding MOSDEF observations or without >5σ line detections in both data sets, the spectra in each band were scaled by the average of the scaling factors on the mask in each respective filter.On average, these scaling factors adjusted the flux calibrations in each band on the order of 3-30%.
Additionally, the spectra on the GOODS-N mask were compared with existing observations in order to ensure the robustness of the flux calibrations.Since the targets on this mask did not have corresponding MOSDEF observations, the spectra in each band were scaled to agree with broad-band ground-based photometry as well as 3D-HST (Skelton et al. 2014;Momcheva et al. 2016) photometric and spectroscopic measurements.Based on 11 galaxies with both MOSDEF coverage and photometric measurements, we find that the median photometric scaling is a factor of 1.2 higher than the scalings from the MOSDEF spectra.However, we prioritize the MOSDEF-spectra-based scalings when available, due to the better match in wavelength coverage compared to the photometric bandpasses as well as consistency with previous methodologies (see Sanders et al. 2023b)

SED and Emission-line Fitting
The spectral energy distributions (SEDs) of each target galaxy in the COSMOS field were fit across 43 photometric data points drawn from the 3D-HST catalog spanning from 3500 Å to 8 µm in the observed frame.Similarly, for the GOODS-N targets, 22 photometric points were fit across the same wavelength range.We corrected the near-IR photometric data for bright rest-frame optical emission lines using the emission-line fluxes determined from the MOSFIRE spectra analyzed in this study.The SEDs were fit with flexible stellar population synthesis models (Conroy & Gunn 2010) using the FAST fitting code (Kriek et al. 2009) in order to determine parameters such as stellar mass and age.We assumed an SMC attenuation curve with a stellar metallicity of 0.22 Z ⊙ , a delayed-τ star-formation history of the form t × exp (−t/τ ), and a Chabrier (2003) IMF.
We used the non-linear least squares algorithm scipy.curvefit() to fit a Gaussian profile to the emission line features in each of the individual spectra as well as the stacks.Uncertainties on the line flux measurements were determined using a Monte-Carlo simulation in which each spectrum was perturbed according to the error spectrum over 100 iterations.The weaker emission-line widths were tied to the velocity widths of Hα and [O iii]λ5008.Additionally, the Hα and [N ii] lines were fit simultaneously, with the fluxes of [N ii]λ6550 and [N ii]λ6585 being tied in a ratio of 1:3 respectively.Similarly to the [N ii] lines, the [O i]λ6302 and [O i]λ6365 lines were fixed with a flux ratio of 3:1 respectively since, in both cases, their relative strengths are fixed by quantum mechanical transition probabilities.For targets that appeared to have a broad/offset component to their emission-line profiles (e.g., non-auroral targets COSMOS-19812, 19985, 20062), we fit the brightest lines with a double Gaussian and reported only the narrow-component flux since the additional component is likely attributed to outflows or other gas not physically associated with H ii regions (Leung et al. 2017).The SED fits determined from the photometry were used to model the continuum and the Balmer stellar absorption troughs.The initial, non-emission-line corrected SEDs were used to obtain a continuum fit and estimate line fluxes, and these line flux estimates were then used to correct the near-IR photometry.In turn, we re-fit the emission lines using the corrected SEDs to obtain our final line flux measurements.These line fluxes are reported along with derived physical quantities in Table 3.

Composite spectra
We found that the signal-to-noise (S/N) on [O iii]λ4364 was very low (< 2σ) across most of the sample, so we created composite spectra (or "stacks") in order to boost S/N and derive average galaxy properties for each stack.We chose to stack our sample using three different configurations that are laid out in Table 4. Stack 1 (S1), consisted of the three z ∼ 2.5 [O iii] auroral targets in the COSMOS field, stack 2 (S2) consisted of the five z ∼ 1.7 GOODS-N auroral targets, and stack 3 (S3) consisted of all [O iii] auroral targets in this study.In addition to the [O iii] auroral-line stacks, we also created composite spectra that consisted of both auroral and filler targets in the MOSFIRE pointings that had coverage of the [O i]λ6302 line unaffected by atmospheric sky lines.We divided these "[O i]" stacks into two groups: a low-redshift (1 ≤ z < 2) stack consisting of 13 targets, and a high-redshift (2 ≤ z ≤ 3) stack consisting of 8 targets.
In order to create the composite spectra, we first used the Hα/Hβ Balmer decrement assuming Case B recombination (Osterbrock 1989), an intrinsic ratio of 2.86, and a Cardelli et al. (1989) extinction curve to correct for internal dust extinction.We then converted each spectrum from flux density (F λ ) to luminosity density (L λ ) by multiplying by 4πD 2 L (1 + z) where D L is the luminosity distance.Subsequently, each spectrum was normalized by Hα luminosity and shifted into the rest frame.Prior to stacking, each spectrum was interpolated and re-sampled to the same wavelength grid with 0.5 Å spacing.The resulting stacked spectrum was then multiplied by the median Hα luminosity of the component spectra in order to obtain units of luminosity density.We then averaged together the SED models of each component spectrum after normalizing by Hα luminosity, and the resulting composite SED curve was used to model the continuum of the stacks during the line-fitting procedure.We report line luminosity measurements for each of the [O iii] auroral stacks in Table 4. Additionally, we show the stacked spectra in Figure 1 with emission lines of interest labeled.In Figure 2, we plot the SFRs vs. stellar masses for our galaxy sample and compare them to typical values from the literature.The eight auroral targets shown in Table 1 have Hα and Hβ coverage, and we include them in the figure.Only 16 of the 29 filler targets, however, have simultaneous Hα and Hβ detections, and these are included as gray points in Figure 2. We calculate SFRs from dust-corrected Hα luminosities using a conversion factor of 3.236 × 10 −42 M ⊙ yr −1 erg −1 s based on models of stellar populations with sub-solar metallicity consistent with what we assume for SED fitting (Reddy et al. 2018).The galaxies in this sample agree well with the star-forming main sequence defined by larger populations of galaxies at similar redshifts.Shivaei et al. (2015) performed a linear fit to a sample of 185 galaxies in the range 1.37 ≤ z ≤ 2.61, and Topping et al. (2021) fit a sample of 285 galaxies in the range 1.37 ≤ z ≤ 1.70.These two empirical fits are shown as dashed and dotted black lines respectively, and they agree with the five GOODS-N galaxies at z ∼ 1.7 with the exceptions of GOODS-N-18462 and GOODS-N-14595, which are offset in log(SFR) by roughly -0.3 dex and +0.6 dex respectively.We additionally plot two main sequence relations defined by Speagle et al. (2014) at the median redshifts of our COSMOS and GOODS-N samples shown in orange and blue, respectively.As an additional point of reference, we overplot a sample of 280 z ∼ 2.3 galaxies distributed among five stacks by Sanders et al. (2021), and we re-scale the stacks to use the same Hα to SFR conversion factor we utilize in this study.For both the z ∼ 1.7 and z ∼ 2.5 galaxies in this study, we find agreement within 1σ relative to the respective main sequence fits with the exception of GOODS-N-18462, which falls slightly below the z ∼ 1.7 relation.Overall, this sample of galaxies is relatively representative in terms of SFR at fixed stellar mass based on larger samples in the same redshift range.

Abundance Determinations
Throughout this paper, we refer to "direct" oxygen abundances as those derived from determining the ion emissivities based on electron temperature and density measurements as opposed to "indirect" methods, which use the ratios of strong nebular emission lines empirically calibrated to local direct measurements or photoionization models.The direct method of oxygen abundance approximates a galaxy as a single H ii region and thus characterizes electron temperatures and densities based on globally-integrated spectra.In this H ii region approximation, it is conventional to further define two temperatures: the temperature associated with the high-ionization state and the low-ionization state of the ions of interest (e.g., T e (O 2+ ) and T e (O + ), respectively).Since the energy required to ionize neutral oxygen is similar to that of hydrogen, we expect that nearly all of the oxygen within H ii regions is ionized, either in the singly or doubly-ionized state, with negligible amounts in higher ionization or neutral states.Therefore, in order to determine the oxygen abundance directly, we sum the abundances of oxygen in its two most prevalent ionization states: Determining the abundances of each ionization species requires knowledge of the electron temperatures and densities associated with the respective ionization zones.In the O 2+ zone, this can be achieved through measurements of the [O iii]λλ4960, 5008 and the [O iii]λ4364 lines, where the [O iii]λ4364 transition originates from a different upper energy level than the [O iii]λ4960, 5008 transitions.In turn, measuring the ratios of these lines allows one to determine the electron temperature associated with the O 2+ zone.The same can be achieved for the O + ion, using the [O ii]λλ3727, 3730 and [O ii]λλ7321, 7332 lines which arise from different respective upper energy levels.One of the major challenges in determining direct oxygen abundances, however, is that the auroral lines produced by transitions from the upper energy levels are often intrinsically very faint, and their detection becomes the main limiting factor in obtaining direct abundance estimates.Ideally, one would measure the nebular and the corresponding faint auroral emission lines originating from the O + and O 2+ ions to directly determine the electron temperature for both ionization zones.However, since our sample of [O iii] auroral targets does not have coverage of the auroral [O ii] lines, we employ the following theoretical relation presented by Campbell et al. (1986) to infer T e (O + ):   T e (O + ) = 0.7 × T e (O 2+ ) + 3000 K (2) When converting a T e (O + ) measurement to T e (O 2+ ), observations suggest an intrinsic scatter of approximately 1300 K in the Campbell et al. (1986) relation (Berg et al. 2020;Rogers et al. 2021).Since we are instead converting T e (O 2+ ) to T e (O + ) using equation 2, we adopt an intrinsic scatter of 0.7 × 1300 K = 910 K, and add this in quadrature when determining the uncertainty on T e (O + ).
Electron temperatures, densities, and ionic abundances were determined using the PyNeb package (Luridiana et al. 2015).In order to compute the electron density n e , we used the getCrossTemDen() method to simultaneously solve for T e (O 2+ ) and n e , taking the [O ii]λ3727/[O ii]λ3730 ratio to be the density-sensitive tracer.With the output values of T e and n e from getCrossTemDen(), we computed the ionic abundances using the getIonAbundance() method, using the ratios of [O iii]λ4959 and [O ii]λ3727, 3730 relative to Hβ to compute the O 2+ /H + and the O + /H + abundances, respectively.Note-Non-detections are reported as 2σ upper limits.
We additionally calculated the nitrogen abundance in our galaxy sample since we have coverage of the [N ii]λ6585 line in all targets.The nitrogen abundance is ideally determined based on emission lines arising from the N + and N 2+ ions.However, we do not have coverage and/or detection of the necessary [N iii] emission lines, so we make the following approximation: where ICF(N) is the ionization correction factor accounting for higher ionization states, and it is defined as ICF(N) = N/N + .We approximate this ratio as N/N + ≈ O/O + since oxygen and nitrogen have similar ionization energies (Peimbert 1967).
One can directly measure the temperature within the N + zone by measuring the auroral-to-nebular line ratio [N ii]λ5756/[N ii]λ6585, analogous to the [O iii]λ4364/[O iii]λ5008 ratio for the O 2+ zone.However, we do not have coverage of the [N ii]λ5756 line in our spectra, so we make the approximation that T e (N + ) ≈ T e (O + ) since both ions should occupy the low-ionization zones.We also use the same electron density as that determined during the calculation of the oxygen abundance.We then used getIonAbundance() employing the [N ii]λ6585/Hβ ratio as the input to calculate the N + /H + abundance.The derived constraints on density, temperature, and ionic and total abundances are reported in Table 3 for the individual targets and Table 4 for the composites.

Auroral-line Detections
The [O iii]λ4364 line was detected at greater than 2σ significance in only two targets: COSMOS-18812 and GOODS-N-8240.For COSMOS-18812, the [O ii]λλ3727, 3730 line doublet fell onto a pair of sky lines, so it was not possible to constrain the electron density or the O + abundance.We tentatively detect [O iii]λ4364 emission in GOODS-N-8240 since the Gaussian fit to the line profile places the significance of this detection at the >2σ level.Upon visual inspection, we report the presence of emission lying a few angstroms blueward of the [O iii]λ4364 line in two of the targets: GOODS-N-6699 and GOODS-N-14595.For GOODS-N-6699, there is an emission feature detected at >3σ significance when centering a Gaussian profile at 4360 Å.It is uncertain whether this emission is associated with the [O iii]λ4364 line though it appears at a similar wavelength.The emission-line feature adjacent to [O iii]λ4364 in GOODS-N-14595 appears to be more closely centered on the expected central wavelength for the auroral oxygen line, with a significance of just 0.38σ when centering on 4360 Å.The spectra of these objects in this wavelength region can be seen in Figure 3. Curti et al. (2017) find a similar feature in a sample of spectral stacks and attribute it to a forbidden transition of singly-ionized iron.They also suggest that the strength of this 4360 Å contamination increases with increasing galaxy metallicity.However, we do not expect this contaminating feature to have a significant impact on our spectra given that strong-line metallicity indicators of our target sample suggest our targets are half solar metallicity or less.For completeness, we report [O iii]λ4364 measurements of the stacked spectra in two ways: fitting the emission around 4364 Å as a single Gaussian and fitting the feature at 4360 Å separately from [O iii]λ4364.As a shorthand, we refer to the feature at 4360 Å as [Fe ii]λ4360, and the simultaneous [O iii] and [Fe ii] line fits are denoted by an asterisk (*) in both Tables 3 and 4.
In Table 3, we see that in most cases, choosing a single vs. a double fit does little to change the [O iii]λ4364 flux.The exceptions to this are GOODS-N-8240 and GOODS-N-14595 where, in the former, the single fit yields a higher S/N ratio and, in the latter, the double fit yields a better S/N ratio.Since it is unclear to what extent the emission in GOODS-N-14595 can be attributed to [O iii]λ4364, we report the determination of physical quantities from the single-Gaussian fit to [O iii]λ4364.
We perform this same exercise on the stacked spectra and report the results in Table 4, with the simultaneous double Gaussian fits marked by asterisks.Since the emission at 4360 Å is only seen in GOODS-N targets, we check to see if the fitting technique has an effect on the measured line luminosities in stacks 2 and 3, finding that there is no significant effect on either stack.We do see an effect on stack 1 in that the single fit has a higher S/N ratio.Thus, we use the single Gaussian fit to [O iii]λ4364 in the stacks to determine physical conditions.
For the individual [O iii] auroral targets, only GOODS-N-8240 has a well-constrained oxygen abundance of 12 + log(O/H) = 8.02 +0.24  −0.17 , corresponding to ∼21% of the solar oxygen abundance.For the composite spectra, we find that the z ∼ 2.5 COSMOS stack has an oxygen abundance of ∼ 11% the solar value, while the z ∼ 1.7 GOODS-N stack has an oxygen abundance greater than ∼ 21% the solar value.For nitrogen, the abundances relative to the solar value are ∼ 4% and ≳ 13% for stacks 1 (z ∼ 2.5) and 2 (z ∼ 1.7).

DISCUSSION
We now turn to an analysis of the emerging trends in relations between strong-line ratios and direct oxygen abundance at high redshift enabled by our deep MOS-FIRE observations.In addition, the new analysis of the [O i]λ6302/Hα ratio in 21 galaxies beyond the local universe, while not representing a complete statistical sam-ple, hints at the properties of the ionized gas and the stellar populations in high-redshift galaxies.

Indirect Metallicity Indicators
Analyzing the accuracy of indirect metallicity indicators out to high redshifts is important for our understanding of the chemical evolution of galaxies across cosmic time.With our oxygen abundance measurements of the stacked spectra of auroral targets, we have constraints for the average metallicities of the galaxies considered in this study.We compare these stacks alongside measurements from the literature to strong-line metallicity indicators in order to understand how the accuracy of these indicators may shift with cosmic time.
In Figure 4, we show six strong-line ratios vs. oxygen abundance for our three stacks, and we plot the metallicity relations from Curti et al. (2020) determined from stacks of galaxy spectra in the local universe.
The short-hand labels for the strong-line ratios are defined as follows: R3 We note that below an oxygen abundance of 12+log (O/H) ≈ 8.1, there are fewer individual z ∼ 0 SDSS galaxies with >10σ [O iii]λ4364 detections, and the sample is biased toward higher specific SFR, representing a population more similar to our high-redshift sample than z ∼ 0 galaxies (refer to the discussion in the appendix of Sanders et al. (2021) for a detailed analysis of the lowmetallicity Curti et al. (2017Curti et al. ( , 2020) ) calibration sample).Additionally, we show the relationships between strong-line ratios and metallicity determined by Bian et al. (2018) for local galaxies selected to have emissionline properties analogous to those of high-redshift galaxies.Alongside these two line-ratio relations, we show the strong-line ratio vs. oxygen abundance for a large sample of H ii regions from the literature (compiled by Sanders et al. (2017) with data from Pilyugin & Grebel (2016), Croxall et al. (2015), and Toribio San Cipriano et al. ( 2016)) with a running median displayed as a solid black curve, and 1σ intervals shown as dotted lines to visualize the spread of the distribution.
Upon visual inspection, the oxygen abundances of the stacks lie within the distribution of galaxies from the literature (compiled by Sanders et al. (2020) with two additional galaxies from Sanders et al. (2023b)) shown as blue points.In the cases of log(R23) and log(N2), the Bian et al. (2018) curve serves as a better metallicity indicator to high-redshift galaxies compared to the Curti et al. (2020) curves.In the cases of the log(R3),  We also display z > 1 galaxies from the literature with direct oxygen abundance measurements as blue points (see Sanders et al. 2020Sanders et al. , 2023b)).Finally, we show the median relation of local H ii regions (see Sanders et al. 2017) as a solid black curve, with the 1σ spread illustrated as black dotted curves.
log(R2), log(O3O2), and log(O3N2) curves, the spread of galaxies is large compared to the differences between the Curti et al. (2020) and Bian et al. (2018) curves, so it is difficult to determine if there is a preference for one over the other.In general, the stacks agree most consistently with the distribution of H ii regions, though the galaxies from the literature are offset from the H ii regions to higher log(R3) and log(R23) at fixed oxygen abundance.
With a small existing sample size, it is difficult to make definitive conclusions about the accuracy of these strong-line ratios, especially considering that the sample of galaxies is biased towards bright, high electron temperature targets.For two of the individual auroralline targets where there were existing MOSDEF spectra (COSMOS-19439 and COSMOS-19752), we predicted the [O iii]λ4364 flux based on [O iii]λ5008 flux as well as T e predictions.For COSMOS-19439 and COSMOS-19753, we predicted auroral [O iii] line fluxes in the ranges of 0.8 − 1.9 × 10 −18 and 2.1 − 5.0 × 10 −18 erg s −1 cm −2 respectively.Since the 2σ upper limits are above our lowest line flux predictions, this suggests that the observations did not reach the required depth in the 10 combined hours of integration.This comparison demonstrates the limitations of 10-m-class ground-based observatories in this area of study and highlights the importance of JWST in building representative galaxy samples moving forward.

Insights from [O i] emission
We present the properties of our sample of galaxies in the [O iii]λ5008/Hβ vs.
[N ii]λ6585/Hα, [S ii]λλ6716, 6731/Hα, and [O i]λ6302/Hα BPT diagrams shown in Figure 5.In all diagrams, local Sloan Digital Sky Survey (SDSS; York et al. 2000;Abazajian et al. 2009) galaxies are shown as grayscale, twodimensional histograms.A sample of local H ii regions from the literature (see Sanders et al. 2017) is shown as a set of magenta points with an accompanying running median and a 1σ shaded region.In the [N ii] We additionally present a novel analysis of galaxies on the [O i] BPT diagram at z > 1.Including the filler targets, a total of nine galaxies yielded significant (>2σ) [O i]λ6302 detections, with two of the auroral [O iii] targets (COSMOS-19753 and GOODS-N-6699) yielding detections.In order to understand the general characteristics of the galaxies in regard to the [O i] BPT diagram, we constructed two composite spectra separated by redshift, choosing to include galaxies with coverage of the [O i] lines with the exception of galaxies whose [O i] feature fall on sky lines.These criteria result in two stacks with 13 galaxies in the 1 ≤ z < 2 stack and 8 galaxies in the 2 ≤ z ≤ 3 stack.The line ratios associated with these stacks are plotted as "plus" (+) symbols in Figure 5.
We see that these stacks follow a similar trend of relatively high [O iii]/Hβ relative to the SDSS sample and high [O i]/Hα relative to the locus of H ii regions.There are several factors that can influence the [O i]/Hα ratio in a galaxy, including contributions of DIG, the presence of shocks, and hardness of the ionizing spectrum, the latter of which appears to be relevant in z > 1 galaxies (Zhang et al. 2017;Shapley et al. 2019).
In the bottom three panels of Figure 5, we compare the line ratios from these  (Sanders et al. 2021;Topping et al. 2021), the comparison of these observations with photoionization models supports the picture of harder ionizing spectra from lowmetallicity, Fe-poor massive stars driving the line ratios of galaxies at higher redshifts (e.g., Steidel et al. 2016;Strom et al. 2017;Shapley et al. 2019;Sanders et al. 2020;Topping et al. 2020a,b;Runco et al. 2021;Cullen et al. 2021).
Though the harder ionizing spectrum is fully capable of explaining the enhancement in [O i]/Hα in these galaxies, it is also possible that shocks and varying contributions of DIG affect the BPT line ratios.With upcoming spectroscopic observations from JWST, an analysis of these effects may become more robust due to a larger sample of galaxies with a wider range of properties, for which we will also have detections of [O i].

Nitrogen abundances
We additionally comment on the nitrogen abundance patterns displayed by our [O iii] auroral galaxy sample.In the context of the star-forming galaxy population at z ∼ 2, the nitrogen abundances from the stacks are consistent with empirical predictions based on their average stellar masses.For example, Strom et al. (2022) determined the nitrogen abundances for a sample of 195 z ∼ 2 star-forming galaxies.Their linear fit to this sample predicts a nitrogen abundance of 12 + log(N/H) = 6.93 at a stellar mass of log(M/M ⊙ ) = 9.5 with an intrinsic scatter of 0.33 dex in abundance.Within their respective limits and uncertainties, all three of the stacks as well as GOODS-N-8240 and COSMOS-18812 have nitrogen abundances consistent with this prediction.
As well as analyzing the nitrogen abundance, we discuss the nitrogen to oxygen (N/O) ratio.The N/O ratios of galaxies and H ii regions are often used as a probe of the nucleosynthetic origin of nitrogen where, at low metallicity (12 + log(O/H) ≲ 8), log(N/O) is fixed at ∼ −1.5.This is referred to as the "primary" nitrogen regime since, at low metallicity, the nitrogen yield is tied to those of the α elements (Pilyugin et al. 2010;Pérez-Montero & Contini 2009;Izotov et al. 2006).At higher oxygen abundances, the nitrogen yield increases in proportion to the CNO abundances, and the N/O ratio increases, comprising the "secondary" nitrogen regime.Since the [O iii] auroral targets have significantly subsolar oxygen abundances on average, they should fall within the primary nitrogen regime.We plot the N/O ratio vs. oxygen abundance for our sample in Figure 6.
For stacks 1 and 3, where we have constraints on the N/O ratio, we find that their abundance pattern is consistent with those found in local H ii regions (e.g., Pilyugin et al. 2012).In addition, the N/O ratio for the stacks is similar to those measured within the z ∼ 2.2 KLEVER sample (Hayden-Pawson et al. 2022), though this sample spans higher oxygen abundances (12 + log (O/H) ∼ 8.2 − 8.7).However, for GOODS-N-8240, we find that its N/O ratio of log(N/O) = −0.99+0.22  −0.23 is slightly enhanced given its oxygen abundance of 12 + log(O/H) = 8.02 +0.24  −0.17 .Several hypotheses have been put forward to explain the abundance pattern of objects with enhanced N/O, one of which appeals to strong winds from Wolf-Rayet stars enriching the ISM (e.g., Pagel et al. 1986;Brinchmann et al. 2008;Masters et al. 2014).A more detailed analysis is required to determine the exact source of nitrogen enhancement in this target.
Another point of interest in studying the N/O ratio is to investigate its effects on trends in the [N ii] BPT diagram across cosmic time.Specifically, if there are significant differences between the N/O vs. O/H ratio at high redshifts compared to low-redshift observations, then an evolving N/O abundance pattern may play an important role in interpreting diagnostic line ratios involving nitrogen and oxygen (Curti et al. 2022;Hayden-Pawson et al. 2022).Because the stacks are consistent with the local N/O vs. O/H relation, there does not appear to be strong evidence for an evolution in the N/O ratio between z = 0 and z ∼ 2 based on this sample, though GOODS-N-8240 does represent an outlier in this regard.

CONCLUSIONS
We present an ultra deep rest-optical spectroscopic analysis of several z > 1, high-excitation galaxies with up to 10 hours of integration time in some bands, and we analyze their excitation properties as well as their oxygen abundances.We selected eight galaxies with strong nebular [O iii] emission and high predicted electron temperatures to maximize the chance of detecting the [O iii]λ4364 emission line.We detected [O iii]λ4364 in two targets, and we chose to stack the eight [O iii]auroral-selected galaxies to observe their general characteristics.Additionally, nine of the galaxies that were not targeted for auroral oxygen emission lines yielded [O i] detections, enabling the first analysis of high-redshift galaxies in this parameter space.Here are the key conclusions from this work: 1.When comparing the oxygen abundnces of the auroral-target stacks and galaxies in the literature on the strong-line indicator diagrams, we find that the stacks from this analysis are qualitatively consistent with the distribution found in the literature in both oxygen abundance and strong-line ratio.
In general, it is difficult to say with a small sample size whether the Curti et al. (2020) or Bian et al. (2018) curves better describe galaxies at z > 1.
In the case of log(R 23 ) and log(N 2 ), this may be the case.While the current sample size is limited, these results indicate that stacking analyses are promising.
2. When stacking together the galaxies with [O i] coverage (both auroral and non-auroral targets), we find that galaxies typically lie at higher [O i]/Hα at fixed [O iii]/Hβ relative to the median locus of local H ii regions.This offset is consistent with photoionization models with low-metallicity (1.0 × 10 −5 ≲ Z star ≲ 2 × 10 −3 ) stellar populations, supporting the picture that the line ratios in z > 1 galaxies are driven by harder ionizing spectra at fixed nebular oxygen abundance.
3. The N/O abundances of the [O iii] auroral stacks suggests that the nitrogen enrichment in our galaxy sample at z ∼ 2 is of primary origin and is consistent with the N/O vs. O/H primary abundance pattern seen in local H ii regions.Though the N/O abundance of GOODS-N-8240 is enhanced given its oxygen abundance, we do not find evidence that the line ratios in our galaxy sample are driven by an evolving N/O ratio with cosmic time.
The results of this analysis demonstrate the limits of 10-m-class ground-based facilities in the realm of nebular metallicity studies of galaxies at cosmic noon.Given that 10 hours of total integration time was still not enough to reach the required depth to consistently detect the [O iii]λ4364 line in all of the targets, we emphasize the importance of more sensitive facilities such as JWST and future 30-m-class observatories to make advances in this area of study.To date, JWST has already yielded a high number of auroral-line detections out to z ∼ 8 (e.g., Curti et al. 2023b;Williams et al. 2022;Nakajima et al. 2023;Tang et al. 2023;Sanders et al. 2023a).It is already playing an instrumental role in building up the sample of auroral-line measurements at cosmic noon and enabling improvements in strong-line metallicity calibrations at high redshfits.

Figure 1 .
Figure 1.Composite spectra of the [O iii] auroral targets.The black curve shows the luminosity density, while the error spectrum is shown in red in the inset axes.Each of the prominent emission lines is labeled and marked with a blue dotted line.

Figure 2 .
Figure 2. SFR vs. stellar mass.The colored squares indicate the auroral-line targets included in this study, with blue and orange corresponding to z ∼ 1.7 and z ∼ 2.5, respectively.The gray points indicate the non-auroral targets included in this study.The dashed black line shows a linear fit from Shivaei et al. (2015), while the solid blue and orange lines show the z ∼ 1.7 and z ∼ 2.5 SFR vs. stellar mass relations respectively from Speagle et al. (2014).The 1-σ scatter for each main sequence line is shaded in its respective color.The red triangles represent spectral stacks of z ∼ 2.3 galaxies from Sanders et al. (2021).

*
These line luminosities for [Fe ii]λ4360 and [O iii]λ4364 are determined by fitting a double Gaussian simultaneously to both features, whereas the [O iii]λ4364 luminosity without an asterisk is determined by fitting a single Gaussian profile.Oxygen abundances are determined from the single-Gaussian fit to [O iii]λ4364.

Figure 3 .
Figure 3. 2D and 1D spectra of the eight [O iii] auroral targets in this study.These spectral plots span from 4335 Å to 4370 Å, covering the Hγ and [O iii]λ4364 emission lines.The solid black line shows the flux density, and the shaded gray spectrum is the error on the flux density.We detect [O iii]λ4364 (labeled with an orange dotted line) in COSMOS-18812 and GOODS-N-8240; however, we report emission blueward of 4364 Å in the GOODS-N 6699 and 14595 spectra.

Figure 4 .
Figure 4. Comparison of the direct oxygen abundance measurements of our stacks with strong-line indicators from Curti et al. (2020) in orange and Bian et al. (2018) in purple.Strong-line indicators are plotted in solid lines over their quoted metallicity ranges, and extrapolations are shown by dashed lines.We also display z > 1 galaxies from the literature with direct oxygen abundance measurements as blue points (seeSanders et al. 2020Sanders et al. , 2023b)).Finally, we show the median relation of local H ii regions (seeSanders et al. 2017) as a solid black curve, with the 1σ spread illustrated as black dotted curves.

Figure 5 .
Figure 5. [N ii], [S ii], and [O i] BPT diagrams showing where the auroral and [O i] galaxy samples from this work lie in relation to SDSS galaxies in the local universe (grayscale, 2D histogram).We also compare the galaxies and stacks from this work to local H ii regions from the literature (compiled by Sanders et al. (2017)) and display a running median.The [O iii] auroral targets are shown by squares, while the non-[O iii]-auroral targets are shown by diamonds.The 1 ≤ z < 2 and 2 ≤ z ≤ 3 targets are displayed in blue and orange respectively.For comparison, stacks of z ∼ 1.5 and z ∼ 2.3 galaxies from Shapley et al. (2019) are shown as purple and red triangles respectively.Upper limits on [S ii] and [O i] detections are shown in black.In the bottom three panels, the same BPT diagrams are shown with CLOUDY photoionization models overlaid.The curves are color-coded by stellar metallicity indicated in the colorbar.
[O i] stacks to CLOUDY (Ferland et al. 2017) photoionization models following the prescription laid out in Jeong et al. (2020).The models are based on stellar spectra drawn from BPASS (Eldridge et al. 2017; Stanway & Eldridge 2018) where each model curve represents a 10 8.5 year-old stellar population with a constant star-formation history.Along each curve of fixed stellar metallicity (Z star ), we vary the ionization parameter and the nebular metallicity according to the Topping et al. (2020a) relation: log(U) = −1.06× [12 + log(O/H)] + 5.78.In both the [N ii] and the [O i] BPT diagrams, we find that both the 2 ≤ z ≤ 3 and 1 ≤ z < 2 [O i] stacks agree well with the very sub-solar metallicity (1.0 × 10 −5 ≲ Z star ≲ 2 × 10 −3 ) stellar population curves.Since not all of the galaxies in the [O i]stacks had wavelength coverage of the [S ii]λ6716, 6731 doublet, we do not include them on the [S ii] BPT diagram.Taken together with typical nebular oxygen abundances inferred from the MOSDEF survey

Figure 6 .
Figure 6.N/O vs. oxygen abundance for stacks 1 and 3 as well as for GOODS-N-8240 compared to local-universe H ii -region measurements from Pilyugin et al. (2012).

Table 1 .
[O iii]auroral targets and physical properties.

Table 3 .
Catalog of observed emission-line flux and physical properties for individual [O iii] auroral targets.
6.99 ± 0.11 11.52 ± 0.15 21.49 ± 0.12 15.76 ± 0.41 15.61 ± 0.10 14.03 ± 0.22 11.11 ± 0.12 10.30 ± 0.-- * These line fluxes for [Fe ii]λ4360 and [O iii]λ4364 are determined by fitting a double Gaussian simultaneously to both features, whereas the [O iii]λ4364 flux without an asterisk is determined by fitting a single Gaussian profile.** Since the [O ii] doublet was affected by sky lines in this target, we estimate the O 2+ /H abundance assuming a density of 100 cm −3 .Note-Non-detections are reported as 2σ upper limits, and Balmer line ratios that yield negative values of E(B-V) are set to 0.00.Oxygen abundances are determined from the single-Gaussian fit to [O iii]λ4364.