The Heavy Metal Survey: The Evolution of Stellar Metallicities, Abundance Ratios, and Ages of Massive Quiescent Galaxies since z ∼ 2

We present the elemental abundances and ages of 19 massive quiescent galaxies at z ∼ 1.4 and z ∼ 2.1 from the Keck Heavy Metal Survey. The ultradeep LRIS and MOSFIRE spectra were modeled using a full-spectrum stellar population fitting code with variable abundance patterns. The galaxies have iron abundances between [Fe/H] = −0.5 and −0.1 dex, with typical values of −0.2 [−0.3] at z ∼ 1.4 [z ∼ 2.1]. We also find a tentative logσv –[Fe/H] relation at z ∼ 1.4. The magnesium-to-iron ratios span [Mg/Fe] = 0.1–0.6 dex, with typical values of 0.3 [0.5] dex at z ∼ 1.4 [z ∼ 2.1]. The ages imply formation redshifts of z form = 2–8. Compared to quiescent galaxies at lower redshifts, we find that [Fe/H] was ∼0.2 dex lower at z = 1.4–2.1. We find no evolution in [Mg/Fe] out to z ∼ 1.4, though the z ∼ 2.1 galaxies are 0.2 dex enhanced compared to z = 0–0.7. A comparison of these results to a chemical evolution model indicates that galaxies at higher redshift form at progressively earlier epochs and over shorter star formation timescales, with the z ∼ 2.1 galaxies forming the bulk of their stars over 150 Myr at z form ∼ 4. This evolution cannot be solely attributed to an increased number of quiescent galaxies at later times; several Heavy Metal galaxies have extreme chemical properties not found in massive galaxies at z ∼ 0.0–0.7. Thus, the chemical properties of individual galaxies must evolve over time. Minor mergers also cannot fully account for this evolution as they cannot increase [Fe/H], particularly in galaxy centers. Consequently, the buildup of massive quiescent galaxies since z ∼ 2.1 may require further mechanisms, such as major mergers and/or central star formation.


INTRODUCTION
In the present-day universe, nearly all massive galaxies have quiescent stellar populations.Archaeological studies have revealed that they formed the bulk of their stars rapidly at z > 2 (e.g., Gallazzi et al. 2005;Thomas et al. 2005;McDermid et al. 2015).Thus, these studies imply that quiescent galaxies should already exist when the universe was only a fraction of its current age.
One way to understand the structural evolution of massive quiescent galaxies is through the "inside-out" growth scenario, in which compact galaxies still grow via gas-poor minor mergers after they have become quiescent (e.g., Bezanson et al. 2009;Naab et al. 2009;van Dokkum et al. 2010;Oser et al. 2012).An alternative explanation is suggested by the fact that the stellar mass function shows a tenfold increase in the number of massive quiescent galaxies since z ∼ 2 (McLeod et al. 2021).If galaxies that quench at lower redshift have larger sizes and stronger color gradients, the observed evolution of average properties could instead be explained by population growth (i.e., progenitor bias; e.g., Khochfar & Silk 2009;van Dokkum et al. 2010;Carollo et al. 2013;Poggianti et al. 2013).The relative importance of minor mergers vs progenitor bias remains a central question in massive galaxy evolution.
The evolution in elemental abundances and stellar population properties over cosmic time provides a unique insight into the assembly and star-formation histories of massive quiescent galaxies (e.g., Matteucci 1994;Trager et al. 2000;Conroy et al. 2014;Choi et al. 2014;Maiolino & Mannucci 2019;Beverage et al. 2021;Peng et al. 2015;Spitoni et al. 2017;Trussler et al. 2020).Unfortunately, measuring elemental abundances beyond the low-redshift universe poses a considerable challenge.At higher redshifts, key absorption features are faint and shifted to near infrared (NIR) wavelengths.Recently, ultra-deep spectroscopic surveys have started pushing these measurements beyond the low-redshift universe and find that the chemical properties of massive quiescent galaxies at z ∼ 0.7 closely resemble today's massive early-type galaxies (Choi et al. 2014;Leethochawalit et al. 2018;Beverage et al. 2023;Bevacqua et al. 2023).These studies show that the most massive galaxies were already in place six billion years ago, while the lowmass quiescent population have continued to grow since z ∼ 0.7 (see Leethochawalit et al. 2019;Beverage et al. 2023).
While minimal evolution is found out to z ∼ 0.7, the picture at higher redshift is less clear.The few existing stellar metallicity measurements of massive quiescent galaxies at z ≳ 1.4 show systematically lower [Fe/H] compared to z ∼ 0, but there is large scatter depending on the methods used to derive the metallicities (Lonoce et al. 2015;Onodera et al. 2015;Kriek et al. 2016Kriek et al. , 2019;;Carnall et al. 2022;Saracco et al. 2023;Zhuang et al. 2023).At z > 2, the only two existing measurements show strong α-enhancement, with [Mg/Fe]∼ 0.5 − 0.6 dex, indicative of extremely short star-formation timescales of < 100 Myr (Kriek et al. 2016;Jafariyazani et al. 2020).To further clarify the picture at z > 1, and assess the assembly of the quiescent galaxy population over the past 11 billion years, larger galaxy samples are needed.
To that end, we have executed the Keck Heavy Metal survey, an ultra-deep spectroscopic survey with LRIS (Oke et al. 1995;Rockosi et al. 2010) and MOSFIRE (McLean et al. 2010, 2012) on the Keck I Telescope (see Kriek et al. 2023).This survey collected high-S/N spectra of 21 massive quiescent galaxies at 1.4 ≲ z ≲ 2.2, covering multiple Balmer and metal absorption lines.
The Heavy Metal survey has more than tripled the number of individual galaxies at z ≳ 1.4 with measurements of stellar metallicities, ages, and elemental abundance ratios.Preliminary results of five Heavy Metal galaxies were presented in Kriek et al. (2019) and are updated in the current work.The survey design is presented in the companion survey paper (Kriek et al. 2023).In section 2 we describe our full-spectrum modeling technique.In section 3 we present the stellar population results.In section 4 we assess how the metal content of quiescent galaxies evolves between z ∼ 2.2 and z ∼ 0 by comparing our results with the LEGA-C (z ∼ 0.7) and SDSS (z ∼ 0) surveys.In section 5 we consider the implications of our results in the build-up of the massive quiescent galaxy population.In section 6 we summarize our results.Throughout this work we assume a flat ΛCDM cosmology with Ω m = 0.3 and H 0 = 70 km s −1 Mpc −1 .

The Heavy Metal Survey
The targets in this study are selected from the Keck Heavy Metal Survey; an ultra-deep rest-frame optical spectroscopic survey of massive quiescent galaxies at z = 1.3 − 2.2.The 21 quiescent targets were identified using the UltraVISTA K-band selected catalog (v4.1) by Muzzin et al. (2013), and were selected to be qui-Figure 1: Spectra (black) and corresponding 1σ uncertainties on the flux (gray), along with the best fitting alf models used to derive the ages and elemental abundances (red) of four example primary Heavy Metal Survey targets (two at z ∼ 1.4 and two at z ∼ 2.2).The other spectra can be found in Appendix A. Spectra are binned such that one pixel is ≃ 5 Å in the rest-frame.escent using U V J colors (Wuyts et al. 2016;Williams et al. 2009).The redshift intervals 1.3 < z < 1.5 and 1.92 < z < 2.28 were chosen such that key rest-frame NIR metal absorption features, namely Balmer lines, Hβ, and MgI, fall within the wavelength windows of the Keck/LRIS and Keck/MOSFIRE spectrometers.Primary targets were further required to have J < 21.5 and H < 21.7 at z ∼ 1.4 and z ∼ 2.1, respectively The z ∼ 1.4 targets were observed for a total of ∼17 hrs on Keck (∼5 hrs with LRIS and ∼12 hrs with MOSFIRE-J), while the z ∼ 2.1 targets were observed for ∼30 hrs (∼13 hrs with MOSFIRE-J and ∼16 hrs with MOSFIRE-H).These integration times provide sufficient S/N for modeling the faint stellar absorption features of individual massive quiescent galaxies.The typical S/N for the z ∼ 1.4 sample is 8 Å−1 and 15 Å−1 in the rest-frame for LRIS and MOSFIRE-J, respec-tively, while the S/N for the z ∼ 2.1 sample is 4 Å−1 and 12 Å−1 for MOSFIRE-J and MOSFIRE-H.For further details on the target selection, observing strategy, data reduction, galaxy sample characteristics, and stellar population properties (i.e., stellar masses) we refer to Kriek et al. (2023).
One of the 21 primary targets (ID:55878) is removed from this study before fitting due to strong line emission, likely originating from AGN (see Ma et al. 2023, in prep).Another galaxy is also removed (ID:59449) because there are no clear absorption lines in the spectra.

Stellar ages and elemental abundances
We derive chemical compositions and stellar ages of the 19 Heavy Metal targets using the full-spectrum absorption line fitter (alf) code (Conroy & van Dokkum 2012;Conroy et al. 2018).alf is designed to fit the

RESULTS
In the fitting presented here, we assume a model with a single burst of star formation (approximated as an SSP) and a fixed Kroupa (2001) IMF.We note that alf also allows for a young second SSP component to account for late-time star formation.Further, we turn off the alf hot horizontal branch star component, because we find it can bias the best-fit SSP ages and metallicities.In the end, the full-spectrum model has 26 free parameters: velocity offset, velocity dispersion, SSP age, isochrone metallicity, the abundances of 19 individual elements (Fe, O, C, N, Na, Mg, Si, K, Ca, Ti, V, Cr, Mn, Co, Ni, Cu, Sr, Ba, and Eu), and the emission line strengths of H, [OIII], and [NII].
The fitting is done in the rest-frame and the instrumental resolution of the data is accounted for by con-volving the grid of models to match the resolution of the spectrum.We measure this instrumental resolution for each spectrum by finding the average full width at half maximum (FWHM) of 20-25 skylines in the variance spectrum.The spectral continuum is also removed from the observations by fitting an 8th order polynomial to the ratio of the data to the model.Once normalized, the spectrum is fit using Markov-Chain Monte Carlo (Foreman-Mackey et al. 2013) with 1,024 walkers and a burn-in of 20,000 steps.The priors of the MCMC fit are uniform and the walkers are initialized around the solar abundance pattern.
After fitting all 19 galaxies, we visually inspect the best-fit models and the corresponding posterior distribution functions (PDFs) and remove any galaxy with poorly constrained parameters.We remove three galaxies because their PDFs do not follow normal distributions and/or they have ill-defined peaks.Two of these galaxies lack spectral coverage of key absorption features (D4000 break), leading to the unreliable abundance measurements, and the third galaxy has strong [O ii] and Hydrogen emission.We also remove two galaxies with best-fit stellar ages < 1 Gyr, as the stellar population models are not optimized for these young ages.As a sanity check, we also inspect correlations between all fitted parameters to ensure the relevant parameters are not being driven by other poorly constrained parameters.After removing these five galaxies, we are left with elemental abundances and stellar age measurements for ten galaxies at z ∼ 1.4 and four galaxies at z ∼ 2.1.
Five of the original 19 galaxies in this study were originally presented in Kriek et al. (2019).In this study, we have since re-extracted the 1D spectra using an improved optimal weighing procedure.We also assume a single SSP when fitting, whereas Kriek et al. (2019) assume a two-burst model.Nonetheless, the results presented here are consistent to within 1σ of what is found in Kriek et al. (2019).
Example spectra with corresponding best-fit models are shown in Figure 1, while the remaining spectra and fits can be found in Appendix A. Corner plots corresponding to the galaxies in Figure 1 are included in Appendix B. The best-fit parameters are listed in Table 1 for all 19 galaxies.Objects with poorly constrained abundances are noted and are removed from the following analysis.Finally, in Appendix D, we conduct a mock recovery test to check for systematic uncertainties in the alf fitting as a function of SNR.Systematic uncertainties remain <0.1 dex even at the lowest S/N (8 Å−1 ).We refer to Kriek et al. (2023) for tabulated details of the observations and SED fitting results.The results from all studies were derived using alf.Quiescent galaxies at z > 0.7 have lower [Fe/H], enhanced [Mg/Fe], and earlier formation times than galaxies at lower redshifts.
In this section, we present the stellar metallicities, abundance ratios, and stellar population ages of 14 massive quiescent galaxies from the Heavy Metal survey.Figure 2 shows the variation in [Fe/H], [Mg/Fe], and formation time (t form ) as a function of velocity dispersion σ v for the Heavy Metal galaxies at z ∼ 1.4 (blue triangles) and z ∼ 2.1 (purple stars).For comparison, we also include the trends of quiescent galaxies at z ∼ 0 and z ∼ 0.7 from Beverage et al. (2023).The z ∼ 0 trends are derived by re-fitting the stacked spectra of SDSS quiescent galaxies from Conroy et al. (2014), while the z ∼ 0.7 trends are based on 135 quiescent galaxies in the LEGA-C survey.Initially, Beverage et al. (2021) reported a -0.2 dex offset in [Fe/H] between LEGA-C and SDSS galaxies based on the spectra from the second LEGA-C data release (DR2).However, with the subsequent release, DR3, which doubled the sample size and improved data reduction (see van der Wel et al. 2021), Beverage et al. (2023) no longer found an offset in [Fe/H] between z ∼ 0.7 and z ∼ 0. Figure 2 also includes results from other studies of quiescent galaxies at z ∼ 1.5 (Carnall et al. 2022;Zhuang et al. 2023) and z ∼ 2.1 (Kriek et al. 2016;Jafariyazani et al. 2020).1 To ensure unbiased comparisons, we restrict the analysis to studies that utilize alf.
In the left panel of Figure 2 we show that the Heavy Metal galaxies have lower [Fe/H] compared to the z ∼ 0 and z ∼ 0.7 galaxies.Both the z ∼ 1.4 and z ∼ 2.1 galaxy populations have Fe-abundances between -0.5 and -0.1 dex, with typical uncertainties of 0.2 dex.There is a hint that the z ∼ 2.1 are slightly offset to lower [Fe/H] compared to z ∼ 1.4.We fit a σ-[Fe/H] relation to the z ∼ 1.4 population using a simple linear regression, with confidence intervals determined by perturbing the [Fe/H] of each data point according to their uncertainties and re-fitting 1000 times.The resulting relation is shown in the left panel of Figure 2 (blue line, with shaded confidence intervals).We find a positive slope of 0.52 +0.38  −0.23 , which is consistent with the z ∼ 0 and z ∼ 0.7 relations.However, there is a clear offset to lower [Fe/H].We do not fit a relation to the z ∼ 2.1 population because of the small sample size.
The observed evolution in [Fe/H] between z = 1.4−2.1 and z < 1 is corroborated by studies of other quiescent galaxies at similar redshifts.At z ∼ 1.4, galaxies in the VANDELS+KMOS survey (Carnall et al. 2022) and two lensed galaxies in the AGEL survey (Zhuang et al. 2023) have [Fe/H]∼ −0.25, in agreement with the z ∼ 1.4 Heavy Metal galaxies (see Figure 2).At z = 2.1, Kriek et al. ( 2016) measure [Fe/H]= −0.25 for a massive quiescent galaxy (COSMOS-11494), consistent with the values of the z ∼ 2.1 Heavy Metal galaxies (refer to Figure 2).Interestingly, the only other study at z ∼ 2.1 finds super-solar [Fe/H] for a lensed galaxy (MRG-M0138) at z = 1.98 (Jafariyazani et al. 2020).It was previously unclear whether this large variation in [Fe/H] is typical of the z ∼ 2.1 quiescent galaxy population.Now, with four Heavy Metal galaxies at z ∼ 2.1 with sub-solar [Fe/H], there is additional evidence suggesting that typical massive galaxies at z ∼ 2.1 are indeed Fe-deficient and that the Fe-abundance of MRG-M0138 may be anomalous.Nevertheless, the sample size is small and the large uncertainties on the Feabundances prevent any definitive statement about the z ∼ 2.1 population.
In the middle panel of Figure 2 we see that the Heavy Metal galaxies at z ∼ 1.4 have similar [Mg/Fe] to what is found in the SDSS (z ∼ 0) and LEGA-C (z ∼ 0.7) samples.Meanwhile, the z ∼ 2.1 Heavy Metal galaxies have [Mg/Fe] that are ≈0.2 dex higher than what is found at lower redshifts, though typical uncertainties are large (0.2 dex) and the sample size is small.We fit a σ-[Mg/Fe] relation to the z ∼ 1.4 Heavy Metal galaxies using the same method as with [Fe/H] and find a slope consistent with zero, and no significant offset compared to the z < 1 relations (blue line with shaded confidence intervals).These results are supported by the other studies of quiescent galaxies at z > 1.The two studies at z ∼ 1.4 find [Mg/Fe] consistent with the LEGA-C and SDSS, while the two at z ∼ 2 find enhanced [Mg/Fe] consistent with the z ∼ 2.1 Heavy Metal galaxies.
Finally, in the right panel of Figure 2 we show galaxy σ v vs. galaxy formation time.This age is determined using the best-fit stellar population age.We find that the formation time of the Heavy Metal galaxies range from 1 Gyr (z form = 9) and 4 Gyr (z form = 9) after the Big Bang, with an average formation time of 2.5 Gyr (z form = 3) and 1.5 Gyr (z form = 4) for the z ∼ 1.4 and z ∼ 2.1 samples, respectively.We note that the galaxy formation times are calculated using an SSP-equivalent age.We note that the galaxy formation times are calculated using an SSP age, and more complex SFH would have likely resulted in slightly earlier formation times.
The assumption of an SSP, however, does not bias the stellar metallicities or elemental abundance measurements (Serra & Trager 2007).In the next section we consider the results in Figure 2 in the context of galaxy evolution.

THE EVOLUTION OF MASSIVE QUIESCENT
GALAXY POPULATION SINCE z = 2.1 In this section, we investigate the changes of average stellar metallicities, abundance ratios, and ages of massive quiescent galaxies as a function of redshift and discuss implications for their evolution over the past 11 billion years.In Figure 3 we show the average [Fe/H], [Mg/H], t form , and [Mg/Fe] for quiescent galaxies in the SDSS, LEGA-C, and Heavy Metal surveys as a function of redshift.To allow for a fair comparison between surveys, we only consider galaxies in SDSS and LEGA-C with similar σ v as the Heavy Metal galaxies.Thus, the average velocity dispersions of the SDSS and LEGA-C galaxies used in Figure 3 Examining the top left panel of Figure 3, we observe an increase in the average [Fe/H] from z = 2.1 to z = 0, with distinct offsets of -0.2 and -0.3 dex at z ∼ 1.4 and z ∼ 2.1, respectively, between the Heavy Metal galaxies and the lower redshift samples.Moving to the middle upper panel, we find less evolution in [Mg/H] compared to [Fe/H]; the z = 1.4 sample is 0.1 dex lower compared to the z = 0.7 and z = 0 galaxies.The z = 2.1 galaxies have no [Mg/H] offset compared to lower redshift galaxies, although uncertainties remain substantial.Primarily due to the increase in [Fe/H], the average [Mg/Fe] decreases towards lower redshift, as revealed in the upper right panel.The z = 2.1 sample exhibits [Mg/Fe] elevated by 0.25 dex, while the z = 1.4 sample is only elevated by 0.08 dex.In the bottom left panel we show that the average t form changes from t form = 1.5 to t form = 4 Gyr from z = 2.1 to z = 0.
In the bottom middle panel of Figure 3  The behavior of the model is driven by the differential enrichment timescales of Fe and Mg.While Mg is predominantly synthesized in massive stars and promptly returned to the interstellar medium (ISM) through core-collapse supernovae (SNe), Fe is produced through the explosions of intermediatemass stars (Type Ia SNe; SNIa) and is released on delayed timescales.Therefore, galaxies with shorter starformation timescales experience less Fe-enrichment from The simple chemical evolution model used in this paper is illustrative, and is by no means meant to capture the complex feedback processes and star-formation histories likely present in the galaxies at hand.More detailed chemical evolution models, including inflows and outflows, along with varied star-formation histories, will help decipher the various factors shaping the elemental abundances and stellar metallicities of integrated stellar populations, especially as we move to higher redshifts (e.g., see N. Gountanis in prep.).
A visual comparison between the observations and the simple chemical evolution model highlights that the inferred star-formation timescales of the z ∼ 0 and z ∼ 0.7 samples is longer than the extreme timescales observed for the Heavy Metal galaxies, which are approximately 0.15 Gyr and 0.7 Gyr for the z ∼ 2.1 and z ∼ 1.4 samples, respectively.These inferred starformation timescales derived from [Mg/Fe], along with the results for t form , paint a picture of galaxy formation in which typical massive quiescent galaxies at higher redshift formed earlier and underwent more rapid star formation compared to their counterparts at z ∼ 0. To visually depict this scenario, we present the implied average star-formation histories in the lower right panel of Figure 3.We adopt a Gaussian parameterization for the star-formation histories, with the mean centered at t form and the FWHM corresponding to the star-formation timescale, which is determined by comparing the abundances to the chemical evolution model.It is important to note that this panel serves as an illustrative tool, and the choice of a Gaussian star-formation history is driven by its simplicity rather than a specific physical motivation.Nevertheless, it is evident that the star-formation histories of the Heavy Metal galaxies, particularly at z ∼ 2.1, are extremely short compared to the lower redshift samples.We also note that this panel does not take into account the evolution of the stellar mass function, nor the spread in formation redshifts and star-formation timescales within the sample.Instead, it shows the typical SFH at each redshift interval.
One possible explanation for the observed evolution is the growth of the quiescent galaxy population due to the continuous quenching of star-forming galaxies between z = 2.1 and z = 0 (i.e., progenitor bias; van Dokkum & Franx 2001;Carollo et al. 2013;Lilly et al. 2013).Galaxies that quench at later epochs presumably form their stars over longer timescales, resulting in later t form and more Fe-enrichment from SNIa (lower [Mg/Fe]) before quenching.Hence, the addition of these galaxies to the quiescent population steadily increases the average t form and [Fe/H], as observed.In fact, the number of quiescent galaxies with M * > 10 10.5 M ⊙ has increased tenfold since z = 2, as evident from the evolution of the stellar mass function (e.g., Conselice et al. 2022;McLeod et al. 2021).
An alternate explanation are mergers and late-time star formation, which impact the elemental abundances of individual galaxies; when galaxies accrete satellites or form new stars, their integrated elemental abundances are altered by the newly added stars.Indeed, the importance of (dry) minor mergers in the evolution of quiescent galaxies has been posited by observations that they were more compact and had flatter color gradients at earlier times (e.g., van Dokkum et al. 2008van Dokkum et al. , 2010;;Naab et al. 2009;Trujillo et al. 2009;Taylor et al. 2010;Suess et al. 2019bSuess et al. , 2020;;Miller et al. 2023) as well as by direct pair fraction measurements (Newman et al. 2012).Furthermore, Suess et al. (2023) show that tiny companions with stellar mass ratios 1:900 compromise some 30% of their total stellar mass budget.
The extent to which mergers, late-time star formation, and progenitor bias contribute to the evolution of the massive quiescent galaxy population remains a topic of ongoing debate (e.g., Bezanson et al. 2009;Hopkins et al. 2009;Carollo et al. 2013;Poggianti et al. 2013;van de Sande et al. 2013;Williams et al. 2017;Damjanov et al. 2019).In the following section we examine the elemental abundances and stellar ages of individual galaxies to help differentiate between the various evolutionary scenarios; if mergers and/or late-time star formation play only a small role in the evolution of massive quiescent galaxies, then the extreme galaxies observed at z ∼ 2 should persist as rare remnants in the z ∼ 0 population.

IMPLICATIONS FOR THE BUILD-UP OF MASSIVE QUIESCENT GALAXIES
In the previous section, we observed significant evolution in the average elemental abundances of massive quiescent galaxies from z ∼ 2.1 to z ∼ 0, suggesting that galaxies at higher redshift are typically metal-poor, have shorter star-formation timescales, and formed earlier.In this section, we examine the abundance ratios of individual galaxies over cosmic time to help differentiate between various evolutionary channels.
Figure 4 shows [Fe/H], [Mg/H], and [Mg/Fe] as a function of σ v (top row) and M * (bottom row).We compare the Heavy Metal galaxies to the abundances of individual LEGA-C galaxies (Beverage et al. 2023, gray circles,), and to galaxies in SDSS (black diamonds) presented by Zhuang et al. (2023).Additionally, we compare to alf-derived abundances of 41 galaxies from the MASSIVE survey, a volume-limited survey of the most massive galaxies in the local universe (black diamonds, Ma et al. 2016;Gu et al. 2022).We also include abundance results from other studies of quiescent galaxies at z ∼ 1.4 and z ∼ 2.1, with the same markers as Figure 2. We note that the SDSS fiber, and the LEGA-C and Heavy Metal slits cover similar physical extents within the galaxies (3-4 kpc).However, as galaxies were smaller at earlier times, the Heavy Metal and LEGA-C slits cover 1-1.2 R e , while SDSS only covers 0.4-0.8R e .The MASSIVE spectra were extracted from the inner R e /8 and therefore represent only the cores of nearby massive early-type galaxies.
In the previous section, we proposed two theories to explain the observed abundance differences between the Heavy Metal survey and z ∼ 0. In the first scenario, the quiescent galaxy population is continuously growing by the addition of newly quenched galaxies, driving the evolution of the observed average abundances ("progenitor bias," e.g., van Dokkum & Franx 2001;Carollo et al. 2013;Poggianti et al. 2013).In this scenario, we expect to find galaxies with properties similar to the Heavy Metal galaxies in the lower redshift universe, and thus the populations at higher and lower redshift should overlap.In other words, the Heavy Metal galaxies should constitute the lower metallicity and high-[Mg/Fe] tail of the z ∼ 0 samples.Indeed, the top panels of Figure 4 show that this is mostly the case for the Heavy Metal galaxies, though there are a few exceptions; most of the Heavy Metal galaxies with high σ v have significantly lower [Fe/H] and [Mg/H] than any galaxy found at z ∼ 0 and z ∼ 0.7.Furthermore, in the rightmost panel, we find that galaxies at z ∼ 2.1 are all [Mg/Fe]enhanced compared to the galaxies at lower redshifts.When considered as a function of stellar mass, the metallicity differences between the lower and higher redshift populations becomes more apparent.At all stellar masses, and particularly at the highest stellar masses, the Heavy Metal galaxies have [Fe/H] and [Mg/H] ≈ 0.2 dex lower than any individual galaxy in SDSS or LEGA-C.This result poses a problem for the progenitor bias scenario; if these Heavy Metal galaxies passively evolve to z ∼ 0, we would expect to find Fe-and Mgdeficient remnants in the SDSS and MASSIVE populations.Specifically, in this scenario 10% of the SDSS and MASSIVE galaxies should be direct descendants of z > 2 quiescent galaxies given the rapid evolution of the stellar mass function (McLeod et al. 2021).Thus, the z ∼ 0 galaxies are likely not the direct decedents of the z ∼ 2 Heavy Metal galaxies.It is important to acknowledge, however, that uncertainties are substantial and sample sizes are limited.Thus, the possibility remains that we may have simply missed the rare metal-poor remnants in the z ∼ 0 samples.We also note that the Heavy Metal were selected to maximize the number of bright quiescent galaxies in one field of view.As such, the environments of these galaxies may not be representative of the broader massive quiescent galaxy population at higher redshift.
As previously mentioned, minor mergers provide an alternate way to explain the observed evolution.In this picture, the z ∼ 1.4 − 2.1 quiescent galaxies may grow into the cores of today's massive early-type galaxies by accreting low-mass galaxies (e.g., Bezanson et al. 2009;Hopkins et al. 2009;van de Sande et al. 2013).This scenario can successfully explain the observed increase in stellar metallicity and decrease in [Mg/Fe], but only if the accreted galaxies form over long timescales (i.e.lower [Mg/Fe]) and are more metal-rich.There are, however, two problems with this scenario.Firstly, our observations show there is evolution within the inner 1 R e , while minor mergers are typically accreted onto galaxy outskirts (Naab et al. 2009;Oser et al. 2010).Secondly, given the mass-metallicity relation, the accreted lowmass galaxies will likely be metal-poor, and therefore cannot explain the increase in metallicity (Kirby et al. 2013).Major mergers with galaxies that have longer star-formation timescales (and thus lower [Mg/Fe]) solve both of these problems, as more massive galaxies tend to be more metal-rich and can more easily disrupt galaxy cores.However, the observed major merger rate is likely too low to fully account for the observed evolution (Oser et al. 2012;Nipoti et al. 2012;Newman et al. 2012) and major mergers alone cannot explain the size evolution (e.g., Hopkins et al. 2009;van Dokkum et al. 2010;Hilz et al. 2013;Belli et al. 2014).Therefore, a combination of major and minor mergers may be necessary to explain the observed evolution.
Finally, late-time star formation, likely in combination with or triggered by mergers, can increase the metallicities of individual galaxies while decreasing [Mg/Fe]; if the star-formation material is enriched in SNIa products by previous epochs of star formation, then the newly formed stars will be younger, more metal-rich, and more α-enhanced.It should be noted that a burst of central star-formation is expected in a wet merger scenario (e.g., Hopkins et al. 2006) and thus the two scenarios together are effective at explaining the absence of metal-poor and [Mg/Fe]-enriched galaxies by z ∼ 0 as observed in Fig- ure 4.
Crucially, all of the above scenarios rely on the assumption that at the same stellar masses, younger galaxies have higher metallicity and lower [Mg/Fe].For example, the progenitor bias scenario only works if the galaxies that quench at later times are more metal-rich and form over longer timescales.In Figure 5 we search for such trends in [Fe/H], [Mg/H], and [Mg/Fe] as a function of t form for the SDSS, LEGA-C, and Heavy Metal quiescent galaxies.In each panel, we remove the first-order mass dependence by subtracting the SDSS M * -abundance relations from all galaxies and then scaling them to the value of the M * -abundance relations at M * = 10 11 M ⊙ (e.g., Leethochawalit et al. 2019;Zhuang et al. 2023).At fixed stellar mass, we do, in fact, find a correlation between the quenching time (as traced by t form ) and stellar metallicities; younger SDSS and LEGA-C galaxies exhibit higher [Fe/H], marginally higher [Mg/H], and marginally lower [Mg/Fe], as previously found by Beverage et al. (2023) and Zhuang et al. (2023), respectively (see also Leethochawalit et al. 2018).
The trends with [Fe/H] and [Mg/Fe] can be naturally explained by galaxies joining the quiescent galaxy population at later times (higher t form ) having more time to form stars from gas that is rich in SNIa products (namely, Fe).However, the marginal trend with [Mg/H] is more challenging to explain, as Mg-enrichment is independent of star-formation timescale and instead primarily depends on the size of the gas reservoir at the time of star-formation quenching.Beverage et al. (2021) propose a model to explain the marginal trend with t form by invoking rapid gas removal during the quenching process.In summary, if galaxies at earlier epochs, that form earlier and over shorter timescales, quench more abruptly and have larger gas fractions at the time of quenching (e.g.effective AGN outflows), then [Mg/H] is truncated to lower values.Thus, the marginal trend with [Mg/H] may suggest that galaxies at earlier times experience more effective outflows (see also the outflow model proposed by Leethochawalit et al. 2019).
Furthermore, Figure 5 shows that high-z galaxies have a slight systematic offset to lower metallicities compared to local galaxies even when the dependence on M * and stellar age is removed.This same result was also found by Zhuang et al. ( 2023), but for a smaller galaxy sample.Thus, this result confirms the finding by Zhuang et al. ( 2023) and reinforces the need for mergers or latetime star formation to alter the chemical compositions and stellar ages of these extreme individual galaxies by z ∼ 0. Again, we note that comparisons at "fixed formation times" between galaxies at different redshift regimes should be approached with caution; the SSPequivalent ages can be biased young.As a result, a more complicated star-formation history may shift the Heavy Metal galaxies in Figure 5 towards even earlier formation times.Nonetheless, correcting for such a bias would not change our interpretation.
Figures 4 and 5 together demonstrate that progenitor bias, minor and major mergers, and late-time star formation are likely all responsible, to some extent, for the observed evolution in the elemental abundances of massive quiescent galaxies over cosmic time.This analysis further shows that the existing story of massive quiescent galaxy assembly -namely, that the z > 1 quiescent galaxies are the progenitor cores of today's massive early-type galaxies -is more complicated than initially anticipated.We come to a similar conclusion based on the dynamical masses of the Heavy Metal galaxies in Kriek et al. (2023).

SUMMARY AND CONCLUSION
In this paper, we present the stellar metallicities, elemental abundance ratios, and stellar ages of 19 massive quiescent galaxies from the Keck Heavy Metal survey.The galaxies are divided over two redshift intervals around z ∼ 1.4 and z ∼ 2.1.We fit the ultra-deep LRIS+MOSFIRE spectra using a full-spectrum modeling code with variable elemental abundance ratios.Combined, these measurements represent the largest sample of individual quiescent galaxies at z ≳ 1.4 with stellar metallicity and elemental abundance measurements.
We find that massive quiescent galaxies at 1.4 ≲ z ≲ 2.2 have subsolar Fe-abundances, ranging from [Fe/H] = −0.5 to −0.1 dex, with typical values of −0.2 and −0.3 dex at z ∼ 1.4 and z ∼ 2.1, respectively.We also find a trend between σ v and [Fe/H], with a best-fit slope consistent with what is found at lower redshifts.We find high [Mg/Fe] of ≈ 0.5 for the z ∼ 2.1 galaxies, while those at z ∼ 1.4 are lower with [Mg/Fe] ∼ 0.3.The ages range from 0.9-3.7 Gyr, implying formation redshifts z form = 2 − 8.
We examine the evolution of the massive quiescent galaxy population over cosmic time by comparing the average Heavy Metal elemental abundances and ages to those in the LEGA-C (z ∼ 0.7) and SDSS (z ∼ 0) surveys.The z ∼ 1.4 and z ∼ 2.1 galaxies have [Fe/H] that are 0.15 and 0.23 dex lower than what is observed at z ≲ 0.7.Interestingly, we find no evolution in [Mg/Fe] at z ≲ 1.4, though the z ∼ 2.1 galaxies have 0.2 dex higher [Mg/Fe].By comparing these results with a simple chemical evolution model, we find that galaxies at higher redshifts formed at progressively earlier epochs and shorter timescales, with the z ∼ 2.1 galaxies forming in only ≈ 150 Myr at z = 4.
The observed evolution of stellar population properties provides a unique insight into the assembly histories of massive quiescent galaxies.One possible explanation is the growth of the quiescent galaxy population.If the galaxies that join the population at later times are younger, more metal-rich, and less alpha-enhanced, then the average properties will evolve.However, this scenario cannot fully explain the observations because we do not find the direct descendants of the chemically extreme Heavy Metal galaxies at z ≲ 0.7.Minor mergers also fail to account for the observed evolution; due to the mass-metallicity relation, they cannot increase [Fe/H], in particular in the galaxy centers.Thus, major mergers and/or late-time star formation are required to explain the evolution of chemical properties across cosmic time.These results imply a more complicated assembly history than the minor-merger driven inside-out growth scenario previously inferred from the evolution in the sizes, color gradients, and central densities of quiescent galaxies.In our accompanying paper, Kriek et al. (2023), arrive at a similar conclusion based on the dynamical masses of the Heavy Metal galaxies.
This study was enabled by ultra-deep spectroscopic observations conducted with the most efficient groundbased spectrographs.With up to 16 hr of integration per galaxy per band, these spectra represent some of the deepest spectroscopic observations of individual galax-ies at z ≳ 1.4 with coverage of key absorption features.As a result, the Heavy Metal survey has tripled the number of individual stellar metallicity measurements at z ≳ 1.4.Nonetheless, sample sizes at these redshifts remain small, and the uncertainties on each individual measurement remain substantial.
In the future, JWST will provide even deeper spectroscopy of massive quiescent galaxies at z ≳ 1.4, extending the existing sample sizes, reaching to lower stellar masses, and enabling the measurements of more elements.Thus, JWST will revolutionize our understanding of the assembly of massive quiescent galaxies over comsic time.
We would like to thank Zhuyun Zhuang and Glenn Van de Ven for useful conversations.We acknowledge support from NSF AAG grants AST-1908748 and 1909942. A.G.B. is supported by the National Science Foundation Graduate Research Fellowship Program under grant No. DGE 1752814 and DGE 2146752 and by the H2H8 Association and LKBF.We gratefully acknowledge the efforts and dedication of the Keck Observatory staff for support, especially amid the challenges posed by a global pandemic.We recognize and acknowledge the very significant cultural role and reverence that the summit of Mauna Kea has within the indigenous Hawaiian community.We are deeply fortunate to have the opportunity to conduct observations from this mountain.In Figure 9, we present corner plots from the full-spectrum alf fitting.The four corner plots correspond to the same four spectra showcased in Figure 1.Next to each corner plot we provide information about the object: ID, redshift, and SNR ( Å−1 ).We present the four essential parameters, [Fe/H], [Mg/H], [Mg/Fe], and the age of the stellar population.The dashed lines in the plots represent the 1σ uncertainties.Though the uncertainties are large, the results follow a normal distribution.We find there to be a degeneracy between stellar age and metallicity.Target 213931 shows the strongest degeneracy, likely because the observations are a blend of two nearby systems, resulting in the superposition of stellar populations with different ages and metallicities.For the other galaxies, the age-Z degeneracy is considerably less pronounced.

C. COMPARISON WITH LICK INDICES
In this Appendix we compute absorption line indices for the Heavy Metal galaxies.The Lick/IDS system, first introduced by Faber et al. (1985) and updated by Worthey et al. (1994), is the most common method for measuring elemental abundances and stellar ages of nearby quiescent galaxies.However, as we continue to push observations to higher redshifts, absorption features become fainter and more contaminated by NIR skylines.As a result, Lick indices and their derived stellar population parameters become highly uncertain.Full-spectrum modeling was, in part, developed to account for these issues.By fitting the entire stellar continuum, noise peaks can be easily down-weighted, and all absorption features for a single element can be evaluated simultaneously.
In this Appendix, we check that the absorption line indices of the Heavy Metal galaxies agree with our main results from full-spectrum modeling.Specifically, we compute the <Fe> and Mgb indices, where <Fe>= 0.5 × (Fe5270 + Fe5335).We then compare these indices to simple spectrum models from Thomas et al. (2003).
Given the prevalence of skylines in the Heavy Metal spectra, it is crucial that we account for the prominent noise peaks.However, by default, the Lick/IDS system does not use the variance spectrum because it biases the equivalent widths.Thus, to overcome skyline contamination, we follow the approach of van der Wel et al. (2021).For each spectrum, we take the best-fitting alf model and compute its χ 2 .We then scale the noise to a χ 2 = 1.Any pixel that is more than 2σ or 20% offset from the model flux is masked.If more than 30% of the pixels in the index wavelength interval (including the pseudo-continuum intervals) are masked, the index is deemed unreliable.Four Heavy Metal galaxies are removed at this step.For the other ten galaxies, the masked pixels are replaced by a linear interpolation using the good pixels.We note that the alf models are only used to help identify which pixels should be masked.
Next, we correct the absorption line indices to a constant velocity dispersion.This step is important if we want to compare indices between objects and to stellar population models.In the Lick/IDS system, indices are corrected to have zero intrinsic velocity dispersion and to have a standard wavelength-dependent resolution, ranging from 8.4 Å to 11.5 Å (Worthey & Ottaviani 1997).We compute correction factors using alf models at a variety of velocity dispersions (e.g., see Davies et al. 1993;Kuntschner 2004).This correction factor is applied after measuring the equivalent widths.For the Heavy Metal galaxies, the correction ranges from C(σ v ) = 1.0 − 1.5.With the corrected Lick indices in hand, we finally estimate uncertainties by bootstrapping the spectrum and re-computing equivalent widths 10,000 times, taking the 16th and 84th percentile as the index uncertainty.
In Figure 10, we present <Fe> as a function of Mgb for 10 Heavy Metal galaxies.We compare these indices with a set of 2 Gyr models from Thomas et al. (2003).Drawing conclusions from this figure is challenging, as numerous Heavy Metal galaxies display index measurements with substantial uncertainties that span the entire model grid.This issue is particularly pronounced for galaxies at a redshift z ∼ 2.1.The increased uncertainties at higher redshifts can be attributed to the younger ages of these galaxies, resulting in less pronounced metal lines.Additionally, significant contamination from skylines in the near-infrared underscores the necessity for full-spectrum modeling, as opposed to index fitting, at higher redshifts.

D. PARAMETER RECOVERY WITH MOCK HEAVY METAL SPECTRA
In this Appendix, we test the robustness of our results as a function of SNR.The primary goal of this test is to check for any systematic uncertainties in full-spectrum modeling, especially as we push the limits of this fitting to All mock spectra are generated from a single alf model with stellar population parameters reflecting average values from Table 1.We degrade the model spectrum to the desired SNR at rest-frame 5000 Å using the Heavy Metal noise spectrum.In this way, we can replicate the observations, accounting for wavelength-dependent sensitivities.The model spectrum is also resampled to the LRIS+MOSFIRE resolution using spectres (Carnall 2017).We generate 45 realizations at SNRs = 8, 12, 20, 40 and fit them using the same method described in Section 2.2.For all realizations we compute the difference between the best-fit parameter and the known model parameters.
We present the results in Figure 11.The black rectangles show the average parameter offset for the 45 realizations, with the blue errorbars representing the 1σ scatter of these offsets.We also show the typical uncertainties on a single realization in red.The gray tick marks at the bottom of each panel represent the SNR of the individual Heavy Metal spectra.All but one Heavy Metal spectrum has SNR ≳ 10 Å−1 .σ v is recovered extremely well with no systematic uncertainties at SNR≳ 12.In contrast, the recovery of the stellar population age is a bit surprising; the systematic uncertainties are larger at higher SNRs [Actually, I think I know why this is.I used a pretty constraining upper limit

Figure 2 :
Figure 2: [Fe/H] (left), [Mg/Fe] (middle), and formation redshift (right) results for the z ∼ 1.4 (blue triangles) and z ∼ 2.1 (green stars) Heavy Metal targets as a function of stellar velocity dispersion.The black solid line shows the σ v -abundance relations of massive quiescent galaxies at z ∼ 0 (Conroy et al. 2014; Beverage et al. 2021), while the black dashed lines denote relations at z ∼ 0.7 (Beverage et al. 2023).We include a quiescent galaxy at z = 2.1 (green square; Kriek et al. 2016), a lensed quiescent galaxy at z = 1.98 (green diamond; Jafariyazani et al. 2020), a stack of quiescent galaxies at 1.0 < z < 1.3 (blue pentagon; Carnall et al. 2022), and two lensed quiescent galaxies at z = 1.016 and z = 1.37 (blue circles; Zhuang et al. 2023).The solid blue line in the left panel represents the best-fit σ v -[Fe/H] with the 16th and 84th percentile uncertainties on the fit shown in light blue.The results from all studies were derived using alf.Quiescent galaxies at z > 0.7 have lower [Fe/H], enhanced [Mg/Fe], and earlier formation times than galaxies at lower redshifts.
we compare the average [Mg/Fe] as a function of [Fe/H] with a simple closed-box chemical evolution model from Kriek et al. (2016, represented by a black dashed line).The model provides mass-weighted abundances for various different star-formation timescales (indicated in Gyr next to the black boxes).

Figure 3 :
Figure 3: Average elemental abundance and stellar age results for the Heavy Metal galaxies at z ∼ 1.4 and z ∼ 2.1, and for stacks of massive quiescent galaxies at z ∼ 0 and z ∼ 0.7 across various axes.Top row: the [Fe/H], [Mg/H], and [Mg/Fe] abundances as a function of redshift.Bottom row: the left panel shows the formation time as a function of redshift.In the middle panel we compare the average abundances on the [Fe/H]-[Mg/Fe] plane to a simple closed box chemical evolution model (black dashed line).Black squares represent predicted elemental abundances for various star-formation timescales, denoted in Gyr.The right panel shows the inferred star-formation history at each redshift interval.The age of the universe for each sample is marked with a dashed line.Typical quiescent galaxies at higher redshift formed earlier and over shorter timescales than massive quiescent galaxies today.

Figure 4 :
Figure 4: Individual measurements of [Fe/H], [Mg/H], and [Mg/Fe] as a function of σ v (top row) and M * (bottom row).The Heavy Metal galaxies are split into z ∼ 1.4 (blue triangles) and z ∼ 2.1 (green stars) samples.We include results from other studies of massive quiescent galaxies at similar redshifts (see legend in Figure 2).We compare these results to individual abundance measurements from the MASSIVE (z = 0; Gu et al. 2022, open circles), SDSS (z ∼ 0; Zhuang et al. 2023, black diamonds), and LEGA-C (z ∼ 0.7; Beverage et al. 2023, gray dots) surveys, with their typical uncertainties provided at the bottom of each panel.At constant M * and σ v , galaxies at z > 1 have different chemical properties than those at lower redshifts.

Figure 5 :
Figure 5: [Fe/H], [Mg/H], and [Mg/Fe] at a fixed stellar mass (M * = 10 11 M ⊙ ) as a function of formation time for the LEGA-C (gray circles), Heavy Metal (blue triangles and green stars), SDSS (red shading) and various other z > 1 studies (refer to the legend in Figure 2).Abundances at fixed stellar mass are calculated from the M *abundance relations at z ∼ 0 shown in Figure 4. Galaxies formed at later cosmic epochs have higher [Fe/H] and [Mg/H], and slightly lower [Mg/Fe].At a fixed t form , the Heavy Metal galaxies exhibit similar abundances to the SDSS and LEGA-C populations.

Figure 6 :
Figure 6: Spectra (black) and corresponding 1σ uncertainties on the flux (gray), along with the best-fitting alf models used to derive the ages and elemental abundances (red) of primary Heavy Metal Survey targets, sorted by redshift.Spectra are binned such that one pixel is ≃ 5 Å.

Figure 9 :
Figure 9: Four example corner plots from the full-spectrum alf modeling.In each panel, we show the probability distributions four different physical parameters.The dashed lines in the 1D distributions show the 1σ levels.For each corner plot we report the galaxy ID, redshift, and SNR ( Å−1 ) of the observations.

Figure 10 :
Figure 10: Lick/IDS indices Mgb and <Fe> for ten primary z ∼ 1.4 (blue triangles) and z ∼ 2.1 (green stars) Heavy Metal targets.A grid of models from Thomas et al. (2003) stellar population model (t age = 2 Gyr) is superimposed in gray.The Heavy Metal galaxies tend towards low metallicities and high [Mg/Fe], in general agreement with what is found from the full-spectrum fitting.Uncertainties are large, motivating the need for full-spectrum modeling.

Figure 11 :
Figure 11: Results from the parameter recovery of mock Heavy Metal spectra.We show systematic offsets in the stellar population fitting code as a function of SNR for four key parameters.At each SNR, we generate 20 mock spectra and fit their elemental abundances and ages following the procedure in Section 2. The black rectangles represent the median offset between the simulated and best-fit values for the 20 realizations.The blue errorbars represent the 16th and 84th percentiles of the best-fit values for the 20 realizations.The red errobars show the typical uncertainties on a single realization.The gray tick marks at the bottom of each panel show the true SNR of the individual Heavy Metal spectra.For the typical Heavy Metal galaxy, the systematic offsets are ≈-0.05dex for [Fe/H] and [Mg/Fe] and ≈0.05 dex for log Age.

Table 1 :
Heavy Metal Quiescent Galaxy Parameters