The Rest-Frame Submillimeter Spectrum of High Redshift, Dusty, Star-Forming Galaxies from the SPT-SZ Survey

We present the average rest-frame spectrum of the final catalog of dusty star-forming galaxies (DSFGs) selected from the South Pole Telescope SZ survey (SPT-SZ) and measured with Band 3 of the Atacama Large Millimeter/submillimeter Array (ALMA). This work builds on the previous average rest-frame spectrum, given in Spilker et al. (2014) for the first 22 sources, and is comprised of a total of 78 sources, normalized by their respective apparent dust masses. The spectrum spans $1.9$$<$z$<$$6.9$ and covers rest-frame frequencies of 240$-$800 GHz. Combining this data with low-J CO observations from the Australia Telescope Compact Array (ATCA), we detect multiple bright line features from $^{12}$CO, $[$CI$]$, and H$_2$O, as well as fainter molecular transitions from $^{13}$CO, HCN, HCO$^+$, HNC, CN, H$_2$O$^+$, and CH. We use these detections, along with limits from other molecules, to characterize the typical properties of the interstellar medium (ISM) for these high redshift DSFGs. We are able to divide the large sample into subsets in order to explore how the average spectrum changes with various galaxy properties, such as effective dust temperature. We find that systems with hotter dust temperatures exhibit differences in the bright $^{12}$CO emission lines, and contain either warmer and more excited dense gas tracers, or larger dense gas reservoirs. These observations will serve as a reference point to studies of the ISM in distant luminous DSFGs (L$_{\mathrm{IR}}$$>$$10^{12}$L$_\odot$), and will inform studies of chemical evolution before the peak epoch of star formation at $z=2-3$.


INTRODUCTION
In the last two decades, wide-field submillimeter (sub-mm) surveys have uncovered a population of very luminous (L IR > 10 12 L ), high redshift, dusty star-forming galaxies (DS-FGs), with implied star formation rates ranging from hundreds to thousands of solar masses per year (see review by Casey et al. 2014). The abundance of DSFGs in the early Universe enables them to contribute significantly to the average star formation rate density (∼ 80% at z = 2 − 2.5 to ∼ 35% at z = 5; e.g. Zavala et al. 2018Zavala et al. , 2021. The extreme star formation rates of DSFGs imply that they contain vast reservoirs of molecular gas, which is used to form stars. These reservoirs would be dominated by molecular hydrogen (H 2 , with M H 2 ∼ 5 × 10 9 − 1.1 × 10 11 M for z ∼ 1 − 3; e.g. Aravena et al. 2019). However, due to the difficulty of detecting molecular hydrogen, other species are commonly used to trace the bulk of the molecular gas in the interstellar medium (ISM). The rotational ground state of the second most abundant molecule, carbon monoxide ( 12 C 16 O J = 1 → 0 or simply CO(1-0)) is the most common and arguably direct tracer molecular material in the ISM (Carilli & Walter 2013). When the rotational ladder of CO can be measured, it is a powerful probe of the physical conditions of the emitting medium. However, a number of processes contribute to the excitation of CO, including star formation, shocks, mechanical heating, and potential activity from active galactic nuclei (AGN). Additionally, the approximate constant of proportionality relating gas mass and CO luminosity can vary by an order of magnitude, as a result of the metallicity and gas conditions of each galaxy (e.g. Bolatto et al. 2013). Constraints from other optically thin species, such as 13 CO and C 18 O, could enable a more accurate determination of gas mass, provided the relative abundances of these isotopologues. However, these species are less abundant and detections of these molecules are limited to a handful of lensed objects (e.g. Danielson et al. 2013), so their global abundances are relatively unknown for DSFGs.
Gravitational lensing provides the opportunity to study the gas content of these systems with both greater sensitivity and an-gular resolution than their unlensed counterparts. Flux-limited selections of samples discovered with wide-field mm/sub-mm surveys have been shown to efficiently select strongly lensed sources (Vieira et al. 2013;Hezaveh et al. 2013;Spilker et al. 2016). Though lensed DSFGs are relatively rare, (N < 1 deg −2 for S 850 µm > 100 mJy; e.g. Negrello et al. 2007), hundreds of gravitationally lensed DSFGs have been discovered via wide field surveys with the Atacama Cosmology Telescope (ACT; Marsden et al. 2014;Su et al. 2017;Gralla et al. 2020), Herschel (e.g. Negrello et al. 2010Negrello et al. , 2017Nayyeri et al. 2016;Bakx et al. 2018;Urquhart et al. 2022), Planck (e.g. Cañameras et al. 2015;Harrington et al. 2016;Berman et al. 2022) and the South Pole Telescope (SPT; Vieira et al. 2010Vieira et al. , 2013Everett et al. 2020). Gravitational magnification (typically µ ∼ 6; Spilker et al. 2016) enables the observation of lines otherwise too faint to see (Spilker et al. 2014). It also decreases the on-source time required for bright emission lines, facilitating large spectroscopic surveys. Early CO surveys of high redshift galaxies (Bothwell et al. 2013;Daddi et al. 2015) studied multiple 12 CO emission lines and provided the first constraints on the CO rotational ladder, also known as the spectral line energy distribution (SLED).
Telescopes such as the Atacama Large Millimeter/submillimeter Array (ALMA) and the Northern Extended Millimeter Array (NOEMA) have been conducting blind CO surveys targeting the same 12 CO emission lines (Bothwell et al. 2013;Weiß et al. 2013;Strandet et al. 2016;Reuter et al. 2020;Neri et al. 2020) in order to determine the spectroscopic redshifts of the targeted sources. The spectroscopic data collected also serves to probe the ISM of both the individual sources and the underlying DSFG distribution. The creation of composite spectra by combining survey data (Spilker et al. 2014;Boogaard et al. 2020;Birkin et al. 2021) places constraints on the "average" DSFG for bright molecular lines such as 12 CO, [CI] and H 2 O. With additional sources, the corresponding reduction in noise also enables the detection of much fainter molecular transitions from 13 CO, HCN, HNC, HCO + , and CN at high redshift. Taken together, the relative strengths of these lines provide first attempts to quantify the relative abundances of various molecules and their brightest isotopologues in DSFGs and provide insights about the interstellar chemistry at high redshift.
Molecules with high dipole moments, such as HCN, HNC, HCO + and CN are excited out of their J = 0 ground state in high density gas (n > 10 4 cm −3 ), where molecular clouds decouple from external tidal forces and UV background, allowing them to collapse. These molecules are therefore thought to be reliable tracers of the dense molecular gas associated with star formation (Gao & Solomon 2004b;Carilli & Walter 2013). Emission lines for the majority of these molecules are typically 1-2 orders of magnitude fainter than the bright CO lines targeted (e.g. Jiménez-Donaire et al. 2019) and the early detections were limited to a handful of highly magnified luminous objects (e.g. the "Cloverleaf" quasar, see Solomon et al. 2003 and APM 08279+5255, see Wagg et al. 2005). While more recent work has included small samples of highly lensed objects (e.g. a selection of strongly-lensed DSFGs from Béthermin et al. 2018), the number of observations at high redshift is still limited. However, an increasing number of studies indicate a trend between the dense gas tracer HCN and far-IR luminosity (Gao & Solomon 2004a;Graciá-Carpio et al. 2006;Juneau et al. 2009;García-Burillo et al. 2012), demonstrating the potential utility of dense gas tracers to trace molecular gas and potentially star formation.
Water is one of the most abundant molecules in galaxies, though most of it is frozen onto dust grains in cold environments (Tielens et al. 1991;Hollenbach et al. 2009). In warm environments, such as near stars or AGN, these icy dust grains heat through collisional excitation, sublimating water, and make water a tracer of star formation and AGN activity. Due to the high optical depth of its lines, water lines can be as luminous as CO lines in the same frequency range (van der Werf et al. 2011). Water has been shown to be directly proportional to the infrared luminosity (L IR ; Omont et al. 2013;Yang et al. 2013) and has shown potential to serve as a resolved tracer of star formation (Jarugula et al. 2019). In contrast, a deep absorption feature at the rest-frame frequency of 557 GHz due to the H 2 O(1 1,0 -1 0,1 ) transition has been observed for a DSFG at z ∼ 6 ( Riechers et al. 2022). Because this absorption line may also show absorption into the CMB, it could indicate the presence of cold water vapor.
With a large enough number of objects, we can split the sample to see how these molecular tracers are affected by various properties, such as effective dust temperature, IR luminosity, stellar mass, etc. The excitation of bright 12 CO emission has been shown to correlate well with dust temperature (Rosenberg et al. 2015). A previous attempt to understand the relative abundances of fainter molecular lines with respect to mid-IR was inconclusive (Spilker et al. 2014), as the sample size was not large enough to overcome the effects of individual source variations.
In this paper, we will use the data collected from the ALMA 3 mm redshift search published in Weiß et al. (2013), Strandet et al. (2016) and Reuter et al. (2020), which are briefly summarized in Sec. 2. The data are scaled and stacked according to Sec. 2.2 in order to create a composite spectrum, presented in Sec. 3. We analyze the average conditions of the ISM for this collection of DSFGs in Sec. 4, and place these results in context with other works from literature. Our conclusions are summarized in Sec. 5.

ALMA 3 mm Observations
This work utilizes data from the SPT-SZ DSFG catalog, and the full details of the target selection and ALMA 3 mm observations can be found in Reuter et al. (2020). We will briefly summarize the key points in this section.
The SPT DSFG catalog was selected from the 2500 deg 2 field of the SPT-Sunyaev-Zel'dovich survey (SPT-SZ; Vieira et al. 2010;Everett et al. 2020) with S 1.4 mm > 20 mJy. In order to refine their positions, the sources are also required to have been detected with the Large Apex BOlometer CAmera (LABOCA; Siringo et al. 2009) with S 870 µm > 25 mJy, bringing the number of sources in the flux-limited selection to 81. Over 70% of the sources in the sample are strongly gravitationally lensed by galaxies , and with a median magnification factor of µ 870 µm = 5.5. The remainder of the sample are either lensed by galaxy clusters or are unlensed protoclusters (e.g. SPT2349-56, see Miller et al. 2018;Hill et al. 2020;Wang et al. 2021). The sources span redshifts from z = 1.9 − 6.9, apparent IR luminosities from 1.3 − 29 × 10 13 L (integrated from 8 − 1000 µm) and dust temperatures from 21 − 79 K, with medians at z = 3.9 ± 0.2, apparent L IR = 8.2 × 10 13 L , and T dust = 54 K.
The ALMA 3 mm observations were obtained over several different ALMA Cycles, ranging from Cycle 0 to Cycle 7. 1 The first observations were conducted in ALMA Cycle 0 and spectra were obtained for 26 targets (Weiß et al. 2013). Followup surveys conducted in ALMA Cycles 1, 3, and 4 targeted additional sources, giving a total of 78 spectral scans obtained with ALMA (Strandet et al. 2016;Reuter et al. 2020). While there are 81 sources in the complete SPT DSFG sample, spectroscopic redshifts for three sources (SPT0538-50, SPT0551-48 and SPT2332-53; Greve et al. 2012;Strandet et al. 2016) were obtained with Z-Spec (Bradford et al. 2009) on the Caltech Submillimeter Observatory and no additional 3 mm data were obtained with ALMA. These sources are therefore not included in the analysis presented in this work.
Despite the changing capabilities of ALMA during our observations, the observing strategy remained the same for all of the 3 mm observations. The data were obtained by conducting a spectral sweep of ALMA Band 3 in 5 tunings, scanning the range between 84.2 and 114.9 GHz. The brightest sources were targeted in ALMA Cycle 0 and were scanned for roughly 5 minutes on-source, with between 14-17 antennas. The remaining fainter sources were scanned with both longer integration times and additional antennas in subsequent ALMA Cycles. Sources scanned in Cycles 1, 3, and 4 were scanned for an average of 10, 6 and 13 minutes on source with between 28-40, 34-41 and 38-46 antennas, respectively. Finally, three sources did not exhibit any emission lines in their initial 3 mm scans, so deeper observations were obtained in ALMA Cycle 7. These scans observed each target for 45-91 minutes on source with 42-49 antennas. Because integration times and number of antennas increased with each ALMA Cycle, the typical noise levels vary according to the ALMA Cycle and were 1.5 mJy beam −1 , 0.91 mJy beam −1 , 0.73 mJy beam −1 , 0.58 mJy beam −1 , and 0.24 mJy beam −1 over a 62.5 MHz bandwidth for Cycles 0, 1, 3, 4, and 7, respectively. The typical RMS values for individual SPT-SZ DSFGs can be found in Table E1 of Reuter et al. (2020). The spectral resolution of the data, ∼ 1.5 km/s over a 2 GHz baseband, is much higher than a typical galaxy line width. This allows many channels to be averaged over for each line, increasing significance.
As previously noted, this work utilizes the 3 mm spectra obtained in Weiß et al. (2013), Strandet et al. (2016) and Reuter et al. (2020). These spectra are central to not only this paper, but also for papers currently in preparation. The 3 mm line fluxes from mid-J CO lines will be combined with the line fluxes from low-J CO observations with ATCA. Spectral line energy distributions will be fitted for all sources, which will be examined in a following paper that is in preparation.

Stacking Methods
Because the details of creating a stacked spectrum are similar to those outlined in Spilker et al. (2014), we summarize the key points here and highlight the differences between the two methods.
The stacked spectrum presented in this paper is a composite of 3 mm spectral scans for the 78 sources observed with ALMA. Calibrated data cubes were created with CASA's TClean package and the spectra were extracted. The continuua of the spectra were fitted with a first order polynomial, excluding channels with signal to noise > 3σ, and then subtracted from each spectrum. The spectra are then re-scaled to a common rest frame (z = 3; as in Eq. 1 of Spilker et al. 2014). The choice to scale to z = 3 is intended to represent the typical redshift of DSFGs. Since surveys of unlensed DSFGs are largely selected at shorter wavelengths, they tend to be at lower redshift (z ∼ 2.3 − 2.9; Koprowski et al. 2014;Simpson et al. 2014;Simpson et al. 2017;Miettinen et al. 2015;Brisbin et al. 2017;Danielson et al. 2017;Micha lowski et al. 2017), while the median redshift of the SPT-SZ DSFGs is 3.9 (Reuter et al. 2020).
Because our sample is known to contain a large number (> 70%; Spilker et al. 2016) of gravitationally lensed sources, it is necessary to account for the magnification. Ideally, the magnification would be calculated by constructing gravitational lens models of the sources . However, in the absence of a complete set of gravitational lens models, the individual spectra are normalized by their respective apparent dust masses as a proxy for size, then multiplied by the average apparent dust mass ( M dust = 8.7 × 10 9 M ). It should be noted that a radius could also be derived from the same spectral energy density fit (e.g. Eqs. 1 and 2 from Weiß et al. 2007). Radius and dust mass are therefore very strongly correlated, with some intrinsic scatter from the simultaneous fitting of dust temperature. Since normalizing by the apparent values for radius and dust mass give similar results, we choose to normalize by the apparent dust mass throughout this work. The values for apparent dust mass used throughout this work can be found in Appendix A.
A variety of other source properties could also have been used in order to normalize the ALMA spectra, such as apparent L IR , 12 CO line luminosity or scaled 1.4 mm SED flux density (as in Spilker et al. 2014). While the line ratios derived in Spilker et al. (2014) were robust within ∼ 15% to the choice of normalization, 12 CO line ratios for the complete sample can vary from Each individual source is normalized by its respective property, then multiplied by the sample average: apparent L IR (blue; L IR = 9.1 × 10 13 L ), 1.4 mm flux (green; S 1.4mm = 18 mJy), apparent dust mass (yellow; M dust = 8.7 × 10 9 M ) and apparent radius derived from SED fitting (red; r = 7.5 kpc) and scaled according to the average of the sample. Bottom panel: The average velocityintegrated flux density normalized to average flux density of the J up = 3 line, the lowest frequency CO observation obtained from the ALMA 3 mm scans. The 16th and 84th percentiles for the normalization by dust mass are denoted by the shaded region, the others are omitted for clarity.
∼ 25% to as much as 65%. The large disparity is in part due to the large intrinsic scatter in popu-lation statistics shown in Fig. 1 for a few choices of normalization. It should be noted that normalization by IR luminosity will affect the shape of the 12 CO SLED, since the excitation is dependent on IR luminosity (e.g. Narayanan et al. 2008). Since the systems at high IR luminosity will be scaled to compare to lower luminosity, this will lead to a lower excitation, as seen in the top panel of Fig. 1. The 1.4 mm flux density (rest frame 350 µm) is taken from the modified blackbody models in Reuter et al. (2020) and represents a compromise between the apparent L IR and apparent dust mass normalizations. Because any results derived from a stacked spectrum are dependent on normalization, we acknowledge and emphasize that the stacked spectrum presented in this work is not uniquely defined.
To create the final composite spectrum of the SPT DSFGs, we employ the same methodology as Spilker et al. (2014). That is, the spectrum of each source is interpolated onto a grid spanning the rest frequencies of the ALMA 3 mm spectra, 250 − 770 GHz. The grid has a uniform spacing of 500 km/s, which roughly corresponds to the typical full width half maximum (FWHM) of observed 12 CO transitions. A weighted average is then performed on each frequency channel, with each contributing source weighted by the inverse standard deviation (Boogaard et al. 2021), rather than the inverse variance used in Spilker et al. (2014). Because the data were collected in different ALMA Cycles, the noise of individual spectra over a 62.5 MHz bandwidth can vary from the 1.4 − 2 mJy beam −1 typically observed in Cycle 0 to 0.2 − 0.3 mJy beam −1 for the three sources re-observed in ALMA Cycle 7 (details in Reuter et al. 2020). The average 3 mm per-channel noise for each source is shown in the top panel of Fig. 2 and is also given in Appendix A. Though normalization by dust mass (shown in the bottom panel) does mitigate some of the differences in ALMA Cycle, demonstrated by the average weights shown in the middle panel of Fig. 2 2 , the choice to weight by inverse standard deviation is a compromise between optimizing signal to noise and accounting for the range of noises due to the varying ALMA Cycles. It should be noted that each frequency channel of the contributing 3 mm spectrum is weighted individually and the average noise across the 3 mm bandwidth is shown in Tab. A. Weighting by the inverse variance of noise was used in various composite spectra (Spilker et al. 2014) in order to optimize for signal to noise. However, as noted in Sec. 2.2, if the noise is non-homogeneous (e.g. if there are very different numbers of antennas between observations), weighting by the inverse variance of noise can be sensitive to outliers. In order to mitigate the presence of outliers, one can choose to weight instead by inverse standard deviation (e.g. Boogaard et al. 2021 and this work) or use the median, rather than the average (e.g. Bothwell et al. 2013;Birkin et al. 2021). A test of the statistical noise properties is given in Appendix A, and demonstrates the noise is Gaussian across all ALMA Cycles.

12 CO Line Ratios
For ∼ 32% of the SPT DSFG sample, two or more CO emission lines fall within the 3 mm observation window and are detected in the blind survey (Reuter et al. 2020). Low-J CO lines were obtained from an ATCA CO survey detailed in Aravena et al. (2016). In addition to the published observations (17 sources), data was obtained after publication, giving a total of 33 observations for the J up = 1 − 2 lines. Sources that had multiple CO line observations either through ALMA or through the combi-nation of ATCA and ALMA data are so-called "double-lined" sources. These sources enable a comparison of the relative strengths of observed 12 CO emission lines within individual systems, shown in Fig. 3 as histograms, with the optically thick thermalized emission limit represented by the thick black line. . Histograms of 12 CO emission line ratios for sources with two or more observed emission lines, using integrated flux density. The J up = 3−6 lines were observed with ALMA, while J up = 1 − 2 lines were observed with ATCA. The dashed line represents the average line ratio obtained from the composite spectrum, and the thick black line represents the thermalized emission limit. The dotted line represents a stack using the double-lined sources only, while the thin black line corresponds to a stack of the singly-lined sources only.
The 12 CO emission line ratios from individual SPT systems also provide a direct comparison sample to the composite spectrum obtained from stacking, which is represented by the dashed line in Fig. 3. With the exception of CO(5-4)/CO(4-3), the emission line ratios obtained from the composite spectrum are in agreement with the individual ratios and below the optically thick thermalized emission limit.
While the composite spectrum necessarily contains all of the individual line ratios represented in these histograms, it differs in two key respects: each observation is normalized by dust mass and weighted by the inverse of the noise and also the stack contains many more observations of single emission lines. This is especially apparent in the CO(5-4)/CO(4-3) panel of Fig. 3, where the composite line ratio is above both the individual line ratios and the thermalized emission line. One reason for this apparent discrepancy is that because of the fixed observation window, the singly-lined sources being compared are necessarily at different redshifts. While Reuter et al. (2020) did not find a statistically significant evolution of dust temperature as a function of redshift, two populations of singly-lined sources being compared likely have different effective dust temperatures and luminosities. Additionally, some of the faintest and lowest mass sources observed in Cycle 7 (e.g. SPT0112-55, SPT0457-49 and SPT2340-59) are singly-lined sources for this analysis as they contain a single 12 CO emission line, while a high dust mass outlier (SPT0155-62) is a doublelined source. It is likely that the population of sources is simply not large enough yet to overcome the effects of individual source variations, in addition to the effects of potential redshift evolution of the 12 CO line properties given in the fixed observed frequency range.

THE STACKED SPECTRUM
We show the average 250 − 770 GHz spectrum of the SPT DSFG sample in Fig. 4, with the detected and potentially detected lines indicated. Due to the majority of sources lying in the redshift range 2.5 < z < 6.0, the stacked spectrum is the most sensitive to the rest frame frequency range 300−800 GHz. In order to properly assess the significance of these lines, we consider two factors: the statistical uncertainty of the indi-vidual ALMA spectra, and the intrinsic scatter of the underlying source population. Like the flux density, the statistical uncertainty of the individual ALMA spectra is scaled according to each source's dust mass in order to preserve the signal to noise ratio.
As noted in Spilker et al. (2014), the stacked spectrum is not ideal for the detection of individual molecular lines since the regular channel grid will likely split the line flux of a given line over multiple channels. In order to assess the flux density of these individual lines, spectra were constructed and centered on the line of interest, with 600 km/s channels. These spectra were then stacked according to the procedure detailed in Sec. 2.2, and a summary of detected lines and upper limits is given in Tab For many species, individual transitions fall below the 3σ detection threshold. However, by stacking all available transitions of a molecular species, we can constrain the total luminosity emitted by the molecule in the covered rest frequency range. In Tab. 1, we give the total luminosity in each species for all transitions that happen to fall within our frequency range.

12 CO SLED
Using the 12 CO fluxes taken from the composite spectrum, we construct a spectral line energy distribution (SLED) shown in Fig. 5, in comparison with both well-sampled SLEDs of individual objects as well as other statistical SLEDs. The Milky Way inner disk (Fixsen et al. 1999) and Antennae Galaxies (Zhu et al. 2003) represent objects in our local Universe. However, the high redshift, gravitationally lensed objects H1413+117 (the "Cloverleaf" quasar at z=2.56; Bradford et al. 2009), SMM J2135-0102 (the "Cosmic Eyelash" DSFG at z=2.32; Danielson et al. 2011) and APM 0827+5255 (a quasar at z=3.91; Weiß et al. 2007) are more analogous to the DSFGs in the SPT sample. While the full SPT sample normalized by L IR (see bottom panel of Fig. 5) shows an apparent flattening around J=5, analogous to local starburst galaxies M82 or NGC 253 (Panuzzo et al. 2010;Bradford et al. 2004), normalization by IR luminosity will affect the overall shape of the 12 CO SLED, since the excitation is dependent on IR luminosity. The intrinsic scatter of the full SPT sample is denoted by the shaded region around the dust mass weighting and suggests that while there is  Bothwell et al. (2013) is also normalized by L IR , the median rather than the average was used and it is denoted by a circle. a large variation in excitation, the excitation is similar to QSOs.
Rather than observe a single object for as many CO transitions as possible, statistical samples can be built through observing a large sample of sources with only a few or even single CO transitions. There are a wide variety of ways to combine these data (see Sec. 2.2 for a full discussion) in order to create a composite SLED. The first composite SLEDs were compiled in Bothwell et al. (2013) and Spilker et al. (2014)  However, because the SLEDs shown are constructed with different normalizations, it is difficult to compare the differences directly. While Bothwell et al. (2013) and Spilker et al. (2014) both utilize populations of DSFGs, they normalize by far-IR luminosity and 1.4 mm flux, respectively. As discussed in Sec. 2.2, the normalization quantity can make a 25% − 65% dif-  Kirkpatrick et al. (2019) contains between 58−75% strongly lensed sources (µ > 5; percentage dependent on J transition), which is similar to the ∼ 70% of strongly lensed sources (µ > 8; Spilker et al. 2016) found in the SPT sample. In spite of the choice of normalization and large intrinsic scatter of the SPT sample, the average of the SPT DSFGs does appear more energetic than many other DSFG SLEDs in the literature.
The excitation of the bright 12 CO emission lines has been shown to correlate well with dust temperature (Rosenberg et al. 2015). In order to test the effects of dust temperature on the shape of the SPT composite SLED, the SPT sample was split into three equal sub-samples according to dust temperatures from Reuter  Table 1. Ranked order of most luminous molecular cooling lines for the ALMA 3 mm bandpass for the redshift range 1.9 <z< 6.9. These lines are ranked according to the total luminosity (ΣL) summed across all transitions in our observed frequency range.
et al. (2020). The same stacking procedure was used for each sub-sample, and the resulting SLEDs are shown in Fig. 6 in comparison with the full sample. Because DSFGs are some of the most extreme star-forming galaxies, the hottest sources in the SPT sample exhibit much more excitation in the higher-J transitions than the lower dust temperature sources, which show an apparent flattening around J=5, like the local starburst galaxies. One important caveat is that the lower-J lines tend to be populated by lower redshift sources at lower dust temperature, while the higher-J lines tend to have higher redshift and dust temperature sources. While Reuter et al. (2020) found a slight correlation of dust temperature with redshift, the intrinsic scatter was sufficiently large that this effect was not statistically significant. However, the difference between the high and low dust temperature SLEDs can also be observed at intermediate dust temperature (shown in Fig. 6 as purple circles) and through J=3-5, where there is a more equal distribution of sources between the different temperature bins.

13 CO and C 18 O
Though 12 CO and its isotopologues, 13 CO and C 18 O, are all thought to originate from the same volume (e.g. Henkel et al. 2010;Zhang et al. 2018), 12 CO is optically thick even at moderate densities (n H 2 10 3 cm −3 ), due to its high relative abundance and low dipole moment. How-ever, the rare 13 CO and C 18 O isotoplogues are typically optically thin due to their low abundance and are therefore capable of probing the total molecular column density of cold gas regions (Carilli & Walter 2013). While both 12 C and 13 C are formed in and ejected from stars, 12 C is generally produced in high-mass stars at rapid timescales. In contrast, 13 C is a "secondary" species, since it is produced in longerlived, low to intermediate mass stars and its creation involves enriching 12 C seed nuclei (Wilson & Rood 1994). Since the creation of 13 C requires the ISM to be enriched by metals from previous generations of stars, the relative abundances of 12 C and 13 C are indications of the star formation history of a system. The emission strengths of the 12 CO and 13 CO lines are related to the abundances of their respective carbon isotopes, given a known optical depth. 13 CO is expected to be optically thin under most conditions, except the most dense, star-forming cores. Less is known about the formation of 18 O, though it is likely formed through the CNO cycle in massive stars and is enhanced in galaxies with recent massive star formation (Henkel & Mauersberger 1993). Due to its lower abundance and optical depth compared to 13 CO, C 18 O can be used to probe different gas column densities, provided the elemental abundances remain constant.
In the SPT DSFG composite spectrum, we detect 13 CO at > 3σ for the J up = 3 − 5 transitions and set limits for C 18 O. While the exact luminosities and significances can be found in Tab. B.1, we find that the line ratio between the J = 3 − 2 and J = 4 − 3 lines increases, with L 12 CO /L 13 CO ∼ 20 and L 12 CO /L 13 CO ∼ 30 for J = 3 − 2 and J = 4 − 3, respectively. This is likely due to a lower excitation in 13 CO due to less line trapping, which is related to lower optical depth. Past J = 4 − 3 however, the L 12 CO /L 13 CO ratio seems to remain constant within errors. However our measure-ments are only sensitive enough to constrain until the J = 5 − 4 transition.
The 12 C/ 13 C abundance ratios have been studied in the Milky Way using a variety of different molecules and have generally been found to steadily increase from ∼ 20 near the galactic center to ∼ 70 in our solar system (e.g. Langer & Penzias 1990;Aalto et al. 1991Aalto et al. , 1995Wilson 1995), likely due to differences in optical depth as one moves from the galactic center (Milam et al. 2005). In regions of high star formation, such as giant molecular clouds, the 12 C/ 13 C ratio is reported to be between ∼ 3 − 5, with more 13 C present in the clouds' central regions (Solomon et al. 1979;Polk et al. 1988). A recent survey of the J up = 1 line for 147 nearby main-sequence galaxies revealed an integrated intensity ratio of 10.9 with a standard deviation of 7.0 (Morokuma-Matsui et al. 2020), while local star-forming and starbursting galaxies exhibit higher ratios of 16.1 ± 2.5 (Méndez-Hernández et al. 2020). Taken together, these surveys imply that galaxies with increasing levels of star formation exhibit increasingly suppressed levels of 13 C, but there is a large intrinsic scatter in the sample. Indeed, some highly star-forming systems have been found to have brightness temperature 12 CO/ 13 CO ratios of ∼ 40 or even as high as ≥ 60 (Young et al. 2021;Sliwa et al. 2017). Additional studies have shown that the 12 C/ 13 C ratio is sensitive to environmental conditions such as the density of nearby galaxies (Alatalo et al. 2015) and star formation rate surface density (Davis 2014).
At high redshift (z > 2), samples with 13 CO and C 18 O are limited to observations of the bright CO transitions in gas-and metalrich galaxies amplified with gravitational lensing (Henkel et al. 2010;Danielson et al. 2013;Spilker et al. 2014;Béthermin et al. 2018;Zhang et al. 2018). Typical values of the 12 C/ 13 C ratio range from ∼ 20 − 40 for the most commonly observed J = 4 − 3 transition and the J = 4 − 3 ratio obtained from the composite spectrum, L 12 CO /L 13 CO ∼ 30, is consistent with that picture.
Because C 18 O is also associated with high star formation rate regions, 13 CO/C 18 O line ratios have been shown to differ in galaxies with differing star formation rates. In the local Universe, typical 13 CO/C 18 O ratios range from ∼ 4 − 8 (e.g. Henkel & Mauersberger 1993). However, in local ULIRGs and starbursts, ratios typically range from ∼ 1−2 (Greve et al. 2009;Sliwa et al. 2017;Méndez-Hernández et al. 2020). While we do not formally detect C 18 O, the limits on the 13 CO/C 18 O ratio range from ∼ 2−3 with an average of ratio 3.2 across all transitions. While these values are smaller than the typical local 13 CO/C 18 O ratios, they are slightly higher than those found in high redshift ULIRGs and starbursts (e.g. Danielson et al. 2013). However, one cannot directly use the observed abundance ratios alone to draw conclusions about the underlying abundances. Detailed radiative transfer calculations are needed to further interpret these results.

Dense Gas Tracers
Though 12 CO traces the total molecular gas, observations of nearby galaxies have found a close link between the dense gas and star formation activity (e.g. Gao & Solomon 2004a;Jiménez-Donaire et al. 2019). Molecules with higher critical densities and excitation energies such as CN, HCN, HNC and HCO + have been used as probes of the warm, dense molecular gas where stars form. In the SPT DSFG composite spectrum, we detect the J = 4 − 3 transitions of HCN, HNC, and HCO + , along with the J = 5 − 4 transition of HNC and the J = 3 − 2 transition of CN. While the J = 4 − 3 transition of the CN radical was detected in Spilker et al. (2014), it was only marginally detected with a significance of 2.9σ in this work. Along with the differences in stacking method detailed in Sec. 2.2, this line was also blended with the J = 5 − 4 transition of HCN (the de-blending process is discussed in Sec. 3), which also contributed to the decreased significance.
The flux densities of the dense gas tracers HCN, HNC and HCO + are shown in Fig. 7 compared with the 12 CO flux density as a function of the intrinsic IR luminosity from Reuter et al. (2020). The apparent IR luminosity was determined by fitting the photometry published in Reuter et al. (2020) according to a modified blackbody with an additional power law component. The intrinsic IR luminosity was then obtained by using either the magnification values from detailed lens models  or by using the median magnification factor of µ 870 µm = 5.5. Comparison samples include nearby LIRGs and ULIRGs from a variety of surveys (Baan et al. 2008;Zhang et al. 2014) and a few measurements from high redshift lensed objects (Guélin et al. 2007;Riechers et al. 2007Riechers et al. , 2010Danielson et al. 2013;Béthermin et al. 2018;Cañameras et al. 2021) as well as average values obtained from stacking (Spilker et al. 2014). The rotational ground levels of the dense gas molecules compared to CO are proxies for the fractional value of dense gas to total mass in the system, while the IR luminosity is proportional to the star formation rate (e.g. Murphy et al. 2011). Though some excited rotational ground levels are also shown in Fig. 7, it is difficult to compare because of the different excitation requirements for 12 CO and the dense gas tracers shown. Recent work (Rybak et al. 2022) has found that (U)LIRGs at high redshift have a lower HCN/CO than the local Universe for the J up = 1 transition. Though our results seem to also show lower HCN/CO ratios at high redshift, the SLED of HCN is a subject of ongoing study in the local Universe (e.g. Saito et al. 2018) and has not been explored at high redshift.
In the local Universe, it has been shown that the ground state HCN emission strongly corre-lates with IR luminosity over many orders of magnitude (Gao & Solomon 2004a;Usero et al. 2015;Bigiel et al. 2016), but it is unknown if this correlation continues at high redshift. At Cosmic Noon (2 < z < 4), the star formation rate was ∼ 10× higher than its present day level and is generally not well represented by the modes of star formation seen in nearby galaxies (Madau & Dickinson 2014;Genzel et al. 2011), except perhaps in rare local ULIRGs. It is unclear if the prodigious star formation rates observed in high redshift galaxies are due to increased gas accretion, or if an increase in the efficiency of the star formation process is required. One possibility is that at high redshift, the star formation rate still scales linearly with IR luminosity, but the dense gas fraction is enhanced, driven by some combination of gravitational instabilities from mergers and gas accretion, and extreme stellar feedback from supernovae or radiation pressure (Bournaud et al. 2010). However, recent work (Rybak et al. 2022) indicates that the dense gas fraction is suppressed rather than enhanced. Another possibility is that the efficiency with which dense gas forms stars depends on the local conditions of the galaxy (Usero et al. 2015;Bigiel et al. 2016). More observations, especially of HCN, are necessary to more fully understand the relationship between dense gas and star formation at high redshift.
The ratios between dense gas tracers can also constrain the physical conditions of the gas.
For instance, the HNC/HCN and HCO + /HCN ratios have been suggested to provide a discriminator between photo-dissociation regions (PDRs) and X-ray-dominated regions (XDRs) or more complex dynamics (Meijerink et al. 2007;Aalto et al. 2007), though X-ray driven chemistry can change abundance ratios (e.g. Juneau et al. 2009). For the J = 4−3 transition, we find a HNC/HCN ratio of ∼ 0.6 and a HCO + /HCN ratio of ∼ 0.9, which are lower than previously obtained values in Spilker et al.  . Averaging over all transitions gives ratios of ∼ 0.9 and ∼ 1.0 for HNC/HCN and HCO + /HCN respectively, in agreement with both previously obtained SPT values and measurements from other high redshift lensed galaxies (Cañameras et al. 2021). For the HNC/HCN ratio, measurements are in the typical range of PDRs (HNC/HCN 0.1-1.0; Hirota et al. 1998;Hacar et al. 2020), and the differences between the overall HNC/HCN ratio and the individual transitions suggest that the HNC/HCN ratio has some dependence on the rotational quantum number. In contrast, since the J = 4−3 and averaged measurements for the HCO + /HCN ratio agree within errors and are consistent with other measurements at high redshift, the HCO + /HCN ratio does not appear to exhibit any frequency dependence. As in Sec. 4.1, we again order the SPT DSFG sample by the dust temperatures from Reuter et al. (2020) and split the sample into two bins according to temperature, as shown in Fig. 8. While the majority of the transitions of the optically thin 13 CO and C 18 O exhibit differences between the so-called "hot" and "cold" DSFG populations, these populations are consistent within errors to the full sample. The dense gas tracers HCN, HNC and HCO + for hotter DS-FGs are potentially more energetic in some transitions than their cooler counterparts. Though more observations are needed to increase the statistical robustness of the sample, this could indicate the presence of more excited dense gas or a more massive dense gas reservoir in hotter DSFGs. Additionally, the "hot" and "cold" DSFG populations show no difference for the cyanide radical, CN, and all three populations show indications of absorption in the J up = 6 transition. It should be emphasized that these results are speculative since many of these transitions have not been detected in this work and are also dependent on the normalization (see Sec. 2.2 for details). However, as orders of mag-nitude more sources are expected to be discovered with SPT-3G, these results could highlight interesting future areas of study.

Water
Like the molecules discussed in Sec. 4.3, H 2 O emission arises from warm gas in dense molecular filaments and structures. In the SPT-DSFGs, the H 2 O(2 1,1 -2 0,2 ) and H 2 O(4 1,4 -3 2,1 ) emission lines (ν rest = 752 GHz, E up = 137 K and ν rest = 380 GHz, E up = 324 K, respectively) are detected, along with the H 2 O(1 1,0 -1 0,1 ) absorption line (ν rest = 557 GHz, E up = 61 K; see Tab. B.1 for details). The H 2 O(2 1,1 -2 0,2 ) emission line is excited through the absorption and pumping of the 101 µm transition line and traces the infrared field. Studies Yang et al. 2013;González-Alfonso et al. 2014) show that this line is correlated with IR luminosity over several orders of magnitude in both local and high-z LIRGs and could also be used to trace star formation (e.g. Jarugula et al. 2019). Because the H 2 O(4 1,4 -3 2,1 ) emission line is a higher energy transition, it has been detected in very dense regions, such as in the circumnuclear disk around a quasar (e.g. Stacey et al. 2020). Since this line was found to be spatially coincident with the CO(11-10) emission line, it is likely that H 2 O(4 1,4 -3 2,1 ) is found in highly excited regions, like high-J CO lines. Radiative transfer models (e.g. Liu et al. 2017) also suggest that the H 2 O(4 1,4 -3 2,1 ) requires extreme (T dust ∼ 100 − 200) conditions in order to be excited. Given the highly excited CO line fluxes shown in Fig. 5, at least a subset of the SPT DSFGs exhibit such conditions.
A deep absorption feature due to the H 2 O(1 1,0 -1 0,1 ) ground state transition was also observed in the SPT DSFG stack. Though we perform a continuum subtraction in order to obtain the stacked spectrum, we measure the con-tinua of the input spectra beforehand and create a stacked continuum in order to determine the equivalent width. We measure an equivalent width of 199 ± 24 km/s for the H 2 O(1 1,0 -1 0,1 ) transition measured in our stacked spectrum. This is the only absorption line securely detected in our stack, and indicates the presence of some low excitation gas backlit against the continuum, along the line of sight. This absorption line has also been seen in Riechers et al. (2022) for a DSFG at z=6.34, where the absorption feature was detected at the ∼ 2σ level, relative to the CMB, indicating the presence of cold water vapor. However, more study would be needed to determine if this effect is also occurring in the individual DSFGs present in our stacked spectrum.

Carbon hydrides
The carbon hydride CH has been observed locally in both dense photodissociation regions (e.g. Habart et al. 2010) and diffuse molecular clouds (e.g. Chastain et al. 2010). Because CH exhibits constant abundance ratios relative to molecular hydrogen (e.g. Liszt & Lucas 2002), it is considered to be a reliable tracer of molecular hydrogen. Additionally, because neutral hydrides like CH are most abundant in gas with a large molecular fraction (Gerin et al. 2016), measurements of this molecule can be combined with other diagnostic lines to determine the molecular fraction. CH was first detected at high redshift in Spilker et al. (2014) by combining all transitions in the 3 mm window. It has again been detected in the stacked spectrum presented in this work and also at 532 GHz and 536 GHz. It has been suggested that the CH doublet is sensitive to both the fine structure constant and electron mass ratio (de Nijs et al. 2012) and measurements of CH at 532 and 536 GHz could provide a test of these fundamental constants with time. However, precise calibration with another line (e.g. H 2 O(1 1,0 -1 0,1 ) at 557 GHz) would be needed to study these effects.

SUMMARY AND CONCLUSIONS
We have presented the average rest-frame millimeter spectrum of the complete SPT DSFG sample. By stacking the 3 mm spectra obtained with ALMA Band 3 for a total of 78 objects from z = 1.9 − 6.9, we are able to probe faint ISM diagnostic lines and study the characteristics of high redshift DSFGs. Our conclusions are as follows: 2. The bright 12 CO transitions from J up = 3 − 6 were combined with low-J ATCA observations in order to produce a SLED similar to, but more excited than, many comparable SMG samples. Because dust temperature has been shown to correlate with the CO SLED shape (e.g. Rosenberg et al. 2015) and the SPT sample tends to have higher dust temperatures than comparable samples (Reuter et al. 2020) it is not surprising that the CO SLED is more excited than others. However, we highlight the importance of using comparable weights when comparing other composite spectra. To further investigate this temperature dependence, we split the sample according to dust temperature, reveal-ing a stark difference between the lower temperature systems, which exhibit subthermal behavior and show an apparent flattening near J up = 5 and higher temperature systems near local thermal equilibrium.
3. The J up = 3 − 5 transitions of 13 CO were detected, and limits were placed on J up = 6, along with all accessible transitions of C 18 O. However, constraints set on the 12 CO/ 13 CO and 13 CO/C 18 O ratios are largely consistent with previously obtained ratios at high redshift and indicate that the DSFGs at high redshift exhibit suppressed levels of 13 C compared to local ULIRGs, though more observations would be needed to conclude anything about 18 O.
4. Spectral line distributions were created using the additional detections of 13 CO, C 18 O, HCN, HNC, HCO + and CN. Splitting the spectra according to dust temperature revealed that hotter systems appear to either contain warmer and more excited dense gas tracer (HCN, HNC and HCO + ) transitions or simply more dense gas. However, there is no difference between hot and cold systems for the optically thin 13 CO and C 18 O molecules.
5. We also robustly detect the H 2 O(2 1,1 -2 0,2 ) and H 2 O(4 1,4 -3 2,1 ) emission lines, as well as the H 2 O(1 1,0 -2 0,1 ) absorption line. While the emission lines trace the densest regions of DSFGs and can potentially be used as indicators of star formation, the presence of the absorption line detected at 557 GHz indicates that there could be reservoirs of lower excitation gas, backlit against the continuum along the line of sight.
6. Though this is the largest composite spectrum of DSFGs to date, orders of magni-tude more sources are expected to be discovered with SPT-3G (Benson et al. 2014) and CMB-S4 (Abazajian et al. 2019). Spectroscopic follow-up of these sources will be able to uncover the earliest stages of the dust-obscured Universe. The weights used in the stacked spectrum are given in Tab. A. The spectra were first re-scaled to a common rest frame (described in Sec. 2.2 and then normalized by the apparent dust mass of the source, as a proxy for size. The apparent dust mass was calculated with the same photometry and method described in Reuter et al. (2020), except was calculated using κ 850 µm / m 2 kg −1 = 0.038 (Draine 2003) instead of the value given in Reuter et al. (2020). However, because the weights are scaled according to the average apparent dust mass of the sample, ( M dust = 8.7 × 10 9 M ), the value of κ does not contribute to the relative weight of an individual source in the stack. We also give the average noise across a 62.5 MHz bandwidth in Tab. A for the 3 mm spectra obtained with ALMA. Each frequency channel of the contributing 3 mm spectrum is weighted individually, so the average is an approximate value for the weight of each source.
The original stacked spectrum given in Spilker et al. (2014) was weighted by the inverse variance of noise, in order to optimize for signal to noise. Because of ALMA's changing capabilities during our observations, we choose to weight by the inverse standard deviation instead, in order to mitigate the presence of outliers. In order to test the statistical noise properties of the sample, we use the same procedure described in Spilker et al. (2014). That is, we select a line-free area of the spectrum (513 GHz), shuffle the indices, scale and stack the spectra for a total of 3000 trials. We then find the signal to noise ratio of the central bin for each trial, as was done in Spilker et al. (2014). We find that the noise for each ALMA Cycle is well-described by Gaussian statistics, as shown in the resulting distributions in Fig. A.1.

B. INDIVIDUAL EMISSION LINES
Tab. B.1 gives a summary of lines detected in the stacked spectrum, along with upper limits for an assortment of ISM diagnostic lines. The luminosities of these lines were obtained using the stacking procedure described in Sec. 2.2 for ALMA 3 mm data. Lines with S/N> 3 are shown in bold. The luminosities are given with statistical error, which is derived from the noise of the contributing 3 mm spectra. These quantities are given along with the 16th and 84th percentiles of the luminosity distribution, as an illustration of the intrinsic scatter of the underlying population. The signal to noise distribution of a line-free region of the stacking spectrum, shown as an illustration of the noise properties of the stacked spectrum. The channels of the contributing spectra were randomized and stacked for a total of 3000 trials, split by ALMA Cycle. For each ALMA Cycle, the resulting distribution is equivalent to a true Gaussian, with a width of σ = 1.

Source
Dust mass 3 mm noise Source Dust mass 3 mm noise (10 9 M ) (mJy) (10 9 M ) (mJy) Table A.1. Weights used in stacked spectrum. The relative importance of each contributing source depends on two components: the apparent dust mass and the uncertainties from each frequency channel of the respective 3 mm spectra, also shown in the top panel of Fig. 2. The apparent dust masses were calculated with the same photometry and method described in Reuter et al. (2020), but using κ 850 µm / m 2 kg −1 = 0.038 (Draine 2003). Each frequency channel of the contributing 3 mm spectrum is weighted individually, but the average values for each source are given above.