Detection of Lyman Continuum from 3.0 < z < 3.5 Galaxies in the HETDEX Survey

, , , , , , , , , , , , , , , , , and

Published 2021 October 21 © 2021. The American Astronomical Society. All rights reserved.
, , Citation Dustin Davis et al 2021 ApJ 920 122 DOI 10.3847/1538-4357/ac1598

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0004-637X/920/2/122

Abstract

Questions as to what drove the bulk reionization of the universe, how that reionization proceeded, and how the hard ionizing radiation reached the intergalactic medium remain open and debated. Observations probing that epoch are severely hampered by the increasing amounts of neutral gas with increasing redshift, so a small, but growing, number of experiments are targeting star-forming galaxies (z ∼ 3) as proxies. However, these studies, while providing fantastic detail, are time intensive, contain relatively few targets, and can suffer from selection biases. As a complementary alternative, we investigate whether stacking the already vast (and growing) numbers of low-resolution (Δλ/λ = 800) Lyα-emitting (LAE) galaxy spectra from the Hobby–Eberly Telescope Dark Energy Experiment (HETDEX) can be used to measure ionizing photons (rest-frame 880–910 Å) escaping their galaxy hosts. As a blind survey, HETDEX avoids the biases from continuum-selected galaxies, and its planned 540 deg2 coverage promotes the statistical power of large numbers. In this paper, we confirm the feasibility of Lyman continuum detection by carefully selecting a sample of 214 high-redshift (z ∼ 3) LAEs from a subset of HETDEX observations, stacking their spectra and measuring a ≳3σ detection of 0.10 μJy rest-frame Lyman continuum emission, uncorrected for attenuation in the intergalactic medium, over the full sample stack (3.0 < z < 3.5 and −22.0 ≲ MUV ≲ −19.0).

Export citation and abstract BibTeX RIS

1. Introduction

As the universe began to assemble galaxies from z ∼ 15 and onward (Bromm & Yoshida 2011), extreme ultraviolet (EUV) or Lyman continuum photons (here, defined specifically as those photons near but shortward of 912 Å) began filling the universe, ionizing the bulk, neutral gas between the galaxies so that by 5 ≲ z ≲ 8, the intergalactic medium (IGM) was once again (almost) entirely ionized (Finkelstein 2016; Stark 2016; Becker et al. 2021). This period that witnessed this change of state in the hydrogen gas of the IGM, roughly 5 ≲ z ≲ 15 (though the beginning redshift is rather more uncertain than the end), is commonly called the Epoch of Reionization (EoR) (Gunn & Peterson 1965; Finkelstein et al. 2019, and many others).

While it is well accepted that the driver of this reionization is photoionization and that O-type stars and active galactic nuclei (AGN)/quasars are prodigious producers of the EUV photons necessary to completely ionize hydrogen, the large-scale mechanics and evolution of the EoR are less well understood. Simulations and limited observations suggest a "Swiss-cheese" model, with bubbles of ionized gas growing and merging into larger and larger regions until only small pockets of neutral gas remain (Zaroubi 2013). Almost certainly the young galaxies are the origin of these extragalactic bubbles in which they reside, but whether the principal actors are massive stars in (1) small, young star-forming galaxies, (2) older, more massive galaxies, (3) accreting supermassive black holes (SMBH) in the galaxy centers, or (4) some other source is unclear.

Ideally, this issue could be resolved empirically by counting the Lyman continuum photons escaping from various galaxy hosts during the EoR, but these photons are unavailable to be counted as they are consumed in the reionization of the intergalactic hydrogen. Consequently, we need to observe galaxies near the EoR but after reionization is complete. Unfortunately, as we approach those higher redshifts, increasing numbers of remnant clumps of neutral hydrogen cloud our lines of sight such that by z ≳ 4 the Lyman continuum photons are effectively all absorbed (Haardt & Madau 1995; Cowie & Hu 1998; Meiksin 2006; Overzier et al. 2012; Vanzella et al. 2018). If, however, we look at galaxies near redshift ∼3, some of the ionizing photons escaping their hosts survive to reach our telescopes. We can then use these lower-redshift galaxies as analogs of their EoR predecessors, matching as closely as possible against various properties such as halo mass, stellar mass, metallicity, stellar population, and star formation rate (Shapley et al. 2003, 2006; Vanzella et al. 2015; Shapley et al. 2016; Steidel et al. 2018, and others).

The deep spectroscopic observations needed to measure these photons from individual galaxies are difficult and time consuming in this redshift range, so preselection of targets is necessary. This preselection is typically performed with a filter dropout in broadband photometric imaging, bracketing the Lyman break (and the so-named Lyman-break Galaxies or LBGs) (Steidel & Hamilton 1993; Steidel et al. 1996; Dickinson 1998). This is a robust technique, but not without issues, as is discussed in Section 5, and there is no guarantee that a targeted galaxy will be in a clear sight line (generally devoid of dense clumps of H i). There is also the real risk, particularly without deep broadband imaging accompanying the spectra, that what is thought to be Lyman continuum is actually from lower-redshift interlopers (Vanzella et al. 2010, 2012; Steidel et al. 2018; Pahl et al. 2021). These galaxies tend to be fairly evolved and, as a consequence of the selection technique, moderately large and continuum bright.

On the other hand, Lyα-emitting (LAE) galaxies tend to be less massive, UV-fainter, and more difficult to detect in broadband imaging than their LBG siblings. They appear to increase as a fraction of the overall population with increasing redshift (Finkelstein 2016; Kornei et al. 2010; Stark et al. 2010; Trenti et al. 2010; Jose et al. 2013; Naidu et al. 2020; Ito et al. 2021; Santos et al. 2021), making them more representative of EoR galaxies than the z ∼ 3 LBGs. As such, they represent an important subpopulation in the understanding of galaxy evolution and of Lyman continuum escape, while simultaneously being (historically) more difficult to observe as spectroscopic identification is necessary. However, blanketing large areas of sky with low-cost, fast observations (meaning, shallow, low-resolution spectra) would vastly increase the number of sampled LAEs. Even though Lyman continuum emission would rarely be detected above the noise for individual galaxies, we can stack the spectra, marginalizing over sight lines and galaxy orientations, to boost the signal.

The Hobby–Eberly Telescope Dark Energy Experiment (HETDEX) (Hill et al. 2008; G. J. Hill et al. 2021, in preparation; K. Gebhardt et al. 2021, in preparation) is a massive, multiyear blind spectroscopic survey of some 540 deg2 of sky. HETDEX employs the Visible Integral-field Replicable Unit Spectrograph (VIRUS) on the upgraded 10 m Hobby–Eberly Telescope (HET). VIRUS has up to 74 pairs of integral field spectrographs fed by more than 30,000 fibers (G. J. Hill et al. 2021, in preparation). HETDEX is mapping out the 3D positions in R.A., decl., and redshift (α, δ, z) of ∼106 LAE galaxies (detected due to their bright emission in Lyα) approximately z ∈ (1.9, 3.5) as well as a similar number of low-redshift (z ∈ (0.0, 0.5)) galaxies. The low-z galaxies are detected mostly in [O ii] 3727 Å and to a lesser extent, [O iii] 4959, 5007 Å, and the Balmer lines (excluding Hα), as well as midrange redshifts (z ∈ (0.7,1.9)) with [Mg ii] 2795 + 2802 Å, [C iii] 1909 Å, and [C iv] 1549 Å.

In 2020 August, HETDEX completed its second updated data set (designated iHDR2) with roughly 1.5 million emission-line detections from the first three years of observations. Using the detections from the spring 2020 observations with higher (>6) emission-line signal-to-noise ratios (S/Ns), as defined in K. Gebhardt et al. (2021, in preparation, hereafter KG21), we select galaxies with Lyα emission at the higher-redshift end (3.0 < z < 3.5) of the survey, where the Lyman continuum region falls within the HETDEX spectral range. The data set is further refined to include only those LAEs that are, as best we can determine, free from any significant contamination from lower-z foreground objects (see Section 3). We then stack the sample spectra to detect Lyman continuum emission.

This feasibility study suggests the massing of large samples of LAEs, particularly z ≳ 3, is now practical. This enables the investigation of the ensemble properties of this subset of the galaxy population, farther down the galaxy mass function that may, by the virtue of their increasing numbers and possibly enhanced Lyman continuum leakage (Yan & Windhorst 2004; Bouwens et al. 2015; Finkelstein et al. 2015, 2019), be from the class of objects largely responsible for the ionizing photon budget of reionization. Translating the observations of these z ∼ 3 LAEs to galaxies at z ≳ 6 will be addressed in work and will be approached via the mapping of similar properties (halo mass, stellar mass, star formation rates, etc.) onto modeled galaxies for the EoR, correcting for bias and number density evolution. This also holds the promise of an enormous catalog of galaxies, specially curated as a test bed to study the physics of how EUV photons escape into the IGM.

The paper is arranged as follows: Section 2 provides an overview of the HETDEX survey and its data processing pipeline. Section 3 describes the critical steps taken to ensure a pure sample of 3.0 < z < 3.5 LAE galaxies, free from foreground contamination. Section 4 reviews the data analysis and stacking methods. Section 5 highlights a limited comparison to a similar study and discusses various biases and issues. Section 6 presents a brief summary with the conclusion that rest-frame Lyman continuum flux has been detected with stacked HETDEX observations and previews future work.

Throughout the paper, a concordance (flat-ΛCDM) cosmology with ΩΛ = 0.7, Ωm = 0.3, and H0 = 70 km s−1 Mpc−1 is assumed. All magnitudes are in the AB system.

2. Observations

The point-spread function (PSF) weighted, extracted spectra and astrometric solutions from HETDEX are taken as the initial, primary inputs to this work. An extensive discussion of the HETDEX project (science goals, design and instrumentation, data reduction pipeline, modeling and calibration, emission-line detection, and more) is presented in KG21 and G. J. Hill et al. (2021, in preparation). Briefly, each HETDEX observation consists of three 360 s exposures with VIRUS, offset in a triangular dither pattern to fill in the spatial gaps in the fiber layout of the (up to) 74 integral field unit (IFU) spectrographs. Each VIRUS IFU covers a 50 × 50 arcsec2 area with 448, 1farcs5 diameter fibers arranged in a hexagonal grid pattern with 1/3 fill factor on the IFU. Each IFU feeds light from the prime focus of the 10 m Hobby–Eberly telescope to a pair of spectrographs covering ∼3500–5500 Å with Δλ/λ = 800. The IFUs are arranged in a square grid on 100'' centers. Thus, each exposure contains more than 3 × 104 low-resolution spectra, and an observation of three dithered exposures covers more than 50 arcmin2 sky area spread over an 18' diameter field of view with a 1:4.6 fill factor.

While the most recent HETDEX Data Release (version 2.1, at the time of this writing) contains 3266 observations (each as three dithers per pointing) from 2017 January through 2020 June, only the 816 observations within the first half of 2020 (January through June) and inside the primary HETDEX spring field (roughly R.A. 160°–245° and decl. 45°–57°) are included for consideration in the current work. This set of observations is further reduced to 662 by enforcing reasonable throughput (larger than 0.095 at 4540 Å; the sample median is above 0.120) and seeing FWHM (below 2farcs6; with the sample median better than 1farcs7) (see KG21 for details). This restricted base set of observations is selected to focus on regions where deep, supplemental photometric imaging is available as well as to expedite the review and classification process (see Section 3), which includes a lengthy visual inspection component. All LAEs included in this work have deep r (∼26) coverage from an extensive Subaru Hyper Suprime-Cam (HSC) survey specially executed in support of HETDEX. A detailed description and analysis of this and other photometric surveys used in HETDEX is provided in D. Davis et al. (2021, in preparation, hereafter DD21).

Reported r magnitudes are from variable (typically 0farcs5–1farcs0 semimajor axis) elliptical apertures (Bertin & Arnouts 1996; Barbary 2016) placed on the HSC imaging. Where no counterpart is found or the aperture magnitude is negative or fainter than the limiting magnitude, that limit is interpreted as the magnitude. For HSC-r, the 5σ limiting magnitude is ∼26.5 (∼25.5) for a 1farcs0 (2farcs0) diameter circular aperture. As the seeing FWHM in this HSC survey varies between 0farcs6 and 1farcs0 and the LAEs of this work are point-source like, 26.5 is adopted as the limiting magnitude.

The HSC-r band uses the SDSS-r filter (Doi et al. 2010), which conveniently places the rest-frame 1500 Å flux squarely within HSC-r, while avoiding the Lyα line for the redshift range of this work, allowing the simple use of the distance modulus adjusted r-magnitude as the absolute rest-frame UV magnitude:

Equation (1)

where DL is the luminosity distance in parsecs and K is the K-correction, which is set to 0 as r probes the UV region for the redshift range of this paper. The $2.5\mathrm{log}(1+{z}_{\mathrm{Ly}\alpha })$ term for the bandpass compression is often included in the K term but is separated out here for clarity.

3. Data Selection

Prior to consumption in this work, the data are reduced (producing sky-subtracted, wavelength-rectified, flux-calibrated spectra for each fiber), and initial detections are made, combining multiple fibers inside apertures and weighting according to a point-source model. Details of the reduction pipeline and detection methods can be found in KG21.

The principal investigative tool for detections is the HETDEX Emission Line eXplorer (ELiXer), which combines the HETDEX reduced data and archival imaging surveys into reports for each emission-line detection to aid line identification and observation diagnostics. It performs automated and semiautomated characterization and classifications, using multiple techniques and information sources, including Bayesian-based Lyα versus [O ii]λ 3727 analysis (Leung 2017; Farrow et al. 2021), photometric catalog matching, emission-line groups and flux ratios, redshift-corrected physical extents, etc. The ELiXer software is described in detail in DD21.

As it is absolutely critical that the data sample be as free as possible of misidentifications and from faint continuum interlopers along or near the line of sight, strict selection criteria are applied. Each LAE in this sample is manually vetted and is, within the spatial resolution limits of the overlapping imaging, required to be effectively isolated—meaning without any significant overlap in PSF from other sources. As such, the selection criteria are chosen not only to isolate a clean sample (of LAEs) but also to keep the number of detections manageable for the manual examination. The full iHDR2 catalog contains approximately 1.5 × 106 individually fitted (KG21) emission-line entries belonging to a few × 105 unique astrophysical objects, including well over 1 × 105 LAEs. Because the final sample is down-selected to include only 214 galaxies, certainly most valid LAEs are left "on the table" to be used in future work with more sophisticated contamination mitigation.

Figure 1 presents some basic summary statistics for the final sample. Individual panels are also referenced throughout this work, but in brief, from left to right and top to bottom, the panels show the sample distributions of (Panel 1) the S/N of the Lyα emission line, (Panel 2) the redshift, (Panel 3) the rest-frame Lyα equivalent width (EW) using the estimated continuum from the HETDEX spectra), (Panel 4) the Lyα line luminosity, (Panel 5) the r apparent magnitude, and (Panel 6) the absolute UV magnitude. Where appropriate, the fractions of the histogram bins with candidate LAEs below the imaging survey's magnitude limit (that is, without an imaging counterpart or with an aperture magnitude fainter than the nominal limit) are marked with red hashes. The Lyα line luminosity simply uses the definition of luminosity,

Equation (2)

where F is the integrated line flux and DL is the luminosity distance.

Figure 1.

Figure 1. Panels are enumerated from left to right then top to bottom beginning on the upper left. Panel 1: histogram of Lyα S/N for this work. The trend of increasing LAE count with lower S/N is expected to continue down below S/N > 5, where they begin becoming statistically indistinguishable from noise. Panel 2: redshift histogram of final sample (3 < z < 3.5). The limits are selected so that the Lyman continuum region and the Lyα line fall cleanly within the HETDEX spectral range. Panel 3: the Lyα rest-frame equivalent width (EW) distribution of this sample estimated from the HETDEX spectra. Because this estimate includes wavelengths blueward of Lyα, there may be extra IGM attenuation and the true EW is likely slightly lower. Panel 4: Lyα line luminosity distribution of this sample, with a median near 1.2 × 1043 erg s−1. Panel 5: the estimated apparent magnitudes from aperture photometry on HSC-r. The HSC-r imaging has a 5σ limiting depth of ∼26.5 mag. The r magnitude was used only in classification steps and the specific value is increasingly unimportant fainter than 25 (DD21). Panel 6: the estimated UV absolute magnitude (uncorrected for IGM attenuation, etc.).

Standard image High-resolution image

The initial sample selection criteria seek to eliminate obvious contaminants, spurious detections, and undesirable (to this work) galaxies (specifically AGN with obvious continuum or broad lines, potentially interacting galaxies, or LAEs near bright foreground objects). HETDEX defines its own specifications on the acceptable contamination rate and implements its own procedures to ensure the specifications are met (KG21). However, where the principal science objectives of HETDEX rely primarily on accurate 3D space mappings (α, δ, z) and is generally less concerned with the LAE galaxy type (AGN, interacting, etc.) or whether there is some contamination from the scattered light of nearby objects (so long as it does not affect the redshift determination), this work requires additional refinement to eliminate these contaminants. The first step in reducing the potential LAE candidates from the set of all candidate detections in iHDR2 is a simple threshold filter over several criteria, described in the following subsections. Although the selection criteria are necessarily presented in a sequence below, except where noted, the actual execution is as a compound query executed as a single statement.

3.1. Initial Emission-line Database Selection

  • 1.  
    Emission Wavelength: Because the spectral region of interest is in the Lyman continuum (880–910 Å) and the HETDEX observable wavelength range is 3470–5540 Å, the minimum observed emission-line wavelength is selected as 4860 Å (or z ≈ 3 for Lyα; see panel 2 in Figures 1 and 4). This ensures the blue end of our targeted Lyman continuum range is captured by the VIRUS spectrographs, allowing for some small variation in each detector's wavelength range and avoiding the edgemost pixels. The wavelength limits for the Lyman continuum are chosen to select the photons responsible for reionization (those blueward of the Lyman limit) in the region surrounding the host galaxies without excessive contamination from outside sources. The (red) upper bound is fixed by the Lyman limit itself while the (blue) lower limit is a bit softer and is established by the mean free path of Lyman continuum photons in 3 ≤ z ≤ 4. The optical depth of the IGM at z ∼ 3 results in a mean free path of Δz ≈ 0.18, which in turn sets the blue limit to ∼870 Å at z = 3 and ∼880 Å by z = 4 (Haardt & Madau 1998; Rudie et al. 2012). Because the 880–910 Å range is commonly used in other publications, it is adopted here as well for consistency.
  • 2.  
    Emission-line FWHM and g-band Apparent Magnitude: To exclude AGN and brighter contaminants, the maximum allowed fit FWHM to the emission line is set at 600 km s−1 with a bright g-band magnitude limit of 23. While AGN in the redshift range of interest are certainly LAEs, in this work we are interested in the far more common, simple star-forming galaxies (though AGN and broad-line/bright LAEs will be included in future work). This selection criterion does not absolutely guarantee the complete exclusion of AGN (particularly faint, narrow-line Type II AGN), but the 600 km s−1 limit is well below the more typical ∼1000–2000 km s−1 used in classifying Type I (broad-line) AGN (Antonucci 1993; Urry & Padovani 1995; Steidel et al. 2002; Baron et al. 2016, and others). Future work will explore the possible contamination by Type II AGN, but here the impact is assumed to be negligible. The photometric g coverage for this sample is incomplete and the g magnitude used here is computed by passing the HETDEX collected spectrum through the SDSS g filter (Doi et al. 2010) using the Python speclite package (version 0.8; Kirkby 2020). The estimated apparent g magnitude is limited by the HETDEX flux sensitivity to 24.5–25.0 (depending on the observing conditions, exposure time, and the VIRUS IFU) (KG21), and most of the bandpass-related computation is based on additional photometric imaging (generally r for this work).
  • 3.  
    S/N and χ2 Model Fit: Although there is good recovery (at the time of this writing, the detailed HETDEX survey recovery rate at the lower S/N range is still being evaluated) down to an S/N of approximately 5 for the Lyα emission line, and the absolute number of LAEs grows rapidly with decreasing S/N (Figure 1 panel 1), the relative number of false detections also increases with decreasing S/N and, in an effort to keep this preliminary manual examination sample manageable, an emission-line S/N minimum threshold of 6.0 is enforced. Closely related to the S/N cut, a χ2 limit of 1.2 on the single Gaussian emission-line fit is imposed to remove potentially questionable detections. While there certainly are real sources with a poor χ2 fit (particularly broad, noisy, or strongly asymmetric lines) this criterion performs generally well. This selection, combined with the redshift range and emission-line FWHM cuts, resulting in a moderately bright Lyα line luminosity distribution centered near 1.2 × 1043 erg s−1 (panel 4 in Figure 1).
  • 4.  
    P(Lyα) Classification: As an obvious condition, candidates that are not strongly indicated as LAEs are excluded from the initial selection with a minimum required value of 0.8 on the ELiXer composite P(Lyα) classification value. While expressed as a real number (0–1), P(Lyα) is not a proper probability but does represent a form of confidence in the classification of the emission line as Lyα. The full details of the computation are presented in DD21. In brief, however, it is a weighted combination of an analysis incorporating several EW estimates, redshifts, and line fluxes (Farrow et al. 2021), based on the Bayesian analysis in Leung (2017). Also factoring in are emission-line positions, flux ratios, and other estimated physical and spectral properties from the HETDEX spectra and available photometry. The Lyα EW is estimated as
    Equation (3)
    where F is the integrated line flux as fit by HETDEX (KG21) and C is the continuum flux density as estimated, in this work, from either the r aperture magnitude or the SDSS g magnitude from the HETDEX spectrum, as described earlier. While the Lyα line appears within g for the redshifts of this work, g-band photometry is often not available or is not as deep as r and so the r magnitude is used, assuming a flat continuum between Lyα and r, as the primary continuum estimate in the EW calculation. This does not alter the efficacy of the EW analysis on which the emission-line discrimination is based and, in fact, may produce a cleaner separation (Adams et al. 2011; Leung 2017). The Lyα EW from the HETDEX spectra is more similar to a proper EW in that it estimates the continuum nearer the emission line (as opposed to the r magnitude). The sample distribution of this EW estimate is shown in panel 3 of Figure 1 with the caution that the continuum estimates include wavelengths blueward of Lyα, subject to increased IGM attenuation, as well as the Lyα line itself. These two issues at least partly cancel but are sources of error in the EW estimates. Future work will refine the EW estimates and the current results are shown here only in the context of the broad description of the data set. With that caveat, the smallest rest-frame emission-line EW in the final data set is ∼40 Å for the r-continuum estimate and ∼26 Å for the HETDEX spectrum estimate, both greater than the commonly used 20 Å EW minimum as an LAE discriminator (Cowie & Hu 1998; Gronwall et al. 2007; Adams et al. 2011). In approximately 15% of the final sample, the imaging counterpart cannot be detected by the Python photometric package (sep, Barbary 2016). However, in all cases, the single emission line is obvious with the assumption that the counterpart is undetected because it is fainter than the imaging limit (here, 26.5 is adopted for HSC-r; Figure 1, panel 5). The classification of the emission line as Lyα based on the EW analysis, when the continuum is fainter than 25th magnitude, is supported (Leung 2017; DD21; KG21).
  • 5.  
    Spatially Resolved Candidates: LAEs at z ∼ 3 are expected to be a few kiloparsecs to maybe 10 kpc in radius (Ribeiro et al. 2016) and should be generally unresolved in the HSC-r imaging (seeing FWHM 0farcs6–1farcs0). Any HSC-r detected galaxies with an sep elliptical aperture effective radius or ELiXer circular aperture radius (DD21) ≥2farcs0 (≳15 kpc for z ∼ 3) are rejected as they could be misclassified as an AGN or foreground galaxy or include a slightly offset line-of-sight interloper, extending the projected profile. Here the effective radius is defined as
    Equation (4)
    where a is the semimajor axis and b is the semiminor axis of the aperture ellipse. It is also possible that the candidate galaxy is actually two or more interacting LAEs (this is also true even when the candidate is unresolved), but without the higher spatial resolution of HST imaging, it is difficult to impossible to distinguish these cases. In regions of the HETDEX survey with multiband HST coverage this becomes possible using SED comparisons as in (Pahl et al. 2021), but given the large HETDEX fiber size (K. Gebhardt et al. 2021, in preparation), it would challenging to separate the individual LAE spectra. These could be common and are certainly interesting cases for reionization as interactions can lead to enhanced star formation (Keel et al. 1985; Lawrence et al. 1989; Jogee et al. 2009, and others).
  • 6.  
    Neighbor Flux Contamination: Regions around known, nearby large galaxies (such as M101 in a region defined by R.A. (J2000) ∈ [210fdg2, 211fdg3] and decl. (J2000) ∈ [54fdg1, 54fdg8]) are completely excluded from consideration to avoid the risk of contamination from flux contributed by the outer regions of foreground galaxies. Additionally, no candidate is permitted to have any detected object (using the sep Python package) extending to within 2'' (resolved or unresolved) of the HETDEX detection position. This reduces the chances of emission contamination and source confusion, with the astrometric (centroid) accuracy ∼ 0farcs5 and the FWHM seeing generally better than 2'' (KG21).For slightly larger separations, the PSF-weighted flux of all imaging found sources within 9 arcsec2 of the candidate LAE are summed and the candidate LAE is removed from the sample if the potential flux contamination exceeds a threshold. As a note pertaining to the execution sequence, this check is performed after the initial query conditions described above and prior to the manual inspections described in the next subsection. The flux for each neighboring source (and the LAE candidate) is computed with simple aperture photometry (again, in HSC-r) and convolved with the HETDEX PSF for the observation of the candidate LAE. For simplicity, each source is assumed to be unresolved (noting that larger, extended sources within this area would automatically cause the LAE candidate to be rejected) with a flat spectrum over the bandpass. The (moffat) modeled PSF-weighted fraction (KG21) of flux in the overlap with the LAE candidate and every other source is summed over and the candidate is rejected if that sum (over all on-sky neighbors within the 9 arcsec2) is larger than 25% of the candidate LAE's r flux. The choice of the 25% threshold is somewhat rough and is intended to be similar to the average uncertainty in the aperture magnitudes that provide the flux estimates. For the final sample of 214 galaxies, the biweight midvariance location of the fractional PSF-weighted flux overlap is 0.01 ± 0.015, far below the 25% limit, and that potential flux contribution is ignored. More sophisticated modeling with a flux correction may be used for future samples to allow some relaxation of this criterion.

These initial selection criteria and automated conditions reduce the set of detections from over 1.5 million to approximately 1000 for visual inspection.

3.2. Visual Inspection

The final step in producing the data set for this work is the manual inspection of the remaining detections that pass the previous selection conditions. Despite best efforts to impose rigorous, repeatable conditions, this is not free from observer subjectivity. Future data selections with improved automation (in development at the time of this writing) will seek to minimize this subjectivity.

The visual inspection examines the following conditions and excises detections that meet any of the following:

  • 1.  
    Poor Imaging: If the photometric imaging within the few arcseconds around a candidate LAE is unclear, corrupt, incomplete, or shallow (magnitude limit < 25), it cannot be reliably examined for the potential foreground interlopers. Any such detections with poor imaging are removed.
  • 2.  
    Data Issues: The 2D HETDEX spectra are examined for cosmic-ray strikes, hot pixels, bad columns, or other data corruption in the Lyman continuum spectral region in the nearest four fibers (in order of increasing distance from the detection center position) and in the stacked sum of all contributing fibers. Any issues in these cutouts result in the candidate's exclusion. Similar data issues in the Lyα region do not warrant removal unless they cast doubt on the actual detection or classification as Lyα.
  • 3.  
    Bad Neighbors: As an extension of the Neighbor Flux Contamination condition in the previous subsection, the photometric imaging cutouts for each remaining LAE candidate are examined for the presence of neighboring sources that could be contributing unwanted flux, including any suggestion of very faint, low-surface-brightness interlopers that could be missed by the automated source extraction. Larger area (30'' × 30'') zoom-outs of the images are also examined for any potential contaminants (generally, stars or small, but spatially resolved, nearby galaxies) that could be sources of scattered-light/flux contamination. Again, any candidates with suspect neighbors are excluded.
  • 4.  
    Unidentified Emission Lines: The presence of any unidentified emission lines—that is, any visually apparent or ELiXer Gaussian fitted (DD21) emission lines in the spectra that are inconsistent with the classification as an LAE at the assumed redshift or suggestive of a second, blended spectrum from a lower-z line-of-sight interloper—in the full 1D spectrum suggests some contamination (blending) of spectra. While the ELiXer application performs automated line searching and fitting, this manual check is an additional safety against any convincing emission lines that ELiXer fails to identify. Candidate LAEs with such spectra are rejected.

Following this examination, 214 detections remain to comprise the final data set for this investigation (again, see Figure 1 for summary statistics). Figure 2 presents a subselection of 15 candidate LAEs as they appear in the HSC-r imaging with Table 1 listing the basic positional and brightness data for those galaxies (with the full listing made available in a machine-readable format). The LAE candidates all appear compact and point-source like, with no apparent foreground interlopers.

Table 1. Candidate LAE Galaxies for This Work

Detection IDDetection NameR.A. (degree)Decl. (degree) z r-mag r-err
2101626973HETDEX J131833.31 + 505207.1199.6388150.868643.380725.020.141
2101677478HETDEX J121602.84 + 504206.1184.0118450.701693.118225.060.145
2101677714HETDEX J121552.27 + 504126.2183.9678050.690623.351024.860.054
2101732424HETDEX J123400.37 + 504626.3188.5015650.773962.998424.930.089
2101849487HETDEX J131652.22 + 504250.0199.2175850.713893.032325.530.052
2101734356HETDEX J130453.80 + 504930.4196.2241750.825123.292324.500.105
2101850445HETDEX J131807.08 + 503938.9199.5294850.660803.027024.280.072
2101982710HETDEX J124748.14 + 562033.2191.9505856.342563.051723.930.070
2102041642HETDEX J135500.52 + 553033.5208.7521555.509303.101524.780.121
2102103229HETDEX J142151.25 + 490031.5215.4635649.008753.494125.110.099
2102161475HETDEX J135234.31 + 504156.7208.1429750.699083.266325.020.085
2102263894HETDEX J145659.60 + 500906.7224.2483550.151863.285924.810.094
2102294452HETDEX J130448.56 + 552515.2196.2023355.420893.004623.940.069
2102475817HETDEX J131213.18 + 542651.4198.0549254.447623.070424.370.094
2102517234HETDEX J130019.57 + 540833.0195.0815354.142503.255324.340.156
.....................

Note. Basic data for the objects in Figure 2. The listing for all 214 candidate LAEs for this work is made available in a machine-readable format. Entries with r − err = 0 are fainter than the imaging magnitude limit of 26.5.

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as:  DataTypeset image

Figure 2.

Figure 2. Examples of HSC-r imaging of LAE candidates. The HETDEX identifiers are printed in the upper left and the candidate LAEs are highlighted with gold ellipses defined by the sep Python package (Barbary 2016). The axis scale is in arcseconds. The basic data for these candidate LAEs are presented in Table 1. These galaxies are generally unresolved and appear free from foreground interlopers.

Standard image High-resolution image

4. Analysis

The primary objective of this work is to determine the feasibility of detecting Lyman continuum photons in HETDEX data and, as such, the analysis at this stage focuses only on detection with limited consideration given to characterization. Future work will enhance the analysis, correcting for IGM attenuation that is here expressly ignored.

The main input data for this work is the PSF-weighted, coadded spectra from the accepted detections selected as described in Section 3. Additionally, for each set of exposures, the "empty" or "sky" fibers are stacked and their average residuals are subtracted from the galaxy stacks in a supplemental cleaning step. The HETDEX data reduction pipeline handles all of the cleaning, rectification, throughput adjustment, differential atmospheric refraction (DAR) correction, and other steps necessary to combine the individual fibers over multiple exposures into a single 1D spectrum per detection (KG21).

4.1. "Sky Fiber" Selection and Stacking

"Sky fibers" are effectively empty fibers (i.e., those that do not fall on any apparent astrophysical object), and they play an important role in this work. After HETDEX processing, any residual flux in these fibers is ostensibly due to random noise, including instrumental noise, and imperfect sky measurement and subtraction. While the majority of fibers in the typical HETDEX exposure fall on empty sky (KG21), we are interested in those freest from any measurable light from the PSF wings of nearby objects or otherwise scattered along the light path as the most representative of the true sky. Functionally, sky fibers, within each three-exposure set, are identified by the following procedure that acts on the post-sky-subtracted, rectified spectra:

  • 1.  
    To remove fibers with a possible measured continuum, any fibers with spectra exhibiting an average flux density (over 500 Å wide bins) outside of ±5 × 10−18 (erg s−1 cm−2 Å−1) or ±4 × 10−18 (erg s−1 cm−2 Å−1) over (almost) the full width (3600–5400 Å) are eliminated. The 3600–5400 Å range is selected to avoid the edges of the detector and the prominent sky lines near 3545 and 5462 Å. The choice of bounding flux densities is an empirical definition and based on the HETDEX flux limits. Because most (50%–70%) fibers fall on empty sky, the values are selected to be small enough to exclude HETDEX-detectable continuum and potential oversubtraction of sky (KG21) and still include roughly half of all fibers for a typical HETDEX exposure set. Here, by "typical" we mean those exposure sets without large errors (instrument failures) or significant foreground objects (bright stars or relatively nearby galaxies, like M101) that cover a large fraction of the fibers.
  • 2.  
    Fibers with potential emission lines are also removed. For expediency, rather than attempt to fit ∼1000 Gaussian profiles per single-fiber spectrum (with ∼100,000 spectra per exposure set), here a possible emission line is defined in a sliding window of three wavelength bins, totaling 6 Å wide, with the integrated flux over the central wavelength bin exceeding 5 × 10−17 (erg s−1 cm−2) and the two immediately adjacent wavelength bins exceeding 4 × 10−17 (erg s−1 cm−2) each. These emission-line flux thresholds are selected based on a typical HETDEX flux limit under good conditions (noting that the actual flux limit is a function of seeing FWHM, throughput, wavelength, CCD response, fiber, etc.) (KG21).
  • 3.  
    The remaining fibers are sorted by their integrated flux over 3600–5400 Å and one-third of the fibers in the sum of the two tails of this approximately normal distribution are eliminated (i.e., keeping the central two-thirds or roughly 1σ). This additional down-selection is simply an extra precaution against the influence of outliers on this residual. Figure 3 shows a representative example of the distribution of sky fiber flux (before the tails are removed). Here we display the histogram, peaking very close to zero, of the summed fluxes in the 23,000 identified sky fibers in exposure set #18 taken on 2020 June 25.
  • 4.  
    The remaining fibers (roughly 30% of all fibers or about ∼25,000 fibers per exposure set) are considered effectively empty.

If the background subtraction was perfect, the distribution would center on zero (it is very close; Figure 3) and no wavelength bins or ranges of bins should vary from zero with anything but random noise. To be clear, we use the term "background" to encompass both the true sky background and instrumental noise. Figure 4 reveals, however, there is some small average residual, particularly on the blue end of the spectrum, that must be corrected. This figure (the stacked background residual spectrum from exposure set #18 taken on 2020 June 25) is typical of HETDEX observations in the fraction of sky fibers and in the residual spectrum. Even though there are larger residuals in the blue, the peak near 3550 Å and 3 × 10−19 (erg s−1 cm−2 Å−1) is still far below the stacked, residual-subtracted galaxy flux in the Lyman continuum region presented later in this paper.

Figure 3.

Figure 3. Example histogram of the integrated flux (3600–5400 Å) in qualifying "sky fibers" (prior to excluding the tails). The fluxes come from observation #18 on 2020 June 25 with ∼88,000 total fibers and ∼23,000 identified sky fibers. The distribution median is 1.54 × 10−17 erg s−1 cm−2 with a standard deviation of 21.1 in the same units. See also Section 4.1.

Standard image High-resolution image
Figure 4.

Figure 4. Example biweight of the background residual spectrum (observation #18 from 2020 June 25). The red horizontal lines indicate the observed-frame spectral region that corresponds to 880–910 Å at the labeled redshift. The separation along the flux density axis is artificial. There is a clear increase in the background residual flux on the blue end, primarily due to instrument effects (KG21). The background residual for each specific observation is subtracted from the corresponding galaxy spectra prior to stacking. For a discussion of the selection of "sky fibers" and the creation of background residual spectra, see Section 4.1.

Standard image High-resolution image

Stacking of the spectra in the defined sky fibers (again, within each data sample included exposure set) is performed on a per-wavelength basis, with the wavelength bins aligned on their "native" observed-frame (2 Å) boundaries, using a biweight scheme, not a sum or average. The biweight measure of the central location (Beers et al. 1990) is similar to a median and is selected for its stability and robustness against outlier influence, reducing or eliminating any bias toward the relatively few, brighter sources. The biweight implementation used in this work is slightly modified with an additional weight applied as the inverse uncertainty in the flux measurements. Hereafter, we refer to this modification of the standard biweight location (ζbiloc) simply as the "weighted biweight:"

Equation (5)

where M is an initial guess (typically the sample median) and the weights, vi , are defined as

Equation (6)

and

Equation (7)

where xi are the data and c a tuning constant (commonly 6; see Beers et al. 1990).

The biweight location "shifts" the median by the average difference of points to the median (xi M). This average excludes outliers and assigns higher weights to points close to the median than points that are farther away. As this shift is a weighted average, we can include the additional inverse uncertainty weights wi by modifying the weights ${v}_{i}^{2}$:

Equation (8)

where

Equation (9)

This "normalization" of the additional weights, wi , is necessary to ensure that vi and $\widetilde{{w}_{i}}$ have comparable values and therefore similarly large contributions to the final weights. The weighted biweight location becomes

Equation (10)

4.2. Background Residuals Correction

Although great pains are taken to remove the background signal from the galaxy spectra in the primary sky subtraction steps, two additional steps are taken to further refine this correction. First, to more consistently subtract the background spectrum from the galaxy spectra, each of the LAE candidate galaxy spectra are re-extracted from the original HETDEX data using the full-field sky subtraction to replace the IFU local sky subtraction used in the original iHDR2 catalog. Here, "full-field" refers to all fibers in all IFUs for an exposure set and "local" refers only to those fibers in the single IFU where the galaxy detection is made. This approach greatly expands the number of fibers used to compute the sky and reduces the potential for oversubtraction that can result from diffuse (or stray) light in a large fraction of the fibers in a single IFU (KG21; G. Zeimann et al. 2021, in preparation).

Second, for every galaxy spectrum, the independently selected sky fibers (as previously described) for all IFUs and exposures that contributed to the calculated full-field sky background for that exposure set are collected and averaged (again, using a weighted biweight). The specific PSF model for that exposure set is then applied to the resulting residual background spectrum and subtracted from the galaxy spectrum in the observed frame. Logically, this is equivalent to subtracting this background residual from the individual galaxies fibers before they are PSF weighted and combined. An example of the background residual from one exposure set (#18 from 2020 June 25) is shown in Figure 4.

In an additional check, the region around the particularly bright sky line at 3535–3555 Å is masked from the galaxy spectra. The resulting galaxy-weighted biweight stacks and Lyman continuum region averages (not shown) exhibit no discernible differences, so no extra correction for this sky line (beyond those described above) is applied.

4.3. Galaxy Spectra Stacking

Generally, aside from the primary Lyα emission line, the S/N of astrophysical photons over other wavelength ranges for our z ≳ 3 LAEs is too low for detection in individual sources. We therefore employ stacking to boost the signal. Because the observed Lyα line is found at different wavelengths (4860–5500 Å), the cleaned and corrected galaxy spectra are first redshifted to their own rest frames, as determined from the Lyα line, so they can be aligned for the weighted biweight stacking. The individual redshifting produces wavelength bins of slightly different widths, from 0.44 to 0.50 Å. The largest redshift results in the narrowest bins and is adopted as the grid onto which all other rest-frame spectra are linearly interpolated. The spectra are then aligned on the Lyα bin, for now ignoring systemic Lyα velocity offsets (Verhamme et al. 2018) (discussed later), and the weighted biweight is computed for each wavelength bin. The uncertainty is reported as a standard error on the biweight location, with the square root of the biweight midvariance (or biweight scale) as σ. The relatively small uncertainty, roughly 0.05 Å–0.12 Å in the rest frame, in the Gaussian fit line center (KG21) is also ignored as it is much smaller than the systemic velocity error and, unlike the velocity error, averages to zero over the sample. As with the sky fibers, averaging is performed with the modified (weighted) biweight.

Figure 5 (top panel) presents the result of stacking the full sample of candidate LAEs. Because of their different redshifts but uniform observed-frame wavelength range, the wavelength bins in the extreme outer blue and red ends of the weighted biweight stack, well beyond the Lyman continuum and Lyα regions, include fewer data points (lower panel) and are truncated from the next figures.

Figure 5.

Figure 5. The rest-frame stack of the galaxy spectra, with correction for background residuals but without correction for IGM attenuation. The lower panel shows the number of galaxies from the sample that contribute to the corresponding wavelength bins in the upper panel. Due to the fixed observed-frame wavelength range and the variation in redshift of the galaxies, the wavelength regions blueward of the marked Lyman continuum and redward of Lyα have fewer contributors.

Standard image High-resolution image

Figures 6 and 7 show the main galaxy spectra stacks for this work, with the wavelength bin data listed in Table 2. To reduce the jitter and boost S/N, the final galaxy-weighted biweight stacks are summed over bins 11 elements wide (∼5 Å, given the uniform rest-frame bin width of ∼0.45 Å). The black line represents the binned stack of the entire sample. Also shown are binned stacks of the brightest one-third (in blue) and faintest two-thirds (red) by MUV.

Table 2. Subsample Stacked Spectra

 Full Sample  Brightest One-third  Faintest Two-thirds  Faintest One-third 
λ FluxFlux Err λ FluxFlux Err λ FluxFlux Err λ FluxFlux Err
....................................
878.134.22991.1694876.433.62872.0956878.134.88271.4072877.323.66571.9114
883.033.73581.1480881.352.14481.9637883.035.13111.4022882.233.22281.9740
887.931.16491.0221886.270.27331.9557887.932.18731.2408887.140.89111.7336
892.833.25630.9949891.191.00771.8225892.834.64751.1933892.051.65111.7201
897.731.51300.9396896.11−1.20521.6673897.731.72141.1473896.951.75071.5381
902.623.84800.9080901.040.52281.6001902.625.30711.1018901.864.71671.5161
907.520.42920.8444905.962.71521.5920907.520.36511.0064906.772.38841.4004
912.421.37960.8228910.88−1.01361.4604912.421.52020.9989911.681.17201.4586
....................................

Note. Spectra of the galaxy stacks from Table 3 and Figures 6 and 7. The center wavelength of each bin (λ) is reported in angstroms and the flux and flux err are in erg s−1 cm−2 × 10−17 integrated over the ∼5 Å wide wavelength bins. Shown here are the data in the Lyman continuum region of each weighted biweight stacked spectrum. The complete listing is made available in a machine-readable format. Null entries, with Flux Err = 0 or with blank columns, are present only for alignment across the four samples. For easier comparison to the reported EUV emission flux densities, Figures 6 and 7 are plotted in μJy instead of the CGS units in this table.

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as:  DataTypeset image

Figure 6.

Figure 6. Flux density (presented in μJy here) weighted biweight stack of all 214 galaxy spectra (without a systemic Lyα velocity correction) binned to ∼5 Å. Also shown are the brightest one-third (median MUV = −21.0) and faintest two-thirds (median MUV = −20.0) along with the locations of the Lyman series (in red text) and some common absorption features (in black text). For readability, the Bright one-third and faint two-third stacks are shown without their errors. See Figure 7 for an expansion of the Lyman continuum region.

Standard image High-resolution image
Figure 7.

Figure 7. Expanded view of the Lyman continuum region (Figure 6). For readability, the full sample weighted biweight stack is shown without its error. The flux (presented in μJy here) in the Lyman continuum region for the full sample is well above and inconsistent with the zero flux line. The full sample weighted biweight of individual EUV flux densities, Method 1 (shown as the green horizontal line), is 0.10 ± 0.036 μJy (see Section 4.4). The separation between the bright one-third stack and the faint two-third stack suggests a possible trend of increasing Lyman continuum emission with decreasing host UV luminosity over the limited range of MUV of this work.

Standard image High-resolution image

In all three weighted biweight stacks, some continuum on either side of Lyα is clearly present (at least redward of the Lyman limit), and absorption for some of the Lyman series is apparent. Other common absorption features are also marked, with several prominently represented, which will be useful for future stellar population fitting. Of note is what appears to be a P Cygni profile of [O vi] (λ1032, 1038 Å) as a signature of the young stellar populations (Leitherer et al. 1999) in these LAEs.

As is expected simply by the selection, the brightest one-third shows increased continuum over the faintest two-thirds until near and blueward of the Lyman limit, where the bright subsample falls off rapidly. This could be consistent with an expected trend to a higher EUV escape with decreasing UV luminosity (and decreasing halo mass) (Nakajima et al. 2018; Finkelstein et al. 2019), though the faint-end limit of this sample is far too bright and the sample size far too small for any conclusions. At a minimum, the absence of a statistically significant detection of EUV emission in the MUV brightest one-third versus the clear detection in the fainter subsamples suggests an MUV limit where EUV escape declines rapidly. Even with the decreased HETDEX sensitivity to the blue end of the spectra range, the Lyman continuum region in the full sample weighted biweight stack of galaxy spectra from this work is clearly above and inconsistent with zero flux (Figure 7), supporting the claim of flux detection in this limited sample.

4.4. Lyman Continuum Averaging

Three averaging methods are applied to the Lyman continuum region with somewhat different detection significances but with consistent results. In the first method, the weighted biweight measure of central location, hereafter "weighted biweight," of the flux density of the Lyman continuum region for each individual (rest-frame) spectrum is computed and the weighted biweight of those individual EUV flux densities is calculated. For this method, the result, 0.10 ± 0.036 μJy, has the lowest significance, in part due to the larger error from the fewer wavelength bins (∼66 per rest spectrum) but is the most straightforward as the Lyman continuum regions from the individual spectra are taken as whole inputs. It is this result that is reported as the primary finding as it is the most conservative of the three methods. In the second method, all Lyman continuum region wavelength bins from all sample spectra are placed in a single vector (∼14,000 elements) and the weighted biweight is computed. The results are consistent, 0.11 ± 0.024 μJy, with a larger significance. In the third method, the weighted biweight of the Lyman continuum region of the resulting spectrum from the full stack (described above) is computed. Again, the results are consistent, 0.12 ± 0.021 μJy with the highest significance of the three methods, but due to the weighted biweight of the stacking, each wavelength bin from individual spectra does not necessarily receive the same weighting, and so a given spectrum may contribute somewhat differently to each ∼0.5 Å bin in the final stack. However, this is only a small caveat in the interpretation and does not affect the conclusions, noting that all three methods provide very similar statistics.

This same procedure is repeated for a few subselections of the data, and the results are presented in Table 3. As noted in the previous section, the trend to a higher absolute EUV emission with decreasing MUV persists even with this small sample size and low significance. This result might support models of reionization that favor an early start with most of the ionizing radiation budget supplied by numerous, fainter star-forming galaxies (see Finkelstein et al. (2019) for an extensive discussion). Additional data will allow for substantial refinement in both S/N and luminosity bin resolution.

Table 3. Summary of Subsample Properties with EUV (900 Å) Emission

DescriptionCount $\tilde{z}$ ${\tilde{M}}_{\mathrm{UV}}$ Method 1Method 2Method 3
   ±σ (μJy) ± SE(μJy) ± SE(μJy) ± SE
Full Sample2143.16−20.35 ± 0.7110.10 ± 0.0360.11 ± 0.0240.12 ± 0.021
Brightest one-third723.15−20.99 ± 0.2730.01 ± 0.0540.01 ± 0.0430.02 ± 0.037
Faintest two-thirds1423.17−20.00 ± 0.5320.15 ± 0.0450.15 ± 0.0290.18 ± 0.026
Faintest one-third723.18−19.40 a ± 0.3270.13 ± 0.0660.12 ± 0.0410.14 ± 0.035

Notes. Summary of basic properties of the sample subselection (note: the ̃ overscore indicates a median). The redshift distribution of each subsample are effectively the same with almost identical median and scatter over [3.00, 3.49]. All reported statistics are the weighted biweight with the error on MUV as the subsample biweight scale. The error on the three EUV escape columns are the standard errors. Method 1 computes the flux density in the Lyman continuum region of each individual candidate LAE spectrum and then performs a weighted biweight over those flux densities. Method 2 concatenates all Lyman continuum wavelength bins across all LAE candidates and performs a weighted biweight on the resulting single vector. Method 3 aligns the rest-frame wavelength bins of all candidate LAE spectra and stacks with a weighted biweight and then computes the flux density in the Lyman continuum region of the resulting stacked spectrum. See Section 4.4.

a The faintest one-third ${\tilde{M}}_{\mathrm{UV}}$ may be fainter as 30 of the 72 galaxies are at or below the r imaging flux limit.

Download table as:  ASCIITypeset image

4.5. Lyα Line Dispersion and Systemic Velocity Offset

The complex resonant scattering of Lyα photons (Field 1959; Moses 1980, and others) means that the emission line typically separates into a blue peak and a red peak, with the blue peak often highly suppressed, resulting in a (sometimes large) line dispersion from the many scatterings and a systemic velocity offset to the galaxy when fitting only the red peak. This can create a problem when aligning the spectra for stacking and certainly for any precise characterization of the Lyα line. These rest-frame velocity offsets can be in the hundreds of km s−1 (Shapley et al. 2003; Berry et al. 2012; Verhamme et al. 2018) and are generally correlated with the Lyα line FWHM. The low spectral resolution of VIRUS (R ∼ 800) amounts to an emission-line minimum FWHM resolution of ∼375 km s−1 (KG21; G. Zeimann et al. 2021, in preparation) for our Lyα lines. This is sufficient for line resolution, except for the narrowest emission lines (39 of 214 Lyα lines are below 375 km s−1 FWHM). Velocity offsets to the fitted line center are measurable with greater precision (∼30 km s−1, or Δz ∼ 0.0005 KG21).

With the moderate range of luminosities and line widths of the LAEs in this work and the more modest objective of simply detecting Lyman continuum emission, combined with the relatively broad spectral region over which the Lyman continuum was averaged, these complications are largely ignored. With that caveat, an approximate correction for the systemic velocity offset of the red peak of Lyα (${V}_{\mathrm{peak}}^{\mathrm{red}}$) from its host galaxy is explored with Equation (2) from Verhamme et al. (2018):

Equation (11)

Applying this relation to the individual spectra prior to stacking, and using the 375 km s−1 minimum FWHM resolution for those lines below that limit, produces a ≲0.1% overall change in individual redshifts, around 200 km s−1 averaged over the entire sample (well above the HETDEX line center precision), and results in no significant difference in the Lyman continuum flux density using any of the averaging methods. Methods 1, 2, and 3 exhibit changes of +2%, +5%, and +10%, respectively. These corrections are all far less than the uncertainty in the weighted biweight of the Lyman continuum emission and have no meaningful impact on the detection at this level. With the current small sample size and moderate scatter in the Lyα line FWHM, the spectral stack is constructed from a somewhat inhomogeneous sample, and a singular correction applied poststacking would not be appropriate. However, with expanded future data sets allowing for stacks of larger numbers of more uniform luminosities, Lyα line widths, etc,, the velocity correction, in particular, could become more important in resolving ensemble spectral features and will be revisited.

Figure 8 presents an overview of the Lyα velocity correction. The top panel shows essentially the full-width view of the full sample LAE weighted biweight stack with the corrected spectrum (red) over-plotted with the uncorrected spectrum (blue). The lower two panels highlight the Lyman continuum region (left) and the Lyα line (right), respectively. The shift in the line center is clearly visible in the lower-right panel but is still relatively small and makes no significant difference over the Lyman continuum region (lower-left panel).

Figure 8.

Figure 8. Galaxy spectra weighted biweight stack with (red) and without (blue) approximate velocity corrections (Verhamme et al. 2018). The averaged velocity offset of the stack was ∼200 km s−1 (to the red), well above the HETDEX line center sensitivity (±30 km s−1), and makes no measurable difference in the computed Lyman continuum flux density.

Standard image High-resolution image

4.6. Error Systematics

Systematics in error analysis are difficult to quantify and can have a particularly large impact with smaller data sets. In an effort to report the most accurate detection significance, the following procedure (comparing subset standard errors on the mean to the standard deviations of the means of the subsets) is used to estimate a correction for the combined systematics (statistical, measurement, and instrument).

  • 1.  
    Randomly subdivide the sample into (m) nonoverlapping, nonrepeating proper subsamples of size (n) (where n is ≤ an integer of the total sample size divided by m).
  • 2.  
    Create a single vector of fluxes over the target wavelength range (i.e., 880–910 Å for the Lyman continuum region) across all (n) spectra for each of the (m) subsets and take the standard error of those fluxes (${\sigma }_{n}/\sqrt{n}$). Here, n is actually the subset size × the number of wavelength bins (which varies slightly from 65 depending on the subset maximum redshift), but for compactness of notation is represented by the number of galaxies in the subset.
  • 3.  
    Compute the mean of each of the (m) subsample flux vectors and take the standard deviation of those means (σm ).
  • 4.  
    Repeat steps (1)–(3) 10,000 times with different random subset draws.
  • 5.  
    Compute the mean of the σm and ${\sigma }_{n}/\sqrt{n}$ values over those 10,000 draws and take the ratio of those means $\left({\overline{\sigma }}_{m}/({\overline{\sigma }}_{n}/\sqrt{n})\right)$.
  • 6.  
    Repeat (1)–(5) for each combination of (m, n) (down to a minimum m of 2).
  • 7.  
    Fit a line through the ratios (indexed by n) and extrapolate to the full data set to find the systematic correction factor (to divide into the full data set standard error).

For this data set, the ratio (${\overline{\sigma }}_{m}$/(${\overline{\sigma }}_{n}/\sqrt{n}$)) fit reaches 1:1 near (n = 82 (for m = 2)), indicating that the sample is already sufficiently large to not need an additional systematics correction applied to the standard error.

4.7. Spectral Contamination

This work assumes that there is no contamination of the measured flux resulting from misclassification of the emission line, PSF scattered light from sky-adjacent astrophysical objects, or extra contributions from line-of-sight, low-surface-brightness interlopers. While multiple steps were taken to exclude these potential contaminants from the data set (see Section 3), no additional corrections, beyond the background residual subtraction (Section 4.1 ), are applied to the individual or stacked spectra. However, the galaxy stack is examined for indications of emission lines that would suggest the presence of misclassified [O ii] (λ3727) or [O iii] (λ5007), for example, within the data set. The galaxy spectra are aligned and stacked assuming the rest-frame wavelength of each potential misclassified emission line, without a correction for any velocity offset (as would be appropriate for these nonresonant line contaminants), and checked for emission lines supporting the assumption that either of these contaminants is present. No such emission is identified. Additional contaminants, [Mg ii] (λ2795+λ2802) and [C iii] (λ1909), are also specifically considered. The [C iii] line does not appear as a redshifted single line within the observed spectral range considered by this work (4860–5540 Å) and would be expected to be accompanied by [C iv] (λ1549), which is not found. The [Mg ii] doublet (at the edge of resolution as separate lines by VIRUS; KG21) likewise does not appear by itself for any part of the wavelength range of interest and should be accompanied by [C ii] (λ2326) over the entire range and by [C iii] after [Mg ii] passes 5088 Å. The [Mg ii] line is often broad and frequently (but not always) associated with AGN, and the maximum emission-line FWHM cut of 10 Å (see Section 3) and absence of continuum in individual galaxy spectra made this an unlikely contaminant. [Mg ii] can also be narrow and appear similar to Lyα; however, for the narrow cases, no evidence of either [C ii] or [C iii] is found in the sample, nor do any of the line shapes suggest a doublet peak, favoring the bluer (2795 Å) peak as is expected for [Mg ii] (Chisholm et al. 2020).

5. Discussion

As a qualitative reference and for some context, we select the Keck Lyman continuum Spectroscopic Survey (KLCS) (Steidel et al. 2018) as a representative of prior studies and perform a limited comparison. Additional z ∼ 3 surveys (including Vanzella et al. 2015; Shapley et al. 2016; Bian et al. 2017; Rivera-Thorsen et al. 2019, among others) will be discussed in the future, examining sample differences and trends, and KLCS is presented here primarily as a sanity check on the selection and stacking methods presented in this work.

5.1. Comparison to the Keck Lyman Continuum Spectroscopic Survey

Because the continuum-selected KLCS sample median is ∼2× brighter (in r and MUV) than that of the galaxies in our sample, the 52 brightest galaxies from this work are chosen to stack as their median magnitude (r = 24.5, MUV = −21) matches that of the KLCS sample. We note that the MUV distributions of the two samples are quite different, roughly symmetric about −21 for the KLCS and strongly biased to fainter magnitudes in this work (see Figure 1, panels 5 and 6). Additionally, because the HETDEX sample is specifically selected from LAEs (generally young, blue, and compact) there may be a bias for galaxies with enhanced Lyman continuum escape. For consistency with the KLCS, a flux normalization is applied to the resulting weighted biweight stacked spectrum. Due to the lower S/N of the HETDEX data, the normalization is applied to the stack and not individually as in the KLCS case. This flux normalization of the bright galaxy stack from this work approximates the f1500 normalization of the KLCS sample (where the KLCS spectra are individually normalized to their flux averaged over 1475–1525 Å prior to stacking). Because the rest-frame spectral range of the samples from this work does not extend to 1500 Å, a rough translation is employed using the ratio of the normalized flux densities in the KLCS sample over the fairly flat and feature free regions near 1500 Å to that near 1300 Å (which is covered in the stack from this work) as a scaling factor (specifically, between 1468 Å and 1496 Å to that of 1268–1296 Å). The bright galaxy-weighted biweight stack is normalized by its own f1300 × the above ratio (i.e., 0.58 ± 0.042 μJy × 1.09). While the subsample size is below the threshold where systematics can have a small impact, as the purpose of this comparison is simply to check the overall curve shape and verify key features, no systematics correction was applied.

This comparison to the KLCS sample is shown in Figure 9. For readability and to aid in the comparison, the bright subsample (shown in black) and the KLCS sample (shown in blue, without error) are each smoothed with a Gaussian kernel (σ = 2.2 Å or 5× the 0.45 Å width of the rest-frame galaxy stack wavelength bins). Although the bright subsample of this work is of a much lower spectral resolution and S/N than the KLCS sample, the two stacks align well and reproduce much the same broad shape and absorption features, lending additional validation to the methodologies of this work. Though of limited value given the large noise in this sample of 52, the observed 〈f900/f1500〉 = 0.07 ± 0.182, including an IGM correction (Haardt & Madau 1995; Inoue & Iwata 2008; Steidel et al. 2018). While consistent with 〈f900/f1500out = 0.057 ± 0.006 in Steidel et al. (2018), it is also consistent with zero and a nondetection of Lyman continuum.

Figure 9.

Figure 9. Comparison of a similar subsample (52 out of 214) from this work, based on matching the median r and MUV to that of Steidel et al. (2018). For readability, each spectrum was smoothed with a Gaussian kernel (σ = 2.2 Å). While this subsample is of significantly lower signal to noise and spectral resolution compared to KLCS and the MUV distribution of the subsample in this work skews to fainter magnitudes, the two stacked spectra overlap well, matching up with significant absorption features. See Section 5.1.

Standard image High-resolution image

5.2. Faint Foreground Contamination

In an update to Steidel et al. (2018), Pahl et al. (2021) use deep HST imaging to identify several previously undetected, very faint line-of-sight interlopers contaminating the KLCS sample, leading to the removal of 2/15 of the 3σ Lyman continuum detections and another 2 from the nondetections (4/124 total). This reduces the full sample ionizing emissivity in the original KLCS findings by ≲10% (from 6.0 to 5.5 × 1024 erg s−1 Hz−1 Mpc−3) and the median Lyman continuum flux density in the 3σ detections by ≲5% (from 0.044 to 0.042 μJy). Pahl et al. (2021) also perform an alternate ionizing emissivity estimation, retroactively applying it to the Steidel et al. (2018) sample, and find a decrease of almost 25% (from 7.2 to 5.5 × 1024 erg s−1 Hz−1 Mpc−3).

While we do not find evidence of significant foreground contamination in our sample, it is a possibility and an issue that we will continue to explore with an expanded HETDEX LAE sample. As a simple check for this work, and using the 3%–13% (4/124 and 2/15) foreground contamination rate in the previous paragraph as a rough guide, we examine the impact of similar contamination on the Lyman continuum detection. We impose a 10% contamination rate uniformly applied across our entire sample with a worst-case scenario of exact line-of-sight colocation of foreground contaminants with LAE candidates. The contaminants are simulated as simple, flat (in fν ) spectra with 27 in g, assuming brighter foreground galaxies with strong emission lines would be caught by the inspections in Section 3. We run 1000 iterations, randomly selecting 10% of the sample in each run, subtract the simulated contaminant spectrum from those LAE spectra, and perform the Lyman continuum stacking analyses. We find that up to 20%–25% of the flux measured in the Lyman continuum region, across all three Methods (Section 4 and Table 3), could come from undetected contamination. As a logical check, we repeat this same test, but remove the randomly impacted LAE candidates under the assumption that the contaminants are now detected. As expected, there is no significant change versus the original weighted biweight measure of the Lyman continuum emission in any of the three Methods and, due to the slightly smaller sample size, the uncertainty in those measures increases by ∼5%. The presence of undetected contamination will have an impact on future work exploring the escaping fraction of EUV photons. Some improvements in contamination detection are anticipated, particularly where deeper, multiband imaging is available, and more sophisticated simulations will be employed to account for undetected contamination, but this does not change the conclusion of this work of the detection of EUV emission in the ensemble of LAE galaxies.

5.3. Biases and Caveats

As noted in the introduction, much of the practical motivation for this work is to provide an alternative survey of z ∼ 3 Lyman continuum leakers that is not subject to the same biases and limitations of other surveys that rely on target preselection (typically using the Lyman break) and deep spectroscopic observations of those targets individually. That is not to say that the methods presented here are free from biases, just that the biases are perhaps less significant taken in the context of galaxy ensemble, and some biases, notably in identifying Lyman continuum leakers (described below) are desirable for this work.

With the notable exception of the relatively small-area MUSE IFU surveys (Drake et al. 2017; Feltre et al. 2020 and others), most other z ∼ 3 surveys rely on target preselection generally using the filter dropout to identify LBGs in the desired redshift range. This is a robust selection method but does immediately introduce a selection bias that favors galaxies with a strong Lyman break, i.e., galaxies that are UV bright and Lyman continuum faint, which may tend toward older, more massive galaxies (Gawiser et al. 2007; Finkelstein 2016; Jose et al. 2013; Santos et al. 2020, 2021). UV faint galaxies, particularly those with a stronger Lyman continuum can be missed as they would be faint in the imaging with a weaker Lyman break. Thus, LBG galaxy surveys may favor galaxies with low EUV emission. This work's emission-line technique, on the other hand, has no preselection and is only flux limited and thus can frequently include fainter galaxies (as noted versus the KLCS above). It does, however, introduce the obvious bias that only galaxies with sufficiently strong Lyα emission (KG21) are included and is blind to non-LAE (or weaker LAE) galaxies. As strong Lyα emission appears common in Lyman continuum leakers, though strong Lyα emission is not sufficient, by itself, to predict Lyman continuum leakage (Kornei et al. 2010; Bian & Fan 2020), there is a potential to favor finding galaxies that are more likely to have stronger EUV emission. This does not appear to be a strong bias as evidenced by the consistent-with-zero detection of EUV emission in this work's MUV-brightest subsample.

From the ground, except for extreme cases, z ∼ 3 galaxies are generally unresolved point sources and a slight misalignment of spectrograph singular apertures (a slit or one fiber) with respect to the source centroid can result in an under measurement of flux. Such misalignments can be the result of mechanical positioning errors of the aperture or centering on a centroid that is not colocated with the emission to be studied, though the latter should be much less of an issue given the unresolved nature of the sources and sufficiently large apertures. Since HETDEX is a blind survey, no fiber prepositioning is involved and centroids are found empirically (KG21) from the wide multifiber coverage. With the exception of sources that are detected near the edge of an IFU, which are excluded from this work, the loss of light due to fiber placement is minimized.

Many previous preselection works stack spectra from a single instrument from data collected over one or a few consecutive (or nearly consecutive) nights and thus have generally similar observing conditions and instrument states. In contrast, though individual HETDEX detections are built from three sequential 6-minute exposures, HETDEX stacks are made from observations scattered over many days, seasons, and years with varying sky conditions and instrument states. This makes background subtraction considerably more heterogeneous for HETDEX data and, though great care is taken in the background subtraction (KG21 and this work, Section 4.2), it subjects the spectra in this work to larger uncertainties. Fortunately, with the projected very large number of spectra (∼100× those of this work), the ensemble uncertainties will be minimized.

The selection criteria (Section 3), which are intended to ensure a contamination-free sample, may also introduce some bias in this work. We are primarily cautious of two criteria. First, the high EW (and other factors) used in the P(Lyα) metric (Section 3.1, bullet 4) requirement could introduce a bias against LAEs with weaker Lyα emission and stronger continuum, and thus potentially reduced Lyman continuum leakage, in an effort to avoid contamination from a misidentification of [O ii]. Second, the restriction to somewhat isolated galaxies (Section 3.1 bullet 6 and Section 3.2) could also introduce an environmental bias, though that is unclear as there is no attempt to establish the physical distances between our candidates and their on-sky neighbors. In future data sets, as improvements are made in the classification of HETDEX galaxies (DD21), the bias from the P(Lyα) selection will be reduced and added spectral deblending capabilities will help eliminate unwanted angular/spatial environmental selection effects.

6. Summary

The full sample ≳3σ (0.10 ± 0.036 μJy) detection of Lyman continuum flux (Figure 6), uncorrected for IGM attenuation, supports the conclusion that stacking the relatively low signal-to-noise, low spectral resolution HETDEX spectra is a valid approach for investigating Lyman continuum photon escape from these EoR analogous galaxies. While the small size of the greatly down-selected data set in this work restricts meaningful sample subdivision and further immediate investigation, the significance of this detection is encouraging and suggests higher-S/N stacks with tight selection criteria for more homogeneous subsamples are possible.

In the next phase of this investigation, we conservatively expect to increase the 3.0 < z < 3.5 LAE sample size by two orders of magnitude, selecting from a larger initial HETDEX LAE population and by pushing down to lower emission-line S/N (∼5.0) and relaxing some of the subselection criteria (section 3) where corrections can be made instead of outright rejection. The much larger sample provides for refined subsampling and higher-S/N stacks, enabling meaningful stellar population fitting and the computation of intrinsic EUV escape fractions. This will allow a detailed examination of galaxy properties (UV luminosity, Lyα properties, halo mass, star formation rate, etc.) as they relate to Lyman continuum photon production and escape.

Stacking well over 100× the number of LAE galaxies of other surveys also affords substantial advantages. By averaging over many sight lines and galaxy orientations, the large individual variations in IGM transmission and potential ISM escape vectors are marginalized out in the aggregation. Additionally, as a blind survey, this data set will not be subject to the same Lyman break/continuum selection biases and is able, on average, to push into the fainter sources, most similar to typical EoR counterparts (Trenti et al. 2010; Finkelstein et al. 2019, and others). Consequently, we should be able to identify the primary sources of ionizing photons in the EoR and shed additional light on the progression of reionization. Furthermore, this represents a unique and vast data set for galaxies that serve as laboratories for exploring the physics of the escape of hard ionization radiation.

The authors thank the anonymous referee for the helpful comments, which assisted in focusing and improving this manuscript.

HETDEX is led by the University of Texas at Austin McDonald Observatory and Department of Astronomy with participation from the Ludwig-Maximilians-Universität München, Max-Planck-Institut für Extraterrestrische Physik (MPE), Leibniz-Institut für Astrophysik Potsdam (AIP), Texas A&M University, The Pennsylvania State University, Institut für Astrophysik Göttingen, The University of Oxford, Max-Planck-Institut für Astrophysik (MPA), The University of Tokyo, and Missouri University of Science and Technology. In addition to Institutional support, HETDEX is funded by the National Science Foundation (grant AST-0926815), the State of Texas, the US Air Force (AFRL FA9451-04-2-0355), and generous support from private individuals and foundations. We also acknowledge the support from NSF-2008793. S.L.F. acknowledges support from the National Science Foundation through grant AST-1908817.

Observations were obtained with the Hobby–Eberly Telescope (HET), which is a joint project of the University of Texas at Austin, the Pennsylvania State University, Ludwig-Maximilians-Universität München, and Georg-August-Universität Göttingen. The HET is named in honor of its principal benefactors, William P. Hobby and Robert E. Eberly.

The authors acknowledge the Texas Advanced Computing Center (TACC) at The University of Texas at Austin for providing high-performance computing, visualization and storage resources that have contributed to the research results reported within this paper. URL: http://www.tacc.utexas.edu.

The Institute for Gravitation and the Cosmos is supported by the Eberly College of Science and the Office of the Senior Vice President for Research at the Pennsylvania State University.

Please wait… references are loading.
10.3847/1538-4357/ac1598