This site uses cookies. By continuing to use this site you agree to our use of cookies. To find out more, see our Privacy and Cookies policy.

Articles

THE ARECIBO LEGACY FAST ALFA SURVEY: THE α.40 H i SOURCE CATALOG, ITS CHARACTERISTICS AND THEIR IMPACT ON THE DERIVATION OF THE H i MASS FUNCTION

, , , , , , , , , , , , , , , , , , , , , , , and

Published 2011 October 17 © 2011. The American Astronomical Society. All rights reserved.
, , Citation Martha P. Haynes et al 2011 AJ 142 170 DOI 10.1088/0004-6256/142/5/170

1538-3881/142/5/170

ABSTRACT

We present a current catalog of 21 cm H i line sources extracted from the Arecibo Legacy Fast Arecibo L-band Feed Array (ALFALFA) survey over ∼2800 deg2 of sky: the α.40 catalog. Covering 40% of the final survey area, the α.40 catalog contains 15,855 sources in the regions 07h30m  < R.A. < 16h30m, +04° < decl. <+16°, and +24° < decl. <+28° and 22h  < R.A. < 03h, +14° < decl. <+16°, and +24° < decl. < + 32°. Of those, 15,041 are certainly extragalactic, yielding a source density of 5.3 galaxies per deg2, a factor of 29 improvement over the catalog extracted from the H i Parkes All-Sky Survey. In addition to the source centroid positions, H i line flux densities, recessional velocities, and line widths, the catalog includes the coordinates of the most probable optical counterpart of each H i line detection, and a separate compilation provides a cross-match to identifications given in the photometric and spectroscopic catalogs associated with the Sloan Digital Sky Survey Data Release 7. Fewer than 2% of the extragalactic H i line sources cannot be identified with a feasible optical counterpart; some of those may be rare OH megamasers at 0.16 < z  < 0.25. A detailed analysis is presented of the completeness, width-dependent sensitivity function and bias inherent of the α.40 catalog. The impact of survey selection, distance errors, current volume coverage, and local large-scale structure on the derivation of the H i mass function is assessed. While α.40 does not yet provide a completely representative sampling of cosmological volume, derivations of the H i mass function using future data releases from ALFALFA will further improve both statistical and systematic uncertainties.

Export citation and abstract BibTeX RIS

1. INTRODUCTION

The evolution of baryons within their dark matter halos and the morphologies of the resulting systems depend on the merger and accretion history of the parent halos. Major efforts of galaxy evolution studies today focus on how galaxies acquire the gas which fuels their star formation and what processes drive the distinctions between the red sequence and the blue cloud. Still, our view of the extragalactic universe is only as complete as our methods for cataloging the galaxies that populate it. While the public wide-area optical/IR and associated spectroscopic surveys are good at detecting luminous ellipticals, bright spirals, and bursting or active galaxies, they are substantially less complete in tracing the low surface brightness, dwarf, and gas-rich galaxy populations that actually dominate the local population. Each catalog derived from an individual survey has its own built-in limitations and biases which affect our ability to construct a true census of the present-day universe.

Because of its relatively simple physics, the H i line provides a useful tracer of the cool gas mass and of the star formation potential in nearby galaxies and probes the very population of modest luminosity, gas-rich objects which are often underrepresented in surveys selected by optical/IR properties. While it is clear that most stars form out of molecular rather than atomic hydrogen, the molecular clouds themselves develop through the collapse of overdensities in the more diffuse, neutral medium. Thus, while the connection of H i to star formation is indirect on small scales, the global H i content serves as a tracer of relative star formation potential. However, at present, H i line measurements yield H i masses $M_{{\rm H\,\mathsc{i}}}$ for far fewer galaxies than those for which stellar masses M* are available from optical/IR wide-area surveys. In fact, only now are H i surveys adequate in terms of volume sensitivity to sample a cosmologically significant volume (Martin et al. 2010).

After the pioneering results delivered by small-scale surveys such as the Arecibo H i Strip Survey (Zwaan et al. 1997) and the Arecibo Dual Beam Survey (ADBS; Rosenberg & Schneider 2002), the advent of multi-feed array receivers on large single-dish telescopes made possible wide-area 21 cm H i line surveys, such as the H i Parkes All-Sky Survey (HIPASS; Barnes et al. 2001; Meyer et al. 2004; Wong et al. 2006) and the companion H i Jodrell Bank All-Sky Survey (Lang et al. 2003). While covering a large fraction of the sky, these surveys failed to sample a cosmologically fair volume because their mean depth was too shallow, typically <40 Mpc, and they were limited in both angular and spectral resolution and in sensitivity. As a result, HIPASS sampled only sparsely both the most H i-rich—but rare—objects and the lowest halo mass systems—detectable only if very nearby and with very narrow H i line widths—and, because of the large Parkes antenna beam (15farcm5), suffered from confusion in the identification of optical counterparts (OCs).

The advent of a similar seven-feed array at Arecibo ("ALFA," the Arecibo L-band Feed Array) has enabled a second-generation wide-area extragalactic H i line survey, ALFALFA, the Arecibo Legacy Fast ALFA survey (Giovanelli et al. 2005a, 2005b; Giovanelli 2007; Haynes 2007). Initiated in 2005 February, survey observations are now more than 90% complete. In this paper, we present the catalog of H i detections covering about 40% of the planned survey sky area, referred to hereafter as the α.40 catalog. Both by design and because of improvements made possible by the accumulation and analysis of more survey data, the catalog presented here both extends and supersedes earlier ones presented by Giovanelli et al. (2007), Saintonge et al. (2008), Kent et al. (2008), Martin et al. (2009), and Stierwalt et al. (2009). In addition, the ALFALFA data release presented here includes, where applicable, a cross-reference to the optical survey data set corresponding to Data Release 7 (DR7) of the Sloan Digital Sky Survey (SDSS; Abazajian et al. 2009).

The availability now of a large body of ALFALFA data, constituting 40% of the expected final survey, allows us to undertake an examination of the characteristics of its catalog of H i sources. Martin et al. (2010) and Toribio et al. (2011a) have presented earlier considerations of survey characteristics for subsets of the α.40 catalog specifically in the context of using the ALFALFA survey to derive the H i mass function (HIMF) and to establish a standard of normal H i content for galaxies in low-density environments, respectively. Here, we examine the full α.40 catalog, discuss its identification of OCs, and compare parameters derived from its measurements with those available in the previous compilation of targeted H i line observations presented by Springob et al. (2005b). We also present a more detailed look at the completeness of α.40 and how H i source catalog limitations in general can affect measurements of the HIMF.

This paper is organized as follows. In Section 2, we discuss the observational strategy, sky coverage, and data processing associated with the production of the ALFALFA data set and its final data products. Section 3 presents the α.40 catalog of H i sources. The identification of the OCs of the H i sources is discussed in Section 4. In that section, we present the cross-match of the α.40 catalog to the SDSS DR7 database and discuss those circumstances under which the ALFALFA detection is not associated with an OC. A comparison of the H i line parameters derived from the ALFALFA survey with those extracted from the large targeted H i data set presented in Springob et al. (2005b) is used in Section 5 to validate the photometric and spectral calibration underlying the ALFALFA source parameters. An analysis of the survey completeness and reliability is presented in Section 6 followed in Section 7 with a discussion of how the α.40 survey characteristics impact its cosmological applications, in particular, the derivation of the HIMF. A brief summary of the main points of this paper is given in Section 8.

2. DATA

The ALFALFA observing strategy has been discussed in detail in Giovanelli et al. (2005a) and B. R. Kent & R. Giovanelli (2011, in preparation). Of particular note to this data release, observations during a given observing session use the ALFA seven-beam receiver parked on the meridian with data acquired in "almost fixed" drift-scan mode; minor motion of the telescope is permitted so that the position of the central beam tracks in constant J2000 declination. With the feed arm positioned along the meridian at azimuths near 180° (for declinations north of the Arecibo zenith at decl. = 18°21') or 360° (for declinations south of zenith), the feed array is rotated by 19° so that the seven beams sweep out tracks equally spaced in declination by about 2farcm1. In nearly all circumstances, a given observing run is dedicated to a single declination track. The two-dimensional (time versus frequency) drift-scan data sets are converted from FITS to IDL format and run through an initial bandpass calibration and subtraction, normally within 24 hr of acquisition.

In contrast to traditional total power, position-switched pointed observations, a drift-scan survey (of which ALFALFA is certainly not the first example) collects spectra continuously (almost) without moving the telescope. In the case of the ALFALFA survey, the sampling rate is 1 Hz, i.e., a spectrum of 4096 spectral channels (a "record") is recorded every second for each polarization of every beam of the feed array. The slowly changing characteristics of the bandpass with time can thus be monitored effectively. The ALFALFA pipeline does so by separately monitoring the behavior of each spectral channel across the time domain, through a robust, low-order polynomial fit (which skips over sources), outside of the spectral region dominated by Galactic emission. For each 600 record unit (a 10 minute drift "scan"), we thus obtain a two-dimensional map of the bandpass which can be "subtracted" from each spectral record. Such "sky subtraction" is thus conceptually similar to that of the traditional position-switching mode, although the duration of the "off" is much larger than that "on" source, gaining $\sqrt{2}$ in sensitivity with respect to standard position-switching observations. During the same processing step, continuum subtraction is also performed, and a separate continuum map is recorded.

For spectral channels affected by Galactic H i emission, such "sky subtraction" is not an option, and the bandpass subtraction cannot be applied in the same manner as for spectral channels away from the Galactic signal. In this case, the spectral shape of the bandpass across the Galactic emission region is adopted as a linear interpolation between the two Galactic emission-free sides of the spectrum. Thus, the flux calibration of Galactic features processed by the standard ALFALFA pipeline is not accurate.

Each two-dimensional bandpass-subtracted data set for each beam and each polarization is examined interactively and flagged for radio frequency interference (RFI); regions characterized by lowered quality (due to standing waves, gain instabilities, etc.) are assigned a lower weight. While this step (known as "flagbb") is laborious, the fact that the continuum information is retained and the RFI is not median filtered away enable the further use of the data set to look for H i absorption, for the derivation of upper limits at arbitrary positions in three dimensions, and for stacking analysis (Fabello et al. 2011). The flattened and flagged two-dimensional line and continuum maps are archived as Level I data sets.

Once the set of drift scans providing full coverage for a complete strip in declination is flagged in this manner, the set of evenly gridded data cubes is generated. Details of the gridding process are given in B. R. Kent & R. Giovanelli (2011, in preparation) and summarized here. The grids are square in the angular dimension, 2fdg4 on a side, and evenly sampled at 1' spacing. Their center positions on the sky are spaced 8 minutes apart in R.A. and centered on odd integer declinations; the spatial dimensions of a grid are 144 × 144 pixels. For convenient access using modest data processors, each spatial grid is split into four, partially overlapping subgrids, each covering 1024 frequency channels. The grid generation algorithm also converts the spectral intensities from units of antenna temperature to mJy beam−1 in flux density, correcting for zenith angle variations in the gain of the telescope. A first step in the examination of the grids performs an astrometric fit to the continuum sources within them; this fit is then used to subtract off the residual telescope pointing errors (Giovanelli et al. 2007; Kent et al. 2008; B. R. Kent & R. Giovanelli 2011, in preparation). Grids are then flat-fielded and rebaselined in both the angular and spectral dimensions to improve their quality by accounting for variations in gain, calibration, and other systematic blemishes. "Flat fielding" here corresponds to the process by which pixel-to-pixel variations within each channel map, caused mainly by continuum fluctuations, are accounted for. For spectral channels away from Galactic emission, extragalactic H i sources are typically small in comparison with the angular size of ALFALFA data cubes ("grids" of 2fdg4 × 2fdg4). Large-scale variations in the continuum level, which may not have been effectively removed by the bandpass subtraction procedure, can be identified by robust fitting a two-dimensional surface (in the angular domain) from the channel map. In the absence of very strong continuum sources, this correction is generally small and does not affect noise statistics in any significant way.

After the angular flat fielding is performed, residual, localized spectral baseline features are also removed by subtracting low-order polynomial fits to the signal-free portions of the spectral domain around emission features. These arise, for example, from standing waves produced by multiple reflections of continuum source emission within the optical path.

Signal extraction is applied following Saintonge (2007a), and once a catalog of candidate detections has been obtained, the grid is interactively examined, the global profiles are extracted, fluxes are measured, OCs are identified, and remarks are recorded. It should be noted that this interactive process improves the definition of source parameters beyond the model fitting used by the automatic signal extractor; this point, and the resultant reliability and completeness of the catalog, is discussed more fully in Section 7. The final catalog of sources is constructed following a process of culling poorer quality detections where a source is contained in adjacent overlapping grids and running a series of data quality checks.

The catalog presented here supersedes previous ALFALFA data releases for several reasons mainly having to do with (1) the increased size of the available data set which yields better understanding of pointing errors, gain variations, and other instrumental artifacts, (2) improved SDSS coverage since the first catalogs were produced, (3) improvements in the algorithm used to make global profile measurements, and (4) increased contiguous coverage. Some earlier measurements tended to underestimate fluxes for the brightest and more extended sources, a systematic effect for which a correction is now applied (see Section 5 for the comparison of flux density measurements with published values). In most cases, changes to the flux density measurements included in earlier data releases are minor, but the current catalog is intended to replace the earlier ones entirely. It should be noted that further revisions of parameters for sources located near edges of the current grid coverage will come in the future in those cases when a newer grid in an adjacent strip better encompasses the source or contributes a higher quality data set. By its nature as a cumulative drift-scan survey, the harvest of ALFALFA will both grow and improve over time.

The full ALFALFA survey is intended to cover 7000 deg2 of sky in two regions of high Galactic latitude within 18° of the Arecibo zenith. All declinations will be covered 0° < decl. <+36°. Since all observations are conducted during nighttime hours, the two regions are referred to as "spring" and "fall." The "spring" region extends from 07h30m  < R.A. < 16h30m, while the "fall" ALFALFA region encompasses from 22h  < R.A. < 03h. Some sources are found outside the stated R.A. boundaries where the actual drift-scan observations extended beyond the nominal map area. Some priority has been given to completing areas within the SDSS spectroscopic survey footprint, and the pace of observing has been dictated by the availability of telescope time. Figure 1 illustrates the area of the sky contained in the α.40 catalog presented here: regions 07h30m  < R.A. < 16h30m, +04° < decl. < + 16°, and +24° < decl. < + 28° (the "spring" region) and 22h  < R.A. < 03h, +14° < decl. < + 16°, and +24° < decl. < + 32° (the "fall" region).

Figure 1.

Figure 1. Sky distribution, in equatorial coordinates on an Aitoff grid projection, of the current α.40 catalog detections. Upper panel: the "fall ALFALFA sky" (anti-Virgo direction) region; lower panel: the "spring ALFALFA sky" (Virgo direction) region. Blue, red, and green symbols identify the Code 1 (best quality), 2 (priors), and 9 (HVC) sources, respectively. The green diagonal lines in each panel trace the supergalactic plane and SGL ± 10°.

Standard image High-resolution image

3. CATALOG PRESENTATION

We present in Table 1 the measured parameters for 15,855 detections, 15,041 of which are certainly associated with extragalactic objects. An additional 814 are detected at velocities which suggest they may not be extragalactic but are more likely to be Galactic high velocity cloud (HVC) features. The contents of Table 1 are as follows.

  • 1.  
    Column 1: entry number in the Arecibo General Catalog (AGC), a private database of extragalactic objects maintained by M.P.H. and R.G. The AGC entry normally corresponds both to the OC and the H i line source except in the cases of HVCs and other H i sources which cannot be associated with an optical object with any high degree of probability. In those cases, the AGC number corresponds only to the H i detection. An AGC number is assigned to all ALFALFA sources; it is intended to be used as the basic cross-reference for identifying and tracking ALFALFA sources as new data acquired in overlapping regions supersede older results. Note that in previous ALFALFA catalogs, an index number was used, a practice no longer employed; a cross-reference to these older identifications is provided in Table 2. The designation of an ALFALFA source referring only to its H i emission (without regard to its OC) should be given using the prefix "H i" followed by the position of the H i centroid as given in Column 3 of Table 1.
  • 2.  
    Column 2: common name of the associated OC, where applicable. Further discussion of the process of assigning OCs is presented in Section 4.1.
  • 3.  
    Column 3: centroid (J2000) of the H i line source, in hhmmss.sSddmmss, after correction for systematic telescope pointing errors, which are on the order of 20'' and depend on declination. The systematic pointing corrections are derived from an astrometric solution for the NRAO Very Large Array Sky Survey (NVSS) radio continuum sources (Condon et al. 1998) found in the grids. As discussed in Giovanelli et al. (2007) and Kent et al. (2008), the assessment of centroiding errors is complicated by the nature of three-dimensional grid construction from the two-dimensional drift scans, those often acquired in widely separated observing runs, and, for resolved/confused sources, unknown source structure. As those authors suggest, the best assessment of H i centroid error is accomplished by comparison of the H i centroids with the positions of the adopted OCs. An analysis of the positional offsets of the H i centroids from the positions of the OCs yields a relation for the median error in the H i position err$_{{\rm med, H\,\mathsc{i}}}$ as a function of the signal-to-noise ratio (S/N; see Column 7), for the α.40 sample:
    Equation (1)
    On average, the positional offset is about 18'', but it can, in rare instances, exceed 1'; those cases are noted in the comments included in Table 2.
  • 4.  
    Column 4: centroid (J2000) of the most probable OC, in hhmmss.sSddmmss, associated with the H i line source, where applicable. The OC has been identified and its likelihood has been assessed interactively using tools provided through the SkyView Web site or the SDSS Explore Tool, in addition to the NASA Extragalactic Database (NED) and the AGC and make use of judgmental criteria including redshift (when known), size, morphology, and optical color. The optical positions are normally estimated to be 3'' or better but may be larger in exceptional cases (very low surface brightness or peculiar, disturbed objects). The process of assignment of the most probable OC is discussed in Section 4.1. It should be noted that only one OC is assigned per H i source although in reality confusion within the telescope beam is a possibility. Suspected cases of confusion or ambiguous assignment of the OC are noted in the comments included in Table 2.
  • 5.  
    Column 5: heliocentric velocity of the H i source, cz in km s−1, measured as the midpoint between the channels at which the flux density drops to 50% of each of the two peaks (or of one, if only one is present) at each side of the spectral feature; see also Springob et al. (2005b). The error on cz to be adopted is half the error on the width, tabulated in Column 6.
  • 6.  
    Column 6: velocity width of the H i line profile, W50 in km s−1, measured at the 50% level of each of the two peaks, as described in Column 5 and corrected for instrumental broadening. No corrections due to turbulent motions, disk inclination, or cosmological effects are applied. The estimated error on the velocity width, epsilonw, in km s−1, follows in parentheses. This error is the sum in quadrature of two components: a statistical error and a systematic error associated with the subjective guess with which the person performing parameter extraction estimates the spectral boundaries of the feature, flagged during the interactive assessment of candidate detections. In the majority of cases, the systematic error is significantly smaller than the statistical error; thus, the former is ignored.
  • 7.  
    Column 7: integrated H i line flux density of the source, S21, in Jy km s−1. This value corresponds to the total H i line flux measured on the integrated spectrum obtained by spatially integrating the source image over a solid angle of at least 7' × 7' and dividing by the sum of the survey beam values over the same set of image pixels (see Shostak & Allen 1980; B. R. Kent & R. Giovanelli 2011, in preparation). Estimates of integrated flux densities for very extended sources with significant angular asymmetries can be misestimated by our algorithm, which is optimized for measuring sources comparable with or smaller than the survey beam. A special catalog with parameters of extended sources will be produced after completion of the survey. The issue is especially severe for extended HVCs that exceed in size that of the ALFALFA data cubes. In these specific cases, only the flux in the knots of emission is measured. In general, the HVCs have been cataloged here applying the same kind of S/N selection threshold as for the extragalactic signals, with the exception of the southern extension of Wright's cloud, where, in addition to a bulk measurement of the portion of the cloud lying within this region, a selection of the brightest knots was measured to trace the structure. See Column 12 and the corresponding comments for individual objects. The estimated uncertainty of the integrated flux density, in Jy km s−1, is given in parentheses.
  • 8.  
    Column 8: S/N of the detection, estimated as
    Equation (2)
    where S21 is the integrated flux density in Jy km s−1, as listed in Column 7; the ratio 1000S21/W50 is the mean flux density across the feature in mJy; wsmo is either W50/(2 × 10) for W50 < 400 km s−1 or 400/(2 × 10) = 20 for W50 ⩾ 400 km s−1(wsmo is a smoothing width expressed as the number of spectral resolution bins of 10 km s−1 bridging half of the signal width; the raw spectra are sampled at 24.4 kHz ∼ 5.5 km s−1 at z ∼ 0); and σrms is the rms noise figure across the spectrum measured in mJy at 10 km s−1 resolution, as tabulated in Column 9.
  • 9.  
    Column 9: noise figure of the spatially integrated spectral profile, σrms, in mJy. The noise figure as tabulated is the rms as measured over the signal- and RFI-free portions of the spectrum, after Hanning smoothing to a spectral resolution of 10 km s−1.
  • 10.  
    Column 10: adopted distance in Mpc, DMpc. For objects with cz > 6000 km s−1, the distance is simply estimated as czcmb/H, where czcmb is the recessional velocity measured in the cosmic microwave background (CMB) reference frame (Lineweaver et al. 1996) and H is the Hubble constant, adopted to be 70 km s−1 Mpc−1. For objects with czcmb < 6000 km s−1, we use the local universe peculiar velocity model of Masters (2005), which is based on data from the SFI++ catalog of galaxies (Springob et al. 2009) and results from analysis of the peculiar motions of galaxies, groups, and clusters, using a combination of primary distances from the literature and secondary distances from the Tully–Fisher relation. The resulting model includes two attractors, with infall onto the Virgo Cluster and the Hydra–Centaurus Supercluster, as well as a quadrupole and a dipole component. The transition from one distance estimation method to the other is selected to be at cz = 6000 km s−1 because the uncertainties in each method become comparable at that distance. Where available, primary distances as available in the published literature are adopted. When the galaxy is a known member of a group (Springob et al. 2009), the group systemic recessional velocity czcmb is used to determine the distance estimate according to the general prescription just described.
  • 11.  
    Column 11: logarithm of the H i mass $M_{{\rm H\,\mathsc{i}}}$, in solar units, computed via the standard formula $M_{{\rm H\,\mathsc{i}}}=2.356\times 10^5 D_{{\rm Mpc}}^2 S_{21}$ and assuming the distance given in Column 10. No correction for H i self-absorption has been applied.
  • 12.  
    Column 12: this column contains three relevant coded flags.The first code, assigned as an integer value of 1, 2, or 9, refers to the category of the H i detection defined as follows.  Code 1 refers to sources of S/N and general qualities that make it a reliable detection. These signals exhibit a good match between the two independent polarizations observed by ALFALFA, a spatial extent consistent with the telescope beam (or larger), an RFI-free spectral profile, and an approximate minimum S/N threshold of 6.5 (Saintonge 2007a). These criteria lead to the exclusion of some candidate detections with S/N > 6.5; likewise, some features with S/N slightly below this soft threshold are included, due to optimal overall characteristics of the feature, such as well-defined spatial extent, broad velocity width, and obvious association with an OC. We estimate that the detections with Code 1 in Table 1 are nearly 100% reliable; the completeness and reliability of the α.40 catalog are discussed in Section 7.  Code 2 refers to sources categorized as "priors." They are sources of low S/N (≲6.5), which would ordinarily not be considered reliable detections by the criteria set for Code 1, but which have been matched with OCs with known optical redshifts coincident (to within their errors) with those measured in the H i line. We include them in our catalog because they are very likely to be real. In general, however, they should not be used in statistical studies which require well-defined completeness limits; this point is further discussed in Section 7.  Code 9 refers to objects assumed to be HVCs; no estimate of their distances is made.Of the 15,855 sources included in this data release, 11,941 are classified as source Code 1, 3100 are Code 2, and 814 are Code 9.The second code, assigned as an alphabetic character, refers to a category reflecting the status of the cross-identification of the ALFALFA detection with an entry in the SDSS DR7 database, as judged by the ALFALFA team. This code is used to identify galaxies which lie outside the SDSS DR7 sky footprint or for which there are clearly issues with the identification. It should be noted that this code refers only to the cross-match with SDSS DR7. The cross-reference and basic parameters of the OCs are given in Table 3. This code and its interpretation are as follows.   I: "identified": the PhotoObjID is set but no other indicative flags have been applied; this code applies whether or not there is an SDSS spectroscopic counterpart.   O: "outside DR7": the SDSS OC lies outside of the SDSS DR7 footprint and thus no DR7 cross-match can be performed.   U: "unidentified": no SDSS OC has been identified, but the object lies within the SDSS DR7 footprint.   N: "no DR7 photometric ID": no SDSS DR7 photometric source has been identified; assignment of this code can result from proximity to bright star, satellite trails, incomplete coverage, or for other reasons.   M: "missing": the OC is in the SDSS DR7 footprint region but neither a PhotoObjID or a SpectObjID are returned to queries of the SDSS DR7 database.   P: "photometry suspect": the SDSS DR7 photometry for the associated PhotoObjID is suspect for some reason as judged by the ALFALFA team. Assignment of this code often is associated with the identification of multiple near-equal-flux photometric objects within an obviously single OC. Such cases apply often to very large optical objects or to faint, low surface brightness and/or patchy systems. The optical photometry associated with the SDSS "parent" object may be adequate but caution should be exercised.   D: "displaced SDSS object": the SDSS Photo/SpectID is displaced from the optical galaxy center, as identified by ALFALFA team. The PhotoObjID may be legitimate; often this is the brightest photometric "child." Because of the displacement, the SDSS redshift may not reflect the systemic recessional velocity of the galaxy.   T: "two SDSS objects": the SDSS PhotoObjID associated with the galaxy center is displaced from the target associated with the SDSS SpectObjID, as judged by the ALFALFA team, i.e., the best PhotoObjID does not coincide with the SpectObjID. Usually, the SpectObjID is an offcenter H ii region or other bright knot within the target galaxy.   S: "superposed SDSS object": the SDSS redshift corresponds to a superposed foreground star or background QSO.   B: "bad SDSS solution": the SDSS redshift is unreliable or rejected for some unspecified reason.The third code, given as an asterisk where applicable, indicates that a comment regarding the H i detection and/or the assignment of the OC is included for this source in Table 2.

Table 1. Properties of H i Detections

AGC Name H i Coords Opt. Coords cz W50(epsilonw) S21 S/N rms Dist $\log M_{{\rm H\,\mathsc{i}}}$ Codes
    (J2000) (J2000) (km s−1) (km s−1) (Jy km s−1)   (mJy) (Mpc) (M)  
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
331061 456-013 000002.5+155220 000002.1+155254 6007 260 (45) 1.13(0.09) 6.5 2.40 85.2 9.29 1 I
331405   000003.3+260059 000003.5+260050 10409 315 (8) 2.62(0.09) 16.1 2.05 143.8 10.11 1 I
102896   000006.8+281207 000006.0+281207 16254 406 (17) 2.37(0.12) 11.2 2.31 227.4 10.46 1 I *
102574   000009.1+280543   −368 23 (3) 1.29(0.08) 11.2 5.05     9 U *
102975   000012.3+290137   −367 23 (3) 2.85(0.07) 26.7 4.69     9 U *
102571   000017.2+272359 000017.3+272403 4654 104 (3) 2.00(0.06) 19.0 2.29 65.9 9.31 1 I
102976   000019.0+285931   −365 26 (2) 2.53(0.11) 18.3 5.76     9 U *
102728   000021.2+310038 000021.4+310119 566 21 (6) 0.31(0.03) 7.5 1.92 9.1 6.78 1 I
102575   000028.0+280845   −371 33 (7) 0.47(0.03) 8.6 2.11     9 U *
 12896 478-010 000030.1+261928 000031.4+261931 7653 170 (10) 3.14(0.08) 22.0 2.44 104.5 9.91 1 I *
102729   000032.1+305152 000032.0+305209 4618 53 (6) 0.70(0.04) 10.5 2.02 65.4 8.85 1 I
102576   000035.3+262712   −430 21 (2) 0.60(0.04) 11.7 2.50     9 U *
102730   000040.1+315610 000039.5+315618 12631 79 (23) 0.66(0.05) 7.3 2.25 175.8 9.68 1 I
102578   000042.3+263311   −429 22 (3) 0.67(0.04) 12.8 2.44     9 U *
101866   000050.1+141612 000047.9+141639 10877 291 (149) 0.79(0.11) 4.1 2.52 150.3 9.62 2 I *
 12901 499-035 000059.5+285431 000058.9+285441 6896 395 (5) 5.03(0.11) 25.2 2.24 93.7 10.02 1 I *
102731 FGC290A 000109.3+305221 000106.4+305247 7366 257 (8) 1.33(0.08) 8.9 2.08 100.5 9.50 1 I
102977   000108.7+284738   −364 22 (3) 2.03(0.11) 13.8 6.64     9 U *
102861   000110.1+320425   −181 22 (1) 7.30(0.06) 55.0 4.69     9 U *
102732   000114.8+312218 000115.0+312227 12532 292 (5) 1.54(0.09) 9.1 2.20 174.3 10.04 1 I
101869   000127.1+142431 000131.4+142427 12639 183 (16) 1.00(0.09) 6.4 2.57 175.4 9.86 1 I *
102733   000129.8+311418 000130.0+311403 12581 134 (12) 1.03(0.08) 8.6 2.29 175.0 9.87 1 I
 12911 N7806 000131.5+312629 000130.1+312631 4767 231 (23) 1.40(0.08) 9.4 2.19 67.5 9.18 1 I *
331082 433-016 000134.5+150448 000134.0+150454 6368 118 (8) 2.72(0.08) 21.4 2.60 85.9 9.67 1 I
748776   000142.4+135019 000141.3+135033 6337 53 (5) 0.65(0.05) 8.7 2.27 89.9 9.09 1 I

Only a portion of this table is shown here to demonstrate its form and content. Machine-readable and Virtual Observatory (VO) versions of the full table are available.

Download table as:  Machine-readable (MRT)Virtual Observatory (VOT)Typeset image

Table 2. Comments on Individual Sources

AGC Cat.ID. H i Code Comment
102896   1 In region affected by RFI; parameters uncertain; near smaller AGC 102897 (000005.5+281129, unknown cz) at 0.7 arcmin
102574   9 HVC; first of two knots near the top of the grid; see also AGC 102575 at 5.1 arcmin
102975   9 HVC; part of filament that stretches through most of this grid
102976   9 HVC; part of a filament that extends beyond this grid into 0004+29
102575   9 HVC; second of two knots near the top of the grid; see also AGC 102574 at 5.1 arcmin
12896   1 Near AGC 331800 (MCG+04-01-009, 0000316+261818, cz = 7754) at 1.2 arcmin
102576 2-4 9 Compact HVC; one of two nearby knots (the other is AGC 102578
102578 2-5 9 Compact HVC; one of two nearby knots (the other is AGC 102576)
101866   2 Ambiguous OC; several near including AGC 103024 (000049.5+141532, unknown cz) at 1.2 arcmin; others may be background
12901   1 Small companion at 0.4 arcmin AGC 103021 (000057.5+285427, unknown cz)
102977   9 HVC; faint south end of a filament that stretches through most of this grid
102861   9 HVC 110.7-29.6 part of nice arc
12911   1 Multiple system NGC 7805/6; UGC 12908 = NGC 7805 group; blend?
101869   1 AGC 101869 (000149.5+142623, cz = 12568) at 5.7 arcmin
102862   9 HVC 110.5-31.0 part of nice arc
102978   9 HVC; part of filament that stretches through most of this grid
102735   1 Optical identification with bluer galaxy in pair; AGC 102831 (000250.0+281725, unknown cz) at 0.3 arcmin
102863   9 HVC 110.8-30.0 part of nice arc
102979   9 HVC; part of filament that stretches through most of this grid
749126   9 HVC 1-6.04-45.19
102864   9 HVC 110.7-30.7 part of nice arc
749127   9 HVC 105.34-47.32
102981   1 OC identified with larger of pair; second is AGC 103015 (000250.0+281725, unknown cz) at 1.6 arcmin
7   1 OC identified with larger of pair; second is AGC 100849 (000306.3+155834, unknown cz) at 1.2 arcmin
100011   2 Poor spatial and spectral definition

Only a portion of this table is shown here to demonstrate its form and content. Machine-readable and Virtual Observatory (VO) versions of the full table are available.

Download table as:  Machine-readable (MRT)Virtual Observatory (VOT)Typeset image

The full content of Table 1 is available in the online version of the journal and will be made available also through our public digital archive site17 and the ALFALFA project data site.18

In addition to the H i emission sources presented in Table 1, it is expected that the ALFALFA spectral cubes will also contain evidence for H i in absorption. Darling et al. (2011) discuss a pilot program which uses an adaptation of the ALFALFA pipeline to search for H i absorption along the line of sight to NVSS sources in a small number of the ALFALFA cubes. The known H i absorber in the interacting system UGC 6081 was recovered. Because the standard ALFALFA reduction is not designed to look for such phenomena, the H i absorption detection is not included in Table 1, and the reader is referred to Darling et al. (2011) for its parameters.

Table 2 contains comments about entries in Table 1 which have been recorded in the course of extracting source parameters and identifying the OCs. The second column contains a cross-reference to the catalog identification used in earlier papers, which is no longer used. We repeat in Column 3 the H i detection code assigned for each source (the first code in Column 12 of Table 1 described above). It should be noted that angular separations given in these comments reference the centroid of the H i source, not the position of the OC. These notes are somewhat heterogeneous in nature, having been incorporated during the process of data reduction by the individual responsible for source extraction. Since the extraction has been performed over a period of several years, the databases available to the person making the comments have evolved; thus, the mention of nearby neighbors is not intended to be complete and should not be used in any derivation of local density. In some cases these notes identify issues with data quality, certainty of the OC or parameter extraction. The presence of a note does not mean necessarily that parameters are less certain than their errors indicate, as we have a tendency to err on the conservative side of casting doubt. They are included here because they provide an additional contribution to the legacy value of the data set.

Subsets of the α.40 catalog have been included in the derivation of the H i mass function (Martin et al. 2010) and the H i width function (Papastergis et al. 2011); both papers include discussion of the sample characteristics, limitations, and biases. Similar to figures shown in those papers, Figure 2 illustrates the distributions of (top to bottom) redshift cz, W50, log S21, log S/N, and log $M_{{\rm H\,\mathsc{i}}}$ for the full ALFALFA α.40 sample presented in Table 1, while Figure 3 shows the corresponding Spaenhauer plot. Further discussion of the impact of survey characteristics on cosmological issues and specifically on the derivation of the H i mass function is given in Section 7.

Figure 2.

Figure 2. Histograms of the distributions of redshift cz, W50, log S21, log S/N, and log $M_{{\rm H\,\mathsc{i}}}$ (top to bottom) for the α.40 catalog sample presented in Table 1.

Standard image High-resolution image
Figure 3.

Figure 3. Spaenhauer diagram for the α.40 catalog sample presented in Table 1. The superposed blue (upper) curve traces the HIPASS completeness limit, while the red (lower) curve traces that survey's detection limit. The vertical dashed line indicates the outer limit in distance corresponding to the HIPASS bandpass edge; HIPASS did not sample any volume at larger distances. The vertical overdensity points evident at 17 Mpc is the Virgo Cluster; the paucity of points at ∼225 Mpc arises because many nights of ALFALFA observations are contaminated by strong RFI generated by FAA radar at the San Juan airport. A less pronounced gap evident at ∼85 Mpc arises from occasional much milder contamination from a harmonic of the radar at 1380 MHz and from rare burst events associated with the US Air Force Nuclear Detonation detection system aboard the GPS which transmits at 1381 MHz.

Standard image High-resolution image

4. OPTICAL COUNTERPARTS OF ALFALFA SOURCES

The principal aim of ALFALFA is to catalog all gas-bearing extragalactic objects in the local universe. An integral part of understanding this H i census is similarly identifying the stellar counterpart associated with each H i source, or even more importantly, rejecting that such a counterpart exists. During the ALFALFA data reduction process, optical images from the Palomar Digital Sky Survey (DSS2) and, where available, the SDSS are interactively examined alongside the ALFALFA H i data set and the most probable OC is identified and recorded. While this assignment may not be correct in individual cases, it provides a first approach to understanding the relationship between the H i source and its stellar counterpart. The notes included in Table 2 record comments on this process made by the ALFALFA team member performing this interactive stage of the data analysis. In this section, we describe the process by which OCs are identified and discuss unresolved issues, provide a cross-reference of sources to the SDSS DR7 database, and summarize general results on the evidence for "optically dark" galaxies.

4.1. Identifying Optical Counterparts

We make use of Virtual Observatory tools embedded in the IDL-based ALFALFA reduction package (B. R. Kent & R. Giovanelli 2011, in preparation) to access several public imaging and catalog databases at several stages in the data reduction process. During the process of H i parameter measurement (the routine called "galflux"), both DSS2(B) and SDSS images are examined to identify interactively the most probable OC of each ALFALFA source. Because of their generally superior quality and ancillary information, preference is given to the SDSS images where they are available. Entries in our internal AGC database as well as those listed in the NASA Extragalactic Database (NED) can be retrieved and examined. The ALFALFA team member processing each source uses the available public information as to color, morphology, redshift, separation from the H i centroid in combination with his/her scientific judgment in assigning an OC. It should be noted however that because the cataloged data presented here were reduced over a three-year time period, not all current information/data were available at the time this assignment was made. Consistency checks are made later to look for redshift discrepancies or cases of large positional offset.

With that caveat in mind, Figure 4 shows several examples which illustrate the process of identification of OCs and the uncertainties inherent in it. Each panel shows a 3' × 3' SDSS g-band image centered on the H i centroid. The superposed circle marks the OC identified in Table 1; the size of each circle is arbitrarily chosen for the best illustration of the target. The panels are intended to illustrate some of the challenges of assigning the OC by highlighting four specific cases as follows.

  • 1.  
    The upper left image is centered on the best-fit position of the H i source detected at HI095452.2+142907, a weak source of S/N = 7.3. The corresponding OC AGC 193821 is identified as the small galaxy SDSS J095453.79+142910.0 22'' from the H i centroid and partly contaminated by the diffraction spike of the bright foreground star; the galaxy is more evident in the DSS2(B) image. There is no further optical information.
  • 2.  
    The upper right image is centered on the position of HI123120.9+050402, a marginal ALFALFA detection with an S/N = 4.9. The OC AGC 220720 is identified as VCC 1347 = CGCG 042-143 = J123117.00+050429.3, a small spiral galaxy offset from the H i centroid by about 64''; the large offset is not surprising given the low S/N of the H i detection. The SDSS optical redshift is 9830 ± 30 km s−1, slightly off the H i cz of 9873 ± 4 km s−1. Because of the low S/N of the H i emission profile but the coincidence with an optical galaxy with an adequately close redshift match, the optical identification is made and the source is designated as a "prior" and assigned an ALFALFA detection category code of 2.
  • 3.  
    The lower left image is centered on the position of HI152240.3+055017, a very narrow (W50 = 24 km s−1) feature at cz = 1796 km s−1. The OC is identified as a dwarf galaxy AGC 258471 better evident in the DSS2(B) image at J152238.7+054945; the SDSS pipeline identifies at least five photometric objects within the low surface brightness emission associated with the dwarf so that its magnitude is poorly measured. The offset of the H i centroid from the optical object is about 38''.
  • 4.  
    The lower right image is centered on the position of HI160743.9+272201, a source of S/N = 10.9. As evident in the SDSS g-band images, there are several objects in the field, including a close pair associated with SDSS spectroscopic target J160744.75+272140.2 = KUG 1605+275 NED01 with a redshift from the SDSS of 23676 ± 31 km s−1. The redshift is too high to be associated with the ALFALFA H i source; several other galaxies in the vicinity of this system have similar redshifts. Careful examination of the SDSS image shows a second object, which is not identified in the SDSS photometric database and which appears to be partly overlapping with J160744.75+272140.2 but in its foreground at J160743.9+272201. We identify the H i source with this foreground blue galaxy which becomes AGC 749361.
Figure 4.

Figure 4. Illustrative examples of issues related to the identification of the OCs of ALFALFA H i sources. Each panel is a 3' × 3' frame extracted from the Montage data product of SDSS g-band images centered on the position of the ALFALFA H i source. In each frame, the superposed circle, of arbitrary size, identifies the adopted OC. See the text for details of individual cases.

Standard image High-resolution image

We emphasize again that because of the ALFALFA centroid position uncertainty and its relatively large beam size, assignment of the most probable OC is a reasonable but not a perfect process. Furthermore, it will continue to be a dynamic one, striving for improvement when new data provide improved detail. For example, the current data set does not include yet a systematic incorporation of data from the SDSS III survey or its DR8.

4.2. Cross-reference with the SDSS

Although not available at the time of earlier ALFALFA data releases, the completion of the SDSS legacy survey has afforded us the opportunity to cross-reference the ALFALFA and SDSS data sets where the two share footprints. As a new feature of this and future ALFALFA catalog releases, here we provide in Table 3 the cross-identifications of ALFALFA sources with the photometric and spectroscopic catalogs associated with the SDSS, in this instance, with the data release DR7 (Abazajian et al. 2009). Entries in Table 3 are as follows.

  • 1.  
    Column 1: the source AGC number, identical to Column 1 of Table 1.
  • 2.  
    Column 2: the H i detection category code, identical to the first (integer) code in Column 12 of Table 1.
  • 3.  
    Column 3: the SDSS cross-reference category, identical to the second code in Column 12 of Table 1.
  • 4.  
    Column 4: the SDSS DR7 photometric catalog object identification number (PhotoObjID), where applicable.
  • 5.  
    Column 5: the SDSS DR7 spectroscopic catalog object identification number (SpectObjID), where applicable.
  • 6.  
    Column 6: the r-band model magnitude corresponding to the photometric object or its SDSS parent.
  • 7.  
    Column 7: the (ur) color associated with the OC from the SDSS as reported in the DR7. This value is used in Figure 7, and in order to allow direct comparison with Figure 9 of Baldry et al. (2004), it has not been corrected for extinction or redshift.
  • 8.  
    Column 8: the redshift corresponding to the SDSS DR7 spectroscopic catalog object, extracted from the SDSS DR7 database, where applicable.
  • 9.  
    Column 9: the error on the redshift given in Column 8, extracted from the SDSS DR7 database, where applicable.

Table 3. The ALFALFA–SDSS DR7 Cross-reference

AGC H i Code SDSS PhotoObjID SpectObjID rmodel (ur) z epsilonz
(1) (2) (3) (4) (5) (6) (7) (8) (9)
331061 1 I 587730775499407375 211330582074884096 14.77 1.59 0.02002 0.00010
331405 1 I 587740589481525478   15.11 1.97    
102896 1 I 758874370996764887   15.26 2.26    
102571 1 I 758874297994314032   16.10 1.39    
102728 1 I 758874299066483769   18.93 2.04    
12896 1 I 758874370460680283   13.98 1.29    
102729 1 I 758874299066548754   18.32 1.43    
102730 1 I 758874299602960817   16.94 1.55    
101866 2 I 587730773351989400 211330580741095424 15.15 2.48 0.03613 0.00010
12901 1 I 758874371533308165   13.69 2.64    
102731 1 I 758874372069392715   16.01 1.70    
102732 1 I 758874299603223055   14.91 1.93    
101869 1 I 587727221413707929 211330580573323264 15.82 1.85 0.04189 0.00010
102733 1 I 758874299603288292   15.85 1.82    
12911 1 I 758874299603222635   13.25 3.00    
331082 1 I 587730774425796793 211330582490120192 14.87 1.46 0.02123 0.00007
748776 1 I 587730772815184088   16.96 1.14    
102734 1 I 758874372605739306   15.90 1.35    
101873 1 I 587727223561257129 211330582536257536 16.35 2.38 0.04254 0.00009
102735 1 I 758874299603419848   18.38 0.67    
101877 1 I 587727221413773686 211330580648820736 16.70 1.40 0.01734 0.00033
102980 1 I 758874371533635834   15.87 1.76    
12920 1 I 758874298531316055   15.13 2.04    
100006 1 I 758874372606001325   14.28 2.70    
100008 1 I 758874372069982254   16.35 1.48    

Only a portion of this table is shown here to demonstrate its form and content. Machine-readable and Virtual Observatory (VO) versions of the full table are available.

Download table as:  Machine-readable (MRT)Virtual Observatory (VOT)Typeset image

It is important that potential users understand the limitations associated with this ALFALFA–SDSS cross-reference. As noted by Giovanelli et al. (2007) and discussed in Section 3, the ALFALFA H i centroid accuracy is of order 20'', but increases as the S/N decreases, as given in Equation (1). Furthermore, as is well known, the standard SDSS image reduction pipeline suffers from source blending, and more importantly, shredding, particularly in the sources whose light distributions are patchy or of low surface brightness. The current ALFALFA reduction process includes an interactive step of direct examination of the SDSS imagery and issues associated with blending/shredding are noted immediately. However, earlier ALFALFA data sets which predated the release of DR7 were not subject to such individual cross-examination. While attempts have been made to flag and check suspicious cases, it is likely that some misidentifications remain.

The DR7 photometric catalog object identification number given in Column (4) is the PhotoObjID whose magnitude and position given in SDSS DR7 correspond most likely to the OC; the actual "best magnitude" may be associated with the SDSS pipeline "parent." Users are cautioned to understand fully issues associated with blending, shredding, and poor sky subtraction and to make use of warning flags and other quality indicators when using the photometry associated with the photometric object given here. Particularly relevant discussions of background subtraction issues are given in West et al. (2010) and Blanton et al. (2011). Similarly, the spectroscopic identification refers to the most probable and most closely related SDSS spectroscopic target. This cross-match likewise can suffer from issues of positional offset, signal-to-noise, etc., and should be treated with similar caution. The SDSS DR7 cross-reference category given in Column 3 of Table 3 (and also as one of the two codes given in Column 12 of Table 1) provides further comment on quality issues as identified by members of the ALFALFA team. However, because some of the processing of ALFALFA data predates the release of SDSS DR7, this code assignment should not be considered complete: many but not all sources have been revisited after the release of DR7. The intent of providing the cross-reference is to make statistical studies more convenient and potentially homogeneous. But again, we emphasize the importance of visual examination of individual cases where such attention is critical to the drawing of scientific conclusions.

Of the 15,041 extragalactic (i.e., non-HVC) objects listed in Table 1, 2312 lie outside the SDSS DR7 footprint and 199 are classified as "dark" (see Section 4.3). Of the ones with identified OCs and included in DR7, 11,740 are assigned SDSS code "I" (meaning the SDSS photometric identification is acceptable and there is no issue with the spectroscopic identification where such exists), while the others are given a code in Table 3 indicating a recognized issue with either the SDSS photometry or spectroscopy. The ALFALFA fall portion of the sky contains some regions for which only photometry is available; in the spring region, the photometric and spectroscopic footprints overlap more completely. Of the 11,240 ALFALFA spring sky galaxies with a corresponding SDSS photometric ID (of any code), 9377 (83%) have an associated entry in the SDSS spectroscopic catalog and 1863 (17%) do not.

Figures 5 and 6 provide graphical illustrations of the relative strengths and weaknesses of the ALFALFA and the SDSS surveys as tracers of the large-scale structure in the local universe. Figure 5 shows a cone diagram of a four-degree wide slice of the ALFALFA spring sky centered on decl. = +26° and including the full ALFALFA bandpass redshift range cz < 18,000 km s−1. The upper cone extends over the full cz range covered by ALFALFA and the lower one, only the inner cz < 9000 km s−1. Blue open circles mark the locations of galaxies detected by ALFALFA, while red filled ones denote objects with redshifts from the SDSS DR7. The falloff in the density of blue points follows the distribution seen in Figure 3. The "finger of God" radial line-up of optical-cz (red) points so prominent in the lower diagram is the Coma cluster A1656. Galaxies in that cluster are well known to be strongly H i deficient (Giovanelli & Haynes 1985; Magri et al. 1988) so that ALFALFA detects very few of them. As indicated by the numbers superposed on the diagram, the number of SDSS spectroscopic targets in the full ALFALFA volume is about three times the number of ALFALFA H i sources; in the inner volume illustrated in the bottom diagram, that ratio drops to 2 and to ∼1 for field galaxies at cz  < 5000 km s−1. While strong bias against finding H i sources in the regions of rich clusters is clearly evident, the H i-bearing galaxies trace well the large-scale supercluster structures and include some of the most isolated objects found in this nearby volume.

Figure 5.

Figure 5. Cone diagrams showing the distribution of α.40 H i sources (blue open circles) and those with optical redshifts from the SDSS (filled red circles) within the spring sky strip covering 24° < decl. < +28°. The upper diagram shows the volume extending over the full ALFALFA bandwidth to 18,000 km s−1 (including regions impacted by terrestrial interference). The bottom diagram contains only the volume to 9000 km s−1.

Standard image High-resolution image
Figure 6.

Figure 6. Cone diagrams showing the distribution of α.40 H i sources (blue open circles) and those with reported optical redshifts (filled red circles) within the fall sky strip covering 24° < decl. < +28°. The upper diagram shows the volume extending over the full ALFALFA bandwidth to 18,000 km s−1 (including regions impacted by terrestrial interference). The bottom diagram contains only the volume to 9000 km s−1. The lack of coverage by the SDSS is evident in the paucity of optical redshifts in comparison with Figure 5.

Standard image High-resolution image

For comparison, Figure 6 shows a similar cone plot covering a four-degree wide slice of the ALFALFA fall sky centered on decl. = +26°. The SDSS spectroscopic survey did not cover this region; the red filled circles mark objects with optical redshifts available from the literature. In this part of the sky, ALFALFA sources contribute the majority of redshifts even at its outer boundary. It should be noted that the slice of the sky sampled in Figure 5 covers a strongly overdense region of the local universe, the Coma–A1367 supercluster, whereas the fall region lies to the south of the main filament of the Pisces–Perseus supercluster and includes a portion of the void in front of it. As in all studies of the local universe, the actual large-scale structure contained in the survey volume can leave a strong imprint on the observed distribution of galaxies and their properties in limited samples. Further discussion of the impact of large-scale structure on cosmological inference is included in Section 7.

Making use of the SDSS cross-reference tabulation, Figure 7 presents a color–magnitude diagram (CMD) for the α.40–SDSS overlap sample for comparison with similar diagrams extracted from the SDSS photometric survey alone. A similar CMD was independently constructed by Toribio et al. (2011a) for a sample of ALFALFA galaxies found in low-density environments. In Figure 7, gray scale and contours show the distribution of the H i-selected sample and the axes correspond to the range illustrated in Figure 9 of Baldry et al. (2004). The superposed dashed line shows the optimum divider used by Baldry et al. (2004) to separate galaxies on the red sequence (above the curve) from those in the blue cloud (below it) and given by their Equation (11). For the purpose of comparison with their Figure 9, no corrections for redshift or internal extinction have been applied to the magnitudes used to construct Figure 7. Figure 7 can also be compared with Figure 4 of Tempel et al. (2011) who used a large sample of galaxies from SDSS DR7 and did apply a K-correction; in their figure, those authors also categorize separately elliptical and spiral galaxies according to the SDSS catalog parameter fdeV, the fraction of the galaxy's luminosity contributed by the de Vaucouleurs profile. Clearly, the α.40 catalog is dominated by blue spiral galaxies and is strongly biased against the red sequence. As discussed by Tempel et al. (2011), some of the luminous, red objects are truly red, luminous, and gas-bearing objects; other luminous objects appear red because they are edge-on disks for which the internal extinction correction is significant. Still, Figure 7 confirms the conclusion of Masters et al. (2010) that the most luminous gas-rich population includes a significant fraction of red galaxies. Further discussion of the stellar and star-forming properties as derived from spectral energy distribution fitting the photometry provided by the SDSS in the optical and the FUV/NUV by the Galaxy Evolution Explorer (GALEX) satellite for the ALFALFA sample will be presented in S. Huang et al. (2011a, in preparation) and S. Huang et al. (2011b, in preparation).

Figure 7.

Figure 7. Gray-scale CMD, based on SDSS DR7 photometry, for the ALFALFA–SDSS overlap sample using the model magnitudes and colors as given in Table 3. The x and y ranges are matched to Figure 2 of Baldry et al. (2004) for comparative purposes. The superposed dashed line is the optimum divider given as Equation (11) of that paper which separates the red sequence from the blue cloud.

Standard image High-resolution image

4.3. ALFALFA Detections without Optical Counterparts

One of the scientific drivers behind blind H i surveys is the possibility of contributing gas-rich but optically "dark" galaxies to the extragalactic census. Previous analyses by, e.g., Briggs (1990), of the statistics of targeted H i line surveys have shown that such objects must be rare; otherwise there would have been more sources detected serendipitously in the random off-source positions observed by the total power position-switching observing mode used for most of those earlier surveys. Indeed, perhaps the best example of an optically dark galaxy is the southwest component of HI1225+01 (Giovanelli & Haynes 1989; Chengalur et al. 1995), but it is not a purely isolated object, being located on the outskirts of the Virgo Cluster and part of a binary system with its dwarf galaxy companion to the northeast. Of the 4315 H i sources reported in the HIPASS catalog, 84% were identified with one or more possible OCs (Doyle et al. 2005). Most of the remainder are located at low galactic latitude where Galactic extinction strongly inhibits the hunt for the stellar counterpart. In fact, Doyle et al. (2005) investigated through follow-up observations the 13 HIPASS without OCs and with AV < 1 mag and concluded that not a single one could be claimed as an isolated dark galaxy. Some might be intergalactic in the sense of being associated with tidal debris fields or fragments of very extended H i disks, but always there were nearby, visible (stellar) objects at the same redshift.

Because of ALFA's superior angular resolution at L-band in comparison with that of the Parkes telescope (4' versus 15farcm5), we are able to centroid the position of the ALFALFA H i sources to better than 20'' on average and to identify their OCs likewise with better surety. Only 1013 of the 15,855 sources presented in Table 1 do not have assigned OCs. Of those, 814 blank field objects have observed velocities which fall within the range characteristic of emission associated with some Milky Way population. All of these are assigned an H i source category code of 9 in Column 12 of Table 1. They are likely to be HVCs, although a few isolated objects with narrow velocity widths and small angular sizes (either barely resolved or unresolved) are candidate low-mass extragalactic halos (Giovanelli et al. 2010). Their distribution and nature will be discussed elsewhere.

Of the remaining 199 H i sources (<2% of the total extragalactic population) whose velocities suggest that they are truly extragalactic, we have individually examined closely the SDSS and/or DSS2(B) fields to look for OCs; comments derived from that examination are included in Table 2. Some of these objects do not lie in the region covered by SDSS, making the identification of OCs more difficult, but by design, only a few lie in regions of significant optical extinction.

Roughly 3/4 of the "dark" H i sources are located in fields where objects of similar redshift are found, albeit beyond the reasonable limits of coincidence given by the ALFALFA pointing accuracy. A number can be linked to previously known extended H i distributions such as the Leo Ring (Stierwalt et al. 2009), the tail of NGC 4254 (Haynes et al. 2007; Kent et al. 2007), the extended tail of NGC 4532/DDO 137 (Koopmann et al. 2008), or the intergroup gas found in the NGC 7448/7463/7464/7465 group (Haynes 1981). Among the blank field H i detections with SDSS data (including DR8) and not contaminated by the presence of bright foreground stars, only about 50 remain as candidates to be isolated "dark" objects. These objects are the targets of a follow-up program that will confirm their reality as H i sources with the Arecibo single-pixel L-band receiver, localize the H i emission via H i synthesis observations, and search for associated low surface brightness stellar emission via optical imaging.

4.4. OH Megamaser Candidates

OH megamasers (OHMs) are powerful line sources associated with the starburst nuclei in merging galaxy systems. Briggs (1998) has pointed out that OHMs at z ∼ 0.17 may contaminate a blind extragalactic survey such as ALFALFA. The OHM phenomenon is extremely rare in the local universe; only about 100 OHMs are known out to a redshift of 0.265 (Darling & Giovanelli 2002). The main 18 cm OH lines occur at rest frequencies of 1665 and 1667 MHz, respectively. In OHMs, the emission at 1667.359 MHz dominates; that line is redshifted in the ALFALFA observing band for sources with 0.16 < z < 0.25. Using the large targeted survey for OHMs by Darling & Giovanelli (2002) as a baseline for the expected flux density and spectral characteristics of OHMs, it is probable that a few of the ALFALFA sources without OCs may in fact be OHMs with 0.16 < z < 0.25. Confirmation that an ALFALFA source is in fact an appropriately redshifted OHM and not an optically dark H i galaxy will require follow-up H i synthesis observations to localize the line emitting region and optical/IR spectroscopy to confirm the redshift.

Already, however, there are four OHM candidates which can be identified as such because the line emission occurs at frequencies higher than 1422 MHz. Hence, under the assumption that the emission arises from the H i 21 cm line, the observed cz is too largely blueshifted for plausible interpretation as an extragalactic or Galactic H i source. The properties of these four objects are given in Table 4 and optical images obtained from either the SDSS or DSS2(B) are shown in Figure 8. The entries in Table 4 are as follows.

  • 1.  
    Column 1: entry number in the AGC.
  • 2.  
    Column 2: centroid (J2000) of the emission line source, in hhmmss.sSddmmss, as in Column 3 of Table 1. The designation of the candidate then adopts the identifier "OHMcand" plus this centroid position.
  • 3.  
    Column 3: position of the identified OC, in hhmmss.sSddmmss.
  • 4.  
    Column 4: zopt, redshift of the OC, where known.
  • 5.  
    Column 5: zOH, redshift of the candidate OHM assuming its emission is dominated by the OH line at 1667.359 MHz.
  • 6.  
    Column 6: cz21, heliocentric velocity if the emission were associated with the H i line, in km s−1.
  • 7.  
    Column 7: FOH, OH line flux density, in Jy km s−1.
  • 8.  
    Column 8: S/N of the OH line emission, defined as in Column 8 of Table 1.
  • 9.  
    Column 9: rms noise in the vicinity of the line emission, defined as in Column 9 of Table 1.
Figure 8.

Figure 8. Optical images of the four best OHM candidates listed in Table 4. The image of AGC 102850 comes from the DSS2(B) while the others are SDSS-g; each image is 3' on a side.

Standard image High-resolution image

Table 4. OH Megamaser Candidates

AGC OHM Coords (J2000) Opt. Coords (J2000) zopt zOH cz21 FOH S/N rms
  (hh mm ss.s+dd mm ss) (hh mm ss.s+dd mm ss)     (km s−1) (Jy km s−1)   (mJy)
(1) (2) (3) (4) (5) (6) (7) (8) (9)
102708 000337.0+253215 000336.1+253204   0.169 −1335 0.91 5.7 2.33
102850 002958.8+305739 002958.2+305832   0.172 −596 0.46 6.7 2.09
181310 082311.7+275157 082312.7+275138 0.16783 0.168 −1551 2.17 15.9 2.18
228040 124540.5+070337 124545.7+070347   0.172 −624 0.33 5.1 2.11

Download table as:  ASCIITypeset image

In all four instances, there is a small object visible in public imaging databases which can be identified as the likely OC.

  • 1.  
    AGC 102708 = OHMcand000337.0+253215 is likely associated with SDSS J000336.02+253204.0, a very tiny object also evident in DSS2(B). There is no NED entry or redshift measurement.
  • 2.  
    AGC 102850 = OHMcand002958.8+305739 is likely associated with 2MASX J00295817+3058322, a well-formed spiral galaxy. There is no confirming redshift measurement.
  • 3.  
    AGC 181310 = OHMcand082311.7+275157 is likely associated with SDSS J082312.61+275139.8, also known as IRAS 08201+2801 and 5C 07.206, a known ULIRG. For this single object, the optical redshift cz = 50,314 km s−1, z = 0.167830 ± 0.000041 from the SDSS confirms the identification as an OHM; its OHM emission was previously discovered by Darling & Giovanelli (2001).
  • 4.  
    AGC 228040 = OHMcand124540.5+070337 is likely associated with SDSS J124545.66+070347.3, a spiral galaxy viewed at high inclination as evident in Figure 8. No confirming redshift measurement is available.

OHMs may also be identified in the subset of low S/N sources not included in the current catalog because they do not meet the criteria of Codes 1 and 2.

By the simplest argument based on the fraction of the usable ALFALFA bandwidth above 1422 MHz and assuming that these four candidates are, in fact, OHMs, it is possible that half of the "dark galaxy" candidates discussed in Section 4.3 might be OHMs at 0.175 <z < 0.245. A similar estimate comes from considering the α.40 volume and the OHM luminosity function at low z (Darling & Giovanelli 2002). A more systematic approach to the identification of OHMs throughout the full bandpass and using the three-dimensional ALFALFA data set is currently being undertaken by members of the ALFALFA collaboration.

5. VALIDATION OF ALFALFA H i PARAMETERS

Most targeted extragalactic H i line flux densities are extracted from spectra conducted using a total power position-switching technique. As outlined in Section 2, the ALFALFA data set is generated using a very different approach whereby ALFA drift-scan data are obtained months and sometimes years apart and without Doppler tracking. The two-dimensional data sets (frequency versus time) for the two individual polarizations of each of the seven beams are bandpass subtracted and flagged for RFI. After the acquisition of all the drifts for a region of the sky, the three-dimensional spectra grid is then generated. As with any new survey, it is critical to verify that the spectral scales (velocity and flux density) at each grid point are accurate.

The Cornell Digital H i archive presents a large compilation of digital H i line spectra obtained using pointed observations of optically selected targets (Springob et al. 2005b) which have been digitally analyzed using similar algorithms to those adopted for ALFALFA. Because those spectra were obtained with a variety of single-dish telescopes and spectrometers, careful attention was paid to correct for instrumental effects such as pointing errors, source extent, instrumental broadening, and spectral smoothing. Corrections for the various effects were modeled and tested to produce a homogeneous catalog of extracted properties with their associated error estimates. Here we present the validation of the ALFALFA velocity, velocity width, and flux density scale by comparison of α.40 catalog parameters with the previous targeted H i line observations of sources which have been re-detected by ALFALFA. Of the 2073 galaxies which are contained both in the Springob et al. (2005b) and the α.40 catalogs, 1887 are classified as ALFALFA Code 1 sources and 186 are Code 2 detections.

5.1. Validation of the ALFALFA Velocity Scale

The ALFALFA "minimum-intrusion" observing mode acquires data without Doppler tracking, i.e., in topocentric mode. Heliocentric corrections are applied in the Fourier domain, whereby the appropriate velocity shift at each point (each spectrum associated with each one second record for each polarization of each beam) is calculated, converted to a phase gradient across the bandpass and applied to the Fourier transform of each spectrum. The inverse Fourier transform then gives the spectrum in the heliocentric rest frame which is used thereafter to yield the systemic velocity cz and the H i profile velocity width W50.

It is important to note that the specific definitions of H i systemic velocity and the global profile velocity width are not uniformly adopted in the literature. For ALFALFA, we adopt the same convention as that used by Springob et al. (2005b), that is, polynomials are fit to each side of the two-horned profile and then cz and W50 are measured at the level of 50% of the peak intensity on either horn: cz is then the midpoint and W50 is the full width at that level. Where appropriate (face-on galaxies; dwarf systems), a single Gaussian provides the best fit and is similarly measured. Figure 9 illustrates the comparison of the two parameters for the α.40-Springob et al. (2005b) H i archive overlap sample. In both panels, the vertical axis shows the residual ALFALFA–H i archive. The occurrence of outliers is expected because (1) ALFALFA spectra correspond only to 40 s per beam of integration time on source, whereas the targeted spectra are generally of much longer integration, (2) targeted spectra are affected by pointing errors either in the coordinates used to position the telescope or intrinsic telescope pointing inaccuracy, and (3) blends with close companions where the pointed spectra were taken with smaller single-dish telescopes. A few cases of gross disagreement are explained by errors in the velocity scales of very old H i data which were acquired in the days before significant information was written into data headers, when the setup of the back-end electronics and spectrometer required physical cabling and hand dial-setting at the start of each observing run and when records of frequency offsets for different quadrants of the spectrometer were kept only on handwritten index cards; these cases are noted in the comments in Table 2. The appearance in the lower panel of some outliers at relatively low W50 reminds us that at low S/N or in the presence of residual baseline structure, broad widths may be underestimated. The dependence of the sensitivity on line width is discussed in Section 6. As evident in Figure 9, the comparison of the velocity scales reveals no systematic offsets and agreement within the expected errors.

Figure 9.

Figure 9. Comparison of systemic velocity cz (upper panel) and velocity width measurements W50 (lower panel) obtained by the ALFALFA survey and values given in the Cornell Digital H i archive (Springob et al. 2005b).

Standard image High-resolution image

5.2. Validation of the ALFALFA Flux Density Scale

As discussed in van Zee et al. (1997), practical limitations and instrumental uncertainties restrict the accuracy with which H i line flux densities can be measured to not better than a few percent. Despite regular calibration via the injection of a noise diode, drifts in the electronic gain, amplifier instabilities, sidelobe variations, and standing waves (caused by multiple reflections within the optical path of cosmic continuum sources or terrestrial RFI) induce variations in the total power, while baseline irregularities and data loss due to RFI impact the measurement of flux density in noisy data. As discussed in Springob et al. (2005b), H i line flux densities derived from targeted (pointed) observations are typically accurate to not better than 15%; older data sets taken when amplifiers were substantially less stable than today are probably accurate to not better than 25%.

Calibration of the ALFALFA data set is performed in two separate stages. First, during the course of an observing run, a noise diode, calibrated by the engineering staff in the lab, is fired once every 600 s. The data stream then includes a record with this additional power source (the "cal-on" record). All observing runs contain at least 9 such calibrations; longer ones may contain as many as 60. A polynomial fit is performed to the ratio of the total power with the calibration diode on, versus when it is off, for the whole set for an observing block. This polynomial is then used to correct the individual records of the drift-scan data. The second method of calibration is performed on the data after grid construction, making use of the radio continuum sources which they contain. A comparison is made of the flux densities of the source contained in the grids with published values in the NVSS (Condon et al. 1998), and then an average correction factor is applied to tie the ALFALFA flux scale to the NVSS. Further details on calibration of the ALFALFA data set are given in B. R. Kent & R. Giovanelli (2011, in preparation).

Even when gain corrections for frequency dependence and other effects are correctly calibrated out, H i line flux densities observed with single-point observations must be corrected for beam dilution and, often, pointing errors. In addition to the inaccuracy of telescope pointing, particularly important in early Arecibo observations, the input positions used to point the telescope were accurate only at the level of 0farcm5–1' level for some of the oldest observations used to acquire the archival data reanalyzed by Springob et al. (2005b).

Springob et al. (2005b) report both raw (as observed) and corrected values of the H i line flux density for galaxies observed via pointed observations of optically selected targets. The true H i line flux density was derived by applying corrections for telescope pointing errors, errors in the positions used to point the telescope (both of which apply more importantly to older data sets), and for partial resolution by the telescope beam. The latter is derived by adopting a hybrid correction for source extent that is based on a modeled H i distribution scaled by the optical size and an average telescope beam power pattern. As a drift-scan mapping survey, ALFALFA flux densities are not subject to such corrections. Figure 10 shows the comparison of the H i line flux densities measured by ALFALFA with the values given in the Springob et al. (2005b) H i archive. The latter are corrected for pointing and position errors and for source extent (but not for internal H i absorption). The vertical axis shows the ratio of the H i line flux densities reported in the two catalogs. ALFALFA detection Code 1 objects are shown as blue open circles; the lower S/N Code 2 objects are shown as red filled circles. Since the error in the H i line flux density for both surveys depends on the H i line flux density itself as well as the S/N of the spectrum and the magnitude of the corrections applied to the pointed data, the increasing scatter seen in the ratio at low H i line flux densities is as expected. When corrections for source extent are applied to the pointed data, the flux density scales are coincident within the errors.

Figure 10.

Figure 10. Top: comparison of H i line flux density measurements S21 for the 1888 galaxies in common between α.40 and Springob et al. (2005b). The vertical axis displays the ratio of the H i line flux density detected by ALFALFA to the corresponding value corrected for source extent and pointing errors (but not internal H i absorption) reported by Springob et al. (2005b). ALFALFA Code 1 detections are plotted as blue open symbols, while Code 2 (priors) detections are shown as red filled circles. The flaring of the ratio at low fluxes is expected. Bottom: similar comparison with 347 galaxies detected by HIPASS. No Code 2 detections were detected by HIPASS.

Standard image High-resolution image

Among the sources with the highest H i line flux density, Figure 10 suggests that the ALFALFA flux algorithm may be missing some flux. The total flux should be recovered since ALFALFA is a mapping survey but the H i line flux from very extended sources, especially those located toward the edges of the constructed grids could be lost due to the finite grid size and the bandpass subtraction and grid flattening schemes. For those, alternative processing tools from the standard pipeline will be developed, after completion of the main survey.

In order to assess the contribution of very diffuse, extended H i in vicinity of nearby, isolated galaxies, Haynes et al. (1998) and Hogg et al. (2007) observed a carefully selected sample of ∼100 galaxies with both the former 42 m and the Green Bank Telescopes (GBT). As they note, the uncertainty in the H i line flux density for their high signal-to-noise data is <1%; on the other hand, the uncertainties contributed by fitting the polynomial baseline and defining the boundaries of the emission profile are considerably larger. Because an unblocked aperture should deliver reduced standing waves and minimal stray radiation, flux densities measured with the GBT should be more accurate than the ones measured with a complex instrument like Arecibo. At this point, there are only 12 galaxies in common with the sample observed very accurately with the former 42 m telescope by Haynes et al. (1998), not enough for conclusive results. These issues will be explored in a future work.

Although it might be expected to serve as the better data set to use in examining systematic uncertainty and testing for missing flux from extended/bright sources, the northern HIPASS survey (Wong et al. 2006) does not in practice provide an adequate comparison sample for several reasons. First, as mentioned above, flux calibration uncertainties do not dominate most H i flux density errors; baseline uncertainties, noise and beam effects do. A drawback of the northern HIPASS catalog is that some of it suffers from residual baseline ripple, particularly when observations were made during the daytime (Wong et al. 2006). Second, the sensitivity difference between Arecibo and Parkes means that the S/N of most ALFALFA/Springob et al. (2005b) detections is typically much higher than that of HIPASS. This fact alone, on top of the baseline issues, give the Springob et al. (2005b) spectra a significant advantage over HIPASS in terms of parameter accuracy. Third, although there are 1000 galaxies in the northern HIPASS data set (Wong et al. 2006) at decl. >2, only ∼350 lie in the overlap with the present α.40 catalog. Lastly, there are no Code 2 sources detected by HIPASS (which is not surprising, given its much poorer sensitivity) so we cannot make the comparison between Codes 1 and 2. However, for the record, we include in the lower panel of Figure 10 the similar comparison of flux densities from ALFALFA and HIPASS; clear cases of confusion within the Parkes beam are not included in this analysis. Curiously, ALFALFA detects more flux density than HIPASS in some cases; examination of those reveals that they are mainly ALFALFA detections with broad W50 for which HIPASS detects, at much lower S/N, a lower H i line flux density and a narrower W50, clearly missing some of the H i line emission detected by ALFALFA.

6. ALFALFA SOURCE COMPLETENESS AND RELIABILITY

The practical exploitation of any survey requires an understanding of its source sensitivity, completeness, and reliability. In comparison with previous blind H i surveys, ALFALFA offers a much richer data set which itself can be used to probe the robustness of its source catalog.

Source extraction and parameter measurement for ALFALFA is performed in a two-step process, which includes automated as well as interactive procedures. Initial source extraction is performed by a fully automatic matched-filtering method (Saintonge 2007a, 2007b). The algorithm uses templates which vary in shape as a function of profile width (Gaussian for narrow profiles, Hermite functions for broad profiles), and outperforms algorithms based on smoothing followed by peakfinding (see Figure 3 in Saintonge 2007a). The reliability (i.e., the fraction of detections that correspond to real sources) of this automated method is estimated to be ≈95% for sources with S/Nextractor > 6.5, a value that was determined by performing source extraction on regions of the ALFALFA data cubes expected to be devoid of cosmic signals (corresponding to the velocity range −2000 km s−1<cz < −500 km s−1, see Section 5.4 and Figure 8 in Saintonge 2007a). Source candidates are then visually inspected and source parameters are interactively measured and cataloged. It should be noted that the parameters of the final cataloged sources (e.g., S21, S/N, W50, etc.) do not generally coincide with the values determined by the automatic signal extractor, because the two procedures use different definitions and calculation methodologies for the parameters (Giovanelli et al. 2007), and the human intervention is designed to optimize the measurement accuracy and further improve the reliability of the catalog by rejecting spurious detections that correspond to low-level RFI, poorly sampled data, and residual baseline fluctuations. We therefore expect the final reliability of ALFALFA Code 1 detections to be very close to 100%.

However, as discussed by Saintonge (2007a), the reliability of ALFALFA sources extracted by the matched-filtering algorithm drops precipitously below an S/N of 6.5. The Code 2 sources included in Table 1 fall below this nominal ALFALFA S/N detection threshold, but are included in the catalog because they coincide with an optical galaxy of known (prior) and coincident redshift. Although these sources should not be included in statistical studies which require careful consideration of survey completeness and sensitivity limits, the vast majority are likely real H i line sources, and the gas in them will also contribute to the overall H i density in the local universe. Hence, we include them in the following analysis of ALFALFA completeness and their impact on measurements of cosmological parameters.

The two-step process used to identify, extract, and measure the ALFALFA detections presented in Table 1 results in a catalog of reliable detections that is dependent on both the integrated H i line flux density of a given source and its global H i line profile width, W50. Like all fixed integration-time spectroscopic surveys, ALFALFA is more sensitive to narrow H i profiles than to broader ones of the same integrated line flux. Based on the demonstrated performance during the single-pass precursor observations of the observing equipment and the signal extraction pipeline, Giovanelli et al. (2005b) predicted the specific relationship between the integrated flux density detection threshold (S21, th in Jy km s−1) and the profile width (W50, in km s−1) of a source in terms of the S/N required for inclusion in the catalog:

Equation (3)

Note that the above expression differs from that given in Equation (2) of Giovanelli et al. (2005b) (numerical factor of 0.15 versus 0.22) because that work adopts the rms appropriate to the single-drift maps used in the precursor observations. One of the principal conclusions of Giovanelli et al. (2005b) was that the two-pass strategy adopted for ALFALFA would improve on that employed by the precursor program by a factor of 1.5.

"Sensitivity" is a qualitative term that can be defined in terms of the survey "completeness." We refer to the completeness of the ALFALFA survey as that fraction of cosmic sources of a given integrated flux density and within the survey solid angle that are detected by ALFALFA and included in the α.40 catalog. Other blind H i surveys (e.g., ADBS, HIPASS) have estimated the completeness of their catalogs as a function of profile width (W50) by examining their ability to recover synthetic sources of known characteristics (peak flux, S/N, W50, etc.) injected into the spectral cubes. One of the motivations for such an approach is to assess the reliability of sources in the presence of non-Gaussian noise. As noted by Saintonge (2007a, see Section 5.4 and Figure 8 of that paper), the impact of non-Gaussian noise on the automatic signal extractor developed for ALFALFA is generally minimal above S/N = 6.5. Its presence, principally in the form of the very broad spectral standing waves resulting from reflections in the telescope focal structure (e.g., Briggs et al. 1997), is responsible for the upturn at large W50 in Equation (3). At the narrower widths, there is no evidence that a Gaussian assumption is unfair.

Now that a significant ALFALFA data set exists, the data themselves can be used to derive the true sensitivity limits. The analysis of the real survey data is motivated both by a desire to use the real observables rather than predictions of the performance of the observing equipment and signal extraction pipeline, and especially by the fact that the ALFALFA survey has actually outperformed its predictions, as discussed in Appendix A of Martin et al. (2010). Hence, we follow a different method to determine the ALFALFA completeness that makes no use of "fake sources," but relies instead on the α.40 catalog itself. For a flux-limited sample from a uniformly distributed population, number counts will follow a power law with an exponent of −3/2. We then can determine the onset of incompleteness when our data deviate from this form. Briefly, the details of this method consist of the following steps.

  • 1.  
    The Code 1 sources are divided into 32 equally spaced bins in log W50.
  • 2.  
    For each width bin, we count the number N of detected sources in logarithmic intervals of flux density to determine the dN/dlog S21 histogram; apart from the impact of large-scale structure in the survey volume, number counts are expected to follow a power law with an exponent of −3/2.
  • 3.  
    For each bin in log W50, we plot S3/221dN/dlog S21 versus S21; see Figure 11 for three representative width bins. This distribution should be flat if all sources are accounted for. A downturn at low S21 thus marks the onset of incompleteness.
  • 4.  
    We fit an error function to each histogram (red dashed lines) and assume completeness over the well-sampled range of S21 over which the distribution shows a flat plateau.
  • 5.  
    We calculate the integrated flux density where the ALFALFA completeness crosses 90%, 50%, and 25% (vertical red lines mark the 90% completeness in each bin). In practice, the distributions drop off in the same way, such that the 50% and 25% limits occur at a constant offset in log S21 from the 90% value across all bins.
  • 6.  
    The values of $S_{21,90\%}$ for each W50 bin are then fit with the combination of two straight lines, similar to Equation (4), with a break at W50 = 300 km s−1.
Figure 11.

Figure 11. Three representative examples of the S21S3/221dN/dlog S21 distribution, used to evaluate completeness. Data points with error bars (1σ Poisson) represent the distribution of Code 1 sources in a low (upper panel), intermediate (middle panel), and high (bottom panel) profile width bin. The downturn of the distributions at low S21 marks the limit where the survey completeness falls below unity. The red dashed line corresponds to an error function fit to the data, while the vertical red solid line represents the flux where the survey completeness is 90% according to the fit, S21, 90%, Code1. Values of S21, 90%, Code1 for each width bin (W50) are used to derive the 90% completeness line of the survey presented in Equation (4). A similar analysis has been used for the combined catalog of Code 1 and 2 sources.

Standard image High-resolution image

The resulting 90% completeness limit (red solid line in the upper panel of Figure 12) for Code 1 sources can be expressed as

Equation (4)

where S21 is in Jy km s−1 and W50 is in km s−1. As mentioned before, the 50% and 25% completeness limits occur at a constant offset from the 90% value. The derived offsets for the Code 1 sources only are

Equation (5)
Figure 12.

Figure 12. Distribution of α.40 extragalactic sources in the profile width vs. integrated flux density (log W50–log S21) plane. The upper panel shows the distribution of Code 1 detections only, while the lower panel shows the same for the whole α.40 catalog, including Code 1 (blue symbols) and Code 2 (green symbols) detections. In both panels, the solid red line corresponds to the 90% completeness limit, while the red dash-dotted line corresponds to the 50% ("sensitivity limit") and the red dotted line to the 25% ("detection limit") completeness limits. See Section 6 for the analytical expressions for the plotted limits, as well as for an explanation of the derivation method.

Standard image High-resolution image

Of the 15,041 extragalactic objects in the α.40 sample, 3100 are categorized as Code 2 detections (low signal-to-noise detections with prior optical detection). The lower panel of Figure 12 shows the corresponding plot of the distribution of sources in the log W50–log S21 plane for the α.40 extragalactic catalog, including the Code 2 detections which are shown as green symbols. These additional H i sources are expected to have a lower detection threshold, clearly evident in the lower panel of Figure 12. An analysis identical to the above can be performed including both the Code 1 and 2 sources, yielding a relation for the combined catalog (red solid, dash-dotted, and dotted lines in the bottom panel of Figure 12):

Equation (6)

and

Equation (7)

Excluding the Code 2 sources from the HIMF analysis as did Martin et al. (2010) guarantees that more confident detections with well-understood selection criteria are used. It could be argued that the use of sources of Code 2 in the analysis could provide value added to the determination of the HIMF. This is discussed further in Section 7.1. In practice, statistical studies requiring stringent requirements on sensitivity limits should use only Code 1 sources and Equation (4). With the proper caution associated with the incomplete nature of Code 2 sources, the combination of Code 1 and Code 2 sources and Equation (6) can be used in studies which can benefit from a larger sample.

In both cases, the 50% completeness limit can be considered the "sensitivity limit" of the survey, since it is the most relevant completeness limit for the derivation of galaxy statistical distributions, such as the HIMF and the H i width function. Rosenberg & Schneider (2002) have shown that adopting a step function cut at the 50% completeness limit of a survey produces approximately the same statistical results as adopting the survey's full completeness function. The 25% completeness limit can be identified with the "detection limit" of the survey, that is the integrated flux density level below which a source has only a small chance of being detected and cataloged.

We remind the reader that the quoted limits given here refer to the full α.40 catalog, and hence are representative of the average ALFALFA data cube noise properties. However, because of variations in noise among and within grids and because some localized regions are entirely contaminated by RFI, limits on the H i flux density at arbitrary positions (e.g., upper limits for non-detections) must be computed individually, by specific inspection of the spectrum noise properties of the data cubes and their associated "weights grid" and the continuum maps. It is the availability of such ancillary information which enables the use of the full ALFALFA data set for stacking (Fabello et al. 2011) to probe statistical ensembles more deeply.

As the previous generation blind H i survey, HIPASS (Meyer et al. 2004) set the standard for survey completeness; by design, ALFALFA was intended to surpass and supersede HIPASS. A reasonable comparison of the impact of the different source detection schemes (including the absolute level of flux density sensitivity) may be made by comparing the distribution of the highest-mass galaxies in HIPASS and ALFALFA. For example, one might have anticipated that the original HIPASS peak-flux density detection scheme (Meyer et al. 2004) could bias the catalog against edge-on (extremely wide) profiles at the highest masses, and such a bias could explain the finding of Martin et al. (2010) that the HIPASS HIMF underestimated the number density of the highest-mass galaxies. Figure 13 shows a comparison of the distribution of profile widths in α.40 (open histogram) and HIPASS (filled histogram), for objects with log $M_{{\rm H\,\mathsc{i}}}/M_{\odot } > 10.0$. No obvious difference which would explain a lack of high-mass sources in the HIPASS catalog is apparent. While the peak-flux density threshold detection could introduce such a bias, it is apparent that the matched filtering technique subsequently applied to the HIPASS data set recovers high-width objects as does the technique used in ALFALFA. Instead, we attribute the lack of extremely high-mass sources in the HIPASS catalog to that survey's limited redshift extent and its lowered sensitivity near its bandpass redshift limit, both of which resulted in inadequate sampled volume and thus an undercounting of the rare, highest-mass H i disks.

Figure 13.

Figure 13. Distribution of profile widths in α.40 (open histogram) and HIPASS (filled histogram) for objects with log $M_{{\rm H\,\mathsc{i}}}/M_{\odot } > 10.0$.

Standard image High-resolution image

Furthermore, because of its lower sensitivity, poorer angular and spectral resolution and source detection scheme, HIPASS was limited in its ability to probe the very low mass and narrow-width H i sources. The spectrometer setup employed by HIPASS yielded a raw resolution of 13.2 km s−1 and of 18 km s−1 after Hanning smoothing; the narrowest objects included in the HIPASS catalog have W50 = 30 km s−1. In contrast, ALFALFA's velocity resolution is 11 km s−1 after smoothing is applied, and the α.40 catalog thus includes sources with extremely narrow velocity widths. Although the signal extraction algorithm adopted by Saintonge (2007a) applied a minimum template width of 30 km s−1, the refined final process of parameter extraction based on individual examination of the emission region permits finer width estimation. In fact, 289 of the extragalactic objects included in Table 1 have W50 < 30 km s−1. Figure 14 examines the distributions of low H i mass systems and their profile velocity widths in the two surveys; ALFALFA is clearly superior in its ability to probe the lowest-mass systems. This increased sensitivity to very narrow H i line emission enhances ALFALFA's ability to probe the lowest H i masses, which in turn robustly constrains the faint-end slope of the H i mass function, α. In fact, at the lowest H i masses, $\log M_{{\rm H\,\mathsc{i}}}/M_{\odot } <$ 8.0, the HIPASS catalog includes only 40 objects while the α.40 catalog contains 339. The ability of ALFALFA to sample narrower H i line sources is also critical for the derivation of the H i width function and its relation to the halo mass function (Papastergis et al. 2011).

Figure 14.

Figure 14. Distribution of profile widths W50 in ALFALFA (open circles) and HIPASS (filled circles, enlarged for visual clarity), for objects with log $M_{{\rm H\,\mathsc{i}}}/M{_\odot } < 8.0$. The overplotted horizontal dashed line shows the profile width cutoff at 30 km s−1, the limit for inclusion in the HIPASS catalog.

Standard image High-resolution image

7. THE IMPACT OF ALFALFA SURVEY CHARACTERISTICS ON DERIVATION OF THE HIMF

In drawing conclusions from blind H i surveys about the H i-selected population in the local universe, it is critical to understand the biases in the survey due to its sensitivity limits, uncertainties in the H i line flux densities and distances leading to uncertainties in the derived H i masses, and the impact of large-scale structure in the survey volume. Toribio et al. (2011b) use a subsample of ALFALFA H i sources identified in low-density environments to establish a standard of normal H i content and performed an analysis of the completeness of the particular version of the ALFALFA catalog they used. Martin et al. (2010; see also Martin 2011) provided an overview of important effects that impact the derivation of the HIMF by two different methods commonly used to derive mass and luminosity functions, namely the 1/Vmax method and the two-dimensional stepwise maximum likelihood (2DSWML) method. In the context of applications such as the derivation of the HIMF by those two methods, we discuss here in greater detail the magnitude and character of α.40 survey properties, its limitations and biases. It is particularly important to understand these effects now because we anticipate the "100% ALFALFA survey" to be available in the near future. The large increase in the number of galaxies available for that analysis will decrease the statistical uncertainties on the measurements, thus amplifying the relative impact of systematics and biases. Additionally, at that stage it will be less practical to create thousands of realizations to help understand the various effects. The results presented in this section will provide a baseline and dictate procedure for the final measurement of the H i mass function from the completed ALFALFA survey.

7.1. The Limits of ALFALFA: Code 2 "Prior" Sources and the RFI-imposed Redshift Cutoff

Selection effects related to the Code 2 sources in α.40 are poorly determined. Because they require redshift information derived from other sources, they are subject to the limitations of the availability of such confirming data. Additionally, ALFALFA's sensitivity as a function of distance is strongly affected by RFI especially in the frequency range contaminated by the aviation radar at the San Juan airport (1350 MHz, corresponding to cz ∼ 15, 600 km s−1). For these reasons, Martin et al. (2010) included only objects with Code 1, detected within 15,000 km s−1. Yet it may be argued that the additional information contained in Code 2 sources, dipping to lower flux limits, could provide additional insight. A first evaluation of the value added to the HIMF by Code 2 sources relates to the observation that most Code 2 sources fall near $M^*_{{\rm H\,\mathsc{i}}}$, a region of the HIMF well sampled by Code 1 sources: the value added is thus likely to be negligible. We explore numerically this expectation.

7.1.1. The Code 2 Sources

Because of the requirement that Code 2 sources be identified with an OC of known (prior) redshift, most often contributed by optical/IR surveys like SDSS, those sources may be biased toward overdensities, toward those regions of the local volume that have been included in specific targeted or wide-area redshift surveys, such as the Virgo Cluster, and in particular toward those regions of the sky that have been covered in the spectroscopic catalogs of the SDSS.

Does the inferred HIMF change if Code 2 sources are included in its derivation? We account for H i mass and flux density errors by creating 500 realizations of an HIMF that includes sources of both Code 1 and Code 2, and compare those to 100 realizations of the fiducial HIMF published in Martin et al. (2010) which contained only the Code 1 objects. We use the 2DSWML method, but do not jackknife resample. As did Martin et al. (2010), we restrict the analysis to the contiguous areas contained in α.40 and limited to cz < 15, 000 km s−1. Over the same volume, the inclusion of Code 2 sources increases the sample size used for this analysis from the 10,021 included in Martin et al. (2010) to 11,177.

Figure 15 displays the H i mass function found when Code 2 sources are included in the analysis. The parameters of the function are not strongly affected by the inclusion of these sources. We find ϕ* (h370 Mpc−3 dex−1) = 4.8 ± 0.3 × 10−3, log (M*/ M) + 2 log  h70 = 9.96 ± 0.02, and α = −1.29 ± 0.02. These correspond to $\Omega _{{\rm H\,\mathsc{i}}} =$ 4.1 ± 0.3 × 10−4 h−170 found by integrating the Schechter function, or $\Omega _{{\rm H\,\mathsc{i}}} =$ 4.2 ± 0.1 × 10−4 h−170 when summing the binned measurements directly. The fiducial HIMF which includes only Code 1 objects as reported by Martin et al. (2010) finds ϕ* = 4.8 ± 0.3, log (M*/M) = 9.96 ± 0.02, α = −1.33 ± 0.02, $\Omega _{{\rm H\,\mathsc{i}}}$ (analytical) = 4.3 ± 0.3, and $\Omega _{{\rm H\,\mathsc{i}}}$ (summed) = 4.4 ± 0.1, all in the same units as expressed for the results with both Code 1 and 2 sources.

Figure 15.

Figure 15. HIMF found via the 2DSWML method (without jackknife resampling) when Code 2 sources are included. The best-fit Schechter function is overplotted as a dashed line, with the best-fit parameters displayed. While $\Omega _{{\rm H\,\mathsc{i}}}$ and the overall Schechter function shape are not changed, the inclusion of the additional sources does slightly flatten the faint-end slope compared to results obtained using only Code 1 sources (Table 5).

Standard image High-resolution image

Encouragingly, these results indicate that the ALFALFA survey's detection coding scheme does not systematically exclude significant sources of H i gas energy density in the local universe. Rather, the agreement between the Code 1 and the Code 1+2 HIMFs suggests that our robust understanding of the survey's sensitivity extends to those weaker sources identified as Code 2 objects.

The only potentially significant impact is on the faint end of the HIMF, influencing both the slope and the points there. The slope parameter α is flattened in the Code 1+2 case, though the two values are just barely within 1σ of each other. In Figure 16, we compare the residuals (the best-fit, fiducial HIMF Schechter model, subtracted from the binned data) for the case where we consider only Code 1 objects (top panel) and the Code 1 + 2 case (bottom panel). The figure clearly demonstrates that the Code 1 + 2 HIMF measured fewer low-mass objects per unit volume, thereby yielding a flatter slope. This is unsurprising for, in comparison to H i surveys, optical surveys undersample dwarf, low surface brightness galaxies. The typical Code 2 detection is a galaxy near L*, and the redshift distribution of Code 2 sources lacks the smattering of low-redshift objects present in a H i-selected sample. As a result, Code 2s add very few additional sources at the lowest redshifts, at which low H i masses are detectable. This is an example of the fact that adding Code 2 sources to the sample is more likely to subtract than to add value to the result: "more is less."

Figure 16.

Figure 16. Residuals (best-fit Schechter model subtracted from binned data) of H i mass functions calculated using only Code 1 sources (top) and both Code 1 and 2 sources (bottom). In both cases, the comparison model is the fiducial, Code 1-only Schechter function given by Martin et al. (2010). The zero-residual reference line is overplotted as a dashed line.

Standard image High-resolution image

We note that in the process of source extraction, a second set of marginal H i line detections has been identified which coincide with possible OCs for which no redshift measurement is available. Because the probability of these objects is yet too uncertain, they are not included in the current α.40 catalog. Future follow-up observations to be made after the main survey is completed will be undertaken to confirm the reality of the H i line detections. This program will contribute additional low H i line flux density sources to the final ALFALFA catalog in this region of the sky.

7.1.2. The Full Redshift Extent of the ALFALFA Survey

(Unfortunately) we live on a planet occupied by technologically active humans. Figure 6 of Martin et al. (2010) illustrates the relative spectral weight within the 40% ALFALFA survey volume as a function of observed heliocentric velocity. A relative weight of 1.0 indicates that the entire surveyed volume was accessible for source extraction and produced high-quality data. As also evident in the deficit of sources near a distance of ∼225 Mpc seen in Figure 3, the FAA radar at the San Juan Airport contaminates the frequencies corresponding to source at cz between 15,000 and 16,000 km s−1, rendering the detection of sources in this range impossible when the transmitter is on. Beyond 16,000 km s−1, ALFALFA's sensitivity recovers, but at the corresponding distance, it is sensitive only to the most massive galaxies. As a result, this distant volume contributes only a small number of galaxies to the overall α.40 sample.

For these reasons, the analysis of the HIMF in Martin et al. (2010) neglected galaxies beyond 15,000 km s−1, so that the results would not be influenced by the large spectral weight gap. This exclusion was especially important in the case of the 2DSWML method, since the 1/Vmax method allows the inclusion of explicit corrections for known missing volumes. 2DSWML, by contrast, determines the shape of the HIMF by comparing counts in H i mass bins to a built-in description of ALFALFA's flux density sensitivity as a function of distance and width. The large gap, which is not anticipated by this approach, may have caused problems in the analysis were those objects included. Martin et al. (2010) felt it was safer to limit the first measurement of the HIMF to regions where the spectral weights are relatively smooth, that is, to galaxies within 15,000 km s−1. Here, we revisit the issue and consider the influence, if any, of including the full redshift extent of the survey in the HIMF analysis.

In particular, we would expect that the increased bin counts at the very highest H i masses may increase the statistical significance of the HIMF measurement there. Such a possibility is of interest because Martin et al. (2010) determined that ALFALFA is more sensitive to high-mass galaxies than HIPASS was, with HIMF results indicating that previous blind H i surveys have missed a significant percentage of the most massive H i disks. To test this possibility, we calculated the HIMF using both methods and using all of the Code 1 sources out to 18,000 km s−1. For each method, we created 250 realizations of the survey to account for flux density, distance, and mass errors. The fit parameters and $\Omega _{{\rm H\,\mathsc{i}}}$ values are displayed in Table 5 for both the 1/Vmax and 2DSWML methods. It is worth noting that the 2DSWML result is distorted, likely because of the influence of the inaccessible volume and the inability of this method to correct for it. In fact, the 2DSWML result drastically underestimates $\Omega _{{\rm H\,\mathsc{i}}}$, shifts log  (M*/M) to a higher value, and flattens out the low-mass slope α. On the other hand, the 1/Vmax method continues to function as expected and results in a reliable measurement. Both of the 1/Vmax results are consistent with each other, including the measured values for $\Omega _{{\rm H\,\mathsc{i}}}$, but the 2DSWML method in the presence of the redshift gap performs poorly. This result confirms the decision by Martin et al. (2010) to limit their analysis to the volume within cz < 15, 000 km s−1.

Table 5. H i Mass Function Fit Parameters by Redshift Extent

Sample and Fitting Function α ϕ* log  (M*/M) $\Omega _{{\rm H\,\mathsc{i}}}$, fit $\Omega _{{\rm H\,\mathsc{i}}}$, points
    (10−3 h370 Mpc−3 dex−1) + 2 log  h70 (× 10−4 h−170) (× 10−4 h−170)
1/Vmax, 15,000 km s−1a −1.33 (0.04) 3.1 (0.6) 9.95 (0.05)   4.4 (0.1)
1/Vmax, 18,000 km s−1a −1.34 (0.03) 3.8 (0.6) 9.92 (0.04)   4.3 (0.1)
2DSWML, 15,000 km s−1 −1.34 (0.02) 4.7 (0.3) 9.96 (0.01) 4.3 (0.3) 4.4 (0.1)
2DSWML, 18,000 km s−1 −1.26 (0.02) 3.4 (0.2) 10.00 (0.01) 3.0 (0.2) 3.1 (0.1)

Note. aIn the 1/Vmax case, pure Schechter functions provide a poor fit to the faint-end slope α, and the sum of a Schechter and a Gaussian function is used to complete the fit. The Gaussian component parameters are not shown in the table, given that they are not expected to be physical.

Download table as:  ASCIITypeset image

This analysis provides further evidence of the relative strengths and weaknesses of the two available methods for estimation of the HIMF. While the 2DSWML approach provides a powerful statistical tool, it functions as a "black box" method that cannot be manipulated by additional knowledge of the survey. In some cases, this may be an advantage, but in the case of ALFALFA where we have detailed information about the survey volume, the survey sensitivity, and other factors contributing to the HIMF, the 1/Vmax method provides a clearer path and a more understandable answer.

7.2. Uncertainties in the H i Mass

On the low H i mass end, uncertainties in the conversion from H i line flux density to H i mass are the primary source of error on the HIMF. Unlike the practice in the derivation of most HIMF results in the literature, the error analysis undertaken here and by Martin et al. (2010) has taken this explicitly into account. Because the HIMF is based on binning galaxies by H i mass and then considering each bin as an independent data point, it is not straightforward to carry H i mass uncertainties through analytically. Instead, the ALFALFA HIMF's uncertainties due to mass errors are calculated through the creation of many hundreds of realizations, each with randomly assigned mass (i.e., distance and flux density) errors. Here, we elaborate further on the distance estimate scheme used in ALFALFA, the biases that would be introduced by using alternative schemes (i.e., a pure Hubble flow model) and the overall impact of distance and flux density errors on the H i mass estimates used to construct the HIMF.

The distance estimation scheme adopted for ALFALFA was described by Martin et al. (2010) and is summarized briefly here. When distances are based on the adopted flow model, we employ the model's error estimates, constrained by the fit of the model to the observed velocity fit. When distances are estimated using pure Hubble flow, the error is estimated to be ∼10%. We fix a minimum error of 163 km s−1, based on the local velocity dispersion measured by Masters (2005). To demonstrate the importance of using the full suite of available information to estimate distances, Figure 17 compares the primary distances (used in α.40) to the values that would be obtained assuming pure Hubble flow.

Figure 17.

Figure 17. Primary distances from the literature vs. estimates based only on pure Hubble flow, with the ALFALFA distance uncertainty estimates overplotted. The dashed line indicates a one-to-one correlation.

Standard image High-resolution image

In their estimate of galaxy masses for the HIMF, the HIPASS team assumed Hubble flow. This is not a safe assumption, particularly in the regions of the sky surveyed in α.40. The Virgo Cluster, in particular, represents a strong deviation from any assumed relationship between distance and recessional velocity. Masters et al. (2004) showed the danger of assuming pure Hubble flow, especially because of the small redshifts accessible to blind H i surveys. Those authors concluded that the low-mass slope of the HIPASS HIMF was underestimated due to neglecting peculiar velocities, and predicted that a survey in the direction of Virgo could severely underestimate the low-mass slope.

Given the large-scale structure in the α.40 volume, we would expect the HIMF to vary strongly if a poor choice of distance estimate were made. In order to test this, we have re-calculated the 2DSWML estimate of the HIMF using pure Hubble flow to estimate distances. That is, we converted the observed heliocentric velocities into the CMB rest frame, and then assumed DMpc = czcmb/H0, where we adopt the ALFALFA standard H0 = 70 km s−1 Mpc−1. In this case, we have no ideal estimate of the distance error, and therefore use 10% of the Hubble flow distance or the local dispersion value 163 km s−1, whichever is greater. As usual, flux density errors are also folded into the mass uncertainties. Once again, we create 250 realizations to estimate uncertainties.

The resulting H i mass function and Schechter fit parameters are displayed in Figure 18. As anticipated (Masters et al. 2004), the use of Hubble flow has caused a serious underestimate of the faint-end slope α. ALFALFA's success at robustly measuring the HIMF depends not only on large sample size over a cosmologically significant volume, but also on the selection of a reasonable model for distance estimation.

Figure 18.

Figure 18. H i mass function obtained via the 2DSWML method when distances, and therefore masses, are obtained assuming pure Hubble flow with H0 = 70 km s−1 Mpc−1. As anticipated by Masters (2005), the adoption of pure Hubble flow yields an underestimate of the low H i mass slope α.

Standard image High-resolution image

Given the discussion of distance errors and their large impact on the HIMF and its uncertainties by Masters et al. (2004), it is reasonable to ask how large are the H i mass errors when both distance and flux density errors in the α.40 sample are taken into account. To obtain robust estimates of H i mass errors, we created many thousand realizations of each galaxy in the α.40 sample and applied distance and flux density errors. The result, displayed in Figure 19, compares the HIMF mass bin galaxies would nominally fall into assuming a perfect measurement of distance and flux density (along the abscissa) to the mean mass of the galaxies assigned to that bin once realistic uncertainties are taken into account. The horizontal uncertainties indicate the 1σ spread of potential "true" masses falling into nominally assigned mass bins. From the figure, it is clear that ALFALFA's measurement of the HIMF and $\Omega _{{\rm H\,\mathsc{i}}}$ is not prone to large uncertainties above 108.0M. In the mass range of interest to the missing satellites problem, dwarf galaxy studies, and the low H i mass slope of the HIMF, that is below 108.0M, galaxies can easily be misassigned to bins, even when a realistic distance model is being used. Depending on the large-scale structure in the survey volume, this effect would lead to either an overestimate or underestimate of α. We therefore take great care to account, conservatively, for mass uncertainties.

Figure 19.

Figure 19. Average (mean) mass falling into each HIMF bin. The estimated 1σ uncertainty of a galaxy's H i mass is overplotted as error bars, along with a dotted line indicating a one-to-one relationship.

Standard image High-resolution image

7.3. The Impact of Large-scale Structure in α.40

Because blind H i surveys are relatively shallow, with ALFALFA probing the local universe only out to z ∼ 0.06, inhomogeneity in the survey volume has a strong impact on the derived H i mass function. This effect is particularly true in the case of the 1/Vmax method, which is not as robust against large-scale structure, but the 2DSWML method is not completely immune from these effects. To test the homogeneity of a sample, the usual statistical test applied is the V/Vmax test (Schmidt 1968). Much like the 1/Vmax method, this test considers the maximum volume out to which each source in a survey can be detected. By comparing the actual volume the source was detected in to the accessible volume, homogeneity in the sample can be evaluated; the expectation value 〈V/Vmax〉 is 0.5 in a homogeneous volume.

In the case of the α.40 volume, 〈V/Vmax〉 = 0.45. This indicates that, at 40% completion, the survey does not yet contain enough volume to fully "smooth out" the effects of large-scale structure. This is reflected in Figure 20 where 〈V/Vmax〉 is shown for each bin of H i mass. The most obvious feature in this figure, the dip near log ($M_{{\rm H\,\mathsc{i}}}$/M) ∼8.4, is due to overdensities in the sample volume, primarily the Virgo Cluster. Galaxies in those overdensities are found, preferentially, in those regions, rather than filling the full volume where ALFALFA's sensitivity could detect them, causing this dip.

Figure 20.

Figure 20. Typical (mean) value of V/Vmax, binned by H i mass. Error bars are Poisson counting uncertainties. The solid line indicates 〈V/Vmax〉 = 0.5, while the dashed line indicates 〈V/Vmax〉 = 0.45 for the α.40 sample.

Standard image High-resolution image

It is clear that α.40 does not, yet, constitute a representative slice of the universe; as the survey progresses, we anticipate that the full sample will pass the V/Vmax test. Another, perhaps more intuitive, way to view the impact of voids and clusters in α.40 is to compare the redshift distribution of cataloged galaxies to the prediction based on the survey's selection function (i.e., the percentage of galaxies at a given distance that are detectable in ALFALFA). The selection function is determined by the 2DSWML analysis of the HIMF, and when combined with the measurement of the HIMF, predicts the redshift distribution for a homogeneously distributed set of galaxies selected from the HIMF.

Figure 21 compares this expectation to the observations in α.40. The bumps and dips in the histogram represent underdensities and overdensities, respectively, in the survey volume. For example, the Virgo Cluster explains the enhancement near 1000 km s−1. The Pisces–Perseus supercluster and its foreground void also make clear imprints in this figure.

Figure 21.

Figure 21. Observed redshift distribution of α.40 galaxies (histogram) compared to the expected distribution obtained via the survey's selection function.

Standard image High-resolution image

7.3.1. Subregions of the α.40 Catalog

If α.40 does not represent a representative sampling of the universe, then statistical studies of the sample's characteristics, like the HIMF, may be subject to biases from large-scale structure. Because of its size, we can make an assessment by the impact of large-scale structure within separate subregions of the catalog. The α.40 sample is made up of three large, contiguous areas. In the Northern Galactic hemisphere, α.40 covers 07h30m <α < 16h30m in two separate blocks: 4° < δ <16° and 24° < δ < 28°. We refer to these subregions as α.40.North1 and α.40.North2, respectively. In the Southern Galactic hemisphere, α.40 covers 22h <α < 03h, 24° < δ <32°, referred to as α.40.South. The entire α.40, combined together, covers enough cosmological volume for the effects of large-scale structure on the derivation of the HIMF to begin to become minimal, but reducing its coverage further leads to a situation in which the HIMFs derived for individual subregions are strongly affected by overdensities and underdensities within their volume.

Figure 22 displays the HIMFs for the three subregions: α.40.North1, North2, and South, from top to bottom. The fit parameters and values of $\Omega _{{\rm H\,\mathsc{i}}}$ are given in Table 6 along with the fiducial 2DSWML HIMF for the entire α.40 sample for comparison. The largest by a significant fraction is α.40.North1, and it contributes over 50% of the 10,000 galaxies in α.40. As expected, the HIMF for this region, when isolated, follows the HIMF for the sample as a whole. Because of the large volume in this region, the HIMF displayed in the top panel of Figure 22 is smooth and featureless.

Figure 22.

Figure 22. HIMF estimated for separate subregions of the α.40 catalog via the 2DSWML method with Schechter fit parameters. Top panel: results for the α.40.North1 region. Middle panel: same, for the α.40.North2 region. Bottom panel: same, for the α.40.South sample. See Table 6 for further quantitative details.

Standard image High-resolution image

Table 6. 2DSWML HIMF Schechter Parameters by Region

Sample and Fitting Function α ϕ* log  (M*/M) $\Omega _{{\rm H\,\mathsc{i}}}$, fit $\Omega _{{\rm H\,\mathsc{i}}}$, points
    (10−3 h370 Mpc−3 dex−1) + 2 log  h70 (× 10−4 h−170) (× 10−4 h−170)
North1 −1.35 (0.02) 4.4 (0.3) 9.98 (0.02) 4.3 (0.4) 4.4 (0.1)
North2 −1.25 (0.04) 5.6 (0.6) 9.92 (0.02) 4.2 (0.5) 4.3 (0.2)
South −1.30 (0.04) 4.1 (0.5) 9.96 (0.3) 3.6 (0.5) 3.5 (0.2)
Whole α.40 −1.34 (0.02) 4.7 (0.3) 9.96 (0.01) 4.3 (0.3) 4.4 (0.1)

Download table as:  ASCIITypeset image

In the case of the smaller samples in the middle and bottom panels of Figure 22, features due to large-scale structure are clearly visible. Because of the inhomogeneity of the surveyed volume, the HIMFs do not follow the prescribed Schechter function. In the case of the α.40.North2 subsample, the faint-end slope is better fit on its own, in which case it is measured to be α = −1.4 ± 0.1.

In every case, the "bumps" and wiggles in the subregion HIMFs correspond to the cone diagram distributions in Martin et al. (2010). In essence, the combination of the ALFALFA survey's sensitivity and the scaling of survey volume with redshift leads to preferred distances for each of the H i mass bins in the HIMF (or preferred H i masses for every distance in the survey). A dip, for example, in the HIMF corresponds to an overdensity at the preferred distance for those H i mass scales. While the 2DSWML method has been designed to be less sensitive to large-scale structure, the volumes of these subregions are too small for these effects to average out.

Such techniques can only work with the data they are given, but the 1/Vmax approach allows for explicit correction for known structures. When these corrections are included in the 1/Vmax analysis of these subregions, the (unshown) results are very similar to those provided here. These corrections are based on the IRAS point-source catalog redshift survey (PSCz; Branchini et al. 1999) density correction (see Section 7.3.2), but imperfections in this correction lead to the same bump and dip features. An additional weakness of the 1/Vmax density correction is that the counts can only be increased for galaxies that do end up in the sample, making the correction significantly less useful in voids. By contrast, 2DSWML essentially "self-corrects" for overdense and underdense regions. Rather than looking at volumes and scaling counts by 1/Vmax, 2DSWML constructs the relationship between bins by scaling the counts themselves and therefore automatically scales the HIMF downward for regions that are overdense and upward for regions that are underdense.

This consideration of subregions within α.40 makes clear the impact that large-scale structure can have on blind H i surveys and the importance of cosmologically significant volumes before global conclusions can be drawn.

7.3.2. Large-scale Structure Correction from Previous Surveys

As described in Martin et al. (2010), the 1/Vmax method can be corrected to account for large-scale structure in the survey volume. Essentially, overdense regions are considered to represent more effective volume (Σ1/Veff, rather than Σ1/Vmax) and vice versa for underdense regions, so that galaxies in various environments are weighted appropriately (Springob et al. 2005a; Rosenberg & Schneider 2002).

While this correction is successful, it does rely on data sets external to the ALFALFA survey. In Martin et al. (2010), the density map derived from the PSCz (Branchini et al. 1999) was used to correct for large-scale structure. However, other options exist, in particular other PSCz maps (smoothed to different levels) and the density reconstruction derived from the 2MASS Redshift Survey (2MRS; Erdoğdu et al. 2006). The large-scale structure correction used has a large influence on the final HIMF estimate; a ∼20% effect on the Schechter parameters, compared to neglecting the density correction, was reported in Martin et al. (2010). Given the magnitude of the effect, it is important to consider the impact that a different choice would make. In particular, since this portion of HIMF analysis is likely to always rely on external information, examining it here may be helpful in the future for the 100% ALFALFA sample.

The parameter of interest reported by PSCz is the overdensity δ, defined relative to the average number density of galaxies found in those surveys:

Equation (8)

In the case of the ALFALFA survey and the HIMF, we are primarily interested in the average value of δ interior to each source's maximum detectable distance or, in other words, the average value of δ in the volume over which the source could have been detected. Both the 2MRS and PSCz density maps report the value of δ in equal-volume cells throughout their survey volume. The PSCz map was chosen because of its greater sensitivity in the nearby survey regions of α.40, where the HIMF was especially vulnerable to the impact of large-scale structure.

While PSCz was a good choice for the analysis of the α.40 HIMF, there are actually several choices of maps available from Branchini et al. (1999), with the primary differences being the smoothing size of each volume cell and the maximum distance out to which the density fields were reconstructed. In Martin et al. (2010), the chosen map extended to 240 h−1 Mpc and was smoothed with a Gaussian kernel of width 3.2 h−1 Mpc. The alternative options include a map that extends to only 120 h−1 Mpc with a 3.2 h−1 Mpc kernel, and one that extends to 240 h−1 Mpc with a larger Gaussian kernel of 7.7 h−1 Mpc. The smoothing scale of PSCz maps can lead to underestimation of density contrasts. Because the primary effect of the large-scale structure correction is on the lowest-mass bins of the HIMF, it is important to explore and understand the influence of this outside data set.

Upon examination of the average interior overdensities determined for α.40 in each map, we find that the PSCz.240.G3.2 map used in Martin et al. (2010) for the 1/Vmax analysis of the HIMF represents an extreme estimate of the impact of large-scale structure within the survey volume. It is the most conservative option, given that it attaches lower weight to those galaxies found in nearby overdensities, particularly the Virgo Cluster, to prevent them from artificially boosting the faint-end slope.

In order to quantify the effect of these options on the resulting HIMF, we use the PSCz.120.G3.2 and PSCz.240.G7.7 to reanalyze the α.40 HIMF. Where the maps do not reach the full redshift extent of the α.40 sample, we set the average interior δ to 0 for galaxies beyond the distance limit. In order to fit Schechter function parameters in each case, we use the same uncertainty estimates for each H i mass bin point as presented in Martin et al. (2010), as the PSCz map applied would only change the magnitude of each point and not its fluctuation due to H i mass uncertainties.

Figure 23 shows the results, focusing on the low-mass end of the HIMF, since H i mass bins with $M_{{\rm H\,\mathsc{i}}} > 10^{8.0}\, M_{\odot }$ are not affected by the large-scale structure volume correction. The different large-scale structure corrections function effectively as a scaling in each bin, so that each option follows the fiducial case closely. Both PSCz.120.G3.2 and PSCz.240.G7.7 boost the faint-end slope, indicating that they are overcounting galaxies in the nearby overdensities, namely the Virgo Cluster. This analysis verifies that PSCz.240.G3.2 was the most conservative choice for correcting the 1/Vmax HIMF for the effects of large-scale structure. The changes to the low-mass slope α and the turnover mass M* are displayed in Table 7, along with the measured 2DSWML parameters for reference. It is clear that the PSCz map with the greatest extent and the smallest smoothing radius is most appropriate for estimating the α.40 H i mass function.

Figure 23.

Figure 23. Low-mass end of the HIMF, showing dependence on the chosen PSCz density reconstruction map. The fiducial 1/Vmax HIMF reported in Martin et al. (2010) is shown as a filled circle, with two other maps represented by squares and triangles.

Standard image High-resolution image

Table 7. 1/Vmax HIMF Schechter Parameters by PSCz Map

PSCz Map α log  (M*/M)
2DSWML Result −1.33 (0.02) 9.96 (0.02)
PSCz.240.G3.2 −1.33 (0.03) 9.95 (0.04)
PSCz.120.G3.2 −1.39 (0.03) 9.96 (0.05)
PSCz.240.G7.7 −1.44 (0.04) 9.98 (0.06)

Download table as:  ASCIITypeset image

8. SUMMARY

This paper presents the cataloged parameters for 15,855 H i line detections extracted from ∼2800 deg2 of high galactic latitude sky observed by the ALFALFA survey. A (pleasant) surprise for us has been the higher than expected ALFALFA detection rate, 5.6 sources per deg2, or, including only the objects that are certainly extragalactic, 5.3 sources per deg2. This latter detection value is a factor of 29 times greater than the rate of 0.18 sources per deg2 achieved by HIPASS. The characteristic resolution of the ALFALFA spectral grids is about 4'; the positions of the H i sources can be determined to an accuracy typically better than 20''. Using the publicly available SDSS and DSS2 imaging data sets, we have assigned probable OCs to more than 98% of the 15,041 extragalactic detections and provide a cross-reference to the SDSS DR7 photometric and imaging databases. An additional 814 H i line detections cannot be identified with stellar counterparts but lie within velocity ranges characteristic of the galactic/circumgalactic HVCs. Roughly 3/4 of the optically "dark" extragalactic H i sources are located in fields containing galaxies of known optical redshift; many are likely to be associated with tidal debris fields. We identify four objects as candidate OHMs redshifted to z ∼ 0.17; one of those is a rediscovery of a previously recognized OHM and is associated with a galaxy of the same optical redshift (Darling & Giovanelli 2001). Future works will explore more systematically the OHM candidates throughout the ALFALFA bandpass and also will search for evidence of H i in absorption (Darling et al. 2011). Unsurprisingly, a census of the H i bearing population of galaxies in the local universe is strongly biased against galaxies on the red sequence, but some luminous, red galaxies are detected in the H i line. In particular, ALFALFA provides a rich sampling of the low-to-moderate-density universe at z ∼ 0.

As a major ALFALFA data release, the α.40 catalog presented here supersedes the data sets published by our team previously. In particular, the H i line flux densities reported here are based on further improvements in the software used for parameter extraction and increased knowledge of the system performance. The ALFALFA reduction pipeline may miss flux for sources which are very large compared to the beam size and offset from the center of the standard grids, but comparison with the H i line flux densities derived from pointed single-dish observations and corrected for beam dilution and pointing errors with the ones reported here shows no systematic offsets except for the very largest and very strongest sources. The latter will need to be evaluated on a case-by-case basis in grids produced and analyzed separately from the standard process and where applicable, corrected for sidelobe contamination (Dowell 2010; J. D. Dowell et al. 2011, in preparation).

The goals and expectations of the ALFALFA survey were outlined in Giovanelli et al. (2005a) and survey source sensitivity and reliability was discussed in Saintonge (2007a). As discussed previously, the integrated H i line flux density threshold of a blind H i survey like ALFALFA increases with H i line profile width (Martin et al. 2010; Toribio et al. 2011b). With the availability of the large α.40 data set, we test those expectations and give quantitative descriptions of the completeness and sensitivity of the ALFALFA survey as functions of log W50. In addition to the highest quality, highly reliable (Code 1) H i detections, the α.40 catalog presented in Table 1 includes also sources of lower S/N which coincide in position and redshift with known optical galaxies (the "priors"). Because the availability of such prior information is highly dependent on the selection functions of other surveys, these additional objects should not be used in studies which require stringent consideration of statistical completeness. However, the vast majority are likely to be valid H i detections and hence they can be included in studies where the number of sources is most critical (e.g., peculiar velocity studies). Future work will be undertaken to confirm these detections and an additional set of low S/N possible detections which coincide with galaxies of unknown redshift.

The sensitivity of ALFALFA and the thorough understanding of its performance enable a robust measurement of the HIMF, and in particular, of its faint-end slope α and the energy density of neutral hydrogen $\Omega _{{\rm H\,\mathsc{i}}}$ at z = 0. On the low-mass end of the HIMF, ALFALFA improves on previous blind H i surveys in terms of sample size, angular and spectral resolution, sampling of cosmic volume, and assumptions of pure Hubble flow. At the lowest H i masses, ALFALFA's finer velocity resolution is an important factor in obtaining a full count of the gas-rich dwarf population.

On the high-mass end, previous H i surveys have overlooked the locally rare population of very massive H i disks. We have evaluated the possible impact on the derived HIMF of missing sources at both the broad and narrow width ends, particularly in comparison with the HIPASS catalog (Meyer et al. 2004). We conclude that HIPASS did not recognize the richness of the very high H i mass population, not because it failed to identify the systems with the broadest widths but because it did not have adequate sensitivity at large distances and was limited to only 64 MHz of bandpass. It is ALFALFA's combination of sensitivity, spectral and angular resolution, frequency, and sky coverage which yields a robust census of the H i bearing population at z = 0.

With ALFALFA still only 40% complete, we have shown that the 2DSWML and 1/Vmax methods yield results on the HIMF in good agreement, but that the loss of significant volume in the ALFALFA survey beyond 15,000 km s−1 reduces the performance of the 2DSWML approach if that region is included. A realistic treatment of distance and flux density uncertainties, translated into mass uncertainties, avoids the strong bias in α and the shape of the HIMF introduced by an assumption of Hubble flow in the local volume. While α.40 does not yet provide a completely representative sampling of the local cosmological volume, our method for including the impact of large-scale structure is a conservative choice, and future data releases from ALFALFA will further improve both statistical and systematic uncertainties. We look forward to completing the ALFALFA survey.

We thank the staff of the Arecibo Observatory, particularly Phil Perillat, Ganesh Rajagopalan, Arun Venkataraman, Hector Hernandez, and the telescope operations staff, for their invaluable help in support of the acquisition of the data used to produce this catalog, and Tom Shannon and Adam Brazier for their critical support of hardware and database development at Cornell. This work has been supported by the NSF grant AST-0607007 to R.G. and M.P.H. and by a Brinson Foundation grant. T.J.B., D.W.C., G.L.H., S.J.U.H., D.A.K., R.A.K., J.R.M., A.A.O., R.P.O., J.L.R., and E.M.W. acknowledge support for the Undergraduate ALFALFA Team from NSF grants AST-0724918, AST-0725267, AST-0725380, AST-0902211, and AST0903394. J.L.R. acknowledges support from NSF AST-000167932. K.S. acknowledges support from the Natural Sciences and Engineering Research Council of Canada (NSERC).

This research has made use of data obtained from or software provided by the US National Virtual Observatory, which is sponsored by the National Science Foundation and of Montage, funded by the National Aeronautics and Space Administration's Earth Science Technology Office, Computation Technologies Project, under Cooperative Agreement Number NCC5-626 between NASA and the California Institute of Technology. Montage is maintained by the NASA/IPAC Infrared Science Archive.

We acknowledge the use of NASA's SkyView facility (http://skyview.gsfc.nasa.gov) located at NASA Goddard Space Flight Center and the NASA/IPAC Extragalactic Database (NED) which is operated by the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration. The Digitized Sky Surveys were produced at the Space Telescope Science Institute under U.S. Government grant NAG W-2166. The images of these surveys are based on photographic data obtained using the Oschin Schmidt Telescope on Palomar Mountain and the UK Schmidt Telescope. The plates were processed into the present compressed digital form with the permission of these institutions. The Second Palomar Observatory Sky Survey (POSS-II) was made by the California Institute of Technology with funds from the National Science Foundation, the National Geographic Society, the Sloan Foundation, the Samuel Oschin Foundation, and the Eastman Kodak Corporation.

Funding for the SDSS and SDSS-II has been provided by the Alfred P. Sloan Foundation, the participating institutions, the National Science Foundation, the US Department of Energy, the NASA, the Japanese Monbukagakusho, the Max Planck Society, and the Higher Education Funding Council for England. The SDSS Web site is http://www.sdss.org/. The SDSS is managed by the Astrophysical Research Consortium for the participating institutions. The participating institutions are the American Museum of Natural History, Astrophysical Institute Potsdam, University of Basel, University of Cambridge, Case Western Reserve University, University of Chicago, Drexel University, Fermilab, the Institute for Advanced Study, the Japan Participation Group, Johns Hopkins University, the Joint Institute for Nuclear Astrophysics, the Kavli Institute for Particle Astrophysics and Cosmology, the Korean Scientist Group, the Chinese Academy of Sciences (LAMOST), Los Alamos National Laboratory, the Max Planck Institute for Astronomy, the MPA, New Mexico State University, Ohio State University, University of Pittsburgh, University of Portsmouth, Princeton University, the United States Naval Observatory, and the University of Washington.

Footnotes

Please wait… references are loading.
10.1088/0004-6256/142/5/170