Biasing Relation, Environmental Dependencies, and Estimation of the Growth Rate from Star-forming Galaxies

, , and

Published 2020 December 11 © 2020. The American Astronomical Society. All rights reserved.
, , Citation Adi Nusser et al 2020 ApJ 905 47 DOI 10.3847/1538-4357/abc42f

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0004-637X/905/1/47

Abstract

The connection between galaxy star formation rate (SFR) and dark matter (DM) is of paramount importance for the extraction of cosmological information from next-generation spectroscopic surveys that will target emission line star-forming galaxies. Using publicly available mock galaxy catalogs obtained from various semianalytic models (SAMs), we explore the SFR–DM connection in relation to the speed-from-light method for inferring the growth rate, f, from luminosity/SFR shifts. Emphasis is given to the dependence of the SFR distribution on the environmental density on scales of 10–100 s Mpc. We show that the application of the speed-from-light method to a Euclid-like survey is not biased by environmental effects. In all models, the precision on the measured β = f/b parameter is σβ ≲ 0.17 at z = 1. This translates into errors of σf ∼ 0.22 and ${\sigma }_{(f{\sigma }_{8})}\sim 0.1$ without invoking assumptions on the mass power spectrum. These errors are in the same ballpark as recent analyses of the redshift space distortions in galaxy clustering. In agreement with previous studies, the bias factor, b, is roughly a scale-independent, constant function of the SFR for star-forming galaxies. Its value at z = 1 ranges from 1.2 to 1.5 depending on the SAM recipe. Although in all SAMs, denser environments host galaxies with higher stellar masses, the dependence of the SFR on the environment is more involved. In most models, the SFR probability distribution is skewed to larger values in denser regions. One model exhibits an inverted trend, where high SFR is suppressed in dense environments.

Export citation and abstract BibTeX RIS

1. Introduction

The connection between dark matter (DM) and galaxies is essential for understanding the processes that regulate the formation and evolution of galaxies and, consequently, to derive cosmological parameters from the analysis of galaxy redshift surveys. In particular, estimation of cosmological parameters from the clustering pattern is inherently dependent on this connection.

In the standard paradigm, galaxies form by the condensation and cooling of gas inside DM-dominated halos (virialized objects; Binney 1977; Rees & Ostriker 1977; Silk 1977; White & Rees 1978). The process is hierarchical (Peebles 1980), with early-forming galaxies collapsing into and merging with other galaxies, and greatly affected by energy released from supernovae (Larson 1974; Dekel & Silk 1986) and active galactic nucleus (AGN) activities (Silk & Rees 1998).

Semianalytic galaxy formation models (SAMs; White & Frenk 1991; Kauffmann et al. 1993; Lacey et al. 1993; Somerville & Primack 1999) have been extensively employed in an attempt to understand the vast number of observational properties of galaxies and the link to the formation of supermassive black holes. The SAMs approximate the complex interconnected processes of star formation, energetic feedback, and hydrodynamics in terms of simple forms involving a large number of free parameters. The importance of each process can then by assessed by tuning the relevant parameters to match certain observational data.

State-of-the-art cosmological simulations can now follow the hydrodynamics in conjunction with elaborate (albeit poorly known) subgrid physics of galaxy formation over large dynamical scales (Gene et al. 2014; Khandai et al. 2015; Schaye et al. 2015; Dolag et al. 2016; Dubois et al. 2016). Unfortunately, the box size of this type of simulation remains insufficient to describe the structure on the large scales probed by large redshift surveys. As an example, Illustris TNG300 (Springel et al. 2018) has a box size of 205 h−1 Mpc on a side, equivalent to only 25% of the volume probed by the low-redshift Two Mass Redshift Survey (2MRS; Huchra et al. 2012) and substantially smaller than the volume coverage planned by future large surveys.

On the other hand, DM-only simulations have been done for large simulation boxes of several gigaparsecs. Kauffmann et al. (1997) pioneered the approach of incorporating SAMs in DM-only simulations. The simulation used in that work was of a box size of only 128 Mpc. The same approach was later applied by many workers in the field using much larger simulations and more elaborate SAMs (e.g., Kauffmann et al. 1999; Benson et al. 2000; Guo et al. 2011; Angulo et al. 2014; Baugh et al. 2019).

Numerical and semianalytic methods have been extensively used to study the biasing relation between the distribution of galaxies and the underlying distribution of mass (DM-dominated) as a function of the stellar mass. However, next-generation spectroscopic galaxy redshift surveys like the Dark Energy Spectroscopic Instrument (DESI) survey (DESI Collaboration et al. 2016), the Euclid space mission (Euclid Collaboration et al. 2019), and the Roman space mission (Akeson et al. 2019) will mainly select objects based on star formation rate (SFR) indicators like the Hα and [O ii] emission lines. Therefore, the biasing relation for galaxies selected based on SFRs is becoming of particular interest (Angulo et al. 2014).

In general, the biasing relation enters into any analysis based on the clustering of galaxies. Its knowledge is essential for a precise and accurate estimation of cosmological parameters. It is at the heart of methods relying on the anisotropy of clustering in redshift space (the so-called redshift space distortions, RSDs; Sargent & Turner 1977) and, to a lesser degree, in analyzing signatures of baryonic acoustic oscillations (BAOs). The standard and most convenient assumption is that of linear biasing. If 1 + δgal and 1 + δ are the galaxy number density and total mass density in units of their respective mean values, linear biasing dictates

Equation (1)

where b is independent of δ and position r but is a function of time as implied by continuity considerations (Nusser & Davis 1994; Tegmark & Peebles 1998). The term ε represents stochastic (random) scatter around the mean relation (e.g., Dekel & Lahav 1999). Both density contrasts, δgal and δ, are assumed to have been filtered with the same smoothing window. For Gaussian random fields, the relation is valid on sufficiently large scales (Bardeen et al. 1986). The term ε arises from an intrinsic scatter in the biasing relation, as well as the Poisson fluctuations (shot noise) associated with the finite number of galaxies. The intrinsic scatter can be attributed to several factors that are not captured solely by the local mass density field at a given time. The assembly and star formation history, details of the feedback process, and external gravitational tidal field that affects the galaxy rotation can all impact the galaxy properties. Linear biasing has been demonstrated to hold on sufficiently large scales, and its dependence on the stellar and host halo mass has been studied using simulations as well as analytic models.

Modern spectroscopic redshift surveys are designed to provide tight constraints on the growth rate, f, of linear density fluctuations at high redshift. The growth rate is related to the growing density mode D via (Peebles 1980)

Equation (2)

with γ ≈ 0.55 + 0.05(1 + w) for a dark energy model with an equation-of-state parameter w (Linder 2005). Therefore, constraining f at different cosmic epochs could, in principle, yield important insight into the dark energy models responsible for the accelerating expansion of the universe. The aforementioned RSDs resulting from placing galaxies at their redshifts rather than actual distances are a traditional probe of f via the combination

Equation (3)

But placing galaxies at their redshift positions rather than actual distances does not only result in RSDs. It also shifts the estimates of the galaxy intrinsic luminosities from their true values obtained using actual distances. To first order, the redshift position differs from the distance by the line-of-sight (LOS) peculiar velocity. Therefore, coherent large-scale luminosity variations in space can be used to constrain the peculiar velocity field. This idea dates back to the work of Tammann et al. (1979), who correlated the magnitudes of nearby galaxies with their redshifts to constrain the velocity of the Virgo cluster relative to the Local Group.

There are two techniques to probe the velocity field using luminosity variations. In the first one, direct constraints on the velocity are derived using the observed variations. This technique has been applied to the 2MRS at z ≲ 0.03 (Nusser et al. 2011; Abate & Feldman 2012; Branchini et al. 2012) and the Sloan Digital Sky Survey (SDSS) at z ∼ 0.1 by Feix et al. (2015, 2017) and led to interesting constraints on the amplitude of the bulk flow. This technique is susceptible to the environmental dependence of the luminosity distribution. Indeed, the dependence of the luminosity function on the large-scale density field could mimic variations due to peculiar velocities. For nearby surveys like the 2MRS, the luminosity shift from peculiar velocities dominates over environmental effects. However, at z ≈ 1, relevant to next-generation surveys, this is not the case anymore, making the application of this technique less attractive or even irrelevant in comparison to low-redshift surveys.

The second technique relies on a simultaneous estimation of the luminosity fluctuation and the peculiar velocity field from the actual galaxy distribution in redshift space (Nusser et al. 2012). The derived velocity field depends on β. The true galaxy luminosities are then estimated using the distances derived from the redshifts by subtracting the LOS peculiar velocities. The parameter β is derived by minimizing the large-scale spatial luminosity variations. Assuming the environmental dependence of the luminosity function is mostly via the large-scale density, this method will yield an unbiased estimate of β thanks to the lack of correlation between the density and the peculiar velocity at a given point in space. In general, Galilean invariance can be invoked to conclude that the peculiar motion of a galaxy cannot affect any of its internal properties. Therefore, any mechanism that affects the luminosity distribution to a given point in space must be uncorrelated with the peculiar velocity at the same point. Thus, environmental dependencies will only affect the statistical uncertainty in the derived estimates of β. We gave the name speed-from-light (SfLM; Feix et al. 2017) to this second method.

The third goal of this paper is to provide an assessment of the applicability of this method for constraining β from next-generation spectroscopic redshift surveys and to verify its sensitivity to environmental effects. We wish to confirm that environmental dependencies do not bias the β estimate from the SfLM and to estimate how they affect the random error assigned to β.

We will rely on mock galaxy catalogs obtained by applying different SAM recipes to large DM-only simulations. Mock catalogs are likely inaccurate at describing galaxy properties at high redshift. We therefore compare predictions from different models to achieve different goals. First, we aim at identifying common features that can be used as predictions for planned surveys. Second, the discrepancies among the models will serve to appreciate the scatter in current theoretical predictions. And third, we will assess how environmental dependencies, which can be quantified in future observations, can provide useful constraints on models of galaxy formation

The outline of the paper is as follows. In Section 2, we present the simulations and corresponding mock galaxy catalogs that we use in this work and their relation to next-generation spectroscopic catalogs. In Section 3, we use mock catalogs to study how stellar masses and SFR depend on the environment. In Section 4, we focus on galaxy biasing and its dependence on the SFR and stellar mass. The impact of the large-scale environment on the estimate of β from the SfLM method is studied in Section 5. We discuss our results and offer our conclusions in Sections 6 and 7, respectively.

2. Mocks and Simulations

We use publicly available mock galaxy catalogs extracted from three DM-only simulations (SMDPL, MDPL2, and BigMDPL) of the MultiDark suites. The relevant parameters of the simulations are listed in Table 1. Mock galaxy catalogs from the SAG (Cora et al. 2018), SAGE (Croton et al. 2016), and Galacticus (Benson 2012) SAMs are publicly available only for the MDPL2 simulation. These mocks, referred to as MultiDark-Galaxies (Knebe et al. 2018), have been downloaded from the CosmoSim database.7 Mock galaxies extracted from the SMDPL simulation are available only for the UniverseMachine (UM; Behroozi et al. 2019), a self-consistent empirical galaxy formation model, and have been kindly provided by Peter Behroozi. The UM model offers a simple recipe for assigning SFRs to halos. We use that recipe to populate the MDPL2 and BigMDPL simulations with galaxies based on the UM mocks in the SMDPL.

Table 1.  Relevant Parameters of the Three MDPL Simulations Used in This Work

  SMDPL MDPL2 BigMDPL
L [h−1 Mpc] 400 1000 2500
mp [108 M] 1.42 22.3 348

Note. All simulations correspond to a ΛCDM cosmology with parameters h = 0.6777, ΩΛ = 0.692885, Ωm = 0.307115, and σ8 = 0.8228.

Download table as:  ASCIITypeset image

All SAMs incorporate the same basic processes of galaxy formation, gas cooling, and star formation according to the amount of cold gas, stellar winds, and AGN feedback. They include recipes for tracing the mass in the main galaxy components: disks, bulges, and black holes. The models have a large number of free parameters that are fixed by matching observations of the galaxy population. Despite their similarities, the three SAMs differ in the way baryonic physical processes are implemented. For example, all of them basically follow the gas cooling treatment presented in White & Frenk (1991) but differ in the details of metal cooling. Galacticus follows the SFR recipe in Krumholz et al. (2009), while SAG initiates star formation only once the cold gas mass in the forming disk exceeds a certain value. In addition to the radio-mode AGN feedback employed in SAG and Galacticus, SAGE also includes a quasar wind mode. Another difference between SAGE and the two other models is the treatment of galaxies that no longer have identifiable parent subhalos in the simulation. SAGE disperses the stellar content of orphan galaxies into the main halo, while Galacticus and SAG maintain them as separate entities. However, since we will focus on galaxies with high SFRs, we do not expect our results to depend on this aspect of the models. Comparison between SAG, SAGE, and Galacticus applied to MDPL2 is provided in Knebe et al. (2018). All models have been calibrated using low-redshift, z ≈ 0, key observational data, such as the black hole–bulge mass relation and the stellar mass functions, but not always to the same data compilation (see Table 1 in Knebe et al. 2018).

We are mostly interested in results at z = 1 to match the typical redshift of next-generation surveys. Therefore, we shall focus on z = 1 simulated catalogs and only consider the z = 0 case to explore the biasing of star-forming galaxies at lower redshift.

2.1. UM Mock Galaxies for MDPL2 and BigMDPL

The UM models available for us are only for the SMDPL simulation of insufficient volume to probe the dependence of the SFR and luminosity distributions on the galaxy environment. Therefore, we wish to generate UM mock galaxies for the larger boxes as well. The main usage for UM in these larger boxes will be to estimate the cosmic variance. Fortunately, in the UM models, the SFR in a halo at redshift z depends mainly on ${V}_{{M}_{{\rm{p}}}}$, the maximum of the rotation curve measured at the peak mass through the history of the halo until z (Behroozi et al. 2019). This is not necessarily true for satellite galaxies that do not play a significant role in our analysis, since we are interested in large star-forming galaxies. Also, since we are mainly interested in the SFR rather than stellar mass, we avoid running the whole UM machinery by assigning SFRs to MDPL2 and BigMDPL halos from a random resampling of the SFR conditional probability distribution given ${V}_{{M}_{{\rm{p}}}}$, taken from the full UM in the SMDPL simulation.8 Instead, for each halo in the larger simulations, we select a value for SFR from the distribution in the ${V}_{{M}_{{\rm{p}}}}-\mathrm{SFR}$ in SMDPL. The resampling will capture environmental dependencies present in the distribution of ${V}_{{M}_{{\rm{p}}}}$ but miss those associated with other parameters that can have an effect on the SFR. Nonetheless, these resampled catalogs will mainly serve for the statistical assessments of the SfLM. An added value for random resampling is that several realizations of SFRs can be generated for the same halo. This allows an assessment of randomness (stochasticity) in the SFR per halo assignment, at least in the UM models.

The top panel in Figure 1 shows the distribution of a randomly selected fraction of mock galaxies in the $\mathrm{SFR}-{V}_{{M}_{{\rm{p}}}}$ plane from the full UM model in SMDPL. Also plotted are contours of the two-dimensional (2D) probability distribution function (PDF) of logSFR and $\mathrm{log}{V}_{{M}_{{\rm{p}}}}$. The two branches of star-forming and quenched (low star formation) galaxy populations are clearly visible. As we shall see in the next section, next-generation surveys like Euclid will observe galaxies with relatively high SFR >10 M yr−1, sampling the tip of the PDF.

Figure 1.

Figure 1. Distribution of UM galaxies in the plane logSFR and $\mathrm{log}{V}_{{M}_{{\rm{p}}}}$. Top: galaxies in the UM catalog extracted from the SMDPL simulation. In the middle (MDPL2) and bottom (BigMDPL) panels, SFRs were assigned to halos with a given ${V}_{{M}_{{\rm{p}}}}$ by resampling the distribution in the top panel. Contours designate certain values of the logarithm of the 2D PDF P(logSFR, log VMp).

Standard image High-resolution image

The middle and bottom panels show, respectively, the same distributions for MDPL2 and BigMDPL obtained by resampling the SFR at a given ${V}_{{M}_{{\rm{p}}}}$ from the distribution represented in the top panel. In practice, we partition the SMDPL galaxies into 100 bins in $\mathrm{log}{V}_{{M}_{{\rm{p}}}}$ and match the bins to the ${V}_{{M}_{{\rm{p}}}}$ of halos in the larger simulations. Then the distribution of UM SFR values in each bin is randomly sampled to assign SFR values to the respective halos in the larger simulations. Due to the larger particle mass in the larger simulations, they do not resolve low ${V}_{{M}_{{\rm{p}}}}$ as the SMDPL does, yielding lower number densities of galaxies relative to the SMDPL. Table 2 lists the number densities of UM galaxies in SMDPL and the resampled UM in the two larger simulations, which we label UM MDPL and UM BigMDPL. This table refers to redshift z = 1 and two different M* cuts, as indicated. We emphasize that the resampling does not provide M* for the larger simulations, only SFR. The number densities in SMDPL and MDPL2 are comparable, with the latter having only an 11% lower value, while UM BigMDPL is significantly more dilute. Still, the number densities of all models in the table are consistent with each other within a factor of 2. In addition, the BigMDPL box represents a sizable fraction of the volume that will be probed by Euclid's spectroscopic survey and contains a comparable total number of galaxies (Euclid Collaboration et al. 2019).

Table 2.  Number of Galaxies with SFR > 10 M yr−1 in the Mock Catalogs at z = 1

Mock Catalog SAG SAGE Galac. UM UM MDPL2 UM BigMDPL
Simulation MDPL2 MDPL2 MDPL2 SMDPL MDPL2 BigMDPL
Ngal, M* > 5 × 109 M 3.29 × 106 4.48 × 106 2.07 × 106 × 105 4.15 × 106 3.4 × 107
Ngal, M* > 5 × 1010 M 7.43 × 105 1.73 × 106 9.5 × 104 5.88 × 104
n [h3 Mpc−3], M* > 5 × 109 M 3.29 × 10−3 4.48 × 10−3 2.07 × 10−3 4.67 × 10−3 4.15 × 10−3 2.20 × 10−3
n [h3 Mpc−3], M* > 5 × 1010 M 7.43 × 10−4 1.73 × 10−3 9.5 × 10−5 9.12 × 10−4

Note. The top two entries list the total number of galaxies in the simulation boxes for two different stellar mass thresholds. The bottom two entries list the galaxy number densities n = Ngal/L3 in the same boxes.

Download table as:  ASCIITypeset image

An important check of the resampling procedure is whether it yields consistent clustering properties among the three UM catalogs. In Figure 2, we plot the quantity Δ(k) ∝ k3Pgal(k), where Pgal(k) is the galaxy power spectrum as a function of the wavenumber k. The details of computing Pgal are described in Section 4. The figure indeed demonstrates a very good agreement between the three power spectra.

Figure 2.

Figure 2. Consistency of the power spectra of the mock galaxies in the full UM in SMDPL and in the UM MDPL and UM BigMDPL samples.

Standard image High-resolution image

2.2. Volumes and Galaxy Densities: Connection to Observations

The results of the present study are relevant for any spectroscopic surveys that, like DESI, Euclid, and Roman-WFIRST, will target emission line galaxies over large sky areas at intermediate redshifts. Here we will focus on the Euclid case and consider it as representative of typical next-generation spectroscopic surveys.

Euclid's survey will detect Hα galaxies with line flux larger than 2 × 10−16 erg s−1 cm−2 over 15,000 deg2 of the sky in the redshift range 0.9 < z < 1.8, corresponding to a comoving volume of ∼43 h3 Gpc−3. Based on models calibrated on available observations of Hα emitters (Pozzetti et al. 2016), the Euclid collaboration has recently provided their forecast for the number density of Hα galaxies: one expects n ∼ 6.9 × 10−4 h3 Mpc−3 at z ∼ 1, gradually decreasing to n ∼ 4.2 × 10−4 h3 Mpc−3 at z ∼ 1.4, and dropping to n ∼ 2.6 × 10−4 h3 Mpc−3 at z ∼ 1.7.

Galaxy formation models provide the SFR rather than the luminosity, LHα, of the Hα line. To link these quantities, we adopt the transformation (e.g., Domínguez Sánchez et al. 2012)

Equation (4)

The Euclid Hα flux cut then corresponds to a lower SFR threshold, ${\mathrm{SFR}}_{\mathrm{lim}}=10\,{M}_{\odot }\,{\mathrm{yr}}^{-1}$, at z = 1. For mock galaxies with M* > 5 × 109 M, the number densities listed in Table 2 are higher than the official Euclid forecast. However, they are comparable to those of model 1 of Pozzetti et al. (2016), since they expect a number density of 2.6 × 10−3 h3 Mpc−3 galaxies above the Euclid Hα flux threshold. This is actually close to the number densities of galaxies with SFR > 10 M yr−1 and M* > 5 × 108 M in the mock catalogs, as seen in Table 2. This is perhaps not surprising, since the Euclid forecast accounts for instrumental effects and observational biases that are not included in the simulations.

The MDPL2 box is ∼2.3% of the total volume probed by the full Euclid survey and ∼12% of the volume in the redshift bin [0.9–1.1]. In this redshift range, Euclid Collaboration et al. (2019) expected to observe more than (106) galaxies, i.e., roughly the number of objects required for a successful application of the SfLM method. The mocks, however, approximately contain this number of galaxies in the MDPL2 volume alone. Therefore, it is fortunate that the number density in the simulations is higher that the expectations of Euclid Collaboration et al. (2019), as this will allow us to test the SfLM already with the mocks we have. The shortcoming is that the smaller simulated volume prevents a proper assessment of cosmic variance.

The larger simulation, BigMDPL, is ∼36% of the full spectroscopic survey, still small to perform a cosmic variance estimation. However, it is helpful in constraining cosmic variance on smaller scales, which can point toward its magnitude for the whole Euclid survey.

3. Stellar Mass and SFR

The properties of the MultiDark-Galaxies have been studied extensively by Knebe et al. (2018). Nonetheless, for completeness and as a basic check on our analysis, we compute the stellar mass and SFR distribution functions from the downloaded data and compare our results with Knebe et al. (2018) whenever relevant. For this validation test, we will consider two epochs, z = 0 and  1.

In Figures 3 and 4, we plot the distribution of a randomly selected fraction of galaxies in the plane $\mathrm{logSFR}-\mathrm{log}{M}_{* }$ at z = 1 and 0, respectively. Instead of the SFR, Knebe et al. (2018) plotted the specific SFR defined as the SFR per unit stellar mass. Since we are interested in galaxies selected according to the SFR, it is more instructive for our purposes to explore the distribution SFR−M* plane. A bimodal structure is recognizable at both redshifts for the UM (SMDPL) galaxies. This is not surprising, since these models impose a division into quenched and active galaxies. At z = 1, all models except Galacticus produce a tight "main sequence" of star-forming galaxies with similar slope and normalization, although it is broader in the SFR direction for SAGE, as can be seen from the plotted contours. A main sequence can be identified in Galacticus at z = 0, as the point encompassed by purple contours, but the overall distribution is much more diffuse than the other models. These figures demonstrate the complexity of the relation between the SFR and stellar mass. The large scatter and the shape of the distribution make it hard to associate a well-defined stellar mass with a given SFR.

Figure 3.

Figure 3. Distribution of galaxies in the M*−SFR plane at z = 1 in different mock catalogs. Contours designate certain values of the 2D PDF of logSFR and $\mathrm{log}{M}_{* }$.

Standard image High-resolution image
Figure 4.

Figure 4. Same as Figure 3 but for z = 0.

Standard image High-resolution image

The left and right panels of Figure 5 show the 1D distribution functions for SFR and stellar mass, respectively. Apart from the MultiDark-Galaxies mocks, the figure also shows results from the UM (SMDPL) simulation. Models generally agree with the measured PDF of the SFR, except for SAGE, which underpredicts the counts in the high-SFR tail. The stellar mass functions at z = 0 (bottom right panel) for SAG, SAGE, and Galacticus are in agreement with the corresponding curves at z = 0.1 in Figure 1 of Knebe et al. (2018). All curves in each panel are roughly in the same ballpark, but the deviations are significant, even at z = 0. This is not surprising due to the differences in the modeling and calibration to observations. As pointed out by Knebe et al. (2018), SAGE produces the best match to the observed stellar mass distribution at z = 0 as reported by Moustakas et al. (2013; bottom right). Also, since Moustakas et al. (2013) found little evolution of the observed stellar mass distribution since z = 1, we can take observations at z = 0 as representative of those at z ≈ 1 and see that only SAG overproduces galaxies at the high-mass end. At z = 0, both the SAG and Galacticus curves are above the observations at the high-mass end. Knebe et al. (2018) attributed this z ≈ 0 excess to less efficient AGN suppression of star formation compared SAGE.

Figure 5.

Figure 5. Number density of SAM galaxies as a function of the SFR (left panels) and stellar mass (right panels) for redshift z = 0 (bottom) and 1 (top). Different model predictions are represented by curves with different line styles, specified in the legend. The open circles are data with error bars from Gruppioni et al. (2015; left) and Moustakas et al. (2013; right).

Standard image High-resolution image

3.1. Environmental Dependencies

The large-scale environment can play a role in shaping the properties of galaxies (e.g., Xu et al. 2020). Minor differences in the assembly history of halos of the same mass (e.g., Gottlober et al. 2001; Sheth & Tormen 2002; Gao & White 2007) can lead to significant differences in their SFR evolution and the final stellar mass. Here we are interested in the modulation of the SFR and stellar mass distributions as a function of the large-scale density smoothed on scales of tens to hundreds of megaparsecs.

In all mocks, the mass density of the DM is provided on a cubic grid. We use a fast Fourier transform (FFT) to smooth the mass density with a top-hat (TH) window of width Rs = 20 and 100 h−1 Mpc, respectively. Densities, δi, at the galaxy positions are obtained by linear interpolation of the smoothed density fields on the grid.

We compare SFRs and M* values in a low- versus high-density environment by comparing the PDFs of logSFR and $\mathrm{log}{M}_{* }$ estimated for galaxies with the lowest versus highest 20% values of δi.

The PDF of logSFR is computed for galaxies with SFR > 10 M yr−1 to match the cut of the Euclid survey and M* > 5 × 109 M. Conversely, no cut in logSFR is imposed in the PDF of $\mathrm{log}{M}_{* }$. The results are plotted in Figures 6 and 7, referring to two different environment scales of Rs = 20 and 100 h−1 Mpc, respectively. In all models, there is a clear dependence on the environmental density, which is more pronounced at the high end of either logSFR or $\mathrm{log}{M}_{* }$. A reduction in the abundance of high-M* galaxies in low-density environments is evident in the right panels, where at high M*, the dashed curve (low δi) is below the solid (high δi) for all models.

Figure 6.

Figure 6. The PDF of logSFR (left) and $\mathrm{log}{M}_{* }$ (right) as a function of the DM density smoothed with a TH window of width Rs = 20 h−1 Mpc at z = 1 (top) and 0 (bottom). Dashed and solid curves correspond to least and most 20% dense regions. The area between these curves for each SAM is color-coded, as indicated in the figure. The figure refers to galaxies satisfying our Euclid cut of SFR mass greater than 10 M yr−1 and stellar mass 5 × 109 M.

Standard image High-resolution image
Figure 7.

Figure 7. Same as Figure 6 but for Rs = 100 h−1 Mpc.

Standard image High-resolution image

The SFR relation to the environment is more involved. Except for Galacticus, high-density environments are associated with higher SFRs. Galacticus exhibits an intriguing "inverted dependence" on δi; the PDF is skewed toward higher SFRs for galaxies in a low-density rather than high-density environment. This implies relatively more active star formation in galaxies in low- than high-density environments. We can compromise this behavior in Galacticus with the trend of an increased fraction of high M* at high densities if the star formation in dense regions is preferentially intensified well earlier than z = 1.

The curves are closer to each other for the larger Rs = 100 h−1 Mpc smoothing. The reason for that is mainly the narrower density range in the larger smoothing. Note that the density ranking is not preserved between the two smoothed density fields, otherwise the two figures would be identical.

It is also interesting to examine the mean logSFR at a given $\mathrm{log}{M}_{* }$ versus density. This is plotted in Figure 8, where the red dashed and blue solid lines, respectively, correspond to galaxies with the lowest and highest 20% densities. For this plot only, the density is smoothed on a scale Rs = 8 h−1 Mpc. The error bars represent the rms of the scatter of individual galaxies around the mean curves. The red curves do not reach as high M* as the blue, simply because of the reduction of galaxies with this high M* in low-density environments. The UM and SAGE galaxies exhibit very little dependence on the density of the environment. The SAG galaxies follow similar curves in low and dense environments for $\mathrm{log}{M}_{* }\lesssim 11$ at both redshifts, even for this small Rs = 8 h−1 Mpc. For Galacticus, the only signature of the environment is a boost in the SFR in low densities for $\mathrm{log}{M}_{* }\lesssim 10$.

Figure 8.

Figure 8. Mean logSFR at a given M* for galaxies with SFR > 10 M yr−1. Red dashed and blue solid lines correspond to galaxies in the 20% least and most dense regions. Error bars represent scatter around the mean. The TH smoothing width for this plot only is Rs = 8 h−1 Mpc.

Standard image High-resolution image

3.1.1. Parameterization of Environmental Dependencies

We focus here on the SFR, as it is relevant for the SfLM applied to emission line surveys. We parameterize the dependence of the SFR on the smoothed mass density, δ, as

Equation (5)

Equation (6)

where the index i refers to a galaxy lying at a point ri with smoothed density δi. The residuals ${ \mathcal R }$ and ${ \mathcal E }$ are random numbers with zero mean values and the parameters c2 and d2 describe how the mean and variance of logSFR vary with the environmental density. For galaxies with densities δi close to a certain δ0, the mean and variance are

Equation (7)

and

Equation (8)

The mean of logSFR over all galaxies is, however,

Equation (9)

The average, $\overline{\delta }={\sum }_{i}{\delta }_{i}/{N}_{\mathrm{gal}}$, of δi over all galaxies is close to zero, but not strictly so. To evaluate it, we work in the continuous limit ∑i(·) → ∫d3rn(r)(·), where $n({\boldsymbol{r}})=\bar{n}(1+{\delta }_{\mathrm{gal}})$ with $\bar{n}$ the mean number density of galaxies. We find

Equation (10)

where the volume average on the right-hand side is over the product of the smoothed mass density times the unsmoothed density inferred from the galaxy distribution, and we have made use of the vanishing volume average of δ. Therefore,

Equation (11)

where, in the last step, it is assumed that the density rms is σδ ≪ 1 and that c2 is sufficiently small. Similarly,

Equation (12)

In practice, we first estimate c1 and c2 from the mocks by an ordinary least-squares fitting to logSFRi. Then the two parameters are used to compute the residual ${{ \mathcal R }}_{i}$ for every galaxy. Finally, the parameters d1 and d2 are derived by the least-squares fitting to ${{ \mathcal R }}_{i}^{2}$ given from the previous step. The results are summarized in Table 3 for all mock galaxies at z = 1 and with SFR > 10 M yr−1. The entry for each mock lists the inferred parameters for the smoothing widths Rs = 20 and 100 h−1 Mpc, respectively, in the top and bottom lines. Because of the relatively small simulation box, results with only Rs = 20 h−1 Mpc are listed for UM mocks in the SMDPL.

Table 3.  The Parameters of the Fitting Formulae in Equations (5) and (6) for a Threshold SFR > 10 M yr−1

  $\overline{\mathrm{logSFR}}$ c1 c2 ${\sigma }_{\mathrm{logSFR}}^{2}$ d1 d2
SAG 1.2926 1.2899 ± 0.0002 0.0213 ± 0.0005 0.0634 0.0614 ± 0.0001 0.0151 ± 0.0003
    1.2925 ± 0.0001 0.0263 ± 0.0033   0.0633 ± 0.0001 0.0175 ± 0.0019
SAGE 1.2566 1.2553 ± 0.0001 0.0132 ± 0.0003 0.0343 0.0340 ± 0.0001 0.0030 ± 0.0001
    1.2566 ± 0.0001 0.0174 ± 0.0021   0.0343 ± 0.0001 0.0033 ± 0.0005
Galacticus 1.3598 1.3644 ± 0.0002 −0.0428 ± 0.0007 0.0699 0.0705 ± 0.0001 −0.0066 ± 0.0002
    1.3599 ± 0.0002 −0.0444 ± 0.0044   0.0699 ± 0.0001 −0.0045 ± 0.0015
UM SMDPL 1.2724 1.2724 ± 0.0004 −0.0005 ± 0.0015 0.0416 0.0416 ± 0.0001 0.0002 ± 0.0005
     
UM MDPL2 1.2676 1.2665 ± 0.0001 0.0086 ± 0.0004 0.0414 0.0411 ± 0.0001 0.0014 ± 0.0001
    1.2676 ± 0.0001 0.0049 ± 0.0024   0.0413 ± 0.0000 0.0003 ± 0.0008

Download table as:  ASCIITypeset image

For each model, the parameter c1 varies very little with Rs and is very close to the corresponding $\overline{\mathrm{logSFR}}$ (second column), consistent with Equation (11). The parameter c2, which is an indicator of the modulation of the mean of logSFR versus δ, exhibits some dependence on Rs and, as expected, has a small amplitude. Also, ${d}_{1}\approx {\sigma }_{\mathrm{logSFR}}^{2}$, as expected.

For SAG and Galacticus, c2 has similar values for the two values of Rs. The remaining two models, SAGE and UM, yield a stronger dependence on Rs with a difference of more than 50% in c2.

The Galacticus mock stands out in two respects: it has the strongest sensitivity to the environment (largest ∣c2∣) and, in accordance with Figures 6 and 7, exhibits an inverted dependence on δ (negative c2). The results are a function of the SFR threshold, but we find similar numbers for SFR > 12 M yr−1. For example, the inverted dependence in Galacticus persists to SFR > 12 M yr−1 with (c2, d2) = (−0.0371 ± 0.0007, −0.005 ± 0.0002) and (−0.036 ± 0.0004, −0.0031 ± 0.0015), respectively, for Rs = 20 and 100 h−1 Mpc. This model shows the strongest change with the SFR threshold. Parameters in the other mocks change at <15%.

4. Galaxy Biasing as a Function of the SFR

Assume we have a (volume-limited) sample of Ngal galaxies with positions ri in a large volume V. Theoretically, the number density contrast δgal is expressed in terms of a sum over Dirac delta functions

Equation (13)

This form, although of little practical use, stresses the importance of shot noise resulting from the discrete nature of the distribution of galaxies. Practically, we generate a galaxy density field from each simulation output by interpolating the galaxy distribution on a cubic grid using the cloud-in-cell (CIC) scheme. The grid size is 2563 for SMDPL, 5123 for MDPL2, and 4703 for BigMDPL. Here also, we use galaxies with M* > 5 × 109 M and SFR > 10 M yr−1.

We make various comparisons between galaxy and mass density fields on the grid. A visual impression of the biasing relation is offered in terms of a scatter plot of δgal versus δ in Figure 9. For clarity, only a small fraction of the densities on the grid are plotted as blue points. The contour lines mark the boundaries containing 68%, 90%, and 95.4% of the points. The contours were computed by fitting a 2D Gaussian normal PDF to the distribution of points in the plane δδgal. The dashed and dashed–dotted lines represent, respectively, the linear regression of δgal on δ and vice versa. Using the expressions in Appendix B, in which we discuss the details of the regression procedure, the slope of the simple linear regression of δgal on δ is given by

Equation (14)

where the subscript α refers to grid points. If the biasing relation is indeed well described by Equation (1), then the ensemble average of this expression is approximated as $\left\langle p\right\rangle =b$ thanks to $\left\langle \delta \varepsilon \right\rangle =0$. Therefore, the slope of the dashed lines in the figure should serve as a statistically unbiased estimate of b. The statistical 1σ uncertainty on the slope is given by

Equation (15)

where N1 is the number of independent grid points. Since the densities are smoothed on a grid, we need to consider that only a fraction of the points are statistically independent. We estimate the number of independent grid points as ${N}_{1}\sim {(3/4\pi )(L/{R}_{{\rm{s}}})}^{3}$ and apply the expression in Equation (15) using density values at N1 randomly selected grid points. This gives σp ≈ 0.006 and 0.015, respectively, for the smaller and larger Rs. Thus, the bias factors, b = 1.44 and 1.43, estimated as the slopes of the dashed lines in the two panels of Figure 9 are consistent within the 1σ statistical error.

Figure 9.

Figure 9. Scatter plot of the smoothed galaxy vs. DM density fields for the z = 1 SAG galaxies. The left and right panels correspond, respectively, to TH smoothing of widths Rs = 20 and 100 h−1 Mpc. Slopes obtained via linear regression of δgal on δ and vice versa are plotted as the dashed and dashed–dotted lines, respectively, and indicated in the legends. To guide the eye, a solid diagonal line δgal = δ is plotted.

Standard image High-resolution image

Given the inferred slopes, the variance of the stochastic term ε in Equation (1) is estimated as ${\sigma }_{\varepsilon }^{2}=\mathrm{Var}({\delta }_{\mathrm{gal}}-p\delta )$. For Rs = 100 h−1 Mpc, we find ${\sigma }_{\epsilon }^{2}=1.1\times {10}^{-4}$, which includes shot noise and intrinsic scatter in the bias relation. Following Appendix A, the shot-noise contribution in this case is ${\sigma }_{\mathrm{SN}}^{2}=7.3\times {10}^{-5}$. Since ${\sigma }_{\delta }^{2}=1.75\times {10}^{-3}$ and ${\sigma }_{{\delta }_{\mathrm{gal}}}^{2}=3.7\times {10}^{-3}$ are much larger than ${\sigma }_{\epsilon }^{2}$, the inverse regression of δ on δgal leads to a similar slope. The same conclusion applies to Rs = 20 h−1 Mpc, where ${\sigma }_{\epsilon }^{2}=1.18\times {10}^{-2}$, ${\sigma }_{\mathrm{SN}}^{2}=9.3\times {10}^{-3}$, ${\sigma }_{\delta }^{2}=6.2\times {10}^{-2}$, and ${\sigma }_{{\delta }_{\mathrm{gal}}}^{2}=1.4\times {10}^{-1}$.

The relation in Equation (1) is assumed to hold between the density field on any scale as long as it is large enough. Decomposing the fields in Fourier modes, the relation yields

Equation (16)

We now examine the power spectra ${P}_{{\rm{g}}}(k)=\left\langle | {\delta }_{\mathrm{gal},{\rm{k}}}{| }^{2}\right\rangle $ of the galaxy distribution and ${P}_{\mathrm{DM}}(k)=\left\langle | {\delta }_{k}{| }^{2}\right\rangle $ of the corresponding DM density field. We FFT the unsmoothed δgal and δ on the grid into Fourier space and compute the respective power spectra. We remove the contribution, ${N}_{\mathrm{gal}}^{-1}$, of the shot noise from the galaxy power spectrum Pg (Peebles 1980). Thanks to the large number of DM particles in the simulation, the shot noise is negligible in PDM.

In Figure 10, we plot the ratio of the galaxy to the DM power spectra for the various mocks at z = 0 (blue curves) and 1 (red). The Nyquist frequency kN = π N/L = 1.6 and 2.0 h Mpc−1 for MDPL2 and MDPL2, respectively. The UM curves are noisier than the others due to the significantly smaller number of UM mock galaxies (see Table 2). The decline of the ratio at $\mathrm{log}k\gtrsim -0.5$ for mocks from the MDPL2 simulation (i.e., all curves except UM) is due to aliases of the CIC interpolation. Since we are interested in the large-scale regime, we have not made any special effort to correct for these aliases (e.g., Jing 2005). The power spectrum ratio versus the wavenumber, k, is an indication of the dependence of the bias factor on scale. For all models, the figure clearly demonstrates a very weak dependence on k in the range $-2\lt \mathrm{log}k\lt -1$, strongly motivating linear biasing. At z = 0 (blue), the ratio is larger than unity only for UM. In the remaining models at this redshift, the ratio is less than unity, meaning that the galaxy distribution is less clustered than the DM.

Figure 10.

Figure 10. Ratio of the galaxy to the DM power spectrum at z = 1 (red) and 0 (blue). Only the full UM in SMDPL is plotted.

Standard image High-resolution image

It is interesting to examine the biasing relation as a function of the threshold imposed on the SFR. We compute the bias factor according to

Equation (17)

where the overline refers to the average of the power spectrum over the range $-2\lt \mathrm{log}k\lt -1$. Figure 11 plots b as a function of a lower threshold imposed on the SFR. The bias factor of galaxies with very low SFRs (quenched star formation) is highest. These galaxies tend to live in massive halos and thus are strongly biased (clustered). Actively star-forming galaxies are associated with less massive halos with lower b, as seen in Figure 11 for all models. As soon as star formation is active, the bias factor is nearly constant versus the SFR threshold, with Galacticus showing the strongest dependence. This is in agreement with Angulo et al. (2014), who analyzed the biasing of SFR-selected galaxies extracted from the Millennium-XXL simulation (Angulo et al. 2012) using the L-Galaxies SAM (Springel et al. 2005). Based on the galaxy and DM correlation functions at separations 60–70 h−1 Mpc, they found that the bias factor depends weakly on the SFR (expressed in terms of number density in their case) with b ≈ 1.1 and 0.7, respectively, at z = 1 and 0. In Figure 11, these values are best matched by the SAGE galaxies, while the other SAMs predict larger b values.

Figure 11.

Figure 11. Bias factor, computed using Equation (17), of galaxies with SFRs above a limiting value. Red and blue curves correspond to z = 1 and 0, respectively.

Standard image High-resolution image

The mean stellar mass for the UM mock galaxies with SFR > 10 M yr−1 at z = 0 is M* = 7.9 × 1010 M and we find a similar value of M* = 5 × 1010 M for SAG. For these values of M*, the corresponding halo mass in the models is  ∼2 × 1012 M with a factor of 2 scatter (see Figure 8 in Knebe et al. 2018). According to Comparat et al. (2017), who analyzed halo bias in the MultiDark simulations, the relevant bias for this halo mass range is around unity with some scatter. Taking into account the range of the halo mass and the scatter in the biasing relation, we find that the difference in b measured from the slopes in Figure 9 and Pgal/PDM between UM and the other models is completely reasonable.

5. The Growth Rate from the SfLM

Redshift surveys provide "observed redshifts," z, of galaxies. The LOS peculiar velocities introduce a shift between z and the cosmological redshifts zc according to Sachs & Wolfe (1967),

Equation (18)

where V is the physical peculiar velocity of a galaxy, the speed of light is c, and we have neglected terms related to the gravitational potential and higher order in V/c. Further, we will not consider the important effect of magnification by gravitational lensing in this paper (see Appendix C for some considerations of this effect).

Cosmological redshifts can only be derived from actual distances and are, therefore, impossible to measure for most galaxies in redshift surveys. Therefore, the (luminosity) distances, dL, computed at z rather than zc, are used to derive SFRs and stellar masses of galaxies from the observed fluxes. This obviously holds for the derivation of any intrinsic luminosity of galaxies, but we will phrase the relevant relations in terms of the SFR.

Given the measured flux, F, the observed SFR is given by

Equation (19)

The true intrinsic SFR is instead (see Appendix C)

Equation (20)

where ${d}_{{\rm{A}}}{({z}_{{\rm{c}}})={d}_{{\rm{L}}}({z}_{{\rm{c}}})/(1+{z}_{{\rm{c}}})}^{2}$ and similarly for dA(z) is the angular diameter distance. Collecting up the terms, we write

Equation (21)

with

Equation (22)

In Appendix C, we show that to first order in V,

Equation (23)

where

Equation (24)

The different signs for the two terms in the square brackets reflect the fact that the corrections for relativistic beaming and the shift in dL(z) are in opposite directions. For a given V, the first term is  ∼V/cz for z ≪ 1, while the second term is independent of redshift. At z = 1, this gives ${ \mathcal V }=1.4\times {10}^{-4}(V/100\,\mathrm{km}\,{{\rm{s}}}^{-1})$. Let us compare this to the modulation of logSFR due to environmental density dependence, taking the parameter c2 = 0.026 from Table 3 for SAG as an example. The density rms for Rs = 100 h−1 Mpc is σδ = 0.042. Thus, the modulation in the mean logSFR versus the environment density is typically c2σδ = 10−3. This is almost an order of magnitude larger than ${ \mathcal V }$. Thus, as expected, at z ∼ 1 in the ΛCDM framework, the SfLM cannot be used as a direct probe of V, as was done at z ≪ 1 for 2MRS and SDSS by Branchini et al. (2012) and Feix et al. (2015, 2017). Of course, it can still be useful for constraining nonstandard models predicting an unusually large amplitude of the velocity field at high redshift.

But now let us focus on the SfLM method. The distribution of galaxies in the redshift survey allows a reconstruction of the peculiar velocity field. The linear theory of gravitational instability relates the 3D physical velocity field, v, to the mass density field in real (comoving distance) space as

Equation (25)

Equation (26)

where we assume a potential flow, v = − ϕ, and adopt a linear biasing relation (Equation (1)) with negligible scatter to arrive at the second line. The solution to this Poisson equation is obtained from the galaxy distribution as a function of a single parameter9  β in Equation (3). The spatial derivative in Equations (25) and (26) is with respect to the comoving distance coordinate, r, and the density fields are assumed in real space. In the application to redshift surveys, Equation (26) needs to be modified to account for the difference in the comoving distance being derived from the observed redshift z rather than the cosmological redshift zc. Denoting the galaxy density in redshift space by ${\delta }_{\mathrm{gal}}^{\mathrm{red}}$, the equation becomes (e.g., Nusser & Davis 1994)

Equation (27)

We see that the degeneracy between f and b is maintained, and the solution remains only a function of β. There is an important difference, though: the solution is not linearly proportional to β, and the equation needs to be solved for every value of β of interest (see Nusser & Davis 1994, for details). Nonlinear effects are important on small scales, especially in redshift space, where incoherent motions and finger-of-god effects smear out structure in the LOS direction. However, we are interested in tens of megaparsecs scales where linear theory is completely satisfactory for the purpose of recovering the peculiar velocity (Keselman & Nusser 2016). The tests we perform below are tailored to the expected uncertainties from the finite number of particles and environmental density effect. Thus, for simplicity of presentation and clarity of results, we choose to work with real space density fields, and we only use velocity reconstructed from Equation (26).

The SfLM seeks β as the value that renders a minimum in the function

Equation (28)

We will assume the approximate expression ${ \mathcal V }=V{ \mathcal D }/c$ (see Equation (23)) and that the velocity model is linear in β. We also assume that we are given the velocity field V1 corresponding to a solution Equation (26) with a fixed value for β = β1. Thus, ${ \mathcal V }(\beta )=\beta {{ \mathcal V }}_{1}/{\beta }_{1}$, and the problem reduces to a simple linear regression, as described in Appendix B. The minimum condition, $\partial {\tilde{\chi }}^{2}/\partial \beta =0$, gives

Equation (29)

We have defined

Equation (30)

where an overline indicates a mean over galaxies in the sample, and the same definition applies to logSFRobs. The 1σ error on β is given by

Equation (31)

where Ngal is the number of galaxies.

5.1. A Rough Estimate of the Error

Before we present the full results for the expected errors on β, here we give a rough estimate. This is most easily achieved if we take logSFRobs = logSFR, so that minimization of the function ${\tilde{\chi }}^{2}$ should yield β consistent with zero for a model of the form ${ \mathcal V }(\beta )=\beta {{ \mathcal V }}_{0}$, where ${{ \mathcal V }}_{0}$ is computed from the true peculiar velocities (see Equation (22)). Of course, in reality, ${ \mathcal V }$ will be obtained from the velocities recovered from δgal, but we are only interested in a rough estimate in this section.

Galilean invariance implies the absence of correlation between the properties of galaxies and their velocities. Therefore, the ensemble average of Equation (29) over all possible realizations of logSFRi yields 〈β〉 = 0. This is valid even if the star formation depends on the underlying mass density, since the velocity and density are also uncorrelated. An estimate of the variance of the scatter around 〈β〉 = 0 is

Equation (32)

where ${\sigma }_{{ \mathcal V }}^{2}=\overline{{({ \mathcal V }-\overline{{ \mathcal V }})}^{2}}$ and ${\sigma }_{\mathrm{logSFR}}^{2}=\left\langle {(\mathrm{logSFR}-\left\langle \mathrm{logSFR}\right\rangle )}^{2}\right\rangle $. In the limit of a large number of objects, ${\sigma }_{\mathrm{logSFR}}^{2}=\overline{{(\mathrm{logSFR}-\overline{\mathrm{logSFR}})}^{2}}$. At z = 1, Equation (22) gives ${ \mathcal V }=4.2\times {10}^{-4}(V/300)$. For the SAG mock galaxy catalogs, σlogSFR ≈ 0.25 for SFR > 10 M yr−1. Thus,

Equation (33)

At z = 1, the simulations give σV = 300 km s−1 for the 1D rms of unfiltered galaxy velocities.

5.2. Recovered versus True Velocities: Smoothing Matters

The rough error estimate presented above is based on the true LOS velocities, while in a realistic application, only the reconstructed LOS velocities, Vrec, from the galaxy distribution smoothed on a large scale are available. Therefore, the β estimate from the SfLM is basically the slope of the regression of the true ${ \mathcal V }={ \mathcal D }V/c$ in Equation (21) on the reconstructed ${{ \mathcal V }}_{\mathrm{rec}}={ \mathcal D }{V}_{\mathrm{rec}}/c$, where ${ \mathcal V }$ is perturbed by the random spread in logSFR.

Smoothing alone causes a statistical bias in the estimate of the slope (β); i.e., even if the smoothed true velocities, Vs, were used instead of Vrec, we expect a slope of the regression of V on Vs to differ from unity due to the correlation between V − Vs and Vs for the TH smoothing.

In this section, we explore the expected statistical bias in the estimates of the slope. We begin with a basic assessment of the ability of the linear theory relation, Equation (25), to reconstruct the peculiar velocity from the DM density field, δ. In Figure 12, the Cartesian velocity components Vrec reconstructed from δ with the true value of the parameter f are plotted against the true velocities V. Both Vrec and V are provided on a grid and have been smoothed with a TH window of width Rs = 20 (left panel) and 100 (right panel) h−1 Mpc. For clarity, only a small randomly selected fraction of the grid points is plotted. This figure refers to the z = 1 output from the MDPL2 simulation. Linear theory performs extremely well in this case. The slopes of the regression of Vrec on V, as well as that of the inverse regression, are clearly very close to unity. Further, the scatter in Vrec versus V is negligible, even for the smaller smoothing.

Figure 12.

Figure 12. Comparison of the velocity predicted from the DM density field using the linear relation (Equation (25)). Both velocity fields are given on a grid and smoothed with TH windows of width 20 (left) and 100 (right) h−1 Mpc, as indicated in the figure. For clarity, only a small fraction of the grid points are shown as blue dots. Dashed and dashed–dotted lines represent linear regression of Vpred on V and its inverse. The slopes are indicated in the plots. The diagonal solid line is plotted to guide the eye.

Standard image High-resolution image

Next, we turn to reconstruction from the galaxy density field, using Equation (26) with β = f. The results are shown in Figure 13 for the z = 1 SAG galaxies selected above the Euclid cut SFR > 10 M yr−1 (see Table 2). The contours contain 68%, 90%, and 95.4% of the points, obtained by fitting a 2D normal PDF to the distribution of points in the VrecV plane. The results are quite different from the previous figure. The regression slopes clearly deviate from unity in this case. Since Vrec is derived for β = f, the slope of the regression of Vrec on V should yield the galaxy bias factor, b. Indeed, the slope is very close to the values obtained from the density scatter plot in Figure 9 and the ratio of the power spectra as seen in Figures 10 and 11 for SAG.

Figure 13.

Figure 13. Same as Figure 12 but for Vpred from the galaxy density field in the SAG simulations. Red contours enclose 68%, 90%, and 95% of the points. The slope of the dashed line is consistent with the bias factor for SAG galaxies with SFRlim ≈ 10 in Figure 11 (dashed curve).

Standard image High-resolution image

We already know from Figure 12 that inaccuracies associated with linear theory reconstruction on the scales considered are negligible. That implies that the origin of the scatter in Figure 13 is stochasticity in the biasing relation and the effect of shot noise on Vrec. The rms of the residual between Vrec and the best-fit lines (dashed curves) in Figure 13 is 63.4 and 26.8 km s−1 for Rs = 20 and 100 h−1 Mpc, respectively. To quantify the contribution of the shot noise, we resort to Appendix A, where we derive the following expression for the variance of the shot-noise effect on Vrec (see also Strauss et al. 1992):

Equation (34)

For SAG at z = 1 and β = f, we find σV,SN = 36.4 and 16.3 km s−1 for the smaller and larger Rs, respectively. This makes the contribution of biasing stochasticity 51.9 and 21.2 km s−1, respectively, for the two smoothing widths. Thus, the intrinsic stochasticity in the biasing relation is the dominant contribution to the scatter.

So far, we have made comparisons between Vrec and V smoothed on the same scale. In contrast, the model velocity in the SfLM is smoothed, while the data logSFRobs involves the true galaxy velocities. In Figure 14, we plot the true unsmoothed SAG galaxy velocities, Vgal, versus Vrec from the galaxy density field. Note that in this plot, Vrec is plotted on the x-axis.

Figure 14.

Figure 14. Actual galaxy velocities vs. the prediction from the relation using the galaxy density field smoothed on Rs = 20 and 100 h−1 Mpc. In contrast to Figure 13, the galaxy velocities here are the raw, unsmoothed galaxy velocities. The three velocity components are compared for 1/15,000 of the SAG mock galaxies.

Standard image High-resolution image

Only a sharp k-cutoff filtering yields a vanishing correlation between the residual Vgal − Vrec and Vrec. However, an idealized sharp k-cutoff filtering is impossible to apply to real data due to complicated observational window functions. For the more practical TH smoothing, the correlation affects the regression slopes (see Appendix B). In addition, as we have seen in Figure 13, stochastic biasing and shot noise will add scatter to Vrec as recovered from the galaxy density field.

The scatter will also affect the slope, but as we have seen, the effect is small, since both the forward and inverse regression in Figure 13 have similar slopes. Therefore, it is not surprising that the regression slopes in Figure 14 are quite different from those in Figure 13 and each other. The relevant slope for the SfLM is that of the regression of Vgal on Vrec. For both, smoothing the slopes is closer to unity than that obtained by a regression of the identical smoothing case of V on Vrec in Figure 13. Just for comparison, the slope of Vrec on Vgal is 4.22 for Rs = 100 h−1 Mpc and 1.43 for the identical smoothing case in the right panel of Figure 13.

Therefore, in order to infer a statistically unbiased β estimate, one should carefully calibrate the result to account for the inherently different smoothing between the models and the data. The same point has also been repeatedly emphasized in regard to the velocity–velocity comparison local surveys by Nusser & Davis (1995) and Davis et al. (2011, 1996).

5.3. Tests of SfLM with Mock Catalogs

We are now in a position to present the results of our tests of the SfLM applied to mock catalogs and its sensitivity to SFR and environment. This exercise is to be viewed as an intermediate-phase testing of the SfLM toward a detailed forecast and full application to realistic surveys. Therefore, we do not make any attempt to run the test on mock catalogs generated in light cones matching the expected number of galaxies versus redshift. Instead, we simply place the simulation box at z = 1 and apply the SfLM to the simulated sample10 (see Section 2.2). For the sake of conciseness, we define βSfLM as the β estimate obtained from the application of the SfLM.

Following the discussion in Section 2.2, since the MDPL2 mocks have substantially smaller volume and fewer objects than the typical corresponding sample obtained from next-generation surveys, our tests will provide upper bounds on the expected errors on β from the SfLM. Furthermore, mocks taken from the same MDPL2 parent simulations cannot be used to quantify cosmic variance. For this purpose, we resort to UM BigMDPL to quantify the uncertainty in βSfLM due to cosmic variance on the scale of the MDPL2. This is done by partitioning the BigMDPL box into cubic subvolumes, each matching the size of the MDPL2 box. The SfLM is then applied to the mock galaxies in each of these subvolumes individually. The scatter in the βSfLM values from all subvolumes serves as an indication of the cosmic variance on the scale of the MDPL2 box and should serve as an upper bound on the cosmic variance on the scale of real, upcoming surveys.

The "observed" quantities logSFRobs are obtained by shifting the true logSFR by ${ \mathcal V }$, as in Equation (21). We work with the approximate expression in Equation (23) for ${ \mathcal V }$ and write the model ${ \mathcal V }$ as

Equation (35)

where

Equation (36)

with Vrec,1 being the reconstructed peculiar velocity Vrec obtained from the galaxy density field δgal, as a solution to Equation (26) for β = f(Ω). Solutions are provided for a TH smoothing of δgal with widths of Rs = 20 and 100 h−1 Mpc, respectively. In all cases, we impose SFR > 10 M yr−1 and consider the three Cartesian components of the true raw velocities of galaxies, vgal, to obtain three sets of logSFRobs. Thus, for each mock, we have three values of βSfLM corresponding to the three Cartesian components. In the results referring to MDPL2 mocks, we consider the mean of the three values for each mock. Since the three components of the velocity are statistically independent, we are effectively assessing the SfLM with a number of objects that is triple what we actually have in each mock. The increase in the statistical sampling should yield a more reliable error estimate to be used as an indicator of precision that can be achieved with ${ \mathcal O }({10}^{7})$ as in the full Euclid spectroscopic survey.

The βSfLM values obtained from each set of mocks are summarized in Figure 15 for Rs = 20 (left panel) and 100 (right panel) h−1 Mpc, as indicated in the figure. The labels SAG, SAGE, and Galacticus on the x-axis correspond to the MDPL2 mocks generated using the full galaxy formation machinery in each model. Here UM refers to UM mocks in MDPL2 obtained by resampling the UM SMDPL galaxies, as described in Section 2.1; UM Big is the same as UM but for the BigMDPL simulation; and UM CosVar shows the results obtained from the subvolumes of UM BigMDPL to assess cosmic variance on the scale of MDPL2.

Figure 15.

Figure 15. The slope of the regression of logSFRobs on the peculiar velocity estimated using Equation (25) from the galaxy distribution is shown by crosses for various mocks. Except for the last point to the right in each panel, the error bars represent the rms uncertainty due to the scatter in the SFR distribution. The last point, UM CosVar, also includes the cosmic variance expected for a volume of 1 h−1 Gpc. The left and right panels correspond to velocities recovered from the galaxy density field smoothed on 20 and 100 h−1 Mpc TH smoothing, as indicated.

Standard image High-resolution image

The different symbols in the panels indicate βSfLM values and their scatter obtained with different procedures.

  • 1.  
    Open red circles. These are the β values obtained by a regression of the true galaxy velocities on the model V1. Therefore, these values represent what the SfLM would theoretically yield in the limit of an infinite number of objects and should be regarded as the reference value for the β parameter.
  • 2.  
    Black crosses. These are the βSfLM values obtained from the SfLM method in each mock, averaging over the three Cartesian velocity components. For the UM mocks, they correspond to the average βSfLM obtained from 20 random resamplings of the full UM in SMDPL.
  • 3.  
    Filled red circles. These symbols represent β = f/b using the true f and b obtained via a regression of the galaxy density on the DM density (both smoothed with the same window, e.g., Figure 9). Very similar values are obtained for b estimated from the regression of the smoothed Vrec,1 on the smoothed true velocities (e.g., Figure 13).
  • 4.  
    Blue crosses. These are the SfLM β estimates including environmental dependencies using δ in Equation (37) (see Section 5.4). They are provided for the SAG, SAGE, and Galacticus cases only.
  • 5.  
    Red crosses. These β values are the same as the blue crosses but using δgal in Equation (37) (see Section 5.4)
  • 6.  
    Red vertical bands for all cases except UM CosVar. These represent the 1σ error on βSfLM, computed according Equation (31). As stated above, this is the error on the mean of the three values obtained for the three Cartesian components or, equivalently, from a single β estimated from triple the number of galaxies in each mock. The bands are centered on the open red circles.
  • 7.  
    Red vertical bands for the UM CosVar case. These represent the rms of the βSfLM values obtained from subvolumes of the simulation BigMDPL. This scatter accounts for the combined error from cosmic variance and the scatter in logSFR.
  • 8.  
    Gray vertical bands. These bands provide a crude estimate of the error expected for a Euclid-like survey, obtained by rescaling the errors shown as red bands. Their estimates account for (a) the number density versus z given in Table 3 in Euclid Collaboration et al. (2019), (b) the monotonic decrease of ${ \mathcal D }$ with z in Equation (24), and (c) the dependence of ${\sigma }_{\mathrm{logSFR}}^{2}$ with z as a result of a change in the limiting SFR for a given observed flux limit.

The error bars (red and gray bands) for Rs = 100 h−1 Mpc (right panel) are roughly a factor of 2 larger than the corresponding errors for Rs = 20 h−1 Mpc. The smaller smoothing clearly captures more details of the structure of the velocity field and is bound to yield a tighter fit to the data. Quantitatively, the relative difference in the errors can be understood from Equation (33) taking σV = 210 and 110 km s−1 for the small and large Rs, respectively. The increase in the error is consistent with the ratio of σV.

The reason for the differences between the open and filled red circles is explained in Section 5.2. The open red circles are regressions of raw on smoothed velocities that lead to distinct slopes from regressions done with equally smoothed velocities. This accounts for the larger difference for Rs = 100 h−1 Mpc, as well as the fact that the open red circles are always above the filled ones.

Given the red error bars, the βSfLM (black crosses) for SAG, SAGE, and Galacticus are consistent with theoretical values that would be obtained from an infinite number of data points (open red circles). The largest deviation is for SAGE with Rs = 100 h−1 Mpc. We could not identify any specific cause for this ∼2σ deviation. Recovered versus true velocities follow the same correlation as for other models, and it seems that this is just a random statistical fluctuation.

The black crosses and open red circles for the UM mocks in the fourth and fifth sets are very similar. This is expected, since the SfLM β values here are averages over many random resampling realizations of the original UM in SMDPL. Furthermore, the rms of the scatter in β around the average is consistent with the error computed with Equation (31).

The red error bar on the set UM CosVar is comparable to the red error bar for the UM set. This is important, since it implies that the contribution of the cosmic variance to the error budget is small. The main error is due to the finite number of data points.

The 1σ error on β obtained by averaging the gray bands over SAG, SAGE, Galacticus, and UM is σβ = 0.17 and 0.34 for the small and large smoothing widths, respectively. Using the bias factors computed for each mock, the corresponding errors on f are σf = 0.22 and 0.45.

5.4. Environment in the SfLM

As already pointed out, environmental dependencies should not introduce systematic errors on βSfLM but rather only affect the random uncertainty. To prove this statement, we perform a simultaneous SfLM fitting of β and the parameters c1 and c2 characterizing environmental dependencies (see Equation (5)). Generalizing Equation (28), we write

Equation (37)

The parameters d1 and d2 introduce a dependence on the scatter of logSFR and do not affect the description here. They would, however, enter the analysis if each data point ${\tilde{\chi }}^{2}$ were assigned a weight according to the expected scatter in logSFR. Even then, the variation of the weighting would be mild at the level of ${\sigma }_{\delta }{d}_{2}/{\sigma }_{\mathrm{logSFR}}^{2}\lt 10 \% $, taking σδ = 0.24 and d2 values from Table 3.

The minimum is obtained for

Equation (38)

Equation (39)

where the Δ symbol implies that the mean value (over the galaxies) has been subtracted. The parameter c1 accounts for the fact that the mean of δ over galaxies is not strictly zero and will not be considered further in our analysis.

In realistic applications, only the galaxy density δgal is accessible to observations; therefore, we repeat the minimization procedure using δgal instead of δ. The blue and red crosses in Figure 15 are βSfLM as a solution to Equation (3) obtained with δ and δgal, respectively. The results are shown for SAG, SAGE, and Galacticus only. For the larger smoothing (right panel), the results are very close to the black crosses corresponding to minimizing Equation (28). The results with δgal (red crosses) are well within the statistical error for the smaller smoothing. The same is true for the blue crosses for SAG and SAGE. However, for the small smoothing, the Galacticus estimate obtained with δgal is significantly higher than all other estimates. This is perhaps not surprising because, as mentioned above, the Galacticus galaxies are peculiar, exhibiting inverted environmental dependencies.

Since we have tripled the number of fitting parameters relative to the estimates in the previous sections, we should naively expect that the error on β is substantially increased. Let us examine this issue in detail. Since the ensemble average $\left\langle {\rm{\Delta }}\delta {\rm{\Delta }}{ \mathcal V }\right\rangle =0$, we find that the ensemble averages $\left\langle \beta \right\rangle $ and $\left\langle {c}_{2}\right\rangle $ in Equations (38) and (39) are the same as in the previous section (see Equation (29)). For the same reason, the ensemble average of the error covariance matrix is diagonal. Therefore, it suffices to examine Equation (38) alone to conclude that the added error due to the term with c2 is ${ \mathcal O }(1/{N}_{\mathrm{gal}})$ compared to ${ \mathcal O }(1/\sqrt{{N}_{\mathrm{gal}}})$ for the first term. As a result, we find that the computed errors on β estimated with and without the inclusion of c2 are very close to each other.

6. Discussion

The effects of the large-scale environment on galaxy properties have been the focus of numerous observational studies (e.g., Bernardi et al. 2005; Hoyle et al. 2005; Park et al. 2007; Disney et al. 2008; Blanton & Moustakas 2009; Tempel et al. 2011; Hahn et al. 2015). Although the environment has a strong impact on the distribution of a single property (e.g., stellar mass), all evidence points toward it playing a meager role in shaping intrinsic relations between galaxy properties such as the fundamental plane and the Tully–Fisher relations (Bernardi et al. 2005; Disney et al. 2008; Nair et al. 2010). The environmental dependence on these scaling relations has not yet been invoked as a strong constraint on galaxy formation models (but see Mo et al. 2004). Part of the reason is that these relations involve observations that are sensitive to the distribution and kinematics of stars that are not properly followed in the SAM–DM simulation combination. Also, hydrodynamical simulations of galaxy formation may not be large enough to quantitatively explore this dependence There are, however, certain intrinsic relations that involve properties that can be studied with SAM. Among these, the one we touched upon in this work is the mean SFR at a given stellar mass, shown in Figure 8 for galaxies with high SFR (>10 M yr−1). Our analysis has revealed important differences among various galaxy formation models. The UM and SAGE models showed little evidence for dependence on environment, while in SAG, the SFR is boosted in less dense environments only for stellar masses above 1011 M. Galacticus galaxies are associated with a higher SFR in less dense environments only for M* ≲ 1010 M.

A central goal of this work is to make an assessment of the SfLM at constraining the parameter β with mock galaxy catalogs that incorporate as many physical effects as possible and focus on star-forming, line-emitting objects. The application of the method at z ≃ 1 relies on the availability of a large number of galaxies in the survey; therefore, we have geared the tests toward next-generation spectroscopic surveys, focusing on the Euclid satellite mission.

The SfLM has been criticized on the grounds that it could be susceptible to environmental dependencies in the SFR/luminosity distribution. The concern was already raised and addressed in previous papers by two of the current authors (Feix et al. 2014, 2015), who argued that environmental dependencies affect only a direct identification of luminosity spatial modulations with the large-scale velocity field. They pointed out that the environment cannot lead to any systematic effect on the estimation of β in terms of fitting a velocity field reconstructed using the spatial distribution of the same galaxies. This is a direct consequence of Galilean invariance, which guarantees a vanishing correlation between peculiar velocity and density field at the same point. We have demonstrated this point in Section 5.4. Environmental dependencies do not bias the β estimate but can potentially contribute to the errors. Our tests demonstrate that this effect is small. Its magnitude can be inferred by comparing the black crosses with the open red circles in Figure 15.

Ideally, the SfLM should be tested with more realistic mock catalogs that mimic the footprint and selection effects of a specific survey. In this work, we have been focusing on the paradigmatic case of the Euclid survey. Unfortunately, the available mock catalogs, though highly valuable to identify and assess environmental dependencies, do not cover a sufficiently large volume to mimic the full Euclid survey. In the absence of publicly available, realistic mock Euclid catalogs, a reasonable compromise is to generate light-cone mock data by stacking the available simulation outputs to cover a sufficiently large volume. We leave this for future work. In this paper, we have estimated the errors by rescaling those obtained from the available mocks at z = 1 (red bands in Figure 15) to the expected number of galaxies in the Euclid survey (gray band errors, same figure). We expect that the real error lies in between the gray and red bands.

It is worth emphasizing that the application of the SfLM to these spectroscopic surveys will be quite straightforward. Clustering analyses have been focusing on detecting and fitting the BAO peak to extract cosmological parameters. Most of these analyses include reconstruction procedures to enhance the signal-to-noise ratio of the BAO peak by tracing galaxy orbits back to an epoch in which linear theory applies (Eisenstein et al. 2007; Padmanabhan et al. 2012). Among these, reconstruction methods based on the cosmological application of the least-action principle (Peebles 1989; Nusser & Branchini 2000; Sarpa et al. 2019) generate as a "by-product" a model peculiar velocity field at the epoch of observation that can be readily used as input to the SfLM method.

In this paper, we have opted to apply the simplest form of the method and avoid performing a full maximum-likelihood estimation (MLE) as in Feix et al. (2015). Theoretically, an MLE analysis would exploit the details of the shape of the SFR/luminosity distribution and not only the second moment, as is done here. We leave the application of the MLE to a future analysis of more realistic galaxy mocks.

The amplitude of the velocity-induced shift is a strong function of redshift. In fact, for a Planck cosmology, the net effect vanished at z = 1.6 but then will pick up again with an opposite sign, dominated by the beaming effect (the V/c term). Therefore, it can be relevant for all of those redshift surveys, like DESI and Roman-WFIRST, that will target galaxies beyond this redshift.

Gravitational lensing by the foreground mass distribution causes a magnification/demagnification of galaxies in the survey. For a source at z = 1, the mass distribution at z < 1 causes a 2 × 10−3 shift in the mean logSFR in spheres of radius 20 h−1 Mpc. The amplitude of the effect is close to the environmental dependency estimated from the mocks. It is rather easy to incorporate this effect in the analysis. Suppose the redshift survey spans the redshift range from z = 1 to  2. Let us separate the lensing shift into a contribution from mass distribution at z < 1 and another from mass at z > 1. The foreground matter at z < 1 will cause a shift that varies in a specific way with the distance of galaxies in the survey. Lensing induced by matter at z > 1 is easy to model using the observed galaxy distribution to infer the density field. The spatial coherence of the total lensing contribution will be quite distinct from the shift due to velocities. We therefore expect little covariance between the lensing and peculiar velocity-induced effects, which would make the two contributions easy to discern.

As demonstrated in Figure 12, nonlinear dynamical effects are insignificant, and linear theory is quite satisfactory on the scales of interest. Larger information content is captured by the recovered velocity field for smaller smoothing scales. We have seen that Rs = 20 h−1 Mpc is adequate for linear theory reconstruction and yields much smaller errors than Rs = 100 h−1 Mpc. The only possible concern is whether Vrec smoothed on an  ∼20 h−1 Mpc scale is contaminated by shot-noise contribution when the mean number density is of the order of the one expected in next-generation surveys, i.e., $\bar{n}\sim 5\times {10}^{-4}{{h}}^{3}\,{\mathrm{Mpc}}^{-3}$ (Euclid Collaboration et al. 2019). Substituting this value for $\bar{n}$ with Rs = 20 h−1 Mpc in Equation (34) yields ${\sigma }_{V,\mathrm{SN}}^{2}={(93\mathrm{km}{{\rm{s}}}^{-1})}^{2}$. This is larger than the contribution of biasing stochasticity, ${(51.9\mathrm{km}{{\rm{s}}}^{-1})}^{2}$, to the variance of Vrec (see Section 5.2) but significantly smaller than the variance of the smoothed velocity at z = 1, ${(210\mathrm{km}{{\rm{s}}}^{-1})}^{2}$. Therefore, the expected shot-noise contribution will be subdominant in the total error budget that is dominated by the scatter in the SFR.

As we have seen, the SfLM method provides an estimate of the β parameter. It is has become customary to express the results on β via the combination σgal,8β = σ8f, thus avoiding the appearance of either b or β in the result. We find this to be inadequate. In the case of RSD on large scales, for example, according to Kaiser (1987), only β can be directly inferred from the ratio of angular moments of the observed galaxy power spectrum independently of any assumptions regarding σ8 and the shape of the DM power spectrum. The combination 8 can also be derived in this context, but only by matching the amplitude of the moments to an assumed shape for the DM power spectrum. Once b and β are known, 8 can be easily derived. Therefore, since the inference of β depends purely on the relative clustering anisotropy and b mainly on the clustering amplitude, it would be prudent to quote the values separately. Although 8 is the amplitude of the velocity, it is not the parameter responsible for RSD in the galaxy distribution. Quoting only 8 creates the false impression that it is the primary quantity that governs the RSD phenomenon.

It is instructive to compare the performance of the SfLM in constraining β to other probes. One of these is by means of a comparison of direct observations of galaxy peculiar velocities versus the predicted velocities from the distribution of redshift surveys. This is possible for local data (within distances of ∼100 Mpc). Davis et al. (2011) performed such a comparison using the SFI++ velocity catalog and 2MRS gravity. They included a full error analysis and found a 1σ uncertainty in β at the level of δβ ≈ 0.05. A similar uncertainty was found by Pike & Hudson (2005) using a different analysis technique and compilation of data.

The other probe is RSD in galaxy clustering. Let us focus on the z ∼ 1 results, i.e., the same redshift we have been focusing on so far. Contreras et al. (2013) estimated β from the anisotropic two-point correlation function of about 34,000 galaxies in the WiggleZ survey at an effective redshift z = 0.76 with an error σβ in the range 0.11–0.22, depending on the model assumed for the real-space correlation function and the minimum transverse separation considered in the analysis.

Pezzotta et al. (2017) performed an RSD analysis on the final release of the VIPERS galaxy catalog. They quoted a result in terms of 8 at z ≈ 0.85 with an error of ∼0.11. Using the results shown in Table 2, this implies an error σβ ∼ 0.13.

More recently, eBOSS Collaboration et al. (2020) presented the final results of the clustering analyses of various extragalactic objects from the completed SDSS. For our comparison, we are interested in the RSD-only analysis of about 170,000 emission line galaxies at the effective redshift z = 0.85 (Tamone et al. 2020). They quoted a consensus result of 8 = 0.315 ± 0.095. Taking σ8(z = 0.85) = 0.522 from the Planck ΛCDM, we find that the error on f ∼ 0.18. This is actually comparable to the error σf ∼ 0.22 that we predict by applying the SfLM to upcoming spectroscopic samples at z = 1.

Next-generation surveys are designed to estimate the growth rate with higher precision. Focusing again on the Euclid case, Amendola et al. (2018) provided forecasts for the growth rate from an RSD-only analysis based, however, on somewhat outdated, optimistic assumptions on the expected number of Euclid galaxies. Their Fisher analysis indicates a relative error on f of about 1% at z = 1 for a flat ΛCDM cosmology (see Table 4 of their paper). More recent forecasts based on updated predictions of Euclid galaxy density have been provided by Euclid Collaboration et al. (2019). They, however, did not consider an RSD-only analysis. Instead, they studied the case of a full shape-clustering analysis that includes both RSD and BAO with no additional information from the cosmic microwave background, low-redshift surveys, and weak lensing. They do not explicitly provide an error on f but rather on the mass density parameter Ωm and the parameters of the dark energy equation of state. From Table 16 in their paper, we conclude that the errors depend to a large extent on the assumed background cosmology. The relative error on f from the combined RSD+BAO ranges from 5% for the standard flat ΛCDM to 21% for a nonflat ΛCDM,  smaller than the SfLM case but very sensitive to the cosmological background model.

7. Conclusions

Motivated by the advent of large spectroscopic surveys designed to target line-emitting galaxies, we have investigated the connection between the SFR and the underlying mass density using a suite of publicly available mock catalogs in which galaxy properties, including SFRs and stellar masses, are predicted using different semianalytic recipes for galaxy formation and evolution. The main results of our analysis can be summarized as follows.

  • 1.  
    There are certain general properties that are qualitatively common to all mocks examined in this work: the validity of scale-independent linear bias on scales larger than  ∼20  h−1 Mpc, the insensitivity of the bias factor to the SFR for star-forming galaxies with SFRs larger than  ∼3 M yr−1, and the trend of large stellar masses and quiescent galaxies in denser environments. This is in agreement with previous findings in the literature (e.g., Angulo et al. 2014).
  • 2.  
    Despite the above and the fact that the galaxy formation recipes incorporate the same physical processes, the corresponding mocks also exhibit distinct features that can be tested in next-generation surveys. In general, an important discriminator among models is the extent to which the large-scale environment affects intrinsic relations between galaxy properties. The PDFs of the SFR and stellar mass, their redshift evolution, and their variation with environment depend on the SAM recipe, especially for objects characterized by intense star formation activity and/or large stellar masses. The linear bias parameter versus SFR is model-dependent, especially at low redshifts. In particular, the Galacticus model predicts, unlike all the others, a decreasing SFR as a function of the stellar mass, and for this reason, it should be the easiest to test through observations.
  • 3.  
    We have shown that the SfLM is a viable method to infer the growth rate of density fluctuations. First, environmental effects do not bias the β estimate obtained from the SfLM. This point has been made before (Feix et al. 2014, 2015). Here we have demonstrated it explicitly using mock catalogs for the specific case of star-forming galaxies, implying that the SfLM can be safely applied to future surveys targeting emission line galaxies. Environmental effects do contribute to random errors, though. Second, a point in favor of the SfLM is that these errors are small compared to shot noise and stochasticity. The former dominates the total error budget for the galaxy density expected for the Euclid survey at z ≃ 1. Third, the uncertainties in the β estimate from the SfLM are comparable to those from RSD analyses of existing data sets at z ∼ 1. Moreover, the SFR/luminosity shift upon which the SfLM method relies is rather insensitive to the parameters of the cosmological background. For example, a reasonable deviation from the Planck parameters represented by a flat Λ model with Ωm = 0.2 changes the effect by about 6% at z = 1. As a result, unlike most of the RSD analyses, the SfLM is rather insensitive to the assumed background cosmology and underlying power spectrum. The SfLM constraints on the growth rate are similar across the very different SAMs considered; thus, we conclude that our results are robust and model-independent. For all of these reasons, we believe that the SfLM is an effective method to infer the growth rate.
  • 4.  
    The SfLM method relies on the SFR/luminosity shift induced by peculiar velocities V. There are two competing effects that contribute to this shift. (1) Relativistic beaming reduces the amount of light emitted by a source with positive V; this effect is independent of redshift. (2) A familiar term at low redshift results from taking the distance as cz/H0; this term is redshift-dependent and increases the observed luminosity compared to the true one. For the Planck cosmological parameters, the two effects cancel out at z ≈ 1.6, while the first one becomes dominant at higher redshifts.

We thank the anonymous referee for useful comments. We also thank Carmelita Carbone, Ben Granett, and Lucia Pozzetti for useful comments and discussion. A.N. acknowledges the hospitality of the MIT Kavli Institute for Astrophysics and Space Research, where part of this work has been done. This research was supported by the I-CORE Program of the Planning and Budgeting Committee, the Israel Science Foundation (Grant Nos. 1829/12 and 936/18). G.Y. acknowledges financial support from the Ministerio de Ciencia, Innovación y Universidades/Fondo Europeo de Desarrollo Regional under research grant PGC2018-094975-C21. E.B. is supported by MUIR/PRIN 2017, "From Darklight to Dark Matter: understanding the galaxy-matter connection to measure the Universe"; ASI/INAF agreement no. 2018-23-HH.0, "Scientific activity for Euclid mission, Phase D"; ASI/INAF Agreement No. 2017-14-H.O, "Unveiling Dark Matter and Missing Baryons in the high-energy sky"; and INFN project "INDARK." This research was supported by the Munich Institute for Astro- and Particle Physics (MIAPP), which is funded by the Deutsche Forschungsgemeinschaft (DFG; German Research Foundation) under Germany's Excellence Strategy—EXC-2094—390783311. The CosmoSim database (www.cosmosim.org) used in this paper is a service provided by the Leibniz–Institute for Astrophysics Potsdam (AIP).

Appendix A: Shot Noise

The variance of the shot noise in a TH-smoothed galaxy density field δgal is

Equation (A1)

where ${V}_{{\rm{s}}}=4\pi {R}_{{\rm{s}}}^{3}/3$. For the SAG mocks with the Euclid cut, $\bar{n}=3.29\times {10}^{-3}{\left[{{h}}^{-1}\mathrm{Mpc}\right]}^{-3}$, giving ${\sigma }_{\mathrm{SN}}^{2}=9\times {10}^{-3}$ and 7.2 × 10−5 for Rs = 20 and 100 h−1 Mpc, respectively.

The shot noise in the velocity can be derived as follows (see Strauss et al. 1992). For simplicity, we choose to calculate the shot-noise variance for the velocity recovered at the origin, r = 0. By homogeneity, the result will be valid at any other point. The linear theory relation gives

Equation (A2)

where, in the TH smoothing, each galaxy is represented as a sphere of radius Rs. To estimate the shot noise in this quantity, we consider bootstrap realizations in which each galaxy is replaced by Ni particles, where Ni is a random integer drawn from a Poisson distribution with a mean of unity. The difference between the velocity reconstructed from one of these realizations and v(0) is

Equation (A3)

leading to a 1D variance

Equation (A4)

Transforming the summation into a volume integral using ${\sum }_{i}\to 4\pi \bar{n}\int {r}^{2}{dr}$ yields

Equation (A5)

Appendix B: Regressions

The aim here is to clarify the difference between various regressions. We seek a parameter (slope) p that renders a minimum in the expression

Equation (B1)

where x and y, respectively, have vanishing mean values. This yields the regression slope of yi on xi as

Equation (B2)

with a 1σ uncertainty

Equation (B3)

This implies that the slope of the regression of x on y is

Equation (B4)

  • I.  
    Statistically unbiased estimate of b via density–density regressions: let x = δs be the mass density contrast (smoothed or otherwise) and $y={\delta }_{\mathrm{gal}}^{s}$ the smoothed number density contrast of galaxies. In this case, the regression of y on x yields
    Equation (B5)
    where we have assumed linear biasing ${\delta }_{\mathrm{gal}}^{s}={\delta }^{s}+{\epsilon }^{s}$ and that the smoothed random noise term epsilons satisfied $\left\langle {\epsilon }^{s}{\delta }^{s}\right\rangle $. This regression yields a statistically unbiased estimate for b.Now, take x = δs and $y={\delta }_{\mathrm{gal}}^{s}$. This is the inverse regression to the above, and it gives
    Equation (B6)
    which equals b only in the limit of vanishing ${\sigma }_{{\epsilon }^{s}}$.
  • II.  
    Let x = δ be the unsmoothed mass density contrast, and, as before, $y={\delta }_{\mathrm{gal}}^{s}$. The regression of y on x yields
    Equation (B7)
    Note that for a sharp k-cutoff, smoothing $\langle \delta {\delta }^{s}\rangle =\langle {({\delta }^{s})}^{2}\rangle $, since δ − δs are composed of Fourier modes that are entirely independent from δs.
  • III.  
    Consider the regression of y = δ on $x={\delta }_{\mathrm{gal}}^{s}$. Then the slope of this regression is
    Equation (B8)
    Only for epsilons = 0 and a sharp k-cutoff smoothing, this regression yields 1/b.
  • IV.  
    The form of the SfLM is basically a regression of true velocities on predicted velocities. We identify x with the radial peculiar velocities, Vrec, predicted from the distribution of galaxies with β = f. We write
    Equation (B9)
    where we take the predicted velocity Vrec as obtained from the smoothed galaxy distribution using linear theory using the true value of β. Further, we have assumed that the smoothing is on sufficiently large scales (see Figure 12) such that Vrec differs from the true smoothed velocities ${V}_{\mathrm{gal}}^{s}$ only due to shot noise and scatter in the biasing relation, as represented by the term epsilonV.As for y, we take
    Equation (B10)
    where epsilonSFR represents the scatter due to the spread of the SFR. In this case, the slope is
    Equation (B11)
    Therefore, only if epsilonV = 0, i.e., no scatter, and Vrec is obtained with k-cutoff smoothing (or without any smoothing), we find p = 1/b. In the application to real catalogs, k-cutoff smoothing is unrealistic, and it is impossible to recover the galaxy velocities without smoothing, but it is rather easy to model the expectation values of velocity products in linear theory. Therefore, one needs to carefully calibrate the results in order to obtain a statistical estimate of β. Fortunately, this is easy to do.

Appendix C: Luminosity Modulation

Neglecting terms proportional to the gravitational potential, we have

Equation (C1)

where c is the speed of light. A galaxy with measured redshift z and an observed flux, F, in units of energy (time)−1 (area)−1 (e.g., erg s−1 cm−2) is assigned a luminosity, Lobs, according to

Equation (C2)

where dL(z) is the luminosity distance evaluated at redshift z.

Let us now explore how Lobs is related to the true intrinsic luminosity, Lint, of the galaxy. Let the galaxy cover an area A perpendicular to the LOS, and let Iν be the specific intensity of light emitted by this area in units of energy  (s)−1 (area)−1 (frequency)−1 (solid angle)−1. We assume a uniform Iν across A. We assume that the galaxy emits at a single frequency with a very narrow line such that

Equation (C3)

where δD is the Dirac δ function. Now, IνdAdΩ is the energy emitted per second per frequency from a small patch dA into a solid angle dΩ. Therefore, we have

Equation (C4)

The observer at redshift zero measures a specific intensity ${I}_{{\nu }_{0}}^{0}$ that is related to Iν by invariant

Equation (C5)

where

Equation (C6)

is the observed frequency. The flux measured by the observer's detector is

Equation (C7)

where ${{\rm{\Omega }}}_{A}=A/{d}_{{\rm{A}}}^{2}({z}_{{\rm{c}}})$ is the solid angle subtended by the area A. The angular diameter is evaluated at zc, since the area is perpendicular to the LOS; thus, to first order, it is unaffected by the peculiar velocity of the galaxy. Now, using Equations (C3), (C5), and (C6), we obtain

Equation (C8)

This is Tolman's surface brightness law. Therefore,

Equation (C9)

Using Equation (C2) and remembering that dL(z) = (1 + z)2dA(z), we find

Equation (C10)

We have arrived at a peculiarity result that the modulation of the luminosity is actually via the angular diameter distance rather that the luminosity distance.

Let us expand the distance ratio to first order in V. We start with

Equation (C11)

where

Equation (C12)

is the comoving distance.

First-order Taylor expansion of dA(z) in the vicinity of z ≈ zc is

Equation (C13)

Equation (C14)

The last line is consistent with previous findings in the literature (Bonvin et al. 2005; Hui & Greene 2006). In the square brackets, V/c reflects relativistic beaming, which goes in the opposite direction of the second term arising from associating the distance with the redshift. To first order, we can replace zc with z in the term in brackets. Therefore, given the velocities (via a model), Lint can be estimated as

Equation (C15)

We can easily include all other relativistic effects related to the gravitational potential. Over the scales considered here, gravitational lensing is dominant. The modification to Equations (C9) and (C10) due to lensing is simple: dA(z) remains the same and given by Equations (C11) and (C12), while

Equation (C16)

where κ is the convergence given in terms of an LOS integral over the density contrast. This reflects the fact that objects appear to occupy larger solid angles for positive κ. Therefore,

Equation (C17)

Let us consider a small patch of the survey at a given redshift and where κ and V can be assumed constant. Let Fl be the limiting flux of the survey. The minimum threshold observed luminosity for this patch is

Equation (C18)

The threshold intrinsic luminosity is (see Equation (C10))

Equation (C19)

Thus, the actual threshold intrinsic luminosity of galaxies in a given range of observed redshift z in the patch will depend on κ and the LOS velocity, V. The number density will therefore change, and this will affect the variance of $\mathrm{log}{L}_{\mathrm{int}}^{{\rm{e}}}$ estimated from Lobs using Equation (C15), in addition to the modulation of ${L}_{\mathrm{int}}^{{\rm{e}}}$ by V. In the tests provided in this paper, we do not include the effect related to the change in the threshold luminosity.

In the above, it may seem that there is a degeneracy between κ and the velocity corrections. The convergence κ is an LOS integrated quantity and can actually be estimated from the distribution of galaxies assuming a biasing relation.

Since we are considering line luminosity, there are no issues related to k-correction present in the case of measuring magnitudes in a given band.

Appendix D: Gravitational Lensing

Gravitational lensing by foreground mass distribution modifies the observed luminosity/SFR of a galaxy by a multiplicative factor 1 + 2κ, where (Bartelmann & Schneider 2001)

Equation (D1)

where the galaxy is at r and $g(r^{\prime} ,r)=(r^{\prime} -r{{\prime} }^{2}/r)/a(r^{\prime} )$. Following the derivation in Nusser et al. (2013), we obtain

Equation (D2)

Working with Limber's approximation for the spherical Bessel functions ${j}_{l}({kr})\sim \sqrt{\tfrac{\pi }{l+1/2}}{\delta }^{{\rm{D}}}(l+1/2-{kr})$, the relation becomes

Equation (D3)

The variance of κ is then

Equation (D4)

where Wl represents averaging over a solid angle $\pi {\theta }_{{\rm{s}}}^{2}$, where θs = Rs/r. In the second part of this equation, the effect of this window function is approximated in terms of the sharp l-cutoff at ${l}_{\mathrm{lmax}}=2r/{R}_{{\rm{s}}}$. For Rs = 20 and 100 h−1 Mpc, the expression yields, respectively, σκ ≈ 2 × 10−3 and 3 × 10−4. This translates into logSFR shifts of $\mathrm{log}(1+2{\sigma }_{\kappa })\approx 1.6\times {10}^{-3}$ and 2.5 × 10−4 for the smaller and larger Rs, respectively.

Footnotes

  • The quantity ${V}_{{M}_{{\rm{p}}}}$ is available from the Rockstar catalogs for all halos in all simulations.

  • The parameter H disappears from the equation if distances are expressed in kilometers per second.

  • 10 

    For MDPL2, the volume of (comoving) $1{({h}^{-1}\mathrm{Gpc})}^{3}$ is equivalent to the volume of 5% of the sky between z = 1 and 1.2. For reference, the sky coverage of Euclid is 35% of the sky.

Please wait… references are loading.
10.3847/1538-4357/abc42f