A MEASUREMENT OF GRAVITATIONAL LENSING OF THE COSMIC MICROWAVE BACKGROUND BY GALAXY CLUSTERS USING DATA FROM THE SOUTH POLE TELESCOPE

E. J. Baxter; R. Keisler; S. Dodelson; K. A. Aird; S. W. Allen; M. L. N. Ashby; M. Bautz; M. Bayliss; B. A. Benson; L. E. Bleem; S. Bocquet; M. Brodwin; J. E. Carlstrom; C. L. Chang; I. Chiu; H-M. Cho; A. Clocchiatti; T. M. Crawford; A. T. Crites; S. Desai; J. P. Dietrich; T. de Haan; M. A. Dobbs; R. J. Foley; W. R. Forman; E. M. George; M. D. Gladders; A. H. Gonzalez; N. W. Halverson; N. L. Harrington; C. Hennig; H. Hoekstra; G. P. Holder; W. L. Holzapfel; Z. Hou; J. D. Hrubes; C. Jones; L. Knox; A. T. Lee; E. M. Leitch; J. Liu; M. Lueker; D. Luong-Van; A. Mantz; D. P. Marrone; M. McDonald; J. J. McMahon; S. S. Meyer; M. Millea; L. M. Mocanu; S. S. Murray; S. Padin; C. Pryke; C. L. Reichardt; A. Rest; J. E. Ruhl; B. R. Saliwanchik; A. Saro; J. T. Sayre; K. K. Schaffer; E. Shirokoff; J. Song; H. G. Spieler; B. Stalder; S. A. Stanford; Z. Staniszewski; A. A. Stark; K. T. Story; A. van Engelen; K. Vanderlinde; J. D. Vieira; A. Vikhlinin; R. Williamson; O. Zahn; A. Zenteno

doi:10.1088/0004-637X/806/2/247

1. INTRODUCTION

Gravitational lensing of the cosmic microwave background (CMB) by large-scale structure (LSS) has recently emerged as a powerful cosmological probe. The first detection of this effect relied on measuring the cross-correlation between CMB lensing maps and radio galaxy counts (Smith et al. 2007). Subsequent studies have correlated CMB lensing maps with several different galaxy populations (e.g., Hirata et al. 2008; Bleem et al. 2012; Planck Collaboration et al. 2014b), quasars (e.g., Hirata et al. 2008; Sherwin et al. 2012; Planck Collaboration et al. 2014b), and maps of the cosmic infrared background (CIB; Holder et al. 2013; Planck Collaboration et al. 2014c), to give just a few examples. These measurements of the correlation between CMB lensing and intervening structure have used massive objects as effectively point-like tracers of LSS and have thus been sensitive to the clustering of the dark matter halos these objects inhabit. In the context of the halo model, this clustering signal is the "two-halo term" (for a review of the halo model see Cooray & Sheth 2002).

The lensing of the CMB due to the galaxies or clusters themselves is sensitive to the structure of the individual halos, i.e., the "one-halo" term. Madhavacheril et al. (2014) have recently reported a measurement of the lensing of the CMB by dark matter halos with masses M ∼ 10¹³ M_⊙ using CMB data from the Atacama Cosmology Telescope Polarimeter stacked on the locations of roughly 12,000 CMASS galaxies from the SDSS-III/BOSS survey. Galaxy clusters, with halo masses M ≳ 10¹⁴ M_⊙, offer another promising target for measuring lensing of the CMB by individual halos.

Seljak & Zaldarriaga (2000) showed that lensing by galaxy clusters induces a dipole-like distortion in the CMB that is proportional to and aligned with the CMB gradient behind the cluster. Consider a galaxy cluster lying along the line of sight to a pure gradient in the CMB. Photon trajectories on either side of the cluster are bent toward the cluster, causing these photons to appear to have originated farther away from the cluster. The net result is that the CMB temperature appears decreased on the hot side of the cluster and increased on the opposite side. In the absence of a CMB temperature gradient behind the cluster, gravitational lensing does not lead to a measurable distortion (this can be seen as a consequence of the fact that gravitational lensing conserves surface brightness). The magnitude of the CMB cluster lensing distortion is therefore sensitive to the mass distribution of the cluster, its redshift, and also the pattern of the CMB on the last scattering surface in the direction of the cluster. For a typical CMB gradient of 13 μK arcmin⁻¹ and a cluster with mass M ∼ 10¹⁵ M_⊙ located at z ∼ 1 (a high mass, high redshift cluster), the lensing distortion in the CMB peaks at ∼10 μK roughly 1 arcmin from the cluster center.

Current CMB experiments do not have the sensitivity to obtain high significance detections of the lensing effect around single clusters. To detect this effect, then, we must combine the constraints from many clusters to increase the signal-to-noise. Since the lensing distortion induced by a cluster is sensitive to the mass of the cluster, the combined lensing constraint can be translated into a constraint on the weighted average of the cluster masses in the sample. For the time being, CMB lensing constraints on cluster mass are unlikely to be competitive with other means of measuring cluster masses, such as lensing of the light from background galaxies (e.g., Johnston et al. 2007; Okabe et al. 2010; High et al. 2012; Hoekstra et al. 2012; von der Linden et al. 2012). Still, such measurements provide a useful cross-check on other techniques for measuring cluster mass because they are sensitive to different sources of systematic error. Future CMB experiments with higher sensitivity will dramatically improve the signal-to-noise of CMB cluster lensing measurements. If sources of systematic error can be controlled, high signal-to-noise measurements of CMB cluster lensing can provide cosmologically useful cluster mass constraints, especially at z ≳ 1 (Lewis & King 2006). Furthermore, if both CMB lensing and galaxy lensing constraints can be obtained on a set of clusters, these measurements can be combined to yield interesting constraints on e.g., dark energy (Hu et al. 2007b).

Several authors have considered the detectability of the effect and how well CMB cluster lensing can constrain cluster masses (e.g., Seljak & Zaldarriaga 2000; Holder & Kosowsky 2004; Dodelson 2004; Vale et al. 2004; Lewis & King 2006; Lewis & Challinor 2006). Various approaches to extract the signal have also been investigated: Seljak & Zaldarriaga (2000) and Vale et al. (2004) considered fitting out the gradient in the CMB to extract the cluster signal; Holder & Kosowsky (2004) considered an approach based on Wiener filtering; Lewis & Challinor (2006) and Yoo & Zaldarriaga (2008) developed a maximum likelihood approach; and Hu et al. (2007a) and Melin & Bartlett (2014) considered approaches based on the optimal quadratic estimator of Hu (2001) and Hu & Okamoto (2002). Many of these techniques rely on a separation of scales inherent to the problem: the distortions caused by cluster lensing are a few arcminutes in angular size, while the primordial CMB has little structure on these scales as a result of diffusion damping. This simple picture is complicated by the fact that instrumental noise and foreground emission may lead to arcminute size structure in the observed temperature field. Furthermore, any method to extract the CMB cluster lensing signal must be robust to contamination from the thermal and kinematic Sunyaev–Zel'dovich (SZ) effects (Sunyaev & Zel'dovich 1972, 1980), as well as other foregrounds.

In this paper we present a 3.1σ measurement of the arcminute scale gravitational lensing of the CMB by galaxy clusters using data from the full 2500 deg² South Pole Telescope (SPT)-SZ survey (e.g., Story et al. 2013). We develop a maximum likelihood approach to extract the CMB cluster lensing signal based on a model for the lensing-induced distortion. Our approach differs somewhat from those mentioned above in that it is inherently parametric: we directly constrain the parameters of an assumed mass profile rather than generating a map of the lensing mass. The method is validated via application to mock data and is then applied to observations of the CMB around 513 clusters identified in the SPT-SZ survey via their SZ effect signature (Bleem et al. 2015). The mass constraints from each cluster are combined to constrain the weighted average of the cluster masses in our sample. As a null test, we also analyze many sets of off-cluster observations and find no significant detection.

The paper is organized as follows: in Section 2 we describe the data set used in this work and in Section 3 we develop a maximum likelihood approach to extract the CMB cluster lensing signal from this data set. The results of our analysis applied to mock data and our estimation of systematic effects are presented in Section 4. The analysis is applied to SPT data in Section 5, and conclusions are given in Section 6.

2. DATA

2.1. CMB Data

The data used in this work were collected with the South Pole Telescope (SPT; Carlstrom et al. 2011) as part of the SPT-SZ survey. The SPT-SZ survey covered roughly 2500 deg² of the southern sky to an approximate depth of 40, 18, and 80 μK arcmin in frequency bands centered at 95, 150, and 220 GHz, respectively. The SPT-SZ maps used in this analysis are identical to those described in George et al. (2014). The maps are projected using the oblique Lambert azimuthal equal-area projection and are divided into square pixels measuring 0.5 arcmin on a side.

The 2500 deg² SPT-SZ survey area was subdivided into 19 contiguous fields, each of which was observed to full survey depth before moving on to the next. The fields were observed using a sequence of left-going and right-going scans. Each pair of scans is at a constant elevation, and the elevation is increased in a discrete step between pairs. Denoting left-going and right-going scans as L and R, the sky map is the sum $\frac{1}{2}(L+R)$ of maps generated from these two scan directions. The difference map formed via the combination $\frac{1}{2}(L-R)$ should have no sky signal and can be used as a statistically representative estimate of the instrumental and atmospheric noise (henceforth, we will sometimes refer to these two noise sources simply as "instrumental noise," since the distinction is irrelevant for our purposes). Because the observing strategy varies somewhat between different fields, so does the level of instrumental noise. Below, we will estimate the instrumental noise levels in a field-dependent fashion. More detailed descriptions of the SPT observation strategy may be found in George et al. (2014) and references therein.

Each sky map used in this work is the sum of signal from the sky and instrumental noise. The signal contribution to the maps can be expressed as the convolution of the true sky with an instrumental-plus-analysis response function. The response function characterizes how astrophysical objects would appear in the SPT-SZ maps and consists of two components: a "beam function" that accounts for the SPT beam shape, and a "transfer function" that accounts for the time-stream filtering of the SPT data. As with the instrumental noise, variations in the observation strategy between different fields cause the transfer function of the maps to also vary between fields. The characterizations of the SPT transfer and beam functions are described in George et al. (2014) and references therein. We treat the transfer function in a field-dependent fashion below. In Section 3 we use the measured beam and transfer functions to fit for the CMB cluster lensing signal in the SPT-SZ data.

2.2. tSZ-free Maps

The SZ effect is the distortion of the CMB induced by inverse-Compton scattering of CMB photons and energetic electrons (for a review see Birkinshaw 1999). This effect is especially pronounced in the directions of massive galaxy clusters as these objects are reservoirs of hot, ionized gas. The SZ effect from clusters can be divided into two parts: the thermal SZ effect (tSZ) and the kinematic SZ effect (kSZ). The tSZ effect is due to inverse-Compton scattering of CMB photons with hot intra-cluster electrons. The effect has a distinct spectral signature that makes a cluster appear as a cold spot in the CMB at low frequencies and a hot spot at high frequencies, with a null at 217 GHz. If the cluster also has a peculiar velocity relative to the CMB rest frame, the CMB will appear anisotropic to the cluster, and an additional Doppler shift will be imprinted on the scattered CMB photons. This distortion, known as the kSZ effect, is frequency independent when expressed as a brightness temperature fluctuation.

The magnitude of the tSZ effect around galaxy clusters can be significantly greater than the magnitude of the CMB cluster lensing signal. A cluster with mass M ∼ 5 × 10¹⁴ M_⊙ introduces a tSZ signal of roughly −400 μK (as compared to roughly 5 μK from lensing) at the cluster center when observed at 150 GHz. Introducing this level of SZ contamination into our mock analysis (see Section 3.6) biases the lensing mass constraints to such an extreme degree that we lose the ability to measure CMB cluster lensing. Eliminating the tSZ is therefore essential to our analysis.

We exploit the frequency dependence of the tSZ to remove it from our data. Since SPT observes at 95, 150, and 220 GHz, we form a linear combination of the data at these three frequencies that nulls the tSZ effect, but preserves the CMB signal. This tSZ-free linear combination is created as follows. First, all three maps are smoothed to the resolution of the 95 GHz map since that map has the lowest angular resolution (∼1.6 arcmin). Next, a linear combination of the 95 and 150 GHz maps that cancels the tSZ while preserving the primordial CMB is generated. Lastly, this linear combination map is added to the 220 GHz map (which is assumed to be tSZ-free since 220 GHz corresponds roughly to the null in the tSZ) with inverse variance weighting to minimize the noise in the final map. We note that this last step, the combination of the 95/150 GHz linear combination data with the 220 GHz data, could benefit from an optimal weighting of the two data sets as a function of angular multipole. The analysis presented here effectively uses a different, sub-optimal weighting. We also ignore relativistic corrections to the tSZ spectrum (Itoh et al. 1998), which negligibly affect the construction of the tSZ-free linear combination.

The noise level of the resulting tSZ-free map is roughly 55 μK arcmin, significantly higher than the 18 μK arcmin noise in the 150 GHz data: we have sacrificed statistical sensitivity to remove the tSZ-induced bias. We use only this tSZ-free linear combination in the analysis presented here. Because the kSZ is not frequency dependent, it is not eliminated with this approach; we will return to its effects in Section 4.3.1.

2.3. Galaxy Cluster Catalog

The galaxy clusters used in this analysis were selected via their tSZ signatures in the 2500 deg² SPT-SZ survey as described in Bleem et al. (2015). We select all clusters with signal-to-noise ξ > 4.5 and with measured optical redshifts, resulting in 513 clusters. The clusters analyzed in this work have a median redshift of z = 0.55% and 95% of the clusters lie in the 0.14 < z < 1.25 redshift interval. Bleem et al. (2015) derived cluster mass estimates for this sample using a scaling relation between M₅₀₀ and the SZ detection significance. As described there, the calibration of this scaling relationship is somewhat sensitive to the assumed cosmology: adopting the best-fit ΛCDM model from Reichardt et al. (2013) lowers the cluster mass estimates by 8% on average, while adopting the best-fit parameters from WMAP9 (Hinshaw et al. 2013) or Planck (Planck Collaboration et al. 2014a) increases the cluster mass estimates by 4% and 17%, respectively. For the cosmological parameters adopted in Bleem et al. (2015; flat ΛCDM with Ω_m = 0.3, h = 0.7, σ₈ = 0.8), the median SZ-derived mass of the cluster sample is M₅₀₀ = 3.6 × 10¹⁴ M_⊙ and 95% of the clusters lie in the range 2.5 × 10¹⁴ M_⊙ < M₅₀₀ < 9.6 × 10¹⁴ M_⊙. We make use of these mass estimates to generate mock data in Section 3.6 and in Section 5 we compare these SZ-derived masses to the cluster masses derived from our measurement of CMB cluster lensing.

2.4. Map Cutouts and the Noise Mask

The lensing analysis presented here is performed on "cutouts" from the tSZ-free maps. Each cutout measures 5.5 arcmin on a side. These cutouts are centered on the galaxy clusters' positions determined in Bleem et al. (2015), and we refer to these as "on-cluster" cutouts.

For the purposes of null tests (i.e., confirming that we observe no signal when no CMB cluster lensing is occurring), we have produced many sets of "off-cluster" cutouts centered on random positions in the maps. To ensure that these off-cluster cutouts have noise properties representative of the on-cluster cutouts, we draw these random points from a sub-region of the map that we refer to as the "noise mask," defined as follows. First, for each field we define the weight map, w, which is approximately proportional to the inverse variance of the instrumental noise at each position in the map. Given the weight map of a particular field, we select positions that have weights between 0.95 w_min and 1.05 w_max, where w_min and w_max are the minimum and maximum weights at all cluster locations in the field, respectively. Finally, we exclude from the noise mask any portion of the map that is within 10 arcmin of an identified point source or cluster. The point source catalog used for this purpose is taken from George et al. (2014) and includes all point sources detected at greater than 5σ (∼6.4 mJy at 150 GHz). For each cluster, we randomly draw 50 off-cluster cutouts from the noise mask region of the field in which the cluster resides. This procedure gives us 50 sets of 513 off-cluster cutouts that have the same noise properties as our 513 on-cluster cutouts. To be robust, our lensing analysis should not detect any cluster lensing on these off-cluster cutouts, and we confirm this fact explicitly below.

3. ANALYSIS

We have developed a maximum likelihood technique for constraining the CMB cluster lensing signal. This approach relies on computing the full pixel-space likelihood of the data given a model for the lensing deflection angles sourced by a cluster. The likelihood function extracts all the information contained in the data about the model parameters.

The unlensed CMB is known to be very close to a Gaussian random field (e.g., Planck Collaboration et al. 2014d). As such, the likelihood of observing a particular set of pixelized temperature values, ${\boldsymbol{d}}$ , can be computed given a model for the covariance between these pixels, ${\boldsymbol{C}}$ . The Gaussian likelihood is:

$\begin{eqnarray}&&{\mathcal{L}}({\boldsymbol{C}}| {\boldsymbol{d}})=\displaystyle \frac{1}{\sqrt{{(2\pi )}^{{N}_{\mathrm{pix}}}\mathrm{det}{\boldsymbol{C}}}}\mathrm{exp}\left[-\displaystyle \frac{1}{2}{{\boldsymbol{d}}}^{T}{{\boldsymbol{C}}}^{-1}{\boldsymbol{d}}\right],\end{eqnarray} \tag{ 1 }$

where N_pix is the number of pixels in ${\boldsymbol{d}}$ . Our model for the data includes contributions from three sources:

$\begin{eqnarray}&&{\boldsymbol{C}}={{\boldsymbol{C}}}_{\mathrm{CMB}}+{{\boldsymbol{C}}}_{\mathrm{foregrounds}}+{{\boldsymbol{C}}}_{\mathrm{noise}},\end{eqnarray} \tag{ 2 }$

where ${{\boldsymbol{C}}}_{\mathrm{CMB}}$ is the covariance due to the CMB, ${{\boldsymbol{C}}}_{\mathrm{foregrounds}}$ is the covariance due to signals on the sky that are not CMB, and ${{\boldsymbol{C}}}_{\mathrm{noise}}$ is the covariance due to instrumental noise. In Equation (1) we have defined the data vector to be the deviation from the mean CMB temperature so that $\langle {\boldsymbol{d}}\rangle =0$ .

We model the foreground and noise covariances as Gaussian. The dominant foreground in our measurement is due to the CIB. Although non-Gaussianity is present in the CIB (Crawford et al. 2014; Planck Collaboration et al. 2014e), the level of non-Gaussianity is small. For example, Crawford et al. (2014) measured the bispectrum of the 220 GHz CIB Poisson term to be B ∼ 1.7 × 10⁻¹⁰ μK³. This contributes approximately B^2/3 = 3.1 × 10⁻⁷ μK² to the power spectrum, which is only ∼1% of the 220 GHz CIB Poisson power spectrum measured by George et al. (2014), C = 4.6 × 10⁻⁵ μK².

3.1. The Lensed CMB Covariance Matrix

Gravitational lensing is a surface brightness-preserving remapping of the unlensed CMB. This means that a photon that is observed at direction $\hat{n}$ originated from the direction ${\hat{n}}_{\mathrm{unlensed}}=\hat{n}+{\boldsymbol{\delta }}(\hat{n})$ , where ${\boldsymbol{\delta }}(\hat{n})$ is the gravitational lensing deflection field. Lensing thus changes the covariance structure of ${{\boldsymbol{C}}}_{\mathrm{CMB}}$ .⁴⁵ Since the cluster position is uncorrelated with the CMB temperature, the mean of the data will remain zero. In principle, ${{\boldsymbol{C}}}_{\mathrm{foregrounds}}$ can also change as a result of gravitational lensing if, for instance, some of the foreground emission is sourced from behind the cluster. This issue warrants careful consideration and we will return to it in more detail below. ${{\boldsymbol{C}}}_{\mathrm{noise}}$ is, of course, unaffected by gravitational lensing since it is not cosmological.

Because we are interested in the behavior of the CMB on small angular scales comparable to the sizes of galaxy clusters, a flat sky approximation is appropriate here and we can replace $\hat{n}$ with the planar ${\boldsymbol{x}}$ . The calculation of the lensed CMB covariance matrix, ${{\boldsymbol{C}}}_{\mathrm{CMB}}$ (M), for a cluster of mass M then proceeds exactly as in the unlensed case (e.g., Dodelson 2003), except ${\boldsymbol{x}}$ must be replaced with ${{\boldsymbol{x}}}_{\mathrm{unlensed}}={\boldsymbol{x}}+{{\boldsymbol{\delta }}}^{M}(\hat{n})$ (the superscript M here is used to indicate that the deflection field is a function of the cluster mass). We find that the elements of the lensed covariance matrix can be written as

$\begin{eqnarray}{{\boldsymbol{C}}}_{\mathrm{CMB},\mathrm{ij}}(M) & = & \displaystyle \int {d}^{2}x\displaystyle \int {d}^{2}x\prime \;{B}_{i}({\boldsymbol{x}}){B}_{j}({{\boldsymbol{x}}}^{\prime })\\ & & \times g({\boldsymbol{x}}+{{\boldsymbol{\delta }}}^{M}({\boldsymbol{x}}),{{\boldsymbol{x}}}^{\prime }+{{\boldsymbol{\delta }}}^{M}({{\boldsymbol{x}}}^{\prime })),\end{eqnarray} \tag{ 3 }$

where

$\begin{eqnarray}&&g({\boldsymbol{x}}+{{\boldsymbol{\delta }}}^{M}({\boldsymbol{x}}),{{\boldsymbol{x}}}^{\prime }+{{\boldsymbol{\delta }}}^{M}({{\boldsymbol{x}}}^{\prime }))\\ &&\qquad \approx \displaystyle \sum _{l}{C}_{l}\displaystyle \frac{(2l+1)}{4\pi }{J}_{0}\left(l|({\boldsymbol{x}}+{{\boldsymbol{\delta }}}^{M}({\boldsymbol{x}}))\right.\\ &&\qquad \quad \left.-({{\boldsymbol{x}}}^{\prime }+{{\boldsymbol{\delta }}}^{M}({{\boldsymbol{x}}}^{\prime }))|\right),\end{eqnarray} \tag{ 4 }$

and J₀ is the zeroth order Bessel function of the first kind. Here, ${B}_{i}({\boldsymbol{x}})$ is the pixelized beam and transfer function for pixel i; i.e., given a true sky signal $f({\boldsymbol{x}})$ , a noiseless experiment would measure a signal in pixel i equal to ${s}_{i}=\int {d}^{2}{{xB}}_{i}({\boldsymbol{x}})f({\boldsymbol{x}})$ . For ease of notation, we lump the telescope beam and transfer functions into a single object; in reality, these two functions are sourced by very different mechanisms as was discussed in Section 2. C_l is the power spectrum of the CMB, which we obtain from CAMB⁴⁶ (Lewis et al. 2000; Howlett et al. 2012) using the best-fit WMAP7+SPT cosmology from Story et al. (2013). Here we use the lensed CMB power spectrum to account for the LSS present at redshifts below and above the cluster redshift.⁴⁷

3.2. The Deflection Angle Template

The lensed CMB covariance matrix can be computed from Equation (3) given a model for the deflection field sourced by the cluster. The deflection field can in turn be computed from a model for the cluster mass distribution if the cluster redshift is known. In this analysis, we assume a Navarro–Frenk–White (NFW) profile for the cluster mass distribution, parameterized in terms of M₂₀₀ and the concentration, c (Navarro et al. 1996). Written in this way, the NFW profile is

$\begin{eqnarray}&&\rho (r)=\displaystyle \frac{(200/3){c}^{3}}{\mathrm{ln}(1+c)-\frac{c}{1+c}}\displaystyle \frac{{\rho }_{\mathrm{crit}}(z)}{\left(\frac{{rc}}{{r}_{200}}\right){\left(1+\frac{{rc}}{{r}_{200}}\right)}^{2}},\end{eqnarray} \tag{ 5 }$

where $\rho (r)$ is the mass density a distance r from the center of the cluster; ${\rho }_{\mathrm{crit}}(z)=3{H}^{2}(z)/(8\pi G)$ is the critical density for closure of the Universe at redshift z; and r₂₀₀ is defined to be the radius at which the mean enclosed density is 200ρ_crit(z). The mass enclosed within this radius is ${M}_{200}=(800\pi /3){\rho }_{\mathrm{crit}}(z){r}_{200}^{3}$ . Henceforth, when referring to the cluster mass we will use M₂₀₀ rather than the more generic M. The concentration parameter, c, controls how centrally concentrated the density profile is, with higher values of c resulting in a more centrally peaked mass distribution. Simulations suggest that c is a slowly varying function of the cluster mass and redshift; for a M₂₀₀ = 5 × 10¹⁴ M_⊙ cluster, the expected concentration is c ∼ 2.7 (Duffy et al. 2008). Since we are concerned with halos of mass M₂₀₀ ∼ 5 × 10¹⁴ M_⊙ here and because our likelihood constraints are only weakly sensitive to the concentration, we fix c = 3 throughout. The results obtained by varying c from 2 to 5 are essentially identical, as we discuss in Section 4.3.4.

While the NFW profile is a common choice for parameterizing the density profiles of galaxy clusters, true cluster density profiles may exhibit significant deviations from this form. High resolution dark matter-only simulations, for instance, suggest that the density profiles of the inner cores of clusters are flatter than predicted by the NFW formula (which diverges as ${r}^{-1}$ for small r; e.g., Merritt et al. 2006; Navarro et al. 2010). The introduction of baryonic effects into such simulations has also been shown to significantly impact the cluster density profile at small r, causing departures from the NFW form (e.g., Gnedin et al. 2004; Duffy et al. 2010; Gnedin et al. 2011; Schaller et al. 2014). Simulations also suggest that for massive or rapidly accreting halos, the outer density profile (r ≳ 0.5 r₂₀₀) declines more rapidly than predicted by the NFW formula (e.g., Diemer & Kravtsov 2014). Finally, halos of galaxy clusters are not expected to be perfectly spherical, but rather triaxial (e.g., Jing & Suto 2002). Still, despite these caveats, the NFW profile has proven an excellent fit to weak lensing observations of galaxy clusters. Although the density profile of an individual galaxy cluster may exhibit significant deviations from the NFW form, the profile averaged over many clusters—such as the 513 clusters considered here—has been shown to be very well described by an NFW mass distribution (e.g., Johnston et al. 2007; Okabe et al. 2010; Newman et al. 2013). Furthermore, departures from the NFW profile in the central part of the cluster are unlikely to have much effect on our results because of the low resolution (roughly 1 arcmin) of our data, and because the mass of the core is a small fraction of the total cluster mass. Ultimately, the NFW profile is more than adequate for our purposes since the current data set does not have the resolution or sensitivity to distinguish between different profiles. We constrain the potential systematic effects introduced into our analysis by departures from the NFW profile in Section 4.2.

For an NFW profile, the deflection vector at angular position ${\boldsymbol{\theta }}$ away from the cluster is

$\begin{eqnarray}&&{{\boldsymbol{\delta }}}^{M}(\theta )=-\displaystyle \frac{16\pi {GA}}{{{cr}}_{200}}\displaystyle \frac{{\boldsymbol{\theta }}}{\theta }\displaystyle \frac{{d}_{\mathrm{SL}}}{{d}_{{{\rm S}}}}f({d}_{{{\rm L}}}\theta c/{r}_{200}),\end{eqnarray} \tag{ 6 }$

where d_L, d_S and d_SL are the angular diameter distances to the lens, to the source, and between the source and the lens, respectively, and $\theta =| {\boldsymbol{\theta }}|$ (Bartelmann 1996; Dodelson 2004). The function f(x) is given by

$\begin{eqnarray}f(x)=\displaystyle \frac{1}{x}\left\{\begin{array}{ll}\mathrm{ln}(x/2)+\displaystyle \frac{\mathrm{ln}\left(x/[1-\sqrt{1-{x}^{2}}]\right)}{\sqrt{1-{x}^{2}}}, & \mathrm{if}\ x\lt 1\\ \mathrm{ln}(x/2)+\displaystyle \frac{\pi /2-\mathrm{arcsin}(1/x)}{\sqrt{{x}^{2}-1}}, & \mathrm{if}\ x\gt 1\end{array}\right.\end{eqnarray} \tag{ 7 }$

and the constant A is related to M₂₀₀ and c via

$\begin{eqnarray}&&A=\displaystyle \frac{{M}_{200}{c}^{2}}{4\pi [\mathrm{ln}(1+c)-c/(1+c)]}.\end{eqnarray} \tag{ 8 }$

In our analysis we allow the cluster mass to be negative; a negative cluster mass simply means that the deflection vector is pointed in the opposite direction of that predicted for a positive cluster mass of equal magnitude.

3.3. Numerical Implementation

With the measured beam and transfer functions of SPT and the deflection angle template of Equation (6), the predicted ${{\boldsymbol{C}}}_{\mathrm{CMB}}$ (M₂₀₀) can be computed by direct integration of Equation (3). Unfortunately, evaluating the 4D integral in Equation (3) is computationally expensive and the full covariance matrix must be computed many times. Consequently, we instead rely on Monte Carlo simulations to calculate the lensed CMB covariance matrix.

The unlensed covariance matrix is first computed at 1.0 arcmin resolution across an angular window 70.5 arcmin on a side (this wide range relative to the cluster cutouts—which are only 5.5 arcmin on a side—ensures that we capture the full effects of the SPT beam and transfer function). In the absence of lensing, Equation (3) can be simplified significantly, and the unlensed covariance elements can be quickly calculated (e.g., Dodelson 2003). Many Gaussian realizations of this unlensed covariance matrix (i.e., realizations of the unlensed CMB) are then generated. Next, a high resolution (0.1 arcmin) map of the deflection field is generated for a particular M₂₀₀ and z. The unlensed CMB maps are then interpolated at the positions of the deflected high-resolution pixels. Since the primordial CMB is smooth on scales below a few arcminutes this interpolation is very accurate. The resultant maps are then degraded to the resolution of the tabulated beam and transfer functions, which are applied to the mock maps using Fast Fourier Transforms. Finally, the mean of the product of the lensed temperatures in pairs of pixels, d_id_j, is computed across the many simulated realizations of the lensed CMB. This mean serves as our estimate of ${{\boldsymbol{C}}}_{\mathrm{CMB}}$ (M₂₀₀).

Our baseline analysis uses 20,000 simulated realizations of the lensed CMB to form an estimate of the lensed CMB covariance matrix. To ensure that this procedure has reached the precision required for our analysis, we repeat the covariance estimation using fewer and lower-resolution simulations. We find that decreasing the number of simulations by a factor of two, increasing the pixel size at which the lensing operation is performed by a factor of 2.5, and decreasing the window size from 70.5 to 60.5 arcmin all lead to small changes in the estimated covariances matrices (on the order of a few percent). We also repeat the full likelihood analysis using the degraded covariance estimates and find that the change in the likelihood is entirely negligible (less than a percent in most cases). We are therefore confident that our covariance estimation procedure has acheived sufficient precision for the analysis presented here.

Even when performed in the Monte Carlo fashion described above, the computation of the lensed CMB covariance matrix is still computationally expensive. To speed up the analysis of the data even more, we compute the lensed covariance matrix across a grid of ${M}_{200}$ and z; the lensed covariance matrix at the desired mass and redshift can then be computed via interpolation. Our baseline analysis uses 31 evenly spaced M₂₀₀ values and 7 evenly spaced z values. To determine whether the accuracy of the covariance interpolation is sufficient for our measurement, we have increased the resolution of the M₂₀₀ and z grid across which the covariance matrix is evaluated and have found the impact on our likelihood results to be negligible.

3.4. Noise and Foreground Covariance

To compute the likelihood in Equation (1) we must also estimate ${{\boldsymbol{C}}}_{\mathrm{nf}}\equiv {{\boldsymbol{C}}}_{\mathrm{noise}}+{{\boldsymbol{C}}}_{\mathrm{foregrounds}}$ . We take the approach of computing this combination of covariances directly from the data. Since the noise level varies somewhat from field to field, the estimation of ${{\boldsymbol{C}}}_{\mathrm{nf}}$ must be performed separately for each field. To do this, we randomly sample cutouts from the SPT maps of each field to measure the covariance of the observed data, ${{\boldsymbol{C}}}_{\mathrm{obs}}$ . These samples are drawn from the noise mask region defined in Section 2.4. ${{\boldsymbol{C}}}_{\mathrm{nf}}$ is then estimated by subtracting the predicted CMB-only covariance from the measured CMB+noise+foreground covariance, i.e., ${{\boldsymbol{C}}}_{\mathrm{nf}}$ = ${{\boldsymbol{C}}}_{\mathrm{obs}}$ − ${{\boldsymbol{C}}}_{\mathrm{CMB}}$ (M₂₀₀ = 0).

If the foregrounds are lensed by the cluster it is possible for ${{\boldsymbol{C}}}_{\mathrm{foregrounds}}$ to vary with M₂₀₀. Modeling foreground lensing, however, would require knowledge of the redshift distribution of the foregrounds; for the sake of simplicity we assume that the foregrounds remain unlensed in our analysis. We quantify the bias introduced into our analysis by this assumption using mock data, as described in Section 4.2. For the purposes of generating this mock data, it is useful to have estimates of both ${{\boldsymbol{C}}}_{\mathrm{noise}}$ and ${{\boldsymbol{C}}}_{\mathrm{foregrounds}}$ (rather than only the sum ${{\boldsymbol{C}}}_{\mathrm{noise}}$ + ${{\boldsymbol{C}}}_{\mathrm{foregrounds}}$ ). To estimate ${{\boldsymbol{C}}}_{\mathrm{noise}}$ we sample cutouts from the $L-R$ difference maps described in Section 2.1. This sampling procedure is done using the same noise masks as above so that ${{\boldsymbol{C}}}_{\mathrm{noise}}$ accurately reflects the noise at the cluster locations.

${{\boldsymbol{C}}}_{\mathrm{foregrounds}}$ , on the other hand, is estimated using previous constraints on the power spectra of the dominant foreground sources. For the tSZ-free maps that we use in this analysis, the dominant foregrounds are the "Poisson" and "clustered" components constrained in Reichardt et al. (2012). The Poisson foreground results from point sources below the detection threshold that are randomly distributed on the sky and has C_l = C₀, independent of l. The amplitude of the Poisson component is estimated from the data. The clustered foreground model accounts for the clustering of point sources and is modeled as D_l ≡ C_l l(l + 1)/(2π) = D₀ independent of l for l < 1500, and D_l ∝ l^0.8 for l > 1500. The amplitude of the clustered component is taken from Reichardt et al. (2012), adjusted to account for the fact that our maps are constructed from a weighted combination of observations at three frequencies. With the foreground power spectra determined, ${{\boldsymbol{C}}}_{\mathrm{foregrounds}}$ can be calculated in the same way as the unlensed CMB covariance matrix.

We emphasize that the main analysis estimates ${{\boldsymbol{C}}}_{\mathrm{nf}}$ ≡ ${{\boldsymbol{C}}}_{\mathrm{noise}}$ + ${{\boldsymbol{C}}}_{\mathrm{foregrounds}}$ directly from the data, and that the individual estimates of ${{\boldsymbol{C}}}_{\mathrm{noise}}$ and ${{\boldsymbol{C}}}_{\mathrm{foregrounds}}$ are used only to test for certain systematic effects using mock data.

3.5. Combining the Likelihoods

With our estimates of ${{\boldsymbol{C}}}_{\mathrm{CMB}}$ (M₂₀₀) and ${{\boldsymbol{C}}}_{\mathrm{noise}}$ + ${{\boldsymbol{C}}}_{\mathrm{foregrounds}}$ , we now have all the ingredients necessary to evaluate the likelihood in Equation (1). For a cutout around the ith cluster, we evaluate the likelihood, ${{\mathcal{L}}}_{i}({M}_{200})$ , as a function of M₂₀₀ to constrain the effects of CMB lensing by that cluster. However, since the instrumental noise is large relative to the CMB cluster lensing signal, we do not expect to obtain a detection of the lensing effect around a single cluster. Instead, we must combine constraints from multiple clusters. One way to accomplish this is to compute the likelihood ${{\mathcal{L}}}_{\mathrm{total}}({M}_{200})={\prod }_{i}^{{N}_{\mathrm{clusters}}}{{\mathcal{L}}}_{i}({M}_{200})$ , where N_clusters = 513 is the number of clusters in our sample. This method of combining likelihoods is appealing because it is simple and because it depends only on the lensing information.

Not all the masses in the sample are the same, so the above treatment—which assumes all clusters share a common mass—provides more of an estimate of the detection significance than any useful information on the masses of the clusters in the sample. Furthermore, the spread in masses will likely lead to a spread in the width of the likelihood function, i.e., a degradation in the signal-to-noise. Some of this can be recaptured by scaling the ${M}_{200}$ parameter for each cluster by an external mass estimator for that cluster, and indeed estimates of each cluster's mass can be obtained from the strength of the SZ signal at the cluster location. Here we use the SZ-determined cluster masses from Bleem et al. (2015) that were discussed in Section 2.3. We convert the M_500,SZ measured in Bleem et al. (2015) into M_200,SZ using the Duffy et al. (2008) mass-concentration relation. So an improved likelihood that includes this information is written not as a function of M₂₀₀, but rather as

$\begin{eqnarray}&&{{\mathcal{L}}}_{i}\to {{\mathcal{L}}}_{i}\left(\displaystyle \frac{{M}_{200}}{{M}_{200,\mathrm{SZ}}}\;{M}_{200,\mathrm{SZ},i}\right),\end{eqnarray} \tag{ 9 }$

with a new free global parameter ${M}_{200}/{M}_{200,\mathrm{SZ}}$ . The individual cluster likelihoods expressed as functions of ${M}_{200}/{M}_{200,\mathrm{SZ}}$ can then be combined as before:

$\begin{eqnarray}&&{{\mathcal{L}}}_{\mathrm{total}}\left(\displaystyle \frac{{M}_{200}}{{M}_{200,\mathrm{SZ}}}\right)=\displaystyle \prod _{i}^{{N}_{\mathrm{clusters}}}{{\mathcal{L}}}_{i}\left(\displaystyle \frac{{M}_{200}}{{M}_{200,\mathrm{SZ}}}\;{M}_{200,\mathrm{SZ},i}\right).\end{eqnarray} \tag{ 10 }$

Note, however, that any intrinsic scatter in the relationship between the lensing-derived M₂₀₀ and the SZ-derived M_200,SZ will lead to additional broadening of the combined multi-cluster likelihood as a function of M₂₀₀/M_200,SZ. We will employ both methods of combining individual cluster likelihoods in Section 5.

3.6. Mock Data

In order to test our analysis pipeline and study possible sources of systematic error we generate and analyze mock data. The mock data sets include contributions from the lensed (and unlensed) CMB, foregrounds and noise. The mock cluster redshift distribution is identical to the redshift distribution of the real clusters. To generate cluster masses for our mock catalog, we convert the SZ-derived M₅₀₀ values described in Section 2.3 to M₂₀₀ assuming that the clusters are described by NFW profiles with the Duffy et al. (2008) mass-concentration relation. The resultant sample has a median mass of M₂₀₀ = 5.6 × 10¹⁴ M_⊙ and 95% of the clusters have 4.0 × 10¹⁴ M_⊙ < M₂₀₀ < 1.37 × 10¹⁵ M_⊙.

For each mock cluster, a realization of the lensed and unlensed CMB was generated in the same manner described in Section 3.3. The clusters were distributed among the SPT fields identically to the real clusters, and the appropriate beam and transfer functions for each field were applied. Gaussian realizations of the measured noise and foreground covariance matrix, ${{\boldsymbol{C}}}_{\mathrm{nf}}$ , were added to the mock data in a field-dependent fashion. The process of generating a mock cluster catalog was repeated 50 times to build statistics. Each mock catalog includes entirely new realizations of the CMB, foregrounds and noise.

4. RESULTS ON MOCK CATALOGS

4.1. Projections

The results of our analysis of the mock cluster cutouts are shown in Figure 1. The top panel shows the results of analyzing the mock data when CMB lensing is turned on, while the bottom panel shows the results when CMB lensing is turned off (i.e., a null test). Each gray curve represents the combined likelihood constraints from an SPT-like survey with 513 clusters generated in the manner described above; the blue curves show the combined constraints from 50 mock data sets of 513 clusters. The vertical red line in the top panel indicates the true mean cluster mass in the mock survey. Each mock data set strongly prefers a positive cluster mass over M₂₀₀ ≤ 0. The combined constraint from 50 mock data sets in Figure 1 illustrates that the likelihood prefers the mean cluster mass of the sample. When the analysis is performed on the unlensed mock data (bottom panel), none of the 50 mock data sets yield a significant detection, and the mean is centered at the (correct) value of M₂₀₀ = 0.

To quantify the significance of our measurement of CMB cluster lensing (for both mock and real data) we use a likelihood ratio test. Since we are interested in whether or not the data prefer lensing over the null hypothesis of no lensing (i.e., M₂₀₀ = 0), we define the likelihood ratio

$\begin{eqnarray}&&{{\rm \Lambda }}=\displaystyle \frac{{\mathcal{L}}({M}_{200}=0)}{\mathrm{max}{\mathcal{L}}({M}_{200})}.\end{eqnarray} \tag{ 11 }$

In the large sample size limit (i.e., many clusters), $-2\mathrm{ln}{{\rm \Lambda }}$ should be χ²(k = 1)-distributed with k = 1 degree of freedom. Note that this statement does not assume that the likelihood for each cluster is Gaussian as a function of M₂₀₀. The p-value for the measurement is then found by integrating the χ²(k = 1) distribution below $-2\mathrm{ln}{{\rm \Lambda }}$ . Our reported detection significance is calculated by converting this p-value into a standard, two-sided Gaussian significance and is exactly equal to $\sqrt{-2\mathrm{ln}{{\rm \Lambda }}}$ . All detection significances are reported in this way below. Averaging across the 50 mocks discussed above, we find that the mean detection significance for an SPT-like survey (i.e., 513 mock clusters) is 3.4σ.

4.2. Systematics Tests

Several sources of systematic error can potentially affect our CMB cluster lensing measurement. We quantify the impact of these systematic effects on our analysis by modeling them in mock data. For the purposes of these systematic tests we generate new mock data consisting of 500 realizations of the CMB, noise, and foregrounds for a single cluster with z = 0.55 and M₂₀₀ = 5.6 × 10¹⁴ M_⊙, corresponding to the median redshift and SZ-derived mass for clusters in our sample. Various systematic effects are introduced to this mock data set as described below. We then analyze the mock data neglecting the presence of the systematic effects and measure how the likelihood changes.

We express the bias introduced by each systematic as the fractional shift in the maximum likelihood mass: $({M}_{\mathrm{sys}}^{\mathrm{ML}}-{M}^{\mathrm{ML}})/{M}^{\mathrm{true}}$ , where ${M}_{\mathrm{sys}}^{\mathrm{ML}}$ is the maximum likelihood mass in the presence of the systematic, M^ML is the maximum likelihood mass without the systematic, and M^true = 5.6 × 10¹⁴ M_⊙ is the true mass of the mock clusters. This process is repeated 50 times and we report the mean value of the bias across these trials. We caution that this procedure is not meant to rigorously quantify the systematic error budget of our lensing constraints; we have, after all, assumed a single mass and redshift for all of the mock clusters. Instead, these estimates are provided for two purposes. First, they suggest that the individual systematic errors associated with our cluster mass measurement are likely small compared to the statistical error bars on this measurement. Second, the estimates provided below highlight the relative importance of each of the systematic effects that we consider here.

4.2.1. Monopole Contamination

The first systematic that we consider is anything that leads to a signal at the cluster center (a "monopole"). The CMB cluster lensing signal vanishes at the cluster center and therefore has no monopole component. Since our model includes no other signals correlated with the cluster, any residual monopole-like signal at the cluster location is not included in our model and could therefore bias our analysis. One important potential source of monopole contamination is residual tSZ in our tSZ-free maps. Although the linear combination map used is nominally independent of tSZ, the finite width of the observing bands and relativistic corrections to the tSZ (Itoh et al. 1998) can produce a small residual component. Other potential sources of monopole contamination include the integrated dusty emission or radio emission from cluster member galaxies much too faint to be individually detected in SPT maps. Strong emission from individual cluster members is treated in the next section.

We determine the amplitude of such contamination directly from our data. Stacking all of the cluster cutouts reveals that the level of monopole contamination is consistent with a β profile (Cavaliere & Fusco-Femiano 1976, 1978) with β = 1, θ_c = 0.5 arcmin, and an amplitude of −3 μK for each cluster. We introduce this level of contamination into our 50 sets of 500 mock cutouts and repeat the likelihood analysis (just as before, without accounting for the monopole contamination) to determine how our likelihood constraints are affected. Across 50 sets of mock cutouts, we find that monopole contamination of the measured amplitude leads to a shift in the maximum likelihood mass that is ≲1%, well below the statistical precision of our cluster mass constraint.

4.3. Emission from Individual Cluster Members

The contamination of our measurement by a single bright cluster galaxy does not in general behave like the monopole contamination considered above. In particular, a single source could fill in one side of the cluster lensing dipole if its projected position relative to the cluster is at a particular radius and orientation. At 150 GHz and a resolution of 1.6 arcmin, a 1 mJy source will have an equivalent CMB fluctuation temperature of 10 μK and, assuming a spectral index of α = −0.5, will have a temperature fluctuation of roughly −10 μK in our tSZ-free maps. We simulate the effects of such sources on our analysis by introducing a single point source with beam-smoothed amplitude of −10 μK into each of our mock cutouts. We choose the location of the point source randomly across a disk of radius 1.5 arcmin centered on the cluster. Since the CMB cluster lensing dipole is expected to peak at ∼1 arcmin away from the cluster center, sources located much farther than this should have little effect on our measurement.

We find that introducing this level of point source contamination into our mock data causes the inferred cluster mass to be biased low by ∼7% on average across our 50 sets of 500 mock cluster cutouts. In reality, however, not every cluster is expected to have an associated point source of this magnitude and proximity to the cluster. Using the De Zotti et al. (2010) model for radio source counts at 150 GHz and the results of Coble et al. (2007), we estimate that only ∼5% of SPT-SZ clusters will have a 1 mJy or greater source within 1.5 arcmin of the cluster center. We only consider radio sources in this calculation because models of dusty sources predict fewer bright sources (e.g., Negrello et al. 2007), and because star formation is suppressed in cluster environments (e.g., Bai et al. 2007). The resulting bias on the mean mass of our cluster sample would thus be <1%, well below our statistical precision.

4.3.1. kSZ

The second systematic that we consider is the kSZ effect. The kSZ effect results from scattering of CMB photons with electrons that have bulk velocities relative to the Hubble flow. Motions of cluster electrons could be due, for instance, to the cluster falling toward nearby superstructures or because the cluster is rotating. While typically much smaller than the tSZ effect, the kSZ effect is frequency independent when expressed as a change in brightness temperature, so the tSZ-free linear combination map contains a kSZ component.

The diffuse kSZ caused by linear or quasi-linear structure will act only as a source of noise in this analysis, and, because its amplitude is much smaller than the instrumental noise (George et al. 2014), it can be safely ignored here. Instead we turn our attention to the kSZ due to the galaxy clusters themselves. This cluster kSZ signal will have two components: a component due to the bulk motion of the cluster, and a component due to internal velocities.

To include the effects of the bulk component of the kSZ in our mock data we rely on the work of Sehgal et al. (2010), which used N-body simulations and models for the gas physics at different redshifts to generate maps of the kSZ effect. The Sehgal et al. (2010) kSZ maps are generated by assigning a single velocity to all gas associated with each cluster, and thus provide an estimate of the kSZ signal due to the bulk velocity of each cluster. The simulated kSZ signal is introduced into our mock cutouts by extracting cutouts from the Sehgal et al. (2010) kSZ maps around clusters with M₂₀₀ between 5.0 × 10¹⁴ M_⊙ and 6.0 × 10¹⁴ M_⊙. This selection ensures that the kSZ signal is reasonably well matched to our mock clusters, which have masses of 5.6 × 10¹⁴ M_⊙. The likelihood analysis of the mock cutouts with kSZ is then performed as before, ignoring the presence of the kSZ.

Across 50 realizations of the mock data, the introduction of a bulk-velocity kSZ component causes the maximum likelihood mass to be biased low by 9% on average, below the statistical precision of this work. We note that our analysis of mock data with kSZ suggests that the size of the bias introduced by the presence of the kSZ depends on the level of instrumental noise and foregrounds in the data. If the foreground or instrumental noise contributions are very small, the bias introduced by the kSZ can become significant. Future experiments with higher sensitivity may need to take a more careful approach to accounting for the kSZ.

The mock kSZ signal considered above does not include the effects of a kSZ signal due to internal motions of gas within the cluster. Of particular concern is the kSZ signal resulting from cluster rotation, which we call rkSZ. A cluster that is rotating will induce a dipole-like kSZ signal since one side of the cluster will be moving toward us while the other will be moving away. Consequently, even though the rkSZ is expected to be small, it is a potentially serious contaminant for the CMB cluster lensing measurement because of its similar morphology on the sky. Unlike the CMB cluster lensing signal, though, the rkSZ dipole is not preferentially aligned with the gradient of the CMB temperature field.

Our model for the rkSZ signal is based on the model of Chluba & Mannheim (2002), where it is assumed that a galaxy cluster rotates as a solid body, motivated in part by the work of Bullock et al. (2001) and Cooray & Chen (2002). Modeling the electron number density as a β-profile, Chluba & Mannheim (2002) derive an expression for the rkSZ signal:

$\begin{eqnarray}&&\displaystyle \frac{{{\rm \Delta }}{T}_{\mathrm{rkSZ}}}{{T}_{\mathrm{CMB}}}(\theta ,\phi )={A}_{\mathrm{rkSZ}}\theta \mathrm{sin}i\mathrm{sin}\phi {\left(1+\displaystyle \frac{{\theta }^{2}}{{\theta }_{\mathrm{core}}^{2}}\right)}^{1/2-3\beta /2},\end{eqnarray} \tag{ 12 }$

where A_rkSZ is a parameter that controls the amplitude of the signal, θ is the angular distance from the cluster center, ϕ is the transverse angular coordinate and i is the inclination angle of the cluster. We set β = 1 and θ_core = 1 arcmin as these values are fairly typical for the clusters in our sample.

The amplitude of the rkSZ signal, A_rkSZ, is not very well constrained at present. Simulations (e.g., Nagai et al. 2003; Fang et al. 2009; Bianconi et al. 2013) suggest that the rotational velocities of clusters are typically small compared to the cluster velocity dispersion. However, in clusters that have recently experienced mergers, the rotational velocities may be significantly larger. Chluba & Mannheim (2002) argue that typical peak rkSZ signals are in the range 0.1–10 μK, but could be as high as 100 μK for a recent merger.

The model rkSZ signal is introduced into our 50 sets of 500 mock cutouts assuming a constant value of A_rkSZ for all mock clusters. Each cluster's inclination angle and orientation on the sky are chosen randomly, however, so the mock rkSZ signal varies from cluster to cluster. We explore several values of A_rkSZ, chosen such that the maximum amplitude of the rkSZ signal (i.e., for an optimally aligned cluster) varies between 1 and 20 μK. We find that the presence of rkSZ in the mock data acts to reduce our measured signal. At a maximum amplitude of 1 μK the rkSZ introduces a mass bias of less than 1% to our mass constraints, at 5 μK the peak of the likelihood is biased to lower masses by roughly 8%, at 10 μK the bias is roughly 28% and at 20 μK the bias is 93%. Therefore, it appears that as long as the rkSZ signal is ≲10 μK, the bias introduced into our mass constraints by such a signal is less than the statistical precision of this work. Since most clusters are expected to have rkSZ signals less than 10 μK, we do not attempt to correct for this effect here. Although clusters that have experienced recent mergers may have rkSZ signals that are higher than 10 μK, the number of such clusters in our sample is likely small.

4.3.2. Foreground Lensing

As discussed above, the degree to which foreground emission is lensed by the cluster is not very well constrained. The CIB—which constitutes the dominant source of foreground emission—is thought to originate from redshifts z ∼ 0.5 to 4. Since our cluster sample is drawn from 0.05 ≲ z ≲ 1.5, the amount by which the foregrounds are lensed will likely vary from cluster to cluster. Our analysis, however, assumes that foregrounds remain unlensed. To investigate the effects of this assumption on our analysis, we generate mock cutouts with lensed foregrounds assuming that foreground emission originates from z = 4. Since the CIB is known to originate from z ≲ 4, setting z = 4 gives an approximate upper bound to the effects of gravitational lensing on the foregrounds, and therefore an upper limit to the systematic error introduced into our analysis by assuming no foreground lensing.

Realizations of the lensed clustered foreground can be generated using the procedure described in Section 3.6. Lensing the Poisson foreground is more difficult as this foreground has power extending to arbitrarily small scales, including scales below that at which we generate map realizations. To get around this, we calculate the lensed Poisson covariance matrix directly from the integral in Equation (3) and use this covariance matrix to generate realizations of the lensed Poisson foreground. The mock cutouts with lensed foregrounds are then analyzed as before, assuming that both foregrounds remain unlensed.

Across the 50 sets of 500 mock clusters that we have generated, we find that lensing of the foregrounds causes our M₂₀₀ constraint to be biased low. Lensing of the Poisson foreground contributes the dominant part of this bias, owing to its large contribution to the total covariance relative to that of the clustered foreground. The average mass bias introduced into our mock analysis by lensing of the foregrounds is 7%. We do not correct for this bias, as doing so would require a detailed modeling of the redshift distribution of the CIB. We emphasize, though, that the bias measured here is necessarily an overestimate of the true bias introduced by foreground lensing because we have placed the foregrounds at z = 4 when the true foreground emission results from z ≤ 4.

4.3.3. Cluster Miscentering

The cluster centers used in our analysis are derived from SPT measurements of the cluster SZ signal and will generally differ from the centers of mass of the clusters. A similar miscentering problem arises in the context of galaxy shear measurements, where the cluster center is typically defined as the location of the brightest cluster galaxy (BCG), even though the BCG may not correspond to the true center of mass of the cluster (e.g., von der Linden et al. 2012). In that context, cluster miscentering can be a significant source of systematic error in cluster mass measurements, causing the masses of miscentered clusters to be underestimated.

We model the effects of imperfect knowledge of the cluster center by applying random positional shifts to our mock cluster data. These offsets are drawn from a two-dimensional Gaussian with σ = 30 arcsec. In this model, 68% of the offsets are smaller than 45 arcsec. As a point of reference, Song et al. (2012) found that 68% of the offsets between SPT-estimated centers and BCGs were smaller than 38 arcsec, so the miscentering error introduced here is likely an overestimate. Analyzing the miscentered mock data reveals that the peak likelihood is biased to lower mass by roughly 6% on average, below the statistical precision of our lensing mass constraint. Accurately modeling the size of the miscentering systematic error would require an understanding of how the miscentering error varies with cluster mass and redshift, and we do not attempt such a detailed analysis here.

4.3.4. Uncertainty in the Cluster Mass Profile

Our analysis assumes an NFW profile for each cluster with concentration c = 3. In reality, the halo concentration is known to vary with cluster mass and redshift, and to exhibit significant scatter. To explore the effects on our analysis of changing the halo concentration, we regenerate the mock cluster data using halos of concentration c = 2.5 and c = 5. These two values of the concentration should bracket the expected range of concentrations allowed for the clusters in our sample, including effects of uncertainty in the assumed cosmological parameters (Dutton & Macciò 2014). The data are then analyzed as before, assuming c = 3. We find that changing the concentration has an essentially negligible effect on our analysis, which is not surprising given that our constraints are not sensitive enough to distinguish between slightly different behaviors of the inner mass profile. Across 50 realizations of 513 mock clusters, we find that the maximum likelihood mass increases on average by less than 1% when c = 5, well below the statistical precision of our measurements. When c = 2.5 we find that the maximum likelihod mass decreases by about 1%. The effects of changing halo concentration can therefore be safely ignored in this analysis.

A related source of potential bias is halo triaxiality. It is well known from simulations (e.g., Jing & Suto 2002; Kasun & Evrard 2005) that halo density profiles are not perfectly spherical. Deviations from sphericity could introduce a bias into our analysis because we have assumed a perfectly spherical NFW profile. Corless & King (2007) have found that in the context of traditional galaxy shear measurements, fitting a spherical NFW profile to the extreme case of a halo elongated along the line of sight can lead to a 50% mass bias. Averaged over all possible halo orientations, however, Corless & King (2007) find that the mean recovered mass is very close to the true mass. Given the low sensitivity of our mass constraints and the findings of Corless & King (2007), it is unlikely that halo triaxiality has a significant impact on our results. A detailed modeling of the effects of halo triaxiality is beyond the scope of this work.

Finally, we consider deviations of the halo profile from the NFW form itself. While large deviations from the NFW profile are expected in the central region of the dark matter halo, the roughly 1.6 arcmin resolution of the SZ-free maps means that we are not very sensitive to the behavior of the density profile in this regime. Deviations from the NFW form are also expected for massive clusters in the outskirts of the halo, r ≳ 0.5r₂₀₀ (Diemer & Kravtsov 2014). Assuming the SZ-derived masses described in Section 2.3, the median θ₂₀₀ = r₂₀₀/d_A(z) for the clusters in our sample is 5.2 arcmin, where d_A(z) is the angular diameter distance to the cluster. This means that our angular window of 5.5 arcmin around each cluster is probing r ∼ 0.5r₂₀₀. Consequently, deviations from the NFW form in the r ≳ 0.5r₂₀₀ regime could potentially introduce a systematic error into our mass constraints.

We model the effects of deviations from the NFW density profile by approximating the results of Diemer & Kravtsov (2014). Mock data with a non-NFW deflection profile are generated and analyzed assuming the usual NFW deflection formula. We find that modifying the form of the deflection profile in this way biases the best-fit mass low by roughly $10\%$ on average across our 50 sets of 500 mock cluster cutouts.

4.3.5. Large-scale Structure

Our NFW lensing template (Equation (6)) accounts only for deflections of CMB photons caused by the cluster itself. It therefore ignores deflections that could be caused by the presence of LSS near the line of sight to the cluster. Lensing by LSS unassociated with the cluster changes the covariance properties of the CMB in a well-known way (e.g., Seljak 1996). This effect is approximated in our model through the use of the LSS-lensed C_l's in computing the model covariance matrix (Equation (3)). However, it is well known that clusters live in overdense environments. Lensing induced by LSS that is associated with the cluster is not included in our model and could therefore bias our analysis.

In the language of the halo model, we have effectively ignored the two-halo contribution to the lensing signal. However, weak lensing data (e.g., Johnston et al. 2007) suggest that within a few virial radii of the cluster center, the one-halo term dominates the lensing signal. Since the analysis presented here considers a small angular region around each cluster that extends to only ≤1 virial radius, it is safe to neglect the two-halo term in this analysis.

4.3.6. Cluster Selection

One remaining potential systematic is related to the SZ-selection method. The SPT clusters have been selected at the locations of decrements in the 95 and 150 GHz maps. Simulations show that clusters selected in this fashion will preferentially sit on decrements in the CMB, and this effect could potentially bias the mass inferred from CMB lensing. However, the bias in the background CMB is small, on the order of −1 μK, and the resulting effect on the CMB lensing mass is likely to be small compared to our statistical error.

4.3.7. Combined Systematic Effects

The above discussion has considered how several different systematic effects can individually bias our lensing constraints. We now attempt to estimate the total bias resulting from the combination of multiple systematic effects. Our combined systematic model includes the five most significant biases considered above. We include the bulk motion kSZ, the rkSZ with peak amplitude of 5 μK, and foreground lensing as described in Section 4.3.2. The clusters are miscentered as described in Section 4.3.3 and the cluster density profile used is the Diemer & Kravtsov (2014) profile described in Section 4.3.4. We find that the mean bias introduced by this combined systematic model is a 39% bias to lower cluster mass. The measured bias is consistent with the product of the individual biases (34%), given the scatter among the 50 simulation realizations. A 39% bias to lower cluster mass amounts to a roughly 0.85σ shift in units of the statistical uncertainty. This should be interpreted as an approximate upper limit on the bias to lower cluster mass, as we have placed all of the foreground emission at z = 4 and have likely over-estimated the effect of miscentering. We do not attempt to correct for this systematic bias, although doing so would not alter the main conclusions of this work, as discussed in Section 6.

5. RESULTS

Figure 2 shows the results of our likelihood analysis applied to the data described in Section 2. The red curve represents our constraint from the analysis of 513 on-cluster cutouts. Each of the 50 gray curves represents the constraint from 513 off-cluster cutouts chosen from the same fields as the on-cluster cutouts in the manner described in Section 2. The thick blue curve is the combined constraint from the 50 sets of off-cluster cutouts.

The on-cluster likelihood in Figure 2 shows a preference for positive mass. We find that M₂₀₀ = 0 is ruled out at 3.1σ using the likelihood ratio test described above. We assume a flat prior on M₂₀₀ so that the posterior probability of M₂₀₀ is directly proportional to the likelihood. Integrating the posterior on M₂₀₀ yields a 68% confidence band of ${M}_{200}={5.1}_{-2.1}^{+2.5}\times {10}^{14}$ M_⊙. The results of our analysis of the SPT clusters are also consistent with the projections from mock data, which had a mean detection significance of 3.4σ. The off-cluster likelihoods shown in Figure 2 (gray curves) are consistent with M₂₀₀ = 0. The worst null likelihood has a detection significance of 2.2σ, which is reasonable since we have considered 50 null likelihoods. The combined constraint from all 50 null likelihoods is also consistent with M₂₀₀ = 0 at 0.84 σ₅₀, where σ₅₀ is the standard deviation computed from the 50 stacked likelihoods. There is therefore no evidence of any bias in our off-cluster analysis.

As described in Section 3.5, the constraints on the lensing mass M₂₀₀ of our cluster sample can be translated into constraints on the ratio between the lensing mass, M_200,lens, and the cluster mass estimated from the tSZ effect, M_200,SZ. The likelihood curve of the ratio M_200,lens/M_200,SZ is calculated per cluster, and the combined constraint (assuming a flat prior on the ratio) is

$\begin{eqnarray}&&\displaystyle \frac{{M}_{200,\mathrm{lens}}}{{M}_{200,\mathrm{SZ}}}={0.83}_{-0.37}^{+0.38}\qquad (68\%\;{{\rm C}}.{{\rm L}}.)\end{eqnarray} \tag{ 13 }$

The mean mass inferred from CMB cluster lensing is consistent with the mean mass inferred from the tSZ signal at 0.5σ. This constraint and the corresponding off-cluster likelihoods are shown in Figure 3. Using the likelihood ratio test described above, we find that M_200,lens/M_200,SZ = 0 is ruled out at 3.1σ.

**Figure 3.** Same as Figure 2 except the likelihood has been computed as a function of M₂₀₀/M_200,SZ, where M_200,SZ is the cluster mass computed from the measured tSZ signal as described in Bleem et al. (2015).
Download figure:
Standard image High-resolution image

As pointed out in Section 2.3, depending on the assumed cosmological model, the mean SZ-derived cluster mass can vary by as much as 17%. Our constraint on M_200,lens/M_200,SZ should therefore be viewed in the context of the cosmological model assumed in Bleem et al. (2015), from which our SZ-derived cluster masses are taken. Additionally, intrinsic scatter in the relationship between M_200,SZ and M_200,lens will lead to broadening of the likelihood as a function of M_200,lens/M_200,SZ. However, the expected level of intrinsic scatter between the true cluster mass and M_200,SZ is only ∼15% per cluster (Benson et al. 2013). Given our 3.1σ detection significance across all clusters, the per-cluster constraint on the lensing mass is roughly $\sqrt{513}/3.1\sim 730\%$ . The effect of intrinsic scatter in the SZ-derived masses is therefore only $1-730/\sqrt{{730}^{2}+{15}^{2}}\sim 0.02\%$ , and is therefore negligible here.

6. CONCLUSIONS

We have presented a measurement of CMB cluster lensing using data from the SPT. Our data rule out the null hypothesis (that cluster lensing is not occurring) at 3.1σ and constrain the weighted average cluster mass of our sample to be ${M}_{200}={5.1}_{-2.1}^{+2.5}\times {10}^{14}$ M_⊙ (68% confidence limit). Our cluster mass constraint—obtained by measurement of the CMB cluster lensing effect—is less precise than other cluster mass estimates, but it does offer a confirmation of SZ-derived mass estimates with completely independent sources of systematic errors: ${M}_{200,\mathrm{lens}}/{M}_{200,\mathrm{SZ}}={0.83}_{-0.37}^{+0.38}$ (68% C.L.). Our lensing mass constraint is consistent with M_200,lens/M_200,SZ = 1 at 0.5σ.

We have investigated several potential sources of systematic error and have found that their individual effects are significantly less than the statistical uncertainties of our mass constraints. We find that the most important systematic effects are the bulk velocity kSZ, the kSZ due to a rotating cluster, lensing of foregrounds by the clusters, cluster miscentering and deviation of the cluster density profile from the NFW form in the outskirts of the cluster. These findings are in agreement with other investigations into CMB cluster lensing systematic effects (e.g., Holder & Kosowsky 2004; Lewis & King 2006), although the contaminating effects of foreground lensing appear to be underappreciated in the literature.

All of the five most important systematic effects listed above bias our lensing constraint to lower masses. In our mock analysis, the presence of these five systematic effects results in an average bias of 39% to lower cluster mass. This level of bias amounts to roughly 0.85σ in units of the statistical error bar. We emphasize, though, there are several uncertainties involved in the calculation of this bias. For one, we have almost certainly overestimated the effects of foreground lensing on our analysis by placing all foreground emission at z = 4. Furthermore, our estimate of the bias caused by cluster miscentering is likely an overestimate as well because of our simplified treatment of this effect. Finally, the amplitude of the rotating-cluster kSZ signal is poorly constrained at present, and its effects on our analysis are therefore somewhat uncertain. Because of the large uncertainties associated with our estimates of systematic effects, we have chosen to not include corrections for these effects in our reported detection significance, and instead compute the detection significance from the statistical error bar alone.

Correcting for the measured 39% bias to lower cluster mass would cause the likelihood to prefer higher cluster mass and would therefore yield a higher detection significance as well as a higher M_200,lens/M_200,SZ. If the same bias is assumed for each cluster, a 39% shift to higher cluster mass would cause the best fit M_200,lens/M_200,SZ to increase to roughly 1.15, still consistent with M_200,lens/M_200,SZ = 1 to within the error bars. There is therefore no evidence from this analysis of tension with the SZ-derived cluster masses, even accounting for potentially large systematic biases.

Additionally, as discussed in Section 5, systematic uncertainty on M_200,SZ may affect our constraint on M_200,lens/M_200,SZ. In particular, the SZ-derived masses used in this work could potentially be overestimated by as much as 8% or underestimated by as much as 17%, depending on the assumed cosmological parameters. Our constraint on M_200,lens/M_200,SZ is derived assuming the same cosmological parameters used in Bleem et al. (2015) and should be considered in that context. Even if the maximal bias is assumed for the SZ-derived cluster masses, our analysis does not yield tension with M_200,lens/M_200,SZ = 1 at greater than 1σ. This statement remains true even if the lensing-derived masses are increased by 38% to account for the systematic biases discussed above.

Upcoming data sets offer the exciting possibility of significantly improved measurements of CMB cluster lensing. The measurement presented here using data from the SPT-SZ survey is noise limited: the lensing signal is at the few μK level and is on few arcminute scales, while the noise in the tSZ-free linear combination is roughly 55 μK arcmin. Ongoing experiments such as SPTpol (Austermann et al. 2012) and ACTPol (Naess et al. 2014), and future experiments such as SPT-3G (Benson et al. 2014), Advanced ACTPol (Calabrese et al. 2014), the Simons Array (Arnold et al. 2014), and so-called Stage IV CMB experiments (e.g., Abazajian et al. 2015) will have significantly lower noise levels than the SPT-SZ survey, allowing them to obtain significantly stronger detections of the CMB cluster lensing signal. Furthermore, these experiments will include additional information about lensing in the form of polarization data. In the primordial CMB, the odd-parity (B-mode) component of the CMB polarization field is expected to be uncorrelated with both the temperature field and the even-parity (E-mode) component of the polarization field. Consequently, lensing induced correlations between B modes and either temperature modes or E modes can be used as a relatively clean probe of CMB lensing (e.g., Hu & Okamoto 2002). Furthermore, polarization offers another handle on eliminating contamination from the SZ effect. The polarized SZ effect (both thermal and kinematic) from clusters is expected to be significantly smaller (i.e., less than 10–100 nK, Carlstrom et al. 2002) than the unpolarized effect, so polarization observations should offer a less-contaminated window into CMB cluster lensing (e.g., Holder & Kosowsky 2004).

With higher sensitivity data than that employed here, CMB cluster lensing has the potential to provide powerful constraints on cluster masses. In principle, these mass constraints can be used to improve cluster mass-observable relationships that are essential for using clusters as cosmological probes. However, our analysis of potential contaminating effects in Section 4.2 suggests that there is still much work to be done in reducing systematic errors associated with measurements of CMB cluster lensing. Particularly important are contamination from the kSZ effect, lensing of foregrounds, and departure from the NFW profile at large radii. In principle, both the kSZ and lensing of the foregrounds can be modeled and incorporated into the analysis to eliminate any bias that these effects introduce. However, uncertainty on the amplitude of the kSZ and uncertainty on the foreground redshift distribution limits our ability to accurately perform this modeling at present.

The South Pole Telescope is supported by the National Science Foundation through grant PLR-1248097. Partial support is also provided by the NSF Physics Frontier Center grant PHY-1125897 to the Kavli Institute of Cosmological Physics at the University of Chicago, the Kavli Foundation and the Gordon and Betty Moore Foundation grant GBMF 947. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. We acknowledge the use of the Legacy Archive for Microwave Background Data Analysis (LAMBDA). Support for LAMBDA is provided by the NASA Office of Space Science. This work was supported in part by the Kavli Institute for Cosmological Physics at the University of Chicago through grant NSF PHY-1125897 and an endowment from the Kavli Foundation and its founder Fred Kavli. The McGill group acknowledges funding from the National Sciences and Engineering Research Council of Canada, Canada Research Chairs program, and the Canadian Institute for Advanced Research. M. Dobbs acknowledges support from an Alfred P. Sloan Research Fellowship. S. Dodelson is supported by the U.S. Department of Energy, including grant DE-FG02-95ER40896. T. de Haan is supported by a Miller Research Fellowship. Cluster studies at SAO are supported by NSF grant AST-1009649.

A MEASUREMENT OF GRAVITATIONAL LENSING OF THE COSMIC MICROWAVE BACKGROUND BY GALAXY CLUSTERS USING DATA FROM THE SOUTH POLE TELESCOPE

Article metrics

Permissions

Author affiliations

ORCID iDs

Dates

ABSTRACT

1. INTRODUCTION

2. DATA

2.1. CMB Data

2.2. tSZ-free Maps

2.3. Galaxy Cluster Catalog

2.4. Map Cutouts and the Noise Mask

3. ANALYSIS

3.1. The Lensed CMB Covariance Matrix

3.2. The Deflection Angle Template

3.3. Numerical Implementation

3.4. Noise and Foreground Covariance

3.5. Combining the Likelihoods

3.6. Mock Data

4. RESULTS ON MOCK CATALOGS

4.1. Projections

4.2. Systematics Tests

4.2.1. Monopole Contamination

4.3. Emission from Individual Cluster Members

4.3.1. kSZ

4.3.2. Foreground Lensing

4.3.3. Cluster Miscentering

4.3.4. Uncertainty in the Cluster Mass Profile

4.3.5. Large-scale Structure

4.3.6. Cluster Selection

4.3.7. Combined Systematic Effects

5. RESULTS

6. CONCLUSIONS

Footnotes

A MEASUREMENT OF GRAVITATIONAL LENSING OF THE COSMIC MICROWAVE BACKGROUND BY GALAXY CLUSTERS USING DATA FROM THE SOUTH POLE TELESCOPE

Article metrics

Permissions

Share this article

Author affiliations

ORCID iDs

Dates

ABSTRACT

1. INTRODUCTION

2. DATA

2.1. CMB Data

2.2. tSZ-free Maps

2.3. Galaxy Cluster Catalog

2.4. Map Cutouts and the Noise Mask

3. ANALYSIS

3.1. The Lensed CMB Covariance Matrix

3.2. The Deflection Angle Template

3.3. Numerical Implementation

3.4. Noise and Foreground Covariance

3.5. Combining the Likelihoods

3.6. Mock Data

4. RESULTS ON MOCK CATALOGS

4.1. Projections

4.2. Systematics Tests

4.2.1. Monopole Contamination

4.3. Emission from Individual Cluster Members

4.3.1. kSZ

4.3.2. Foreground Lensing

4.3.3. Cluster Miscentering

4.3.4. Uncertainty in the Cluster Mass Profile

4.3.5. Large-scale Structure

4.3.6. Cluster Selection

4.3.7. Combined Systematic Effects

5. RESULTS

6. CONCLUSIONS

Footnotes