Cluster profiles from beyond-the-QE CMB lensing mass maps

Clusters of galaxies, being the largest collapsed structures in the universe, offer valuable insights into the nature of cosmic evolution. Precise calibration of the mass of clusters can be obtained by extracting their gravitational lensing signal on the Cosmic Microwave Background (CMB) fluctuations. We extend and test here the performance achieved on cluster scales by the parameter-free, maximum a posteriori (MAP) CMB lensing reconstruction method, which has been shown to be optimal in the broader context of CMB lensing mass map and power spectrum estimation. In the context of cluster lensing, the lensing signal of other large-scale structures acts as an additional source of noise. We show here that by delensing the CMB fluctuations around each and every cluster, this noise variance is reduced according to expectations. We also demonstrate that the well-known bias in the temperature quadratic estimator in this regime, sourced by the strong non-Gaussianity of the signal, is almost entirely mitigated without any scale cuts. Being statistically speaking an optimal and blind lensing mass map reconstruction, the MAP estimator is a promising tool for the calibration of the masses of clusters.


I. INTRODUCTION
Galaxy clusters are the largest gravitationally bound structures in the Universe.Their abundance as a function of redshift and mass is a direct probe of the growth of structures, and thus provides constraints on the matter density Ω m , on the amplitude of matter fluctuations σ 8 , as well as on the dark energy equation of state and the sum of neutrino masses [1][2][3][4][5][6][7][8][9][10][11].
The effect of gravitational lensing provides a direct observable of the total mass, free from assumptions on the dynamical state of the gas.Gravitational lensing of background galaxies offers precise calibration of the scaling relation [26][27][28][29][30][31][32][33], but is subject to systematics such as the intrinsic alignment and redshift uncertainties of the sources [34], and is limited by the absence of background galaxies for clusters at high redshift.
The gravitational lensing of the CMB, on which we focus here, is free from the systematics of galaxy weak lensing, and most useful for high redshift clusters.The CMB acts as an extended source at a redshift of 1100, with well understood statistics [35].Note that CMB is not free from systematics, mainly due to astrophysical contaminations, such as the own SZ emission of the cluster [36][37][38].However, at the small scales relevant for CMB lensing, polarized foreground emission is expected to be negligible.
The standard tool to reconstruct the lensing deflection field from the CMB is the quadratic estimator (QE) [39][40][41][42], which combines pairs of CMB maps to estimate the statistical anisotropies created by lensing.It has been shown that the standard temperature QE underestimates the strong deflection field of massive clusters due to a bias in the estimated gradient (unlensed) CMB field [43].Filtering out the small scales on the gradient leg of the QE greatly reduces this bias [44], at the expense of losing some moderate amount of signal to noise.
For current and upcoming CMB surveys, the QE reconstructed cluster lensing field is noise dominated.One possibility is to stack the reconstructed lensing map and measure the average cluster mass [37,[45][46][47].Similarly, one can obtain the mean cluster mass with matched-filtering [48][49][50][51].
Other estimators have been developed to reconstruct the lensing field around galaxy clusters, such as stacking CMB maps along the gradient direction to extract the lensing dipole [52,53], the gradient inversion estimator [54,55], or the maximum likelihood estimator (MLE) [56][57][58].The MLE directly fits the few parameters of a lensing template on observed CMB maps, and thus estimates the cluster mass without reconstructing the actual lensing map signal.This estimator is adequate for stacking analyses, provided the template shape accurately describes the mass profile and the mass distribution around the cluster, including potential sources of contamination such as the SZ effect.Recently, machine learning also provided a tool to estimate the cluster mass from lensed CMB maps, using neural networks trained on simulations [59][60][61].
First introduced in [62,63], likelihood-based CMB lensing mass map estimators aim to optimally reconstruct the large scale deflection field.Refs.[64,65] introduced similar technology in the context of cluster lensing.Their approach is built on assuming a cluster convergence profile, and iteratively delensing the observed maps to obtain the cluster mass.The lensing template at each iteration is estimated by stacking a set of reconstructed cluster convergence profiles.Some approximations are made for a tractable implementation, due to issues sourced by the beam of the instrument.
Recent years saw further development of the CMB lensing likelihood-based estimator of Ref. [62,63] (often called now maximum a posteriori estimator (MAP)), and other Bayesian attempts, [66][67][68][69][70][71][72][73].It has been demonstrated that the MAP estimator outperforms the QE, in particular for deep polarization surveys such as CMB-S4 [25], where most of the observed Bmodes are created by lensing.Indeed, while the QE is limited by the cosmic variance of the lensed B-modes, the likelihood estimator is able to delens parts of these B-modes (provided the noise is below the lensing level of ∼ 5µK arcmin), and thus decreases the variance.The MAP lensing mass map estimator essentially achieves what is expected to be the lowest possible reconstruction noise.
The scope of the current paper is to test the performance of the MAP estimator (as implemented in Ref. [66]) in the vicinity of galaxy clusters.Contrary to the likelihood approximations of Refs.[64,65], or to the MLE of [56][57][58], our MAP estimator does not assume a cluster profile, or the presence of a cluster signal at all, and is thus agnostic in the true deflection field.We take into account the large scale structures (LSS) deflection field together with the one from the cluster specifically.LSS lensing increases the reconstruction noise substantially a low noise levels, making the analysis more realistic.
Our paper is organised as follow.In Section II we briefly review the lensing of the CMB by a dark matter halo.In Section III we present the QE and the MAP lensing reconstruction, as well as our matched-filtering to estimate the cluster mass.In Section IV we compare the performance of both QE and MAP estimators applied to CMB simulations, and we conclude in Section V.
Our results, including the CMB simulations and the lensing reconstructions, were obtained with the publicly available code LensIt1 .

A. NFW profile
We use the Navarro-Frenk-White profile [NFW,74] to model the cluster mass distribution in our analysis.In this model, the density of the galaxy cluster is where ρ 0 and r s are the characteristic cluster density and scale radius respectively.We use M 200 to characterize the mass of the cluster.M 200 is the mass enclosed in a sphere of radius R 200 , within which the average density of the cluster is 200 times the critical density of the Universe ρ crit at the cluster redshift z.We truncate the NFW profile at a radius R trunc = 3 × R 200 to obtain realistic mass profiles.One can write the following relation between ρ 0 and ρ crit , where c 200 = R 200 /r s is the concentration parameter.It has the following empirical dependence on the cluster mass and redshift [46,75] c 200 (M 200 , z) = 5.71 (1+z (2.3) Hence, the NFW density profile is quantifiable with these two parameters, the mass2 M 200 and redshift z.

B. Lensing by NFW profile
The gravitational field created by the mass distribution in the late-time Universe causes deflection of CMB photons.This phenomenon is referred to as weak lensing of the CMB [35].As a result of weak lensing, the original CMB signal is remapped on the sky.We denote the quantities for the unlensed and lensed CMB as X(n) and X(n), respectively.Here, X can represent temperature (T ) or polarization Stokes parameters (Q and U ). Assuming the flat-sky approximation, the remapping of the unlensed quantity can be expressed as where α(n) is the deflection field, which can be expressed as the gradient of the lensing potential: α(n) = ∇ nϕ(n), neglecting the LSS lensing rotational component (second order in the scalar perturbations).The convergence κ(n) is the most relevant quantity we work with (2.5) Since we assume a spherically symmetric profile for the cluster (see Eq. 2.1), the cluster convergence is circularly symmetric, and depends only on the radial distance from the cluster center, denoted as r = |r| where Σ cl (r) is the projected surface density of the cluster along the line of sight and Σ crit (z) represents the critical surface density of the universe at the cluster redshift where c is the speed of light, G is the gravitational constant, and d A,CMB , d A,cl , and d A,cl-CMB represent the angular diameter distances to the CMB, the cluster, and between the cluster and CMB, respectively.In the small angle approximation we relate the angular separation θ and the radial distance from the cluster center with θ = r d A,cl .The expression for the cluster convergence is then where x = r/r s = θ/θ s and g(x) is a circularly symmetric function given in Appendix A. We rewrite the cluster convergence as a product of a normalization κ 0 and a template function κ t , which depends on the chosen density profile of the dark-matter halo (and hence on its mass and redshift), so that κ cl (x) = κ 0 κ t (x).The normalization κ 0 is chosen such that κ t (x = 1) = 1.Given the definition of κ 0 , it has direct proportionality relation with the mass of the cluster M 200 [50], Hence, κ 0 works as tracer of the cluster mass and constraining the signal to noise ratio (SNR) for M 200 boils down to constraining the same quantity for κ 0 , σ κ0 κ 0 = σ M200 M 200 .

III. CLUSTER MASS ESTIMATORS
We describe the MAP reconstruction methodology in III A. We follow Ref. [66] very closely, but had to adapt some details to make the algorithm work reliably at high resolution and high multipoles in the presence of clusters.In III B we describe the simple matched filter we use to recover the mass of the cluster from the MAP lensing maps.

A. Lensing Reconstruction
We work in the flat sky approximation, and identify multipoles to the plane wave vectors ℓ = (ℓ x , ℓ y ).As standard in literature, we use ℓ for the CMB multipoles and L for the lensing multipoles.
Let X be the unlensed CMB temperature or polarisation field, X ∈ {T, Q, U }, expressed in Fourier space.The covariance of these unlensed fields are the primordial power spectra The observed CMB field in pixel space can be expressed as where D is the operator that maps the unlensed CMB modes to the lensed CMB in real space (and thus contains the Fourier transform and lensing remapping operations).The operator B is the linear response matrix of the instrument, including beam and pixel window functions, and n is an independent noise, expressed in pixel space.When considering the temperature estimator, we have X dat = T , polarization only estimators have X dat = (Q, U ), while the minimum variance estimators have For a fixed deflection field the covariance of the observed CMB fields is where N is the noise covariance matrix in pixel space.While if averaging on the deflection fields as well we can write this covariance as where C len is the covariance of the lensed CMB fields, given by the fiducial lensed CMB power spectra, and Y is the Fourier transform operator.
In pixel space, the unnormalized quadratic estimator of the lensing deflection field can be expressed as the product of two filtered CMB maps where the inverse variance filtered and Wiener filtered legs are The normalisation of the QE is chosen to obtain an unbiased estimator, and is expressed in Fourier space as the isotropic response function R QE (L) [40].This gives the normalised QE convergence field in Fourier space, The maximum a posteriori estimator finds the deflection field that maximizes the likelihood of the lensed CMB fields, assuming a Gaussian prior on the lensing potential, with power spectrum C κκ,fid where the lensed CMB likelihood is assumed to be Gaussian and given by  11).Green, orange and pink show temperature-only, polarization-only and combined reconstruction respectively.Dashed shows the quadratic estimator case and solid the CMB lensing maximum a posteriori method of this work.Dotted shows the QE with a high-ℓ cut on the gradient leg [44].Shown is the case of a 2 × 10 14 M ⊙ cluster at z = 0.7, for the CMB-S4-like configuration given in the text.While the MAP approach is exactly the same than for optimal CMB lensing power spectrum reconstruction, this is probing much smaller scales (the spectrum SNR is almost entirely confined to L < 1500 for this configuration).respect to the α) as search direction.The main term of the gradient can be obtained exactly by running a QE with modified weights on partially delensed CMB maps [see 66, for more details].This is the gradient of the first term on the right-hand side in Eq. 3.8, quadratic in the data for fixed α, whose role is to capture the residual lensing signal.The gradient of the second term on the right-hand side, independent of the data for fixed α, is the 'mean-field', that removes from the quadratic piece signature of anisotropies unrelated to lensing.In our analysis, there is no masking and the noise is statistically isotropic, making the traditional main sources of meanfield vanish.In principle there is a noise contribution during the iterative procedure: the delensed data noise during the iterative process is not isotropic anymore, because at each step the data is delensed to reduce the CMB statistical anisotropy.This compresses or dilates regions of the sky according to the local estimate of the convergence field, changing the local effective noise levels.This contribution is small and localized at large lensing multipoles, which contributes little to the signal.Hence we neglect the mean field altogether in this paper.
Unfortunately, the normalization of this estimator is not tractable analytically.We follow [70,72] and obtain an empirical normalization of our MAP estimator from a set of simulations.We generate 1000 CMB flat sky patches, and lens them with random realizations of the large scale structures deflection field.Note that in these simulations we do not include any cluster signal.The empirical normalization is assumed to be isotropic, and is obtained from the cross correlation between the reconstructed κMAP and the input convergence κ in , averaged over our set of simulations As shown in [70,72], this empirical normalization is not sensitive to the cosmology, input lensing or data noise, allowing for a correct normalization also when the fiducial ingredients entering the reconstruction process are a poor match to those of the data.
In the posterior maximization process, delensing of the CMB occurs via the operator D † .Our first implementations of the MAP solver in Ref. [66] made the additional assumption of the invertibility of the deflection field, and performed remappings of the maps using a standard bicubic spline algorithm.To improve stability and performance, we updated our flat-sky tools with the lensing method of Ref. [73], based on non-uniform FFTs techniques [76,77].This method is extremely accurate and at the same time removes this unnecessary assumption of an invertible deflection.With this, we found that the search for the MAP point converges without issues on all relevant scales, after only 10-20 iterations.

B. Cluster Signal Reconstruction
Once we have reconstructed the lensing map κ(L) from the simulated CMB data, we fit our theoretical template κ 0 κ t (L).In practice the template would be obtained with a first guess of the cluster angular scale θ s .This can be estimated for instance from the tSZ emission of the cluster as in [50].Here we take the template which corresponds to the angular scale of the simulated clusters.Our estimator assumes an isotropic Gaussian noise spectrum N (L), such that ⟨κ(L)κ(L ′ )⟩ − ⟨κ(L)⟩⟨κ(L ′ )⟩ = δ(L − L ′ )N (L).This allows us to construct a minimum variance estimator for κ 0 , given by Owing to spherical symmetry, all quantities but κ(L) in the aforementioned expression are functions of L = |L|.The theoretical error σ 2,th κ0 for the estimator is as follows: for which we often use the curved-sky expression In these equations, the noise of the κ(L) estimate is given by where C κκ L is the power spectrum of the background convergence profile of the large-scale structure exclusive of the cluster, which contributes to the Gaussian noise on the cluster lensing signal.In the case that κ is the QE estimate, N (0) L is the leading CMB-lensing reconstruction noise (the disconnected four-point function), and N L the connected part originating from the secondary trispectrum contractions, proportional to C κκ L [35,78].In the MAP case, the same Eq.(3.13) provides a good fit to the spectrum, with both noise terms calculated with partially delensed spectra, following Ref.[70].On hypothetical maps where the cluster lensing signal were the only lensing-like signal, only N (0) L would be present.On small scales the non diagonal terms in the covariance N (L, L ′ ) of κ(L) could become important and reduce the constraining power, as discussed in [54].We checked from our set of simulations (described in the next section) that for all the estimators we considered, and up to L = 5000, the covariance N (L, L ′ ) of κ(L) is very close to diagonal, so it is close to optimal to use N (L) for our matched filtering.
In Fig. 1, we show the contribution of each lensing multipole to the theoretical signal to noise of Eq. 3.11.These curves are obtained with the theoretical N (0) L and N (1) L of each estimator, considering a CMB-S4 like experiment with white noise level of 1µK-arcmin in temperature and √ 2µKarcmin in polarization and a beam full width at half-maximum (FWHM) of 1 arcmin.We considered a cluster of mass of M 200 = 2 × 10 14 M ⊙ , at z = 0.7, which matches roughly the expected mean mass and redshift of the CMB-S4 clusters detected with thermal SZ effect [25].
We included there multipoles ℓ CMB min = 100 to ℓ CMB max = 4000 of the CMB temperature and polarization maps for the lensing reconstruction.We compare both the QE and the MAP estimators.We also show the results for a modified QE with a scale cut on the gradient leg of temperature map at ℓ cut = 2000, noted as Hu-Dedeo-Vale QE (HDV QE) [44] , which is introduced in more details below.
We see that for both QE and MAP, the lensing scales that dominates the signal to noise are around L ∼ 2000 in the temperature and around L ∼ 1100 in the polarization.The HDV estimator mainly loose information on the large scales, for L ≲ 2000, while smaller scales contain similar information as compared to the QE.This shows that for this CMB-S4 configuration and cluster convergence profile, temperature and polarization channels are bringing similar level of information, albeit from different scales.Finally we clearly see that the MAP estimator outperforms the QE at all lensing scales, and this improvement is mainly due to increased signal from the polarization channel.

IV. RESULTS
In this section, we present results on the cluster convergence profile and on the cluster mass from reconstructions of the lensing signal from a set of simulations.Our simulations reproduce a CMB S4-like experiment [25], with white noise level of 1µK-arcmin in temperature and √ 2µK-arcmin  [44] (green, HDV QE), and the maximum a posteriori (MAP) lensing reconstruction of this work (red, without any cuts), and the input truncated to the same L range.The bias in the MAP reconstruction is much reduced compared to the QE but still visible at this cluster mass.In polarization reconstruction no such bias is visible.Lower panel: The same profiles in harmonic space, weighted at each multipole by their contribution to the cluster mass estimate (see Eq. (3.10)).An estimate of the map-level reconstruction noise was subtracted in each realization, in order to accelerate convergence, so that the scatter is not representative of an actual data analysis.Analytic predictions are given for the QE and HDV QE cases as the dotted and dashed black lines.The MAP case is slightly tilted in this configuration compared to the input profile (solid black). in polarization.We assume a beam with a full width at halfmaximum (FWHM) of 1 arcmin.Our flat sky patches have a pixel size of 0.3 arcmin with 1024 pixels on a side.We simulate the lensing by both the large scale structures and the dark matter halo, assuming (slightly wrongly) they are uncorrelated.The dark matter haloes are in the center of the patches.
We test both the quadratic estimator and the MAP estimator, as introduced in Section III A, to obtain the estimated  [44], including CMB multipoles 100 ≤ ℓ ≤ 5000 and lensing multipoles 100 ≤ L ≤ 5000 (here all clusters are set at z = 0.7).The points show the corresponding results for our simulated reconstructions with M = 4, 7 and 10 × 10 14 M ⊙ , together with the result of our MAP method (red).The error bar is our empirical estimate for a sample of a thousand cluster.The bias is a strong function of halo mass and can be very significant for massive haloes.
κ(L).We then estimate the mass of the cluster with the matched filtering, as described in Section III B. We compare our results for the temperature only (T), the polarization only (QU) and the minimum variance (TQU) estimators.

A. Bias in Temperature QE
As discussed in the literature [43,44] the temperature QE is biased low due to strong to moderate lensing close to the cluster center.As lensing by the cluster magnifies the background CMB, the gradient of the observed CMB is smaller than the unlensed CMB gradient.Since the weak lensing signature, i.e. the dipole-like structure in the lensed-unlensed sky, is sensitive to both the strength of the gradient and the mass of the cluster, it decreases that signature as well.As a consequence, when we estimate cluster signal using weak lensing temperature QE, the estimator is biased low.
The temperature quadratic estimator is nothing but a product of two filtered maps as given in Eq. 3.4.The bias comes mainly from the small scales in the gradient leg.A solution to this bias, as demonstrated in [44], is to use a low-pass filter for the gradient leg and only include multipoles below ℓ = 2000.We denote this modified QE with a scale cut, as Hu-Dedeo-Vale QE (HDV QE) from now on.
Since the bias is more prominent for massive clusters and small CMB scales, we choose here the clusters parameters to be M 200 = 4 × 10 14 M ⊙ , z = 0.7 and use CMB multipoles from ℓ CMB min = 100 to ℓ CMB max = 5000 for the reconstruction.We run all the estimators (QE, HDV QE and MAP) with the temperature only channel, on 1000 simulations, and stack the reconstructed maps.To assess the noise in each reconstruction, we performed additional runs of the estimators on the same simulation, with the same CMB and LSS realization, but without the cluster present.To reduce the variance during the stacking process, for each reconstructed map we first subtract the convergence estimated from the simulation without the cluster, before stacking them.
In Figure 2, we provide the comparison for the real-space deflection angle profile (α(θ), upper panel) and the harmonic space convergence profile (κ(L), lower panel).The deflection α, which is the gradient of the lensing potential ϕ is a vector quantity, but only its radial component is non-zero on average, since our clusters are circularly symmetric.Hence we only stacked the radial component of α.
The lower panel shows which is the unnormalized contribution of each and every multipole to the mass estimate (empirically, errors are independent to a good approximation).As expected the QE is biased low for such massive clusters, whereas the HDV QE [44] does almost unbiased reconstruction of the cluster signal on all scales, at the cost of a 20% increase in the standard deviation of the mass estimation.Our iterative estimator in other hand, does get rid of most of the bias, without wasting any information in the small scale regime.For this high cluster mass, our recovered profile still shows some discrepancy to the input, with recovered signal lower than expected on large lensing scales and higher on smaller scales.On the lower panel, we also plot as the dotted and dashed analytic prediction for the QE bias.We do this essentially by brute-force calculation of the QE and HDV QE expectation values, varying CMB and LSS lensing fluctuations, while keeping the cluster lensing deflection fixed, as described in some more details in appendix B. The predictions match well, but not exactly, with the empirical findings.We could not identify at this moment the origin of the (small) discrepancies.One difference is that the analytical predictions are calculated using curved-sky geometry, while the simulations use small boxes in the flat-sky approximation.Fig. 3 shows the prediction with this tool for the halo mass bias as a function of halo mass, together with our findings on this simulation set.We see that the predicted mass bias for the HDV QE is quite accurate compared to the simulations, while there seems to be a small offset for the QE.It is interesting to note that the MAP bias is very close to the HDV QE bias, while having lower variance.
For comparison, we show the quantity in Eq. 4.1 for the polarization only (QU) estimators in Fig. 4. We see that the bias in the QE, due to the magnification of the background gradient, is almost negligible, contrary to the temperature estimator.We argue that this might come from the fact that the polarization QE is dominated by the EB combination, which is mostly sensitive to the shear and not the magnification.In practice, we also tested that for the EE only QE the bias is smaller than for the TT estimator.
We now discuss how the iterative estimator also suppresses the noise of the mass estimation compared to QE and HDV QE.

B. Cluster mass constraints
We now consider 1000 simulations with a cluster mass of M 200 = 2×10 14 M ⊙ and z = 0.7, closer to the expected mean values of the CMB-S4 clusters.We reconstruct the lensing signal with scales between ℓ CMB min = 100 and ℓ CMB max = 4000, for both polarization and temperature maps.We reconstruct the lensing convergence fields with the three estimators: QE, HDV QE and MAP, and with the temperature only (T), polarization only (QU) and minimum variance (TQU) estimators.We then run a matched filter on these reconstructed lensing map to estimate the convergence amplitude κ0 , assuming a template κ t with the characteristic angular size θ s corresponding to the input mass and redshift of the cluster profiles.We use the lensing multipoles between L min = 100 and L max = 6000 to obtain κ0 from Eq. 3.10.We employ the empirical noise, obtained from 1000 simulations, as the variance N (L) of κL in that equation.
We compute the variance of κ0 from our set of 1000 simulations, and compare to the theoretical expectations of the variance from Eq. 3.11.Table I summarizes our results.In Fig. 5 we show the theoretical relative error on the mass for a set of 1000 clusters, as a function of white noise level, for our set of estimators.We also plot the standard deviation we recovered from our simulation set (points with error bars).We employ the bootstrap method, by resampling our simulation set, to estimate the error on this error.We see that at the CMB-S4 noise level, the performance of the estimators on simulations are within the expectations.Moreover, we see that the MAP estimator allows to recover the loss in constraining power from the HDV QE estimator.
It is clear that as we go to lower noise level, the polarization estimators eventually dominates the constraints.In this  11) with a cut at 2000 in the gradient leg [44] to remove the QE bias (see Fig. 2), dashed lines the QE case without any cuts, and solid the maximum a posterior lensing map reconstruction method of this work (MAP).The improvement in constraining power comes from the partial removal from the CMB maps of the lensing signal not directly related to the cluster, and is more prominent in polarization, as expected.Markers with error bars are estimated from our set of simulations, with the x coordinate slightly shifted for clarity.The error bars are obtained from the bootstrapping method (for 1000 clusters), showing good consistency.In the QE case, we applied a simple multiplicative correction to remove the bias in the estimate.
regime, the lensing signal unrelated to the cluster plays an increasingly important role in the error budget.Care must be taken comparing these numbers to the literature, since many works (for example MLE's [58], using Gaussian CMB's generated with lensed CMB spectra instead of a non-Gaussian CMB) do not consider this source of noise.At the CMB S4 noise level, the MAP estimator detects the cluster signal with an approximate 12%, 13% and 20% enhancement in significance compared to HDV QE with TQU, QU and T channels respectively.While our results are presented based on 1000 simulations, they can be easily extrapolated to the projected number of clusters in CMB S4, esti-TABLE I: Summary of results on 1000 simulations with input κ 0 = 0.1285, corresponding to a cluster mass of M 200 = 2 × 10 14 M ⊙ at z = 0.7.The table displays the mean κ0 evaluated from 1000 simulations along with the corresponding errors.For the T and TQU estimators, HDV QE results are provided, while the QE result is presented for the QU estimator.Additionally, MAP estimated values are shown in the second row.σ th κ0 depicts the theoretical error prediction given by Eq. 3.11.See also Fig. 5 HDV 0.0062 mated to be 10 5 .By employing the MAP estimator for 10 5 clusters, the mass can be constrained, achieving an accuracy down to 0.47% for our ideal scenario where no mis-centering or foreground effects are present.

V. CONCLUSIONS
In this work we demonstrated the use of the maximum a posteriori CMB lensing mass map estimator to reconstruct the mass profile of galaxy clusters.We showed that the constraints on the cluster masses reach the forecasted ones on simple simulations.The MAP estimator improves the constraints on the cluster mass by 12% compared to the HDV QE [44] with joint temperature and polarization reconstruction, and by 20% from polarization only.This is due to the absence of scale cuts in temperature, and iterative reduction of the CMB fluctuations sourced by cluster and LSS lensing (in polarization predominantly).
In temperature, the MAP estimator suffers from a much smaller bias than the well-known QE bias, that arises from the misestimate of the background gradient CMB modes.This may be interpreted by the fact that the MAP iteratively estimates the delensed CMB modes, using higher order correlation functions of the CMB maps, to get more accurately the unlensed gradient.This permits usage of all CMB scales in the reconstruction, without the necessity of scale cuts as in [44] (provided, of course, that foregrounds or other contaminating signals are under control on the relevant scales).
The MAP reconstruction used here is the exact same that is being developed for generic CMB lensing reconstruction, on wide areas of the sky.It makes no assumption on what sources the deflections, and a reconstruction on some area of the sky will reconstruct at once all cluster signals in the area in a 'optimal', statistically speaking, manner.On the practical side, the curved sky MAP implementation received recently con-siderable increase in speed and accuracy [73], making full-sky analyses perfectly doable in very reasonable time, and converging just as well on realistic, N-body simulations inputs.Results will be presented elsewhere.
In this first work, we used for simplicity flat-sky simulations, each time with one cluster in a small patch.Our simulations went a step further than some other analyses by including the lensing from the background large scale structures together with the cluster.This increases the variance of the QE, and somewhat less for the MAP.We assumed there the background LSS lensing independent from the cluster.In reality, clusters are located at the nodes of the cosmic web, and are highly correlated with overdensity regions.Dark matter N-body simulations such as the Websky [79] and DEMNUni [80] can be used to quantify this.We also neglected many practical issues, such as the impact of mis-centering, which can degrade the constraints on the mass by about ∼ 10% [23,50].To this and other issues independent from the mass reconstruction process can be applied the same techniques developed for the QEs.
The normalization of the MAP lensing mass map (which is analogous to a Wiener-filter) requires careful handling.Indeed it cannot be obtained accurately analytically and we relied on a set of (cluster-free, with Gaussian input lensing fields) fiducial simulations to obtain it.We showed in [70] that this normalization is independent of the true cosmology, input lensing and data noise statistics of the CMB, hence we expect this procedure to be sufficiently robust in practice.
Much work remains to be done.We did not consider contamination from foregrounds signal, such as the SZ effect, radio point sources or the Cosmic Infrared Background.The thermal SZ effect has a frequency dependence which allows to remove its contribution from the observed CMB maps, at the expense of increased variance.It is possible to reduce the loss of signal in the lensing reconstruction by tSZ cleaning only the gradient leg of the QE [36][37][38].The kinematic SZ however cannot be distinguished from the lensing signal.Efforts have been undertaken to enhance the robustness of MLE-based estimators against tSZ contamination [53,81].However, it is expected that the polarization foregrounds are small on the relevant scales, so our MAP estimator forecast in polarization should in principle be robust.
Here, 1 ĝ(n) is the unnormalized deflection vector (spin-1) field estimate.We want to evaluate this quantity which may be used to predict the result of stacking QEs from CMB maps on identical clusters.
Let n′ be the undeflected position that corresponds to the observed location n.Working at fixed n, by Fourier transforming the T -maps, we may write this as a cross-spectrum, with F lm = d 2 n 1 F (n, n1 )Y * lm (n ′ 1 ).

(B5)
Here F and G have structure similar to that of a spin-weighted correlation function (B6) For an arbitrary deflection field, this is a tough calculation, requiring several spherical harmonics transforms on non-regular grid per each point of interest n In the case of cluster lensing things simplify quite a bit owing to • spherical symmetry, ⟨g(n)⟩ fixed deflection will depend on the co-latitude θ only, and the deflection field does not change the longitude coordinates, • and the fact that clusters are small.With the coordinates such that the cluster lies at the pole, only a small number of m's will be necessary.The circle at latitude θ has length sin θ, hence there is an effective m max ∼ l max sin θ ∼ 5 for θ ∼ 10 ′ and an analysis with l max = 5000.Since it is spin-1, ⟨ 1 g(θ)⟩ will be heavily dominated by the m = 1 component near the pole.
We use this to perform the calculation in Eq. (B3) by brute force, using the efficient general spherical harmonic transforms of [73].

FIG. 1 :
FIG.1: Contribution per lensing multipole to the cluster mass SNR (the integrand of Eq. 3.11).Green, orange and pink show temperature-only, polarization-only and combined reconstruction respectively.Dashed shows the quadratic estimator case and solid the CMB lensing maximum a posteriori method of this work.Dotted shows the QE with a high-ℓ cut on the gradient leg[44].Shown is the case of a 2 × 10 14 M ⊙ cluster at z = 0.7, for the CMB-S4-like configuration given in the text.While the MAP approach is exactly the same than for optimal CMB lensing power spectrum reconstruction, this is probing much smaller scales (the spectrum SNR is almost entirely confined to L < 1500 for this configuration).

FIG. 2 :
FIG.2: Top panel: Reconstructed deflection angle profiles α(θ) from stacking a set of 1000 temperature-only reconstructions (M = 4 × 10 14 M ⊙ , z = 0.7).CMB multipoles 100 ≤ ℓ ≤ 5000 are used to reconstruct the lensing multipoles 100 ≤ L ≤ 5000.Shown are the standard QE (blue), QE with a high-ℓ cut at 2000 on the gradient leg[44] (green, HDV QE), and the maximum a posteriori (MAP) lensing reconstruction of this work (red, without any cuts), and the input truncated to the same L range.The bias in the MAP reconstruction is much reduced compared to the QE but still visible at this cluster mass.In polarization reconstruction no such bias is visible.Lower panel: The same profiles in harmonic space, weighted at each multipole by their contribution to the cluster mass estimate (see Eq.(3.10)).An estimate of the map-level reconstruction noise was subtracted in each realization, in order to accelerate convergence, so that the scatter is not representative of an actual data analysis.Analytic predictions are given for the QE and HDV QE cases as the dotted and dashed black lines.The MAP case is slightly tilted in this configuration compared to the input profile (solid black).

FIG. 4 :
FIG.4: Same as the lower panel of Fig.2, for the polarizationonly (QU) quadratic and MAP estimators.The QU channel does not suffer from the bias due to strong to moderate lensing close to the cluster center.

FIG. 5 :
FIG.5: Constraints on cluster mass for a sample of 1000 identical clusters (M 200 = 2 × 10 14 M ⊙ , at redshift 0.7) as a function of white noise levels.The beam FWHM is 1 arcmin.Dotted lines (HDV QE) show the QE forecast (Eq.3.11) with a cut at 2000 in the gradient leg[44] to remove the QE bias (see Fig.2), dashed lines the QE case without any cuts, and solid the maximum a posterior lensing map reconstruction method of this work (MAP).The improvement in constraining power comes from the partial removal from the CMB maps of the lensing signal not directly related to the cluster, and is more prominent in polarization, as expected.Markers with error bars are estimated from our set of simulations, with the x coordinate slightly shifted for clarity.The error bars are obtained from the bootstrapping method (for 1000 clusters), showing good consistency.In the QE case, we applied a simple multiplicative correction to remove the bias in the estimate.