Making the leap. Part I. Modelling the reconstructed lensing convergence PDF from cosmic shear with survey masks and systematics

The last few years have seen the development of a promising theoretical framework for statistics of the cosmic large-scale structure — the theory of large deviations (LDT) for modelling weak-lensing one-point statistics in the mildly nonlinear regime. The goal of this series of papers is to make the leap and lay out the steps to perform an actual data analysis with this theoretical tool. Building upon the LDT framework, in this work (paper I) we demonstrate how to accurately model the Probability Distribution Function (PDF) of a reconstructed Kaiser-Squires convergence field under a realistic mask, that of the third data release of the Dark Energy Survey (DES). We also present how weak lensing systematics and higher-order lensing corrections due to intrinsic alignments, shear biases, photo-z errors and baryonic feedback can be incorporated in the modelling of the reconstructed convergence PDF. In an upcoming work (paper II) we will then demonstrate the robustness of our modelling through simulated likelihood analyses, the final step required before applying our method to actual data.


Introduction
In weak gravitational lensing (WL), light rays from background source galaxies propagate through the foreground inhomogeneous distribution of baryonic and dark matter.This induces (de)magnification of the brightness of galaxies and a coherent distortion pattern in their observed shapes [1,2].The statistics of the projected WL fields depend on the three-dimensional large-scale structure (LSS) and hence provide a powerful way to probe and address critical questions in cosmology such as structure formation history, the nature JCAP03(2024)060 of dark energy and dark matter, and the laws of gravity.WL cosmology is therefore an active area of research in currently ongoing wide-area galaxy imaging surveys such as DES [3], KiDS [4] and HSC-SSP [5].The promising results from these surveys have also motivated the development of new generation surveys such as Euclid [6] and Vera Rubin's LSST [7] which will soon provide data with unprecedented quality.
One standard approach to extract cosmological information from WL fields is to analyse their power spectra (in Fourier space) or the real space 2-point correlation function (2PCF).These statistics can completely characterise the information contained in Gaussian random fields which describe our Universe accurately at its earliest stage of evolution [8] and on linear (large) scales.However, at late times, the evolution of matter density fluctuations becomes nonlinear due to gravitational instability and develops non-Gaussian features on nonlinear (small) scales.As a consequence, to probe the rich information contained in this non-Gaussian late-time cosmic density field one needs to continue standard 2PCF analyses as well as go beyond and investigate non-Gaussian statistical tools.We follow this path in our work.
There are various kinds of non-Gaussian statistics that have already been or that will be applied to weak-lensing survey data.We mention as examples the cosmic shear 3-point correlation function (3PCF) [9,10], the integrated 3PCFs [11][12][13][14], aperture-mass moments [15][16][17], lensing peaks [18,19] and density-split statistics [20][21][22] (to name a few).All of these have distinct advantages and disadvantages based on both their measurement and modelling strategies.A typical WL survey analysis strategy would rely on a baseline 2PCF analysis and further complement that with non-Gaussian statistics [13,23] to obtain stringent constraints on cosmological parameters.Extracting non-Gaussian information is a promising avenue as field-level approaches have been shown to enhance the constraining power for σ 8 and Ω m by a factor of 3 and 5, respectively and to break the weak lensing degeneracy [24,25].The Probability Distribution Function (PDF) of the weighted projection of the matter density fluctuation field along the line-of-sight, the weak lensing convergence field (κ), was shown to contain a significant amount of cosmological information [23] and to hold the potential to break cosmological parameter degeneracies that are exhibited in standard weak lensing 2PCF analyses [26][27][28].By measuring the smoothed κ field value inside apertures (or cells), the one-point κ-PDF statistic has the advantage of being straightforward to measure compared to other non-Gaussian WL convergence field probes such as bispectrum (counting triangular configurations) or Minkowski functionals (topological measurement).An emulation approach to constrain cosmological parameters using the κ/σ-PDF (σ is the κ standard deviation) jointly with convergence power spectrum was applied to HSC Year 1 (Y1) data [29].A similar analysis was carried out for KiDS-1000 and LSST mock data as well [30].Both these works relied on small patches obtained from N -body simulations as models for the κ-PDF to perform their cosmological analyses.In this series of work we instead use a from-firstprinciples theoretical modelling framework based on the large deviation theory (LDT) for the κ-PDF in the mildly nonlinear regime, giving access to the cosmological signal [28] as well as the covariance [31].This is complementary to the halo-model approach aimed to describe the more nonlinear regime [32].
However, one major difficulty in applying κ-PDF to real data analyses is that κ itself is not a direct observable.Reference [33] showed that the κ and the more directly observable weak lensing cosmic shear γ are related to each other through a convolution and pointed out -2 -

JCAP03(2024)060
We show the modification of our original LDT theoretical model of the κ-PDF to account for this reconstruction with the realistic mask in section 4 and test it against measurements from N -body simulations in section 5. Furthermore, in section 6 we present the strategy for modelling higher-order lensing, astrophysical and survey systematic effects in the κ-PDF.We summarise and conclude in section 7. Appendix A recaps details about the treatment of the masking effect on the cosmic shear power spectrum in the pseudo-C l formalism.In appendix B we quantify the effect of the KS reconstruction on the moments/cumulants of the reconstructed κ-field.Finally, in appendix C we present a simplistic study of the impact of a square mask (without holes) on the KS reconstructed κ-PDF.

Modelling the (observationally inaccessible) true convergence PDF
In this section we review how one can model the "true" but observationally inaccessible 1-point convergence κ-PDF based on large deviation theory (LDT).Readers already familiar with this model originally described in e.g.[40,45] can skip to section 3.

Statistical definitions
We use different statistical quantities that we briefly introduce here for clarity.From the PDF P X of some continuous random variable X, one can define the Moment Generating Function (MGF) as the Laplace transform of the PDF or equivalently as the expectation value2 of the random variable e λX .Note however that the existence of an MGF is not guaranteed for all possible random fields.For example, the MGF of a strictly lognormal field is undefined for real positive λ.For the MGF to exist (in cases where the PDF also exists), the PDF needs to decay faster than the exponential e λx for the integral in equation (2.1) to exist.In the case of the one-point statistics of the cosmic density field, as computed within the large deviation framework (see section 2.2), there actually exists a critical positive real value λ c -hereafter dubbed critical pointbeyond which the MGF is not defined.In practice and for a field sampled within a finite volume, the MGF along the real axis will always exist and will simply tend towards e λXmax for λ ≥ λ c , where X max is the maximum value of X in the finite field.
The moment generating function, as its name implies, can be used to compute the moments of the distribution as can be seen from the series expansion of the expectation of e λX ,

JCAP03(2024)060
so that the nth derivative of the MGF at λ = 0 is equal to the nth-order moment, E (X n ).The logarithm of the MGF is the Cumulant Generating Function (CGF) where k n are the cumulants (i.e. the connected moments) of the distribution and where we have dropped the subscript X for clarity.
The reduced cumulants are defined as where k 2 is the variance.These are important in the context of cosmological structure formation because the S n of the cosmic matter density field have been shown to be independent of redshift down to mildly nonlinear scales [46,47] and thus introduce relevant scaling relations between the cumulants.We thus also define the scaled cumulant generating function (SCGF hereafter) as which we also extrapolate to non-zero values of the variance (i.e.evaluate at finite k 2 on our chosen smoothing scale).One can reconstruct the PDF from the CGF using an inverse Laplace transform (inverting equation (2.1)) (2.6)

Large deviation theory of the matter density field
The large deviation theory (LDT) framework in large-scale structure has mainly been used to model the one-point statistics of the smoothed 3D matter density field [see, for example, 42,43,48] and also of projected 2D quantities such as weak lensing convergence and aperturemass fields [15,28,40].It has also been extended to the joint distribution between densities measured at some distance [49] and projected quantities between different source redshift bins [50] (i.e.n-point PDF).The results are most simply applied for highly symmetrical window functions such as two-or three-dimensional top-hats, but can be generalised to other smoothing schemes [15,42,51].We begin here by recalling some of the results of LDT for the one-point statistics of the matter density contrast smoothed in two-dimensional disks (which replicates the dynamics within long cylinders), which in turn will allow us to compute the one-point statistics of projected 2D quantities like the convergence field.A set of random variables {ρ ϵ } ϵ with PDF P ϵ (ρ ϵ ) is said to satisfy a large deviation principle if the limit exists, where ϵ is the driving parameter.Ψ is known as the rate function of ρ ϵ and describes the exponential decay of its PDF.The driving parameter ϵ indexes the random variables

JCAP03(2024)060
with respect to some evolution, for example an evolution in time.In the case of the matter density field smoothed on a single scale R, this driving parameter is the variance, which acts as a clock from initial to late times (ϵ ≡ σ 2 R ).We now omit the ϵ sub/superscripts in our notation for simplicity.
The existence of a large deviation principle for the random variable ρ implies that its SCGF φ ρ is given through Varadhan's theorem as the Legendre-Fenchel transform of the rate function Ψ ρ [52][53][54] (2.8) where the Legendre-Fenchel transform reduces to a simple Legendre transform when Ψ ρ is convex.In that case, where ρ is a function of λ through the stationary condition Another consequence of the large-deviation principle is the so-called contraction principle.This principle states that for a random variable τ satisfying a large deviation principle and related to ρ through the continuous mapping f , the rate function of ρ can be computed as This is called the contraction principle because f can be many-to-one, in which case we are contracting information about the rate function of one random variable down to the other.
In physical terms, this states that an improbable fluctuation of ρ is brought about by the most probable of all improbable fluctuations of τ .
For the case of the normalised 3D matter density field ρ ≡ ρ/ρ, the rate function of the late-time normalised density field at different scales can be computed from initial conditions if the most likely mapping between the two is known -that is, if one is able to identify the leading field configuration that will contribute to the infimum of equation (2.11).In cylindrically symmetric configurations, as for a disk of radius R at redshift z (in 2D space) or alternatively a very long 3D cylinder centered on this disk, the most likely mapping between final and initial conditions should preserve the symmetry [55,56]. 4This in turn leads to initial conditions also being cylindrically symmetric and the dynamics between the two being that of cylindrical collapse.

JCAP03(2024)060
Thus, starting from Gaussian initial conditions,5 the rate function Ψ τ of the most probable initial (linear) density field τ is simply a quadratic term.From the dynamics of cylindrical collapse that maps the most probable initial and late-time fields, the rate function Ψ cyl of the normalised late-time density field ρ in a disk of radius R is then given by where σ 2 R -our driving parameter -is the variance of the nonlinear density field in the disk, σ 2 r is the variance of the linear density field inside the initial disk (before collapse) of radius r = R ρ 1/2 (from mass conservation), and τ is the linear density contrast obtained through the most probable mapping between the linear and late-time density fields.This mapping is given by 2D spherical (cylindrical) collapse, for which an accurate parametrisation is given by [59] This parametrisation is in the spirit of previous works involving the density filtered in spherical cells, while the value of ν in this parametrisation of ζ is chosen to be ν = 1.4,so as to reproduce the value of the leading-order (tree-level) skewness in cylinders as computed from Eulerian perturbation theory [48].
Then, as a straightforward consequence of the contraction principle, the rate function given by equation (2.12) is also the rate function of any monotonic transformation of ρ, so that for the density contrast δ = ρ − 1, we have Ψ δ (δ) = Ψ ρ (ρ(δ)).Thus, plugging equation (2.12) into equation (2.9) gives us the SCGF φ cyl of the matter density contrast in a disk at redshift z.
Finally, one of the key aspects of the large deviation formalism in the cosmological context is that we apply the result for the SCGF beyond the σ 2 R → 0 limit.This extrapolation of the exact result allows us to obtain a realistic CGF ϕ cyl of the real density field for non-vanishing σ 2 R by rescaling the SCGF by the driving parameter (the nonlinear variance) at the scale and redshift being considered.This yields: This is physically meaningful because we thus construct a CGF naturally matching its quasi-linear limit and since the reduced cumulants S n from the cylindrical/spherical collapse dynamics have been shown to be very robust over a large range of scales and redshifts down to mildly nonlinear scales (≳ 5 Mpc/h at z ≳ 0, see for example figure 2 in [43] and figure A1 in [48]) so that rescaling by the nonlinear variance σ 2 R allows access to the full one-point statistics of the nonlinear density field.In LDT terms, the SCGF given by the large deviation principle is the well-defined asymptotic form taken by the reduced cumulants of the field in the regime where the variance goes to zero, and we simply keep this form for our predictions of the nonlinear CGF.
Finally, note that though equations (2.9) and (2.12) have been known and used for three decades in the context of count-in-cells statistics, that is the density field filtered in JCAP03(2024)060 3D top-hat windows [see for example 60,61], their re-derivation through large deviation statistics is more general and allows to set up a framework for the computation of different probabilities in the cosmological context.

From matter density to convergence
Let us recall that for a flat cosmology, the convergence κ can be interpreted as a line-ofsight projection of the matter density contrast between the observer and the source and can be written as [62] where χ is the comoving radial distance and the generalised lensing kernel ω n(z) for a wide distribution of sources following the normalised distribution n(z) is where the Heaviside H ensures that the integrand vanishes for z ≥ z s .Hereafter we only note the lensing kernel ω for simplicity.Under the small-angle/Limber approximation, it can be shown that correlators of the (smoothed) convergence field can be seen as a juxtaposition (by which we mean an integral along the line-of-sight) of the 2D correlators of the underlying density field, as if each 2D slice along the line-of-sight is statistically independent of the others [40,45].In terms of the one-point statistics of the smoothed κ field within a top-hat window function of angular radius θ, this translates to saying that the CGF of κ is a sum along the line-of-sight of the CGF of independent 2D slices of the matter density contrast: where ϕ <χ(z)θ cyl is the CGF of the density contrast filtered within a disk of radius χ(z)θ so as to reproduce the geometry of the light cone.
Equation (2.17) thus reduces the complexity of the problem down to computing the one-point statistics of the 2D matter density in each two-dimensional slice (or equivalently within long 3D cylinders at the same redshift up to some factor depending only on the length of the cylinder) along the line-of-sight, which we have already done in section 2.2.Using these results, we can then build the nonlinear CGF ϕ κ,θ of the convergence field.Note that equation (2.17) highlights the nice property of the projected CGF being expressible simply as a sum of independent redshift slices.This is an important property when considering more complicated joint distributions (e.g.joint convergence CGF of multiple source tomographic redshift bins), in which the only modification in this integral would be the replacement of

JCAP03(2024)060
the ω n(z) (z)λ term by a term depending on more than one λ variable.This form is much simpler than that of the corresponding multi-variate PDF [50].
Finally, it is important when working with the CGF (albeit to compute the PDF) to know the approximate location of the (theoretical) critical point of the convergence field, λ c .First, the critical points δ c in each redshift slice are calculated by finding where the second derivative of the rate function becomes zero.The corresponding critical λ values can be obtained by applying the stationary condition (equation (2.10)).The minimum λ c along the line-of-sight is then taken as the critical point of the convergence CGF.
The convergence PDF P(κ) is then computed from its CGF ϕ κ,θ using the inverse Laplace transform of equation (2.6) as and thus explicitly depends on the lensing kernel ω n(z) from equation (2.16).This computation involves a continuation of the CGF in the complex plane that can be performed using different techniques.We refer to [40] and [15] for technical discussions on the two possible methods that have been used in the literature to perform this complex continuation in the LDT context.
Here, we use the one of [15] that relies on an informed fitting function of the CGF.

Measuring the KS reconstructed convergence PDF
In this section we describe how one performs a Kaiser-Squires inversion on an observed masked shear field in order to create a reconstructed convergence κ map as well as how one can measure a meaningful κ-PDF from this map.

From galaxy ellipticities to shear field
In real-life weak-lensing experiments, the convergence field is not a direct observable (except maybe through magnification, see for example [63]).What we actually observe are the shapes of source galaxies, their ellipticities, which are a noisy estimate of the reduced shear g: with g = γ/(1 − κ), ϵ IA the intrinsic shape of the galaxy and ϵ n the shape measurement noise.
In the weak-lensing regime, both the shear γ and the convergence κ are ≪ 1 so that the ellipticities can serve as a noisy estimate of the shear field through In the following we will ignore intrinsic alignments whose introduction in the modelling will be described later in section 6.1.On the other hand, the contribution of noise is estimated in the literature by randomly rotating the shape of the galaxies to erase the cosmological contribution which would lead to pure-noise ellipticity, shear and then convergence fields (whose reconstruction we detail below) of zero average but non-negligible variance.By virtue of the central limit theorem, this noise is expected to become Gaussian for large numbers of JCAP03(2024)060 galaxies and is also expected to be closely independent from the reconstructed field.Since the simulations we use in this work contain no intrinsic shape noise, we will assume that those two hypotheses are valid and shape noise can thus be trivially taken into account in the theoretical modelling of the convergence PDF by simple convolution of the clean PDF with a Gaussian of the correct shape noise amplitude (we will test this in paper II of the series).For a survey with varying source galaxy density across the sky, this would lead to a mean shape noise variance which would still convolve the clean PDF.Note however that the Gaussian hypothesis can be easily lifted if needed -simply by replacing the Gaussian convolution by another one -as well as the hypothesis of zero correlation between the noise and reconstructed field which can be tested by computing the joint PDF between the noise map (obtained through random rotations) and the observed convergence.A denoised estimator of the observed convergence would then be obtained by marginalisation over the noise in this joint PDF.

From shear to convergence
Crucially, for a footprint of approximately 5000 deg 2 such as the one from the DES Y3 -and up to three times that for a Euclid-like experiment -we need a full-sky, spherical harmonics approach to estimate the convergence field from the shear [36,64].In this formalism, the (Born-)projected Poisson equation reads [65] with ψ the usual projected (under Born approximation) gravitational potential and ð, ð are respectively the raising and lowering operators acting on the spin-weighted spherical harmonics [see appendix A of 65].Similarly, the usual complex shear field equations translate into Finally, expanding the projected gravitational potential and convergence (scalar, spin-0 fields) as well as the spin-2 complex shear on the spin-weighted spherical harmonics basis 0 Y lm and 2 Y lm respectively, we get ) Both the convergence and the shear have been decomposed into curl-free E-modes and divergence-free B-modes, and we can relate the shear to the convergence as

JCAP03(2024)060
Equation (3.10) is the harmonic space spherical-sky generalisation of the Kaiser-Squires (KS) inversion formula (originally proposed by [33] for flat-sky convergence field reconstruction from the observed shear field).An inverse spherical-harmonic transform on the full-sky allows one to then obtain the KS reconstructed κ E and κ B fields on the celestial sphere.Strictly speaking, the equations above are correct assuming no couplings between the lenses that would appear when relaxing the Born approximation and taking into account corrections of the same order in the Sachs' equation of gravitational lensing [66,67].In this paradigm, the only allowed B-modes are those produced by the effect of masking the full-sky or observational systematics.On the other hand, allowing for couplings between lenses along the line-of-sight would slightly change the definitions of the shear and convergence and as a consequence their relationship.This is not a fundamental issue since those higher-order corrections can be modelled respectively for the shear and the convergence in cases where one or the other can be accessed [see for example 67-69], and since these corrections are small, one could still reconstruct a field from equation (3.10) and call it convergence.Alternatively, the knowledge of what are both the true shear and convergence has also permitted to build Bayesian reconstruction schemes of the convergence based on an observed shear field with the help of informed priors [see for example [34][35][36] as well as machine learning reconstruction algorithms [37] and more recently clever mix of the two [38].In this paper, we reconstruct a convergence field by applying the KS inversion equation (3.10) on a masked full-sky shear field which contains higher-order lensing corrections and thus admits B-modes.We then only work with the reconstructed κ E field and theoretically model the E-modes one-point PDF assuming that the main difference between the reconstructed and "true" PDFs are sourced only by masking the full-sky shear field.7

Reconstructing the convergence PDF from simulated shear maps
In actual observations, one would start from a discrete shear catalogue that has to be interpolated (pixelised) to create a shear map.We here skip that step and directly use a set of publicly available full-sky weak lensing shear maps in Healpix format [70] 8 generated from the ray-traced N -body simulation suite from Takahashi et al. [71]. 9We have 108 realisations of several fixed source redshifts up to z s ∼ 5 and combine them to mimic a realistic source distribution from an input n(z) (see section 5 for the description of how we do that in practice to mimic the 4 th source redshift bin of the DES Y3).For each set of full-sky shear maps, we multiply those by a binary mask, shown in figure 1, mocking observed maps from DES Y3 and reconstruct the E-modes full-sky convergence field using KS reconstruction equation (3.10).This reconstruction makes use of the functions map2alm and alm2map of the Healpy package [72].We then smooth the maps with top-hat filters of radius θ in JCAP03(2024)060 harmonic space and the only remaining step is to select the pixels on the full-sky map to then construct a meaningful/physical convergence PDF.Indeed, since large parts of the shear field were set to zero due to the binary survey mask (and though the reconstruction of the convergence is in principle non-local) the reconstruction of κ E in the previously masked areas yields unphysically small values compared to the fluctuations of the true convergence κ true values (obtained directly from the simulation suite) in those regions.We illustrate this point in figure 2 where we display a full-sky (FS) reconstructed field κ E,FS smoothed with a top-hat of radius 20 arcmin.We thus only take into account pixels whose smoothed convergence values κ E,unmasked results from unmasked regions on the sky. 10 That way, we can expect a loss of power of κ E-modes to B-modes in the reconstruction due to the presence of the mask and the non-locality of the transform, but sufficiently mitigated such that physical scaling relations for example between higher-order cumulants of the matter density field still hold.
We illustrate this fact more quantitatively in the next section in equation (4.1).In practice, this can be done by smoothing the binary mask with the same top-hat kernel as the field, and only keeping pixels where the smoothed masked has values sufficiently close to 1.For the DES Y3 mask, we find in practice that keeping all pixels where the smoothed mask is higher than 0.98 is a good equilibrium between keeping the maximum possible portion of the survey volume and mitigating the influence of masked pixels.This approach might be improved by upweighting the pixels that are partially masked as done in the density-split statistics context [20,21] which is similar to ours.It might allow to consider more pixels but we do not implement here such a strategy.

Modelling the KS reconstructed convergence PDF
Having described the LDT based framework for modelling the observationally inaccessible κ-PDF in section 2 and then the actual procedure to reconstruct and measure the κ-PDF under the presence of a realistic survey mask using KS inversion in section 3, we now describe how to modify our original framework to incorporate the effect of the survey mask and develop a model for the KS reconstructed κ-PDF.

Schematic view
Let us first have a look at how the reconstruction scheme presented in the previous section affects the amplitude of convergence fluctuations.Schematically, masking the field reduces the overall shear power spectrum and thus the reconstructed convergence power spectrum by a factor f mix , 11 due to mode-mixing which is at first order comparable to the observable fraction of the full-sky f sky .On the other hand, the reconstructed convergence in the masked regions (e.g.far away from the survey footprint) is mostly 0 so that we could model the full-sky reconstructed PDF as a sum of the distribution of κ E values in the unmasked regions and a JCAP03(2024)060 -0.0292103 0.0292103 -13 -

JCAP03(2024)060
Dirac-delta distribution, both weighted by f sky and (1 − f sky ) respectively.As a consequence, the variance of the reconstructed κ E in the unmasked region of the sky σ 2 κ E ,unmasked is roughly the variance σ 2 κ E ,FS of the full-sky κ E,FS reconstructed from the masked shear field divided by f sky and we have This is further illustrated with actual values of the κ E variance under the DES Y3 mask in appendix B. Additionally, the core idea behind the large deviation approach to the statistics of the cosmic matter density field is to identify asymptotic scaling relations between the cumulants of the field such that specifying the value of the variance serves both as a dial controlling the proximity to the asymptotic limit and as a closure relation allowing to compute all cumulants of the field from the variance.For now neglecting the projection along the line-of-sight (and since the convergence reconstruction scheme only affects the amplitude of fluctuation mildly), it could seem natural to assume that at first order the reconstruction affects the successive cumulants of the convergence field in a manner consistent with their scaling relations with the variance based on constant reduced cumulants S n (see equation (2.4)).This would typically mean that the loss of power from E to B modes in the region of the sky where it is mitigated the most (which we call the "unmasked" regions, see section 3.3), preserves core physical properties of the field.As such, neglecting the projection along the line-of-sight, the n th cumulants of the reconstructed κ E field in the unmasked regions can schematically be related to the cumulants of the true but observationally inaccessible κ field as In the above equation (4.2), the subscript c denotes the connected parts of the moments, i.e. the cumulants.

Reconstructed cumulant generating function
More precisely, the large deviation principle applied to the cosmic matter density field implies that the scaling relations (constant S n in equation (2.4)) between cumulants are correct for the matter density field, but not for the projected convergence field itself.However, under the Born and Limber approximations, the total lensing effect that we observe can be treated as the independent combination of successive lensing events along the line-of-sight.As a consequence, taking into account the convergence reconstructing scheme while preserving the scaling relations between cumulants must be understood at the level of each lensing event, that is for the cumulants of the matter density contrast along the line-of-sight.In terms of the reconstructed CGF ϕ κ E , this can be written by modifying equation (2.17) as

JCAP03(2024)060
where ⟨δ 2 true ⟩ is the true variance of slices of the density field at redshift z and of radius χ(z)θ, while ⟨δ 2 E ⟩ is to be understood as the variance that results from the reconstruction scheme (as schematically done in equation (4.1) for the whole projection) and whose modelling we describe in the next subsection.

Reconstructed variance along the line-of-sight
We now take a more detailed look at the ⟨δ 2 E ⟩ term that appears in equation (4.3).In the Limber approximation, it corresponds to a variance measure 12 of a 2D matter density contrast slice of radius χ(z)θ at redshift z which acts locally as the (reconstructed) convergence from a single lens plane.In the absence of a mask, it can be written as where P NL is the nonlinear matter power spectrum and W l is the angular top-hat window function in harmonic space given by where P l are Legendre polynomial of degree l.
In the presence of masks, equation (4.4) is modified as follows [39] δ where is the element of the mode-mixing matrix that accounts for the transfer of power in the power spectrum from the unmasked shear E-modes to the masked shear E-modes.This term is explained in more detail in appendix A. Here, f l = [(l + 2)(l − 1)]/[l(l + 1)] and accounts for the passage from convergence to the shear power spectrum in order to apply the mode-mixing formalism to the shear field directly.Finally, the f sky factor is not necessarily to be understood as the true sky fraction observed by the survey.Instead, it actually comes from the parametrisation of the full-sky reconstructed convergence PDF as a sum of the fluctuations within the mask weighted by f sky and a Dirac-delta (for the unobserved sky regions) weighted by (1 − f sky ).However, this parametrisation, though very accurate in practice, is not exact at the boundaries and holes of the mask, and even more so after smoothing the reconstructed convergence field.Having said that, this f sky term is not free either as it can directly -without any theoretical input -be estimated from data as the ratio of the reconstructed smoothed full-sky variance to the one within the considered pixels.Note moreover that the reconstructed κ E unmasked variance values measured after masking from the Takahashi simulation maps tend to be equal to the κ E full-sky variance divided by the true fraction of the full-sky observed in the mask up to less than a percent for the DES Y3 mask and for the range of smoothing scales that we tested (up to ∼ 30 arcmin), hinting to the fact that our parametrisation is very accurate.

JCAP03(2024)060 5 Testing the model
In this section we implement the theoretical formalism for the Kaiser-Squires reconstructed convergence PDF described in the previous section and compare it to measurements made in the Takahashi simulations.Given that the simulated shear maps also contain higher-order lensing corrections, the comparison presented here allows us to test that the mask modelling is accurate even when neglecting other sources of B-modes than the ones created by the mask.We study two test cases and a third in appendix C. In subsection 5.2 we test the reconstruction scheme in a regime where the large deviation theory is known to be very accurate, that of the Bernardeau-Nishimichi-Taryuya (BNT) transform [73] which allows us to construct lensing observables only sensitive to the dynamics of the quasi-linear regime of the matter density field.It allows us to understand how well we are probing the Kaiser-Squires reconstruction without mixing additional inaccuracies that the theory could present.In subsection 5.3, we test our theoretical κ E PDF for a source distribution mimicking that of the 4 th bin from DES Y3 analysis and where we applied the real DES Y3 mask to the full-sky shear field.We present in appendix B supplementary information to subsection 5.3 but at the level of the cumulants and in appendix C the effect of a reconstruction similar to subsection 5.3 but where we replace the DES Y3 mask by a square patch with no holes and of the same area (∼ 70 × 70 deg 2 ).

Mock data preparation and simulation-specific corrections
As mentioned previously, the simulated true convergence and cosmic shear maps used in this work are obtained from the publicly available N -body simulations by Takahashi et al. [71].The simulation suite provides weak lensing maps for N = 38 fixed source redshift planes between z s = 0.05 and z s = 5.3 in Healpix format.We combine these maps according to a source distribution n(z) inspired by the 4 th tomographic bin of the DES Y3 analysis to simulate a DES Y3-like lensing map in its 4 th redshift bin.The procedure to do that is as follows: where s i is a specific weight for a given full-sky lensing map κ z i s at source plane z i s .The weights used for combining the discrete source planes from the simulations are shown in figure 3.This final simulated convergence map can then be expressed as a line-of-sight projection of the matter density contrast through equation (2.15) with a lensing kernel w n(z) : The w z i s is in turn the lensing kernel for a given discrete source plane at redshift z i s and reads  To fairly compare the theory and our measurement from the simulations, we further include a few corrections to the theoretical modelling described in section 4.These corrections are specific to the Takahashi simulation and take into account inaccuracies in the simulation itself rather than additional effects present in a real survey.Firstly, we take into account the fact that the simulation has discrete spherical lens shells of thickness 150 Mpc/h.This is done both by replacing the continuous integrations along the line-of-sight by discrete sums dχf (χ) → i 150 × f [150(i − 0.5)], (5.4) and at the level of the nonlinear power spectrum by correcting it following (equation (28) in appendix B of ref. [71]): with c 1 = 9.5171 × 10 −4 , c 2 = 5.1543 × 10 −3 , α 1 = 1.3063, α 2 = 1.1475 and α 3 = 0.62793.Note that the wave-modes k are in units of h/Mpc and the power spectrum in units of (Mpc/h) 3 .Finally, in a manner analogous to taking into account the pixel window function at the map level, the resolution effects of the Takahashi maps can be taken into account by multiplying the nonlinear power-spectrum by a damping factor that depends on the nside of the Healpix map (equation (5) in ref. [71]): (5.6) Note that taking into account corrections (5.5) and (5.6) at the level of the nonlinear power spectrum has some effects on all other higher-order (density contrast) cumulants through their scaling relations with the variance while keeps the S n ratios constant in equation (2.4).

DES Y3 mask and BNT transform
One issue faced by theoretical approaches that aim at describing quantities projected along the line-of-sight is the mixing of both very nonlinear scales not accurately probed by standard from-first-principles approaches such as ours, and reasonably larger (quasi-linear) scales more accessible to the theory.As such, usual weak lensing statistical probes are (i) modelled accurately only on sufficiently large angular scales with scale-cuts on small angular scales, so as to mitigate the influence of small nonlinear physical scales at the tip of the light-cone, (ii) modelled by more phenomenological approaches such as halo models that can also take into account nonlinear and baryonic physics which becomes important on small scales [74], (iii) or by making use of numerical recipes and simulations for e.g.specifically incorporating baryonic feedback effects [75] which have unfortunately not been tested in great detail, especially for higher-order non-Gaussian statistics.
Alternatively, a theoretical strategy to disentangle quasi-and non-linear scales in lensing quantities known as the Bernardeau-Nishimichi-Taruya (BNT) transform or nulling strategy was proposed by [73].It allows for very accurate theoretical predictions in the context of power spectrum analysis and has recently been extended to the convergence and aperture-mass PDFs [15,40].This nulling strategy was used recently in [76] to remove the sensitivity to the poorly modelled small scales for the cosmic shear 2PCF, and therefore improve cosmological constraints using the DES Y1 shear data.This strategy could become even more relevant for future lensing experiments with better knowledge of redshifts and the division of sources in more redshift bins.
This BNT transform can only be used in the context of a tomographic analysis of at least 3 source redshifts (or redshift bins, although not treated here) and is a linear transformation M applied to the set of lensing kernels ω i ≡ ω(χ, χ s,i ), giving rise to a new set of re-weighted kernels ωj = M ij ω i . (5.7) For a set of 3 source planes labeled from j = i − 2 to j = i arranged by ascending order, it was shown in [73] that M must satisfy the system (5.8) This system of equations is under-constrained and hence we also impose by convention M ii = 1.
The elements of M can thus be computed considering sequential triplets of tomographic bins, going from the lowest to the highest redshift, such that .9) We display in figure 4 an example for a set of 3 source planes located at z s = 1.0, 1.2, 1.4.This is the kernel we use in this subsection.The green, yellow and blue dashed lines are the kernels up to z s = 1.0, 1.2, 1.4 respectively re-weighted by their appropriate BNT coefficients while the thick red line is the sum of the 3 re-weighted kernels.Note that the blue dashed line is also the original kernel since its BNT coefficient is set to 1.One can thus clearly see that the effect of nulling is to set to zero the contribution of all lenses below the closest source plane and therefore remove the contribution of small scales (at the tip of our light-cone) which are very nonlinear and where the effect of baryonic physics becomes non-negligible.
For our purpose, the BNT transform -which boils down to a simple linear combination of the maps -is applied at the level of the masked shear field 14 and is also straightforward to implement in our theoretical approach to the convergence PDF since we only need to replace the original kernel with its nulled counterpart.
We then show in figure 5 -in the regime where the traditional BNT transformed PDF is perfectly described by the LDT formalism -how the theoretical integration of the convergence reconstruction scheme performs.The BNT convergence field is smoothed with a top-hat window of radius θ = 10 arcmin.By looking at the top panels, one can appreciate that 14 Since the BNT transform is a linear transformation, it can without distinction be applied at the level of the (masked) shear or the convergence.The consideration of nonlinear high-order lensing or reduced shear corrections would formally break the exactness of the nulling but would still mitigate the influence of small scales so that it could still be used on real data.The same comment applies to the inexact knowledge of the underlying cosmology: an inexact nulling still mitigates the influence of small scales.4.3) and the blue points with error bars are the mean and 1σ fluctuations of the measured PDF in 108 realisations.The grey solid line is the theory described by equation (2.17) that is without taking into account the KS reconstruction masking effect.Bottom panel: residuals between the theory and the simulation.The green line is the residual of the original non-KS theory to the true, non-reconstructed simulated convergence PDF.It is highly compatible with the residuals between the KS-theory and the simulated reconstructed KS PDF (blue points).Neglecting the KS reconstruction effects (grey points) significantly worsens the quality of the theoretical prediction while taking them into account does not improve nor worsen the predictive power of our LDT formalism.
the exponential cut-off in the tails of the PDF, a prediction of our formalism, is well-observed once one reduces the lensing kernel down to scales accessible to from-first-principles theoretical modelling (such as ours).More interestingly, this behaviour is still observed through the Kaiser-Squires reconstruction scheme presented in section 3, and our theoretical modelling of this reconstruction does not reduce our ability to capture the shape of the PDF in this regime.This hints at the fact that our model is accurate enough to reproduce the reconstruction effects, at least in the regime where our initial large-deviation formalism is accurate.This is further illustrated in the bottom panel of figure 5 where blue points denote the residuals of our KS theory with respect to the measurement from the KS reconstructed simulations.The green line describes the residual between the original LDT theory without reconstruction and the measurement of the PDF from the true κ map (i.e.not reconstructed from the shear field using KS method).Here we observe that the two residuals are highly compatible.Moreover, we also find that taking the reconstruction into account does not reduce in the -20 -

JCAP03(2024)060
slightest our ability to describe the PDF in this regime, while neglecting it (grey points) would seriously damage our predictive power.

DES Y3 mask for sources in the fourth DES Y3 redshift bin
For the more traditional case of a source distribution mimicking the 4 th source bin of the DES Y3 analysis (that we replicate using the source redshift distribution shown in figure 3), and for a shear field under the DES Y3 mask, we show in figure 6 the results of our theoretical prediction for the reconstructed κ E PDF.The field is smoothed with a top-hat window of radius θ = 20 arcmin.By looking at the top-left panel, one can appreciate that the theoretical formalism seems to capture well the bulk of the PDF when considering the 1σ fluctuations across the 108 Takahashi realisations estimating the diagonal of the PDF covariance matrix.This is to be contrasted with the theoretical formalism that neglects the KS reconstruction effects and thus performs poorly which justifies the need for its modelling.Looking at the PDF on a log-scale (top-right panel), one can however observe an increasingly larger departure of the theory from the simulation as one enters the high and low-density tails.This is expected since cumulants from the large-deviations formalism are only valid in the quasi-linear regime and the mixing of scales (that the BNT transform would prevent as shown in subsection 5.2) worsens the quality of the theoretical prediction in the tails.This is clearly seen in the residuals between the theory and the simulation (bottom panel) where a typical mismatched skewness modulation 15 is visible, showing that higher order correction to the skewness would be needed if we had smaller error bars, especially since only the cosmic variance is here used to obtain those error bars.The alternatives include looking at larger angular scales, considering sources at higher redshifts, incorporating modelling errors in the error budget or making use of the BNT transform shown in subsection 5.2 which we consider the most principled approach.
Focusing on the accuracy of the modelling of the KS reconstruction, we now look at the green line in the bottom panel which describes the residual of the original LDT theory without KS reconstruction to the measurement of the PDF from the true κ map (i.e.not reconstructed from the shear field using KS method).As in the previous subsection, we once again observe that the two residuals are highly compatible, highlighting the fact that the reconstruction does not weaken nor improve our ability to describe the PDF in this regime.This again hints at the fact that our implementation of the KS effects on the PDF is accurate enough.

Incorporating observational systematics in the modelling
In this section we discuss the modelling of additional effects in the reconstructed κ-PDF, including galaxy intrinsic alignments, baryonic feedback effects, additive and multiplicative shear biases, higher-order lensing corrections and photometric redshift uncertainties. 15Let us remind here that the skewness enters the Edgeworth expansion of the PDF at the first non-Gaussian correction order and multiplies a third order Hermite polynomial of the convergence field as follows where H3(x) = x 3 − x.We emphasise that the predictions shown in this paper do not use or assume an Edgeworth expansion.It is however useful to interpret the shape of the residuals between the theory and the simulation.4.3) and the blue points with error bars are the mean and 1σ fluctuations of the measured PDF in 108 masked Takahashi maps.The grey solid line is the theory described by equation (2.17) that is without taking into account the KS reconstruction.Bottom panel: residuals between the theory and the simulation.The green line is the residual of the original non-KS theory to the true, non-reconstructed simulated convergence PDF.It is highly compatible with the residuals between the KS-theory and the simulated reconstructed KS PDF (blue points).Neglecting the KS reconstruction effects (grey points) significantly worsens the quality of the theoretical prediction while taking them into account preserves the predictive power of our LDT formalism.We note that the slight increase in the size of residuals is due to the complicated shape of the mask and presence of holes as demonstrated in figure 9 for the case of the square mask of similar area.

Intrinsic alignments
As noted in equation (3.2), the observed source galaxy ellipticity is a combination of the gravitational lensing shear component γ, the intrinsic ellipticity of the galaxies ϵ IA induced by correlations with local gravitational tidal fields at the source, and the random stochastic component that contributes as shape noise.In this section we include the ϵ IA (also known as intrinsic alignment) component in our framework using the popular non-linear tidal alignment (NLA) model [77,78].Indeed, recent work based on a perturbative field-level forward modelling of weak lensing fields [25] indicates that on scales of about 15 arcmin, the NLA model is adequate and leads to conservative and unbiased cosmology constraints even when analysing data generated through a more complex tidal alignment and tidal torquing (TATT) intrinsic alignment model [79].

JCAP03(2024)060
As shown in section 3.2, through second derivatives of the projected gravitational potential ψ, the lensing shear γ is related to the lensing convergence κ, the latter being expressed as a line-of-sight integration of the matter density contrast δ (see equation (2.15)).In analogy, for the intrinsic ellipticity component ϵ IA we can define a quantity κ IA κ IA (ϑ) where δ IA is a three-dimensional field that effectively determines the intrinsic alignment (IA) of the source galaxies with their local gravitational tidal fields.Note that the line-of-sight projection kernel in equation (6.1) is the source galaxy redshift distribution n(z), and not the lensing kernel ω n(z) (z) as in equation (2.15).In the NLA model, δ IA can be expressed as a first order bias expansion with the nonlinear matter density contrast field16 [77,78]: The f IA (z) term is the amplitude of the intrinsic alignment of the specific source galaxies and is given by where ρ crit is the critical energy density, D(z) is the growth factor normalised to unity today, and z 0 is a pivot redshift.The fact that c 1 was calibrated in h 2 units allows us to fix c 1 ρ crit = 0.0134.Therefore, the intrinsic alignment of source galaxies can thus be readily included in the overall convergence signal as an additive component on top of the lensing κ κ → κ + κ IA .( In practice, the inclusion of the intrinsic alignment effect in the modelling of the PDF P (κ) within the NLA framework is straightforward and can be achieved by replacing the lensing kernel in equation (2.15) with One can treat these intrinsic alignment terms as nuisance parameters which can be marginalised over in order to constrain cosmological parameters of interest when analysing P(κ).Note that even though marginalized constraints over the IA parameters in the DES Y3 shear 2PCF analysis are compatible with no IA (see for example [81] or [82]), neglecting the IA effect would bias the constraints on other parameters of interest.Based on those analyses, the typical order of magnitude of the IA effect would be around A IA ∼ 0.4 and α IA ∼ 1.7 but again one should not fix those values in an analysis, also because the uncertainty on  1. Impact of bayronic effects on the standard deviation of the weak lensing convergence at scale θ = 10 ′ for a DES Y1 n(z) as measured from the BAHAMAS simulations.
the IA nuisance parameters is one of the limiting aspect of cosmic shear analysis that one might want to mitigate through the use of high-order statistics.For the PDF, the interplay with cosmological parameters will be studied in more detail in our upcoming work (paper II in the series).One potentially important follow-up line of work would consist in extending the current bias-like IA expansions to the PDF so that higher-order terms in models such like TATT to alleviate for potential biases in the IA treatment.Current DES Y3 analysis of convergence moments [39] or third-order aperture mass in KiDS-1000 [83] which both contain NLA modelling are nevertheless respectively compatible with the main 3 × 2 points analysis from DES Y3 which is made with TATT [82] and the KiDS-1000 2-point analysis [83].

Baryonic feedback
We determine the effect of baryonic feedback on the mildly nonlinear convergence PDF by relying on simulated DES Y1-like lensing maps from the BAHAMAS hydrodynamical simulation suite [84], where the strength of AGN feedback has been varied in various simulation runs.The convergence PDF for the full DES Y1 n(z) and the separate redshift bins is illustrated in the upper panel of figure 7. Note that for our purpose, the distribution of sources between DES Y1 and Y3 is similar enough, specifically, the four bins roughly peak respectively at redshift z s = 0.3, 0.5, 0.8 and 1.In the lower panel we show the residual between the PDFs including baryonic feedback and the dark-matter only runs, which shows a clear signature of a changed standard deviation, whose values we summarise in table 1.
Conjecturing that the main impact is on the variance, we obtain the PDFs of the weak lensing convergence divided by its variance ν κ = κ/σ κ and show exemplary results for the whole n(z) and two redshift bins in figure 8.The excellent agreement of the PDFs for the standard deviation-normalised convergence field demonstrates that modelling the impact of baryons at the level of the variance is likely sufficient, as assumed in the recent analysis of HSC Y1 data [29].Using conservation of probability, this implies that the κ-PDF in the presence of baryonic feedback is given by For the purpose of obtaining the variance correction factor, the baryonic feedback model within HMcode can be used as illustrated in figure 5 of [74] showing a simple single-parameter model reproducing the 3D matter power spectrum in BAHAMAS.
We have further checked (not shown) that our the results are fully compatible with similar measurements from kappaTNG maps for single source redshifts [85]  the IllustrisTNG hydrodynamical simulations [86], and for tomographic bins following the Euclid n(z) as adopted by the Magneticum simulations [18,87].

Additive and multiplicative shear biases
We adopt the modelling of any biases coming from the shear measurement pipeline, such as noise bias (e.g.[88]), model-fitting bias (e.g.[89]), selection bias (e.g.[90]) and bias from the imperfect correction of the image point spread function (PSF; e.g.[91]), with a multiplicative factor 1 + m to any instance of the estimated shear as is common in literature.As the weak lensing shear and convergence fields are connected to each other via linear transformations, we can propagate the shear measurement biases to convergence with the following transformation using a multiplicative 1 + m and an additive bias c term This linear relation holds in the weak lensing regime where κ is small.It is common in the literature [39,82,92] to consider that these biases are redshift and scale independent within a given source redshift distribution n(z), and thus fixed at the map level.In that  case, since the KS reconstruction of the convergence forces us to fix the l = 0 wave mode to zero as a straightforward consequence of the mass-sheet degeneracy, we are forcing the mean κ (E/B) to be zero which renders inconsistent the consideration of an additive bias c.This is consistent with the assumption that any additive bias component can be perfectly removed from the measurement pipeline for compactness through calibration with image simulations on average [93].The PDF of the multiplicative biased shear κ = (1 + m)κ is straightforwardly obtained by conservation of probability as (6.8) Previous works have indicated that the presence of shear biases enhances the complementarity of the shear 2PCF and the convergence PDF [26] which may help to further lift parameter degeneracies appearing at the 2-point level.

Higher-order lensing corrections
At the scales and redshift where the large-deviations formalism can be considered accurate enough to be applied to real data analysis, the corresponding weak-lensing PDFs of either the convergence or the aperture mass can be generated from a finite set of cumulants in the sense that a correct variance, skewness, and a consistent manner to generate higher-order cumulants -26 -

JCAP03(2024)060
yield good results [15,40,67].In that context, studying the corrections to the PDF induced by higher-order lensing corrections amounts to computing the leading-order corrections on the variance and skewness which can then be incorporated straightforwardly to the formalism for the non-linear variance, and by slight modification of the spherical collapse ν parameter in equation (2.13) to match the new values of the skewness along the line-of-sight [see for example 67, that does it for post-Born corrections in the LDT context].We detail in the following the corrections that could be considered in our theoretical formalism and explain why we could discard them for our DES Y3 analysis of the reconstructed κ E PDF.This is consistent with what was done for the weak-lensing moments analysis in DES [39,94].Among all possible corrections, most of the traditional ones have been estimated in past works and in the LDT context.More precisely, the relaxation of the Born approximation and accounting for the coupling between lenses was studied in detail in [67] and in appendix F1 of [15].These terms will tend to Gaussianise the convergence field since they characterise the introduction of random deflections along the light path which will in turn tend to diminish the impact of the non-linear clustering of matter.The heuristic picture one could form is that of clustered chunks of matter blurred by these lensing terms.It was shown that the effect matters for higher source redshifts (e.g.CMB lensing) but is totally negligible for cosmic shear experiments.
Reduced shear corrections which at first order account to replacing the shear field by γ → γ + γκ was studied in the LDT context in [51] and in appendix F2 of [15] and was shown to induce only a percent-level change in the skewness at our scales and source-redshifts of interest.This tiny effect can be ignored for the analysis we propose here since the gravityinduced skewness directly implemented in the LDT formalism is itself not accurate to the percent-level because of the mixing of scales discussed in section 5.
Finally, individual galaxies can be (de)magnified through lensing effects and thus their fluxes are de-/increased.At the flux limit of a survey, this can cause fainter sources to be included in the observed sample while they would, in the absence of lensing, be excluded.At the same time, the density of galaxies in the small region around this source appears (increased) reduced since the area of the region is also (de)magnified.As such, the net effect depends on the slope of the intrinsic, unlensed, galaxy luminosity function at the survey's flux limit.This is known as the magnification bias and it also induces a correction on the overall measured skewness of the convergence field.However, as for the reduced shear correction, it was shown in appendix F3 of [15] to matter very little and we hence could discard its implementation in the theoretical formalism used to analyse DES Y3-like survey data in our upcoming works.

Photometric redshift uncertainties
In cosmic shear surveys the redshifts of source galaxies are determined using photometric methods.Any systematic error in the photometric estimates of the galaxies can lead to biases in the redshift distribution of the source galaxies which can in turn lead to biased cosmological parameter inference.In order to include the effect of such a systematic uncertainty on the source redshift distribution, one can propagate this through the theory by updating the weak lensing kernel equation (2.16).We could either compute this for a set of redshift distributions JCAP03(2024)060 (obtained through different methods, for example the hyperrank software developed in the context of DES [95]) or parametrise it through a single shift parameter ∆z via [39,82,95] n(z) = n ′ (z + ∆z) , (6.9) where n(z) denotes the shifted redshift distribution of the source galaxies and n ′ (z) the default redshift distribution estimate.This simple parametrization of a single mean redshift uncertainty (one for each tomographic redshift bin) is reported to be sufficient within the statistical power of surveys such as DES Y3 2-point main analysis (see figure 10 of [81] and figure 12 of [95]), and will be considered sufficient for this series of papers, in coherence with the previous high-order statistics analysis of current data sets [39,83].

Discussion and conclusion
The weak lensing convergence κ denotes a weighted projection of the three-dimensional matter density fluctuations.Intuitively, it quantifies a projected 'mass' of all the late-time, non-Gaussian distributed foreground structures which contribute to the deformation of a light beam emanating from a background source galaxy on its way to us.Therefore, studying the full 1-point Probability Distribution Function (PDF) of the smoothed κ field inside apertures (see [32,41]) is a promising way to access the non-Gaussian cosmological information of the foreground matter density field beyond the variance (or the widely used 2PCF) and holds the potential to tighten constraints on cosmological parameters.Unfortunately, the κ field itself is not a direct observable as what is actually seen in lensing surveys is the cosmic shear field -the weak coherent distortions imprinted in the shapes of the source galaxies.Nevertheless, one can apply the widely known Kaiser-Squires (KS) inversion technique on the shear field to reconstruct the "true" inaccessible convergence field.However, the KS inversion is exact and recovers the "true" convergence only when one has information of the shear field over the entire celestial sphere.17This is of course not the case in practice as one has access only to the shear information inside a given footprint on the sky due to survey masks (consisting of holes, complicated survey geometry and boundaries).Applying the full-sky KS inversion on the masked shear field thus results in a reconstructed κ field which differs from the "true" inaccessible κ field.Hence, for any κ-statistic which one desires to measure and analyse using a KS reconstructed κ map from observed masked shear data, it is of paramount importance to correctly quantify and account for the effect of the survey mask in the theoretical modelling of the corresponding κ-statistic.In this paper we have presented for the first time how to do that in a from-first-principles theoretical modelling of the KS reconstructed κ-PDF.Our main achievements on this front can be summarised as follows: • Our reconstructed full-sky convergence PDF is obtained from the "true" one [15,40] inside the survey footprint and purely geometric factors which take into account the effect of the survey window and the survey area on the variance (4.6) and hence the series of cumulants (4.3).Explicitly, this is achieved by both an accurate parametrisation JCAP03(2024)060 (without free parameters) of the reconstructed full-sky convergence PDF as a function of the one inside the survey footprint and modifying the scaling relations of the matter density contrast field cumulants (which are needed to compute the line-of-sight projected κ cumulants that are in turn required in the modelling of the KS reconstructed κ-PDF) by purely geometric factors which take into account the effect of the survey window and the survey area.
• We have applied the recipe for the full-sky KS κ-map reconstruction under the presence of a realistic survey mask (in our case, DES Year 3 survey mask) to the simulated cosmic shear data from the Takahashi suite of weak lensing N -body simulations and measured the reconstructed κ-PDF.We have tested our theoretical modelling of the same against the measurements and found excellent agreement between them within cosmic variance (figure 6).We further find that using the baseline theoretical model for the "true" κ-PDF without accounting for the masking effect significantly deviates from simulation measurements of the reconstructed PDF.These conclusions are valid for scales and source redshifts relevant for the baseline theoretical model which is accurate on quasi-linear scales as we have also demonstrated in figure 5 where we have applied a nulling strategy to construct lensing observables less-sensitive to very small non-linear scales normally probed through scale-mixing in the projection along the line-of-sight.
• In preparation for an upcoming real data analysis of the KS reconstructed κ-PDF we have also discussed and laid down the strategy to include several effects such as astrophysical and measurement systematics as well as higher-order lensing corrections to the theoretical model for the reconstructed κ-PDF.We included a modelling for intrinsic alignments based on an adaptation of the weak lensing kernel (6.5) and tested in simulations that baryonic feedback can be included through a rescaling of the variance (6.6).We describe how the lensing PDF is affected by a multiplicative shear biases (6.8) and how to propagate photometric redshift uncertainties through our theoretical model.Thus, our work not only presents a proper theory-based modelling framework for a real analysis of the KS reconstructed κ-PDF under the presence of realistic survey masks but it also underlines the susceptible errors when analysing any statistic from a KS reconstructed κ-map with a theoretical model that does not correctly include the E/B mode mixing due to the presence of survey masks.
Overall, our results indicate that the κ-PDF measured from the straightforward to implement spherical-sky KS reconstructed κ-map on the observed shear field (in the presence of masks) can be treated accurately within a theoretical framework without the need for any forward-modelling simulation-based approach.In particular, though some of the systematics modelling presented in this paper might need to be improved for Stage-IV surveys, the modelling of masks in combination with LDT in the context of the BNT transform will remain valid.This paves the way for us to explore the power of the κ-PDF in probing higher-order information in current lensing surveys such as DES, and in the future using Euclid and Vera Rubin's LSST data.We will perform cosmological inference analyses on simulated and real data in the following papers of this 'Making the leap' series.6.We give values for the reconstructed full-sky κ E variance and the variance and skewness values across the pixels under the mask that we take into account as described at the end of section 3.3.
⟨κ 3 ⟩ c,true and ⟨κ 3 E ⟩ c,unmasked values).As expected, the reconstruction does not significantly improve nor reduce the accuracy of the large-deviations formalism describing the underlying dynamics of the matter density field.The values given in table 2 are supplementary to figure 6 and are thus taken from a reconstructed field under the DES Y3 mask and for sources in the Takahashi simulation mimicking the DES Y3 4 th bin displayed in figure 3.

C Square mask for sources in the DES Y3 4 th bin
We here replicate the results of section 5.3, that is a source distribution (shown in figure 3) mimicking the 4 th source bin of the DES Y3 analysis but this time using a square mask without holes of roughly the same area as the DES Y3 mask.Since by construction the KS effects on the variance are perfectly taken into account, any discrepancy in the modelling of the KS effects comes at the level of the skewness of the reconstructed κ E field.As before, we quantify the quality of the KS modelling by comparing the residuals of the theoretical KS PDF to the simulated KS reconstruction PDF and the residuals of those same PDFs without any reconstruction.That way we visually do not include the intrinsic quality of the LDT prediction in our assessment.
Looking at the bottom panel of figure 9 we observe that the green line matches better the mean residuals of the theoretical KS PDF to the simulated KS reconstruction PDF (blue points) than what we could observe in the case of the true DES Y3 mask in figure 6.At the same time the impact of neglecting the KS effects in the modelling are less pronounced for this idealised mask (grey points).This is understandable to a certain extent.Indeed, the presence of multiple holes of various sizes within the DES Y3 mask makes the mixing of wave-modes in the E to E/B modes leakage more complicated to a degree where our mitigation (performed by restricting ourselves to the less affected pixels, cf.section 3.3) is less effective than in this simpler square-mask case.Note that similarly, finding it harder to mitigate the E/B mode leakage, would happen if the area of the square patch is small since the boundary effects would then affect more seriously the entire available area under the mask.The next step to improve the modelling would be to compute explicitly the convolution kernel on the shear bispectrum induced by the mask and include the new correction on the skewness in a manner analogous to how we would include higher-order weak lensing correction, i.e. by modification of the spherical collapse parameter ν in equation (2.13) [67].This however does not seem to be necessary at this stage for a DES Y3 analysis.3) and the blue points with error bars are the mean and 1σ fluctuations of the measured PDF in 108 Takahashi maps.The grey solid line is the theory described by equation (2.17) that is without taking into account the KS reconstruction.Bottom panel: residuals between the theory and the simulation.The green line is the residual of the original non-KS theory to the true, non-reconstructed simulated convergence PDF.It is highly compatible with the residuals of the KS-theory to the simulated reconstructed KS PDF (blue points).Though their inclusion allows to recover the full predictive power of our LDT formalism, neglecting the KS reconstruction effects is less imperative for this simple idealised square-mask, as opposed to the more realistic DES Y3 mask.This is understandable as discussed in appendix C.

0 1 Figure 1 .
Figure 1.Illustrative Mollweide projection of the binary mask applied to the full-sky shear field to mimic DES Y3 observations.The mask is as close as possible to the real one, notably keeping the multiple holes of different sizes across the field of view.The observed fraction of the full-sky is f sky = 0.1149.

Figure 2 .
Figure 2. Illustrative Mollweide projection of the full-sky reconstructed κ E,FS field for sources mimicking the DES Y3 fourth redshift bin and from a simulated shear field under the DES Y3 mask shown in figure 1.The reconstructed κ E field has been further smoothed by a top-hat window of radius 20 arcmin.Observed carefully, one can spot the non-locality of the KS reconstruction through the blurring of the mask boundaries and holes.It should also be noted that the majority of the pixel values far outside the survey footprint fluctuate very closely around zero (but are not exactly zero), as expected.

Figure 3 .
Figure3.Weights applied to the discrete source planes of the Takahashi simulation to mimic the DES Y3 source distribution in its fourth redshift bin.The weights are normalised so that their sum is equal to one.The thin line is only there to guide the eye and does not have any meaning.

Figure 4 .
Figure 4. Illustration of the effect of the BNT transform on lensing kernels.The green, yellow and blue dashed lines are the kernels up to z s = 1.0, 1.2, 1.4 respectively re-weighted by their appropriate BNT coefficients, M ij = [0.80,−1.80, 1] where j is fixed and equal to 3 if the blue kernel is the third of a tomographic analysis.The thick red line is the sum of the 3 re-weighted kernel.The effect of nulling is to set to zero the contribution of all lenses below the closest source plane.

Figure 5 .
Figure5.Top panels: PDF of the Kaiser-Squires reconstructed nulled convergence map using source planes located at z s = 1.0, 1.2 and 1.4 under the DES Y3 mask.The red line is the theory described by equation (4.3) and the blue points with error bars are the mean and 1σ fluctuations of the measured PDF in 108 realisations.The grey solid line is the theory described by equation (2.17) that is without taking into account the KS reconstruction masking effect.Bottom panel: residuals between the theory and the simulation.The green line is the residual of the original non-KS theory to the true, non-reconstructed simulated convergence PDF.It is highly compatible with the residuals between the KS-theory and the simulated reconstructed KS PDF (blue points).Neglecting the KS reconstruction effects (grey points) significantly worsens the quality of the theoretical prediction while taking them into account does not improve nor worsen the predictive power of our LDT formalism.

Figure 6 .
Figure 6.Top panels: PDF of the Kaiser-Squires reconstructed convergence map for the simulated DES Y3-like 4 th bin in the Takahashi simulation and under the DES Y3 mask.The red line is the theory described by equation (4.3) and the blue points with error bars are the mean and 1σ fluctuations of the measured PDF in 108 masked Takahashi maps.The grey solid line is the theory described by equation (2.17) that is without taking into account the KS reconstruction.Bottom panel: residuals between the theory and the simulation.The green line is the residual of the original non-KS theory to the true, non-reconstructed simulated convergence PDF.It is highly compatible with the residuals between the KS-theory and the simulated reconstructed KS PDF (blue points).Neglecting the KS reconstruction effects (grey points) significantly worsens the quality of the theoretical prediction while taking them into account preserves the predictive power of our LDT formalism.We note that the slight increase in the size of residuals is due to the complicated shape of the mask and presence of holes as demonstrated in figure9for the case of the square mask of similar area.

Figure 7 .
Figure 7. Top panel: smoothed convergence PDF from the BAHAMAS runs for the full DES Y1 n(z) (black) and the 4 tomographic bins (colours) for the DM-only run (solid) and the high AGN run (dashed) averaged over 25 lensing cones.The dotted vertical line at κ = 0 highlights the non-Gaussian shape of the κ-PDF.Bottom Panel: residual of convergence PDF comparing the hydro BAHAMAS runs with fiducial (blue), high (red) and low (green) AGN feedback to the DM-only result for the whole n(z).The solid lines indicate the residual between the means (averaged over 25 runs) while the shaded bands indicate the mean and standard deviation of the residual ratio in individual runs (obtained from 25 runs).The horizontal black dashed and dotted lines indicate a residual of 2% and 1%, respectively.

Figure 8 .
Figure 8. Residual of ν = κ/σ lensing PDF comparing the hydro BAHAMAS runs with fiducial (blue), high (red) or low (green) AGN feedback to the DM-only result averaged over 25 lensing cones smoothed for the whole n(z) and in two different tomographic bins corresponding to DES Y1 n(z).Bin 2 corresponds to the range z ∈ [0.43, 0.63] while bin 4 corresponds to z ∈ [0.9, 1.3].The solid lines indicate the residual between the means (averaged over 25 runs) while the shaded bands indicate the mean and standard deviation of the residual ratio in individual runs.

Figure 9 .
Figure 9. Top panels: PDF of the Kaiser-Squires weak-lensing convergence map for the DES Y3-like 4 th bin in the Takahashi simulation and under an idealised square-patch mask of 70 × 70 deg 2 .The red line is the theory described by equation (4.3) and the blue points with error bars are the mean and 1σ fluctuations of the measured PDF in 108 Takahashi maps.The grey solid line is the theory described by equation (2.17) that is without taking into account the KS reconstruction.Bottom panel: residuals between the theory and the simulation.The green line is the residual of the original non-KS theory to the true, non-reconstructed simulated convergence PDF.It is highly compatible with the residuals of the KS-theory to the simulated reconstructed KS PDF (blue points).Though their inclusion allows to recover the full predictive power of our LDT formalism, neglecting the KS reconstruction effects is less imperative for this simple idealised square-mask, as opposed to the more realistic DES Y3 mask.This is understandable as discussed in appendix C.