The DECam Ecliptic Exploration Project (DEEP): V. The Absolute Magnitude Distribution of the Cold Classical Kuiper Belt

The DECam Ecliptic Exploration Project (DEEP) is a deep survey of the trans-Neptunian solar system being carried out on the 4-meter Blanco telescope at Cerro Tololo Inter-American Observatory in Chile using the Dark Energy Camera (DECam). By using a shift-and-stack technique to achieve a mean limiting magnitude of $r \sim 26.2$, DEEP achieves an unprecedented combination of survey area and depth, enabling quantitative leaps forward in our understanding of the Kuiper Belt populations. This work reports results from an analysis of twenty 3 sq.\ deg.\ DECam fields along the invariable plane. We characterize the efficiency and false-positive rates for our moving-object detection pipeline, and use this information to construct a Bayesian signal probability for each detected source. This procedure allows us to treat all of our Kuiper Belt Object (KBO) detections statistically, simultaneously accounting for efficiency and false positives. We detect approximately 2300 candidate sources with KBO-like motion at S/N $>6.5$. We use a subset of these objects to compute the luminosity function of the Kuiper Belt as a whole, as well as the Cold Classical (CC) population. We also investigate the absolute magnitude ($H$) distribution of the CCs, and find consistency with both an exponentially tapered power-law, which is predicted by streaming instability models of planetesimal formation, and a rolling power law. Finally, we provide an updated mass estimate for the Cold Classical Kuiper Belt of $M_{CC}(H_r<12) = 0.0017^{+0.0010}_{-0.0004} M_{\oplus}$, assuming albedo $p = 0.15$ and density $\rho = 1$ g cm$^{-3}$.

Beyond the orbits of the major planets, our solar system hosts a large population of minor bodies known as Kuiper Belt Objects (KBOs).In the 30 years since the observational establishment of the Kuiper Belt (Jewitt & Luu 1993), several surveys (e.g., Millis et al. 2002;Bernstein et al. 2004;Petit et al. 2011;Bannister et al. 2018;Bernardinelli et al. 2022) have pushed the inventory of known objects to nearly 4000.These bodies, which are left over from the birth of our planetary system, provide constraints on its formation and dynamical evolution.When taken in aggregate, their dynamics, compositions, and sizes enable us to infer details about the dynamical evolution of the planets, the composition of our solar system's protoplanetary disk, and even the physical processes by which planetesimal formation occurred (see, e.g., Nesvorný 2018, Gladman & Volk 2021).
In particular, the size distribution of the so-called Cold Classicals (CCs)-which are thought to be relics of the birth of the solar system, relatively untouched and uncontaminated in the ∼ 4.5 Gyr since their formation-is a sensitive probe of the process of planetesimal formation.If the CCs are truly a quiescent population, a measurement of their size distribution can provide us with a unique opportunity to directly compare a primordial size distribution with predictions made by planetesimal formation models.Such a comparison will enable us to hone our formation models, and better understand the details of the physical processes at play in planetesimal formation, as well as the specific conditions of our own protoplanetary disk.
Recently, the streaming instability (SI; Abod et al. 2019) has begun to emerge as a leading theory of planetesimal formation.Numerical SI simulations predict an exponentially tapered power law absolute magnitude (H) distribution, enabling a direct comparison between theory and observation.Kavelaars et al. (2021) found that the absolute magnitude distribution of the CCs detected by the Outer Solar System Origins Survey (OSSOS; Bannister et al. 2018) is consistent with an exponentially tapered power law.However, the CCs used by Kavelaars et al. (2021) only went as faint as H r ∼ 8.3, leaving faint-end consistency with the SI as an open question.Existing literature on measurements of the faint end of the CC absolute magnitude distribution seems to be in weak tension with SI models of planetesimal formation.In particular, Fraser et al. (2014) find a marginally steeper faint-end size distribution than is predicted by SI simulations.However, a dearth of observed objects at the faint end of the distribution, along with the fact that Fraser et al. (2014) did not fit to an exponentially tapered power law, limits the usefulness of such comparisons.We require a larger, deeper set of CC detections by a survey with well-understood biases to thoroughly test any planetesimal formation theory.
The DECam Ecliptic Exploration Project (DEEP) is the first survey with sufficient depth and areal coverage to settle the tension in the shape of the faint end of the H distribution of the CCs.In this paper, we analyze data from 20 DECam fields (comprising an area of approximately 60 sq.deg.), reaching a mean limiting magnitude of m r ∼ 26.2.We use a subset of our data to reconstruct the luminosity function of the Kuiper Belt as a whole, as well as the luminosity function of the CC population.As the main scientific result of this paper, we use our results to reconstruct the underlying absolute magnitude distribution of the CCs, and find consistency with models of planetesimal formation via the streaming instability (Abod et al. 2019;Kavelaars et al. 2021).
This paper is organized as follows.In Section 2 we outline our observational strategy and discuss the data used in the subsequent analysis.Next we describe our image pre-processing pipeline (Section 3) and the pipeline used to carry out the object search (Section 4).We present an overview of our detections in Section 5.In Section 6 we calculate our detection efficiency using implanted synthetic objects.We compute the luminosity function for our KBOs as a whole in Section 7. In Section 8 we isolate a sample of CCs from our detections.Our main scientific result, presented in Section 9, is a calculation of the absolute magnitude distribution for the CC population.In Section 10 we calculate an estimate of the total mass of the CCs.In Section 11 we test the consistency of our absolute magnitude distributions with the results from Bernstein et al. (2004).The paper concludes in Section 12 with a summary of our results and a discussion of their implications.

DEEP SURVEY STRATEGY AND DATA
DEEP was carried out with the Dark Energy Camera (DECam) on the 4-meter Blanco telescope located at Cerro Tololo Inter-American Observatory in Chile from 2019-2023, targeting four patches of sky along the invariable plane (see Trilling et al. 2023 andTrujillo et al. 2023, hereafter Papers I andII, for more details).This paper focuses on the data taken in one of those four patches, our so-called B1 fields, from 2019-2021.These data consist of 20 individual DECam pointings targeting a progressively larger area of sky from 2019-2021, with significant overlap between years in order to enable the tracking of KBOs.Table 1 gives the pointing for each night of data, as well as the detection efficiency parameters, calculated in Section 6.Note that in order to avoid double-counting of single-epoch detections we only use a subset of our fields (indicated in Table 1 to derive the constraints in Sections 7 and 9. A DEEP exposure sequence, designed with a cadence ideal for a technique called shift-and-stack (Tyson et al. 1992;Gladman & Kavelaars 1997;Allen et al. 2001;Bernstein et al. 2004;Holman et al. 2004;Parker & Kavelaars 2010;Heinze et al. 2015;Whidden et al. 2019), typically consists of ∼100 consecutive 120-second VR-band exposures of the target field.1In the shift-and-stack approach, single-epoch images are shifted at the rate of a moving object (rather than at the sidereal rate) so that a moving object appears as a point source in the co-added stack.
There are two primary reasons why the shift-and-stack technique is preferable to long exposures for the discovery of moving objects.First, since the rate and direction of an object's motion are not known a priori, we are able to stack our images in a grid of velocity and direction that spans the space of possible KBO motions.The second benefit is the preservation of the S/N of moving objects.For stationary sources in astronomical images, S/N goes like t 1/2 .A source is effectively stationary if its apparent position changes by less than the size of the PSF over the course of the exposure.This sets an upper limit on the useful exposure time when searching for moving objects, which we will call t max .Given DECam's typical VR-band seeing of ∼0.9" and the typical KBO rate of motion of a few arcseconds per hour, our value of t max is on the order of several minutes.When t ≥ t max , the S/N of stationary sources continues to go like t 1/2 , while smearing causes the S/N of moving sources to go like t 0 .Thus moving objects fade into the background while the S/N of background sources continues to grow.Because DECam's CCDs have negligible read noise, we lose no sensitivity by adding together many short images, and thus the S/N of the moving objects continues to increase like t 1/2 .
The DEEP "B1" TNO search fields used in this analysis.Each hexagonal shaded area represents the DECam focal plane with its 61 active CCDs.The B1a-c fields were observed with integrated exposure times of ∼ 3.5 hours in August 2019 (left), and re-observed at suitably shifted positions (not plotted) in September 2019.In October 2020 we observed the B1a-f fields (center).In October 2021 we observed the B1b-f, h, and i fields.The larger areas in 2020 and 2021 account for diffusion of the 2019 detections.The plotted points represent our putative KBO detections.Note that Neptune is near the B1 fields, meaning that there should be relatively few resonant objects among our detections.1. DEEP telescope pointings used for the long-stare image sequences described in this analysis.The positions of the fields at each epoch aim to track as many objects as possible by accounting for the effects of Earth reflex motion and TNO shear.Each exposure is 120 seconds long and is taken in the VR band.The next three columns show the best fit of each night to Equation ( 5).The final two columns indicate whether a field was used to derive constraints for either the full KBO population or the CC population.

IMAGE PRE-PROCESSING
In this section we describe how our images are processed in preparation for our shift-and-stack pipeline (described in Section 4).The following steps take place after the images have gone through preliminary reductions with the DECam community pipeline (Valdes et al. 2014).

Synthetic TNOs
To enable studies of our efficiencies, we generated a population of several thousand synthetic sources to plant in our images.These synthetic sources were not meant to emulate a realistic population, but rather to test efficiency across the space of all possible bound orbits in the Kuiper Belt.They span distances from ∼ 20 au to a few hundred au, and include fully retrograde orbits.To enable studies of efficiency as a function of brightness, we have given our synthetic sources apparent magnitudes as bright as 20, and as faint as 27.2, as well as sinusoidal rotation curves with amplitudes as large as 0.5 mag, and rotation periods between a few hours and a few days.

Flux Calibration and Synthetic Source Injection
To calibrate the flux of our synthetic sources we calculate the photometric zero-point for each individual CCD image by cross-matching the non-streaked sources (ellipticity < 0.82 ) against Pan-STARRS sources (Magnier et al. 2013) with r SDSS magnitude3 between 15 and 21.We then use the Python package SpaceRocks (Napier 2020) to calculate the sky position, sky motion, and brightness of each synthetic TNO, including a rotation lightcurve.With the sky motion of each object and the PSF of the image, we generate a streak model for each synthetic TNO.With the brightness, streak model, and photometric zero-point specified, we inject the synthetic TNOs into the image. 4Along with the synthetic TNOs, we also inject 12 stationary synthetic point sources with an r-band magnitude of 21 into each CCD image, in order to enable calibration after difference imaging.We add these 12 stationary sources at fixed pixel locations in the images (i.e., not fixed sky positions), so they do not appear in the template, and thus remain in difference images.

Difference Imaging
After we implant synthetic sources, we prepare the images for the shift-and-stack pipeline.To do this, we must remove every stationary source-even the faintest sources that are not visible in single exposures.We apply the High Order Transform of Psf ANd Template Subtraction code hotpants (Becker 2015), which implements and improves upon the method of Alard & Lupton (1998) to create difference images.This code formed the basis for the Dark Energy Survey's supernova search pipeline (Kessler et al. 2015), and has consequently been thoroughly exercised on DECam data.
We first assemble the collection of exposures in the same observing run, including images from both the long and short stares (see Papers I and II for details).We generate three different templates by median-combining the single epoch images with seeing (by measuring the FWHM of the in-frame stars) in the top, middle, and bottom 1/3 of the ensemble.We require the minimum time separation between observations to be longer than 0.01 days (14.4 minutes) to ensure that the templates contain minimal flux from the slow movers.The hotpants algorithm then performs seeing matches between science images and the template with the closest match to the image's seeing to generate difference images.The better-seeing images (either single epoch or template) are convolved to match the images with the worst seeing, and the bright sources (pixel counts > 3000) are masked before performing image subtraction.The final step is masking the bright Gaia sources (G > 18) and regions where there is a contiguous group of at least 5 pixels above/under the ±2σ level.This step usually masks less than 1% of the total pixels and not only cleans out most of the artifacts generated by the difference imaging process, but also removes streaks from artificial satellites, thus greatly reducing the false detection rate in shift-and-stack images.However, this masking comes with the caveat of masking bright sources.In practice, we find that sources brighter than VR∼ 24-24.5 are masked.To ensure that we recover the bright objects, we also write out difference images in which we do not mask the pixels above/under the ±2σ level.While the un-masked images produce significantly more spurious sources after shifting and stacking, we simply ignore the faint sources produced by these images, opting to only consider sources with S/N≥ 30, thus finding all of the bright sources with minimal additional effort.
Finally we use the stationary magnitude 21 fakes to re-scale each difference image such that it has a zero-point of 30.After re-scaling, we compute weight images as the inverse of the variance in the difference images.Using weight images enables us to optimize the S/N of our stacks without manually rejecting images.5

The Grid
To carry out our search, we developed a novel method of generating the shift-and-stack grid.We begin by computing the grid bounds.A unique grid is required for each field, as its exact shape depends on the epoch and sky position of a pointing.We must also select the range of topocentric distances (∆) of interest.For the search described in this paper, we consider ∆ ∈ [35, 1000] au.6With RA, Dec, and ∆ fixed, an object's position vector in the topocentric frame, ⃗ x T is uniquely determined.We then change the origin from the topocentric to the barycentric frame so that we have ⃗ x B .7 After changing the origin to the barycentric frame, we assign a velocity v bound such that an object at the position ⃗ x B would be barely bound to the solar system (specifically, we use the speed appropriate for a semi-major axis a = 200,000 au).We then uniformly sample a collection of velocity vectors ⃗ v B on the surface of the sphere of radius v bound .With the velocity vectors specified, the state vectors are fully determined, allowing for the computation of the corresponding RA rates and Dec rates as observed in the topocentric frame. 8We repeat this process at several discrete values of ∆, and then use the Python package alphashape to draw a concave hull bounding the computed rates.This hull encompasses the full region of physically possible sky motions for the distances of interest.
Once we have computed the boundary of the concave hull for a given pointing, we choose a finite set of rates at which to stack our data.Toward this end, we employ a new method in which we fill the hull with a large sample of random points, and then use a K-means clustering algorithm to divide the region into N clusters.One can use dimensional analysis to determine that N ∝ A hull /ϵ 2 , where A hull is the area of the hull, and ϵ is the desired grid spacing to minimize the maximum source trailing (roughly determined by the PSF width divided by the duration of the exposure sequence).We take the centroid of each cluster as a grid point.When compared to a rectangular grid, this method allows us to use ∼20% fewer grid points, and simultaneously reduce both the mean and maximum distance of any given point in the hull to the nearest grid point.As a result, we achieve more even coverage of physically possible rates, while minimizing the computational cost and opportunities for false positives.Although the process is random by nature, a sufficient number of random samples makes it nearly deterministic.We show an example of a grid for a 4-hour exposure sequence with 1" seeing in Figure 2.

The Shift-and-Stack Procedure
After computing the shift-and-stack rates, we proceed with the stacking.We designate the first exposure in an exposure sequence as a reference image.We then compute the RA and Dec at the center of the reference image, and use the RA rate and Dec rate to compute the amount by which we have to shift each image to match with the reference image's center.In other words, we stack the pixels along the path taken by the center of the reference image for given RA and Dec rates.We do not consider variations in focal plane geometry across the chip, as the solid angle of a single chip is rather small (we thus assume that the chips are locally flat), and all of the images are SWarped (Bertin et al. 2002) for the difference imaging process.We perform separate stacks for both the weighted signal (i.e. the signal multiplied by the weight) and weight images, and then obtain the full stack by dividing the stacked weighted signal image by the stacked weight image.
After each stack we use the Python package sep (Barbary 2016;Bertin & Arnouts 1996) to extract all sources with at least 3 pixels with values above 1.5σ.9Each stack is contaminated with of order a few ×10 3 spurious sources mostly consisting of cosmic rays, dead pixels, over-saturated pixels close to bright stars, and residuals from poorly subtracted stars and galaxies.Since we use approximately 100 stacks per chip, the total number of spurious sources per chip is close to 10 5 .Because the vast majority of spurious sources are not PSF-like, we have trained convolutional neural nets (CNNs) using tensorflow (Abadi et al. 2015) to reject them automatically.We trained one CNN on synthetic sources superimposed on background from DEEP difference images, and another on the autoscan training set that was used to train a random forest algorithm for background rejection in the Dark Energy Survey (DES) supernova search (Goldstein et al. 2015).Both CNNs retain nearly all of the signal and fail to reject different types of background, thus enabling a significant performance gain by requiring a source to be classified as good by both CNNs.This procedure cuts the number of sources per stack by three orders of magnitude, down to a few ×10 2 .After all of the stacks are completed for a given chip, we consider the complication that most objects are bright enough to be detected in adjacent stack rates.To eliminate this redundancy, we employ a DBSCAN (Ester et al. 1996) clustering algorithm in pixel space to group detections associated with the same object.
The grid spacing in the initial shift-and-stack is good enough for source detection, but is too coarse to provide the best values for the position and rate for a given source.To refine the parameters, we use a Markov Chain Monte Carlo (MCMC) approach in which we perform targeted stacks on our candidates to maximize S/N.These targeted stacks are still restricted to the parameter space of bound orbits, but are now continuous in RA and Dec rates.This procedure enables us to obtain refined RA and Dec rates with uncertainties, while simultaneously optimizing the measured RA and Dec of the source.We use the uncertainties in RA rate and Dec rate (which are typically about 0.1"/hour in each dimension) to probabilistically classify our detections in Section 8.After refining the parameters of our detections, we discard all sources with rates slower than 3 pixels per hour (0.79"/hour, or distance ≳ 150 au), as such slow rates tend to accumulate false positives due to subtraction artifacts much more quickly than faster rates.We feed our remaining candidates through a final CNN that reduces the number of sources per chip to ∼ 10.The images we show to the CNN are similar to the right panel in Figure 3, and contain more information than the cutouts we show to the first CNN.Good sources tend to show a characteristic radial pattern, while false sources do not.Once we have refined our sources' rates and positions, we compute their flux and flux uncertainty using sep.We use these values to calibrate the magnitudes of our detections against the known magnitudes of our implanted sources, as well as obtaining a magnitude uncertainty.
Finally we do a reverse stack on our data, in which we repeat the above procedure with negated RA and Dec rates.Because no physical KBO would appear as a point source when stacked at these rates, all sources that result from this stack are false positives.This reverse stack enables the critical step of accounting for false positives in our detections.In Figure 4 we show differential histograms of the number of sources resulting from the forward and reverse shift-andstack as a function of S/N, both before and after applying weights and various cuts.In Figure 5 we show a scatter plot of the RA and Dec rates of all detections from the forward and reverse shift-and-stack, prior to human vetting.

Candidate Vetting
The final step in the discovery pipeline is the visual inspection of the grids that were labeled as good by the final CNN.This step includes the implanted synthetic sources and the sources from the reverse shift-and-stack.To do this visual inspection we developed a website where users vote yes, no, or maybe on a candidate and have their vote recorded to a database.The images are presented to the vetters in a blind manner, meaning that the vetter has no indication of whether an object is an implanted source, a source from the reverse shift-and-stack, or a true candidate.By blindly vetting all sources, we can reliably compute a voter's true and false positive rates for yes, no, and maybe votes.We require votes from three unique vetters for each object, and then combine the votes into a probability that a source is "real" using the following framework.
The odds (i.e., betting odds) of a source being good given a vote are where O represents odds and P represents probability.The + symbol means that the object is truly a good source, and the − symbol means that the object is truly a bad source.We calculate the prior O(+) using the excess in the number of sources in the forward shift-and-stack over the number of sources in the reverse shift-and-stack.The fraction on the right-hand-side of Equation ( 1) is the Bayes factor, The quantity P (vote|+) is calculated as the probability of assigning a given vote to an implanted source.Similarly, the quantity P (vote|−) is calculated as the probability assigning a given vote to a source from the reverse shift-and-stack.
Given multiple votes, we can simply update the information by taking the product of Bayes factors as where B i is the Bayes factor of the ith voter.We can then convert the odds from Equation (3) into the probability that a source is real as We assign the values calculated by Equation ( 4) as a weight (w) for each of our detections.This treatment allows us to take a probabilistic approach in studying our detections in Sections 6-9.10In principle, P (vote|+), P (vote|−), and O(+) can all vary with magnitude and rate of motion.In practice, however, we found that P (vote|+) and P (vote|−) did not vary much over the range of brightness and rate that we considered for our fits in Sections 7 and 9, so we considered them to be constants.Similarly, parameterizing O(+) had little effect on our detections' weights after human inspection, so we chose to treat it as constant.11See Table 2 for a tabulation of our vetters' Bayes factors, and see Figure 6 for correlations between vetters' votes.

DETECTIONS
In this section we qualitatively analyze our detections; we do more thorough quantitative analyses in Sections 6-9.In our 20 nights of data we detected a weighted sum of 2297.9 objects with weight greater than 0.01, corresponding to 2896 unique sources.We have elected to omit 3698 sources with weight less than 0.01, as such detections are rather unlikely to be real, and their omission does not change the results of our analysis.While the majority of our remaining sources have weight close to 1, there are some more ambiguous cases.We show a mosaic of all detections with weight ≥ 0.4 in Figure 7. Based on our distribution of observed magnitudes (shown in Figure 8), it appears that our detection efficiency begins to fall off at magnitudes fainter than r ∼ 26 (though we explore this in detail in Section 6).It is also informative to examine the sky moving rates of our detections, which we display in Figure 9.In this figure, the size of each marker is proportional to its weight.The most apparent feature here is a large population of objects moving at approximately 3" per hour, mostly corresponding to CCs (see Section 8).Based on the increased density of small points at slow rates of motion, we note a propensity for slow-moving false positives to pass our CNNs, but to be later given low weights after human inspection (there is a noticeable overdensity of points at slow rates in Figure 5 which is absent in Figure 9).At faster rates, points with low weights appear to be evenly spread.The presence of such features provides further evidence that a great deal of care in avoiding false positives is required when making statistical use of single-night detections.When such detections can be linked to several epochs, single-night false positives will be less problematic.

DETECTION EFFICIENCY
In order to make use of our detections, we must understand the efficiency with which we recover our implanted synthetic sources.We parameterize the detection efficiency as a function of apparent magnitude using a single hyperbolic tangent function, given the following equation where η 0 is the peak detection efficiency, m 50 is the magnitude at which the detection efficiency drops to η 0 /2, and σ is the width of the hyperbolic tangent function. 12We weight each of the detections in our fit using Equation (4), and maximize the likelihood given by ln L(θ) where θ is the vector of function parameters, and undetected fakes receive w = 0. We display the best fit for each night in Figure 10, and list the fit parameters in Table 1.The combined efficiency for all 20 nights of data is given by the dashed line, while individual nights' efficiencies are given by the grey lines.The average m50 for our entire survey is mr ∼ 26.2, with individual nights ranging from 25.92-26.65.Our average peak efficiency, η0, is ≳ 0.92, with individual nights ranging from 0.85-0.95.For reference, we also show limits from the Dark Energy Survey (DES) (Bernardinelli et al. 2022), OSSOS (Bannister et al. 2018) and LSST (Ivezić et al. 2019).

THE LUMINOSITY FUNCTION
We use our characterized detections to compute the differential sky density Σ of the Kuiper Belt as a function of apparent magnitude. 13We reiterate that for this analysis we are using only single-epoch data (i.e., we have not linked these detections across multiple epochs), so we must treat each night as an independent survey.However, because our survey was designed to detect objects multiple times, our nights are not all statistically independent.As such, we selected a subset of our data consisting of the statistically independent set of nights {B1b 20201015, B1c 20201016, B1e 20201017, B1a 20201018, B1f 20201020, B1d 20201021} to do our fits.This subset offers the best combination of survey area and depth among all possible subsets of our data.14Note that when fitting our data, we truncate our detection efficiency at m 50 , and ignore all fainter detections.
For a given probability distribution Σ(m), the expected number of detections by a survey is given by where Ω is the survey's areal coverage and η(m) is its detection efficiency.The variable θ is a vector of function parameters.Next, the probability of randomly drawing an object with magnitude m from Σ(m) is given by where ϵ is a functional representation of the magnitude uncertainty for which we have adopted a Gaussian centered at m, with a width of δm. 15e calculate the underlying luminosity function (Σ) of the Kuiper Belt by maximizing the likelihood (L) given by (see, e.g., Loredo 2004, Fraser et al. 2014) where the index k runs through each night of data, and the index j runs through a survey's detections.The value w j,k denotes the weight of the jth object detected by the kth survey, as calculated by Equation (4).16 Several previous works have studied the form of the luminosity function for KBOs (Bernstein et al. 2004;Petit et al. 2006;Fraser et al. 2008;Fraser & Kavelaars 2009;Fuentes et al. 2009).Following the example of these studies, we fit our data with functional forms of varying complexity.We first try a single power law given by Σ single (m) = 10 α(m−m0) , (10) where m 0 is the magnitude at which the density of objects is one per square degree, and α is the power law slope.We next try a rolling power law given by where Σ 23 is the number of objects with m r = 23 per square degree, while α 1 and α 2 control the shape of the function.We fit a broken power law given by were m 0 is a normalization parameter, m B is the magnitude at which the break occurs, and α 1 and α 2 are the bright-and-faint slopes, respectively.Finally we fit an exponentially tapered power law with the functional form where α is the faint-end power law slope, β is the strength of the exponential taper, m 0 is a normalization parameter, and m B is the magnitude at which the exponential taper begins to dominate.
In each case, we obtain a best fit using an optimizer, and then estimate our uncertainties by running an MCMC.We then use a survey simulation technique to test the quality of our fits.We randomly sample a population of objects from our best-fit Σ(m|θ), and then impose the detection criteria of our survey to simulate detections.After we simulate our detections, we construct a simulated empirical distribution, E(m).By repeating this process many times, we end up with an ensemble of simulated empirical distribution functions, {E i (m)}.This ensemble is representative of the distribution of possible outcomes for our survey under the assumption that our best fit Σ(m|θ) is the underlying truth.Next we calculate the upper and lower limits for the central 95th percentile of {E i (m)} as a function of m, which we call u(m) and l(m), respectively.We use these values to define a test statistic that measures the fraction of the interval m over which E i (m) is an outlier at the 95 percent level.The statistic, which we call S, is given by where We calculate S for each simulated survey to assemble a set {S i }, and then for our actual detections, S d .We then compute the quantile of S d among {S i }, which we call Q outlier .Values of Q outlier > 0.95 indicate a poor fit.Finally, we compute the Bayes Information Criterion (BIC) for each of our fits.In general, lower values of BIC indicate a preferred model.We demonstrate our results in Figure 11, and summarize them in Table 3.First we note that like several studies before us, we strongly rule out the single power law.The rolling, broken, and exponentially tapered power laws all provide acceptable fits.Judging by the BIC, the rolling power law is marginally preferable to the broken and exponentially tapered power laws, but we contend that a dearth of detections brightward of m r = 24 limits the usefulness of such comparisons.There is also no overwhelming physical motivation for choosing between the distributions.While Kavelaars et al. (2021) showed that the absolute magnitude distribution of the CCs is well-described by an exponentially tapered power law, there is no reason a priori that it should fit the full Kuiper Belt luminosity function.In fact, we might expect the full Kuiper Belt luminosity function to be fit poorly by an exponentially tapered power law, because it is likely a mix of multiple distributions.However, as we show in Section 8, our detections are dominated by CCs.It is therefore unsurprising that the exponentially tapered power law yields a reasonable fit.Given these considerations, we opt not to choose a preferred model, and instead claim for the time being that all three distributions provide acceptable fits to the DEEP detections. .Best-fit differential distributions for the single, rolling, broken, and exponentially tapered power laws in purple, blue, yellow, and red respectively.The points and 1-σ error bars represent the differential distribution of our detections, corrected for efficiency and weight.The points are meant only as a visual aid; we always use the maximum likelihood technique to fit our data.Note that our fits are not normalized with respect to latitude, but rather they represent the average sky density among the fields in our pointing mosaic.3. Best fit parameters and statistics for each of the distributions we tested on the full KBO sample.The BICs have been normalized such that the minimum value among the distributions is 0.

ISOLATING A SAMPLE OF COLD CLASSICALS
While our single-night detections do not provide nearly enough information for secure dynamical classification, we can use the detections' rates of motion to make assumptions about their dynamical classes.Consider the space of allowable sky motions for bound orbits shown in Figure 12.As before, the large black outline contains the space of possible rates of motion for objects on bound orbits at topocentric distances of 35-1000 au.The blue region is the allowable parameter space for CCs (42.4 au < a < 47.7 au, 0 < e < 0.2, 0 • < i < 4 • ), and the red region is the allowable parameter space for a somewhat arbitrary definition of Hot Classicals (HCs) that we are only using for the purpose of demonstration (42.4 au < a < 47.7 au, 0 < e < 0.2, 4 • < i < 45 • ).Note that the exact shape and orientation of these regions varies as a function of sky position and epoch.
In practice, there is some overlap in the rates of motion between different dynamical classes, which is accentuated by the uncertainty in our rate measurements.We use survey simulation to isolate the CCs in our detections.Using the OSSOS++ Kuiper Belt model17 , we calculate which objects our survey would have detected, and apply a smear in the RA and Dec rates consistent with our real detections.For each simulation we use a kernel density estimator to draw a region containing 95% of CCs.We find that we can isolate a fairly pure sample (on average ∼ 70% purity; see Table 4) of CCs from our single-night data.This purity is somewhat serendipitous, as the DEEP B1 fields happen to be in a patch of sky that is relatively uncontaminated by objects in resonance with Neptune.We speculate that the uncertainties inherent to the distribution of objects in the OSSOS++ model are larger than the purely statistical uncertainties listed in Table 4. Nevertheless, these values should provide a realistic estimate of the purity of the CC samples that we attempt to isolate for the analysis in Section 9. Finally, we obtain a distance estimate and uncertainty for each of our CC detections.This estimate relies on the fact that in the regime of the CCs, the relationship between an object's heliocentric distance and the inverse of its apparent rate of motion can be well-approximated as linear.For each night of data, we project our CC population model (obtained from the OSSOS++ model) into the space of RA rate and Dec rate, and fit a line relating distance to the inverse of the apparent sky motion.We then use the resulting relationship to compute the heliocentric distance of each of our detections.By sampling from the covariance of our detections' rates, we obtain Gaussian distance uncertainties of at most 1-2 au, which end up being precise enough to fit an absolute magnitude distribution (Section 9).

THE ABSOLUTE MAGNITUDE DISTRIBUTION OF THE COLD CLASSICALS
Determining the absolute magnitude (H) of an object requires simultaneous knowledge of its apparent magnitude and its heliocentric distance.While our detections have reasonably well-constrained apparent magnitudes (∼ ±0.1), their distances are not as well-constrained from single-night detections a priori.However, as we found in Section 8, any true CCs are within a narrow range of r, with uncertainties of only 1-2 au, leading to uncertainty of only 0.1-0.2mag in H.
For a given probability distribution Σ(H), the expected number of detections by a survey is given by where Ω is the survey's areal coverage and Γ is the underlying radial distribution of objects in the survey's field of view.Note that we use the OSSOS++ Kuiper Belt model (Kavelaars et al. 2021) to compute a kernel density for Γ.  4. (Second column) Weighted number of detections in each of our 20 fields, along with the weighted number of objects consistent with being CCs.Note that we have ignored all objects fainter than the m50 of the night in which they were detected.
(Third column) Fraction by dynamical class of the objects with rates consistent with being CCs, as determined by survey simulation.The final column indicates whether a field was used to derive constraints for the CC population.
Since our fields are all observed near opposition, it is a good approximation to use H = m − 5 log 10 (r(r − 1)).( 17) Next, the probability of randomly drawing an object with magnitude m and heliocentric distance r from the distribution Σ(H) is given by where ϵ is a functional representation of the magnitude uncertainty, and γ is a functional representation of the uncertainty in heliocentric distance.Note that we have taken ϵ to be a Gaussian centered at m, with a width of δm, and γ to be a Gaussian centered at r, with a width of δr.
We again use a modified version of the method described by Fraser et al. (2014) in which we maximize the likelihood given by where the index k runs through the surveys (in our case individual nights of data), and the index j runs through a survey's detections.
We again fit single, rolling, broken, and exponentially tapered power laws.Note, however, that we have changed the definition of the rolling power law to where Σ 8 is the number of objects per square degree per magnitude at H r = 8.For these fits, we must also account for the contamination in our sample from non-CCs.We do this by sampling from each night's detections that have rates consistent with being a CC, accepting each detection with a probability given by the CC fraction column in Table 4.Note that we omit the B1f field due to the projected low purity of the isolated with that of Kavelaars et al. (2021).The two results are consistent (i.e., the OSSOS result is within our 95% confidence region), and both surveys are consistent with the result of Bernstein et al. (2004), which is the deepest KBO survey to date. Figure 14.Absolute magnitude distribution of the CCs in our sample compared to those in Kavelaars et al. (2021).The black line is our best fit exponentially tapered power law, and the dark grey region represents a 95% confidence interval.Note that we have no detections brighter than Hr ∼ 6.3.The red line is the best fit from Kavelaars et al. (2021), and the white and green circles were taken from the same work.The white circles represent where the inventory of the CCs is considered complete, and the green circle is computed using the detections from Bernstein et al. (2004).
The consistency of our detections with a rolling power law warrants careful consideration.First, we note that for appropriate parameters, a rolling power law is functionally equivalent to a Gaussian.All rolling power law fits we have presented in this work satisfy these criteria, and can therefore be equally well-represented as normal distributions.If the CCs are normally distributed in H, it would imply that they have a characteristic size, after which the H distribution turns over.While we currently have no reason to suspect that the size distribution turns over, our data cannot rule it out as a possibility.Furthermore, the only survey in the literature with data beyond our limit (Bernstein et al. 2004) cannot distinguish between the models, as it is consistent with extrapolations of both the rolling and exponentially tapered power laws.While more precise measurements of the CC H distribution from forthcoming DEEP data will help to bring the true distribution into clearer focus, an even deeper targeted CC survey may be necessary to distinguish between models.

MASS OF THE COLD CLASSICAL KUIPER BELT
Given our absolute magnitude distribution, we can calculate the total mass of the CCs as where F is an estimate of the average fraction of the total CC population per sq.deg. in one of our fields (we calculate F ≈ 3.27 × 10 −4 using the OSSOS++ model), and M is the mass of a body as a function of H given a geometric albedo, p, and a mass density, ρ.If we assume that ρ and p are constant among the population of CCs, we can manipulate Equation ( 21) to pull the ρ and p dependence out of the integral, as Note that the quantity 1329 3 π 6p 3/2 Σ(H|θ)10 −0.6H dH gives the volume of CCs per square degree in units of km 3 deg −2 .
We find that M CC (H r < 10.5) (i.e., the mass of the CCs up to our detection limit) is (at 95% CL) If we extrapolate our fits our to H r = 12, we find To facilitate a comparison, we can modify the form of the mass estimate reported by Bernstein et al. (2004), to where r is the average heliocentric distance of a CC. 18While the mass estimate from Bernstein et al. (2004) goes slightly deeper than our extrapolation, the extra mass is negligible, so the two mass estimates are in excellent agreement.Finally, we note that since our H distribution is consistent with that found by Kavelaars et al. (2021), our mass estimates must also be in agreement.

CONSISTENCY WITH DEEPER SURVEYS
The only survey in the literature that is significantly deeper than the present work is that of Bernstein et al. (2004), which reached m 50 = 29.0219over a search area of 0.019 deg 2 .Although the survey area is quite small, we can use it as a powerful lever arm to determine whether our fits remain valid down to H r ∼ 12.To do so, we simulate the survey of Bernstein et al. (2004), using our Σ(H) fits (and their uncertainties) true underlying CC H distribution.For each H form, we simulate the survey 10 3 times, and then ask whether the true number of detections by the survey (N = 3) is commensurate with the suite of simulations.In particular, we calculate P (≤ N ), the probability that the survey would have made fewer than or exactly 3 detections.For the broken power law we find P (<= 3) < 0.02, for the exponentially tapered power law we find P (<= 3) = 0.16, and for the rolling power law we find P (<= 3) = 0.65.
First, we note that our best-fit exponentially tapered and rolling power laws are both consistent with the B04 detections.We are hesitant to rule out the broken power law because we find that dropping the m 50 of B04 from r ∼ 29.02 to r ∼ 28.82 yields P (<= 3) = 0.05 for the broken power law.While this result is still indicative of a marginal fit, it is not strong enough to rule out the broken power law altogether.Although the B04 survey had sensitivity as faint as r ∼ 29.02, its faintest detection was r ∼ 28.23, so it is possible that the reported m 50 was overestimated.If, on the other hand, the reported efficiency of B04 is accurate, it is possible that the H distribution of the CCs flattens out (and possibly turns over) somewhere between the limits of DEEP and B04.While DEEP will not be able to resolve this issue, a very deep targeted CC survey should be able to answer the question (Stansberry et al. 2021).
12. DISCUSSION AND CONCLUSIONS In this paper we have presented our single-night detections from 20 nights of data in the DEEP B1 field.By using a shift-and-stack technique we were able to achieve an r-band depth of ∼ 26.2 over approximately 60 square degrees of sky.Our data yielded 2297.9 single-epoch candidate detections, including 1849.8 detections fainter than m r ∼ 25-the most detections fainter than m r ∼ 25 ever reported in a single survey by more than an order of magnitude.
Our claim of fractional discoveries is a first for KBO science, and as such may seem peculiar.However, as we have shown in Section 4.2, a weighted treatment of our detections allows us to properly account for false positives.By accounting for false positives, we are able to make full use of the data from even our deepest nights, whose faintest detections cannot be recovered in another epoch.Additionally, our statistics remain reliable near the detection limit where false positives tend to accumulate.
Using 554.4 single-night detections from 6 unique DECam pointings, we computed the luminosity function of the Kuiper Belt as a whole (Section 7).Then using ∼ 280.0 CC detections from 5 unique DECam pointings we calculated the luminosity function of the CC population (Appendix A), down to m r ≳ 26.5.In both cases, we were able to confirm that a single power law is not suitable to describe the underlying distribution.We found that rolling, broken, and exponentially tapered power laws yielded acceptable fits, though low discrimination power at the bright end prevented us from giving overwhelming preference to any model.
The most significant scientific result of this work is a measurement of the absolute magnitude distribution of the CCs down to H r ∼ 10.5.While a dearth of bright objects limits our constraining power at the bright end, our plethora of faint detections enable us to tightly constrain the faint end of the distribution.Our detections are consistent with an exponentially tapered power law with a faint-end slope α = 0.26 +0.08 −0.07 .This faint-end slope is marginally shallower than previous measurements in the literature, but is consistent with simulations of planetesimal formation via the streaming instability.This is of particular interest because the theory predicts a physically motivated size distribution (as opposed to other size distributions which have simply been engineered to adequately describe observations) of CCs that matches observations over the full range of sizes probed to date.However, we urge that these conclusions must be approached with caution.Of particular consequence to the conclusions from this work, limited resolution in SI simulations leaves the theoretical faint-end slope somewhat uncertain (see Kavelaars et al. (2021) for an in-depth discussion of the outstanding problems with the streaming instability as a complete theory of planetesimal formation).While the exponentially tapered power law H distribution is a good fit to our CC detections, we cannot rule out rolling or broken power laws.Both the exponentially tapered and rolling power law distributions are consistent with the results of Bernstein et al. (2004), implying that the H distribution continues to flatten out beyond the DEEP limit.In contrast, our broken power law may be in tension with the the results of Bernstein et al. (2004).Interestingly, our rolling power law fit would imply a characteristic size for the CCs, beyond which the probability density rolls over.
Finally, we note some limitations of this work that can be improved upon in future studies.The most obvious limitation is our use of single-night detections.Although we developed robust new techniques to account for false positives, linked detections with well-constrained orbits are still preferable.Since we used single-night detections in this work, our selection of a CC subsample of our detections was rather rough, relying heavily on the OSSOS++ solar system model and the serendipity that Neptune was near these fields, meaning that they were necessarily relatively devoid of resonant objects.Linked orbits will enable proper dynamical classification, thus reducing the uncertainties in our studies of individual dynamical classes.In forthcoming work, we will analyze three additional fields similar to the B1 field.We will link our detections, yielding well-determined orbits for the smallest known KBOs.Our catalog of discoveries will enable studies of Kuiper Belt populations to unprecedented depth, providing deep insight to the formation and evolution of our planetary system.

Figure 2 .
Figure2.Sample grid for a 4-hour exposure sequence with 1" seeing.The black teardrop-shaped boundary encompasses the space of possible bound orbits with topocentric distance between 35 and 1000 au.The red points are the sample rates computed using the procedure described above.

Figure 3 .
Figure 3. Example of the image format used for visual inspection of our candidates.The detection shown in this figure is a synthetic source with an r-band magnitude of 27.0.The left panel is what we call an MCMC plot.The black teardrop-shaped region is the boundary of the space of possible KBO rates of motion.The grey region at slow rates represents the rate cut we made on our detections.The points are sampled from our MCMC, with colors corresponding to the S/N of the object in the stack (this object had a peak S/N of 8).Each point represents a targeted stack at the given rate.As a stack approaches the correct rate, causing the source to appear optimally point-like, the S/N increases in this characteristic manner.The inset labeled Diff Coadd is the stationary stack; the image shows no discernible signal.The inset labeled Best Stack is the stack at the best rate, as determined by the MCMC.In this image the KBO is quite apparent.The right panel shows a grid of stacks centered at the best rate, and offset in increments of 1 pixel per hour in RA rate and Dec rate.
for m r < m 50

Figure 4 .Figure 5 .
Figure4.Number of sources resulting from the forward and reverse shift-and-stack as a function of S/N, both prior to human vetting, and after applying weights from human vetting and various cuts.It is clear that false positives drop off quickly with increasing S/N, and that our vetting procedure was highly effective at removing false positives.

Figure 6 .
Figure 6.Correlations between vetters' votes on detections from the reverse stack (top left), implanted sources in the forward stack (top right), and candidate sources in the forward stack (bottom).The numbers in each square tabulate the number of times the square's outcome occurred.Note that yes votes on sources from the reverse stack are not strongly correlated.

Figure 8 .
Figure 8. Weighted distribution of the apparent magnitudes of our detections (where the weights are given by Equation [4]).

Figure 9 .
Figure9.Sky moving rates of our candidate detections.The size of each marker is proportional to its weight, as calculated by Equation (4).The dense cloud of points corresponds mostly to CC detections (see Section 8).

Figure 10 .
Figure10.Recovery efficiency for implanted sources as a function of r-band magnitude.The combined efficiency for all 20 nights of data is given by the dashed line, while individual nights' efficiencies are given by the grey lines.The average m50 for our entire survey is mr ∼ 26.2, with individual nights ranging from 25.92-26.65.Our average peak efficiency, η0, is ≳ 0.92, with individual nights ranging from 0.85-0.95.For reference, we also show limits from the Dark Energy Survey (DES)(Bernardinelli et al. 2022), OSSOS(Bannister et al. 2018) and LSST(Ivezić et al. 2019).
Figure11.Best-fit differential distributions for the single, rolling, broken, and exponentially tapered power laws in purple, blue, yellow, and red respectively.The points and 1-σ error bars represent the differential distribution of our detections, corrected for efficiency and weight.The points are meant only as a visual aid; we always use the maximum likelihood technique to fit our data.Note that our fits are not normalized with respect to latitude, but rather they represent the average sky density among the fields in our pointing mosaic.

Figure 12 .
Figure 12.Sky motion parameter space for simulated CCs and HCs in B1a on 2020-10-18.The dashed line shows the CC selection region for this night.

Table 2 .
Anonymized Bayes factors of each of our vetters.