This site uses cookies. By continuing to use this site you agree to our use of cookies. To find out more, see our Privacy and Cookies policy. Close this notification
Brought to you by:

How to COAAD Images. I. Optimal Source Detection and Photometry of Point Sources Using Ensembles of Images

and

Published 2017 February 21 © 2017. The American Astronomical Society. All rights reserved.
, , Citation Barak Zackay and Eran O. Ofek 2017 ApJ 836 187 DOI 10.3847/1538-4357/836/2/187

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0004-637X/836/2/187

Abstract

Stacks of digital astronomical images are combined in order to increase image depth. The variable seeing conditions, sky background, and transparency of ground-based observations make the coaddition process nontrivial. We present image coaddition methods that maximize the signal-to-noise ratio (S/N) and  optimized for source detection and flux measurement. We show that for these purposes, the best way to combine images is to apply a matched filter to each image using its own point-spread function (PSF) and only then to sum the images with the appropriate weights. Methods that either match the filter after coaddition or perform PSF homogenization prior to coaddition will result in loss of sensitivity. We argue that our method provides an increase of between a few and 25% in the survey speed of deep ground-based imaging surveys compared with weighted coaddition techniques. We demonstrate this claim using simulated data as well as data from the Palomar Transient Factory data release 2. We present a variant of this coaddition method, which is optimal for PSF or aperture photometry. We also provide an analytic formula for calculating the S/N for PSF photometry on single or multiple observations. In the next paper in this series, we present a method for image coaddition in the limit of background-dominated noise, which is optimal for any statistical test or measurement on the constant-in-time image (e.g., source detection, shape or flux measurement, or star–galaxy separation), making the original data redundant. We provide an implementation of these algorithms in MATLAB.

Export citation and abstract BibTeX RIS

1. Introduction

Coaddition of observations is one of the most basic operations on astronomical images. The main objectives of coaddition are to increase the sensitivity (depth) of observations, to remove signal from high-energy particle hits on the detector (sometimes called cosmic-rays), and to decrease the point-spread function (PSF) width (e.g., deconvolution from wavefront sensing, Primot et al. 1990; lucky imaging, Law et al. 2006). One question that is common to all these tasks is what is the best method to coadd the images. This of course requires a definition of what "best" means.

Here, we assume that all observations are registered and resampled to the same grid (e.g., using Swarp; Bertin et al. 2002), and focus on the combination operation of the images. There are several common complications for image coaddition. Ground-based images are often taken under variable seeing, background, and transparency conditions. Even some space-based observations may suffer from such problems. For example, X-ray images commonly have non-uniform PSF across the field of view. These complications make the coaddition operation less trivial than simple (weighted) scalar addition.

Coaddition is playing, and will play, a major role in ongoing and future surveys (e.g., SDSS, York et al. 2000; Pan-STARRS, Kaiser et al. 2002; PTF, Law et al. 2009, DES, The Dark Energy Survey Collaboration 2005; LSST, Ivezic et al. 2008). There are many approaches to image coaddition after the registration step. One pre-step common to many methods is to reject some of the images with the worst seeing, background, or transparency. There is no sensible prescription to which images one should reject, and it is straightforward to show that this approach leads to a loss of sensitivity compared with the maximum possible. One extreme example for this process is lucky imaging (Law et al. 2006), where one often discards up to 99% of the data frames. The next coaddition steps involves either weighted summation of the images or applying partial filtering (or deconvolution) to each image in order to homogenize the PSFs of all the images, followed by coaddition.

For example, Annis et al. (2014) recommended coadding the SDSS Stripe 82 images by weighted summation of the images, where the weights are of the form

Equation (1)

and k = 1, m = 2, and n = 1. Here wj is the weight applied to the image i, Fj is proportional to the product of the telescope effective area, detector sensitivity and atmospheric transparency (i.e., flux-based photometric zero point), s is the width of the PSF, and V is the variance of all of the background noise sources (e.g., background, readout noise). On the other hand, Jiang et al. (2014) adopted a similar formula but with k = 1, m = 1, n = 1. Another example for such a weighted summation was given by Fischer & Kochanski (1994), who solved a numerical optimization scheme for the optimal weights with which the images should be added.

Other approaches are also used. For example, some authors perform PSF homogenization on all the images prior to coaddition. This is done by convolving the frames with some kernel in order to ensure all the observations have the same PSF. For example, Desai et al. (2012) define a median-seeing PSF, and then for each image find the convolution kernel that will transform the image (via convolution) to a median-seeing image, apply to all images, and then sum the images. This convolution kernel can be found using techniques outlined in Phillips & Davis (1995), Alard & Lupton (1998), and Bramich (2008). However, for images with seeing that is worse than the target seeing, the PSF homogenization operation becomes a deconvolution operation, which amplifies the noise in high spatial frequencies and creates long-range correlations in the noise. Another issue of the PSF homogenization approach is the artificial insertion of correlations between neighboring pixels. As a result, any measurement on the data becomes more involved. For example, applying a naive matched filter with the coadd-image PSF will result in additional loss of information.

In a series of several papers we will tackle the problem of optimal image coaddition. In this paper (Paper I), we will lay down the statistical formalism and assumptions we use for the coaddition problem. We construct a coaddition method that achieves the maximum possible signal-to-noise ratio (S/N) for detection of nonblended faint point-like sources in the coadded image under the assumption of constant variance Gaussian noise.1 In addition, we present a coaddition method that is optimal for photometric measurements of both faint and bright objects (i.e., even for sources in which the source noise is non-negligible). These coaddition products are the analogs of the linear matched filter for ensembles of images. We note that this coaddition method, optimized for source detection, was used by Masci & Fowler (2009) to reduce the WISE satellite data. However, in order to recover the sharpness of the original images, they applied deconvolution after the coaddition process.

It is noteworthy that linear matched filter images are smeared and their pixels are correlated. Therefore, general hypothesis testing and statistical measurements on them requires knowledge and consideration of the spatial covariance matrix. This makes them non-attractive for visualization and non-intuitive for performing measurements other than source detection and flux measurement (e.g., resolving binary stars or measuring galaxy shapes).

In Paper II (Zackay & Ofek 2017), we build on the results of this paper to construct a coaddition method for the special (but common) case of background-dominated noise. Equipped with the background-dominated noise assumption, we find a simple transformation that removes the pixel correlation induced by the matched-filtering step, and makes the additive noise in all pixels uncorrelated and have equal variance. Most importantly, this technique provides a sufficient statistic2 for any statistical measurement or decision on the data (e.g., source detection, flux measurement, shape measurement). Moreover, this method conserves the information of all the spatial frequencies, and provides a PSF which is typically sharper than the PSF of the best image in the stack we coadd. In addition, this image is practically indistinguishable from a regular astronomical observation, and any image analysis code can be applied to it unchanged. Combining all these properties together, it allows a major reduction in the amount of data products users will need in order to perform any further processing on the data (e.g., galaxy shape measurements for weak lensing). Finally, this coaddition algorithm is fast and its implementation is trivial.

In future papers (Zackay et al. 2017, in preparation), we will apply the algorithm from Paper II for the case of rapid imaging through the turbulent atmosphere and where the PSF is either known (through observing a reference star or using a wavefront sensor) or unknown. We show that this method potentially provides better results than existing approaches for the coaddition of speckle images (e.g., speckle interferometry, Labeyrie 1970; lucky imaging, Law et al. 2006, and its suggested improvements, e.g., Garrel et al. 2012).

The structure of this paper is as follows. In Section 2, we review the statistical formalism of image coaddition and refer to some basic lemmas on how to combine random variables. In Section 3, we discuss the optimal coaddition image for faint source detection. In Section 4 we describe how to obtain the optimal coaddition image for photometric measurements, not restricted to faint sources. In Section 5, we give a simple-to-use formula to calculate the expected S/N for source detection and PSF photometry for both single images and ensembles of images. In Section 6, we suggest possible variants of this method, which, albeit sub-optimal for source detection and photometry, are resilient to particle hits, and generalize the concept of median for images. In Section 7, we demonstrate our detection algorithm on simulated and real images and show it surpasses the sensitivity of current popular methods. In Section 8 we briefly describe the code we use. Finally, we summarize the important formulae and conclude in Section 9.

2. Statistical Formalism and Background

Denote the jth measured image, in a series of background-subtracted observations of the same position, by Mj. Further, denote by Pj the PSF of the jth observation.

The statistical model for the jth image is given by

Equation (2)

where T denotes the true (background-subtracted) sky image, Fj is a scalar, representing the transparency (same as in Section 1), ⨂ represents convolution, and the noise term ${\epsilon }_{j}$ has variance

Equation (3)

where Bj is the variance of all the position-independent noise—the sum of the sky background variance, the read noise variance, and the dark current variance. For simplicity, we assume that the gain is 1 and that the counts are measured in electrons.

In the following subsections we derive several important well-known results of signal detection using hypothesis testing. The optimal detection of an attenuated signal in the presence of varying noise is reviewed in Section 2.1. In Section 2.2 we derive a general rule for weighted addition of random variables, while in Section 2.3 we derive the well-known linear matched filter solution for source detection.

2.1. Detection and Measurement of an Attenuated Signal in the Presence of Varying Noise

In this section, we review the simple problem of how to detect or measure a signal, with a known template, in data given by an ensemble of observations. We strictly assume that the signal in all observations is attenuated linearly with known attenuation coefficients, and that the noise in all the observations is Gaussian. Albeit simple, the solution to this problem is the key toward constructing an optimal algorithm for image coaddition.

Let Xj be a set of independent Gaussian variables with variance

Equation (4)

Given the null hypothesis, which states that there is no signal and is denoted by ${{\mathscr{H}}}_{0}$, we assume

Equation (5)

where $E[X| {\mathscr{H}}]$ denotes the expectancy of the variable X given that the hypothesis ${\mathscr{H}}$ is true. Given the alternative hypothesis, which states there is a signal, and is denoted by ${{\mathscr{H}}}_{1}(T)$, we assume

Equation (6)

Here ${\mu }_{j}$ are the known scaling factors of each observation. The log-likelihood ratio test statistic (which is proven to be the most powerful test at any given size, and therefore optimal; see Neyman & Pearson 1933) is simply given by

Equation (7)

For detection purposes, i.e., to reject the null hypothesis, the term $\tfrac{{\mu }_{j}^{2}{T}^{2}}{2{\sigma }_{j}^{2}}$ can be ignored, as it does not depend on the data. However, this term is still relevant for flux measurements. Simplifying and absorbing the factor of 2 into the likelihood, we get

Equation (8)

The log-likelihood ratio is linear in T, hence we can identify a sufficient statistic

Equation (9)

with a distribution that is independent of T, and therefore can be used to reject the null hypothesis against the alternative hypotheses ${{\mathscr{H}}}_{1}(T)$ for all values of T simultaneously. Therefore, the optimal strategy for detection of a statistically significant signal is to calculate S and check if it is above or below a certain threshold Sγ. The value of ${S}_{\gamma }$ is fixed by the desired false alarm probability γ. To evaluate this threshold, we need to calculate the expectancy and variance of S given ${{\mathscr{H}}}_{0}$:

Equation (10)

Given the independence of Xj and the fact that $V[{X}_{j}| {{\mathscr{H}}}_{0}]={\sigma }_{j}^{2}$, the variance of S given ${{\mathscr{H}}}_{0}$ is

Equation (11)

and the threshold is3

Equation (12)

Under the same definitions, if we wish to get the most accurate estimator of T, we can use the fact that the likelihood of the data can be calculated for all T using S (Equation (9)) and I (Equation (11)). This is because we can express the log likelihood of M given T by

Equation (13)

where $g(X)={\sum }_{j}\tfrac{{X}_{j}^{2}}{2{\sigma }_{j}^{2}}$ does not depend on T and therefore cannot influence any measurement. Regardless of whether the Bayesian or frequentist approaches are used, the likelihood of the data as a function of T is sufficient for any further measurement or detection process. Therefore, S and I are sufficient for any measurement or statistical decision on the set of statistics $\{{X}_{j}\}$.

2.2. Weighted Addition of Random Variables

Another well-known result is that the statistics S and I we have derived in Equations (9) and (11) naturally arise in a much more general setup, which is known as the weighted addition of random variables. Assume we have an arbitrary set of independent random variables Xj and two hypotheses ${{\mathscr{H}}}_{0},{{\mathscr{H}}}_{1}$ that satisfy

Equation (14)

We can define the signal-to-noise ratio (S/N) of a statistic S with regard to the decision between ${{\mathscr{H}}}_{0}$ and ${{\mathscr{H}}}_{1}$ by

Equation (15)

Next, we can ask: what is the statistic, linear in Xj, that maximizes this S/N? The solution to this problem is the following weighted mean:

Equation (16)

which coincides with the previously defined S (Equation (9)). Moreover, the maximal S/N is $\sqrt{I}$. The detailed solution of this prblem is given in Appendices A and C.

Weighted addition extends also to the case of adding estimators. If we are given a sequence of independent measurements

Equation (17)

in which we assume only that $V[{\epsilon }_{j}]={\sigma }_{j}^{2}$, the same set of statistics S and I arise, and the maximum S/N estimator for T is

Equation (18)

This simple and rigorous fact is derived in Appendix B.

The fact that this same solution repeats under many different general contexts and assumptions allows us to derive optimal statistics for real-world astronomical problems. The first of them is the well-known matched filter for source detection in single astronomical images.

2.3. The Matched Filter Solution for Source Detection

Now that we are equipped with the simple yet robust tools for adding random variables, we can derive the standard matched filter solution for point source detection in a single image. To allow rigorous statistical analysis of source detection, we assume (throughout the paper) that the source we are trying to detect is well separated from other sources, which might interfere with its detection. Let M be an image such that

Equation (19)

where T is the true background-subtracted sky image, P is the image PSF, and epsilon is an additive white Gaussian noise term.

We would like to detect, or measure, a single point source at position (x0, y0). Each pixel coordinate (x, y) in the image is an independent4 information source, where the PSF scales the amount of flux each pixel would get. That is,

Equation (20)

Using Equation (9), we can write the optimal statistic for source detection at position $({x}_{0},{y}_{0})$ for a single image:

Equation (21)

Equation (22)

where $\overleftarrow{P}(x,y)=P(-x,-y)$. Thus, we obtain the well-known optimal statistic for the detection of a single source in an image with additive white Gaussian noise. The solution is to convolve the image with its reversed PSF (i.e., cross-correlation of the image with its PSF). This statistic is called the linear matched filter image.

Due to its optimality, robustness, and ease of implementation and interpretation, the linear matched filter is the method of choice for many source detection tools (e.g., SExtractor; Bertin & Arnouts 1996; see a review in Masias et al. 2012).

3. Image Coaddition for Source Detection

Now that we have shown that using the tools outlined in Section 2 we get the optimal statistic for source detection in single images, we can apply these methods for source detection to an ensemble of images. Before doing that, it is worthwhile to develop an intuition on how the solution should look. Expanding the linear matched filter solution of one image to multiple images, we expect the solution to be somewhat like a three-dimensional linear matched filter summed over the image index axis, to get a statistic with the dimensions of an image. Therefore, we expect that the solution will be simply matched filtering each image with its own PSF followed by weighted summation.

For source detection, we assume that the noise is approximately Gaussian and that the variance in each pixel, excluding the variance induced by the source itself, is independent of pixel position. Since our null hypothesis is that there is no source, the noise induced by the source should be ignored. However, noise induced by neighboring sources cannot be ignored. Therefore, our assumption that the variance is constant (at least locally) in the image is true only if the image is not crowded5 (i.e., confusion noise can be neglected). Using these assumptions, we denote the variance of the jth image by

Equation (23)

Here Bj is the background (or more formally the spatially invariant variance component) of the Mj image. We can now apply the machinery for coaddition of random variables developed in Section 2 to the problem of image coaddition.

Let us assume that we would like to detect a point source at position $(0,0)$. In this case the model for such a point source is $T={T}_{0}{\delta }_{(\mathrm{0,0})}$, where T0 is the point source's flux and ${\delta }_{(\mathrm{0,0})}$ is a two-dimensional array with 1 at position $(0,0)$ and zero otherwise. Inserting this to the model of Mj we get

Equation (24)

Using the result from Appendix A, we can combine all the pixels from all the images as a list of independent statistics. Therefore, the optimal detection statistic for a source at position $(0,0)$ is

Equation (25)

For simplicity, x represents the (x, y) coordinate. Using the fact that searching a source at position $p\equiv ({p}_{1},{p}_{2})$ on a set of images is exactly as searching for a source at position $(0,0)$ on the input images shifted by $-p$, we get

Equation (26)

This can be written in vectorial notation as

Equation (27)

where $\overleftarrow{{P}_{j}}(x)={P}_{j}(-x)$. We note that this could be easily computed (using the fact that ${\sigma }_{j}^{2}$ is a scalar) in Fourier space using the fast Fourier transform:

Equation (28)

where the $\widehat{\,\,}$ sign represents the Fourier transform, and the $\overline{\,\,\,}$ sign denotes the complex conjugate operation. We note that by putting the bar sign over the hat sign we mean that the complex conjugate operation follows the Fourier transform operation.

It is worthwhile to note that S, the coaddition of filtered images, is a log-likelihood ratio image, where $S({p}_{1},{p}_{2})$ is the optimal test statistic for the existence of a source at $({p}_{1},{p}_{2})$. In order to find sources using S, we therefore have to search for local maxima. To decide if the source candidate is statistically significant, it is more convenient to normalize S by its standard deviation:

Equation (29)

The value of each local maximum divided by the standard deviation of the noise (Equation (29)) is the source significance in units of standard deviations. To get S in units of flux (best estimate only in the context of background-noise-dominated source), we need to normalize differently (see Equation (18)) to get

Equation (30)

Analyzing our solution, we can see that if we have only one image then the well-known linear matched filter is recovered. However, when we have more than one image, the optimal detection statistic is equivalent to applying a matched filter to each image separately, weighting the images by $\tfrac{{F}_{j}}{{\sigma }_{j}^{2}}$, and accumulating. This is different from the other solutions found in the literature, where you first coadd and then match filter, or perform partial filtering for homogenization, then coadd, and then match filter. The method we developed is optimal by construction, hence it will have better performance relative to other coaddition schemes. We note that with this method in hand, any additional image, no matter how poor the conditions are, will increase the sensitivity. Therefore, the rejection of images becomes a damaging operation. Because of the robustness of our optimality arguments, whenever the assumptions that we have made on the noise are true, we expect this method to surpass the performance of all popular methods found in the literature.

An important question is by how much our method improves source detection relative to other methods. The answer to this question depends on the details, like the distribution of observing conditions (e.g., seeing) and the exact method we would like to compare with. We can quantify the gain in using our method by answering a simple question: how much survey speed is being lost when a non-optimal filter is used in the case of a single image? A single matched filter cannot be adequate for multiple images with different PSFs. No matter what weights are later applied in the combination of the different matched-filtered images (all with the same filter), the ${({\rm{S}}/{\rm{N}})}^{2}$ of the other methods cannot surpass the sum of the ${({\rm{S}}/{\rm{N}})}^{2}$ in their individual pixels.

Given a single image with a Gaussian PSF with width s, the best filter is a Gaussian with width s. In Appendix D we calculate the S/N loss factor, when one uses a Gaussian filter with a width ks. We find that the sensitivity ratio of these two cases is

Equation (31)

This means that the reduction in survey speed (or the detection information) is

Equation (32)

Figure 1 shows this survey speed reduction as a function of k. As expected, this function has a maximum of unity at k = 1. Using a width which is a factor of 1.2 (2) different from the actual PSF width will result in a survey speed loss of about 6% (40%). Therefore, given the typical distribution of the seeing conditions, one should expect to obtain an improvement of between a few percent and about 25% in survey speed using our method, relative to, e.g., the Annis et al. (2014) weighted summation.

Figure 1.

Figure 1. Survey speed loss due to the use of a matched filter, the width of which is k times that of the optimal filter (Equation (32)).

Standard image High-resolution image

4. Optimal Photometric Measurements from a Set of Observations

Another important problem in the context of astronomical image processing is measuring the flux of sources. The two leading methods for performing flux measurements are aperture photometry and PSF photometry. Both methods are well-defined on single images. When the PSF is exactly known, the method of PSF photometry is the most exact since it applies an appropriate weight to each pixel before summation. Aperture photometry is a good approximation that is often used when the star is bright, the PSF is unknown, or the measurement is done for sources that are not point-like. In this section, we present the correct way to extend photometric measurements to ensembles of observations. In Section 4.1 we discuss the extension of PSF photometry to ensembles of images, and in Section 4.2 we cover aperture photometry. We note that in the case of spatially homogeneous noise variance (i.e., faint source) and known PSF, it is both easier and more accurate to use the coaddition method presented in Paper II, and then perform aperture/PSF photometry.

An important note is that the flux measurement of a source from the images that were used for the source's first detection is biased because of the Malmquist effect. Basically, if there is a population of sources with mean flux below the detection threshold, some of them will have upward noise fluctuations, making them detectable. Therefore, the naive flux measurement of these sources from their discovery image may be biased. For further information, see Hogg & Turner (1998).

4.1. Extending PSF Photometry to Ensembles of Images

We define the optimal photometric measurement as a statistic S such that the expectancy of the statistic is the true flux of the source and its variance is minimal. That is, $E[S]=T,$ and $V[S]$ is minimal. This is equivalent to finding a statistic that maximizes the S/N and normalizing it to units of flux. Because in the case of photometry we need to take into account the source noise, in this section we will work under the assumption that

Equation (33)

where the distribution of ${\epsilon }_{j}$ is Gaussian with variance

Equation (34)

which does not distribute uniformly across the image, because bright stars have larger variance components. We note that we still assume that the measured source is well separated from other sources.

Following the same procedure we used for source detection, we get the maximum S/N for the flux measurement. When measuring a point source at position $(0,0)$ with true flux T0, we can write

Equation (35)

Again, we assume every pixel is statistically independent from the others, and therefore we can apply the previously developed machinery (Equation (18)) pixel by pixel and get the maximal S/N estimator:

Equation (36)

In the above formula, we ignored the small, but important, bias due to the exact subpixel location of the source. Such a bias can be dealt with by replacing the denominator of Equation (36) with

Equation (37)

where ${P}_{x,j}^{{\rm{\Delta }}q}$ is Pj, shifted by the fractional pixel shift ${\rm{\Delta }}q$. To apply this correction, it suffices to either store the PSFs or simply calculate ahead of time the correction incurred by each subpixel shift (in a gridded table) by each PSF. In addition, in order to compute the statistic in Equation (36), we need to know the true flux T0 that we are looking for. This problem could be avoided by one of the following.

  • 1.  
    Solving numerically for the true T0 in iterations: guess ${T}_{0}=0$, compute ${\tilde{T}}_{\mathrm{PSF}\mathrm{photometry}}$, and then update ${T}_{0}\,={\tilde{T}}_{\mathrm{PSF}\mathrm{photometry}}$, and compute ${\tilde{T}}_{\mathrm{PSF}\mathrm{photometry}}$ again with the resulting flux from the first iteration. Since the first estimate of T0 is an excellent first guess, the latter is practically converged, and more iterations are unnecessary. This solution is source specific, and therefore should be implemented only if photometry of a single source is needed.
  • 2.  
    Scanning the T0 space logarithmically: using a slightly wrong T0 does not lead to significant errors in the measurement. Since the guess of T0 influences only weights that are independent of the data itself, it does not introduce a bias. The standard deviation of a statistic that is calculated with a slightly wrong T0 is generally very close to the optimal value. For example, using Gaussian PSFs with width $1\lt s\lt 10$ pixels, $1\lt B\lt 1000$, and $1\lt T\lt {10}^{6}$, we find that using T0, which is wrong by a factor of 2, leads to reduction of at most 1% in survey speed. This solution requires the coaddition process to produce several images (∼8, as the relevant range is from the background level up to the maximal dynamic range value. These numbers are typically only up to a factor of 28 from one another).
  • 3.  
    Aiming for a specific T0. If one knows that the targets have a specific magnitude range, it is possible to use the most relevant T0.

We want to stress that the whole discussion on T0 is only relevant for flux estimation of bright sources. For the detection of sources, we are not required to know T0 a priori, as demonstrated in Section 3.

The calculation of this measurement statistic can be done quickly by noting that once a T0 is assumed, the calculation of the measurement statistic on all image positions can be done by convolution using the fast Fourier transform (FFT).6

To gain intuitive interpretation of the PSF photometry measurement statistic, it is useful to inspect the behavior of the statistic on one image. In the expression for ${\tilde{T}}_{\mathrm{PSF}\mathrm{photometry}}$, we can identify the two general approximations astronomers often make in the extreme limits of the relation between the brightness of the target and the background. On the one hand, if a target is brighter than the background on many pixels, we can see that the optimal measurement statistic we obtained is similar to aperture photometry with the optimal aperture, as in the center the weights of the pixels converge to 1. On the other hand, we see that if a target is much weaker than the background variance, the statistic converges to a matched filter on the image. When extending this to multiple images we can see that in the statistic ${\tilde{T}}_{\mathrm{PSF}\mathrm{photometry}}$ the combination of different images is rather nontrivial.

4.2. Extending Aperture Photometry to Ensembles of Images

Even when the PSF is unknown, or when the object shape for which we perform flux measurements is poorly known (e.g., a galaxy), it is still important to combine the flux measurements from different images correctly. This is because observations in the ensemble will generally have different noise properties. For each image, depending on the seeing conditions, source flux, and its PSF, one can choose the best aperture radius rj that maximizes the S/N in the image (see below).

We can calculate from each image the aperture photometry measurement

Equation (38)

where ${C}_{{r}_{j}}$ is given by

Equation (39)

Substituting Equation (2) into Equation (38),

Equation (40)

Denote the source intrinsic light distribution by Q. Q satisfies

Equation (41)

and $T={T}_{0}Q$. Next, we define

Equation (42)

to be the fraction of the source flux within the measurement aperture. Now, we can write

Equation (43)

The variance of ${\epsilon }_{j}\otimes {C}_{{r}_{j}}$ is

Equation (44)

Now that the problem is cast into the same form as that solved in Section 2, we can use the maximum S/N measurement statistic (Equation (18)) to find the best estimator for aperture photometry:

Equation (45)

Note that if the source is very strong, the term ${\mu }_{j}{T}_{0}$ might dominate the noise variance $V[{\epsilon }_{j}\otimes {C}_{{r}_{j}}]$. In that case, one needs to have a reasonable guess of the source flux T0 prior to the coaddition process (see Section 4.1).

Moreover, one needs to solve for the best aperture radius rj (that depends on T0). This solution can be obtained separately for each image by maximizing the squared S/N of ${\theta }_{j}$:

Equation (46)

where ${\mathrm{argmax}}_{r}(f(r))$ denotes the value of the parameter r for which the function f(r) obtains its maximum. Maximizing this could be done if one has a model for $\mu (r)$, which could be deduced from the intrinsic size of the object measured and a rough PSF model.

5. Estimating the Limiting Flux and the Expected S/N of Sources

For several reasons, including S/N calculation and optimizations, it is worthwhile to have a formula for the S/N of PSF photometry over multiple images. Although this formula is easy to derive, we are not aware of a simple analytic expression for the S/N of PSF photometry in the literature for either single or multiple observations.

Following the previous section, we do not necessarily assume that the sources are weak compared to the sky, and therefore our measurement model is

Equation (47)

where ${\epsilon }_{j}$ is a Gaussian with mean zero and variance:

Equation (48)

where B is the background noise and $T\otimes {P}_{j}$ is the Poisson noise of the source.

In order to calculate the S/N we refer to the fact that ${({\rm{S}}/{\rm{N}})}^{2}$ is an additive quantity. That is, for any two uncorrelated statistics ${S}_{1},{S}_{2}$, we can build a statistic

Equation (49)

such that the S/N of ${S}_{\mathrm{1,2}}$ satisfies

Equation (50)

This fact is proven in Appendix C. Using this, all that we need to do in order to calculate the total ${({\rm{S}}/{\rm{N}})}^{2}$ of the optimal coaddition of a source with flux T0 in an ensemble of images is to sum the ${({\rm{S}}/{\rm{N}})}^{2}$ over all the pixels in all the images.

Therefore, the ${({\rm{S}}/{\rm{N}})}^{2}$ for flux measurement (the source noise term in the variance is irrelevant for detection purposes) is given by

Equation (51)

For the sake of S/N calculation, it is a good enough approximation to assume that the PSF is a two-dimensional symmetric Gaussian. Rewriting the above with a Gaussian PSF, with width sj, and replacing the x and y coordinates by the radial coordinate r, this can be written as

Equation (52)

This integral has an analytic solution given by

Equation (53)

In the source-noise-dominated case (${F}_{j}{T}_{0}\gg 2\pi {s}_{j}^{2}{B}_{j}$), the formula simplifies to

Equation (54)

In the background-noise-dominated case (${F}_{j}{T}_{0}\ll 2\pi {s}_{j}^{2}{B}_{j}$), the ${({\rm{S}}/{\rm{N}})}^{2}$ becomes

Equation (55)

We stress that this equation is correct only if the numbers T0 and Bj are in units of electrons. Another remark is that Equation (55) is actually the exact (no approximation) ${({\rm{S}}/{\rm{N}})}^{2}$ for the detection of a source with flux T0, because in this situation, the relevant variance should be exactly $V[{M}_{j}| {{\mathscr{H}}}_{0}]$. Therefore, by isolating T0 in Equation (55) we can get a formula for the limiting magnitude for detection at a given S/N:

Equation (56)

6. Robust Coaddition

A drawback of the method presented so far is that it uses the summation operator, which is not resilient against outliers. Outliers can be a major source of noise and often require special treatment. The typical way outliers are dealt with is either using median coaddition, min/max rejection, or sigma clipping.

The summation in either Equations (26) or (27) includes weights, and in either case we may want to choose an operation that is more robust. Because of the weights in Equations (26) and (27), a simple median operation is not possible. Instead, one can use the weighted median estimator to replace the weighted average. The weighted median of a set of estimators

Equation (57)

is simply defined as the 50% weighted percentile of the normalized estimators $\tfrac{{\theta }_{j}}{{\mu }_{j}}$, where each of the estimators is given the weight of the inverse standard deviation,7 $\tfrac{{\mu }_{j}}{{\sigma }_{j}}$. In a similar manner, one could apply sigma clipping to the estimators, removing estimators that deviate from the average estimated value of T by more than 3 of its own standard deviation.

It is important to note that these operations make sense only when we measure the flux of unblended objects using the matched-filtered images.8 Note that since our method uses the matched-filter image as the estimator, sharp outliers like cosmic-rays or hot pixels will tend to reduce in significance (as we are using the wrong filter for their detection).

Given the problems involved in using robust estimators, we would like to suggest a different approach, which is to identify and remove sharp outliers using image subtraction, given an artifact-free image subtraction statistic. Such an image-subtraction algorithm can be found in Zackay et al. (2016). In the situation where a large number of images is coadded ($\gtrsim 10$), any outlier that will influence the final measurement must be easily visible in the subtraction image of any two images. Therefore, the detection and removal of any significant outliers prior to coaddition should be possible and probably constitutes the most accurate approach for dealing with artifacts. We will further discuss this in Zackay et al. (2016).

7. Tests

To verify that our method indeed works properly and to try and identify possible problems, we tested it on simulated data (Section 7.1) and real data (Section 7.2).

7.1. Simulated Data

The actual improvement of the coaddition technique depends on the distribution of the observing conditions (e.g., seeing, background, and transparency). Therefore, we use parameters that roughly mimic typical observing conditions. Obviously, for different surveys and use cases, these parameters could vary significantly.

We repeated the following simulation 100 times. We simulated twenty 1024 × 1024 pix2 images, each with 2048 stars in an equally spaced grid. For simplicity, we assumed all the images have equal transparency. The seeing distribution was taken as uniform with FWHM between 1.5 and 6 pixels. The background distribution was uniform between 500 and 1900 e pixel−1. The point spread functions were assumed to be circular two-dimensional Gaussians. We adjusted the brightness of the sources to be close to the detection limit in the coadd image (100 photons per image; see Section 5). The images were created aligned, so no further registration was applied.

Next, we summed the images using various coaddition schemes. The techniques we tested are (i) weighted summation of the images using the Annis et al. (2014) scheme (see Equation (1)); (ii) weighted summation of the images using the Jiang et al. (2014) method (see Equation (1)); and (iii) our method. We note that unlike Annis et al. (2014) or Jiang et al. (2014) we did not remove any images.9 To evaluate the detection S/N of each method, we averaged the S/N of all sources at their correct positions in the coaddition image (using Equation (29)). Given the number of sources simulated, this average is accurate to a level of about 1%. We then calculated, using the specific seeing and brightness parameters of the images, the theoretically maximal S/N for source detection (Equation (55)).

In Figure 2, we plot the distribution of the ratio of the obtained S/N with the various methods and the theoretically calculated S/N. It is clear that the method presented in this paper achieves the maximum possible S/N and that the symmetric scatter around its value is the measurement error of the achieved S/N. It is further clear that for this specific case, our method provides a $\sim 5 \% $ higher S/N than the method of Jiang et al. (2014), and a $\sim 10 \% $ higher S/N than that of Annis et al. (2014). These translate to a survey speed increase of about 10% over Jiang et al. (2014) and 20% over Annis et al. (2014).

Figure 2.

Figure 2. S/N distributions in the simulated images. For each coaddition method and for each simulation, the average of the achieved ${({\rm{S}}/{\rm{N}})}^{2}$ for source detection is calculated. The distributions of these averages divided by the theoretically calculated maximum ${({\rm{S}}/{\rm{N}})}^{2}$ are shown for each coaddition method.

Standard image High-resolution image

7.2. Real Data

We performed several comparisons of our method with other techniques, using real data. Our test data originates from the Palomar Transient Factory (PTF10 ; Law et al. 2009; Rau et al. 2009) data release 2. The images were obtained under various atmospheric conditions. All the images were reduced using the PTF/IPAC pipeline (Laher et al. 2014), and were photometrically calibrated (Ofek et al. 2012).

We used four sets of images. For each set, we prepared a deep reference image. The reference image is based on coaddition of a subset of the best images using the Annis et al. (2014) weights and no filtering. We also selected a subset of images which we coadded using the various techniques including equal weights, Annis et al. (2014) weighting, Jiang et al. (2014) weighting, our optimal-detection coaddition, and the optimal coaddition method described in Paper II. In order to minimize the nonlinear registration effects, we used a section of 1024 × 1024 pixels near the center of each field. The four sets of images, along with their selection criteria, are listed in Table 1. Small cutouts of the coadd images are presented in Figure 3. We note that it is very difficult to spot the fine differences between these images by eye and quantitative tests are required.

Figure 3.

Figure 3. Small cutouts of coadded images based on data set 3. A—deep coadd image; B—no weights, no filter; C—coaddition with the Annis et al. (2014) weights; D—coaddition with the Jiang et al. (2014) weights; E—optimal coaddition for source detection. A 10% variation in information content cannot be detected by eye, but quantitative comparison shows that the optimal coaddition for source detection provide improved S/N for sources, and hence higher survey speed.

Standard image High-resolution image

Table 1.  Sets of Images Used in the Comparison

Set Field CCD #im Type Criteria
1 100031 6 425 Ref. Variance < 1000 e and FWHM < 4'' and $\gt 10$ PSF stars
      45 coadd All images taken on Oct, Nov, Dec 2012 and $\gt 10$ PSF stars
2 100031 6 425 Ref. Variance < 1000 e and FWHM < 4'' and $\gt 10$ PSF stars
      48 coadd All images taken on the first nine days of each Month in 2011 and $\gt 10$ PSF stars
3 100031 4 263 Ref. Variance < 1000 e and FWHM < 4'' and $\gt 10$ PSF stars
      39 coadd All images taken on Oct, Nov, Dec 2012 and $\gt 10$ PSF stars
4 100031 4 263 Ref. Variance < 1000 e and FWHM < 4'' and $\gt 10$ PSF stars
      17 coadd All images taken on the first nine days of each Month in 2011 and $\gt 10$ PSF stars

Note. Selection criteria for reference images and coadd images in the various sets used for testing the coaddition methods. "Field" indicates the PTF field ID, while "CCDID" is the CCD number in the mosaic (see Law et al. 2009; Laher et al. 2014). "Type" is either Ref. for the deep reference image, or Coadd for the subset we coadd using the various methods. Note that field 100031 is in the vicinity of M51.

Download table as:  ASCIITypeset image

We multiply each image by its CCD gain (to work in electron units). The images were coadded using sim_coadd.m (Ofek 2014). This function first registers the images, using sim_align_shift.m, subtracts the background from all the images (sim_back_std.m), calculates the weights (for the various methods, weights4coadd.m), estimates the PSF for each image (build_psf.m), and finally coadds the images. The program optionally filters each image with its PSF prior to coaddition (sim_filter.m). All these functions are available11 as part of the astronomy and astrophysics toolbox for MATLAB (Ofek 2014) (see additional details in Section 8). We note that the coaddition technique described in Paper II is implemented in sim_coadd_proper.m.

In order to compare between the various techniques, we first run our source extraction code mextractor.m. Like SExtractor (Bertin & Arnouts 1996), this code uses linear matched filter to search for sources. However, while SExtractor thresholds the filtered image relative to the standard deviation of the un-filtered image, our code does the thresholding relative to the filtered image. This small, but important, difference means that our thresholding is always done in units of standard deviations, so images with different PSFs can be compared easily. We set the detection threshold to $4\sigma $. We note that the coadded images were always filtered using their PSF.

We matched the sources found in each coadded image against sources found in the deep reference image. Table 2 lists the number of sources detected in each image (${N}_{{\rm{d}}}$), the number of sources in the reference that are matched in the coadd (${N}_{{\rm{r}}}$), the number of sources in the reference that are unmatched in the coadd image (${N}_{{\rm{u}}}$), and the number of false detections in the coadd (${N}_{{\rm{f}}}$). Our method consistently finds more (3% to 12%) real sources than other weighted-summation methods. We note that the slight increase in the number of false sources using our method is due to the effective better PSF that our method has relative to regular weighted addition (after matched-filtering; see Paper II). An image with better seeing, once matched-filtered, has a larger number of uncorrelated scores, and therefore an increased amount of sources with score that is larger than some threshold. This effect is reproduced in simulations and could be easily accounted for in the survey design by slightly increasing the threshold above which a source is declared statistically significant. It is further important to note that even if we fix the number of false positives to some constant, our method still detects more real sources than the other techniques. In Table 2 we also list the results based on the method presented in Paper II. For source detection, the technique described in Paper II is identical (up to small numerical errors) to the method discussed in this paper. However, the method described in Paper II has several important advantages and hence should be used whenever adequate.

Table 2.  Comparison Between Coaddition Methods

Set Method ${I}_{{\rm{r}}}$ ${N}_{{\rm{d}}}$ ${N}_{{\rm{r}}}$ ${N}_{{\rm{u}}}$ ${N}_{{\rm{f}}}$
1 Deep 0 2433
  Equal weights 0.65 1136 1087 1346 49
  Annis et al. (2014) 0.91 1332 1255 1178 77
  Jiang et al. (2014) 0.89 1299 1229 1204 70
  This work 1 1417 1324 1109 93
  paper II 1 1419 1325 1108 94
2 Deep 0 2433
  Equal weights 0.59 1183 1085 1348 98
  Annis et al. (2014) 0.93 1481 1343 1090 138
  Jiang et al. (2014) 0.90 1421 1301 1132 120
  This work 1 1541 1390 1043 151
  paper II 1 1542 1391 1042 151
3 Deep 0 4062
  Equal weights 0.53 1645 1539 2523 106
  Annis et al. (2014) 0.84 1955 1796 2266 159
  Jiang et al. (2014) 0.81 1912 1759 2303 153
  This work 1 2206 1976 2086 230
  paper II 1 2206 1976 2086 230
4 Deep 0 4062
  Equal weights 0.83 2319 2007 2055 312
  Annis et al. (2014) 0.82 2144 1981 2081 163
  Jiang et al. (2014) 0.92 2272 2062 2000 210
  This work 1 2401 2125 1937 276
  paper II 0.99 2400 2124 1938 276

Note. Quantitative comparison between images coadded using various technique matched against a deep reference image. The four blocks, separated by horizontal lines, correspond to the four data sets in Table 1. ${I}_{{\rm{r}}}$ is the approximate survey speed ratio between the image compared and our optimal coaddition method (see the text for details). ${N}_{{\rm{d}}}$ is the number of detected sources to a detection threshold of 4σ. ${N}_{{\rm{r}}}$ is the number of real sources (i.e., detected sources that have a counterpart within $3^{\prime\prime} $ in the deep reference image). ${N}_{{\rm{u}}}$ is the number of real undetected sources (i.e., sources detected in the reference that are not detected in the coadd image). ${N}_{{\rm{f}}}={N}_{d}-{N}_{r}$ is the number of false detections.

Download table as:  ASCIITypeset image

Next, we compared the detection ${({\rm{S}}/{\rm{N}})}^{2}$ of the sources in the various images.12 The reason for using ${({\rm{S}}/{\rm{N}})}^{2}$ is that it is an additive quantity that is proportional to the survey speed and to the detection information. In order to do the comparison, we run mextractor.m again in forced-detection mode. In this mode, we provide the code with a list of positions of the sources detected in the deep reference image, and the code measures the detection significance and S/N at these locations. For each of the different coaddition methods on the subset of images, we calculated the following quantity: for each star detected in the reference image, we divide its S/N measured13 in a coaddition image by its S/N measured in our optimally coadded image. Then, we sum the squares of these ratios, and calculate their median. This roughly gives the survey speed in each coadd image, relative to our coaddition method, and is listed in Table 2 (${I}_{{\rm{r}}}$). In these specific examples, our method provides a factor of 17% to 47% improvement over unweighted coaddition, and 8% to 18% improvement over the weighted addition methods. We note that this is sensitive to the exact seeing distribution of the observations and we estimate that the typical improvement will be between a few percent to 25%, relative to Annis et al. (2014). Finally, in Figure 4 we present, for data set 3, the relative ratio between the detection S/N of individual sources in the coadded images relative to the detection S/N in the optimal coadd image. Again, our coaddition provides better S/N both at the faint and bright ends. We note that by using the techniques described in Section 4, the flux measurement S/N in the bright end can be further improved.

Figure 4.

Figure 4. Ratio between the detection S/N of individual sources in a coadded image relative to the optimal coaddition. Our method is represented by the 1:1 line, while the Annis et al. (2014) and unweighted coadditions are represented by the gray and black dots, respectively. This example is based on the third data set.

Standard image High-resolution image

8. Code and Implantation Details

The formulae we present in this paper are straightforward to code. However, most of the attention in the implementation should be given to pre-processing steps and measurements of the properties of the images. This includes estimating the background mean, variance, PSF, and cosmic-ray removal.

The code that performs the coaddition technique suggested in this paper is available as part of the Astronomy and Astrophysics toolbox (available online14 ) for MATLAB (Ofek 2014). This package can be used for all of the image-processing steps, including the de-bias, flat-field correction, and cosmic-ray removal.

We note that usual implementation details like registration, resampling to the same grid, measuring the background and variance levels, and measuring/interpolating the PSF, as always, require attention. However, the attention to the details required by this method is not different from that required for the successful application of other methods. Below we comment on several important implementation details.

Background and variance estimation: the background and variance in real wide field of view astronomical images cannot be treated as constants over the entire field of view. Therefore, we suggest estimating them locally and interpolating. To estimate the background and variance one needs to make sure that the estimators are not biased by stars or small galaxies. Our suggestion is to fit a Gaussian to the histogram of the image pixels in small regions,15 and to reject from the fitting process pixels with high values (e.g., the upper 10th percentile of pixel values).

Estimating the transparency: the transparency Fj of each image is simply its flux-based photometric zero point.16 However, one has to make sure that these zero-points are measured using PSF photometry rather than aperture photometry, otherwise the zero-points may depend on the seeing.

Estimating the PSF: among the complications that may affect the PSF measurement are pixelization, interpolation, and resampling. Furthermore, the PSF is not spatially constant across large fields of view (typically, changes are important on scales larger than a few arcminutes), and it may also change with intensity due to charge self-repulsion. This specifically may lead to the brighter-fatter effect (e.g., Walter 2015). We note that the fact that one needs to estimate the PSF in order to run this method should not be viewed as a drawback, as any decent method that finds sources in the image requires this step anyway.

9. Summary

We argue that popular image coaddition methods do not achieve the maximal S/N possible for source detection and photometry. We derive the optimal statistic for source detection and photometry under the assumptions that the noise is approximately Gaussian and that the target is well separated from other bright sources. We show that the optimal way to coadd images for source detection is by filtering (i.e., cross-correlating) each image with its PSF, and then summing with weights. This method is summarized by Equation (27) (or equivalently, Equation (28)). In order to find sources, we need to find local maxima in the calculated score map. The significance of these sources in units of standard deviations is given by Equation (29). We note that the computational requirements of applying the method are low, as the cross-correlation operation can be computed using the Fast Fourier transform (FFT).17 We also derive optimal statistics for PSF photometry (Equation (36)) and aperture photometry (Equation (45)) for an ensemble of images. Finally, we derive a formula to calculate the S/N for PSF photometry in an ensemble of images with a symmetric Gaussian PSF (Equation (53)). For statistical tasks other than source detection or photometric measurements, we refer readers to Paper II in this series, in which we provide a coaddition method that is optimal for any statistical measurement or decision, under the more restrictive (but common) assumption of background-dominated noise.

We demonstrate our coaddition method for source detection on simulated images as well as on real images. Our method increases the survey speed (for faint source detection and photometry) by a few percent to 25% over traditional methods, both in theory and in practice.

We thank Avishay Gal-Yam, Assaf Horesh, Frank Masci, William Newman, and Ora Zackay for discussions. This paper is based on observations obtained with the Samuel Oschin Telescope as part of the Palomar Transient Factory project, a scientific collaboration between the California Institute of Technology, Columbia University, Las Cumbres Observatory, the Lawrence Berkeley National Laboratory, the National Energy Research Scientific Computing Center, the University of Oxford, and the Weizmann Institute of Science. B.Z. is grateful for receiving the Clore fellowship. E.O.O. is incumbent of the Arye Dissentshik career development chair and is grateful for support by grants from the Willner Family Leadership Institute Ilan Gluzman (Secaucus NJ), Israel Science Foundation, Minerva, Weizmann-UK, and the I-Core program by the Israeli Committee for Planning and Budgeting and the Israel Science Foundation (ISF).

Appendix A: Weighted Addition of Independent Random Variables for Hypothesis Testing

Here, we will show that the same statistic that was derived in Section 2.1 as an exact solution for detection of an attenuated signal in the presence of varying Gaussian noise is the solution of a much more robust question—maximizing the S/N of the weighted addition of independent random variables. Given a set of statistical variables Xj with two hypotheses ${{\mathscr{H}}}_{0}$ and ${{\mathscr{H}}}_{1}$ with expectancies18

Equation (58)

and further assuming that the variance of Xj is equal under the assumption of both hypotheses, i.e.,

Equation (59)

if we know the exact distributions of the variables Xj given both hypotheses, we can use the Neyman-Pearson lemma (Neyman & Pearson 1933), which states that the optimal test statistic for the decision between ${{\mathscr{H}}}_{0}$ and ${{\mathscr{H}}}_{1}$ is the log-likelihood ratio test. Here, we want to construct a test statistic that we can apply even when the exact distributions are unknown, but their first and second moments are known.

Given the fact that all the statistics are assumed to be independent, it is reasonable to assume that the correct way to combine the variables is via a linear sum (if the distributions are known we can simply sum the difference in the log of the probablity of observing Xj for each hypothesis). That is, we are interested in a linear statistic of the form:

Equation (60)

If the set of variables we have is large, so that the distribution of a weighted sum of the variables can be approximated by a Gaussian distribution for both hypotheses, then the only important properties of the resulting sum are the expectancy of S given both hypotheses, and the variance of S given both hypotheses. Moreover, if the statistic has equal variance under both hypotheses, then there is one number that characterizes our ability to distinguish between the hypotheses, which is the squared S/N:

Equation (61)

Thus, we want to find the linear combination that maximizes the S/N. i.e., find a set of weights ${\beta }_{j}$ such that the score in Equation (60) will have maximal S/N (Equation (61)).

We assume that Xj, ${\mu }_{j}$, and ${\beta }_{j}$ are complex variables (this will become useful in future papers in the series). Substituting {0, ${\mu }_{j}$} as the expectancies of Xj given the two hypotheses, we get

Equation (62)

Given the hypothesis ${{\mathscr{H}}}_{\alpha }$, where $\alpha \in \{0,1\}$, the variance of S is

Equation (63)

In order to maximize the squared S/N with respect to ${\beta }_{j}$, we equate all the partial derivatives to zero:

Equation (64)

which gives us

Equation (65)

where the overline denotes complex conjugation. Simplifying, we get the equation

Equation (66)

Noting that multiplying all the weights ${\beta }_{j}$ by a constant factor does not change the S/N, we can simplify further and get

Equation (67)

Thus, we now have a general formula for the weighted addition of random variables.

We note that if the exact distributions of the random variables are known, then the optimal combination is the log-likelihood ratio test statistic. This statistic could be composed of some function of the variables themselves that is not linear. This means that the best score will be of the form

Equation (68)

What we have derived in this appendix is the best linear approximation to this score that could be derived under much less exact knowledge on the variables themselves.

Appendix B: Weighted Addition of Independent Estimators

Here, we will show that the same statistic that was derived in Section 2.1 and Appendix A is the solution (up to a multiplicative constant) of another general question: what is the maximum S/N linear combination of independent estimators?

Given a set of statistics ${\theta }_{j}$ such that

Equation (69)

where ${\mu }_{j}$ are known complex variables,

Equation (70)

we want to find the best weighted linear combination estimator

Equation (71)

such that $E[\theta ]=T$ and $V[\theta ]$ is minimal. Calculating the expectancy of θ we get

Equation (72)

while the variance of θ is

Equation (73)

Maximizing the ${({\rm{S}}/{\rm{N}})}^{2}$ of the estimator

Equation (74)

is equivalent to minimizing the variance of θ with respect to a fixed $E[\theta ]$. Moreover, ${({\rm{S}}/{\rm{N}})}^{2}[\theta ]$ is invariant to scalar multiplication of θ because

Equation (75)

Therefore, we can maximize ${({\rm{S}}/{\rm{N}})}^{2}[\theta ]$ with respect to all ${\beta }_{j}$. After normalization, the minimum variance estimator is

Equation (76)

To maximize ${({\rm{S}}/{\rm{N}})}^{2}[\theta ]$ with respect to ${\beta }_{j}$, we equate all the partial derivatives to zero:

Equation (77)

which gives us

Equation (78)

where the overline denotes complex conjugation. Simplifying, we get the equation

Equation (79)

Because ${({\rm{S}}/{\rm{N}})}^{2}[\theta ]$ is invariant under multiplication by a constant factor, we get

Equation (80)

Adjusting the estimator to match our first requirement that $E[\theta ]=T$ we can finally write

Equation (81)

Appendix C: S/N of the Addition of Optimally Weighted Statistics

In Appendix A we find the best way to add statistics even when we do not know their exact distributions. Here we would like to derive a closed formula for the S/N of the resulting statistic. Denote by

Equation (82)

Write the expression for the statistic S, the maximum $({\rm{S}}/{\rm{N}})$-weighted addition of the variables defined in Appendix A, as

Equation (83)

Calculating the expected difference in S given the two different hypotheses, we get

Equation (84)

while the variance of S is

Equation (85)

Therefore, the squared S/N of S is simply given by

Equation (86)

This means that the squared S/N of a statistic, which is the optimal weighted addition of a set of statistics, is the sum of the squared S/Ns of the individual statistics. This property now allows us to easily estimate the sensitivity of observations ahead of time, with an intuitive closed formula. This (almost) same calculation applies also for optimal weighted addition of estimators, and we omit it to prevent cumbersome repetitions.

Appendix D: How Much Survey Speed is Lost When We Use the Wrong PSF?

Here we derive an analytic solution to the question of how much sensitivity is lost in the detection process when the wrong linear matched filter is used. Following previous sections, we define the measured image by

Equation (87)

and assume that the source we are trying to detect is well separated from other sources. We also assume that the source we are looking for is faint relative to the background, leading to a spatially invariant noise variance:

Equation (88)

For simplicity, we assume that the PSF, P, can be approximated as a symmetric Gaussian. We define sr as the width of the real PSF P, while su is the width of the Gaussian used for match filtering. Denote by ${w}_{x}({s}_{u})$ the weight for the pixel x, when using a kernel with width su. Now, we can calculate the ${({\rm{S}}/{\rm{N}})}^{2}$ of the statistic

Equation (89)

by

Equation (90)

Substituting the PSF and the match-filtering kernel with the relevant Gaussians and approximating the sums with integrals, we get

Equation (91)

Equation (92)

where $| x{| }^{2}={x}_{1}^{2}+{x}_{2}^{2}$.

The maximum of this function with respect to a given value of sr is obtained at su = sr. When substituting su = ksr and dividing the ${({\rm{S}}/{\rm{N}})}^{2}$ by the maximum possible, we get

Equation (93)

Equation (94)

Figure 1 shows a plot of the loss in survey speed when using the wrong PSF, as derived above. This simple exercise gives us an idea of how much sensitivity we lose when we combine images with different PSFs, because when we match-filter the coadded image, all images are match-filtered with a single PSF, which cannot simultaneously be optimal for all of them.

Footnotes

  • The constant noise requirement can be relaxed as one can solve the problem locally in a region in which the noise variance is roughly constant.

  • A statistic is sufficient with respect to a model and its associated parameters if no other statistic that can be calculated from the same sample provides any additional information as to the value of the model parameters.

  • This threshold was calculated under the convenient approximation that $\gamma ={\int }_{x}^{\infty }\tfrac{{e}^{-{x}^{2}/2}}{\sqrt{2\pi }}\approx \tfrac{1}{x\sqrt{\pi }}{e}^{-{x}^{2}/2}\approx {e}^{-{x}^{2}/2-2}$, which is a good approximation for $4\lt x\lt 8$, a reasonable significance range for practical surveys.

  • In practice the pixels maybe slightly correlated due to hardware effects in the CCD.

  • To detect a source in the vicinity of other sources, our estimator is obviously biased by the nearby source, which is not accounted for in the statistical model presented in this work. Correctly resolving and measuring blended sources require a more elaborate technique in general. In Paper II in this series, we show a coaddition process that is optimal for all hypothesis-testing cases, including resolving blended sources.

  • Calculation of convolution directly (without FFT) could also be very fast if the convolution kernel is small.

  • Inverse standard deviation and not inverse variance. A way to derive this is to define a symmetrical exponential loss function for the error in each image and find the value that minimizes the global loss function. This value is given analytically by the weighted median.

  • Performing a median operation on regular images will yield weird artifacts when performed on images with different PSFs. Examples for such are seeing-distribution-dependent bias, flux-dependent PSF, and the effective removal of any observation with a PSF that is different (better or worse) from the majority of the images.

  • Given the fact that weights (if calculated correctly) should take care of low-quality images, we regard this step as damaging rather than beneficial.

  • 10 
  • 11 
  • 12 

    Note that the detection S/N is defined without the source-noise term in the denominator (see Equation (55)).

  • 13 

    The S/N is measured by filtering the image with its PSF, normalizing the filtered image by its own standard deviation, and reading the value at the local maximum that corresponds to the source.

  • 14 
  • 15 

    We are currently using 256 × 256 arcsec2 blocks.

  • 16 

    See Appendix A of Ofek et al. (2011) for a method to calculate this zero-points.

  • 17 

    The run time of FFT on an image is typically an order of magnitude faster than reading/writing an image from the hard disk.

  • 18 

    We note that even if $E[{X}_{j}| {{\mathscr{H}}}_{0}]={z}_{j}$, it is always possible to transfer the problem to hypothesis testing on ${Y}_{j}={X}_{j}-{z}_{j}$, in which case the expectancy given the null hypothesis is again zero.

Please wait… references are loading.
10.3847/1538-4357/836/2/187