The following article is Open access

Uncertainty Propagation in (Gaussian) Convolution

Published February 2021 © 2021. The Author(s). Published by the American Astronomical Society.
, , Focus on AAS 237 Citation Randolf Klein 2021 Res. Notes AAS 5 39 DOI 10.3847/2515-5172/abe8df

2515-5172/5/2/39

Abstract

Convolution of spectra, maps, or even higher dimensional data is often part of data reduction or analysis. Often a Gaussian kernel is used. When the convolved data are measurements, they are associated with uncertainties. This research notice derives how uncertainties propagate through the convolution. While the math is straightforward algebra, the results are not readily available. Here, the uncertainty propagation applied to regularly gridded data is provided. The calculation is done for uncorrelated data and correlated data.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Let's assume you are given observational data sampled on a regular grid with associated uncertainties. And you may need to convolve the data, e.g., with a Gaussian. Below, we derive the propagation of the uncertainties associated with the measurements through the convolution. The derivation of the variance and covariance is done first for a general kernel and then a Gaussian kernel is assumed. The first part of this Note goes through the exercise assuming uncorrelated data. The next part assumes data where neighboring points are correlated by a convolution with a Gaussian kernel. Even when the data was not correlated by such a convolution, similar correlations can be expected when the data is correlated, e.g., by an instrumental response that can be approximated with a Gaussian.

2. Uncorrelated Data

Let's start with uncorrelated regularly gridded data (pixels) in d dimensions. The data is indexed with ${\boldsymbol{x}}=({x}_{1},{x}_{2},\ldots ,{x}_{d})$ with ${x}_{i}\in {\mathbb{N}}$ and $1\leqslant {x}_{i}\leqslant {N}_{i},\forall i\in \{1,\ldots ,d\}$. Let's call this set of indices, representing the pixels, ${\mathbb{M}}$. It is a subset of ${{\mathbb{Z}}}^{d}$.

We describe the observation O with random variables ${O}_{{\boldsymbol{x}}}$, ${\boldsymbol{x}}\in {{\mathbb{Z}}}^{d}$. The expected value is $E({O}_{{\boldsymbol{x}}})={s}_{{\boldsymbol{x}}}$, the true signal. The variance is $\mathrm{Var}({O}_{{\boldsymbol{x}}})={\sigma }_{{\boldsymbol{x}}}^{2}$ describing the uncertainty associated with the observations. To easily handle the edges of the data volume 1 , ${s}_{{\boldsymbol{x}}}={\sigma }_{{\boldsymbol{x}}}=0$ for ${\boldsymbol{x}}\notin {\mathbb{M}}$.

The observations are independent from each other, thus the covariance is

If follows from $\mathrm{Cov}({O}_{{\boldsymbol{x}}},{O}_{{\boldsymbol{y}}})=E({O}_{{\boldsymbol{x}}}{O}_{{\boldsymbol{y}}})-E({O}_{{\boldsymbol{x}}})E({O}_{{\boldsymbol{y}}})$ that

Equation (1)

For the convolution of O to the convolved data C, we use a kernel of $K({\boldsymbol{z}})$ (${\boldsymbol{z}}=({z}_{1},{z}_{2},\ldots ,{z}_{d})\in {{\mathbb{Z}}}^{d}$) with a support ${\mathbb{K}}$, i.e., ${\boldsymbol{z}}\in {\mathbb{K}}\iff K({\boldsymbol{z}})=0$, smaller than the data grid ($\forall {\boldsymbol{z}}\in {\mathbb{K}}:-{Z}_{i}\lt {z}_{i}\lt {Z}_{i}$ and ${Z}_{i}\lt {N}_{i},\forall i\in \{1,\ldots ,d\}$). The kernel is normalized (${\sum }_{{\boldsymbol{z}}}K({\boldsymbol{z}})=1$). The convolution C of O with K can then be written as

To calculate how much the convolution correlates the initially uncorrelated data, we derive the covariance of C:

from

and

Thus, the variance is

Equation (2)

In other words, to obtain the variance of convolved data, the variance of the data needs to be convolved with the square of the kernel used to convolve the data. Note that generally ${\sum }_{{\boldsymbol{z}}}{K}^{2}({\boldsymbol{z}})\ne 1$ and that is the correct scaling for the variance.

If the kernel is a d-dimensional Gaussian

While mathematically the support of Kb is infinte, practically the support can be limited to a finite set of points ${\mathbb{K}}$. For useful values of b ($1\lt b\ll \min ({N}_{i})$), the normalization is $A\approx b\sqrt{2\pi }$.

The covariance then is

Equation (3)

The covariance between two points is not zero, but is actually the variance of the midpoint between the two points scaled down with a Gaussian coefficient depending on the distance between the two points and the smoothing kernel width. While $\tfrac{{\boldsymbol{x}}+{\boldsymbol{y}}}{2}$ may not be an actual pixel, the expression $\mathrm{Var}({C}_{\tfrac{{\boldsymbol{x}}+{\boldsymbol{y}}}{2}})$ is still well defined via the convolution in Equation (2).

Now, we have quantified how much the value in different pixels are correlated after smoothing the data set. That is important when, for example, doing aperture photometry on the smoothed map. The variance of the flux in the aperture is not just the sum of the variances of the pixels in the aperture.

3. Correlated Data

However, the pixels in the original data set may not be uncorrelated to begin with. Instrumental resolution, over-sampling, or for other reasons the data at hand may be correlated. For the following, we assume correlated data r with uncertainties σ and that the covariance can be estimated using the expression above (Equation (3)). The Gaussian correlation is motivated by the fact that instrumental responses can often be described by Gaussians.

Let's describe the correlated data with the above defined random variable ${C}_{{\boldsymbol{x}}}$, ${\boldsymbol{x}}\in {{\mathbb{Z}}}^{d}$, with the expected value being $E({C}_{{\boldsymbol{x}}})={r}_{{\boldsymbol{x}}}$ and the variance $\mathrm{Var}({C}_{{\boldsymbol{x}}})={\sigma }_{{\boldsymbol{x}}}$. As above, ${r}_{{\boldsymbol{x}}}={\sigma }_{{\boldsymbol{x}}}=0$ for ${\boldsymbol{x}}\notin {\mathbb{M}}$. The covariance is generally not zero for ${\boldsymbol{x}}\ne {\boldsymbol{y}}$, but from above it should be reasonably described by

If $\tfrac{{\boldsymbol{x}}+{\boldsymbol{y}}}{2}$ is not a pixel but specifies an inter-pixel location, some interpolation is needed to define ${\sigma }_{\tfrac{{\boldsymbol{x}}+{\boldsymbol{y}}}{2}}$. We use nearest neighbors here (rounding up) for reasons that will become clear.

The convolution is as above ${\widetilde{C}}_{{\boldsymbol{x}}}={\sum }_{{\boldsymbol{z}}}{C}_{{\boldsymbol{z}}}\widetilde{K}({\boldsymbol{x}}-{\boldsymbol{z}})$. Thus, the variance is

because

Let's take the kernel again as a d-dimensional Gaussian with a reasonable width θ (see above): ${K}_{\theta }({\boldsymbol{z}})={A}_{\theta }^{-d}\exp \left(-\tfrac{1}{2}\tfrac{{{\boldsymbol{z}}}^{2}}{{\theta }^{2}}\right)$ with Aθ so that ${\sum }_{{\boldsymbol{z}}}{K}_{\theta }({\boldsymbol{x}})=1$. Then the variance is

Transforming the summation variables between $({\boldsymbol{x}},{\boldsymbol{y}})\leftrightarrow ({\boldsymbol{\epsilon }},{\boldsymbol{\delta }})$ is a relation between ${{\mathbb{Z}}}^{2d}$ and points in ${{\mathbb{Z}}}^{2d}$ satisfying $({\epsilon }_{i}+{\delta }_{i})\mathrm{mod}\ 2=0,\forall i=1\ldots d$, because ${\boldsymbol{\epsilon }}+{\boldsymbol{\delta }}=2({\boldsymbol{x}}+{\boldsymbol{z}})$.

To separate the sums, let us change the condition from involving both summation variable to ${\delta }_{i}\mathrm{mod}\ 2=0,\forall i=1\ldots d$. That changes the sum only slightly, because the Gaussian containing δ is a smooth function with respect to the grid, which samples it ($\theta \gt 1$), and this change in the summation just shifts the grid. Furthermore, since $\tfrac{{\boldsymbol{x}}+{\boldsymbol{y}}}{2}={\boldsymbol{\delta }}/2+{\boldsymbol{z}}$ as index to ${\sigma }^{2}$ got rounded up to a ${\boldsymbol{\delta }}$ with only even entries, the shift does not change anything for sampling ${\sigma }^{2}$.

And instead of summing over all $({\boldsymbol{\epsilon }},{\boldsymbol{\delta }})\in {{\mathbb{Z}}}^{2d}$ with the condition ${\delta }_{i}\mathrm{mod}\ 2=0,\forall i\,=\,1\ldots d$, we can sum over all $({\boldsymbol{\epsilon }},{{\boldsymbol{\delta }}}^{{\prime} })\in {{\mathbb{Z}}}^{2d}$ and set ${\boldsymbol{\delta }}=2({{\boldsymbol{\delta }}}^{{\prime} }-{\boldsymbol{z}})$.

Equation (4)

Equation (5)

The variance of the convolved correlated data is again the variance of the correlated data convolved with the square of the Gaussian kernel, but scaled with a factor depending on the widths of the Gaussian kernel and width of the Gaussian coefficient in the covariance.

4. Summary

The main take-away points around uncertainty propagation through a (Gaussian) convolution are:

  • 1.  
    To obtain the variances of convolved initially uncorrelated data, the variances of the data need to be convolved with the square of the kernel used to convolve the data (Equation (2)).
  • 2.  
    The covariance of data convolved with a Gaussian is the variance of the convolved data at the midpoint between the two points scaled down (Equation (3)).
  • 3.  
    The variance of the convolved correlated data is again the variance of the correlated data convolved with the square of the Gaussian kernel, but scaled with a factor depending on the widths of the Gaussian kernel and width of the Gaussian coefficient in the covariance (Equation (5)).

This research was conducted at the SOFIA Science Center, which is operated by the Universities Space Research Association under contract NNA17BF53C with the National Aeronautics and Space Administration.

Footnotes

  • 1  

    The edges can be handled differently. The results derived here will not depend on how the edges are handled.

10.3847/2515-5172/abe8df