Image restoration using sparse approximations of spatially varying blur operators in the wavelet domain

Restoration of images degraded by spatially varying blurs is an issue of increasing importance in the context of photography, satellite or microscopy imaging. One of the main difficulty to solve this problem comes from the huge dimensions of the blur matrix. It prevents the use of naive approaches for performing matrix-vector multiplications. In this paper, we propose to approximate the blur operator by a matrix sparse in the wavelet domain. We justify this approach from a mathematical point of view and investigate the approximation quality numerically. We finish by showing that the sparsity pattern of the matrix can be pre-defined, which is central in tasks such as blind deconvolution.


Introduction
This problem of image restoration in the presence of spatially varying blurs drew a considerable attention in the context of satellite imaging with the Hubble space telescope [1].It is now becoming increasingly important in various imaging modalities such as 3D fluorescence microscopy [2].In this paper, we aim at providing new computational tools to tackle this problem.We consider a blurring operator H in R d and defined for any u ∈ L 2 (Ω) as the following integral operator, ∀x ∈ Ω, Hu(x) = y∈Ω K(x, y)u(y)dy, where Ω ⊂ R d is the image domain.The function K is a spatially varying kernel defining the Point Spread Function (PSF) at each location x ∈ Ω.
The naive computation of a product Hu is simply obtained by direct discretization of (1) and costs O N 2d arithmetic operations for an image of dimension d with N pixels in each dimension.It is unsuitable for large scale problems encountered in imaging.Previous works attempted to reduce this computational cost by approximating H using either tensor product or piecewise convolutions [3,1,4].
Unfortunately, these strategies do not have a property that is highly desirable in numerical analysis: the ability to approximate the original operator with an arbitrary precision.In practice, we observed that they are often too coarse for complex situations encountered in microscopy imaging.
In a recent work [5], we proposed to approximate H by an operator diagonal in the wavelet domain.However, this approximation was shown to be too coarse for some practical applications.Our main contribution in this paper consists of showing that the diagonal approximation can be replaced by a sparse approximation and lead to arbitrarily close approximations.More precisely, we show that H − ΨΘΨ * , can be made arbitrarily small if Ψ * denotes a forward wavelet transform and S is a sparse matrix that contains only O(N d ) non zero coefficients instead of N 2d .As a practical consequence, matrix-vector products using the proposed approach costs O d log(N)N d arithmetic operations, which is doable even for very large images.We also show that the sparsity pattern of S can be known a priori, which is a highly desirable property in the case of blind deconvolution, where the operator should be inferred from the blurred images.
In section 2, we recall the theoretical results motivating this approximation.Section 3.1, contains preliminary restoration results in order to illustrate theoretical results and to identify a number of coefficients offering a compromise between the storage costs and the restoration quality.In this section, the kept coefficients are obtained by simple thresholding and can have arbitrary coordinates.In section 3.2, we finally show that the sparsity pattern of the matrix Θ can be pre-defined and that this approach leads to near equivalent results to a thresholding.

Theoretical motivations
Sparse approximations of integral operators have been theoretically analyzed in [6,7].Surprisingly this approach was never applied to the approximation of spatially varying blur operators.The closest application found in the literature deals with foveation [8], but it is not adapted to a large class of kernels.
We define an orthonormal wavelet basis of L 2 (R) as the family of functions: Each ψ j,m is a dilated and translated of the mother wavelet ψ, Let us recall a typical result that motivate the proposed approach.We stick to the one-dimensional case for the ease of exposition.Since H is a linear operator in a Hilbert space, it can be written as H = ΨΘΨ * , where Θ : l 2 → l 2 is the matrix representation of the blur operator in the wavelet domain.Matrix Θ is characterized by the coefficients: Theorem 1 (Decay of θ j,m,k,n - [6]).Supposing that: • the wavelets are compactly supported with supp(ψ j,m ) = I j,m , • the wavelets have M vanishing moments, • the operator H belongs to the class of Calderon-Zygmund operators, meaning that K satisfies: then the coefficients θ j,m,k,n satisfy the following estimate: with C M a constant depending on M and the wavelets, dist(I j,m , I k,n ) denotes the distance between the two supports of the wavelets ψ j,m and ψ k,n .Moreover, if the kernel K is compactly supported, then for sufficiently large |m − n|, This results ensures that the coefficients away from the diagonal fastly decay to zero.A more precise analysis shows that O dN d log(N) non zero coefficients allow to approximate H with an arbitrary precision [6]. Figure 1 shows the matrix Θ associated to a typical spatially varying blur operator in log scale.It is readily seen that the decrease of the coefficients away from the diagonal is extremely fast.The kernel is a Gaussian with equal variances increasing in the vertical direction.
In the next section, we reconstruct degraded images from the knowledge of various thresholded Θ matrices.

Restoration using thresholded matrices
In a first experiment, the Θ matrix is thresholded in order to keep a number of coefficients being a multiple of N d (the number of pixels of the image).That is, we define the thresholded matrix Θ T k by zeroing all but the T k = kN d largest coefficients of Θ.This experiment allows the identification of an admissible number of coefficients offering a compromise between the storage costs and the image restoration quality.To achieve this experiment, a full Θ matrix is numerically computed and then successively thresholded for various k ∈ N * .Thereafter, the image is restored assuming the following classical degradation model: where v is the degraded image observed (see Figure 3c), u is the image to restore (see Figure 3a), and H is a blurring operator corresponding to the PSF displayed in Figure 3d.A standard TV-L2 optimization problem is solved to restore the image u: where TV (u) is the isotropic total variation of u. Figure 4 displays restored images for various values of k and the sparsity pattern of Θ T k .As expected, considering more coefficients improves the deblurring results.However, in the presence of noise, the two images restored using the full matrix and the thresholded matrix with k = 20, provide very similar results.Therefore, it is possible to store only a small amount of coefficients and obtain results close to the ones obtained having a full (unrealistic) knowledge of the blur matrix.Furthermore, these results are obtained drastically faster, see the speed-up Figure 4.
Remark: it is near impossible to solve the deblurring problem with an exact operator due to the numerical burden of computing matrix-vector products in large dimensions.We thus chose a spatially varying blur that is sparse in the wavelet domain in order to be able to make comparisons between an exact computation and an approximated one.Speed up times might thus be much higher for other kinds of blurs.

Restoration using pre-defined sparsity patterns
We are currently developping strategies to solve blind deconvolution problems using the proposed approximation.In this setting, it is not possible to threshold the exact (unknown) matrix to obtain a sparsity pattern.The knowledge of such a structure is primordial to approximate K using prior information.Our aim in this section is to show that sparsity patterns can be pre-defined, allowing convincing restoration results.
Our main observation is that for a given wavelet ψ j,m , only its neighbours in the same scale, the finer scale and the coarser one lead to significant correlation coefficients Hψ j,m , ψ k,n .This concept of neighbours is also supported by Theorem 1.Thus, sparsity patterns can be defined using notions of intra and inter-scale neighbourhoods.
For each node of the wavelet decomposition quadtree, a neighbourhood is defined.It describes which correlation coefficients shall be preserved to generate the sparsity pattern.Figure 5 illustrates this idea.In this example, we consider a wavelet transform of depth 2. A neighbourhood N i is i = 1 . . .7 is associated to each sub-band.For instance N 1 can be represented as follows: where l stands for the low frequency wavelet, h, v, d for the horizontal, vertical and diagonal orientations respectively.Figure 6 illustrates this neighbourhood N 1 for a given wavelet at the center of the image.
same scale same scale same scale same scale same scale all orientations all orientations all orientations all orientations all orientations 0 −1

Conclusion
In this paper we investigated the use of sparse matrices in the wavelet domain in order to approximate spatially varying blur operators.We observed that very large compression factors still allow to obtain near optimal reconstruction results.Compared to previously proposed approach, this technique allows to approximate the exact operator with an arbitrary precision.It is thus suited to noiseless and noisy problems.We also showed that pre-established sparsity patterns can be used to efficiently approximate the blurring integral operator.This property is central in order to tackle blind deconvolution problems.
We are currently working on this topic.

Introduction
This problem of image restoration in the presence of spatially varying blurs drew a considerable attention in the context of satellite imaging with the Hubble space telescope [1].It is now becoming increasingly important in various imaging modalities such as 3D fluorescence microscopy [2].In this paper, we aim at providing new computational tools to tackle this problem.We consider a blurring operator H in R d and defined for any u ∈ L 2 (Ω) as the following integral operator, where Ω ⊂ R d is the image domain.The function K is a spatially varying kernel defining the Point Spread Function (PSF) at each location x ∈ Ω.
The naive computation of a product Hu is simply obtained by direct discretization of (1) and costs O N 2d arithmetic operations for an image of dimension d with N pixels in each dimension.It is unsuitable for large scale problems encountered in imaging.Previous works attempted to reduce this computational cost by approximating H using either tensor product or piecewise convolutions [3,1,4].Unfortunately, these strategies do not have a property that is highly desirable in numerical analysis: the ability to approximate the original operator with an arbitrary precision.In practice, we observed that they are often too coarse for complex situations encountered in microscopy imaging.In a recent work [5], we proposed to approximate H by an operator diagonal in the wavelet domain.However, this approximation was shown to be too coarse for some practical applications.Our main contribution in this paper consists of showing that the diagonal approximation can be replaced by a sparse approximation and lead to arbitrarily close approximations.More precisely, we show that H − ΨΘΨ * , can be made arbitrarily small if Ψ * denotes a forward wavelet transform and S is a sparse matrix that contains only O(N d ) non zero coefficients instead of N 2d .As a practical consequence, matrix-vector products using the proposed approach costs O d log(N)N d arithmetic operations, which is doable even for very large images.We also show that the sparsity pattern of S can be known a priori, which is a highly desirable property in the case of blind deconvolution, where the operator should be inferred from the blurred images.
In section 2, we recall the theoretical results motivating this approximation.Section 3.1, contains preliminary restoration results in order to illustrate theoretical results and to identify a number of coefficients offering a compromise between the storage costs and the restoration quality.In this section, the kept coefficients are obtained by simple thresholding and can have arbitrary coordinates.In section 3.2, we finally show that the sparsity pattern of the matrix Θ can be pre-defined and that this approach leads to near equivalent results to a thresholding.

Theoretical motivations
Sparse approximations of integral operators have been theoretically analyzed in [6,7].Surprisingly this approach was never applied to the approximation of spatially varying blur operators.The closest application found in the literature deals with foveation [8], but it is not adapted to a large class of kernels.
We define an orthonormal wavelet basis of L 2 (R) as the family of functions: Each ψ j,m is a dilated and translated of the mother wavelet ψ, Let us recall a typical result that motivate the proposed approach.We stick to the one-dimensional case for the ease of exposition.Since H is a linear operator in a Hilbert space, it can be written as H = ΨΘΨ * , where Θ : l 2 → l 2 is the matrix representation of the blur operator in the wavelet domain.Matrix Θ is characterized by the coefficients: Theorem 1 (Decay of θ j,m,k,n - [6]).Supposing that: • the wavelets are compactly supported with supp(ψ j,m ) = I j,m , • the wavelets have M vanishing moments, • the operator H belongs to the class of Calderon-Zygmund operators, meaning that K satisfies: then the coefficients θ j,m,k,n satisfy the following estimate: with C M a constant depending on M and the wavelets, dist(I j,m , I k,n ) denotes the distance between the two supports of the wavelets ψ j,m and ψ k,n .Moreover, if the kernel K is compactly supported, then for sufficiently large |m − n|, This results ensures that the coefficients away from the diagonal fastly decay to zero.A more precise analysis shows that O dN d log(N) non zero coefficients allow to approximate H with an arbitrary precision [6]. Figure 1 shows the matrix Θ associated to a typical spatially varying blur operator in log scale.It is readily seen that the decrease of the coefficients away from the diagonal is extremely fast.The kernel is a Gaussian with equal variances increasing in the vertical direction.
In the next section, we reconstruct degraded images from the knowledge of various thresholded Θ matrices.

Restoration using thresholded matrices
In a first experiment, the Θ matrix is thresholded in order to keep a number of coefficients being a multiple of N d (the number of pixels of the image).That is, we define the thresholded matrix Θ T k by zeroing all but the T k = kN d largest coefficients of Θ.This experiment allows the identification of an admissible number of coefficients offering a compromise between the storage costs and the image restoration quality.To achieve this experiment, a full Θ matrix is numerically computed and then successively thresholded for various k ∈ N * .Thereafter, the image is restored assuming the following classical degradation model: where v is the degraded image observed (see Figure 3c), u is the image to restore (see Figure 3a), and H is a blurring operator corresponding to the PSF displayed in Figure 3d.A standard TV-L2 optimization problem is solved to restore the image u: where TV (u) is the isotropic total variation of u. Figure 4 displays restored images for various values of k and the sparsity pattern of Θ T k .As expected, considering more coefficients improves the deblurring results.However, in the presence of noise, the two images restored using the full matrix and the thresholded matrix with k = 20, provide very similar results.Therefore, it is possible to store only a small amount of coefficients and obtain results close to the ones obtained having a full (unrealistic) knowledge of the blur matrix.Furthermore, these results are obtained drastically faster, see the speed-up Figure 4.
Remark: it is near impossible to solve the deblurring problem with an exact operator due to the numerical burden of computing matrix-vector products in large dimensions.We thus chose a spatially varying blur that is sparse in the wavelet domain in order to be able to make comparisons between an exact computation and an approximated one.Speed up times might thus be much higher for other kinds of blurs.

Restoration using pre-defined sparsity patterns
We are currently developping strategies to solve blind deconvolution problems using the proposed approximation.In this setting, it is not possible to threshold the exact (unknown) matrix to obtain a sparsity pattern.The knowledge of such a structure is primordial to approximate K using prior information.Our aim in this section is to show that sparsity patterns can be pre-defined, allowing convincing restoration results.Our main observation is that for a given wavelet ψ j,m , only its neighbours in the same scale, the finer scale and the coarser one lead to significant correlation coefficients Hψ j,m , ψ k,n .This concept of neighbours is also supported by Theorem 1.Thus, sparsity patterns can be defined using notions of intra and inter-scale neighbourhoods.
For each node of the wavelet decomposition quadtree, a neighbourhood is defined.It describes which correlation coefficients shall be preserved to generate the sparsity pattern.Figure 5 illustrates this idea.In this example, we consider a wavelet transform of depth 2. A neighbourhood N i is i = 1 . . .7 is associated to each sub-band.For instance N 1 can be represented as follows: where l stands for the low frequency wavelet, h, v, d for the horizontal, vertical and diagonal orientations respectively.Figure 6 illustrates this neighbourhood N 1 for a given wavelet at the center of the image.

Conclusion
In this paper we investigated the use of sparse matrices in the wavelet domain in order to approximate spatially varying blur operators.We observed that very large compression factors still allow to obtain near optimal reconstruction results.Compared to previously proposed approach, this technique allows to approximate the exact operator with an arbitrary precision.It is thus suited to noiseless and noisy problems.We also showed that pre-established sparsity patterns can be used to efficiently approximate the blurring integral operator.This property is central in order to tackle blind deconvolution problems.We are currently working on this topic.

Figure 1 .
Figure 1.An example of a Θ matrix in a log 10 scale for a 256 × 256 images.The matrix contains about 4.3 billion coefficients.

Figure 2 .
Figure 2. The point spread function of the blur used to generate the matrix on the left.The kernel is a Gaussian with equal variances increasing in the vertical direction.

Figure 3 .
Figure 3. Images involved in the deblurring problem

Figure 5 .Figure 6 .
Figure 5. Quadtree corresponding to a wavelet transform of depth 2 Figure 6.Illustration of the neighbourhood N 1 .In black, the given wavelet at the center of the image, and in red all the preserved neighbouring wavelets.

Figure 7
Figure7displays restored images using sparsity patterns generated from these two neighbourhoods.This experiment shows that sparsity patterns defined from the correlations between neighbouring wavelets are relevant for our aim: approximating and interpolating the blurring operator.

Figure 7 .
Figure 7. Deblurred images using matrices defined by sparsity patterns

Figure 1 .
Figure 1.An example of a Θ matrix in a log 10 scale for a 256 × 256 images.The matrix contains about 4.3 billion coefficients.

Figure 2 .
Figure 2. The point spread function of the blur used to generate the matrix on the left.The kernel is a Gaussian with equal variances increasing in the vertical direction.

Figure 3 .
Figure 3. Images involved in the deblurring problem

Figure 5 .Figure 6 .Figure 7
Figure 5. Quadtree corresponding to a wavelet transform of depth 2 Figure 6.Illustration of the neighbourhood N 1 .In black, the given wavelet at the center of the image, and in red all the preserved neighbouring wavelets.

Figure 7 .
Figure 7. Deblurred images using matrices defined by sparsity patterns