Enhanced total variation minimization for stable image reconstruction

The total variation (TV) regularization has phenomenally boosted various variational models for image processing tasks. We propose to combine the backward diffusion process in the earlier literature of image enhancement with the TV regularization, and show that the resulting enhanced TV minimization model is particularly effective for reducing the loss of contrast. The main purpose of this paper is to establish stable reconstruction guarantees for the enhanced TV model from noisy subsampled measurements with two sampling strategies, non-adaptive sampling for general linear measurements and variable-density sampling for Fourier measurements. In particular, under some weaker restricted isometry property conditions, the enhanced TV minimization model is shown to have tighter reconstruction error bounds than various TV-based models for the scenario where the level of noise is significant and the amount of measurements is limited. Advantages of the enhanced TV model are also numerically validated by preliminary experiments on the reconstruction of some synthetic, natural, and medical images.


Introduction
Since the work of Rudin, Osher and Fatemi [52], various variational models based on the total variation (TV) have been intensively studied for image processing problems; see, e.g., [14,16] for reviews.Given linear measurements y ∈ C m observed via y = M X + e (1.1) from an unknown image X ∈ C N ×N , where M : C N ×N → C m is a linear operator defined componentwisely by [M( X)] j := M j , X = tr(M j X * ), for suitable matrices M j with m considerably smaller than N 2 , and e ∈ C m is a noise term bounded by e 2 ≤ τ with level τ ≥ 0, reconstruction of the unknown X can be modeled as the following TV minimization problem: min where • TV is the TV semi-norm.Note that the TV semi-norm can be mainly categorized as the isotropic [12] and anisotropic [13] cases for discrete images.In this paper, we discuss how to enhance the canonical constrained TV model (1.2) by the recently proposed springback regularization in [4] for image reconstruction, and establish stable reconstruction guarantees.As profoundly analyzed in [43], the constrained TV model (1.2) has the advantage of reconstructing high-quality images from a relatively small number of measurements.Theoretical analysis in [43] is mainly based on the seminal compressed sensing (CS) works [9,23].Note that the classic CS theory assumes the sparsity of the (vector) signal of interest or its coefficients under certain transformation, and correspondingly the signal reconstruction can be modeled as some ℓ 1 -norm minimization problems.
The CS theory can be extended to image reconstruction because natural images usually have (approximately) sparse gradients.Indeed, mathematically the TV semi-norm of a discrete image X ∈ C N ×N is just the sum of the magnitudes of its gradient ∇X ∈ C N ×N ×2 .That is, where the definition of ∇X can be found in Section 2.2.We note here that the definition (1.3) leads to the anisotropic version of the TV semi-norm.Since the anisotropic and isotropic TV semi-norms are equivalent up to a factor of √ 2 (see an explanation in Section 2.2), as [43], we only consider the anisotropic case for succinctness and the following discussion can be extended to the isotropic case analogously.
Models using the ℓ 1 -norm are fundamental to various CS problems, while solutions to such models may be over-penalized because the ℓ 1 regularization tends to underestimate high-amplitude components of the solution, as analyzed in [25].Accordingly, many non-convex alternatives have been proposed in the literature to overcome this pitfall and thus promote sparsity more firmly; see, e.g., the ℓ p (0 < p < 1) regularization [21,27], the ℓ 1−2 regularization [64], and the transformed ℓ 1 regularization [65].The non-convexity feature in image processing has also been emphasized in various papers, see, e.g., [45].Recently, we proposed the springback regularization in [4], and it can be generalized as the following for discrete images: R α (X) : where α > 0 is a meticulously-chosen parameter to ensure the positiveness or the well-definedness of (1.4), and ∇X 2 2 is the sum of the squared magnitudes of ∇X.Note that the springback regularization (1.4) is of difference-of-convex.To some extent, it keeps both the nice recoverability of various nonconvex surrogates of the TV regularization and the computability of the original TV regularization.To be consistent with the TV literature, we call (1.5) an enhanced TV regularization in this paper.
Non-convex penalties proposed in the CS literature are mainly rooted in the field of statistics, and they are usually applied in straightforward ways in the image processing literature.Interestingly, as elaborated in Section 1.1, the enhanced TV regularization (1.4) has some intrinsic interpretations from the perspective of image processing.We are thus encouraged to consider the enhanced TV model min for image reconstruction, and we aim at establishing some stable reconstruction guarantees theoretically.It is worth noting that, despite the theoretical reconstruction guarantees established in [4] for sparse signals or signals that are sparse after an orthonormal transform, the guarantees established in [4] are not applicable to the enhanced TV model (1.5).The reason is that the gradient transform ∇ : X → ∇X fails to be orthonormal, as mentioned in [43].Also, we notice that the idea of enhancing the TV regularization (the isotropic version) with a subtraction of a squared norm of the image gradient was skated over in [41], and it was empirically tested for some image denoising problems despite the lack of rigorous study for reconstruction guarantees from a few measurements.

An image processing view of the enhanced TV regularization
Solutions to TV-based models may lose contrast across edges.That is, the contrast of the regions on both sides of an edge may be reduced, and thus blur may occur near the edge.We refer the reader to [5,55] for discussions on the loss of contrast caused by various image processing models using TV regularization.Partial differential equations (PDEs) and variational approaches have been intensively investigated to enhance the contrast.On the PDE side, some well-known approaches were proposed to tackle the loss of contrast for image enhancement.For example, the shock filter was proposed in [46] to deal with blur-like image degradations, creating strong discontinuities at image edges and flattening the image within homogeneous regions.Afterwards, the shock filter has been generalized in many ways, see, e.g., [3,62].Another important example is the forward-and-backward (FAB) diffusion scheme proposed in [29] to simultaneously remove the noise and enhance the contrast.Since then, a number of influential works regarding the FAB diffusion have been conducted, see, e.g., [59,61,63].Despite that different PDE schemes were designed, a common feature of these works is that the backward diffusion process is adopted to enhance the contrast of the edges in a concerning image.Since backward diffusion is a classical example of an ill-posed problem [58], most of these PDE schemes sound numerically challenging; we refer the reader to [17,18,60] on how to discretize and solve these PDEs efficiently.On the variational side, it was shown in [19,44,45] that the contrast of the edges could be enhanced by using non-smooth data fidelity terms, which can be achieved by, e.g., replacing the squared ℓ 2 -norm data fidelity term with the ℓ 1 -norm.There are also some attempts to add negative terms into the variational model to maximize the contrast, see, e.g., [28,47], though their connections with the TV regularization are not considered.
We remark that the enhanced TV model (1.5) is related to the backward diffusion from the PDE perspective.An explanation in the context of the Euler-Lagrange (E-L) equation in a continuum setting is included in Appendix A. Briefly speaking, the term − α 2 ∇X 2 2 generates an additional backward diffusion term −α∆X into the E-L equation corresponding to the classic TV regularization.In Figure 1, we empirically illustrate that the enhanced TV regularization (1.4) is very effective for some fundamental denoising and deblurring problems.In Figure 1, we also note that the enhanced TV regularization (1.4) may not perfectly overcome another drawback of TV: the staircase effect.That is, solutions to TV-based models may have stairlike edges.There are many efforts trying to avoid this effect, including the replacement of the TV regularization with an exponentiation term of it [6], the usage of the infimal convolution of functionals with first and second order derivatives as regularizer [15], the addition of some higher-order terms into the E-L equation corresponding to the variational TV model [20], the total generalized variation [7], the usage of some modified infimal convolutions [53,54] regarding [15], and many others.
In a nutshell, the enhanced TV regularization (1.4) can be interpreted as introducing an additional backward diffusion term into the underlying E-L equation corresponding to the TV regularization for the purpose of enhancing the contrasts along the edges.

Contributions
In the CS context, it is possible to exactly recover a signal if the signal is exactly sparse and its measurements are noise-free; otherwise, we can only establish stable recovery guarantees.The term stable in this paper is mainly concerned with both inexact sparsity and measurement noise.Our analysis is conducted under the restricted isometry property (RIP) framework studied in [10].We say that a linear operator A : C n1×n2 → C m has the RIP of order s and level δ ∈ (0, 1) if and the smallest δ for (1.6) is said to be the restricted isometry constant (RIC) associated with A.
We first investigate non-adaptive subsampled linear RIP measurements of an image X ∈ C N ×N with noise level τ > 0. By "non-adaptive," we mean that the sampling strategy is not designed with specific structures or under certain distributions.In Theorem 3.1, we show that the enhanced TV model (1.5) can stably reconstruct an image X ∈ C N ×N from some non-adaptive subsampled linear RIP measurements which are contaminated by noise, with the RIP order O(s), the RIP level δ < 0.6, and the noise level τ > 0.Moreover, the required RIP level δ < 1/3 derived in [43] for the TV model (1.2) is weakened to δ < 0.6 for the enhanced TV model (1.5) under the additional condition (3.8) for the parameter α.We also show in Theorem 3.2 that the reconstruction error bound in Theorem 3.1 can be further improved if more measurements are allowed.
The above reconstruction guarantees for non-adaptive measurements require the subsampled measurements and the Haar wavelet basis to be sufficiently incoherent.This requirement is satisfied by many kinds of measurements except for the Fourier frequency measurements, because low-order wavelets and Fourier measurements are highly correlated, as analyzed in [34].Fourier measurements play essential roles in many imaging tasks.For example, as discussed in [26,34], the measurement process of various image processing procedures such as radar, sonar, and computer tomography can be modeled (with appropriate approximation and discretization) by taking samples from weighted discrete Fourier transforms.It is also known (see, e.g., [38]) that measurements taken for magnetic resonance imaging (MRI) can be well modeled as Fourier coefficients of the desired image.
On the other hand, many empirical pieces of evidence, including the first works [37,38] for compressed sensing MRI, have shown that better reconstruction quality is possible by subsampling Fourier frequency measurements with a preference for low frequencies over high frequencies.Thus, we follow the density-variable sampling strategy proposed in [34] and choose Fourier measurements randomly according to an inverse square law density.We show that from at least m s log 3 (s) log 5 (N ) such subsampled Fourier measurements with s log(N ), the enhanced TV model (1.5) reconstructs an unknown image X stably with high probabilities.We also show that the least amount of Fourier measurements required by the enhanced TV model (1.5) is only (0.6/(1/3)) −2 ≈ 30.86% of that by the TV model (1.2) as established in [34].

Related works
We briefly review some TV-related works on image reconstruction.The reconstruction of a onedimensional image in C N with an exactly s-sparse gradient from noise-free, uniformly subsampled Fourier measurements were considered in [9], without stability analysis concerning the inexact sparsity or noise.It was shown that this one-dimensional image could be recovered exactly by solving the corresponding TV model with high probabilities, provided that the number of measurements m satisfies m s log(N ).The reconstruction of a one-dimensional image using noisy measurements was then considered in [8].Stability of the reconstruction of approximately sparse images from noisy measurements was first shown in [43] for two-dimensional images and soon extended to higher-dimensional cases in [42].More specifically, it was asserted in [43] that, from some non-adaptive subsampled linear RIP measurements of an image X ∈ C N ×N with the RIP order O(s), the RIP level δ < 1/3, and the noise level τ > 0, the solution X opt to the TV model (1.2) satisfies where (∇ X) s is the best s-sparse approximation to the discrete gradient ∇X.Moreover, with more measurements, it was shown in [43] that the log factor in the bound (1.7) could be removed, and thus the bound (1.7) can be improved as In comparison with the bound (1.8), the reconstruction error bound for the enhanced TV model (1.5) in Theorem 3.2 is tighter if the level of noise τ is relatively large and the number of measurements m is limited.More discussions can be found in Section 3.3.Besides, the RIP level is assumed to satisfy δ < 1/3 in [43] for the TV model (1.2), while we weaken it to δ < 0.6 for the enhanced TV model (1.5).Though δ < 1/3 can be improved, as remarked in [43], the reconstruction error bounds (1.7) and (1.8) for the TV model (1.2) tend to be infinity if δ → 0.6 (cf. the proof of Proposition 3 in [43]).On the other hand, the bounds in Theorems 3.1 and 3.2 for the enhanced TV model (1.5) are still reasonably valid when δ → 0.6; meanwhile, the upper bound required for α tends to be 0 correspondingly.Thus, as δ → 0.6, the bounds (3.11) and (3.13) in Theorems 3.1 and 3.2 for the enhanced TV model (1.5) assert the stability of the TV model (1.2) in image reconstruction from a few linear RIP measurements.
As mentioned, guarantees for non-adaptive measurements require the subsampled measurements and the Haar wavelet basis to be sufficiently incoherent.Thus, the mentioned guarantees in [42,43] cannot be directly applied to the situation of Fourier measurements.The first results on image reconstruction from Fourier measurements were derived in [34] and [48], in which uniform and nonuniform * reconstruction guarantees are considered, respectively.More specifically, the approach in [34] requires a larger number of measurements than [48], while its reconstruction error bound is sharper than that in [48].In [34], uniform reconstruction guarantees were derived for two-dimensional images from noisy Fourier measurements, chosen randomly according to an inverse square law density.Specifically, from at least m s log 3 (s) log 5 (N ) such subsampled Fourier measurements with s log(N ), the reconstruction error bound for the TV model (1.2) was derived in the same form of (1.8).We refer to, e.g., [1,2,32], for more discussions.As we focus on the uniform reconstruction from non-adaptive measurements, we follow the approach in [34] to consider Fourier measurements.

Outline
The rest of this paper is organized as follows.In the next section, we summarize some notation and technical backgrounds.In Section 3, we establish stable image reconstruction guarantees for the enhanced TV model (1.5) from non-adaptive subsampled linear RIP measurements and variabledensity subsampled Fourier measurements, respectively.Proofs of the results in Section 3 are presented in Section 4. In Section 5, we report some numerical results when the enhanced TV model (1.5) is applied to some image reconstruction problems.Different kinds of images with subsampled Fourier measurements are tested.Finally, we make some conclusions in Section 6.

Preliminaries
We first summarize some notation and recall some preliminary technical backgrounds.

Notation
For any x, y ∈ R n , let x, y = x T y be their inner product, and x p (p ≥ 1) be the usual ℓ p -norm of x.For a matrix X ∈ R m×n , let supp(X) := {(j, k) : X j,k = 0} be the support of X, and X 0 be the cardinality of supp(X).X is said to be s-sparse if X 0 ≤ s.Let be the entry-wise ℓ p,q norm (p, q ≥ 1) of X.If p = q, X p,p is denoted by X p for short.In particular, the ℓ 2,2 norm is also known as the Frobenius norm, which is induced by the inner product X, Y := m j=1 n k=1 X i,j Y i,j = tr(XY * ) for any X, Y ∈ C m×n , where X * denotes the adjoint of the matrix X.For an index set S ⊂ {1, 2, . . ., m} × {1, 2, . . ., n}, let X S ∈ R m×n be the matrix with the same entries as X on indices S and zero entries on indices S c .The only exception is F Ω .We denote by F Ω the restriction of the bivariate discrete Fourier transform F to a subset Ω ⊂ {−N/2 + 1, . . ., N/2} 2 .Logarithm without indicating base is with respect to base 2. For matrices or vectors x and y of the same dimension, x • y denotes the Hadamard (entry-wise) product between x and y.We use the notation a b to mean that there exists C > 0 such that a ≤ Cb, and likewise for the symbol .

Gradients and TV semi-norms
For any image X ∈ C N ×N represented by an N × N block of pixel intensities with all intensities X j,k in [0, 1], the discrete directional derivatives of X ∈ C N ×N are defined in a pixel-wise manner as Recall the definition (1.3) of the TV semi-norm.This definition of ∇X leads to the anisotropic TV semi-norm, defined in [13], that is, the sum of the magnitudes of its discrete gradient If the choice of ((X x ) j,k , (X y ) j,k ) in the definition of ∇X is replaced by (X x ) j,k + i(X y ) j,k , then it leads to the isotropic TV semi-norm as defined in [12]: If we regard ∇X as an N 2 × 2 matrix, then X TV a and X TV i are the ℓ 1,1 and ℓ 2,1 norms of ∇X, respectively.Note that both TV semi-norms are equivalent subject to a factor of √ 2: Moreover, note that ∇X 2 = ( j,k (X x ) 2 j,k + (X y ) 2 j,k ) 1/2 in the second component of the enhanced TV regularization (1.4) is the ℓ 2,2 norm of ∇X.

Haar wavelet system
The Haar wavelet system provides a simple yet powerful sparse approximation of digital images.The following descriptions on this system can be found in, e.g., [43].The univariate Haar wavelet system is a complete orthonormal system of square-integrable functions on the unit interval, consisting of the constant function the mother wavelet and the dyadic dilations and translates of the mother wavelet The bivariate Haar wavelet system is an orthonormal system for the space L 2 (Q) of square-integrable functions on the unit square Q = [0, 1) 2 , and it is derived from the univariate Haar system by tensor product.The bivariate Haar system consists of the constant function and all functions where , then the bivariate Haar basis is restricted to the 2 n × 2 n = N 2 basis functions {H ℓ j,k : j ≤ n − 1} and identified as some discrete images h ℓ j,k via (2.2) forms an orthonormal basis for C N ×N .For any given ℓ = (ℓ 1 , ℓ 2 ) ∈ V , we denote by H the bivariate Haar transform X → ( X, h ℓ j,k ) j,k .By a slight abuse of notation, we also denote by H the unitary matrix representing this bivariate Haar transform.That is, we denote by HX the matrix product that generates ( X, h ℓ j,k ) j,k .Some properties of the bivariate Haar wavelet system are summarized below, and the proofs can be found in [43].
Lemma 2.1 Suppose X ∈ C N ×N is mean-zero, and let c (k) (X) be the bivariate Haar coefficient of X having the kth largest magnitude, or the entry of the bivariate Haar transform HX having the kth largest magnitude.Then, for all For any indices (j, k) and (j, k + 1), there are at most 6n bivariate Haar wavelets which are not constant on these indices, i.e., |h ℓ j,k (j, Lemma 2.3 The bivariate Haar wavelets satisfy ∇h ℓ j,k 1 ≤ 8 for all j, k, ℓ.

Discrete Fourier system
In addition to general RIP measurements, we particularly investigate Fourier measurements.Let N = 2 n be a power of 2, where n ∈ N + .The following facts of Fourier basis and transform in the context of imaging can be found in, e.g., [34].The univariate discrete Fourier basis of indexed by the discrete frequencies in the range of −N/2+1 ≤ k ≤ N/2.The bivariate discrete Fourier basis of C N ×N is a tensor product of univariate bases, i.e., indexed by the discrete frequencies in the range of We denote by F the bivariate discrete Fourier transform X → ( X, ϕ k1,k2 ) k1,k2 .Again, by a slight abuse of notation, we denote by F the unitary matrix representing this linear map.That is, we denote by F X the matrix product that generates ( X, ϕ k1,k2 ) k1,k2 .Moreover, since limited measurements are considered, we denote by F Ω the restriction of F to a subset of frequencies Ω ⊂ {−N/2 + 1, . . ., N/2} 2 .

Main results
We now establish reconstruction guarantees for the enhanced TV model (1.5) from non-adaptive linear RIP measurements and variable-density Fourier measurements, respectively.The following proposition generalizes Theorem 4.1 in [4] for signal recovery, and it allows us to bound the norm of an image D when it is close to the null space of an RIP operator.
and ε ≥ 0. Suppose that A has the RIP of order k + 4kγ 2 and level δ, and that the image D ∈ C N ×N satisfies the tube constraint Suppose further that for a subset S of cardinality |S| ≤ k, D satisfies the cone constraint where E 1 , E 2 could be scalars, vectors, or matrices, and E 2 is assumed to satisfy E 2 2 = D 2 .Here • 2 denotes the absolute value for scalars, the usual ℓ 2 vector norm for vectors, and the ℓ 2,2 norm (Frobenius norm) for matrices.If β 2 satisfies the posterior verification then it holds that where Furthermore, we have

Remark 3.2 If
A is assumed to have the RIP of order 5kγ 2 ≥ k + 4kγ 2 , then Proposition 3.1 still holds.Thus, we assume the order 5kγ 2 for simplicity in the following theorems.
For any image X ∈ C N ×N , its derivatives X x and X y belong to C (N −1)×N and C N ×(N −1) , respectively.Thus, it is convenient to consider the matrices Π 0 and Π 0 obtained from a matrix Π by concatenating a row of zeros to the bottom and top of Π, respectively.More concretely, for a matrix Π ∈ C (N −1)×N , we denote by Π 0 ∈ C N ×N the augmented matrix with entries Similarly, we denote by Π 0 ∈ C N ×N the matrix constructed from adding a row of zeros to the bottom of Π.For a linear operator A : C (N −1)×N → C m with [A(X)] j = A j , X , we denote by A 0 : C N ×N → C m the linear operator with [A 0 (X)] j = A 0 j , X .We denote by A 0 : C N ×N → C m similarly.It was shown in [43] that the entire image and its gradients could be related as follows.
where X T denotes the (non-conjugate) transpose of the matrix X.

Reconstruction from non-adaptive linear RIP measurements
We are prepared to state our first result on stable image reconstruction from non-adaptive linear RIP measurements.
Theorem 3.1 Let N = 2 n be a power of two, where n ∈ N + .Let A : C (N −1)×N → C m1 and A ′ : C (N −1)×N → C m1 be such that the concatenated operator [A, A ′ ] has the RIP of order 5s and level δ < 0.6.Let H : C N ×N → C N ×N be the orthonormal bivariate Haar wavelet transform, and B : C N ×N → C m2 be such that the composite operator BH * : C N ×N → C m2 has the RIP of order 2s + 1 and level δ < 1.Let m = 4m 1 + m 2 , and consider the linear operator M : Let X ∈ C N ×N be an image and X opt the solution to the enhanced TV model (1.5) with M defined as then we have the stable gradient reconstruction results and and the stable image reconstruction result which is compatible with (3.9), then the linear term ∇ X − (∇ X) s 1 in (3.10) and hence the term ) can be removed.This corollary will be proved after Theorem 3.1.

Remark 3.3
The proof of Theorem 3.1 is inspired by the proof in [43] for the TV model (1.2), in which it was conjectured that the 4m 1 measurements derived from A in the construction (3.7) of M are artifacts of the proof.The components A 0 (X), A 0 (X), A ′0 (X T ), and A ′ 0 (X T ) are only used for deriving the stable gradient reconstruction bounds (3.9) and (3.10).On the other hand, component B(X) only helps us derive the bound (3.11) from (3.9) and (3.10).
If more measurements are allowed, then the bound (3.11) can be further improved, the requirement (3.8) on α can be relaxed, and the artificial components in M can be removed.Theorem 3.2 Let N = 2 n be a power of two, where n ∈ N + .Let H : C N ×N → C N ×N be the orthonormal bivariate Haar wavelet transform, and M : C N ×N → C m be such that the composite operator MH * : C N ×N → C m has the RIP of order Cs log 3 (N ) and level δ < 0.6.Let X ∈ C N ×N be a mean-zero image or an image containing some zero-valued pixels, and X opt be the solution to the enhanced TV model then we have The RIP requirements in both theorems above indicate that the linear measurements should be generated from standard RIP matrix ensembles, which are incoherent with the Haar wavelet system.Many classes of random matrices can be used to generate RIP matrix ensembles.For example, a matrix in R m×N 2 with i.i.d.normalized Gaussian random entries has a small RIP constant δ s < c with high probabilities if m c −2 s log(N 2 /s), as shown in [10].Similar results were extended to sub-Gaussian matrices in [39].If m s log 4 (N ), then it was proved in [11,51] that the RIP holds with overwhelming probabilities for a partial Fourier matrix F Ω ∈ R m×N 2 .The RIP also holds for randomly generated circulant matrices (see [49]) and randomly subsampled bounded orthonormal systems (see [50]).Most of these mentioned measurements are incoherent with the Haar wavelet system, but the partial Fourier matrix with uniformly subsampled rows is an exception.Thus, some specific sampling strategies for Fourier measurements should be considered.For example, it was asserted in [33] that F Ω ∈ R m×N 2 with m s log 4 (N ) and randomized column signs has the RIP; it was also shown in [34] that F Ω with rows subsampled according to some power-law densities is incoherent with the Haar wavelet system after preconditioning.

Reconstruction from variable-density Fourier measurements
As shown in [34], if the measurements are sampled according to appropriate power-law densities, then they are incoherent with the Haar wavelet system.We consider a particular variable-density sampling strategy proposed in [34] and derive a partial stable image reconstruction theorem tailored for Fourier measurements.Following the idea of [34], our guarantees are based on a weighted ℓ 2 -norm in measuring noise such that high-frequency measurements have a higher sensitivity to noise; that is, the ℓ 2 -norm in the constraint MX − y 2 ≤ τ of the enhanced TV model (1.5) is replaced by a weighted ℓ 2norm model.For the particular scenario with Fourier measurements, the general linear operator M is specified as F Ω , which is the restriction of the Fourier transform matrix to a set Ω of frequencies as defined in Section 2.4.(3.14) where C is an absolute constant and C N is chosen such that η is a probability distribution.Consider the weight vector ρ = (ρ j ) m j=1 with ρ j = [1/η(ω j 1 , ω j 2 )] 1/2 .Then we have the following assertion for all mean-zero or zero-valued pixel-containing images X ∈ C N ×N with probability exceeding then the solution X opt to the model (3.18)

Further discussion
We supplement more details about the theoretical results presented in Sections 3.1 and 3.2.
The a posterior verification on α.Three conditions (3.8), (3.12), and (3.16) on α are required in Theorems 3.1, 3.2, and 3.3, respectively.Determining the value of α is possible only if we have a priori estimation on X opt 2 .Thus, these conditions can be interpreted as a posterior verification because they can be verified once X opt is obtained by solving the model (1.5).In practice, we solve the model (1.5) numerically and thus obtain an approximate solution, denoted by X * , subject to a preset accuracy ǫ > 0. That is, then (3.12) and (3.16) are satisfied.
The RIP level δ < 0.6 in Theorems 3.1 and 3.2.The bound 0.6 is sharp, as we need to ensure √ 1 − δ − √ 1 + δ/2 > 0 (cf.proof in Section 4.1).For the reconstruction guarantees derived in [43] for the TV model (1.2), the level is assumed to satisfy δ < 1/3, and it is not sharp as remarked in [43].Though δ < 1/3 can be improved, the reconstruction error bound in [43] for the TV model (1.2) tends to be infinity if δ → 0.6.In light of Remark 3.1, the bounds (3.11) and (3.13) are still valid in this case, and the upper bound required for α tends to 0 correspondingly with consideration of the behavior of K 2 .That is, Theorems 3.1 and 3.2 can guarantee the stability of the TV model (1.2) when δ → 0.6, resulting in reconstruction error bounds in forms of (3.11) and (3.13).
The required amount m of Fourier measurements in Theorem 3.3.The RIP level δ does not appear explicitly in Theorem 3.3, while we shall assume m sδ −2 log 3 (s) log 5 (N ) and the constant δ is eliminated in such an inequality with ; see our proof in Section 4.4.The least required amount m for the TV model (1.2) shall also satisfy this relation with s, N , and δ, as proved in [34].Since the upper bound on the RIP level δ is enlarged from 1/3 for the TV model (1.2) (see [34]) to 0.6 for the enhanced TV model (1.5), the least amount of Fourier measurements required for the enhanced TV model (1.5) should be (0.6/(1/3)) −2 ≈ 30.86% of the least amount of Fourier measurements required in [34] for the TV model (1.2).Inconsistency when α → 0. The enhanced TV regularization (1.4) tends to be the anisotropic TV term as α → 0. At the same time, the reconstruction error bounds (3.11), (3.13), and (3.18) do not reduce to the corresponding bounds (1.7) and (1.8) for the TV model (1.2).Note that the bounds (3.13) and (3.18) are of the same form.To explain this inconsistency, note that Proposition 3.1 is a pillar of the proofs of Theorems 3.1, 3.2, and 3.3.In contrast, the proof for the TV model (1.2) in [43] relies on the following fact: If D satisfies the tube constraint (3.1) and the cone constraint D S c 1 ≤ γ D S 1 + σ, then it was shown in [43] that Indeed, the left-hand side of the estimation (4.2) in the proof of Proposition 3.1 contains a quadratic term D 2 2 and a linear term D 2 , and only the linear term remains if β 1 , β 2 → 0, which then leads to the same result as (3.19).However, in the proof of Proposition 3.1, we remove this linear term and keep the quadratic term, and hence the obtained result cannot be reduced to the result (3.19) as β 1 , β 2 → 0. Such an inconsistent situation is also encountered by the springback model in [4].Comparison between (1.8) and (3.13).We are interested in whether or not the bound (3.13) (as well as the bound (3.18), which shares the same form as (3.13)) can be tighter than (1.8) in the sense of with a given α > 0. If the image X is known to have an s-sparse gradient, then the comparison (3.20) is reduced to √ s ατ .As s is fixed in this scenario, we can claim that the estimation (3.13) is tighter than the estimation (1.8) in the sense of (3.20) if τ √ s/α, i.e., the level of noise τ is relatively large.If the sparsity of ∇ X is not assumed, but the linear measurements are noise-free, i.e., τ = 0, then the comparison (3.20) is reduced to in which the left-hand side of (3.21) is an increasing function of s.In order to discern the scenario where (3.21) holds, a key fact from Remark 3.4 should be noticed: for RIP measurements mentioned there, a small number m of measurements admits an RIP with a small s.The bound O(s log(N 2 /s)) for Gaussian measurements appears not to be monotonic with respect to s.On the other hand, with the implicit constant factors derived in [51], this bound is indeed monotonically increasing with respect to s.Thus, if the number of measurements m is limited, which only renders an RIP with a small s, then (3.21) holds.This situation coincides with the intuition that, as the term ∇ X − (∇ X) s 1 ≫ 1 for many digital images, especially when the number of measurements is limited (so that s is small), taking a square root shall lead to a smaller bound than that without doing so.Together with both scenarios, we can claim that if the level of noise τ is relatively large and the number of measurements m is limited, then the enhanced TV model (1.5) performs better than the TV model (1.2) in the sense of (3.20), because (3.20) is guaranteed to hold when and we can study √ s α τ τ and separately.This comparison can be analogously extended to other cases for which the corresponding reconstruction error bounds are also linear with respect to terms ∇ X − (∇ X) s 1 / √ s and τ .Such examples include the model in [36], which has the regularization term X TV a − X TV i .For the model in [36], it seems that reconstruction guarantees leading to an error bound without the log factor log(N 2 /s) are still missing.Note that this log factor also occurs in the bound (1.7) for the TV model (1.2) and the bound (3.11) for the enhanced TV model (1.5), but it is removed if the required RIP order increases from O(s) to O(s log 3 (N )), and then both bounds can be improved to (1.8) and (3.13), respectively.Reconstruction guarantees for the model in [36] have been investigated in [35].However, the derived error bound (see Theorem 3.8 in [35]) still fails to remove the log factor log(N 2 /s), despite that the subsampled measurements are required to have the RIP of order O(s 2 log(N )) with a more complicated level δ which depends on N , s, and the constant C in Lemma 2.1.

Proofs
In this section, we present the complete proofs for the theoretical results in Section 3.

Thus, combining D
with the cone constraint (3.2), we have The assumption |S| ≤ k leads to Together with this bound (4.1), the tube constraint (3.1), and the RIP of A, we have The assumption δ < 0.6 ensures Hence, we have As D 2 is bounded by the sum of D S + D S1 2 and r j=2 D Sj 2 , it satisfies Thus, we have the quadratic inequality 2) The requirement (3.3) on β 2 ensures that where the last inequality is due to Cauchy-Schwarz inequality and E 2 2 = D 2 .Then, we have which yields the estimation (3.4).Finally, we derive (3.5).As |S| ≤ k, we have Then, together with the requirement (3.3) on β 2 and the cone constraint (3.2), we have which completes the proof of Proposition 3.1.
Proof of Corollary 3.1.In the second inequality of (4.3), we use the fact which completes the proof of Corollary 3.1.

Proof of Theorem 3.1 and Corollary 3.2
We first prove the stable gradient reconstruction results (3.9) and (3.10), and then obtain the stable image reconstruction result (3.11) with the aid of a strong Sobolev inequality.The following Sobolev inequality was derived in [43] for images with multivariate generalization given in [42].
Proof of Theorem 3.1.The proof is divided into the stable gradient and image reconstructions, respectively.
Stable gradient reconstruction.We plan to apply Proposition 3.1 to the term ∇(X opt − X).Let V = X opt − X and L = (V x , V T y ).For convenience, let P denote the mapping of indices which maps the index of a nonzero entry in ∇V to its corresponding index in L. By the definition of ∇, L has the same norm as ∇V , i.e., L 2 = ∇V 2 and L 1 = ∇V 1 .Thus, it suffices to apply Proposition 3.
• Cone constraint.Let S denote the support of the largest s entries of ∇ X.On one hand, it holds that On the other hand, we have Thus, we obtain As L contains all the same nonzero entries as ∇V , it satisfies the following cone constraint: • Tube constraint.We note that V satisfies a tube constraint as Then, it follows from Lemma 3.1 that Thus, L also satisfies a tube constraint: By virtue of Proposition 3.
Furthermore, by (3.5), we have which completes the proof of the stable gradient reconstruction results (3.9) and (3.10).
Stable image reconstruction.We now apply the strong Sobolev inequality given in Lemma 4.1 to Together with the bound (3.10), we have the stable image reconstruction result (3.11).
, then it follows from Corollary 3.1 that the linear term of ∇ X − (∇ X) s 1 in the estimation (4.4) can be removed.Thus, from (4.4) to (3.11), the term log( N 2 s ) ∇ X−(∇ X)s 1 √ s in (3.11) can be also removed.

Proof of Theorem 3.2
We apply Proposition 3.1 to c = HV as opposed to ∇V .Some properties of the bivariate Haar wavelet system, characterized as Lemmas 2.1, 2.2, and 2.2, are needed in the proof.Besides, a classical Sobolev inequality weaker than the strong Sobolev inequality in Lemma 4.1 is needed.

Lemma 4.2 ([43]
) Let X ∈ C N ×N be a mean-zero image or contain some zero-valued pixels.Then Proof of Theorem 3.2.Let V = X opt − X, and apply Proposition 3.1 to c = HV , where c (1) := c (1) (V ) denotes the Haar coefficient corresponding to the constant wavelet, and c (j) := c (j) (V ) (j ≥ 2) denotes the (j − 1)-st largest-magnitude Haar coefficient among the remaining.We use this ordering because Lemma 2.1 applies only to mean-zero images.Let h (j) denote the Haar wavelet associated with c (j) .
We have assumed that the composite operator MH * : C N ×N → C m has the RIP of order Cs log 3 (N ) and level δ < 0.6, and we now derive the constant C.
• Cone constraint on c = HV .As shown in Section 4.2, we have Recall that S is the index set of s largest-magnitude entries of ∇V .It follows from Lemma 2.2 that the set Ω of wavelets which are non-constant over S has the cardinality at most 6s log(N ), i.e., |Ω| ≤ 6s log(N ).Decompose V as Because of the linearity of ∇, we have ∇V = ∇V Ω + ∇V Ω c .By the construction of Ω, we have (∇V Ω c ) S = 0, which leads to (∇V ) S = (∇V Ω ) S .Then, it follows from Lemma 2.3 that where (⋄) is due to the property of partial sum of harmonic series [22], and ( * ) is due to the fact ∇ 2 2 ≤ 8 [12].As we prepare to apply Proposition 3.1 to c = HV , we need to bound ∇V 2 below in terms of V 2 = c 2 , where V 2 = c 2 is due to Parseval's identity and the fact that {h (j) } forms an orthonormal basis for Thus we have • Tube constraint MH * c 2 ≤ 2τ .As X and X opt are in the feasible region of the model (1.5), for c = HV = HX opt − H X, we have Under the derived cone and tube constraints on c, along with the RIP condition on MH * , Theorem 3.2 is proved by applying Proposition 3.1 and using γ ), and β 2 = αC ′ log N 2 /s .In fact, 5kγ 2 with both particular k and γ leads to the required RIP order Cs log 3 (N ) for MH * .Together with all these factors and Proposition 3.1, we know that if then it holds that which leads to the estimation (3.13).

Proof of Theorem 3.3
The proof of Theorem 3.3 follows the approach of Theorem 3.2, in which the local coherence of the sensing basis (Fourier) with respect to the sparsity basis (Haar wavelet) plays a major role.
The following result indicates that, with high probabilities, signals can be stably reconstructed from subsampled measurements with the local coherence function appropriately used.It can be deemed as a finite-dimensional analog to [50, Theorem 2.1], and a proof can be found in [34].
In particular, the following result describes the local coherence of the orthonormal Fourier basis with respect to the orthonormal Haar wavelet basis, which was initially occurred in [34].Lemma 4.4 (Theorem 4 in [34], slightly modified) Let N = 2 n be a power of 2, where n ∈ N + .The local coherence µ loc of the orthonormal two-dimensional Fourier basis {ϕ k1,k2 } with respect to the orthonormal bivariate Haar wavelet basis {h ℓ j,k } in C N ×N is bounded by and one has κ 2 ≤ κ ′ 2 ≤ 17200 + 502 log(N ).Remark 4.1 For Theorem 4 in [34], n ≥ 8 was assumed to ensure 17200 + 502 log(N ) ≤ 2700 log(N ) and hence κ 2 ≤ κ ′ 2 ≤ 52 log(N ).We regard the assumption as a restriction on the size N × N of images, thus we remove this assumption and adopt the bound √ 17200 + 502 log N in our following proof.Besides, it was conjectured in [34] that the factor 2700 is due to lack of smoothness for the Haar wavelets, and this factor might be removed by considering smoother wavelets.
Proof of Theorem 3.3.Let P ∈ C m×m be the diagonal matrix encoding the weights in the noise model.That is, P = diag(ρ), where, for κ ′ as in Lemma 4.4, ρ ∈ C m is a vector converted from the matrix Note that P g = ρ • g for g ∈ C m .Together with the particular incoherence estimate in Lemma 4.4, Lemma 4.3 implies that with probability at least 1 − N −2c log 3 (s) (as c is a generic constant, the factor 2 of c is removed in the statement of Theorem 3.3), A := 1 √ m P F Ω H * has the RIP of order s and level δ < 0.6 once s log(N 2 ) log(N ) and By the assumption m s log 3 (s) log 5 (N ) (in fact, we shall assume m sδ −2 log 3 (s) log 5 (N )), we can assume that A has the RIP of order s = Cs log 3 (N ) and level δ < 0.6, where C is the constant derived in Theorem 3.2.Moreover, let V = X opt − X and apply Proposition 3.1 again to c = HV , where c (1) := c (1) (V ) denotes the Haar coefficient corresponding to the constant wavelet, and c (j) := c (j) (V ) (j ≥ 2) denotes the (j − 1)-st largest-magnitude Haar coefficient among the remaining.To apply Proposition 3.1, we need to find cone and tube constraints for c = HV .
• Cone constraint on c = HV , which is the same as (4.8) in the proof of Theorem 3.2.
The rest is similar to the proof of Theorem 3.2, and the only trivial difference is the tube constraint, where 2τ there is replaced by √ 2τ here.Hence, we omit the following steps, and the estimation for the setting in this theorem, with constants removed, is the same as (3.13).

Numerical experiments
We now report some experimental results to validate the quality of reconstruction and numerical solvability of the enhanced TV model (1.5).As mentioned, the model (1.5) is of difference-of-convex, and it can be solved by some well-developed algorithms in the literature.We include the details of an algorithm in Appendix C. For comparison, we consider the TV model (1.2) and the TV a −TV i model in [36].In our experiments, the TV model (1.2) is solved by the split Bregman method studied in [31], and the TV a −TV i model is solved by the difference-of-convex functions algorithm (DCA) with subproblems solved by the split Bregman method in [36].Details of tuned parameters of these algorithms are stated in Appendix C. As displayed in Figure 2, we test the standard Shepp-Logan phantom, three more synthetic piecewise-constant images (Shape, Circle, and USC Mosaic), two natural images (Pepper and Clock ), and two medical images (Spine and Brain).Two sampling strategies are considered in our experiments.The first one is the radial lines sampling, and the other one is the strategy (3.15) proposed in Theorem 3.3, which is referred to as the MRI-desired sampling strategy below.All codes were written by MATLAB R2021b, and all numerical experiments were conducted on a laptop (16 GB RAM, Intel CoreTM i7-9750H Processor) with macOS Monterey 12.1.Example #1: Shepp-Logan phantom.The Shepp-Logan phantom is standard in the image reconstruction literature.Experiments for this image are organized into three parts.The first part concentrates on the reconstruction of the Shepp-Logan phantom of size 256 × 256 from noise-free measurements, and α is fixed as 0.8 in the enhanced TV model (1.5).We sample along 15, 8, and 7 radial lines, corresponding to sampling rates 6.44%, 3.98%, and 3.03%, respectively, and take MRIdesired measurements with rates 2.29%, 1.91%, and 1.53%.The results shown in Figure 3 suggest that the enhanced TV model (1.5) produces reconstruction with good accuracy in all six sampling settings, and reconstruction quality is much better than those in comparison when the amount of samples is limited (e.g., 7 radial lines and 1.53% MRI-desired measurements).This observation verifies the result in Section 3.3.That is, when τ = 0, the reconstruction error bound (3.18) for the enhanced TV model (1.5) is tighter than (1.8) for the TV model (1.5) with a limited amount of measurements.As mentioned in Section 3.3, such a result also pertains to the comparison between the enhanced TV model (1.5) and the TV a − TV i model in [36].
For comparison, we also report relative errors in the Frobenius sense and SSIM values in Table 1.Advantages of the enhanced TV model (1.5) are shown when the available measurements are limited (e.g., when the sampling rate is below 3.03%).When the measurements are relatively sufficient, e.g., in the cases of 15 lines and eight lines, the enhanced TV model (1.5) does not produce reconstruction with the least error.We see that the outperformance of the enhanced TV model (1.5) is not maintained when the measurements become sufficient, while the difference is too tiny to be visually observed.Besides, it is worth noting that SSIM values in all six sampling settings are 1.0000 for the enhanced TV model (1.5), and the stability of this model with respect to the amounts of measurements is well illustrated for the Shepp-Logan phantom images.The second part illustrates the robustness of the enhanced TV model (1.5) with respect to noise.We still fix α as 0.8 in the model (1.5), and we take measurements along 15 lines (corresponding to 6.44% sampling rate) and use 6.5% MRI-desired samples.The Fourier measurements are perturbed by Gaussian noise with standard derivations ("std" for short) of 0.04, 0.06, and 0.08, respectively.The contamination process is implemented in MATLAB commands: For any image X with size N × N , we first compute its Fourier measurements by the fast Fourier transform (FFT), i.e., F=fft2(X)/N.Then we perturb F by F=F+1/sqrt(2)*(std*randn(size(F))+std*1i*randn(size(F))).Relative errors and SSIM values listed in Table 2 show that the enhanced TV model (1.5) is the most robust one.In particular, in terms of the SSIM values, the enhanced TV model (1.5) produces much better reconstruction quality, and the superiority is more apparent when the level of noise increases.These results assert the theoretical result in Section 3.3 that the enhanced TV model (1.5) has a tighter reconstruction error bound than the TV model (1.2) and the TV a − TV i model in [36] when the level of noise is relatively large.
The third part is focused on the phase transition of the success rates of reconstruction.A re-    3. From both Figure 5 and Table 3, the reconstruction of the enhanced TV model (1.5) is significantly better than the other two models.We also take this example to test how the inner iterations can affect the overall performance of the algorithms under comparison.The algorithm presented in Appendix C adopts DCA as the outer iteration and uses the ADMM to solve each DCA subproblem.When the maximum number of inner ADMM iterations is increased from 1,000 to 2,000, the numerical results are reported in the fifth column of Figure 5, labeled as "Enhanced TV-2,000".We see that even if the enhanced TV model (1.5) with at most 1,000 inner iterations is good enough to generate a satisfactory reconstruction, e.g., for Circle and USC Mosaic, more inner iterations can further reduce the relative errors by up to several orders of magnitude.This observation provides a simple recipe for higher-accuracy reconstruction.Example #3: Natural images.We then test two natural images: Peppers and Clock.We fix α as 1 in the enhanced TV model (1.5).In Figure 6, we display the reconstruction of both images from 9.16% MRI-desired samples.Furthermore, we report relative errors in the Frobenius sense and SSIM values of each reconstruction in

Conclusions
We focused on enhancing the canonical constrained total variational (TV) minimization model for image reconstruction by the spingback regularization in our previous work [4].The enhanced TV model improves the original TV model with an additional backward diffusion process so that the loss of contrast can be further reduced.We theoretically established the reconstruction guarantees using the enhanced TV model (1.5) for non-adaptive subsampled linear RIP measurements and variabledensity subsampled Fourier measurements, respectively.For non-adaptive linear RIP measurements, the requirement on the RIP level δ was relaxed from δ < 1/3 (derived for the TV model (1.2); see [43]) to δ < 0.6.The reconstruction error bounds estimated in Theorems 3.1 and 3.2 suggest reasonable reconstruction error estimations for the TV model (1.2) when δ → 0.6, in which case the bounds derived in [43] for the TV model (1.2) tend to be infinity.For variable-density sampled Fourier measurements, the required least amount of measurements of the enhanced TV model (1.5) was shown to be around 30.86% of that established in [34] for the TV model (1.2).This improvement is due to the relaxation of the requirement on δ.
Recall that we only consider the anisotropic TV, and proofs of the main theoretical results can be easily generalized to the isotropic TV case.In addition, our results can be generalized from several other perspectives.For example, one can consider other sampling strategies, such as those in [1,48] for Fourier samples as considered in Theorem 3.3.For the guarantees analysis with Fourier measurements, noise is measured by the weighted ℓ 2 -norm (see (3.17)), and then one can consider some other norms to measure noise such as those in [1,48].Our theoretical results for two-dimensional images can also be extended to higher dimensional signals, as considered in [1,42].It seems also promising to  , and µ > 0 balances the TV term and the data fidelity term.Note that the isotropic TV proposed in [52] is used in the model (A.1).Though the anisotropic TV defined in [24] is used in the enhanced TV regularization (1.4), the main purpose of this appendix is to explain how the TV is enhanced in the sense of (1.4).Thus, we adopt the model (A.1) for simplicity.We refer the reader to [40] for the anisotropic TV flow.More specifically, the enhanced (isotropic) TV denoising model in a continuum setting can be written as where n denotes the outer normal derivative along the boundary ∂Ω of Ω.Alternatively, as [52], we could use the gradient descent marching with artificial time t.That is, the solution procedure of the Euler-Lagrange equation (A.3) uses a parabolic equation with time t as an evolution parameter.This means, for u : Ω × [0, T ] → R, we solve with a given initial condition u(x, 0) and the boundary condition ∂u ∂n | ∂Ω = 0. Note that there is a backward diffusion term −α∆u in the evolution equation (A.4).Thus, as t increases, we approach a denoised and deblurred version of the image if the blur is assumed to follow such a diffusion process.
If the energy functional E ETV (u) has a minimum, then the minimizer must satisfy the Euler-Lagrange equation (A.4).Certainly, the existence of the minimizer of E ETV is unknown for an arbitrary α.On the other hand, with α < µ inf x∈Ω

Figure 1 :
Figure 1: Illustration of the TV and enhanced TV regularization for image denoising.First row: SSIM values of each image; Second row: histograms of pixel intensities of each image.

Theorem 3 . 3
Let N = 2 n be a power of 2, where n ∈ N + .Let m and s satisfy s log(N ) and m s log 3 (s) log 5 (N ).

4. 1 2 .
Proofs of Proposition 3.1 and Corollary 3.1 Proof of Proposition 3.1.We arrange the indices in S c in order of decreasing magnitudes (in absolute value) of D S c and divide S c into subsets of size 4kγ 2 , i.e., S c = S 1 S 2 • • • S r , where r = N 2 −|S| 4kγ In other words, D S c = D S1 + D S2 + • • • + D Sr , where D S1 consists of the 4kγ 2 largest-magnitude components of D over S c , D S2 consists of the next 4kγ 2 largest-magnitude components of D over S c \S 1 , and so forth.As the magnitude of each component of D Sj is less than the average magnitude D Sj−1 1 /(4kγ 2 ) of components of D Sj−1 , we have

Lemma 4 . 1 (
Strong Sobolev inequality) Let B : C N ×N → C m be a linear map such that BH * : C N ×N → C m has the RIP of order 2s + 1 and level δ < 1, where H : C N ×N → C N ×N is the bivariate Haar transform.Suppose that D ∈ C N ×N satisfies the tube constraint BD 2 ≤ ε.Then j) | ∇h (j) 1 ≤ 8 j∈Ω |c (j) |.Let k = 6s log(N ), c Ω 1 and c Ω c 1 denote j∈Ω |c (j) | and j∈Ω c |c (j) |, respectively.Concerning the decay of the wavelet coefficients in Lemma 2.1, we have |c (j+1) | ≤ C ∇V 1 /j.Together with the cone constraint (4.6) for ∇V , we have

Definition 4 . 1 (
Local coherence[34]) The local coherence of an orthonormal basis Φ = {φ j } N j=1 of C N with respect to the orthonormal basis

Lemma 4 . 3
Let Φ = {φ j } N j=1 and Ψ = {ψ k } N k=1 be two orthonormal bases of C N .Assume the local coherence of Φ with respect to Ψ is point-wise bounded by the function κ in the sense of

Figure 3 :
Figure 3: Shepp-Logan phantom: Comparison of three models with radial line-sampled and MRIdesired measurements.

Figure 4 :
Figure 4: Phase transitions with respect to m and α.Example #2: Synthetic images.Example #1 shows the superiority of the enhanced TV model(1.5)for Shepp-Logan phantom with limited samples, and one purpose of the following study is to further assert this superiority.We consider the radial line sampling and validate this superiority by testing three synthetic images: Shape, Circle, and USC Mosaic.We also fix α = 0.8 in the enhanced TV model(1.5).When the number of measurements is limited enough, all three models cannot generate good reconstruction.Bearing in mind that the criteria of the limitation on the amount of measurements are different for three models, we now show some cases that the reconstruction via the enhanced TV model (1.5) is particularly good while those via the TV model (1.2) and the TV a − TV i model in[36] may fail.Reconstruction results are displayed in Figure5, and relative errors and SSIM values are reported in Table3.From both Figure5and Table3, the reconstruction of the enhanced TV model (1.5) is significantly better than the other two models.We also take this example to test how the inner iterations can affect the overall performance of the algorithms under comparison.The algorithm presented in Appendix C adopts DCA as the outer iteration and uses the ADMM to solve each DCA subproblem.When the maximum number of inner ADMM iterations is increased from 1,000 to 2,000, the numerical results are reported in the fifth column of Figure5, labeled as "Enhanced TV-2,000".We see that even if the enhanced TV model (1.5) with at most 1,000 inner iterations is good enough to generate a satisfactory reconstruction, e.g., for Circle and USC Mosaic, more inner iterations can further reduce the relative errors by up to several orders of magnitude.This observation provides a simple recipe for higher-accuracy reconstruction.

Figure 5 :
Figure 5: Shape, Circle, and USC Mosaic: Comparison of three models with limited measurements.

Figure 6 :
Figure 6: Peppers and Clock: Comparison three models with the MRI-desired sampling.SSIM values are also reported in the titles of each reconstruction.Example #4: Medical images.Finally, we test two medical images: Spine and Brain.We again fix α as 1 in the enhanced TV model(1.5).Moreover, we take 15.3% MRI-desired samples for the reconstruction of Spine and 9.16% for Brain, and the reconstructed images are displayed in Figure7.It is shown that the enhanced TV model (1.5) produces better reconstructions than the other models.We test more sampling rates and report the SSIM values of reconstructions with each rate in Figure8.It is easy to see that the superiority of the enhanced TV model (1.5) is more apparent when the sampling rate is relatively low.Thus, the enhanced TV model (1.5) is preferred when measurements are limited.Similar to Example #3, the enhanced TV model (1.5) performs less effectively for Example

# 4 than
Examples #1 and #2 due to the non-piecewise-constant edges of these medical images.

Figure 7 :
Figure 7: Spine and Brain: Comparison of three models on medical images with the MRI-desired sampling.SSIM values are also reported in the titles of each reconstruction.
the bounds on D 2 and D 1 are still reasonable as δ → 0.6.As the whole analysis below rests upon Proposition 3.1, this fact (3.6) suggests that the following reconstruction error bounds (3.11),(3.13),and (3.18) are all reasonable as δ → 0.6.

Table 1 :
Relative errors and SSIM value of the reconstructed images in Figure3.

Table 2 :
Relative errors and SSIM values of the reconstructed images in Figure3, with three levels of noise std = 0.04, 0.06, and 0.08.construction is recognized as successful if the relative error of the reconstructed image is less than 10 −3 .We consider the Shepp-Logan phantom with size 64 × 64 in this part.We choose α among {0.7, 0.8, ..., 2.7} for the enhanced TV model (1.5), and choose the number of measurements m from 3 to 12 radial lines for radial sampling, and among {100, 140, 180, ..., 900} for MRI-desired sampling.For each case, we test five times and report the success rate.According to Theorem 3.3, stable reconstruction can be achieved if samples are enough in the sense of (3.14) and the model parameter α is bounded in the sense of(3.16).The results in Figure4assert that a successful reconstruction via the enhanced TV model (1.5) requires relatively sufficient samples and a reasonably bounded parameter α, thus validating results in Theorem 3.3.

Table 3 :
Relative errors and SSIM values of the reconstructed images in Figure5.

Table 4 ,
from MRI-desired samples of rates 9.16%, 13.7%, 18.3%, and 22.9%.The superiority of the enhanced TV model (1.5) is further validated.It is worth noting that the enhanced TV model (1.5) performs less effectively for reconstructing natural images than images in Examples #1 and #2 because these natural images have more complicated (non-piecewise-constant) edges.It is not surprising that the enhanced TV model (1.5) is less

Table 4 :
Relative errors and SSIM values of reconstructions of two natural images with various sampling rates.