Convergence of non-linear diagonal frame filtering for regularizing inverse problems

Inverse problems are key issues in several scientific areas, including signal processing and medical imaging. Since inverse problems typically suffer from instability with respect to data perturbations, a variety of regularization techniques have been proposed. In particular, the use of filtered diagonal frame decompositions has proven to be effective and computationally efficient. However, existing convergence analysis applies only to linear filters and a few non-linear filters such as soft thresholding. In this paper, we analyze filtered diagonal frame decompositions with general non-linear filters. In particular, our results generalize SVD-based spectral filtering from linear to non-linear filters as a special case. As a first approach, we establish a connection between non-linear diagonal frame filtering and variational regularization, allowing us to use results from variational regularization to derive the convergence of non-linear spectral filtering. In the second approach, as our main theoretical results, we relax the assumptions involved in the variational case while still deriving convergence. Furthermore, we discuss connections between non-linear filtering and plug-and-play regularization and explore potential benefits of this relationship.


Introduction
Let A : X → Y be a bounded linear operator between two real Hilbert spaces X and Y.We consider the inverse problem of recovering x + ∈ X from noisy data where z is the data perturbation with ∥z∥ ≤ δ for some noise level δ > 0. Inverting the operator A is often ill-posed in the sense that the the Moore-Penrose inverse A + is discontinuous.Thus, small errors in the data are significantly amplified by use of exact solution methods.To address this problem, regularization methods have been developed with the aim of finding approximate but stable solution strategies [4,12,26].

Diagonal frame filtering
Diagonal frame decompositions in combination with regularizing filters are a flexible and efficient regularization concept for (1.1).Suppose A has a diagonal frame decomposition (DFD) giving the representations Here (u λ ) λ∈Λ and (v λ ) λ∈Λ are frames of ker(A) ⊥ and ran(A), respectively, with corresponding dual frames (ū λ ) λ∈Λ and (v λ ) λ∈Λ .In the special case that (u λ ) λ and (v λ ) λ are orthonormal, A = λ∈Λ κ λ ⟨•, u λ ⟩ v λ is a singular value decomposition (SVD) for A. More general frame decompositions have first been studied by Candés and Donoho [7,9] in the context of statistical estimation and recently in [13,14,10,16,25,21,22] in the context of regularization theory.Specifically, in [10,Theorem 2.10] it has been shown that if (κ λ ) λ∈Λ accumulate at zero, then A + is unbounded.In such a situation, regularization methods have to applied for the solution of (1.1).

Non-linear extension
A major drawback of linear regularizing filters is that the damping factor depends only on the quasi-singular values and is independent of the data.In practice, certain filters that depend non-linearly on ⟨y δ , v λ ⟩ tend to perform better in filtering out noise than linear methods; see [1,15,23].The aim of this paper is to analyze general non-linear frame-based diagonal filtering where (φ α ) α>0 is a non-linear filter rigorously introduced in Definition 3.1 below.The reconstruction mappings B α : Y → X come with a clear interpretation: In order to avoid noise amplification due to multiplication with κ −1 λ , the filter damps each coefficient according to damping factors φ α (κ λ , ⟨y δ , v λ ⟩) prior to the inversion.By taking φ α (κ λ , ⟨y δ , v λ ⟩) = (f α (κ λ )κ λ ) • ⟨y δ , v λ ⟩ we recover linear filtering analyzed in [10].An example of a nonlinear filter is the soft thresholding filter defined by φ α (κ, c) = sign (c)(|c| − α/κ) + where (•) + := max{•, 0}.In [14] it has been shown that this filter yields a convergent regularization method.Affine filters applied with the SVD have recently been studied in [2].Our paper extends these special cases to general classes of non-linear filters.
Specifically, we establish assumptions on the non-linear regularizing filter (φ α ) α>0 to demonstrate stability and convergence in the following sense : • Stability: For fixed α > 0 let y δ , y k ∈ Y be with y k → y δ .Then we have B α (y k ) ⇀ B α (y δ ) as k → ∞.
As our main theoretical findings, we derive such results for a broader class of non-linear filters by exploiting the specific diagonal structure of (1.2); see Section 4.2.In addition, for a broad class of homogeneous filters, such results will be derived by a reduction to the well-established case of variational regularization [27]; see Section 4.1.Note that even in the case where we reduce our analysis to variational regularization, the non-linear filtered regularization technique is related but different from variational regularization with separable constraints [17,5].This is discussed in detail for special case of soft thresholding in [14], where the diagonal frame filtering has been opposed to frame-analysis and framesynthesis regularization.
Note that just as we can express the filtered DFD as a variational regularization, we can also express it as a plug-and-play (PnP) regularization.While convergence results in the context of regularization methods for PnP have not been extensively studied, with the first in-depth study presented in [11], it is important to acknowledge that the assumptions for convergence of PnP are particularly stringent.A specific class of nonlinear filters conform to the PnP regularization by satisfying the necessary conditions, and as a result the established stability and convergence results hold, even showing stronger stability in these cases.Conversely because the assumptions of the analysis of the filtered DFD are quite general, it covers various PnP methods that aren't addressed in the theory presented in [11].Consequently, as discussed in Section 5, this paper even contributes the regularization theory of PnP.

Outline
In Section 2 we present preliminaries in the form of a notion, auxiliary results, and some technical lemmas.In Section 3 we rigorously introduce non-linear filters and non-linear filtered diagonal frame decompositions.The main results of this paper are presented in Section 4, where we provide two approaches to proving convergence: by linking to existing theories of variational regularization for stationary filters, and by a direct proof in the general case.Further, in Section 5 we analyze connections between filtered DFD and PnP regularization.In Section 6, we summarize our findings and offer some future research directions.

Preliminaries
Let X, Y be Hilbert spaces.If B : dom(B) ⊆ X → Y, then dom(B) denotes the domain and ran(B) = B(dom(B)) the range of B. If B is linear bounded with dom(B) = X we write B : dom(B + ) ⊆ Y → X with dom(B + ) := ran(B)⊕ran(B) ⊥ for the Moore-Penrose inverse of B. Recall that Φ : X → X is nonexpansive if ∀x, y ∈ X : ∥Φ(x) − Φ(y)∥ ≤ ∥x − y∥.

Functionals and proximity operators
Functionals on X will be written as R : X → R ∪ {∞} and we usually use r or s to denote a functional when X = R.We define the domain of R by dom(R) := {x ∈ X | R(x) < ∞} and for q ∈ R we define the lower level set of R with bound q by L(R, closed for all q ∈ R. For convex functions sequentially, weak as well as weak sequential lower semi-continuity are equivalent to strong lower semicontinuity.We call We define Γ 0 (X) as the set of all R : X → R ∪ {∞} that are proper, convex and lower semi-continuous.The subdifferential of R ∈ Γ 0 (X) is a set-valued operator ∂R : X → 2 X defined by The elements of ∂R(x) are called subgradients of R at x.
Remark 2.2 (Proximity operators on the real line).Let s ∈ Γ 0 (R) with s(0) = 0.According to Lemma 2.1(e), prox s is nonexpansive and increasing.Furthermore, as stated in Lemma 2.1(d), the domain of s is a closed interval that contains zero, and s is continuous on the interior of dom(s).For all x ∈ dom(s), the subgradients at x form a closed interval, and the set-valued function x → ∂s(x) is increasing.This means that for x ≤ y, it holds for all u ∈ ∂s(x) and v ∈ ∂s(y) that u ≤ v. Hence, the same property holds for the inverse of prox s .When we refer to min or max of a subdifferential or the inverse of a proximity operator, we are referring to the largest lower or smallest upper bound of the respective interval.
Applying [3,Equation (2.22) While this closed form expression for s in terms prox s , this also indicated that such an relation is not very straight-forward.
Figure 1 shows an example of proximity operators ϕ = prox s of a functional s ∈ Γ 0 (R) with s(0) = 0, highlighting some of its relations.In particular, it shows how jumps of s cause regions where ϕ is constant.In particular, if s remains ∞, then the proximity operator is constant and thus not surjective.We also see that prox is monotonically increasing and non-expansive.
x s(x) dom(s) x

Technical Lemmas
In this section we provide some technical Lemmas for proximity operators on R and their connection to the associated functionals, that we use for our analysis.
Proof.See Appendix A.1.Lemma 2.3 establishes that for a family of increasing and nonexpansive functions, the convergence of the functions to the identity function is equivalent to the convergence of their inverses to the identity function.This lemma holds practical relevance when applied to a family of proximity operators.Lemma 2.4.Let s : R → R + be convex and lower semi-continuous with s(0) = 0, and let κ, α > 0. Suppose there exist b, c > 0 such that Then for all y ∈ R the following holds: • If |y| > cκ, then s(y) ≥ bc |y/κ| − bc 2 /2.
Then for all x ∈ R the following hold Proof.See Appendix A.3.

Diagonal frame decomposition
Let A : X → Y be a bounded linear operator and Λ an at most countable index set.
The notion of a DFD reduces to the singular value decomposition (SVD) when (u λ ) λ∈Λ and (v λ ) λ∈Λ are orthonormal bases.In particular, DFDs exist in quite general settings.They can also exist where there is no SVD, for example when the spectrum of A is continuous.However, the main advantage of DFDs over SVDs is that the quasi-singular systems can provide better approximation properties than the singular systems from the SVD.An example of this is when (u λ ) λ∈Λ can be taken as the wavelet basis, as in the case of the Radon transform [9,10,22].
Let (u, v, κ) be a DFD for A and ū be a dual frame of u, defined by x = T ūT * u x for all x ∈ X.Further, let denote the synthesis and analysis operator of ū and v, respectively.Using the DFD, the Moore-Penrose inverse of A can be written as where M κ is the component-wise multiplication operator M κ ((x λ ) λ∈Λ ) = (κ λ x λ ) λ∈Λ and M + κ its Moore-Penrose inverse.Since the frame operators are continuous and invertible, diagonalizing A with a DFD basically reduces the inverse problem (1.1) to an inverse problem with a diagonal forward operator from ℓ 2 (Λ) to ℓ 2 (Λ).
Due to the ill-posedness of inverting A, the values of (κ λ ) λ∈Λ accumulate at zero [10,Thm. 7], which means that (1/κ λ ) λ is unbounded.As a result, small errors in the data can be significantly amplified using (2.2).To reduce error amplification, we use regularizing filters aiming to damp noisy coefficients.Unlike the widely studied linear regularizing filters [7,10,19,22], we investigate non-linear filters that can depend on both the operator and the data in a non-linear matter.

Non-linear filtered DFD
The following two definitions are central for this work.Definition 3.1 (Non-linear regularizing filter).We call a family (φ α ) α>0 of functions φ α : R + × R → R a non-linear regularizing filter if for all α, κ > 0, the following holds The properties required in the definition of non-linear regularizing filters are quite natural.Recall that the field of imaging often exploits the fact that natural images or signals, other than noise, are sparse in certain frames, which means that the majority of the coefficients are zero.Assuming the original coefficients to be small and the noisy coefficient to be zero, it makes sense to leave them at zero after filtering.Furthermore, without specific knowledge of the noise structure, it seems reasonable to preserve the order after coefficient filtering which is the monotonicity.In addition, the distance between denoised coefficients should not exceed the distance between the noisy coefficients, which is nonexpansive.The last property is a technical one to show the convergence of the non-linear filtered DFD defined next to an inverse of A. Definition 3.2 (Non-linear filtered DFD).Let (φ α ) α>0 be a non-linear regularizing filter.The non-linear filtered DFD (B α ) α>0 with B α : dom(B α ) → X is defined by where Φ α,κ : According to (F2), Φ α,κ is well-defined and nonexpansive.
. Linear filtered DFDs has been analyzed and shown to be a regularization method in [10].Here we extend the analysis to filters that are non-linear in the second component.
The goal of this paper is to show stability and convergence of (B α ) α>0 .Since T ū and T * v are continuous, according to (2.2), (3.2) it is sufficient to analyze stability and convergence of (M + κ • Φ α,κ ) α>0 .

Filters as proximity operators
In the next lemma we demonstrate that non-linear regularizing filters are proximity operators of proper, convex, and lower semi-continuous functionals.
According to Lemma 2.1(d), M + κ • Φ α,κ (z) for any z is contained in dom(R α ).While it is known that x = (x λ ) λ ∈ dom(R α ) if and only if (s α,λ (κ λ x λ )) λ ∈ ℓ 1 (Λ), this criterion is difficult to verify knowing only the proximity operators.The following lemma provides a more practical condition for verifying that an element belongs to the domain of R α .Lemma 3.6.Let (φ α ) α>0 be a non-linear regularizing filter and α > 0 be fixed.Then, we have ran( ) and consequentially by Fermat's theorem we conclude x ∈ dom(∂R α ).

Convergence Analysis
In this section we investigate stability and convergence of (M + κ • Φ α,κ ) α>0 .By the continuity of the frame operators T ū and T * v , this implies stability and convergence of the non-linear filtered DFD rigorously defined in Definition 3.2.To achieve stability and convergence, we impose additional assumptions on the nonlinear regularization filter (φ α ) α>0 .First, we simplify the filtered DFD to the familiar concept of variational regularization, which helps to understand the analytical strategies employed.In a second step, we relax the extra assumptions and analyze (M + κ • Φ α,κ ) α>0 directly.

Stationary case: Application of variational regularization
Variational regularization uses minimizers of the generalized Tikhonov functional T α,y (x) = ∥Ax − y∥ 2 /2 + αR(x), where R is a regularizing functional and α > 0 the regularization parameter.It is well-investigated by numerous works, such as [12,24,27].In this paper we use the following convergence result.
• Existence: T α,y has at least one minimizer.
• Stability: For x k ∈ argmin T α,y k , there exists a subsequence (x k(ℓ) ) ℓ∈N and some • Weak Convergence: Assume y ∈ ran(A) and Ax = y has a solution in dom(R). Let Proof.See [27,Theorems 3.22,3.23,3.26].Lemma 3.4(d) shows that the filtered DFD can be expressed as an optimization problem with κ-regularizer R α .In the stationary case where R α = αR the DFD reduces to a variational regularization.We will show below that this already covers a large class of non-linear filter-based methods.An example of this relation was given in [14], where it was shown that the the soft thresholding filter yields a regularization method.
Example 4.3.For fixed b, d > 0 consider the function Then, for all κ > 0, φ 1 (κ, •) is monotonically increasing, nonexpansive, and defines a non-linear regularizing filter (φ α ) α>0 satisfying Assumption A. Note that φ α is the proximity operator of the scaled Huber loss function α The following Lemma reduces the non-linear filtered DFD to variational regularization.
Due to Assumption A, the operator (Φ −1 α,κ − id)/α is independent of α and by Lemma 3.6 we have From Lemmas 4.4 and 4.1 we obtain the following.
Additional results in variational regularization are known, indicating convergence in the norm topology and convergence rates under more stringent assumptions.However, the exploration of the connection between these assumptions and filter-based methods falls outside the scope of this paper.The detailed analysis of these additional results is reserved for future research.
Remark 4.6 (Strong Convergence).In the context of Proposition 4.5, under the assumption that R is totally convex at c + as established in Proposition 3.32 of [27], the sequence c k converges to c + in the norm topology.

Direct analysis in the general case
Assumption (A1) imposes that R α must take the form αR, though such a constraint is not essential.In this section, we relax the additional conditions on (φ α ) α>0 .While using approaches similar to variational regularization, we focus on the diagonal structure of M κ • Φ α,κ instead of relying on linear dependence with α.
Next we show that Assumption A is weaker than Assumption B, hence the results in this section generalize the results of variational regularization.
(φ α ) α>0 is a non-linear regularizing filter and satisfies Assumption B. Note that φ 1 is the same function as in Example 4.3 but for α ̸ = 1 the filter functions differ.By Remark 4.2 the construction of the non-linear regularizing filter satisfying Assumption A is uniquely given by φ 1 .Thus, (φ α ) α>0 does not satisfy Assumption A. Especially, φ α is the proximity function of the Huber function α • b/κ 2 • L(dακ/(κ 2 + αb), x), where we clearly see that the resulting regularizing functional is non-stationary.In Section 4.4, we compare this regularizing filter with that presented in in Example 4.3 and the soft thresholding filter in terms of numerical rates.
Further, by the point-wise monotonicity of φ α as α → 0 and the fact that s α,λ (0) = 0 for all α > 0, we get that s α,λ (y) is monotonically decreasing as α → 0. Then the theorem of monotone convergence implies ) ℓ be a weakly convergent subsequence with weak limit x + .By Lemma 3.4, we have Thus ∥M κ x n(ℓ) − z n(ℓ) ∥ → 0 and since z n(ℓ) → z and M κ is linear and bounded, we conclude M κ x n(ℓ) ⇀ M κ x + .Therefore, z = M κ x + and x + = M + κ z.Because this holds for every weakly convergent subsequence, we conclude x k ⇀ M + κ z.

Main theorems
Utilizing the results obtained for M + κ • Φ α k ,κ under Assumption B, we can deduce stability and weak convergence for the non-linear filter-based reconstruction method (B α ) α>0 .Notably, these results persist under the stricter Assumption A. Proof.Write x + = A + y and Application of the convergence results of Propositions 4.12 shows c k ⇀ M + κ M κ T * u x + = T * u x + .Thus for any z ∈ X, we have ⟨z, which concludes the proof.

Numerical rates
In this section, we present a brief example of numerical rates by comparing the filters of Example 4.3 and 4.9 with the soft thresholding filter.The underlying operator is the 2D Radon transform, and we utilize the wavelet vaguelette decomposition (DFD) with the Haar wavelet [9]. Figure 4.4 plots, on the left, the filter function with κ λ = 1/4 and α = 1/10 for the regularizing filters in Example 4.3 and 4.9, as well as the soft thresholding filter.On the right, the ℓ 2 -error of the reconstruction compared to the ground truth is plotted with respect to various percentages of added noise.A linear parameter choice, α = Cδ for some constant C > 0, is employed.

Connection to PnP regularization
PnP regularization and variational regularization are closely linked.Moreover, PnP regularization operates with an implicitly provided regularization parameter, similar to the direct analysis.In this section, our goal is to clarify the connections between the PnP framework and non-linear filtered DFD.We introduce assumptions about the regularizing filter that enable us to align our method with PnP regularization, similar to variational regularization, thereby achieving strong stability under slightly stronger conditions.Furthermore, we employ direct analysis strategies to establish convergence results for a diagonalized version of PnP regularization under more relaxed assumptions, as needed in previous works.
The concept in PnP regularization is to find a fixed point of the operator where (D α ) α>0 is a suitable family of regularization operators (or denoisers).The PnP framework has been applied successfully various fields including image restoration [8,30] and inverse imaging problems [29,28].
Definition 5.1 (Family of denoisers).We call (D α ) α>0 admissible family of denoisers if the following hold: In [11], the following results for PnP as a regularization method, notably the first of its kind, have been derived.
• Convergence: Assume y = Ax for x ∈ E and suppose ∥y k − y∥ ≤ δ k where δ k → 0. Consider α k → 0 such that (1 − Lip(D α k ))/ Lip(D α k ) ≳ δ k and take x k = Fix(T α k ,y k ).There exists a subsequence (x k(ℓ) ) l∈N and a solution x + of Ax = y such that x k(ℓ) ⇀ x + .If the solution is unique, then x k ⇀ x + .

Reducing non-linear filtered DFD to PnP
Building upon Lemma 3.4, we apply Proposition 5.2 with D α = prox γRα and A = M κ .To ensure the admissibility of the family of denoisers (D α ) α>0 , we impose assumptions on the non-linear regularizing filter, contingent on the quasi-singular values and, consequently, on the operator.One crucial condition for ensuring the admissibility of denoisers is contractiveness.Enforcing prox γR α to be contractive means that the regularizing functional R α has to be strongly convex, and in turn, all s α,λ have to be strongly convex.

PnP denoiser as a diagonal operator
Since we can express the filtered DFD in the form of plug-and-play regularization, and convergence is demonstrated under weak assumptions in the direct analysis of Section 4, this extends to a version of PnP that goes beyond the scope of [11].
After diagonalizing the operator A = T vM κ T * u with a frame decomposition, we apply PnP to regularize the discontinuous diagonal operator M κ .This involves considering the fixed points of

Conclusion
In this paper, we have analyzed non-linear diagonal frame filtering for the regularization of inverse problems.Compared to the previously analyzed linear filters, non-linear filters can better exploit the specific structure of the target signal and the noise.We introduced assumptions on the non-linear regularizing filters to prove stability and weak convergence in the context of regularization theory, and established connections to variational regularization and PnP regularization.Future research might focus on convergence in terms of norm topology and the derivation of convergence rates.

Figure 1 :
Figure 1: Example of functional and its proximity operator on the real line.The dashed line always represents the identity function.The wavy line represents infinity.

Figure 2 :
Figure 2: Illustration of the filter (φ α ) α>0 from Example 4.3 that is generated by a single filter function φ 1 and satisfies Assumption A.

Example 4 . 9 .
Fix the constants b, d > 0 and consider the function

Figure 4 :
Figure 4: On the left, filter functions for Example 4.3 and 4.9 (with κ λ = 1/4 and α = 1/10), as well as the soft thresholding filter, are displayed.On the right, the plot illustrates the ℓ 2 -error of reconstructions under varying percentages of added noise.A linear parameter choice is applied.