Approximation of discontinuous inverse operators with neural networks

In this work we deal with parametric inverse problems, which consist in recovering a finite number of parameters describing the structure of an unknown object, from indirect measurements. State-of-the-art methods for approximating a regularizing inverse operator by using a dataset of input–output pairs of the forward model rely on deep learning techniques. In these approaches, a neural network (NN) is trained to predict the value of the sought parameters directly from the data. In this paper, we show that these methods provide suboptimal results when a regularizing inverse operator is discontinuous with respect to the Euclidean topology. Hence, we propose a two-step strategy for approximating it by means of a NN, which works under general topological conditions. First, we embed the parameters into a subspace of a low-dimensional Euclidean space; second, we use a NN to approximate a homeomorphism between the subspace and the image of the parameter space through the forward operator. The parameters are then retrieved by applying the inverse of the embedding to the network predictions. The results are shown for the problem of x-ray imaging of solar flares with data from the Spectrometer/Telescope for Imaging X-rays. In this case, the parameter space is homeomorphic to a Moebius strip. Our simulation studies show that the use of a NN for predicting the parameters directly from the data yields systematic errors due to the non-Euclidean topology of the parameter space. The proposed strategy overcomes the discontinuity issues and furnishes stable and accurate reconstructions.

In this work we deal with parametric inverse problems, which consist in recovering a finite number of parameters describing the structure of an unknown object, from indirect measurements. State-of-the-art methods for approximating a regularizing inverse operator by using a dataset of input-output pairs of the forward model rely on deep learning techniques. In these approaches, a neural network (NN) is trained to predict the value of the sought parameters directly from the data. In this paper, we show that these methods provide suboptimal results when a regularizing inverse operator is discontinuous with respect to the Euclidean topology. Hence, we propose a two-step strategy for approximating it by means of a NN, which works under general topological conditions. First, we embed the parameters into a subspace of a low-dimensional Euclidean space; second, we use a NN to approximate a homeomorphism between the subspace and the image of the parameter space through the forward operator. The parameters are then retrieved by applying the inverse of the embedding to the network predictions. The results are shown for the problem of x-ray imaging of solar flares with data from the Spectrometer/Telescope for Imaging X-rays. In this case, the parameter space is homeomorphic to a Moebius strip. Our simulation studies show that the use of a NN for predicting the parameters directly from the data yields systematic errors due to the non-Euclidean topology of the parameter * Author to whom any correspondence should be addressed.
Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

Introduction
In inverse problems, neural networks (NNs) have been used in many different contexts: (i) to approximate a suitable penalty term in Tikhonov-like regularization approaches; (ii) to estimate the value of the regularization parameter; (iii) to replace time-consuming operations in unrolled schemes; (iv) to post-process coarse reconstructions; and (v) to directly approximate an inverse regularizing operator. We refer the reader to [4,14,18] and references therein for a recent overview.
In this paper we are interested in the latter strategy, which can bring important advantages with respect to the classical Tikhonov regularization. First, it is faster, as the solution is simply the evaluation of the trained NN on the assigned data, while the biggest computational effort is spent during the training phase. Second, deep approximation techniques are not limited to the treatment of linear inverse problems, but can be easily extended to the case of non-linear ones [2]. In this work, we focus on parametric inverse problems, where the solution is described by a finite number of parameters.
Although there is some literature that deals with either approximating the inverse operator [3,25,26], or estimating the parameters of interest [8,13], by means of NNs, little attention has been paid to the regularization properties of the inversion method. Specifically, when a regularizing inverse operator is not continuous with respect to the Euclidean topology, issues could arise when approximating it directly with a NN, which is, by definition, a continuous function between Euclidean spaces. For instance, let us consider a parametric inverse problem in which the goal is to estimate an angle θ ∈ [0, 2π) when the data lie in S 1 . In this case, data close to (1, 0) would be mapped either close to 0 or to 2π, thus producing a discontinuity, which can not be (accurately) approximated by a NN. This simple example suggests that the naive approach based on using a NN to map the data directly into the parameters may lead to suboptimal results in practice.
We provide a general treatment for the solution of parametric inverse problems when a dataset of parameter-data pairs is available. The contribution of the paper is twofold. First, under mild conditions on the forward operator, we derive the correct topology of the parameter space needed for defining a continuous regularizing operator. Second, we describe a strategy for approximating the regularizing operator with NNs. This strategy relies on the knowledge of an embedding of the parameter space into a (low-dimensional) Euclidean one. Then, a NN is used for approximating a homeomorphism between the image of the parameter space through the forward operator and the embedded space. Finally, the parameters are retrieved by applying the inverse of the embedding to the NN predictions. The advantage of this approach is that the network is used for approximating a continuous function between Euclidean spaces, in this way avoiding the discontinuity issues that arise when a NN is used to map non-homeomorphic spaces.
We demonstrate the effectiveness of the proposed method on a parametric imaging problem from synthetic data of the Spectrometer/Telescope for Imaging X-rays (STIX) [12], the x-ray telescope of the Solar Orbiter mission. In particular, we show that, when a specific shape is used for parameterizing the solution of the inverse problem, the parameter space has a topology that makes it homeomorphic to a family of Moebius strips of parameter , of equation { (x, y, z) : ∈ [0, max ), (x, y, z) ∈ Moebius}. In this context, our proposed strategy clearly outperforms the naive approach, which does not take into account the topology of the parameter space.
The reminder of the paper is organized as follows. In section 2 we describe the mathematical formulation of the parametric regularization and we provide details about the topology that has to be considered on the parameter space for defining a regularizing operator. Section 3 is devoted to the treatment of discontinuity issues arising when the parameter space is not endowed with the Euclidean topology and the regularizing operator is approximated directly with a NN. Further, we describe the proposed strategy for overcoming these issues. In section 4 we present the image reconstruction problem for STIX and the parametric shape used for approximating the solution. Finally, the results of the numerical experiments are shown in section 5. Section 6 is devoted to conclusions.

Parametric regularization
In the context of this paper, the data space is R M endowed with the Euclidean topology ε. The object space H is a function space equipped with the coarsest topology η that makes continuous the (possibly non-linear) operator A : H → R M modeling the data formation and acquisition process. Our problem is then the ill-posed inverse problem of finding an object f ∈ H that satisfies where g is the experimental data corrupted by noise. We propose to seek for a regularized solution into a parametric subspace of H. With this in mind, we assume that the exact solution f * belongs to a subset defined as the image of a function Φ : Θ → H, where Θ is a subset of R P with non-empty interior called parameter space. We denote by ε Θ the subspace topology induced on Θ by the Euclidean topology of R P and we assume that Φ is continuous on (Θ, ε Θ ). We can then recast the ill-posed inverse problem (1) as the one of finding θ such that Hereafter, we will denote with M the noise-free data subset, i.e. M := (A • Φ)(Θ), and we will equip M with the topology ε M inherited as a subspace of R M . We can define a continuous inverse of A • Φ on M provided that the following two conditions hold: (a) A • Φ is injective; (b) we consider on Θ the coarsest topology that makes continuous A • Φ, i.e.
These conditions are crucial to define a regularizing operator. Indeed, the topology τ makes continuous also the inverse of A • Φ on M, as proved in the following lemma.  According to the definition given in the introduction of [24], the map from data space into the parameter space is regularizable by an operator R if and only if R represents a continuous extension of the map. In this case, R is referred to as a regularizing operator. Hence, any continuous extension of the left inverse of A • Φ is a regularizing operator for the inverse problem (2). For any of such extensions, the composition Φ • R is a regularizing operator for (1) and finding a regularized solution of (1) is then equivalent to determine R. Since A • Φ is continuous on (Θ, ε Θ ), the topology τ needs to be coarser than ε Θ . In particular, when τ is strictly coarser than ε Θ , the inverse of A • Φ from M to (Θ, ε Θ ) is not continuous. As a consequence, it can not be extended to a continuous regularizing operator for (2). We show this fact in the following example.
In general, we have the following result.

Lemma 2.2. If conditions (a) and (b) hold, and if τ is strictly coarser than ε Θ , then a regularizing operator R is not continuous when
Proof. Let us assume by contradiction that R is continuous from R M to (Θ, ε Θ ). As R | M is a continuous inverse of A • Φ, then A • Φ would be a homeomorphism between (Θ, ε Θ ) and (M, ε M ). Hence, thanks to lemma 2.1, (Θ, ε Θ ) would be homeomorphic to (Θ, τ ), which is an absurd because ε Θ is strictly coarser than τ . This result states that, when (Θ, ε Θ ) is not homeomorphic to M, we need to define a regularizing inverse operator that is discontinuous w.r.t. the Euclidean topology. We show in the next section how to construct regularization maps with the desired characteristics.

Regularization from examples
In this section, we show how to construct an approximation of the regularizing operator R by means of a dataset of examples, i.e. a set of pairs representing a noisy sampling of the graph of A • Φ. Towards this aim, we make use of NNs [7,10], which are simply parametric functions obtained by recursively composing a certain number of layers. Each layer is defined as where L > 1 is the number of layers and W is the set of the network weights (the entries of the weight matrices and of the biases). A function defined as in (5) is then trained to perform a task, i.e. the weights are modified by means of an optimization procedure so that it approximates a given function. A straightforward application consists in training a NN to predict the parameter θ from the corresponding g by solving By doing so, N W * is an approximation of R that is continuous w.r.t. the Euclidean topology considered both in the domain and in the codomain. Indeed, any NN of the form (5) is implicitly defined from R n in R m as Euclidean spaces. However, if the topology τ on Θ is strictly coarser than ε Θ , as we previously discussed, the NN should be discontinuous for providing a good approximation of R, leading to an evident contradiction.
To show the issues that arise when applying this naive approximation of R with a NN, we consider again example 2.1. By using S = 30 000 samples drawn at random from the set Θ, i.e. θ i for i = 1, . . . , S and their corresponding values g i = (cos θ i , sin θ i ), we train a NN N approximating R by solving problem (6). In figure 2(a) we report the scatter plot of the angle θ predicted by N from the point g = (cos(θ), sin(θ)) ∈ M over a test set of 200 000 examples. The plot clearly shows that, according to lemma 2.2, N approximates a discontinuity in (1, 0) in a continuous way, causing a systematic error in a neighborhood of (1, 0).
To overcome this drawback, we propose the following strategy (whose results on the data of example 2.1 are given on figure 2(b)). Denoting with E a subset of R N and with ε E the topology inherited as a subspace of R N , we assume to have an analytical expression of an embedding and of its inverse γ −1 over E. In the usual case, (Θ, τ ) is a smooth manifold and the Whitney embedding theorem [1] guarantees the existence of the embedding and ensures that N 2P. Moreover, typically in applications, the parameters to be estimated are either linear (and hence discontinuity issues do not arise) or angular (as the case in example 2.1). For this reason, we expect (Θ, τ ) to be homomorphic to canonical topological spaces (like a sphere, a thorus, a Klein bottle, or Cartesian product of them) for which analytical embeddings in R N are well known. As γ is a homeomorphism, finding a continuous inverse of A • Φ on M can be recast as the problem of approximating a homeomorphism ψ between M and E: For this approximation task we can make use of a NN, as the topologies in the domain and codomain are induced by the Euclidean one. Then, the training problem looks like Finally, an approximated regularizing operator for problem (2) can be defined as Indeed, we note that the operator R is defined in (4) as a continuous extension on R M of an homeomorphism between M and (Θ, τ ). If we assume that the trained NN N w * is able to accurately approximate the homeomorphism ψ : M → E, then γ −1 • N w * is a good approximation of R since γ −1 • N w * is defined and continuous on the whole space R M and . Hence, the only condition we need to verify is that N w * can approximate the map ψ defined and continuous from Euclidean spaces and this is guaranteed by the universal approximation theorem. Figure 3 offers a schematic of the operators involved in the definition of R. The role of γ −1 is to map each point of E into a parameter value θ in a continuous way w.r.t. the topology τ and in a discontinuous way w.r.t. ε Θ . Instead, N W * is a continuous transformation between R M and R N , both equipped with the Euclidean topology. When N M (and often in applications N M), N W * performs a dimensionality reduction task. In any case, N W * is defined and continuous on the entire space R M . Therefore, when the inverse γ −1 can be continuously extended to a neighborhood containing E, the (approximated) regularizing operator R is a continuous extension of the left inverse of A • Φ. It is worth noticing that our proposed strategy represents a generalization of the naive approach. Indeed, when the topology τ coincides with the Euclidean one, we can trivially choose γ as the identity function and problem (9) is the same as (6).
In the case of example 2.1, we define γ(θ) := (cos(θ), sin(θ)) and we train a NN by solving (9). The predicted values of θ are shown in figure 2(b). From the scatterplot, we can appreciate how the discontinuity issue is solved by our proposed method.

Application to the STIX imaging problem
In this section we describe the parametric imaging problem for the STIX [12], an instrument on board the Solar Orbiter satellite launched by the European Space Agency in February 2020. STIX is conceived for the study of solar flares, intense phenomena that arise on the Sun surface. During these events, a sudden release of energy stored in the magnetic field of the Sun accelerates electrons and causes the emission of x-ray photons by bremsstrahlung [6]. The goal of the inverse imaging problem from STIX data is to retrieve the image of the x-ray emission from the measurements of the photons incident on the telescope [15][16][17]20]. STIX exploits a bigrid imaging system that allows the sampling of the Fourier transform of the photon flux in 30 frequencies ξ j = (u j , v j ), j = 1, . . . , 30 [9,12] (see figure 4 for a representation). Therefore, the STIX imaging problem can be described by the equation where ϕ(x, y) is the function representing the number of photons emitted per unit area from the location (x, y) on the Sun surface, V ∈ C 30 is the array containing the experimental values of the Fourier transform called visibilities and F is the Fourier transform computed in ξ 1 , . . . , ξ 30 defined by 3 In the following, C 30 will be considered as R 60 . 3 Note that the adopted definition of Fourier transform is typical of astronomical applications and differs from the usual one because of a plus sign. As the morphology of solar flares is quite simple, the images to reconstruct are usually composed by a few basic geometric shapes such as elliptical Gaussians or loops [5,21,22] (figure 4). Such shapes are bidimensional functions ϕ θ (x, y) parameterized by an array θ containing, for instance, the coordinates of the center of the shape, the eccentricity, the rotation angle, etc. Therefore, in the case of the parametric imaging problem for STIX, the parameterization is the function Φ that maps θ into ϕ θ and problem (11) becomes the one of finding θ such that Since the Gaussian elliptical shape is a special case of the loop shape with curvature equal to zero, in the following we will consider ϕ θ as a loop and we will provide a description of the topology τ of the parameter space in such a case. A loop shape is defined by the following parameters (see figure 5): • the coordinates (x c , y c ) of the center of the shape; • the intensity F, also named total flux, that is the integral of ϕ θ over R 2 ; • the full width at half maximum (FWHM) σ, which represents the width of the level curve of the loop at 50% of the peak; • the curvature c that describes the bending of the loop; • the eccentricity ε, that, when the curvature is 0, is related to the eccentricity of the elliptical level curve at half maximum of the shape; • the rotation angle α.
In our case, then, θ := (x c , y c , F, σ, ε, α, c). More in detail, as shown in figure 5, a loop shape is given by a superimposition of circular Gaussian shapes with FWHM equal to σ and centers located on a parabola of equation y = cx 2 rotated of angle α. The expression of the loop is then where j w j = 1, w j > 0 decreases for increasing distance of (x j , y j ) with respect to (x c , y c ) and the distance between (x j , y j ) and (x j−1 , y j−1 ) along the parabola is proportional to ε 0. We point out that when ε = 0, the loop shape becomes a circular Gaussian shape that is invariant with respect to rotations of angle α and bending with curvature c. In the following, we will set c = 0 and α = 0 when ε = 0.  The parameter space of this inverse problem is where I X , I Y , I F and I σ are the intervals of definition of x c , y c , F and σ, respectively, and Consequently, a loop shape with orientation angle 0 and curvature c coincides with a loop shape with orientation angle 180 and curvature −c, from which it follows that M is a Moebius strip in R 60 . Therefore, since (Θ, τ ) is homeomorphic to M, we have that τ makes Θ homeomorphic to a Moebius strip and that τ is strictly coarser than ε Θ (see figure 6).
We now propose a rather general strategy for visualizing the Moebius strip M in the data space, which ideally would be suitable for gaining insights on any topology under examination. Indeed, one can randomly sample M by randomly sampling Θ and computing the data corresponding to each parameter. Then, it is possible to visualize M by performing a principal  −c). Also, the use of a fourth dimension, represented by the color map associated to c, permits the separation of the visibilities that lie close to the central knot.
We describe now the embedding of (Θ, τ ) in R N and its inverse. If we consider the simple case of Θ = [0, 180) × [c min , c max ], we obtain that the embedding is the parameterization γ of the Moebius strip given by γ(α, c) := ((1 + c sin(α)) cos(2α), (1 + c sin(α)) sin(2α), c cos(α))) (16) with inverse where arctan2 is the function that retrieves the value of the angle in polar coordinates corresponding to a point (x, y). On the other hand, in the general case, we have that the embedding is γ g defined by γ g (x c , y c , F, σ, ε, α, c) := (x c , y c , F, σ, ε, εγ(α, c)) (18) with inverse where s ∈ R 5 and t ∈ R 3 . We point out that in (18) the parameterization of the Moebius strip in the last three components is multiplied by ε for taking into account that, when the eccentricity is equal to 0, the loop collapses into a Gaussian circular shape and, in that case, the orientation angle and the curvature are chosen equal to 0.

Numerical experiments
We assess the performances of the proposed method when applied to the STIX imaging problem. First, we consider a scenario in which we fix all the parameters of the loop shape with the exception of α and c. We compare the performances of the proposed method with those of the naive approach based on training a NN to predict α and c from the visibilities. We show that the performances of the naive approach are suboptimal and that the reason of this misbehavior is only due to discontinuity issues. Second, we test our method on the more realistic problem of retrieving all the parameters of the loop from the corresponding visibility values.
The implemented NNs are multilayer perceptrons [7] with similar architecture: they take as input an array of 60 real values (the real and imaginary parts of the 30 visibilities) and they have hidden layers composed by 3000 neurons each. We choose the rectified linear unit [7] as activation function of the neurons. For implementing and training the networks we utilize the PyTorch library [19] and the Adam optimizer [11]. Our code is publicly available at https:// github.com/paolomassa/Parametric-inverse-problem-topology.

Simple dataset scenario
We fix x c = y c = 0, F = 1000, σ = 8, ε = ε max = 5 and we randomly generate a set of pairs {(α i , c i )} S i=1 , where S = 50 000, α i ∈ [0, 180) and c i ∈ [−0.05, 0.05]. For each sample θ i = (α i , c i ), we compute the corresponding array of visibilities V i . Then we split the dataset into a training, a validation and a test set of 30 000, 10 000 and 10 000 samples, respectively. We note that, in this simple dataset scenario setting, we are not adding noise to the visibility values. We consider two NNs N n and N e with four hidden layers each. The subscripts stand for naive and embedding, respectively. The networks N n and N e are trained on the same set of examples for 1000 and 100 epochs, respectively. Figure 8 shows the results obtained on the test set by the naive approach and by the proposed method. Specifically, for each array of visibilities V i of the test set, we compute ((α n ) i , (c n ) i ) := N n (V i ) and ((α e ) i , (c e ) i ) := γ −1 (N e (V i )). In the left panel of figure 8, we show the scatter plots of α n and α e as functions of the ground truth value α. In the right panel, instead, we report the scatter plots of c n and c e as functions of the ground truth value c. It is evident from these results that the naive approach has suboptimal performances. Indeed, when the ground truth value of the orientation angle is close to 0 (or to 180), the predictions provided by N n are affected by large errors. For the same examples, also the value of the predicted curvature is very different from the correct one. This is due to the fact that, as the topology τ on Θ is strictly coarser than ε Θ , the prediction should be discontinuous w.r.t. the latter topology. However, since N n intrinsically assumes Θ endowed with ε Θ and it continuous w.r.t. that topology, the network approximates the discontinuity in a continuous way. On the other hand, γ −1 • N e provides accurate estimations of both the orientation angle and the curvature. There are just a few examples for which the predictions seem off-target, but it can be easily noted that, due to the identification (0, c) = (180, −c) the proposed approach still predicts a value of the orientation angle close to 0 instead of close to 180. Coherently, the predicted curvature value has only a different sign w.r.t. the ground truth one. Therefore, the corresponding ground truth and predicted loop shapes are approximately identical.
The discontinuity issue arising when the orientation angle is close to 0 or 180, can be further appreciated with the following test. We fix x c = y c = 0, F = 1000, σ = 8, ε = 5, α = 0, choose c ∈ {−0.05, −0.025, 0, 0.025, 0.05}, and compute the corresponding visibilities. We then predict the orientation angle and the curvature with both the naive and the proposed approach and visualize the associated loop shapes. Figure 9 shows that N n clearly fails to provide reliable reconstructions of the ground truth loop shapes, by mis-estimating the orientation angle and curvature. The proposed method, instead, does not suffer from the discontinuity issue and retrieves visually accurate loop shapes.
A couple of comments are necessary at the end of this subsection. First, the discontinuity issue shown in figures 8 and 9 does not depend on the network architecture. Indeed, every NN of the form (5) is not continuous when Θ is equipped with the topology τ . Although there might be network architectures which are more performing than the multilayer perceptron, the discontinuity issue would always arise and our method would always represent a valid solution. Furthermore, we note that the definition of γ does not affect the performance of the NN. Indeed, although E is not uniquely defined, it is homeomorphic to M by definition. Hence, a NN is able to approximate the homeomorphism ψ : M → E independently from the actual definition of E and γ, since it is a continuous function between Euclidean spaces.
Second, as we have not added noise to the visibility values of the training, validation and test set, the reported results are not affected by overfitting [7,10] and the encountered misbehavior can be explained only in terms of the topological considerations we have made. Finally, while N n has been trained for a number of epochs ten times larger than N e , its performances remain consistently worse than those of γ −1 • N e . This is a further confirmation that the errors in the predictions of the naive approach are not due to implementation or training issues, but just to the topological nature of the problem.

Complete dataset scenario
We generate a set of S = 100 000 pairs {(V i , θ i )}, where θ i is a randomly drawn array of parameters of a loop and V i is the corresponding array of visibilities. Then, we split this dataset into a training, validation and test set of 60 000, 20 000 and 20 000 samples each. In this scenario, the visibility values are perturbed with white Gaussian noise with zero mean and standard deviation equal to 2 √ F (for simulating realistic STIX data acquisitions [12]). We evaluate the performances of the proposed method, by training a NN N whose weights are solution of (9), where the embedding is γ g defined in (17). The implemented NN has six hidden layers, and dropout [23] is applied before each layer to avoid over-fitting. Training is stopped when the loss on the validation set is minimized. Figure 10 shows the results obtained by γ −1 g • N on the test set. In the left-most panel, we note that the parameters x c , y c and F are retrieved with good accuracy, the normalized absolute error 4 being always lower than 10%. On the other hand, the FWHM σ and the eccentricity ε are reconstructed with larger uncertainty, as they present a wider error distribution. However, the 75th percentile is lower than 15% for both parameters.
The middle and the right panel of figure 10 show how the proposed regularization method deals with the discontinuity issues presented before. The reconstruction of the orientation angle as a function of the ground truth value is very close to the identity when the data correspond to elongated Gaussian shapes (orange and red dots in the middle panel scatter plot). Instead, for circular shapes, we note how the method reconstructs arbitrary values of the orientation angle (the blue dots in the middle panel scatter plot), as the shape is indeed invariant under rotations.

Concluding remarks
We presented a regularization method for approximating the solution of parametric inverse problems by leveraging on a dataset of examples of input-output pairs of the forward operator. The regularization operator is conceived as the composition of a dimensionality-reduction homeomorphism (performed by means of a NN) and the inverse of a suitable embedding of the parameter space into a Euclidean space. Our results provide new insights on the use of NNs for the solution of inverse problems. Indeed, we proved that approximating a regularizing operator directly with a NN is suitable only when the operator is defined between subsets of R n and R m both endowed with the topology induced by the Euclidean one. In the more general case of locally Euclidean topological spaces, the proposed method represents a rigorous strategy to construct a continuous regularizing operator. Even when the parameter space is endowed with a topology that is strictly coarser than the Euclidean one, our method is able to solve the discontinuity issue that makes the naive approach fail by keeping all the advantages in terms of computational efficiency of using a NN.
A limitation of the proposed method is that it strongly relies on knowledge of the analytical expression of γ. However, we believe that, in applications, γ should be easily defined, as the underlying topologies are typically well-known. Indeed, it is reasonable to assume that the discontinuities that could arise in applications are due to angular parameters (as in our example), and so possibly leading to topological spaces that are canonical, e.g. a sphere, a thorus, a Klein bottle, or Cartesian product of them. In practice, we should be always able to derive the topology of the parameter space and the associated γ, and therefore to apply this methodology in a straightforward manner. At the same time, the choice of γ is not unique, since infinite embeddings between known topologies are possible. Whichever the choice γ, a NN is always able to approximate the homeomorophism between the data space and the embedded one.
As far as the application to the STIX imaging problem is concerned, to the best of our knowledge this is the first time that NNs are used for its solution. Since the first data acquisition in June 2020, there has been a huge effort by the STIX team for correcting systematic errors in the data and the visibility calibration is now close to the end. Therefore, assessing the performances of the proposed method on real measurements, which is beyond the scope of this paper, will be material of future studies as well as the comparison with other algorithms already implemented for the solution of this inverse problem.
The ideas we proposed in this paper may apply to a much larger range of practical applications, and future work could be devoted to (i) address the problem of deriving the topology τ and the corresponding embedding γ in an automatic way (at least under specific assumptions) (ii) testing NNs with different architectures; (iii) testing loss functions weighting the parameters according to their relevance in describing the solution; (iv) providing uncertainty quantification on the retrieved parameters.
Finally, we believe that the change in topology may come as a mathematical trick to for approximating discontinuous functions with NNs. This will be material for future studies, generalizing our methodology.