Regularized solution of a nonlinear problem in electromagnetic sounding*

Gian Piero Deidda; Caterina Fenu; Giuseppe Rodriguez

doi:10.1088/0266-5611/30/12/125014

1. Introduction

Electromagnetic induction (EMI) technique has had widespread use in hydrological and hydrogeological characterizations [34, 46, 53], hazardous waste characterization studies [17, 38], precision-agriculture applications [7, 16, 61], archaeological surveys [33, 44, 57], geotechnical investigations [47] and unexploded ordnance (UXO) detection [29, 30]. The use of small measurement systems, with rapid response and easy integration into mobile platforms, has been the key factor in the success of EMI techniques for near-surface investigations in these fields, as they allow dense surveying and real-time conductivity mapping over large areas in a cost-effective manner.

EMI theory and foundations of measurement systems are described in the applied geophysics literature [39, 56, 60]. The basic instrument, usually called a ground conductivity meter (GCM), contains two small coils, a transmitter and a receiver, whose axes can be aligned either vertically or horizontally with respect to the ground surface. An alternating sinusoidal current in the transmitter produces a primary magnetic field H_P, which induces small eddy currents in the subsurface. These currents, in turn, produce a secondary magnetic field H_S, which is measured, together with the primary field, at the receiver. The ratio of the secondary to the primary magnetic fields, recorded as in-phase and quadrature components, is then used, along with the instrumental parameters (height above the ground, frequency, inter-coil spacing, and coil configuration), to estimate electrical properties (conductivity and magnetic susceptibility) of the subsurface.

Traditional GCMs, like the pioneering Geonics EM38 and EM31, have been designed as profiling instruments for apparent electrical conductivity (defined as the conductivity of a homogeneous half-space that produces the same response as measured above the real earth with the same device) mapping, mainly with subsequent qualitative interpretation. Nevertheless, they can be also used to perform sounding surveys to get quantitative estimates of depth variations in true electrical conductivity. For this purpose, different approaches have been considered. Assuming a linear dependence between the GCM response and the subsurface electrical conductivity, McNeill [39] presented a method to estimate conductivities for simple multilayered earth models, which is applicable for low induction numbers

$\begin{eqnarray}&&B=\frac{r}{\delta }=r\sqrt{\frac{{{\mu }_{0}}\omega \sigma }{2}}\ll 1,\end{eqnarray} \tag{ 1.1 }$

under the assumption of uniform electrical conductivity σ. Here r is the inter-coil distance while δ represents the skin depth (the depth at which the principal field H_P has been attenuated by a factor e⁻¹); ${{\mu }_{0}}=4\pi {{10}^{-7}}\;{\rm H}/{\rm m}$ is the magnetic permeability of free space; and $\omega =2\pi f$ , where f is the operating frequency of the device in Hz.

Adopting the same linear model of McNeill [39], Borchers et al [3] implemented a Tikhonov inverse procedure, with a finite difference approximation of the second derivative as a regularization matrix, to reconstruct conductivity profiles from measurements taken using a GCM at various heights above the ground. To account for high values of the induction number, Hendrickx et al [26] fitted the technique of Borchers et al [3] to the nonlinear model described in Ward and Hohmann [60]. Besides, Deidda et al [12] proposed a least squares inverse procedure, optimized by mean of a projected conjugate gradient algorithm, to estimate conductivity profiles under the linear model assumption. All these approaches were reliable and useful, as they provided quantitative estimates of depth variation in true electrical conductivity, but they were not very appealing for the practitioners, as long as the use of traditional GCMs prevailed. In fact, collecting the required multiple measurements at several heights above the ground involves time-consuming, laborious, and costly fieldwork.

To overcome these fieldwork troubles, in recent years a new generation of GCMs has been developed to allow for the fast collection of multiple depth responses. Some devices, such as DUALEM-421S (DUALEM, Inc.) and CMD Mini-Explorer (GF Instruments), are designed to record data at multiple coil spacing and orientations simultaneously, using a single frequency; other instruments such as GEM-2 (Geophex, Inc.) work using multiple frequencies simultaneously (usually from three to ten), although with a fixed transmitter-receiver geometry. With this technological improvement, the near-surface community is showing a growing renewed interest towards EMI imaging [5, 31, 38, 40, 41, 48, 54, 55], and is pressing for new efforts to develop reliable and faster inversion procedures in order to make real-time imaging a reality.

With the aim of giving a contribution to this joint effort, we propose here a regularized one-dimensional (1D) inversion procedure designed to swiftly manage multiple GCM depth responses. It is based on the coupling of the damped Gauss–Newton method with either the truncated singular value decomposition (TSVD) or the truncated generalized singular value decomposition (TGSVD), and it implements an explicit (exact) representation of the Jacobian to solve the nonlinear inverse problem. To illustrate its performance, we first describe the results obtained inverting synthetic data sets generated over three 1D layered models with very high conductivities. In particular, we analyze the influence of some experimental settings, such as number and type of measurements (vertical and horizontal coil configurations), highlighting the different behavior of TSVD and TGSVD implemented with two different regularization matrices. In addition, we investigate how to choose the optimal regularization parameter, both when the noise level in the data is known and when it is not. Besides measuring the execution time for all numerical experiments, in contrast with Schultz and Ruppel [55], we prove that the analytical computation of the Jacobian, combined with the Broyden update, makes the inversion algorithm more than ten times faster than approximating the Jacobian by finite differences. Finally, we present a real case study: using a field data set measured at a site where an independent electrical resistivity tomography (ERT) was also collected, we assess the reliability of the inverse procedure by comparing the inverted conductivity profile to the profile obtained by ERT.

A large number of papers concerns the properties of the Gauss–Newton method and its variants, see, e.g., [8, 18, 27, 32]. In [20] the Levenberg–Marquardt method is applied to an inverse problem in groundwater hydrology, using an adaptive strategy to choose the Lagrangian parameter combined with the discrepancy principle for stopping the iteration. In [18] the positivity constraint on the solution is treated by a procedure different from the one adopted in our algorithm, which introduces a further non-linearity in the problem to be solved.

The plan of the paper is as follows: in section 2 we describe the non-linear model which connects the real conductivity of the soil layers to the apparent conductivity, and in section 3.1 we compute the Jacobian matrix of the model. The inversion algorithm is introduced in section 3.2, while section 3.3 describes the adopted regularization procedure, and section 3.4 illustrates various approaches for the choice of the regularization parameter. Finally, sections 4 and 5 report the results of numerical experiments performed on synthetic and real data, respectively. Section 6 contains concluding remarks.

2. The nonlinear forward model

The nonlinear model described in [59, 60], and further analyzed and adapted to the case of a GCM in [26], is derived from Maxwellʼs equations, keeping in mind the cylindrical symmetry of the problem, due to the magnetic field sensed by the receiver coil being independent of the rotation of the instrument around the vertical axis. The input quantities are the distribution of the electrical conductivity and magnetic permeablity in the subsurface; the output is the apparent conductivity at height h. In the following, λ is a variable of integration which has no particular physical meaning. It can be interpreted as the ratio between a length and the skin depth δ; see (1.1).

Following [59, Chapter III], we assume that the soil has a layered structure with n layers, each of thickness d_k, $k=1,...,n$ , and consequently that the electromagnetic variables are piecewise constant; see figure 1. The thickness d_n of the bottom layer is assumed to be infinite. Let ${{\sigma }_{k}}$ and ${{\mu }_{k}}$ be the electrical conductivity and the magnetic permeability of the kth layer, respectively, and let ${{u}_{k}}(\lambda )=\sqrt{{{\lambda }^{2}}+{\rm i}{{\sigma }_{k}}{{\mu }_{k}}\omega }$ , where ${\rm i}=\sqrt{-1}$ is the imaginary unit. Then, the characteristic admittance of the kth layer is given by

$\begin{eqnarray}&&{{N}_{k}}(\lambda )=\frac{{{u}_{k}}(\lambda )}{{\rm i}{{\mu }_{k}}\omega },\qquad k=1,\ldots ,n.\end{eqnarray} \tag{ 2.1 }$

The surface admittance at the top of the kth layer is denoted by ${{Y}_{k}}(\lambda )$ and verifies the following recursion

$\begin{eqnarray}&&{{Y}_{k}}(\lambda )={{N}_{k}}(\lambda )\frac{{{Y}_{k+1}}(\lambda )+{{N}_{k}}(\lambda ){\rm tanh} ({{d}_{k}}{{u}_{k}}(\lambda ))}{{{N}_{k}}(\lambda )+{{Y}_{k+1}}(\lambda ){\rm tanh} ({{d}_{k}}{{u}_{k}}(\lambda ))},\quad k=n-1,\ldots ,1,\end{eqnarray} \tag{ 2.2 }$

which is initialized by setting ${{Y}_{n}}(\lambda )={{N}_{n}}(\lambda )$ at the lowest layer. Numerically, this is equivalent to starting the recursion at k = n with ${{Y}_{n+1}}(\lambda )=0$ .

**Figure 1.** Schematic representation of the subsoil and of the discretization used in the paper.
Download figure:
Standard image High-resolution image

Now let,

$\begin{eqnarray}&&{{R}_{0}}(\lambda )=\frac{{{N}_{0}}(\lambda )-{{Y}_{1}}(\lambda )}{{{N}_{0}}(\lambda )+{{Y}_{1}}(\lambda )},\end{eqnarray} \tag{ 2.3 }$

where ${{N}_{0}}(\lambda )=\lambda /({\rm i}{{\mu }_{0}}\omega )$ , and

$\begin{eqnarray}\begin{array}{rcl} {{T}_{0}}(h) & = & -{{\delta }^{3}}\int _{0}^{\infty }{{\lambda }^{2}}{{{\rm e}}^{-2\,h\lambda }}{{R}_{0}}(\lambda ){{J}_{0}}(r\lambda )\;{\rm d}\lambda , \\ {{T}_{2}}(h) & = & -{{\delta }^{2}}\int _{0}^{\infty }\lambda {{{\rm e}}^{-2\,h\lambda }}{{R}_{0}}(\lambda ){{J}_{1}}(r\lambda )\;{\rm d}\lambda , \\ \end{array}\end{eqnarray} \tag{ 2.4 }$

where ${{J}_{0}}(\lambda )$ and ${{J}_{1}}(\lambda )$ are Bessel functions of the first kind of order 0 and 1, respectively, and r is the inter-coil distance. We express the integrals (2.4) in the variable λ, instead of $g=\delta \lambda$ as in [59]. This has some impact on the numerical computation; see remark 2.1.

The apparent conductivity measured by a GCM can be expressed as

$\begin{eqnarray}&&\frac{4}{{{\mu }_{0}}\omega {{r}^{2}}}{\rm Im}\left( \frac{{{({{H}_{S}})}_{d}}}{{{({{H}_{P}})}_{d}}} \right),\end{eqnarray} \tag{ 2.5 }$

where ${{({{H}_{P}})}_{d}}$ and ${{({{H}_{S}})}_{d}}$ are the components along the dipole axis of the primary and secondary magnetic field, respectively, which can be represented by adapting the expressions given in [59, page 113] to the geometry of a GCM. Substituting in (2.5), we obtain the predicted values of the apparent conductivity measurement ${{m}^{V}}(h)$ (vertical orientation of coils) and ${{m}^{H}}(h)$ (horizontal orientation of coils) at height h above the ground

$\begin{eqnarray*}&&{{m}^{V}}(h)=\frac{4}{{{\mu }_{0}}\omega {{r}^{2}}}{\rm Im}({{B}^{3}}{{T}_{0}}(h)),\quad {{m}^{H}}(h)=\frac{4}{{{\mu }_{0}}\omega {{r}^{2}}}{\rm Im}({{B}^{2}}{{T}_{2}}(h)),\end{eqnarray*}$

where B is the induction number (1.1). Simplifying the formulae, we find

$\begin{eqnarray}\begin{array}{rcl} {{m}^{V}}(h) & = & \frac{4r}{{{\mu }_{0}}\omega }{{\mathcal{H}}_{0}}\left[ -\lambda {{e}^{-2\,h\lambda }}{\rm Im}({{R}_{0}}(\lambda )) \right](r), \\ {{m}^{H}}(h) & = & \frac{4}{{{\mu }_{0}}\omega }{{\mathcal{H}}_{1}}\left[ -{{e}^{-2\,h\lambda }}{\rm Im}({{R}_{0}}(\lambda )) \right](r). \\ \end{array}\end{eqnarray} \tag{ 2.6 }$

Here we denote by

$\begin{eqnarray}&&{{\mathcal{H}}_{\nu }}[{{f}_{h}}](r)=\int _{0}^{\infty }{{f}_{h}}(\lambda ){{J}_{\nu }}(r\lambda )\lambda {\rm d}\lambda \end{eqnarray} \tag{ 2.7 }$

the Hankel transform of order ν of the function ${{f}_{h}}(\lambda )$ , where the height h is a fixed parameter. In our numerical experiments we approximate ${{\mathcal{H}}_{\nu }}[{{f}_{h}}](r)$ by the quadrature formula described in [1], using the nodes and weights adopted in [26].

The model depends upon a number of parameters which influence the value of the apparent conductivity. In particular, it is affected by the instrument orientation (horizontal/vertical), its height h above the ground, the inter-coil distance r, and the angular frequency ω. In view of the technical features of the GCM at our disposal, we consider r and ω to be constant. This constraint could be easily removed.

Remark 2.1. The previous relations (2.6) show that the apparent conductivity predicted by the model does not depend explicitly on the skin depth δ and the induction number B. This has some relevance in numerical computation, as an estimate of the value of δ is not required. To our knowledge, this is the first time that this is noted.

3. Solution of the inverse problem

In our analysis, we let the magnetic permeability take the same value ${{\mu }_{0}}$ in the n layers. This assumption is approximately met if the ground does not contain ferromagnetic materials. Then, we can consider the apparent conductivity as a function of the value ${{\sigma }_{k}}$ of the conductivity in each layer and of the height h, and we write ${{m}^{V}}({\boldsymbol{ \sigma }} ,h)$ and ${{m}^{H}}({\boldsymbol{ \sigma }} ,h)$ , where ${\boldsymbol{ \sigma }} ={{({{\sigma }_{1}},\ldots ,{{\sigma }_{n}})}^{T}}$ , instead of ${{m}^{V}}(h)$ and ${{m}^{H}}(h).$

The problem of data inversion is very important in geophysics, when one is interested in depth localization of inhomogeneities of the soil. To this purpose, multiple measurements are needed to recover the distribution of conductivity with respect to depth. In order to obtain such measurements, we use the two admissible loop orientations and assume to record apparent conductivity at height h_i, i = 1,...,m, as depicted in figure 1. This generates $2\;m$ data values. Our algorithm could be easily adapted to the case when other parameters of the model are varied.

Now, let b^V_i and b^H_i be the data recorded by the GCM at height h_i in the vertical and horizontal orientation, respectively, and let us denote by ${{r}_{i}}({\boldsymbol{ \sigma }} )$ the error in the model prediction for the ith observation

$\begin{eqnarray}{{r}_{i}}({\boldsymbol{ \sigma }} )=\left\{ \begin{array}{ccccccccccccccc} b_{i}^{V}-{{m}^{V}}({\boldsymbol{ \sigma }} ,{{h}_{i}}), & i=1,...,m, \\ b_{i-m}^{H}-{{m}^{H}}({\boldsymbol{ \sigma }} ,{{h}_{i-m}}), & i=m+1,...,2\;m. \\ \end{array} \right.\end{eqnarray} \tag{ 3.1 }$

Setting ${{{\bf b}}^{V}}={{(b_{1}^{V},\ldots ,b_{m}^{V})}^{T}}$ , ${{{\bf m}}^{V}}({\boldsymbol{ \sigma }} )={{({{m}^{V}}({\boldsymbol{ \sigma }} ,{{h}_{1}}),\ldots ,{{m}^{V}}({\boldsymbol{ \sigma }} ,{{h}_{m}}))}^{T}}$ , and defining ${{{\bf b}}^{H}}$ and ${{{\bf m}}^{H}}({\boldsymbol{ \sigma }} )$ similarly, we can write the residual vector, the measured data, and the model predictions as

$\begin{eqnarray}&&{\bf r}({\boldsymbol{ \sigma }} )={\bf b}-{\bf m}({\boldsymbol{ \sigma }} ),\quad {\bf b}=\left[ \begin{array}{ccccccccccccccc} {{{\bf b}}^{V}} \\ {{{\bf b}}^{H}} \\ \end{array} \right],\quad {\bf m}({\boldsymbol{ \sigma }} )=\left[ \begin{array}{ccccccccccccccc} {{{\bf m}}^{V}}({\boldsymbol{ \sigma }} ,{\bf h}) \\ {{{\bf m}}^{H}}({\boldsymbol{ \sigma }} ,{\bf h}) \\ \end{array} \right].\end{eqnarray} \tag{ 3.2 }$

The problem of data inversion consists of computing the conductivity ${{\sigma }_{k}}$ of each layer (k = 1,...,n) which determines a given data set ${\bf b}\in {{\mathbb{R}}^{2\,m}}$ . As it is customary, we use a least squares approach by solving the nonlinear problem

$\begin{eqnarray}&&\mathop{{\rm min} }\limits_{{\boldsymbol{ \sigma }} \in {{\mathbb{R}}^{n}}}\;f({\boldsymbol{ \sigma }} ),\qquad f({\boldsymbol{ \sigma }} )=\frac{1}{2}\parallel {\bf r}({\boldsymbol{ \sigma }} )\;{{\parallel }^{2}}=\frac{1}{2}\mathop{\sum }\limits_{i=1}^{2\,m}r_{i}^{2}({\boldsymbol{ \sigma }} ),\end{eqnarray} \tag{ 3.3 }$

where $\parallel \cdot \parallel$ denotes the Euclidean norm and ${{r}_{i}}({\boldsymbol{ \sigma }} )$ is defined in (3.1).

To estimate the computational complexity needed to evaluate ${\bf r}({\boldsymbol{ \sigma }} ),$ we assume that the complex arithmetic operations are implemented according to the classical definitions, i.e., that 2 floating point operations (flops) are required for each complex sum, 6 for each product and 11 for each division. The count of other functions (exponential, square roots, etc.) is given separately. If n is the number of layers, $2\;m$ the number of data values, and q the nodes in the quadrature formula used to approximate (2.7), we obtain a complexity $O((45n+8\;m)q)$ flops plus $2nq$ evaluations of functions with a complex argument, and mq with a real argument.

3.1. Computing the Jacobian matrix

As we will see in the next section, being able to compute or to approximate the Jacobian matrix $J({\boldsymbol{ \sigma }} )$ of the vector function (3.2) is crucial for the implementation of an effective inversion algorithm and to have information about its speed of convergence and conditioning.

The approach used in [26] is to resort to a finite difference approximation

$\begin{eqnarray}&&\frac{\partial {{r}_{i}}({\boldsymbol{ \sigma }} )}{\partial {{\sigma }_{j}}}\simeq \frac{{{r}_{i}}({\boldsymbol{ \sigma }} +{{{\boldsymbol{ \delta }} }_{j}})-{{r}_{i}}({\boldsymbol{ \sigma }} )}{\delta },\quad i=1,\ldots ,2\;m,\ j=1,\ldots ,n,\end{eqnarray} \tag{ 3.4 }$

where ${{{\boldsymbol{ \delta }} }_{j}}=\delta \;{{{\bf e}}_{j}}={{(0,\ldots ,0,\delta ,0,\ldots ,0)}^{T}}$ and δ is a fixed constant.

In this section we give the explicit expression of the Jacobian matrix. We will show that the complexity of this computation is smaller than required by the finite difference approximation (3.4). The following lemma is one of the main contributions of this work. In its statement we omit, for clarity, the variable λ.

Lemma 3.1. The derivatives $Y_{kj}^{\prime }=\frac{\partial {{Y}_{k}}}{\partial {{\sigma }_{j}}}$ , $k,j=1,\ldots ,n$ , of the surface admittances (2.2) can be obtained starting from

$\begin{eqnarray}&&Y_{nn}^{\prime }=\frac{1}{2{{u}_{n}}},\qquad Y_{nj}^{\prime }=0,\qquad j=1,\ldots ,n-1,\end{eqnarray} \tag{ 3.5 }$

and proceeding recursively for $k=n-1,n-2,...,1$ by

$\begin{eqnarray}\begin{array}{rcl} Y_{kj}^{\prime } & = & N_{k}^{2}{{b}_{k}}Y_{k+1,j}^{\prime },\qquad j=n,n-1,\ldots ,k+1, \\ Y_{kk}^{\prime } & = & \frac{{{a}_{k}}}{2{{u}_{k}}}+\frac{{{b}_{k}}}{2}\left[ N_{k}^{2}{{d}_{k}}-{{Y}_{k+1}}\left( {{d}_{k}}{{Y}_{k+1}}+\frac{1}{{\rm i}{{\mu }_{k}}\omega } \right) \right], \\ Y_{kj}^{\prime } & = & 0,\qquad j=k-1,k-2,\ldots ,1, \\ \end{array}\end{eqnarray} \tag{ 3.6 }$

where

$\begin{eqnarray}&&{{a}_{k}}=\frac{{{Y}_{k+1}}+{{N}_{k}}{\rm tanh} ({{d}_{k}}{{u}_{k}})}{{{N}_{k}}+{{Y}_{k+1}}{\rm tanh} ({{d}_{k}}{{u}_{k}})},\quad {{b}_{k}}=\frac{1}{{{[{{N}_{k}}+{{Y}_{k+1}}{\rm tanh} ({{d}_{k}}{{u}_{k}})]}^{2}}{{{\rm cosh} }^{2}}({{d}_{k}}{{u}_{k}})}.\end{eqnarray} \tag{ 3.7 }$

Proof. From (2.1) we obtain

$\begin{eqnarray}&&\frac{\partial {{u}_{k}}}{\partial {{\sigma }_{j}}}=\frac{\partial }{\partial {{\sigma }_{j}}}\sqrt{{{\lambda }^{2}}+{\rm i}{{\sigma }_{k}}{{\mu }_{k}}\omega }=\frac{1}{2{{N}_{k}}}{{\delta }_{kj}},\quad \frac{\partial {{N}_{k}}}{\partial {{\sigma }_{j}}}=\frac{\partial }{\partial {{\sigma }_{j}}}\frac{{{u}_{k}}}{{\rm i}{{\mu }_{k}}\omega }=\frac{1}{2{{u}_{k}}}{{\delta }_{kj}},\end{eqnarray} \tag{ 3.8 }$

where ${{\delta }_{kj}}$ is the Kronecker delta, that is, 1 if k = j and 0 otherwise. The recursion initialization (3.5) follows from Y_n = N_n; see section 2. We have

$\begin{eqnarray*}\begin{array}{rcl} Y_{kj}^{\prime } & = & \frac{\partial {{N}_{k}}}{\partial {{\sigma }_{j}}}{{a}_{k}}+{{N}_{k}}\cdot \frac{\frac{\partial {{Y}_{k+1}}}{\partial {{\sigma }_{j}}}+\frac{\partial {{N}_{k}}}{\partial {{\sigma }_{j}}}{\rm tanh} ({{d}_{k}}{{u}_{k}})+{{N}_{k}}\frac{\partial {\rm tanh} ({{d}_{k}}{{u}_{k}})}{\partial {{\sigma }_{j}}}}{{{N}_{k}}+{{Y}_{k+1}}{\rm tanh} ({{d}_{k}}{{u}_{k}})} \\ {} & {} & -{{N}_{k}}{{a}_{k}}\cdot \frac{\frac{\partial {{N}_{k}}}{\partial {{\sigma }_{j}}}+\frac{\partial {{Y}_{k+1}}}{\partial {{\sigma }_{j}}}{\rm tanh} ({{d}_{k}}{{u}_{k}})+{{Y}_{k+1}}\frac{\partial {\rm tanh} ({{d}_{k}}{{u}_{k}})}{\partial {{\sigma }_{j}}}}{{{N}_{k}}+{{Y}_{k+1}}{\rm tanh} ({{d}_{k}}{{u}_{k}})}, \\ \end{array}\end{eqnarray*}$

with a_k defined as in (3.7). If $j\ne k$ , then $\frac{\partial {{N}_{k}}}{\partial {{\sigma }_{j}}}=\frac{\partial {{u}_{k}}}{\partial {{\sigma }_{j}}}=0$ and we obtain

$\begin{eqnarray*}&&Y_{kj}^{\prime }=N_{k}^{2}\frac{\frac{\partial {{Y}_{k+1}}}{\partial {{\sigma }_{j}}}\left( 1-{{{\rm tanh} }^{2}}({{d}_{k}}{{u}_{k}}) \right)}{{{[{{N}_{k}}+{{Y}_{k+1}}{\rm tanh} ({{d}_{k}}{{u}_{k}})]}^{2}}}=N_{k}^{2}{{b}_{k}}Y_{k+1,j}^{\prime }.\end{eqnarray*}$

The last formula, with b_k given by (3.7), avoids the cancellation in $1-{{{\rm tanh} }^{2}}({{d}_{k}}{{u}_{k}})$ .

If j = k, after some straightforward simplifications, we get

$\begin{eqnarray*}\begin{array}{rcl} Y_{kk}^{\prime } & = & \frac{\partial {{N}_{k}}}{\partial {{\sigma }_{k}}}{{a}_{k}}+\frac{{{N}_{k}}}{{{N}_{k}}+{{Y}_{k+1}}{\rm tanh} ({{d}_{k}}{{u}_{k}})}\left[ Y_{k+1,k}^{\prime }(1-{{a}_{k}}{\rm tanh} ({{d}_{k}}{{u}_{k}})) \right. \\ {} & {} & +\left. \frac{\partial {{N}_{k}}}{\partial {{\sigma }_{k}}}({\rm tanh} ({{d}_{k}}{{u}_{k}})-{{a}_{k}})+\frac{{{d}_{k}}}{2}\left( 1-{{a}_{k}}\frac{{{Y}_{k+1}}}{{{N}_{k}}} \right)(1-{{{\rm tanh} }^{2}}({{d}_{k}}{{u}_{k}})) \right]. \\ \end{array}\end{eqnarray*}$

This formula, using (3.7) and (3.8), leads to

$\begin{eqnarray*}&&Y_{kk}^{\prime }=\frac{{{a}_{k}}}{2{{u}_{k}}}+{{N}_{k}}{{b}_{k}}\left[ {{N}_{k}}\left( Y_{k+1,k}^{\prime }+\frac{{{d}_{k}}}{2} \right)-\frac{1}{2}{{Y}_{k+1}}\left( \frac{{{d}_{k}}}{{{N}_{k}}}{{Y}_{k+1}}+\frac{1}{{{u}_{k}}} \right) \right].\end{eqnarray*}$

The initialization (3.5) implies that $Y_{kj}^{\prime }=0$ for any $j\lt k$ . In particular, $Y_{k+1,k}^{\prime }=0$ , and since ${{N}_{k}}/{{u}_{k}}$ is constant, one obtains the expression of $Y_{kk}^{\prime }$ given in (3.6). This completes the proof. □

Remark 3.1. The quantity a_k in (3.7) appears in the right-hand side of (2.2), and its denominator is present also in the expression of b_k. It is therefore possible to implement jointly the recursions (2.2) and (3.6) in order to reduce the number of floating point operations required by the computation of the Jacobian. We also note that, since in the following theorem 3.2 we only need the partial derivatives of Y₁, we can overwrite the values of $Y_{k+1,j}^{\prime }$ with $Y_{kj}^{\prime }$ at each recursion step, so that only n storage locations are needed for each λ value, instead of n².

Theorem 3.2. The partial derivatives of the residual function (3.2) are given by

$\begin{eqnarray*}\frac{\partial {{r}_{i}}({\boldsymbol{ \sigma }} )}{\partial {{\sigma }_{j}}}=\left\{ \begin{array}{ccccccccccccccc} \frac{4r}{{{\mu }_{0}}\omega }{{\mathcal{H}}_{0}}\left[ \lambda {{{\rm e}}^{-2{{h}_{i}}\lambda }}{\rm Im}\left( \frac{\partial {{R}_{0}}(\lambda )}{\partial {{\sigma }_{j}}} \right) \right](r), & {} & i=1,\ldots ,m, \\ \frac{4}{{{\mu }_{0}}\omega }{{\mathcal{H}}_{1}}\left[ {{{\rm e}}^{-2{{h}_{i-m}}\lambda }}{\rm Im}\left( \frac{\partial {{R}_{0}}(\lambda )}{\partial {{\sigma }_{j}}} \right) \right](r), & {} & i=m+1,\ldots ,2\;m, \\ \end{array} \right.\end{eqnarray*}$

for j = 1,...,n. Here ${{\mathcal{H}}_{\nu }}$ ( $\nu =0,1$ ) denotes the Hankel transform (2.7), r is the inter-coil distance, $\frac{\partial {{R}_{0}}(\lambda )}{\partial {{\sigma }_{j}}}$ is the jth component of the gradient of the function (2.3)

$\begin{eqnarray*}&&\frac{\partial {{R}_{0}}(\lambda )}{\partial {{\sigma }_{j}}}=\frac{-2{\rm i}{{\mu }_{0}}\omega \lambda }{{{(\lambda +{\rm i}{{\mu }_{0}}\omega {{Y}_{1}}(\lambda ))}^{2}}}\cdot \frac{\partial {{Y}_{1}}}{\partial {{\sigma }_{j}}},\end{eqnarray*}$

and the partial derivatives $\frac{\partial {{Y}_{1}}}{\partial {{\sigma }_{j}}}$ are given by lemma 3.1.

Proof. The proof follows easily from lemma 3.1 and from equations (2.3), (2.6), and (3.1). □

Remark 3.3. The numerical implementation of the previous formulae needs care. It has already been noted in the proof of lemma 3.1 that equations (3.6)–(3.7) are written in order to avoid cancellations that may introduce huge errors in the computation. Moreover, to prevent overflow in the evaluation of the term

$\begin{eqnarray*}&&{{{\rm cosh} }^{2}}({{d}_{k}}{{u}_{k}}(\lambda ))={{{\rm cosh} }^{2}}({{d}_{k}}\sqrt{{{\lambda }^{2}}+{\rm i}{{\sigma }_{k}}{{\mu }_{k}}\omega })\end{eqnarray*}$

in the denominator of b_k, we fix a value ${{\lambda }_{{\rm max} }}$ and for $\operatorname{Re}({{d}_{k}}{{u}_{k}}(\lambda ))\gt {{\lambda }_{{\rm max} }}$ we let ${{b}_{k}}={{b}_{k}}(\lambda )=0$ . In our numerical experiments we adopt the value ${{\lambda }_{{\rm max} }}=300$ .

The complexity of the joint computation of ${\bf r}({\boldsymbol{ \sigma }} )$ and its Jacobian given in theorem 3.2 amounts to $O((3{{n}^{2}}+8mn)q)$ flops, $3nq$ complex functions, and mnq real functions. To approximate the Jacobian by finite differences, one has to evaluate $n+1$ times ${\bf r}({\boldsymbol{ \sigma }} )$ , corresponding to $O((45{{n}^{2}}+8mn)q)$ flops, 2n²q complex functions, and mnq real functions.

If the Jacobian is a square matrix, i.e., $n=2\;m$ , its computation is seven times faster than approximating it by finite differences. The situation improves when the dimension of the data set is smaller than the number of layers (this is the ideal situation, as will be shown in sections 4 and 5), e.g., the speedup factor is 9 for $n=4\;m$ . The complexity issue is of concern to end users because it is often desirable to process the field data in real time during the measurement campaign using a notebook computer.

In order to further reduce the computational cost, it is possible to resort to the Broyden update of the Jacobian [6], which can be interpreted as a generalization of the secant method. The procedure consists of updating an initial approximation of the Jacobian ${{J}_{0}}=J({{{\boldsymbol{ \sigma }} }_{0}})$ computed in the initial point σ₀. This is realized by the following rank-1 update

$\begin{eqnarray}&&{{J}_{k}}={{J}_{k-1}}+\frac{({{{\bf y}}_{k}}-{{J}_{k-1}}{{{\bf s}}_{k}}){\bf s}_{k}^{T}}{{\bf s}_{k}^{T}{{{\bf s}}_{k}}},\qquad k=1,2,\ldots ,\end{eqnarray} \tag{ 3.9 }$

where ${{{\bf s}}_{k}}={{{\boldsymbol{ \sigma }} }_{k}}-{{{\boldsymbol{ \sigma }} }_{k-1}}$ and ${{{\bf y}}_{k}}=r({{{\boldsymbol{ \sigma }} }_{k}})-r({{{\boldsymbol{ \sigma }} }_{k-1}})$ . This formula makes the linearization $r({{{\boldsymbol{ \sigma }} }_{k}})+{{J}_{k}}({\boldsymbol{ \sigma }} -{{{\boldsymbol{ \sigma }} }_{k}})$ exact in σ_k−1 and guarantees the least change in the Frobenius norm $\parallel {{J}_{k}}-{{J}_{k-1}}{{\parallel }_{F}}$ . As this method works well locally [13, Chapter 8], in the sense that the accuracy of the approximation degrades as the iteration index grows, we apply recursion (3.9) for $k=1,\ldots ,{{k}_{B}}-1$ , and we reinitialize the method with the exact Jacobian after k_B iterations. A single application of (3.9) takes $10mn+2(m+n)$ flops, to be added to the cost of the evaluation of ${\bf r}({\boldsymbol{ \sigma }} )$ . We will investigate the performance of this method in section 4.

3.2. Inversion algorithm

The classical approach for solving (3.3) is to find a stationary point of the gradient ${\bf f}^{\prime} ({\boldsymbol{ \sigma }} )$ of $f({\boldsymbol{ \sigma }} )$ by Newtonʼs method. In this case, the iterative step ${{{\bf s}}_{k}}$ is chosen by solving the n × n linear system

$\begin{eqnarray*}&&{\bf f}^{\prime\prime} ({{{\boldsymbol{ \sigma }} }_{k}}){{{\bf s}}_{k}}=-{\bf f}^{\prime} ({{{\boldsymbol{ \sigma }} }_{k}}),\end{eqnarray*}$

where ${\bf f}^{\prime\prime} ({{{\boldsymbol{ \sigma }} }_{k}})$ is the Hessian of $f({\boldsymbol{ \sigma }} )$ . While ${\bf f}^{\prime} ({\boldsymbol{ \sigma }} )$ can be obtained from theorem 3.2, the analytical expression of ${\bf f}^{\prime\prime} ({\boldsymbol{ \sigma }} )$ is not available; it could be computed by further differentiating the gradient, but we believe that this would imply a large computational cost. To overcome this difficulty we resort to the Gauss–Newton method, which minimizes at each step the norm of a linear approximation of the residual ${\bf r}({\boldsymbol{ \sigma }} +{\bf s})$ ; see (3.2).

Let ${\bf r}({\boldsymbol{ \sigma }} )$ be the Fréchet differentiable and σ_k denote the current approximation; then we can write

$\begin{eqnarray*}&&{\bf r}({{{\boldsymbol{ \sigma }} }_{k+1}})\simeq {\bf r}({{{\boldsymbol{ \sigma }} }_{k}})+J({{{\boldsymbol{ \sigma }} }_{k}}){{{\bf s}}_{k}},\end{eqnarray*}$

where ${{{\boldsymbol{ \sigma }} }_{k+1}}={{{\boldsymbol{ \sigma }} }_{k}}+{{{\bf s}}_{k}}$ and $J({\boldsymbol{ \sigma }} )$ is the $2\;m\times n$ Jacobian of ${\bf r}({\boldsymbol{ \sigma }} )$ , defined by

$\begin{eqnarray*}&&{{[J({\boldsymbol{ \sigma }} )]}_{ij}}=\frac{\partial {{r}_{i}}({\boldsymbol{ \sigma }} )}{\partial {{\sigma }_{j}}},\qquad i=1,\ldots ,2\;m,\ j=1,\ldots ,n.\end{eqnarray*}$

At each step k, ${{{\bf s}}_{k}}$ is the solution of the linear least squares problem

$\begin{eqnarray}&&\mathop{{\rm min} }\limits_{{\bf s}\in {{\mathbb{R}}^{n}}}\parallel {\bf r}({{{\boldsymbol{ \sigma }} }_{k}})+{{J}_{k}}{\bf s}\parallel ,\end{eqnarray} \tag{ 3.10 }$

where ${{J}_{k}}=J({{{\boldsymbol{ \sigma }} }_{k}})$ or some approximation; see, e.g., (3.4) and (3.9). From it we obtain the following iterative method

$\begin{eqnarray}&&{{{\boldsymbol{ \sigma }} }_{k+1}}={{{\boldsymbol{ \sigma }} }_{k}}+{{{\bf s}}_{k}}={{{\boldsymbol{ \sigma }} }_{k}}-J_{k}^{\dagger }\;{\bf r}({{{\boldsymbol{ \sigma }} }_{k}}),\end{eqnarray} \tag{ 3.11 }$

where $J_{k}^{\dagger }$ is the Moore–Penrose pseudoinverse of J_k [2]; if $2\;m\geqslant n$ and J_k has full rank, then $J_{k}^{\dagger }={{(J_{k}^{T}{{J}_{k}})}^{-1}}J_{k}^{T}$ .

When the residuals ${{r}_{i}}({{{\boldsymbol{ \sigma }} }_{k}})$ are small or mildly nonlinear at σ_k, the Gauss–Newton method is expected to behave similarly to Newtonʼs method [2, Chapter 9.2.2]. We remark that, while the physical problem is obviously consistent, this is not necessarily true in our case, where the conductivity $\sigma (z)$ is approximated by a piecewise constant function. Furthermore, in the presence of noise in the data the problem will certainly be inconsistent. At the same time, since we are focused on the nonlinear case, connected to the presence of strong conductors in the subsoil, we do not take into account the second possibility. We remark that in the case of a mildly nonlinear problem, a linear model is available [3, 39]. If the previous conditions are not satisfied, the Gauss–Newton method may not converge.

The damped Gauss–Newton method replaces the approximation (3.11) by

$\begin{eqnarray}&&{{{\boldsymbol{ \sigma }} }_{k+1}}={{{\boldsymbol{ \sigma }} }_{k}}+{{\alpha }_{k}}{{{\bf s}}_{k}},\end{eqnarray} \tag{ 3.12 }$

where ${{\alpha }_{k}}$ is a step length to be determined. To choose it, we use the Armijo–Goldstein principle [43], which selects ${{\alpha }_{k}}$ as the largest number in the sequence ${{2}^{-i}}$ , $i=0,1,...$ , for which the following inequality holds

$\begin{eqnarray*}&&\parallel {\bf r}({{{\boldsymbol{ \sigma }} }_{k}}){{\parallel }^{2}}-\parallel {\bf r}({{{\boldsymbol{ \sigma }} }_{k}}+{{\alpha }_{k}}{{{\bf s}}_{k}})\;{{\parallel }^{2}}\geqslant \frac{1}{2}{{\alpha }_{k}}\parallel {{J}_{k}}{{{\bf s}}_{k}}{{\parallel }^{2}}.\end{eqnarray*}$

This choice of ${{\alpha }_{k}}$ ensures convergence of the method, provided that σ_k is not a critical point [2, Chapter 9.2.1].

The damped method allows us to include an important physical constraint in the inversion algorithm, i.e., the positivity of the solution. In our implementation ${{\alpha }_{k}}$ is the largest step size which both satisfies the Armijo–Goldstein principle and ensures that all the solution components are positive.

3.3. Regularization methods

With the aim of investigating the conditioning of problem (3.3), we examined the numerical behaviour of the singular values of the Jacobian matrix $J=J({\boldsymbol{ \sigma }} )$ of the vector function ${\bf r}({\boldsymbol{ \sigma }} )$ . Let $J=U\Gamma {{V}^{T}}$ be the singular value decomposition (SVD) [2] of the Jacobian, where U and V are orthogonal matrices of size $2\;m$ and n, respectively, $\Gamma ={\rm diag}({{\gamma }_{1}},\ldots ,{{\gamma }_{p}},0,\ldots ,0)$ is the diagonal matrix of the singular values, and p is the rank of J. We recall that the condition number of J is given by ${{\gamma }_{1}}/{{\gamma }_{p}}$ .

Fixed $n=2\;m=20$ , we generate randomly 1000 vectors ${\boldsymbol{ \sigma }} \in {{\mathbb{R}}^{20}}$ , having components in $[0,100]$ . For each of them we evaluate the corresponding Jacobian $J({\boldsymbol{ \sigma }} )$ by the formulae given in theorem 3.2 and compute its SVD. The left graph in figure 2 shows the average of the singular values obtained by the previous procedure and, for each of them, its minimum and maximum value. It is clear that the deviation from the average is small, so that the condition number of the Jacobian matrix has the same order of magnitude in all tests. Consequently, the linearized problem is severely ill-conditioned independently of the value of σ, and we do not expect its condition number to change much during iteration.

**Figure 2.** SVD of the Jacobian matrix: left, average singular values and errors (n = 20); right, average singular values for $n=10,20,30,40$ .
Download figure:
Standard image High-resolution image

The graph on the right in figure 2 reports the average singular values when $n=2\;m=10,20,30,40$ . The figure shows that the condition number is about 10¹⁴ when n = 10 and increases with the dimension. The singular values appear to be exponentially decaying, but zero is not a singular value; that is, the problem is not strictly rank deficient. The decay rate of the computed singular values changes below machine precision ( $2.2\cdot {{10}^{16}}$ ), which is represented in the graph by a horizontal line. The exact singular values are likely to decay with a stronger rate, while the computed ones are probably significantly perturbated by error propagation. A problem of this kind is generally referred to as a discrete ill-posed problem [24].

A typical approach for the solution of ill-posed problems is Tikhonov regularization. It has been applied by various authors to the inversion of geophysical data; see, e.g., [3, 12, 26]. To apply Tikhonovʼs method to the nonlinear problem (3.3), one has to solve the minimization problem

$\begin{eqnarray}&&\mathop{{\rm min} }\limits_{{\boldsymbol{ \sigma }} \in {{\mathbb{R}}^{{\bf n}}}}\{\parallel {\bf r}({\boldsymbol{ \sigma }} )\;{{\parallel }^{2}}+{{\mu }^{2}}\parallel M{\boldsymbol{ \sigma }} {{\parallel }^{2}}\}\end{eqnarray} \tag{ 3.13 }$

for a fixed value of the parameter μ, where M is a regularization matrix, which is often chosen as the identity matrix, or a discrete approximation of the first or second derivatives. In general, choosing the regularization parameter requires the computation of the solution ${{{\boldsymbol{ \sigma }} }_{\mu }}$ of (3.13) for many values of μ. This can be done, for example, by the Gauss–Newton method, leading to a large computational effort.

To reduce the complexity we consider an alternative regularization technique based on a low-rank approximation of the Jacobian matrix. The best rank ℓ approximation ( $\ell \leqslant p$ ) to the Jacobian according to the Euclidean norm, i.e., the matrix ${{A}_{\ell }}$ which minimizes $\parallel J-A\parallel$ over all the matrices of rank ℓ, can be easily obtained by the SVD decomposition $J=U\Gamma {{V}^{T}}$ . This procedure allows us to replace the ill-conditioned Jacobian matrix with a well-conditioned, rank-deficient matrix ${{A}_{\ell }}$ . The corresponding solution to (3.10) is known as the truncated SVD (TSVD) solution [23] and it can be expressed as

$\begin{eqnarray}&&{{{\bf s}}^{(\ell )}}=-A_{\ell }^{\dagger }{\bf r}=-\mathop{\sum }\limits_{i=1}^{\ell }\frac{{\bf u}_{i}^{T}{\bf r}}{{{\gamma }_{i}}}{{{\bf v}}_{i}},\end{eqnarray} \tag{ 3.14 }$

where $\ell =1,\ldots ,p$ is the regularization parameter, ${{\gamma }_{i}}$ are the singular values, the singular vectors ${{{\bf u}}_{i}}$ and ${{{\bf v}}_{i}}$ are the orthogonal columns of U and V, respectively, and ${\bf r}={\bf r}({{{\boldsymbol{ \sigma }} }_{k}})$ .

To introduce a regularization matrix $M\in {{\mathbb{R}}^{t\times n}}$ ( $t\leqslant n$ ), problem (3.10) is usually replaced by

$\begin{eqnarray}&&\mathop{{\rm min} }\limits_{{\bf s}\in \mathcal{S}}\parallel M{\bf s}\parallel ,\qquad \mathcal{S}=\{{\bf s}\in {{\mathbb{R}}^{n}}\;:\;{{J}^{T}}J{\bf s}=-{{J}^{T}}{\bf r}\},\end{eqnarray} \tag{ 3.15 }$

under the assumption $\mathcal{N}(J)\cap \mathcal{N}(M)=\{0\}$ and $t\gt {\rm max} (0,n-2\;m)$ . The generalized singular value decomposition (GSVD) [45] of the matrix pair (J,M) is the factorization

$\begin{eqnarray*}&&J=U{{\Sigma }_{J}}{{Z}^{-1}},\qquad M=V{{\Sigma }_{M}}{{Z}^{-1}},\end{eqnarray*}$

where ${{\Sigma }_{J}}$ , ${{\Sigma }_{M}}$ are diagonal matrices, U, V are orthogonal matrices, and Z is nonsingular. The truncated GSVD (TGSVD) solution ${{{\bf s}}_{\ell }}$ to (3.15) is then defined as

$\begin{eqnarray}&&{{{\bf s}}^{(\ell )}}=-\sum \bar{\ }-\ell +1\bar{p}i=p\frac{{\bf u}_{2\,m-p+i}^{T}{\bf r}}{{{c}_{i}}}\;{{{\bf z}}_{n-p+i}}-\sum \bar{\ }+1pi=p({\bf u}_{2\,m-p+i}^{T}{\bf r}){{{\bf z}}_{n-p+i}},\end{eqnarray} \tag{ 3.16 }$

where $\ell =0,1,\ldots ,\bar{p}$ is the regularization parameter, $\bar{p}=t$ if $2\;m\geqslant n$ , and $\bar{p}=2\;m-n+t$ if $2\;m\lt n$ . Here, c_i ( $i=1,\ldots ,\bar{p}$ ) are the elements of ${{\Sigma }_{J}}$ different from 0 and 1.

Our approach for constructing a smooth solution to (3.3) consists of regularizing each step of the damped Gauss–Newton method (3.12) by either TSVD or TGSVD, depending on the choice of M. For a fixed value of the regularization parameter ℓ, we substitute ${\bf s}$ in (3.12) by ${{{\bf s}}^{(\ell )}}$ expressed by either (3.14) or (3.16). We let the resulting method

$\begin{eqnarray}&&{\boldsymbol{ \sigma }} _{k+1}^{(\ell )}={\boldsymbol{ \sigma }} _{k}^{(\ell )}+{{\alpha }_{k}}{\bf s}_{k}^{(\ell )}\end{eqnarray} \tag{ 3.17 }$

iterate until

$\begin{eqnarray*}&&\parallel {\boldsymbol{ \sigma }} _{k}^{(\ell )}-{\boldsymbol{ \sigma }} _{k-1}^{(\ell )}\parallel \lt \tau \parallel {\boldsymbol{ \sigma }} _{k}^{(\ell )}\parallel \quad {\rm or}\quad k\gt 100\quad {\rm or}\quad {{\alpha }_{k}}\lt {{10}^{-5}},\end{eqnarray*}$

for a given tolerance τ. The constraint on ${{\alpha }_{k}}$ is a failure condition, which indicates that the method does not converge to a positive solution. This typically happens when the solution blows up because of ill-conditioning. We denote the solution at convergence by ${{{\boldsymbol{ \sigma }} }^{(\ell )}}$ . We will discuss the choice of ℓ in the next subsection.

3.4. Choice of the regularization parameter

In the previous section we saw how to regularize the ill-conditioned problem (3.3) with the aid of T(G)SVD. The choice of the regularization parameter is crucial in order to obtain a good approximation ${{{\boldsymbol{ \sigma }} }^{(\ell )}}$ of σ.

In real-world applications, experimental data are always affected by noise. To model this situation, we assume that the data vector in the residual function (3.2), whose norm is minimized in problem (3.3), can be expressed as ${\bf b}={\bf \hat{b}}+{\bf e}$ , where ${\bf \hat{b}}$ contains the exact data and ${\bf e}$ is the noise vector. This vector is generally assumed to have normally distributed entries with mean zero and common variance. In real data sets the last condition is not necessarily met.

If an accurate estimate of the norm of the error ${\bf e}$ is known, the value of ℓ can be determined with the aid of the discrepancy principle [14, section 4.3]. It consists of determining the regularization parameter ℓ as the smallest index $\ell ={{\ell }_{{\rm discr}}}$ such that

$\begin{eqnarray}&&\parallel {\bf b}-{\bf m}({{{\boldsymbol{ \sigma }} }_{{{\ell }_{{\rm discr}}}}})\parallel \leqslant \kappa \parallel {\bf e}\parallel .\end{eqnarray} \tag{ 3.18 }$

Here $\kappa \gt 1$ is a user-supplied constant independent of $\parallel {\bf e}\parallel$ . In our experiments we set $\kappa =1.5$ , since it produced the best numerical results.

We are also interested in the situation when an accurate bound for $\parallel {\bf e}\parallel$ is not available and, therefore, the discrepancy principle cannot be applied. A large number of methods for determining a regularization parameter in such a case have been introduced for linear inverse problems [24]. They are known as heuristic because it is not possible to prove convergence results for them in the strict sense of the definition of a regularization method; see [14, Chapter 4].

It is not possible, in general, to apply all the methods developed for the linear case to a nonlinear problem. The L-curve criterion [22] can be extended quite naturally to the nonlinear case. Let us consider the curve obtained by joining the points

$\begin{eqnarray}&&\left\{ {\rm log} \parallel {\bf r}({{{\boldsymbol{ \sigma }} }^{(\ell )}})\parallel ,{\rm log} \parallel M{{{\boldsymbol{ \sigma }} }^{(\ell )}}\parallel \right\},\quad \ell =1,...,\bar{p},\end{eqnarray} \tag{ 3.19 }$

where ${\bf r}({{{\boldsymbol{ \sigma }} }^{(\ell )}})={\bf b}-{\bf m}({{{\boldsymbol{ \sigma }} }^{(\ell )}})$ is the residual error associated to the approximate solution ${{{\boldsymbol{ \sigma }} }^{(\ell )}}$ computed by the iterative method (3.17), using (3.16) as a regularization method. If (3.14) is used instead, it is sufficient to let M = I and replace $\bar{p}$ by p.

The curve (3.19) exhibits a typical L-shape in many discrete ill-posed problems. The L-curve criterion seeks to determine the regularization parameter by detecting the index ℓ of the point of the curve closer to the corner of the 'L'. This choice produces a solution for which both the norm and the residual are fairly small. There are several papers showing examples in which the L-curve method systematically fails; see, e.g., [19, 58]. Nevertheless, it has been shown by numerical experiments that it provides a good estimation of the optimal regularization parameter in many inverse problems of applicative interest [21, 50].

Various methods have been proposed to determine the corner of the L-curve. The L-corner method considers a sequence of pruned L-curves, obtained by removing an increasing number of points, and constructs a list of candidate vertices produced by two different selection algorithms. The corner is chosen from this list by a procedure which compares the norms and the residuals of the corresponding solutions [21]. It is currently implemented in [25].

We remark that a new approach, based on the comparison of regularized solutions computed by both TSVD and the Tikhonov method, has been recently proposed in [28]. Its application to our algorithm is not immediate, but we plan to investigate its performance in the future.

4. Numerical experiments with synthetic data

To illustrate the performance of the inversion method described in the previous sections, we present the results of a set of numerical experiments. Initially, we apply our method to synthetic data sets, generated starting from a chosen conductivity distribution and adding random noise to data. In the next section we will analyze a real data set.

Figure 3 displays the three functions f_r(z), $r=1,2,3$ , used in our experiments to model the distribution of conductivity, expressed in Siemens/meter, with respect to the depth z, measured in meters. The first one is differentiable $({{f}_{1}}(z)={{{\rm e}}^{-{{(z-1.2)}^{2}}}}),$ the second is piecewise linear, and the third is a step function. All model functions imply the presence of a strongly conductive material at a given depth. We assume that the measurements are taken with the GCM in both vertical and horizontal orientation, placed at height ${{h}_{i}}=(i-1)\bar{h}$ above the ground, i = 1,...,m, for a chosen height step $\bar{h}$ ; see (3.1). In our experiments $\bar{h}\geqslant 0.1$ meters.

In this section we simulate the use of a Geonics EM38, operating at frequency $14.6\;{\rm kHz}$ , with $1\;{\rm m}$ coil separation. For a chosen model function f_r and a fixed number of layers n, we let the layers' thickness assume the constant value ${{d}_{k}}=\bar{d}=2.5/(n-1)$ , $k=1,\ldots ,n-1$ , so that ${{z}_{j}}=(j-1)\bar{d}$ , j=1,..., n; see figure 1. The choice of $\bar{d}$ is motivated by the common assumption that this kind of GCM can give useful information about the conductivity of the ground up to a depth of 2–3 meters.

We assign to each layer the conductivity ${{\sigma }_{k}}={{f}_{r}}({{z}_{k}})$ . Then, we apply the nonlinear model defined in (3.2) to compute the exact data vector ${\bf \hat{b}}={\bf m}({\boldsymbol{ \sigma }} )$ .

To simulate experimental errors, we determine the perturbed data vector ${\bf b}$ by adding a noise vector to ${\bf \hat{b}}$ . Specifically, we let the vector ${\bf w}$ have normally distributed entries with zero mean and unitary variance, and compute

$\begin{eqnarray}&&{\bf b}={\bf \hat{b}}+\frac{\tau \parallel {\bf \hat{b}}\parallel }{\sqrt{2\;m}}{\bf w}.\end{eqnarray} \tag{ 4.1 }$

This implies that $\parallel {\bf b}-{\bf \hat{b}}\parallel \approx \tau \parallel {\bf \hat{b}}\parallel$ . In the computed examples we use the noise levels $\tau ={{10}^{-3}},{{10}^{-2}}$ . Based on our experience, the noise on experimental data may be larger than 10⁻¹, but it can be substantially reduced, e.g., by averaging a small number of repeated measurements.

For each data set, we solve the least squares problem (3.3) by the damped Gauss–Newton method (3.12). The damping parameter is determined by the Armijo–Goldstein principle, modified in order to ensure the positivity of the solution. Each step of the iterative method is regularized by either the TSVD approach (3.14) or TGSVD (3.16) for a given regularization matrix M. In our experiments we use both $M={{D}_{1}}$ and $M={{D}_{2}}$ , the discrete approximations of the first and second derivatives. These two choices for M pose a constraint on the magnitude of the slope and the curvature of the solution, respectively. To assess the accuracy of the computation we use the relative error

$\begin{eqnarray*}&&{{e}_{\ell }}=\frac{\parallel {\boldsymbol{ \sigma }} -{{{\boldsymbol{ \sigma }} }^{(\ell )}}\parallel }{\parallel {\boldsymbol{ \sigma }} \parallel },\end{eqnarray*}$

where σ denotes the exact solution of the problem and ${{{\boldsymbol{ \sigma }} }^{(\ell )}}$ its regularized solution with parameter ℓ, obtained by (3.17). The experiments were performed using Matlab 8.1 (R2013a) on an Intel Core i7/860 computer with 8Gb RAM, running Linux. The software developed is available from the authors upon request.

Our first experiment tries to determine the best experimental setting, that is, the optimal number of measurements and of underground layers to be considered. At the same time, we investigate the difference between the TSVD (3.14) and the TGSVD (3.16) approaches and the effect on the solution of the regularization matrix M. For each of the three test conductivity models, we discretize the soil by 20 or 40 layers, up to the depth of 2.5 meters. We solve the problem after generating synthetic measurements at 5, 10, and 20 equispaced heights up to 1.9 meters. This process is repeated for each regularization matrix. The (exact) Jacobian is computed as described in section 3.1. Table 1 reports the values of the relative error ${{e}_{{\rm opt}}}={{{\rm min} }_{\ell }}{{e}_{\ell }}$ , representing the best possible performance of the method. This value is the average over 20 realizations of the noise, with noise level $\tau ={{10}^{-3}},{{10}^{-2}}.$

Table 1. Optimal error ${{e}_{{\rm opt}}}$ for $m=5,10,20$ and n = 20, 40 for the TSVD solution (M = I) and the TGSVD solution with $M={{D}_{1}}$ and $M={{D}_{2}}$ . The Jacobian is computed as in section 3.1.

		M = I		$M={{D}_{1}}$		$M={{D}_{2}}$
example	m	n = 20	n = 40	n = 20	n = 40	n = 20	n = 40
	5	4.1e-01	3.8e-01	1.8e-01	1.7e-01	2.3e-01	2.9e-01
f₁	10	3.6e-01	3.7e-01	1.4e-01	1.3e-01	1.8e-01	1.6e-01
	20	3.5e-01	3.5e-01	1.5e-01	1.4e-01	1.2e-01	1.3e-01

	5	4.8e-01	4.6e-01	1.3e-01	1.4e-01	2.2e-01	2.6e-01
f₂	10	4.3e-01	4.0e-01	1.2e-01	9.5e-02	1.3e-01	1.9e-01
	20	3.9e-01	3.7e-01	1.1e-01	9.1e-02	1.5e-01	1.4e-01

	5	5.7e-01	5.6e-01	3.9e-01	3.9e-01	4.2e-01	4.1e-01
f₃	10	5.5e-01	5.4e-01	3.6e-01	3.4e-01	3.4e-01	3.2e-01
	20	5.6e-01	5.6e-01	3.5e-01	3.4e-01	2.9e-01	3.3e-01

It is clear that the TSVD approach (see the column labelled M = I in the table) is the least accurate. The TGSVD, with either $M={{D}_{1}}$ or $M={{D}_{2}}$ , gives the best results for the three test functions. The results in table 1 state that they are essentially equivalent, and do not clearly indicate which is the best. We will show in section 5 that the regularization matrix $M={{D}_{2}}$ appears to produce more accurate reconstructions starting from experimental data.

Regarding the size of the soil discretization, it seems convenient to use a large number of layers, that is, n = 40. This choice does not increase significantly the computation time. It is obviously desirable to have at our disposal a large number of measurements; however, the results obtained with m = 5 and m = 10 are not much worse than those computed with m = 20; five measurement heights are often sufficient to give a rough approximation of the depth localization of a conductive layer. This is an important remark, as it reduces the time needed for data acquisition.

Figure 4 gives an idea of the quality of the computed reconstructions for the model functions f₂ and f₃, with n = 40 and noise level $\tau ={{10}^{-3}}$ . The exact solution is compared to the approximations corresponding to $m=5,10,20$ . The previous comments about the influence of the number of measurements m are confirmed. It is also remarkable that the position of the maximum is very well localized.

In the previous experiments we assumed that all the $2\;m$ entries of vector ${\bf b}$ in (3.2) were available. In table 2 we compare these results to those obtained by using only half of them, i.e., those corresponding to either the vertical or horizontal orientation of the instrument. The rows labelled 'both' are extracted from table 1. The results are slightly worse when the number of data is halved, especially for the less regular model functions, while they are almost equivalent for the smooth function f₁. This observation contributes, like the previous one, to simplify and speed up field measurements.

Table 2. Optimal error ${{e}_{{\rm opt}}}$ for $m=5,10,20$ and n = 20, 40 for f₁ $(M={{D}_{2}})$ , f₂ $(M={{D}_{1}})$ , and f₃ $(M={{D}_{2}})$ . The results obtained from measurements collected with the instrument in both vertical and horizontal orientation are compared to those obtained with a single orientation.

		f₁, $M={{D}_{2}}$		f₂, $M={{D}_{1}}$		f₃, $M={{D}_{2}}$
orientation	m	n = 20	n = 40	n = 20	n = 40	n = 20	n = 40
	5	2.3e-01	2.9e-01	1.3e-01	1.4e-01	4.2e-01	4.1e-01
both	10	1.8e-01	1.6e-01	1.2e-01	9.5e-02	3.4e-01	3.2e-01
	20	1.2e-01	1.3e-01	1.1e-01	9.1e-02	2.9e-01	3.3e-01

	5	3.3e-01	2.9e-01	3.5e-01	3.1e-01	6.2e-01	6.6e-01
vertical	10	2.4e-01	1.7e-01	2.9e-01	2.6e-01	5.3e-01	5.0e-01
	20	1.3e-01	2.2e-01	2.4e-01	1.7e-01	4.0e-01	4.3e-01

	5	2.9e-01	2.7e-01	3.6e-01	3.5e-01	6.6e-01	8.5e-01
horizontal	10	2.4e-01	2.6e-01	1.9e-01	1.6e-01	6.3e-01	6.0e-01
	20	2.0e-01	2.1e-01	1.7e-01	1.8e-01	4.4e-01	4.7e-01

In section 3.1 we described the computation of the Jacobian matrix of (3.2) and compared it to the slower finite difference approximation (3.4) and to the Broyden update of the Jacobian (3.9). To investigate the execution time corresponding to each method, we let the method (3.17) perform 100 iterations, with $M={{D}_{2}}$ , for a fixed regularization parameter ( $\ell =4$ ). When the Jacobian is exactly computed, the execution time is $7.18\;{\rm s}$ , while the finite difference approximation requires $18.96\;{\rm s}$ . The speedup factor is 2.6, which is far less than the one theoretically expected. This is probably due to the implementation details and to the features of Matlab programming language. We performed the same experiment by applying the Broyden update (3.9) and recomputing the Jacobian every k_B iterations. For k_B = 5 the execution time is $2.00\;{\rm s}$ , while for k_B = 10 it is $1.32\;{\rm s}$ . Despite this strong speedup (a factor of 14 with respect to finite difference approximation), the accuracy is not substantially affected by this approach. Table 3 reports the relative error ${{e}_{{\rm opt}}}$ obtained by repeating the experiment of table 1 using the Broyden method with k_B = 10. We only report the values of ${{e}_{{\rm opt}}}$ for some of the examples. The loss of accuracy, if any, is minimal.

Table 3. Optimal error ${{e}_{{\rm opt}}}$ for $m=5,10,20$ and n = 20, 40 for f₁ $(M={{D}_{2}})$ , f₂ $(M={{D}_{1}})$ , and f₃ $(M={{D}_{2}})$ . The Jacobian is computed every 10 iterations and then updated by the Broyden method.

	f₁, $M={{D}_{2}}$		f₂, $M={{D}_{1}}$		f₃, $M={{D}_{2}}$
m	n = 20	n = 40	n = 20	n = 40	n = 20	n = 40
5	1.8e-01	2.3e-01	1.3e-01	1.2e-01	4.6e-01	5.0e-01
10	1.7e-01	1.5e-01	1.1e-01	1.0e-01	3.3e-01	3.9e-01
20	1.1e-01	1.3e-01	1.1e-01	9.0e-02	3.1e-01	3.3e-01

The maximum resolution that the inversion algorithm can achieve in imaging high-conductivity thin layers is another important issue we inspected in this work. This situation is typical, e.g., in UXO detection. To this end, we consider the test function f₃ and let the length ξ of the step vary; that is, we set ${{f}_{3,\xi }}(z)=1$ for $z\in [0.5,0.5+\xi ]$ and ${{f}_{3,\xi }}(z)=0.2$ otherwise. Each problem is solved for three regularization matrices and two noise levels, and each test is repeated 20 times for different noise realizations. The left graph of figure 5 reports the average errors for different values of ξ, while the right graph displays the corresponding standard deviations. The choice $M={{D}_{1}}$ appears to be the best for detecting a thin conductive layer. Indeed, not only the errors are better, but the smaller standard deviations ensure that the method is more reliable. Figure 6 shows the reconstructions of ${{f}_{3,\xi }}$ with three different step lengths, $\xi =1.0,0.5,0.2$ , $M={{D}_{1}}$ , and $\tau ={{10}^{-2}}$ . It is remarkable that the position of the maximum is well located by the algorithm even in the presence of a very thin step.

Remark 4.1. It has been shown in recent literature [4, 15] that if a linear inverse problem is solved in L^p spaces, with $p\lt 2$ , the reconstruction of discontinuous functions can greatly improve. This would be particularly helpful in the presence of a highly conductive thin layer, but applying such methods is rather involved even in the linear case.

**Figure 5.** Results for the reconstruction of test function ${{f}_{3,\xi }}$ with a variable step length ξ, which is reported on the horizontal axis. The left graph reports the average error ${{e}_{{\rm opt}}}$ , obtained with three regularization matrices $M=I,{{D}_{1}},{{D}_{2}}$ . Each test is repeated 20 times for each noise level $\tau ={{10}^{-3}},{{10}^{-2}}$ . The right graph reports the corresponding standard deviations.
Download figure:
Standard image High-resolution image

**Figure 5.** Results for the reconstruction of test function ${{f}_{3,\xi }}$ with a variable step length ξ, which is reported on the horizontal axis. The left graph reports the average error ${{e}_{{\rm opt}}}$ , obtained with three regularization matrices $M=I,{{D}_{1}},{{D}_{2}}$ . Each test is repeated 20 times for each noise level $\tau ={{10}^{-3}},{{10}^{-2}}$ . The right graph reports the corresponding standard deviations.
Download figure:
Standard image High-resolution image

**Figure 6.** Optimal reconstructions for the test function ${{f}_{3,\xi }}$ , with step lengths $\xi =1.0$ (left), 0.5 (center), and 0.2 (right), obtained with $M={{D}_{1}}$ and noise level $\tau ={{10}^{-2}}$ .
Download figure:
Standard image High-resolution image

In the previous experiments, the regularization parameter ℓ has been chosen optimally, that is, in order to produce the smallest deviation from the exact solution. Obviously, in real-world applications this is not possible, so it is essential to determine the parameter effectively. When an accurate estimate of the noise level is known, this can be done by the discrepancy principle (3.18). In our experiments, we set $\kappa =1.5$ and substitute $\parallel {\bf e}\parallel$ by $\tau \parallel {\bf b}\parallel$ , where τ is the noise level and ${\bf b}$ is the noisy data vector; see (4.1).

When the noise level is unknown, the regularization parameter may be estimated by a heuristic method. We compared the L-corner method, described in section 3.4, to the restricted Regińska method (ResReg) [49, 50], the residual L-curve [50, 51], and the hybrid quasi-optimality criterion [42, 50]. These methods were designed for linear inverse problems, but they can also be applied to nonlinear problems, as they only require the knowledge of the residual corresponding to each regularized solution. The L-corner method proved to be the most robust, so in the rest of the section we will only refer to it.

Our numerical experiments showed that the discrepancy and the L-corner methods furnish very good estimates for the parameter when M = I, while they are less reliable when $M={{D}_{1}}$ or D₂, that is, for the choice of the regularization matrix which produced the best results in our experiments. This fact is known for the L-curve; see, e.g., [52]. In fact, a good choice for M is a matrix whose kernel (approximately) contains the solution, and this makes the L-curve lose its typical 'L' shape. We remark that we cannot apply the discrepancy principle to real data sets for which an estimate of the noise is not available. Moreover, according to our experience, the noise on real EMI data is not necessarily equally distributed. This will be commented on in section 5.

Figure 7 shows a reconstruction of test function f₁ obtained with m = 10, n = 40, noise level $\tau ={{10}^{-2}}$ , and regularization matrix $M={{D}_{2}}$ . The graph on the left displays the L-curve corresponding to this example; the graph on the right compares the approximations produced by the discrepancy criterion and the L-corner method to the exact solution. In this case the optimal parameter is $\ell =2$ . The discrepancy fails, as it gives the estimate $\ell =1$ , while the L-corner returns $\ell =2$ . This test function is approximately contained in the kernel of D₂, as $\parallel {{D}_{2}}{\boldsymbol{ \sigma }} \parallel \simeq 3\cdot {{10}^{-2}}$ while $\parallel {\boldsymbol{ \sigma }} \parallel \simeq 4$ , and the L-curve appears almost shapeless. In any case, the L-corner method implements a particular strategy to deal with such cases and produce a good reconstruction.

Figure 8 reports the same graphs for test function f₂, with m = 10, n = 40, $\tau ={{10}^{-3}}$ , and $M={{D}_{1}}$ . The optimal parameter is $\ell =4$ . The L-corner method gives $\ell =4$ and discrepancy returns $\ell =2$ . In this case both methods succeed in identifying accurately the depth at which the conductivity is maximal.

5. Numerical experiments with field data

We tested the nonlinear inversion technique described in the previous sections on field data collected at the Cagliari Airport (Sardinia, Italy) in an area where previous geophysical investigations, conducted for UXO detection, established the presence of layered materials with very high electrical conductivity, suitable to be investigated with vertical EMI soundings. The reliability of the inverted conductivity profile was assessed by comparison with conductivities obtained by ERT [11, 37].

The ERT profile was performed using 48 electrodes set up with an inter-electrode spacing of $0.5\;{\rm m}$ deployed in a Wenner–Schlumberger array, which we chose to reach a compromise between a reasonable vertical resolution and a good signal-to-noise ratio [10]. ERT data were collected using an IRIS Syscal Pro Switch 48 resistivity meter, which was set for six-cycle stacking (repetition of measurements), with the requirement of reaching a quality factor (standard deviation) of less than 5%. Data were then inverted using the commercial program Res2Dinv [35, 36]. The software employs a smoothness-constrained least-squares optimization method [9, 36] to minimize the difference between measured and modelled data, which it calculates using either a finite difference or a finite element approximation. The program divides the subsurface into rectangular cells whose corners, along the line, follow the positions of the electrodes in the subsurface [36]. The quality of the fit between measured and modelled data is expressed in terms of the root mean square (RMS) error. Figure 9 shows the ERT result we obtained with an RMS error of 2%, displayed in conductivity units to facilitate direct comparison with the electromagnetic data.

**Figure 9.** ERT results: in the left graph we display the conductivity section; the right graph reports the conductivity profile versus the depth at the position where the electromagnetic data were collected, marked by a dashed line in the first graph.
Download figure:
Standard image High-resolution image

As expected, the section shows a subsurface model where conductivity changes almost exclusively in the vertical direction. At the near surface, electrical conductivity starts with low values ( $\lt 200\;{\rm mS}\;{{{\rm m}}^{-1}}$ ) and keeps them down to $1\;{\rm m}$ depth; then, it abruptly increases, reaching maximum values (up to $1800\;{\rm mS}\;{{{\rm m}}^{-1}}$ ) at the depth of about $1.7\;{\rm m}$ ; finally, it lowers below $800\;{\rm mS}\;{{{\rm m}}^{-1}}$ in the deepest portion of the investigated section. The graph on the right of figure 9 shows the conductivity profile at the location where we carried out the electromagnetic sounding. As the exact solution is not available in this real case study, this profile was used as a benchmark to comparatively assess the reliability of our regularized nonlinear inversion procedure.

Electromagnetic data were measured with the CMD-1 conductivity meter (GF Instruments), a frequency-domain electromagnetic device with a constant operating frequency of $10\;{\rm kHz}$ and $0.98\;{\rm m}$ coil separation. After completing the usual calibration procedure, the electromagnetic vertical sounding was obtained by making measurements in vertical and horizontal coil-mode configurations, lifting the instrument above the ground at heights from 0 to $1.9\;{\rm m}$ , with a $0.1\;{\rm m}$ step, by means of a specially built wooden frame; see picture on the right in figure 10. For both coil orientations and each instrument height, we recorded 20 readings to get the mean value and the standard deviation of each measurement. Figure 10 (left) displays the resulting electromagnetic data versus height curves.

**Figure 10.** Left: mean apparent conductivities measured in vertical (circles) and horizontal (triangles) modes at different heights above the ground; error bars are standard deviations, which are multiplied by 10 for display purposes. Right: the wooden frame used to put the instrument at different heights above the ground. In the picture, the GCM is placed at height $0.5\;{\rm m}$ .
Download figure:
Standard image High-resolution image

The standard deviations on the data (figure 10, left) are rather different from one another. This suggests that the noise is not equally distributed and rules out the use of the discrepancy principle to estimate the regularization parameter, as well as other statistical methods (see, e.g., the generalized cross-validation [24]) for which this assumption is essential. For this reason, in our experiments we only use heuristic parameter selection techniques.

We apply the damped Gauss–Newton method (3.12) to the least squares problem (3.3), where the vector ${\bf b}$ (see (3.2)) contains the field data reported in the left graph of figure 10. We use 20 measurements in vertical and horizontal orientation (m = 20), discretizing the soil by 40 layers (n = 40) up to the depth of $2.5\;{\rm m}$ . The Jacobian is exactly computed, as described in section 3.1, and the damping parameter is determined by the Armijo–Goldstein principle.

We initially set M = I and regularize the solution by TSVD; figure 11 shows the solutions ${{{\boldsymbol{ \sigma }} }_{\ell }}$ , $\ell =2,\ldots ,7$ . In each graph, the dashed line represents the conductivity profile produced by ERT. To assess the performance of the L-corner method, we compare it to the ResReg, residual L-curve, and hybrid quasi-optimality criterions, mentioned in section 4. The values of the regularization parameters are reported in the first row of table 4, together with the coordinates (depth and value) of the maximum of the corresponding regularized solution. These are to be compared with the values (1.68,1.74) which locate the maximum of the conductivity profile predicted by ERT; see figure 9, right. The results of table 4 and figure 11 show that the position of the maximum is well localized, starting from $\ell =5$ , but that the shape of the solution is never accurately determined.

Table 4. Performance of the methods for the estimation of the regularization parameters mentioned in section 4, when the inversion algorithm is applied to field data with m = 20, n = 40, and $M=I,{{D}_{1}},{{D}_{2}}$ . Each entry of the table reports the value of ℓ identified by a particular method and, in parentheses, the depth at which the maximum of ${{{\boldsymbol{ \sigma }} }_{\ell }}$ is located and the value of the maximum (in S m⁻¹). The values predicted by ERT are (1.68,1.74).

	L-corner	ResReg	L-res	Q-hyb
M = I	5 (1.47,1.11)	4 (1.67,1.04)	2 (2.44,0.98)	7 (1.73,1.26)
$M={{D}_{1}}$	2 (1.99,1.48)	1 (2.50,0.98)	2 (1.99,1.48)	2 (1.99,1.48)
$M={{D}_{2}}$	2 (1.60,1.53)	1 (1.60,1.53)	2 (1.60,1.53)	1 (1.60,1.53)

We now report the results obtained by TGSVD with $M={{D}_{1}}$ and $M={{D}_{2}}$ (the discrete approximations of the first and second derivatives). The first six regularized solutions are displayed in figure 12 and figure 13, respectively, together with the ERT solution. The identified regularization parameters and the coordinates of the maximal conductivity predicted by ERT appear in the second and third rows of table 4.

**Figure 12.** Regularized solutions ${{{\boldsymbol{ \sigma }} }_{\ell }}$ , with regularization parameter $\ell =1,\ldots ,6$ , obtained by applying TGSVD ( $M={{D}_{1}}$ ) to each iteration of the Gauss–Newton method. We used all the available measurements (m = 20) and set n = 40. The dashed line represents the conductivity predicted by ERT.
Download figure:
Standard image High-resolution image

**Figure 13.** Regularized solutions ${{{\boldsymbol{ \sigma }} }_{\ell }}$ , with regularization parameter $\ell =1,\ldots ,6$ , obtained by applying TGSVD ( $M={{D}_{2}}$ ) to each iteration of the Gauss–Newton method. We used all the available measurements (m = 20) and set n = 40. The dashed line represents the conductivity predicted by ERT.
Download figure:
Standard image High-resolution image

It is clear that resorting to the TGSVD, using either the first or the second derivative as a regularizing operator, is much more effective than the TSVD approach. In this particular case, the second derivative produces the best results. Among the parameter estimation methods, the L-corner algorithm [21] appears to be the most robust, as it identifies an acceptable solution in all three cases. Moreover, the position of the maximum is localized with sufficient accuracy.

Figure 14 reports the first three regularized solutions with $M={{D}_{2}}$ , corresponding to m = 10 and m = 5, that is, using half of the measurements used in the previous figures ( $h=0\;{\rm m},0.2\;{\rm m},\ldots ,1.8\;{\rm m}$ ), and a quarter of them ( $h=0\;{\rm m},0.4\;{\rm m},\ldots ,1.6\;{\rm m}$ ). Reducing the number of data values leads to less accurate solutions, but the position of the conductivity maximum and its value are very well determined. In both cases, the L-corner method returned $\ell =2$ .

6. Conclusions

In this paper we propose a regularized inversion method to reconstruct the electrical conductivity of the soil with respect to depth, starting from electromagnetic data collected by a GCM. We develop exact formulae for the Jacobian of the function to be inverted, and choose a relaxation parameter in order to ensure both the convergence of the iterative method and the positivity of the solution. This leads to a fast and reliable algorithm. Various methods for the automatic estimation of the regularization parameter are considered. Numerical experiments on synthetic data sets show that the algorithm produces reasonable results, even when the noise level is chosen to be consistent with real applications. The method is finally applied to real field data, producing results that are compatible with those obtained by ERT. In the near future, we plan to adapt our inversion algorithm and our software in order to deal with multiple depth responses, corresponding to multifrequency and/or multioffset GCM measurements, produced by the new generation of instruments currently available.

Acknowledgments

The authors would like to thank SOGAER, the company that manages Cagliari Airport, for allowing them to collect data in the airport area, and in particular Massimo Rodriguez for his help and assistance during the measurement campaign. The authors also thank Luigi Noli and Mario Sitzia for helping in the field data acquisition. Finally, the authors are grateful to Giorgio Cassiani (University of Padova) and Lothar Reichel (Kent State University) for fruitful discussions on the content of this paper, and to two anonymous referees for their suggestions that led to improvements of the presentation.

Regularized solution of a nonlinear problem in electromagnetic sounding^*

Article metrics

Submit

Permissions

Author e-mails

Author affiliations

Author notes

Dates

Peer review information

Abstract

1. Introduction

2. The nonlinear forward model

3. Solution of the inverse problem

3.1. Computing the Jacobian matrix

3.2. Inversion algorithm

3.3. Regularization methods

3.4. Choice of the regularization parameter

4. Numerical experiments with synthetic data

5. Numerical experiments with field data

6. Conclusions

Acknowledgments

Regularized solution of a nonlinear problem in electromagnetic sounding*

Article metrics

Submit

Permissions

Share this article

Author e-mails

Author affiliations

Author notes

Dates

Peer review information

Abstract

1. Introduction

2. The nonlinear forward model

3. Solution of the inverse problem

3.1. Computing the Jacobian matrix

3.2. Inversion algorithm

3.3. Regularization methods

3.4. Choice of the regularization parameter

4. Numerical experiments with synthetic data

5. Numerical experiments with field data

6. Conclusions

Acknowledgments

Regularized solution of a nonlinear problem in electromagnetic sounding^*