This site uses cookies. By continuing to use this site you agree to our use of cookies. To find out more, see our Privacy and Cookies policy.
Brought to you by:
Paper

Joint estimation of Robin coefficient and domain boundary for the Poisson problem

and

Published 8 December 2021 © 2021 IOP Publishing Ltd
, , Citation Ruanui Nicholson and Matti Niskanen 2022 Inverse Problems 38 015008 DOI 10.1088/1361-6420/ac3c17

0266-5611/38/1/015008

Abstract

We consider the problem of simultaneously inferring the heterogeneous coefficient field for a Robin boundary condition on an inaccessible part of the boundary along with the shape of the boundary for the Poisson problem. Such a problem arises in, for example, corrosion detection, and thermal parameter estimation. We carry out both linearised uncertainty quantification, based on a local Gaussian approximation, and full exploration of the joint posterior using Markov chain Monte Carlo sampling. By exploiting a known invariance property of the Poisson problem, we are able to circumvent the need to re-mesh as the shape of the boundary changes. The linearised uncertainty analysis presented here relies on a local linearisation of the parameter-to-observable map, with respect to both the Robin coefficient and the boundary shape, evaluated at the maximum a posteriori (MAP) estimates. Computation of the MAP estimate is carried out using the Gauss–Newton method. On the other hand, to explore the full joint posterior we use the Metropolis-adjusted Langevin algorithm, which requires the gradient of the log-posterior. We thus derive both the Fréchet derivative of the solution to the Poisson problem with respect to the Robin coefficient and the boundary shape, and the gradient of the log-posterior, which is efficiently computed using the so-called adjoint approach. The performance of the approach is demonstrated via several numerical experiments with simulated data.

Export citation and abstract BibTeX RIS

1. Introduction

In this paper, we consider a Poisson problem where we measure potentials on an accessible part of a slab, to infer the shape and the distributed Robin coefficient field of an inaccessible part of the slab. The problem setup is inspired by physical applications such as corrosion detection or thermal engineering, see for example [15] among others. There is a considerable amount of literature available, dealing with both theoretical and computational aspects of both the individual problems, i.e. estimation of the Robin coefficient or estimation of the domain shape, see e.g [13, 515], as well as of the problem of joint estimation [1626]. We also mention the works [27, 28], in which a similar problem is considered using a generalized impedance boundary condition.

We pose the problem in the Bayesian framework [2931], leading to a problem of statistical inference, whose solution is the posterior density. We first quantify the posterior uncertainty in the parameters by using the Laplace approximation, see e.g. [32], which entails constructing a local Gaussian distribution about the maximum a posteriori (MAP) estimate. To efficiently calculate the MAP estimate, we use the Gauss–Newton method [33], which requires the computation of the Fréchet derivative of the potential with respect to both the Robin coefficient and the boundary shape. The first of these derivatives is straightforward [7], while on the other hand, determination of the shape derivative is more involved. The specific approach we take to finding the shape derivative is similar to that provided in [34], though for an in-depth discussion of shape derivatives, see [35]. We also note the works of [3638] for an alternative approach to estimation of the boundary shape for a Poisson problem based on conformal maps.

To accurately characterise the full joint posterior, we use a Markov chain Monte Carlo (MCMC) sampling method. Specifically, we employ the Metropolis-adjusted Langevin algorithm (MALA), which requires the gradient of the log-posterior. We use the adjoint approach [3941] to compute the gradient efficiently.

Current approaches to handling the shape estimation problem can be loosely separated into those which use boundary integral methods (such as the boundary element method), see e.g. [2426], and those which use volume integral methods (such as the finite element method) [21, 34, 4244]. Both of these approaches, however, suffer from several limitations. First, the boundary-based methods are only useful in a fairly restrictive setting, as they are typically based on the fundamental solutions. A major drawback of volume-based methods is that they require re-meshing the domain as the estimate of the domain shape changes, or embedding the problem in a larger mesh and using interpolation. The additional computational overheads incurred by having to re-mesh may be negligible when finding the MAP estimate. However, when carrying out MCMC sampling the mesh would need to be adapted for each new proposal, leading to a significant increases in computational time. Furthermore, as alluded to in [11, 42, 43], altering the mesh can place a limit on the feasible boundary movements, and can also introduce spurious artefacts.

Current approaches are further complicated by including (simultaneous) inference for the Robin coefficient over the inaccessible part of the domain. As the problem is posed in the Bayesian framework, we must specify a prior distribution for the Robin coefficient (and for the boundary shape). However, as the Robin coefficient is defined on the inaccessible part of the boundary, changing the boundary shape would require updating (i.e., recomputing) the prior, which adds significant additional cost.

To avoid re-meshing, while also allowing for fairly arbitrary conductivities, we exploit the fact that the Poisson problem is invariant under the push forward [36, 45, 46], as discussed in the following section. This allows for all computations to be carried out in a simple predefined reference domain with a fixed mesh, though we point out that the resulting forward problem will generally be transformed to the anisotropic version of the Poisson equation even if the actual conductivity in the true domain is isotropic.

The remainder of the paper is organised as follows. In section 2, we review the forward problem including the key invariance property, and formulate the required shape derivative based on it. In section 3, we briefly review the Bayesian framework for inverse problems, providing the details on the Laplace approximation, as well as on MALA. In section 4, we consider three numerical examples, based on different physical situations. Lastly, section 5 lists the concluding remarks.

2. The forward problem

Let ${\Omega}\subset {\mathbb{R}}^{2}$, be a bounded domain, the specific nature of which is given below. Furthermore, suppose σL(Ω) takes real scalar-values with σc1 > 0 for all x ∈ Ω. Assume also, that the boundary can be decomposed as ∂Ω = ΓA ∪ ΓI ∪ ΓD, with ΓA, ΓI, and ΓD all having positive measure. Finally, let $E=\left\{{e}_{1},{e}_{2},\dots ,{e}_{q}\right\}$ denote the set of measurement points with ei ∈ ΓA for i = 1, 2, ..., q. The forward problem considered is then, find ${\left.{u}_{k}\right\vert }_{E}$ such that

Equation (2.1)

for k = 1, 2, ..., , where is the number of different boundary conditions applied, and η is the outward facing unit normal. In the case of an electrostatic conductor, u represents the electrostatic potential, σ the electrical conductivity, exp(β) the boundary admittance 3 , and gk the applied current densities. As discussed in e.g. [16, 17], the application of several current densities helps ensure uniqueness for the solution of the estimation problem in the continuous setting.

We take the domain to be of the form ${\Omega}=\left\{({x}_{1},{x}_{2}):{x}_{1}\in (0,L)\enspace \;\text{and}\;\enspace 0< {x}_{2}< f({x}_{1})\right\}$ with $f\in {C}_{\text{per}}^{1}([0,L])$ where ${C}_{\text{per}}^{1}([0,L]){:=}\left\{f\in {C}^{1}([0,L])\vert f(0)=f(L)\right\}$ and f(s) > 0 for all s ∈ [0, L]. Then the inaccessible part of the domain is ${{\Gamma}}_{\text{I}}=\left\{(s,f(s)):s\in [0,L]\right\}$. That is to say, the inaccessible part of the boundary ΓI is the top of the domain and can be represented as a periodic function. On the other hand, the accessible part of the domain ΓA is taken to be the bottom of the domain, i.e., ${{\Gamma}}_{\text{A}}=[0,L]\times \left\{0\right\}$, while ΓD makes up the union of the lateral boundaries. See figure 1 for a schematic of the forward problem. Inline with the literature [8, 18, 20], we choose boundary currents

Equation (2.2)

Figure 1.

Figure 1. Illustration of the invariance property of the Poisson problem. The diffeomorphism, ψ, maps the true (corroded) domain, Ω, to a reference (uncorroded) domain $\tilde{{\Omega}}$, which in turn leads to an updated potential, $\tilde{u}(\tilde{\boldsymbol{x}})=u{\circ}{\psi }^{-1}(\tilde{\boldsymbol{x}})$, and an updated conductivity $\tilde{\sigma }$, and boundary conditions (see (2.4)–(2.6)). The updated conductivity is in general anisotropic. To illustrate this we have plotted the anisotropy of the conductivity using ellipses, the principal axis of which shows the anisotropy while the colour represents the magnitude.

Standard image High-resolution image

2.1. Invariance of the forward problem

In the current paper, we denote by $\tilde{{\Omega}}$ the (known) uncorroded reference geometry, which is assumed to be a slab, i.e., $\tilde{{\Omega}}=(0,L)\times (0,H)$, with LH > 0. The Poisson equation, as outlined in (2.1), is known to be invariant under certain deformations of the domain Ω [36, 45, 46]. More specifically, we let ψ denote the diffeomorphism mapping Ω to $\tilde{{\Omega}}$ given by,

Equation (2.3)

We use x to denote points in Ω, while points in $\tilde{{\Omega}}$ will be denoted by $\tilde{\boldsymbol{x}}$, i.e., x ∈ Ω and $\tilde{\boldsymbol{x}}\in \tilde{{\Omega}}$. Then, taking into account that such a diffeomorphisms leaves ΓA unchanged 4 , i.e., ${\left.\psi \right\vert }_{{{\Gamma}}_{\text{A}}}=\mathrm{I}\mathrm{d}$ (the identity mapping), the functions ${\tilde{u}}_{k}(\tilde{\boldsymbol{x}})={u}_{k}{\circ}{\psi }^{-1}(\tilde{\boldsymbol{x}})$ are known to satisfy a transformed Poisson equation [36, 45, 46]:

Equation (2.4)

for k = 1, 2, ..., , where $\tilde{\boldsymbol{\eta }}$ is the outward facing normal of $\tilde{{\Omega}}$, the push forward of σ (by ψ) is given by

Equation (2.5)

with J the Jacobian matrix of ψ, and

Equation (2.6)

where φ is the restriction of ψ to ΓI. It is worth pointing out that even if the conductivity σ in Ω is isotropic (as is the case in the current paper), the pushed forward conductivity $\tilde{\sigma }$ in $\tilde{{\Omega}}$ will in general be anisotropic. We show a schematic of the invariance property of the Poisson problem in figure 1.

The invariance of the Poisson equation, as outlined above, means that for any domain, Ω, which is diffeomorphic to $\tilde{{\Omega}}$, the solution to (2.1) can be found by solving (2.4). If we only require the solution to (2.1) in a single domain, Ω, there is little use in solving (2.4) instead. However, when considering an estimation problem, such as that considered in the current paper, the forward problem will need to be solved repeatedly. Then, since we want to estimate the domain boundary, we will require the solution to (2.1) in a number of different domains. Computing each of these solutions would require re-meshing of the domain, or at least of the inaccessible part of the boundary ΓI. This increases the computational overhead, which can become significant if the number of forward solutions required is large. Furthermore, as the Robin boundary coefficient is defined on ΓI, special care would need to be taken to ensure that spatial prior information, such as smoothness of the coefficient, was kept consistent across each of the domains. This would further increase the computational requirements, and could also lead to convergence issues for the estimates, see e.g. [31, 47, 48]. To reduce these issues, in the current paper, we propose only ever solving the forward problem in the fixed reference domain, $\tilde{{\Omega}}$, by exploiting the discussed invariance property.

2.2. Derivatives

Key to further reducing the computational costs of solving the estimation problem is efficient calculation of generalised derivatives of the forward model with respect to both the shape of the inaccessible part of the domain boundary ΓA and the Robin coefficient β, which we now outline. A related approach is provided in [34, 43], though the procedure used there requires, among other things, a dual procedure to cope with distributional boundary conditions.

In what follows, we will slightly abuse notation and set g = gk and u = uk for some fixed $k\in \left\{1,2,\dots ,\ell \right\}$. Furthermore, it will be useful to consider writing the weak form of (2.1) as a(u, v) = F(v), where

Equation (2.7)

for $v\in {H}_{D}^{1}({\Omega})$, with ${H}_{D}^{1}({\Omega})=\left\{v\in {H}^{1}({\Omega}):{\left.v\right\vert }_{{{\Gamma}}_{\text{D}}}=0\right\}$.

To calculate the generalised derivative for the boundary, we first take $h\in {C}_{\text{per}}^{1}([0,L])$ and introduce the perturbed problem: ah (uh , vh ) = F(vh ), for which the weak solution, uh H1h ), satisfies

Equation (2.8)

where ${{\Omega}}^{h}=\left\{({x}_{1},{x}_{2}):{x}_{1}\in (0,L)\enspace \text{and}\enspace 0< {x}_{2}< f({x}_{1})+th({x}_{1})\right\}$ with $t\in {\mathbb{R}}^{+}$. We next introduce the diffeomorphism which maps Ωh to $\tilde{{\Omega}}$,

Finally, the Gateaux derivative, see e.g. [49, chapter 8], of uH1(Ω) evaluated at Ω in the direction h is then

The potential u can be written 5 as u = T−1 F, see e.g. [52]. Furthermore, as a in (2.7) is the symmetric form associated with T, straightforward differentiation then yields

Equation (2.9)

Due to the nature of the admissible set of boundary perturbations, it follows that dF(Ω; h) = 0. That is to say, finding da(Ω; h) renders calculating du(Ω; h) essentially trivial.

With the two diffeomorphisms, ψ (see (2.3)) and ψh , in hand, we can rewrite both a and ah using the invariance property of the Poisson problem (see section 2.1) as

respectively, where $\tilde{\sigma }$ and $\mathrm{exp}(\tilde{\beta })$ are defined in (2.5) and (2.6) respectively, $\tilde{u}=u{\circ}\psi $, while ${\tilde{u}}^{h}={u}^{h}{\circ}{\left({\psi }^{h}\right)}^{-1}$, and

with J h the Jacobian matrix of ψh , and where ${\boldsymbol{x}}^{h}={\left({\psi }^{h}\right)}^{-1}(\tilde{\boldsymbol{x}})$ and $s={\varphi }^{-1}(\tilde{s})$. We can then write the derivative as,

Equation (2.10)

where we have used the substitution $B=\mathrm{exp}(\tilde{\beta })$, and where $\mathrm{d}\tilde{\sigma }({\Omega};h)\in {S}^{2}\left(C\left([0,L]\right)\right)$, the space of symmetric matrices with elements in $C\left([0,L]\right)$ and $\mathrm{d}B({\Omega};h)\in C\left([0,L]\right)$.

In the current set up, due to the assumptions on the domain shape perturbations, each of the derivatives in (2.10) are straightforward to compute as follows. First, since the push forward of σ is given by

we have $\mathrm{d}\tilde{\sigma }({\Omega};h)=\sigma (\boldsymbol{x})\mathbf{\Sigma }(\boldsymbol{x})h({x}_{1})$, with

On the other hand, inline with e.g. [53], by setting $\hat{h}=\mathrm{d}\tilde{\sigma }({\Omega};h)$ we have

Next,

and finally, by setting $\check{h}=\mathrm{d}B({\Omega};h)$, we have

Notice that in the process of determining the derivatives of the forward problem with respect to perturbations of the inaccessible part of the boundary, we have also determined the derivatives of the forward problem with respect to perturbations of the Robin coefficient.

3. Inference

Let us now pose the problem of finding the inaccessible part of the domain and the Robin coefficient as a problem of statistical inference within the Bayesian framework. The solution to the inference problem is the posterior density, i.e., the conditional probability distribution of the parameters conditioned on the measured data. We initially treat the inaccessible part of the domain boundary ΓI as a function of x1, and the Robin coefficient β as a function defined on ΓI. The posterior is then expressed using the infinite dimensional version of Bayes' theorem,

Equation (3.1)

where Z is a normalization constant, $\frac{\mathrm{d}{\mu }_{\text{post}}}{\mathrm{d}{\mu }_{\text{prior}}}$ denotes the Radon–Nikodym derivative of the posterior measure μpost with respect to the prior measure μprior, and πlike denotes likelihood. For an in-depth discussion of (3.1), and on when the posterior is well-defined see [31].

Although (3.1) is valid in infinite (and finite) dimensions, we follow the more intuitive form of Bayes' formula that uses Lebesgue measures and thus only holds in finite dimensions, see e.g. [31, 48]. As such, in the following we denote by πprior and πpost the finite dimensional prior and posterior densities, respectively, and work with the more familiar finite-dimensional version of Bayes' theorem.

In the next section, we provide details on the discretisations and parameterisations used for the unknowns, as well as the specific prior distributions used. First, though, we provide a brief summary of the key steps:

  • (a)  
    Calculate the MAP estimate. The MAP estimate is calculated by solving an optimisation problem. We solve the optimisation problem using the Gauss–Newton method, which requires the computation of the Jacobian matrix.
  • (b)  
    Compute the local Gaussian approximation to the posterior. To approximately quantify the posterior uncertainty, we carry out a local Gaussian approximation to the posterior, based on the Gauss–Newton Hessian of the negative log-posterior (the Laplace approximation).
  • (c)  
    Sample the full posterior. To assess the validity of the Laplace approximation, we sample the full posterior using MCMC. Specifically, we use MALA, and compute the required gradients with the adjoint method. The proposal distribution is adapted during the sampling, with the above local Gaussian approximation used as the initial proposal distribution.

3.1. Parameterisations and the prior

As stated previously, the inaccessible part of the domain boundary is treated as a function of x1 only. Similarly to [34, 43], we take ${{\Gamma}}_{\text{I}}=\left\{(s,f(s)):s\in [0,L]\right\}$ with $f\in {C}_{\text{per}}^{\infty }([0,L])$ and take

Equation (3.2)

As such, the problem of inferring the shape of the inaccessible part of the domain is posed as inferring the Fourier coefficients 6 , $\boldsymbol{\alpha }={[{\alpha }_{0},{\alpha }_{1},\dots ,{\alpha }_{2p}]}^{T}\in {\mathbb{R}}^{2p+1}$. As in [34, 43], we encode our prior beliefs about the geometry of the domain through a Gaussian prior on α , with mean α * and covariance matrix Γα . As such, the prior density for α can be written as

Specification of α * and Γα is left to section 4.

We postulate a Gaussian prior measure on the Robin coefficient, i.e., ${\mu }_{\beta }=\mathcal{N}({\beta }_{\ast },{\mathcal{C}}_{\beta })$. Inline with e.g. [48, 55, 56], to ensure the inverse problem is well-posed in infinite dimensions [31], while simultaneously promoting smoothness, we use a squared inverse elliptic operator to define the prior covariance operator. Specifically, we take ${\mathcal{C}}_{\beta }={\mathcal{A}}^{-1}$, where $\mathcal{A}$ is the second order elliptic differential operator defined by

Equation (3.3)

where δβ controls the marginal variance and l is the inverse of the correlation length. As discussed in [5759], suitable boundary conditions should also be applied to reduce so-called boundary artefacts. To this end, letting η I denote the outward facing unit vector normal to ∂ΓI, we equip $\mathcal{A}$ with a Robin boundary condition,

which leads to a constant (i.e., homogeneous) marginal variance for β over ΓI [58, 59].

The Robin coefficient is discretised using continuous piece-wise linear Lagrange basis functions ${\left\{{\phi }_{i}\right\}}_{i=1}^{q}$, leading to a discrete approximation for β of the form ${\beta }_{h}={\sum }_{i=1}^{q}\;{\beta }_{i}{\phi }_{i}$. Hence, the parameters of interest for the Robin coefficient are $\boldsymbol{\beta }={[{\beta }_{1},{\beta }_{2},\dots ,{\beta }_{q}]}^{T}\in {\mathbb{R}}^{q}$, and the resulting discrete representation of the prior covariance operator, denoted Γβ , is

see for example [48, 56, 60]. That is, the covariance matrix can be written as

Equation (3.4)

where

for i, j = 1, 2, ..., q. The prior distribution for β can then be written as

In the current paper, we take the boundary shape and Robin coefficient to be (statistically) mutually independent a priori, allowing us to decompose the prior as πprior( α , β ) = π( α )π( β ). The joint prior density is thus given by

3.2. The likelihood

As is fairly standard, see e.g. [7, 24, 25], we assume that the data, $\boldsymbol{y}\in {\mathbb{R}}^{m}$, is corrupted by additive noise e and is related to the parameters through

where $\mathcal{G}:{\mathbb{R}}^{2p+1}\times {\mathbb{R}}^{q}\to {\mathbb{R}}^{m}$ is termed the parameters-to-observable mapping 7 , and $\boldsymbol{e}\in {\mathbb{R}}^{m}$ denotes the additive noise. As discussed in section 2, we take the data to be comprised of point-wise noisy measurements of u, satisfying (2.1), at points along the accessible part of the boundary, ΓA. As such, we can rewrite the parameters-to-observable mapping as $\mathcal{G}(\boldsymbol{\alpha },\boldsymbol{\beta })=\boldsymbol{\mathcal{B}}\boldsymbol{u}\in {\mathbb{R}}^{m}$, where u = vec(u1, u2, ..., ul ), and $\boldsymbol{\mathcal{B}}$ denotes the linear observation (interpolation) operator which gives the values of each ui at the measurement locations. In practice, each of the ui are discretised and thus with a slight abuse of notation, we take the parameter-to-observable mapping to be of $\boldsymbol{\mathcal{B}}:{\mathbb{R}}^{r}\to {\mathbb{R}}^{m}$.

We assume that the additive noise is normally distributed, i.e., $\boldsymbol{e}\sim \mathcal{N}({\boldsymbol{e}}_{\ast },{\mathbf{\Gamma }}_{e})$, and mutually independent of the parameters. These assumptions lead to the likelihood taking the form

see for example [2931].

3.3. The posterior density

By employing the above parameterisations for the inaccessible part of the boundary, ΓI, and the Robin coefficient, β, the solution to the inference problem is the joint posterior density of α and β . This can be expressed using the finite dimensional version of Bayes' formula, as

We define the posterior potential (of the posterior density),

Equation (3.5)

so that the posterior can be written as ${\pi }_{\text{post}}(\boldsymbol{\alpha },\boldsymbol{\beta }\vert \boldsymbol{y})\propto \mathrm{exp}\left\{-\mathcal{J}\right\}$.

3.3.1. The Laplace approximation to the posterior

In the context of Bayesian inference for problems governed by partial differential equations, the most common Gaussian approximation to the posterior is the Laplace approximation, see e.g. [7, 48, 61]. Assuming the parameter-to-observable mapping is Fréchet differentiable with respect to α and β, the Laplace approximation is ${\hat{\pi }}_{\text{post}}(\boldsymbol{\alpha },\boldsymbol{\beta }\vert \boldsymbol{y})=\mathcal{N}(({\boldsymbol{\alpha }}_{\text{MAP}},{\boldsymbol{\beta }}_{\text{MAP}}),{\hat{\mathbf{\Gamma }}}_{\text{post}})$, where ( α MAP, β MAP) denotes the MAP estimate of ( α , β ),

and the approximate posterior covariance matrix is given by

Equation (3.6)

where Γ m = diag(Γα , Γβ ), and $\mathsf{G}({\boldsymbol{\alpha }}_{\text{MAP}},{\boldsymbol{\beta }}_{\text{MAP}})$ denotes the generalised derivative of $\mathcal{G}$ with respect to α and β evaluated at the MAP estimate, i.e.,

Equation (3.7)

It is worth noting that the approximate posterior covariance matrix coincides with the inverse of the Gauss–Newton approximation to the Hessian of the negative log-posterior (referred to simply as the Hessian in what follows), denoted H , i.e.,

3.3.2. Full characterisation of the posterior

In some cases, the MAP estimate and a localised Gaussian approximation, such as the Laplace approximation, may be sufficient for the quantification of uncertainty in the parameters. However, without fully characterising the posterior it is essentially impossible to determine the feasibility, and thus relevance, of the local Gaussian approximation. As stated previously, fully characterising the posterior for nontrivial problems requires sampling based methods such as MCMC.

Sampling in high dimensional problems can often become infeasible in practice. However, this curse of dimensionality can often be offset to some extent by implementing sampling schemes which exploit the geometry of the posterior, see for example [62, 63]. One popular MCMC sampling scheme which takes into account the geometry of the posterior is MALA [6466], which can be motivated using Langevin dynamics. The basic idea in MALA is to guide the proposal towards an area of higher probability, thus speeding up the convergence. Use of MALA requires the gradient of the negative log-posterior with respect to the unknown parameters.

We now briefly recall the key steps to MALA, for more details see for example [64, 65]. Langevin diffusion is defined by the following stochastic differential equation

Equation (3.8)

where we have denoted all the unknowns as m = ( α , β), A is a positive definite (preconditioning) matrix, and B t is standard Brownian motion.

A key feature of the above process is that in the continuous-time case its stationary and limiting distribution is πpost. However, discrete approximations to (3.8) can have vastly different asymptotic behaviours from the diffusion process they try to approximate [64]. As such, it is necessary to introduce a Metropolis–Hastings accept/reject step that ensures the convergence to πpost. In MALA, the proposal for the next step, ${\boldsymbol{m}}_{k+1}^{\ast }$, is generated as

where now the preconditioning matrix A represents the covariance of the proposal density $q({\boldsymbol{m}}_{k+1}^{\ast }\vert {\boldsymbol{m}}_{k})$, τ > 0 can be seen as a step size parameter that globally scales the proposal density, and ξ is a standard normal (q + 2p + 1-dimensional) probability density. The proposal is then accepted or rejected according to the Metropolis–Hastings algorithm, where

Equation (3.9)

is the probability of accepting the proposal. In practice, we work with the logarithm of α to avoid numerical underflow.

In MALA, the drift towards the posterior gradient makes the proposal density q(⋅|⋅) nonsymmetric, and therefore the ratio $q({\boldsymbol{m}}_{k}\vert {\boldsymbol{m}}_{k+1}^{\ast })/q({\boldsymbol{m}}_{k+1}^{\ast }\vert {\boldsymbol{m}}_{k})$ does not cancel out. The log-ratio of the proposal densities is given by

where $\delta \boldsymbol{m}{:=}{\boldsymbol{m}}_{k+1}^{\ast }-{\boldsymbol{m}}_{k}$ is the taken step, and L T L = A −1.

Let us now discuss the selection of the proposal covariance A and the proposal scaling parameter τ. To achieve optimal sampling efficiency, we should have A = Σπ , where Σπ denotes the covariance of the true posterior [67]. In addition, it has been shown that, under various assumptions, the asymptotically optimal acceptance rate for MALA is 0.574 [65]. The acceptance rate can be adjusted using the scaling parameter τ. Since Σπ and the optimal value of τ are unknown prior to carrying out the inversion, we adapt them continuously during the sampling, see [6769]. A reasonable starting location for MCMC is the MAP estimate given by the Gauss–Newton iterations, and as the initial proposal covariance before the start of the adaptation we use the Laplace approximation (3.6).

Before any conclusions can be drawn from a MCMC run, we need to be reasonably sure that the chain has converged to its limiting distribution. Naturally, the convergence can never be guaranteed since any method of diagnosing convergence can by definition only diagnose the non-convergence of a chain. In practice, however, we can arrive at a reasonable certainty by testing the convergence with multiple complementary methods. These methods include visual inspection of the chains and measuring the sampling quality by autocorrelation, comparing within-chain variances between multiple runs (the Gelman–Rubin diagnostic) [70, 71], and estimating sampling uncertainty by computing the Monte Carlo standard error (MCSE) [72, 73].

In this paper, we implement a stopping rule based on the estimated MCSE and its relation to the posterior variance of the parameters, as in [74]. First, we run the sampler for enough steps so that the MCSE can be reliably calculated using the non-overlapping batch means method, see [72]. Then, at regular intervals we check, for each unknown individually, if the MCSE estimate at 98% confidence level is smaller than 10% of the posterior standard deviation. When this is true for each unknown, we terminate the run.

3.4. The adjoint approach and further derivatives

Calculation of the MAP estimate, ( α MAP, β MAP), and construction of the Laplace approximation of the posterior covariance matrix, ${\hat{\mathbf{\Gamma }}}_{\text{post}}$, requires the generalised derivative $\mathsf{G}$ (see (3.7)), details of which were provided in section 2.2. On the other hand, the use of MALA necessitates computation of only the gradient of the negative log-posterior with respect to both the inaccessible part of the boundary and the Robin coefficient. Although the gradient is essentially trivial to calculate with the Jacobian in hand, computing the Jacobian can be costly, requiring as many forward solves as the number of measurements. To avoid this potential bottleneck, we employ the so-called adjoint approach [3941], which computes the gradient at a cost of one additional (adjoint) forward solve. In what follows, we will set g = gk and u = uk for some fixed $k\in \left\{1,2,\dots ,\ell \right\}$. We begin by introducing a Lagrangian, $\mathcal{L}:{H}_{D}^{1}({\Omega})\times {H}_{D}^{1}({\Omega})\times {\mathbb{R}}^{2p+1}\times {H}^{1}([0,L])\to \mathbb{R}$, as

with $\mathcal{J}$ defined in (3.5), and where we have used the invariance property of the Poisson equation. The subscript α is used to denote the dependence on α .

Determining the gradient of $\mathcal{J}$ (with respect to the unknowns) is achieved by requiring that variations of the Lagrangian with respect to the forward potential, u, and the so-called adjoint potential, v, are 0, see e.g. [41]. The variation of the Lagrangian with respect to the adjoint potential gives

for $\tilde{w}\in {H}_{D}^{1}(\tilde{{\Omega}})$, i.e., u must satisfy the forward problem. The variation with respect to the forward potential gives the adjoint equation,

for $\tilde{z}\in {H}_{D}^{1}(\tilde{{\Omega}})$, which the adjoint variable v must satisfy (this is the one additional forward solve required). Finally, the gradients with respect to the Fourier coefficients and the Robin coefficient are then given (in strong form) as

and

respectively, where

where $\boldsymbol{x}={\psi }^{-1}(\tilde{\boldsymbol{x}})$ and $s={\varphi }^{-1}(\tilde{s})$, and i = 0, 1, ..., 2p.

Finally, the gradient of the cost function evaluated at (u, v, α , β) is then given by

Equation (3.10)

4. Numerical examples

In this section, we outline three numerical examples to assess the applicability, performance, and robustness of the proposed approach. In each example, we take the uncorroded (reference) domain to be a slab with length L = 1, and height H = 0.05.

To reduce so-called inverse crimes [75], the data-generating meshes have a substantially finer discretisation than the meshes used at the inference stage. Furthermore, in each example, neither the true Robin coefficient β, nor the true boundary shape, ΓI, are generated form their respective priors, which we describe below.

The synthetic data is generated solving (2.1) with the finite element method, using = 8 different Neumann boundary conditions (see (2.2)). Measurement data consists of 32 equally spaced noisy point-wise measurements of u along ΓA, and the as shown in figure 2 using black dots. In each example the noise is taken to have mean e * = 0 and covariance matrix ${\mathbf{\Gamma }}_{e}={\delta }_{e}^{2}\boldsymbol{I}$, where ${\delta }_{e}=\left(\mathrm{max}(\mathcal{B}u)-\mathrm{min}(\mathcal{B}u)\right)\times 1/100$, that is, the noise level is 1% of the range of the noiseless measurements.

  • Example 1: the first example we consider exhibits a smooth, shallow dip in the inaccessible part of the boundary (see figure 2), and a smooth true Robin coefficient. For brevity, the true Robin coefficients for all examples are shown with the inversion results in figures 57.
  • Example 2: the second example has a wider corroded area on the left side of the slab, and an increase in the thickness at the right, which could represent an area with deposition or buildup. Furthermore, changes in the boundary are steeper relative to example 1. In addition, a small amount of white noise is added to the boundary shape and the Robin coefficient to model, perhaps, a more realistic situation where the boundary and Robin coefficient are rough.
  • Example 3: finally, in the third example we place three deep and steep cavities across the inaccessible part of the slab. This example is the most challenging as the true boundary shape and Robin coefficient have 0 prior probability. However, this example does provide a good test of the problem's robustness to the choice of prior.

Figure 2.

Figure 2. Data-generating meshes of the three examples considered. The black dots denote the measurement locations.

Standard image High-resolution image

Before discussing the results of each example we outline further computational details.

4.1. Computational details

For each example considered, we take the diffeomorphism (from the true domain Ω to $\tilde{{\Omega}}$) to be of the form

with

The choice of p can be viewed as making a trade-off between computational speed and the maximum possible accuracy of the solution. That is, with a smaller p we have fewer parameters to infer but smaller values of p cannot represent as complex shapes. In the current paper we take p = 7, meaning we have 15 Fourier coefficients to estimate.

For examples 2 and 3, representing the true boundary shapes accurately requires substantially more basis functions than the 15 used. Thus, to better assess the estimates, we can compare them to the true boundary shape projected onto the truncated Fourier basis. This is illustrated in figure 3, where we show the reference inversion mesh (used in all examples) as well as the inversion mesh corresponding to the true domain for example 3 using the first 15 Fourier basis functions. It is clear that steeper changes in the boundary shape cannot be represented by the truncated Fourier series.

Figure 3.

Figure 3. Top: the reference inversion mesh used for all examples. Bottom: true domain discretised using the inversion mesh with the true boundary shape projected onto the first 15 Fourier basis functions.

Standard image High-resolution image

We use the same prior distribution for all examples. Specifically, for the boundary shape we take α * = 0 and

Equation (4.1)

where ${\sigma }_{\alpha }^{2}$ controls the overall variance of the parameters, and sα < 0 controls how quickly the variance diminishes towards the higher frequency components. The faster the Fourier coefficients converge towards zero, the smoother the resulting function is [76], and as such, sα can be interpreted as a smoothness parameter. We choose ${\sigma }_{\alpha }^{2}=0.01$ and sα = − 1 for all considered examples. For the Robin coefficient we set β * = 0 and Γβ as in (3.4), where we set ${\delta }_{\beta }^{2}=50$ and l = 10. Draws from both priors are shown in figure 4.

Figure 4.

Figure 4. Draws from the prior over the true values of example 2. The shaded areas indicate the one to three standard deviations of the prior.

Standard image High-resolution image

The meshes used for the generation of the synthetic data are different in each case (see figure 2), and each mesh has between 2000 and 3000 nodes. On the other hand, for all examples, in the data-generation mesh the Robin coefficient is represented using 230 finite element basis functions. The discretisation of the mesh used for inference has a total of 640 nodes, and results in 78 degrees of freedom for the Robin coefficient, i.e., $\boldsymbol{\beta }\in {\mathbb{R}}^{78}$. Thus, the total number of the parameters to be inferred is 78 + 15 = 93.

Finally, we remark that, inline with e.g. [77], the Gauss–Newton method used to compute the MAP estimate is terminated when the norm of the gradient has decreased by a factor of 105, while line search is carried out using the Armijo condition with c1 = 10−4 and c2 = 0.9, see [33]. All computations were carried out using MATLAB 2018b on a laptop with an Intel i7-8850H CPU and 16 GB RAM.

4.2. Results

Here we discuss and compare the Laplace approximation of the posterior and the accurate posterior found using MCMC for each of the examples outlined above.

Example 1. For the first example, the MAP estimate was found after 24 Gauss–Newton iterations, while the MCMC convergence criterion was reached in approximately 70 000 steps after burn-in (10 000 steps in all examples), which took approximately two hours. The MAP estimate, Laplace approximation of the (marginal) posterior, and the conditional mean and the (marginal) posterior found using MCMC are all shown in figure 5.

Both the Laplace approximation of the posterior and accurate posterior support the true boundary shape and Robin coefficient well. Comparing the Laplace approximation and accurate posterior for the boundary shape, it is clear that the accurate posterior has significantly narrower confidence intervals, particularly towards the lateral boundaries. Moreover, for the boundary shape, the CM and MAP estimates are in fairly good agreement, while the accurate posterior seems to be symmetric, indicating that the posterior is well approximated by a normal distribution. On the other hand, when comparing the Laplace approximation and accurate posterior for the Robin coefficient, we see that the accurate posterior is negatively skewed (evident from the confidence intervals and the fact that the CM estimate is generally smaller than the MAP estimate).

Figure 5.

Figure 5. Results for example 1: MAP and CM estimates of the boundary shape and Robin coefficient, some posterior draws, and the true boundary shape and Robin coefficients. The shaded areas denote the posterior uncertainty estimates as one, two, and three sigma. These are computed as MAP-estimate ±, n = 1, 2, 3 for the Laplace approximation, and as 68–95–99.7 credibility regions for the MCMC results.

Standard image High-resolution image

Example 2. For the second example, calculating the MAP estimate took 30 Gauss–Newton iterations, while the MCMC convergence criterion was reached in approximately 140 000 steps after burn-in, which took approximately three and a half hours. In figure 6 we show the MAP estimate, Laplace approximation of the posterior and the posterior and conditional mean found using MCMC.

As in example 1, the Laplace approximation of the posterior and accurate posterior support the true boundary shape and Robin coefficient well. It also clear that for both the boundary shape and Robin coefficient, estimates are fairly insensitive to the small, highly oscillatory, variations in corresponding truths. Similarly to example 1, the Laplace approximation has slightly wider confidence intervals compared to the accurate posterior, particularly towards the lateral boundaries, and the CM and MAP estimates are in good agreement for the boundary shape. Again, the accurate posterior for the Robin coefficient is negatively skewed.

Figure 6.

Figure 6. Results for example 2.

Standard image High-resolution image

Example 3. For the final example 31 Gauss–Newton iterations were needed to compute the MAP estimate, while the MCMC convergence criterion was reached in approximately 80 000 steps after burn-in, taking approximately two hours. Figure 7 shows the MAP estimate, Laplace approximation of the posterior and the posterior and CM found using MCMC. Since the true boundary shape lies well outside the prior distribution, we also show the projection of the true boundary shape onto the first 15 Fourier basis functions. This provides a reference for the best possible reconstruction.

Figure 7.

Figure 7. Results for example 3. The black dashed line shows the projection of the true boundary shape onto the first 15 Fourier basis coefficients.

Standard image High-resolution image

As in the previous examples, the Laplace approximation to the posterior and the accurate posterior for the Robin coefficient support the true coefficient well. Conversely, though as expected, the sharp jumps in the boundary shape are poorly supported by both the approximate and accurate posterior densities as these are impossible to estimate using the given prior on the boundary shape. However, we do see that both posterior densities support the projection of the true boundary shape onto the first 15 Fourier basis functions fairly well. It is also evident that the uncertainty in the boundary shape is significantly lower in regions where the domain is narrowest. This is inline with our intuition as the data (collected on the bottom of the slab) is much more informative on the shape of the boundary which is closer to where the data is measured. Finally, the accurate posterior does show some improvement over the approximate posterior for the boundary shape, particularly at the sharp jumps, as well as being slightly positively skewed.

5. Conclusion

In this paper, we considered the simultaneous inference of both the shape of an inaccessible part of the domain boundary, as well as the Robin coefficient on the inaccessible part of the domain boundary. We considered three examples with different shapes and associated Robin boundary conditions for the inaccessible part of the domain. By exploiting an invariance property of the Poisson problem we were able to avoid having to re-mesh the changing domain shape, which in turn negates the need to recompute the discretised covariance operator of the Robin coefficient on the inaccessible part of the domain. Furthermore, use of the invariance property allowed for a straightforward and efficient means of calculating the required shape derivatives.

In each case, we carried out both local (optimisation-based), approximate, uncertainty quantification in both parameters, as well as full exploration of the posterior using MCMC with MALA. To ensure efficiency of the MCMC algorithm, and possible application to larger-scale problems, we initially posed the problem in infinite dimensions and then employed the adjoint approach to compute the required derivatives.

Our results suggest that for the specific cases of the problem considered here, the Laplace approximation generally provides a fairly accurate representation of the true posterior, though it cannot model any skew, which was present in the accurate posterior densities found using MCMC. However, when assessing the applicability and performance of our proposed approach, we need to keep in mind that the current study was only carried out in two dimensions, with domain shape changes essentially restricted to the top surface. Furthermore, the types of boundary shapes which could be estimated was limited to those which could be represented using only a small number of Fourier basis functions. Finally, though common in the literature, the measurement setup used here is hard to carry out in reality. Natural next steps for future work then, would be to apply the same framework to three dimensional problems, consider a more realistic measurement set up, and investigate other types of representations for the shape of the inaccessible part of the domain.

Acknowledgments

The research of M Niskanen is supported by the Academy of Finland project 321 761. The authors wish to acknowledge Tom ter Elst and Gareth Gordon for their insights into the use of diffeomorphisms as well as Jari Kaipio for several valuable discussions.

Data availability statement

The data that support the findings of this study are available upon reasonable request from the authors.

Footnotes

  • This representation is used to ensure positivity of the boundary admittance.

  • The shape of the accessible part of the domain is assumed known and fixed.

  • The bilinear form $a:{H}_{D}^{1}({\Omega})\times {H}_{D}^{1}({\Omega})\to \mathbb{R}$ in (2.7) is bounded and coercive, thus by the Lax–Milgram theorem, see e.g., [50, 51], there exists a unique bijection $T:{H}_{D}^{1}({\Omega})\to {({H}_{D}^{1}({\Omega}))}^{\ast }$ such that a(u, v) = (Tu)(v) for all $u,v\in {H}_{D}^{1}({\Omega})$. Thus we may also write the weak form of (2.1) as Tu = F, the solution of which is u = T−1 F.

  • Various other representations, such as wavelets [54], could also be used for the boundary shape.

  • Evaluated at the functions corresponding to the Fourier coefficients α and the finite element coefficient vector β .

Please wait… references are loading.
10.1088/1361-6420/ac3c17