Bayesian inversion in resin transfer molding

Marco Iglesias; Minho Park; M V Tretyakov

doi:10.1088/1361-6420/aad1cc

1. Introduction

In this paper we study the Bayesian inverse problem within the moving boundary setting motivated by applications in manufacturing of fiber-reinforced composite materials. Due to their light weight and high strength, as well as their flexibility to fit mechanical requirements and complex designs, such materials are playing a major role in automotive, marine and aerospace industries [3, 4, 27]. The moving boundary problem under consideration arises from resin transfer molding (RTM) process, one of the most commonly used processes for manufacturing composite materials. RTM consists of the injection of resin into a cavity mold with the shape of the intended composite part according to design and enclosing a reinforced-fiber preform previously fabricated. The next stage of RTM is curing of the resin-impregnated preform, which may start during or after the resin injection. Once curing has taken place, the solidified part is demolded from the cavity mold. In the present work we are concerned with the resin injection stage of RTM under the reasonable assumption that curing starts after resin has filled the preform. Though the current study is motivated by RTM, the results can be also used for other applications where a moving boundary problem is a suitable model.

We now describe the forward model (see further details in [3, 38, 47]). Let $D^{\ast}\subset \mathbb{R}^{d}$ , $d\in \{1, 2\}$ , be an open domain representing a physical domain of a porous medium with the permeability $\kappa(x)$ and porosity φ. The boundary of the domain $D^{\ast}$ is $\partial D^{\ast}=\partial D_{I}\cup \partial D_{N}\cup \partial D_{O}$ , where $\partial D_{I}$ is the inlet, $\partial D_{N}$ is the perfectly sealed boundary, and $\partial D_{O}$ is the outlet. The domain $D^{\ast}$ is initially filled with air at a pressure p₀. This medium is infused with a fluid (resin) with viscosity μ through an inlet boundary $\partial D_{I}$ at a pressure p_I and moves through $D^{\ast}$ occupying a time-dependent domain $D(t)\subset D^{\ast}$ , which is bounded by the moving boundary $\newcommand{\Ga}{\Upsilon} \Ga(t)$ and the appropriate parts of $\partial D$ . An example of the physical configuration of this problem in 2D is illustrated in figure 1.

**Figure 1.** An example of the physical configuration of the moving boundary problem.
Download figure:
Standard image High-resolution image

The forward problem for the pressure of resin $p(t, x)$ consists of the conservation of mass

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq:2000} \nabla \cdot \mathbf{v} = 0, ~~~x\in D(t),\ t>0, \nonumber \end{align} \tag{ 1 }$

where the flux $\mathbf{v}(x, t)$ is given by Darcy's law

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq:2001} \mathbf{v}(x,t)=- \frac{\kappa(x)}{\mu} \nabla p(x,t) \nonumber \end{align} \tag{ 2 }$

with the following initial and boundary conditions

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq:2002} p(x,t) =p_{I},\ x\in \partial D_{I},\ t\geqslant 0, \nonumber \end{align} \tag{ 3 }$

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \nabla p(x,t) \cdot \mathbf{n}(x) =0,\ x\in \partial D_{N},\ t\geqslant 0, \label{eq:2003} \nonumber \end{align} \tag{ 4 }$

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle V(x,t)= - \frac{\kappa(x)}{\mu \varphi} \nabla p(x,t) \cdot \mathbf{n}(x,t), ~x\in \Ga (t),\ t\geqslant 0, \label{eq:2004} \nonumber \end{align} \tag{ 5 }$

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle p(x,t) =p_0,\ x\in \Ga (t),\ t>0, \label{eq:2005} \nonumber \end{align} \tag{ 6 }$

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle p(x,t) =p_0, x \in \partial D_{O}, \ t>0, \label{eq:2005n} \nonumber \end{align} \tag{ 7 }$

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle p(x,0) =p_0,\ x\in D^{\ast}, \label{eq:2006} \nonumber \end{align} \tag{ 8 }$

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \Ga (0) =\partial D_{I}. \label{eq:2008} \nonumber \end{align} \tag{ 9 }$

Here $V(x, t)$ is the velocity of the point x on the moving boundary $\newcommand{\Ga}{\Upsilon} \Ga(t)$ in the normal direction at x, $\mathbf{n}(x)$ and $\mathbf{n}(x, t)$ are the unit outer normals to the corresponding boundaries.

We remark that for definiteness we have assumed that at the initial time the moving boundary $\newcommand{\Ga}{\Upsilon} \Ga(0)$ coincides with the inlet boundary $\partial D_{I}$ and that the constant pressure condition is imposed at the inlet. It is not difficult to carry over the inverse problem methodology considered in this paper to other geometries and other conditions on the inlet (e.g. constant rate). Further, in two (three) dimensional RTM settings one usually models permeability via a second (third)-order permeability tensor to take into account anisotropic structure of the media [3, 38] but here for simplicity of the exposition the permeability $\kappa(x)$ is a scalar function. Again, the developed methodology is easy to generalize to the tensor case.

Let us note that in the 1D case the nonlinear problem (1)–(9) is analytically simple and admits a closed form solution (see section 2 and [38]) but the two and three dimensional cases are much more complicated and analytical solution is in general not available. We remark that in two and three dimensional cases the resin can race around low permeability regions and the front $\newcommand{\Ga}{\Upsilon} \Ga$ can become discontinuous creating macroscopic voids behind the main front (see further details in [3, 38]) but in this paper we ignore such effects which deserve further study.

It has been extensively recognized [13, 14, 30, 31, 37, 42] that imperfections in a preform that arise during its fabrication and packing in the molding cavity can lead to variability in fiber placement which results in a heterogenous highly-uncertain preform permeability. In turn, these unknown heterogeneities in permeability of the preform give rise to inhomogeneous resin flow patterns which can have profound detrimental effect on the quality of the produced part, reducing its mechanical properties and ultimately leading to scrap. To limit these undesirable effects arising due to uncertainties, conservative designs are used which lead to heavier, thicker and, consequently, more expensive materials aimed at avoiding performance being compromised. Clearly, the uncertainty quantification of material properties is essential for making RTM more cost-effective. One of the key elements in tackling this problem is to be able to quantify, in real time, the uncertain permeability, which can, in turn, be used in active control systems aimed at reducing the risk of defects formation.

In this work we assume that $D^{\ast}$ , $\partial D_{O}$ , $\partial D_{I}$ , $\partial D_{N}$ , p_I, p₀, μ and φ are known deterministic parameters while the permeability $\kappa(x)$ is unknown. Our objective is within the Bayesian framework to infer $\kappa(x)$ or, more precisely, its natural logarithm $u(x)=\log{\kappa(x)}$ from measurements of pressure $p(x, t)$ at some sensor locations as well as measurements of the front $\newcommand{\Ga}{\Upsilon} \Ga(t)$ , or alternatively, of the time-dependent domain $D(t)$ at a given time t > 0. We put special emphasis on computational efficiency of the inference, which is crucial from the applicable point of view.

1.1. Practical approaches for permeability estimation in fiber-reinforced composites

While the estimation of preform permeability during resin injection in RTM is clearly an inverse problem constrained by a moving boundary PDE such as (1)–(9), most existing practical approaches pose the estimation of permeability in neither a deterministic nor stochastic inverse problems framework. For example, the very extensive review published in 2010 [41] reveals that most conventional methods for measuring permeability assume that (i) the material permeability tensor is homogenous and (ii) the flow is 1D (including 2D radial flow configurations). Under these assumptions the resin injection in RTM can be described analytically, via expressions derived from Darcy's law, which enable a direct computation of the permeability in terms of quantities that can be measured before or during resin injection. These conventional methods suffer from two substantial practical limitations. First, they do not account for the heterogenous structure of the preform permeability, and although they provide an estimate of an effective permeability, this does not enable the prediction of the potential formation of voids and dry spots. Second, those conventional methods compute the permeability in an off-line fashion (i.e. before RTM) with specific mold designs that satisfy the aforementioned assumptions intrinsic to those methods (e.g. rectangular flat molds). This second limitation is not only detrimental to the operational efficiency of RTM but also neglects the potential changes in permeability that can results from encapsulating the preform in cavities with complex designs.

Some practical methodologies for online (i.e. during resin injection) estimation of heterogenous permeability have been proposed in [36, 49]. While these approaches seem to address the aforementioned limitations of conventional methods, they also use a direct approach for the estimation of permeability which faces unresolved challenges. As an example, let us consider the recent work of [49] which uses an experimental configuration similar to the one described in figure 1 and which, by using pressure measurements from sensors located within the domain occupied by the preform, computes a finite-difference approximation of the normal flux to the front $\newcommand{\Ga}{\Upsilon} \nabla p(\Ga(t), t)\cdot \mathbf{n}$ . In addition, by means of images from CCT cameras, seepage velocity of the resin front is computed in [49]; this velocity is nothing but $V(x, t)$ defined by (5) in the context of the moving boundary problem (1)–(9). Under the assumption that μ and φ are known, the approach proposed in [49] consists of finding

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{eq:AA2} \kappa(\Ga(t))=\arg\min_{\theta}\Big\Vert V(x,t)+\frac{\theta}{\varphi \mu} \nabla p(\Ga(t),t) \cdot \mathbf{n}\Big\Vert \nonumber \end{align} \tag{ 10 }$

with $V(x, t)$ and $\newcommand{\Ga}{\Upsilon} \nabla p(\Ga(t), t) \cdot \mathbf{n}$ computed from measurements as described above. This approach offers a practical technique to estimating κ on the moving front and can then potentially infer the whole permeability field during the resin injection in RTM. However, from the mathematical inverse problems perspective, this ad-hoc approach is not recommended as it involves differentiating observations of pressure data for the computation of $\newcommand{\Ga}{\Upsilon} \nabla p(\Ga(t), t) \cdot \mathbf{n}$ . Indeed, it is well-known [23] that differentiation of data is an ill-posed problem that requires regularization. In addition, rather than an inverse problem, the least-squares formulation in (10) is a data fitting exercise that excludes the underlying constraint given by the moving boundary problem and which entails a global effect induced by κ. As a result, the estimate of permeability obtained via (10) has no spatial correlation and thus fails to provide an accurate global estimate of the permeability field.

The recent work of [29] demonstrates considerable advantages of using systematic data assimilation approaches to infer permeability during the resin injection of RTM. By means of a standard ensemble Kalman methodology for data assimilation, the approach of [29] uses measurements from visual observations of the front location to produce updates of the preform permeability within the context of a discrete approximation of the moving boundary problem (1)–(9). While the methodology used in [29] is focused in producing deterministic estimates, the standard Kalman methodology can be potentially used to quantify uncertainty in preform permeability. However, it has been shown that standard Kalman methodologies, such as the one used in [29], could result in unstable estimates unless further regularisation to the algorithm is applied [18].

In addition to the lack of an inverse problem framework that can lead to unstable and ultimately inaccurate estimates of the permeability in resin injection of RTM, most existing approaches (i) do not incorporate the uncertainty in the observed variables and (ii) do not quantify uncertainty in the estimates of the permeability of preform. It is indeed clear from our literature review that the estimation of permeability of preform during resin injection deserves substantial attention from an inverse problems perspective capable of quantifying uncertainty inherent to the fabrication and packing of the preform.

1.2. The Bayesian approach to inverse problems

In this paper we propose the application of the Bayesian approach to inverse problems [46] in order to infer the logarithm of the permeability $u(x)=\log{\kappa(x)}$ , from observations $\{y_{n}\}_{n=1}^{N}$ collected at some prescribed measurement/observation times $\{t_{n}\}_{n=1}^{N}$ during the resin injection in RTM. At each time t_n we observe a vector, y_n, that contains noisy measurements of resin pressure from sensors as well as some information of the moving domain (or alternatively front location) observed, for example, via CCT cameras or dielectric sensors [32]. In the Bayesian approach, the unknown $u(x)$ is a random function that belongs to a space of inputs X. A prior probability measure $\mu_{0}(u)=\mathbb{P}(u)$ on u must be specified before the data are collected; this enables us to incorporate prior knowledge which may include design parameters as well as the uncertainty that arises from preform fabrication (i.e. prior to resin injection). In our work we consider Gaussian priors which have been identified as adequate for characterizing the aforementioned uncertainty in log-permeability from the preform fabrication [30, 31, 50] (see also references therein).

At each observation time t_n during the infusion of resin in RTM, we then pose the inverse problem in terms of computing, $\mu_{n}(u)=\mathbb{P}(u\vert y_{1}, \dots, y_{n})$ , the (posterior) probability measure of the log-permeability conditioned on measurements $y_{1}\dots, y_{n}$ . Each posterior $\mu_{n}$ then provides a rigorous quantification of the uncertainty in the log-permeability field given all available measurements up to the time t_n. Knowledge of each of these posteriors during RTM can then be used to compute statistical moments of the log-permeability under $\mu_{n}$ (e.g. mean, variance) as well as expectations of quantities of interest that may be needed for the real-time optimization of controls (e.g. pressure injection) in RTM.

Although the proposed application of the Bayesian formulation assumes Gaussian priors, the nonlinear structure of the PDE problem, that describes resin injection in RTM, gives rise to a sequence of non-Gaussian Bayesian posteriors $\{\mu_{n}\}_{n=1}^{N}$ which cannot be characterized in a closed form. A sampling approach is then required to compute approximations of these posteriors. Among existing sampling methodologies, Sequential Monte Carlo (SMC) samplers [1, 8, 22, 34] are particularly relevant for the formulation of the above described inverse problem as they provide a recursive mechanism to approximate the sequence of Bayesian posteriors $\{\mu_{n}\}_{n=1}^{N}$ . Markov Chain Monte Carlo (MCMC) approaches [9] can also be used to compute $\mu_{n}$ , at each observation time t_n. However, conventional MCMC formulations do not exploit the sequential nature of the problem by enabling a recursive estimation of $\{\mu_{n}\}_{n=1}^{N}$ which is crucial for the optimisation of the RTM process via making use of active control systems.

Starting with J samples from the prior $u_{0}^{(\,j)}\sim \mu_{0}$ , $j=1, \dots, J$ (i.i.d.), the idea behind SMC is to transform a system of weighted particles $\{W_{n-1}^{(\,j)}, u_{n-1}^{(\,j)}\}_{j=1}^{J}$ that define $\mu_{n-1}^{J}$ to an updated set $\{W_{n}^{(\,j)}, u_{n}^{(\,j)}\}_{j=1}^{J}$ that approximates $\mu_{n}$ as the new data y_n collected at time t_n become available. The weights $\{W_{n}^{(\,j)}\}_{j=1}^{J}$ are normalised (i.e. $\sum\nolimits_{j=1}^{J}W_{n}^{(\,j)} =1$ , $W_{n}^{(\,j)}>0$ ) and the empirical measure

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{00} \mu_{n}^{J}(u)\equiv \sum_{j=1}^{J} W_{n}^{(\,j)}\delta_{u_{n}^{(\,j)}}(u) \nonumber \end{align} \tag{ 11 }$

converges to $\mu_{n}$ as $J\to \infty$ ( $\delta_{w}$ denotes the Dirac measure concentrated at w). Moreover, if $f(u)$ denotes a quantity of interest of the unknown log-permeability $u(x)$ , the weighted particles $\{W_{n}^{(\,j)}, u_{n}^{(\,j)}\}_{j=1}^{J}$ can be easily used to compute the sample mean

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq35A1} \mathbb{E}^{\mu_{n}^{J}}(\,f(u))\equiv \int_{X}f(u)\mu_{n}^{J}({\rm d}u) = \sum_{j=1}^{J}W_{n}^{(\,j)}f(u^{(\,j)}) , \nonumber \end{align} \tag{ 12 }$

which converges (see for example [34]) to the expectation (under $\mu_{n}$ ) of the quantity of interest $\mathbb{E}^{\mu_{n}}(\,f(u))$ .

The recursive computation of the weighted particles in SMC is suitable for the proposed application in RTM as it allows us to update, potentially in real time, our knowledge of the uncertainty in the log-permeability. However, producing accurate approximations of the Bayesian posteriors $\{\mu_{n}\}_{n=1}^{J}$ in the context of the inference of preform log-permeability in RTM represents a substantial computational challenge that arises from the fact that these posterior measures are defined on a (infinite-dimensional) functional space. Upon discretization, these posteriors could be potentially defined on a very high-dimensional space. Unfortunately, it has been shown [5, 9] that standard Bayesian sampling methodologies such as standard SMC do not scale well with the dimension of the (discretized) unknown; this leads to unstable and ultimately inaccurate algorithms.

The recent works [9, 22] developed scalable (dimension independent) sampling algorithms for the approximation of the Bayesian posterior that arises from high-dimensional inverse problems. While these algorithms have a solid theoretical background that ensures their stability and convergence properties, achieving a desirable level of accuracy often comes at extremely high computational cost. More specifically, Bayesian methodologies, that provide approximation of the form (11) and that converge asymptotically to the underlying posterior measure $\mu_{n}$ , often involve solving the forward model thousands or even millions of times. In the context of the inverse problem for RTM, the numerical solution of the moving boundary (forward) problem in 2D or 3D settings is computationally very intensive. Therefore, the sequential approximation of the Bayesian posteriors of preform's log-permeability must be conducted with scalable computational efficiency so that it can be realistically used within a near real-time optimization loop for RTM. In the proposed work we develop a computational inverse framework that possess such computational efficiency with the ultimate aim of the real-time uncertainty quantification of the reinforced preform's log-permeability.

1.3. Contributions of this work

The contributions of this article are the following:

(A)
A Bayesian formulation of the inverse problem to infer log-permeability from sequential data collected during resin injection in RTM. Both the 2D forward model described by (1)–(9) as well as the corresponding 1D version are considered. For the 1D case, we show that application of the infinite-dimensional Bayesian framework of [46] leads to well-posedness of the sequence of Bayesian posteriors.
(B)
Application of a state-of-the-art SMC framework [22] for the approximation of the sequence of Bayesian posteriors $\{\mu_{n}\}_{n=1}^{J}$ that arises from the Bayesian formulation. Our use of SMC serves two purposes. First, the SMC framework motivates construction of a novel regularizing ensemble Kalman algorithm (REnKA) that aims at approximating this sequence of posteriors in a computationally efficient manner, thus suitable for its implementation in a practical setting of RTM. Second, we use a Benchmark obtained by SMC application in order to test accuracy of REnKA.
(C)
Numerical investigation of the accuracy and robustness of the proposed REnKA scheme in the 1D case; this involves constructing, via the SMC sampler of [22], accurate approximations of the posteriors that we use as Benchmark against which we compare the proposed REnKA. The advantages of REnKA in terms of accuracy versus computational cost are showcased by comparing it with the implementation of a low-resolution SMC whose computational cost is comparable to REnKA's.
(D)
Application of REnKA for further investigations of the Bayesian inverse problem in both 1D and 2D. In particular for the 1D case we conduct a numerical investigation of the added value of assimilating the front location relative to the number of pressure sensors. Since the number of pressure sensors that can be physically deployed for preform permeability monitoring in RTM is usually limited, this investigation aims at providing practitioners with guidelines for the number of sensors that can accurately infer preform permeability alongside with its uncertainty. In addition, for the 1D case we study the effect of the frequency of the observations, as well as the observational noise level on the inference of the log-permeability. We further apply REnKA to the 2D forward model and, analogous to the 1D case, we study the effect that the number of pressure sensors have on the inferred log-permeability.

The rest of the paper is organized as follows. In section 2 we introduce the Bayesian inverse problem of inferring the permeability of a porous media in a 1D moving boundary problem for resin injection in RTM. In section 3 we discuss and apply SMC to approximate the Bayesian posteriors that arise from the Bayesian approach. In section 4 we introduce REnKA and then conduct, via numerical experiments with the 1D RTM model, a numerical investigation of its approximation properties relative to its computational cost. In section S1 of the supplementary material (stacks.iop.org/IP/34/105002/mmedia) SupMat we further investigate relevant practical aspects of the inverse problem in 1D including the study of the effect of the number of pressure sensors as well as the noise level on accuracy of the inferred log-permeability and its uncertainty. In section 5 we demonstrate the applicability of REnKA to approximate the Bayesian inverse problem in the 2D RTM model described earlier. Some conclusions are presented in section 6.

2. Bayesian inversion of a 1D RTM model

In this section we apply the Bayesian approach to infer log-permeability in the context of the 1D version of the forward problem defined in (1)–(9). The corresponding 1D moving boundary problem induces a sequence of forward maps that we define in section 2.1 and that we aim at inverting with the Bayesian formalism that we introduce in section 2.2. This sequence of 1D forward maps admits a closed-form solution that can be numerically approximated at a very low computational cost. This will enable us in section 3 to obtain accurate numerical approximations of the solution to the Bayesian inverse problem; we use these accurate approximations as a benchmark for assessing the approximation properties of the ensemble Kalman algorithm that we introduce in section 4.

2.1. The forward 1D RTM model

Let us consider a 1D porous media with physical domain $\newcommand{\e}{{\rm e}} D^{\ast}\equiv [0, x^{\ast}]\subset \mathbb{R}$ . As before, we denote by $\kappa(x)$ ( $x\in D^{\ast}$ ) and $\varphi>0$ the permeability and porosity of the porous medium, respectively. Resin with viscosity μ is injected at x = 0 at a pressure p_I. The pressure at the moving front (outlet) $\newcommand{\Ga}{\Upsilon} \Ga(t)$ is prescribed and equal to p₀. The initial pressure distribution before injection is also set to p₀. For convenience of the subsequent analysis, we parameterize the permeability in terms of its natural logarithm $\newcommand{\e}{{\rm e}} u(x)\equiv \log\kappa(x)$ . The pressure $p(x, t)$ and the moving front $\newcommand{\Ga}{\Upsilon} \Ga(t)$ are given by the solution to the following model

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \frac{\rm d}{{\rm d}x}\Bigg[\frac{1}{\mu}{\rm e}^{u(x)}\frac{\rm d}{{\rm d}x}p(x,t)\Bigg] =0,~~x\in (0,\Ga(t)),~ t>0, \label{eq1} \nonumber \end{align} \tag{ 13 }$

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle p(x,0) =p_{0},~~x\in (0,x^{\ast}], \label{eq3} \nonumber \end{align} \tag{ 14 }$

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle p(0,t) =p_{I},~~t\geqslant 0, \label{eq4} \nonumber \end{align} \tag{ 15 }$

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \frac{\rm d}{{\rm d}t}\Ga(t) + \frac{1}{\varphi\mu}{\rm e}^{u(\Ga(t))}\frac{\rm d}{{\rm d}x}p(\Ga(t),t)=0,~t>0, ~~\ \ \Ga(0)=0, \label{eq5} \nonumber \end{align} \tag{ 16 }$

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle p(\Ga(t),t) =p_{0},\ ~~~~ t>0. \label{eq6} \nonumber \end{align} \tag{ 17 }$

The solution to (13)–(17) can be obtained analytically by the following proposition (see [3, 38, 47]).

Proposition 2.1. Given $\newcommand{\e}{{\rm e}} u\in X\equiv C[0, x^{\ast}]$ , let us define

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq7} F_{u}(x):=\int_{0}^{x}{\rm e}^{-u(z)} {\rm d}z,\qquad {\rm and}\qquad W_{u}(x):=\int_{0}^{x}F_{u}(\xi){\rm d}\xi. \nonumber \end{align} \tag{ 18 }$

The unique solution $\newcommand{\Ga}{\Upsilon} \Ga(t)$ , $p(x, t)$ of (13)–(17) is given for $t\geqslant 0$ by

$\begin{align} \newcommand{\bi}{\boldsymbol} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \Ga(t) =W_{u}^{-1}\bigg((\,p_{I}-p_{0})\frac{t}{\mu\varphi}\bigg), \label{eq8} \nonumber \end{align} \tag{ 19 }$

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{eq10} p(x,t) = \left\{\begin{array}{@{}cc@{}} p_{I}-(\,p_{I}-p_{0})\frac{F_u(x)}{F_u(\Ga(t))}, & \ x\in D(t)\equiv (0,\Ga(t)), \nonumber \\ p_{0},& \ x\in D^{\ast}\setminus D(t). \end{array}\right. \nonumber \end{align} \tag{ 20 }$

The quantity of interest arising from the RTM injection model is the so-called filling time: the time it takes the front $\newcommand{\Ga}{\Upsilon} \Ga(t)$ to reach the right boundary of the domain of interest $[0, x^{\ast}]$ . Filling time, denoted by $\tau^{\ast}$ , is defined by $\newcommand{\Ga}{\Upsilon} \Ga(\tau^{\ast})=x^{\ast}$ . From (19) and the definition in (18) it follows [38] that $\tau^*$ is given by

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq10B} \tau^{*}=\frac{\mu\varphi}{(\,p_{I}-p_{0})}\int_{0}^{x^{*}}F_{u}(\xi){\rm d}\xi. \nonumber \end{align} \tag{ 21 }$

Note that the parameters p₀ and p_I are prescribed control variables and thus known. In addition, we assume that μ and φ are known constants. As stated earlier, we are interested in the inverse problem of estimating the permeability, or more precisely its natural logarithm $u(x)=\log{\kappa}(x)$ given time-discrete measurements of the front location as well as the pressure from M sensors located at $\{x_{m}\}_{m=1}^{M}\subset [0, x^{\ast}]$ . We denote by $\{t_{n}\}_{n=1}^{N}$ the set of N observation times. For fixed (assumed known) parameters p_I, p₀, φ and μ, the solution to the PDE model (13)–(17) induces the nth forward map $\newcommand{\cG}{{\mathcal G}} \cG_{n}:C[0, x^{\ast}]\to \mathbb{R}^{M+1}$ defined by

$\begin{align} \newcommand{\bi}{\boldsymbol} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \newcommand{\cG}{{\mathcal G}} \displaystyle \label{eq11} \cG_{n}(u) \equiv \Big[\cG_{n}^{\Ga}(u) ,\cG_{n}^{\,p}(u) \Big]^{T}=\Big[\Ga(t_{n}\wedge \tau^{\ast}),\bigg\{\,p(x_{m},t_{n}\wedge \tau^{\ast}) \bigg\}_{m=1}^{M} \Big]^{T}. \nonumber \end{align} \tag{ 22 }$

Given $u(x)=\log{\kappa(x)}\in X$ , the evaluation of the forward map $\newcommand{\cG}{{\mathcal G}} \cG_{n}(u)$ predicts the location of the front and the pressure at the sensor locations at the time t = t_n. Since observation times are prescribed before the experiment, there is no assurance that for a given u, the corresponding filling time satisfies $t_{n}\leqslant \tau^{\ast}$ for all $n=1, \dots, N$ . In other words, the front could reach the right end of the domain before we observe it at time t_n. In a real experimental setting, the process stops at time $\tau^{\ast}$ . However, in the inverse problem of interest here, observation times are selected beforehand, and the search of optimal u's within the Bayesian calibration of the nth forward map can lead to filling times greater than some observation times. In this case $(t_{n}>\tau^{\ast})$ , the definition (22) yields $\newcommand{\cG}{{\mathcal G}} \newcommand{\Ga}{\Upsilon} \cG_{n}(u) =[\Ga(\tau^{\ast}), \{\,p(x_{m}, \tau^{\ast})\}_{m=1}^{M}]^{T}$ .

The following theorem ensures the continuity of the forward map, which is necessary for justifying the application of the Bayesian framework in section 2.2.2.

Theorem 2.2. The forward map $\newcommand{\cG}{{\mathcal G}} \cG_{n}:C[0, x^{*}]\to \mathbb{R}^{M+1}$ is continuous.

For the proof of this theorem, see appendix A.

In the following subsection we apply the Bayesian framework for inverse problems in order to invert observations of $\newcommand{\cG}{{\mathcal G}} \cG_{n}(u)$ .

Remark 2.3. We note that for the present work the porosity φ is an assumed known constant; our objective is to infer the log-permeability $u(x)=\log{\kappa(x)}$ . However, the Bayesian methodology that we apply can be extended to the case where the unknown is not only $\log{\kappa}(x)$ but also φ, and can include the case where $\varphi=\varphi(x)$ is a spatial function defined on the physical domain $D^{\ast}$ .

2.2. The Bayesian inverse problem

Suppose that, at each observation time t = t_n, we collect noisy measurements of the front location as well as pressure measurements from sensors. We denoted these measurement by $\newcommand{\Ga}{\Upsilon} y_{n}^{\Ga}\in \mathbb{R}^{+}$ and $y_{n}^{\,p}\in \mathbb{R}^{M}$ , respectively. Our aim is to solve the inverse problem of estimating the log permeability $u(x)=\log{\kappa(x)}$ given all the data $\newcommand{\Ga}{\Upsilon} y_{1}^{\,p}, y_{1}^{\Ga}, \dots, y_{n}^{\,p}, y_{n}^{\Ga}$ up to time t = t_n. We assume that the aforementioned observations are related to the unknown $u(x)$ , via the forward map (22), in terms of

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\cG}{{\mathcal G}} \displaystyle \label{eq12} y_{n}^{\,p} =\cG_{n}^{\,p}(u)+\eta_{n}^{\,p}, \nonumber \end{align} \tag{ 23 }$

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \newcommand{\cG}{{\mathcal G}} \displaystyle y_{n}^{\Ga} =\cG_{n}^{\Ga}(u)+\eta_{n}^{\Ga},\label{eq12B} \nonumber \end{align} \tag{ 24 }$

where $\newcommand{\Ga}{\Upsilon} \newcommand{\e}{{\rm e}} \eta_{n}^{\Ga}$ and $\newcommand{\e}{{\rm e}} \eta_{n}^{\,p}$ are realizations of Gaussian noise with zero mean and covariance $\newcommand{\Ga}{\Upsilon} \Gamma_{n}^{\Ga}$ and $\newcommand{\Ga}{\Upsilon} \Gamma_{n}^{\,p}$ , respectively, i.e. $\newcommand{\Ga}{\Upsilon} \newcommand{\e}{{\rm e}} \eta_{n}^{\Ga}\sim N(0, \Gamma_{n}^{\Ga})$ and $\newcommand{\Ga}{\Upsilon} \newcommand{\e}{{\rm e}} \eta_{n}^{\,p}\sim N(0, \Gamma_{n}^{\,p})$ (i.i.d.). For simplicity we assume that both measurements of the front location and pressures are uncorrelated in time. We additionally assume that $\newcommand{\Ga}{\Upsilon} \newcommand{\e}{{\rm e}} \eta_{n}^{\Ga}$ and $\newcommand{\e}{{\rm e}} \eta_{n}^{\,p}$ are uncorrelated for all $n=1, \dots, N$ .

Note that (23) and (24) can be written as

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \newcommand{\cG}{{\mathcal G}} \displaystyle \label{eq12C} y_{n}=\cG_{n}(u)+\eta_{n},\qquad \eta_{n}\sim N(0,\Gamma_{n}), \nonumber \end{align} \tag{ 25 }$

with

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{eq12D} y_{n}\equiv \left[\begin{array}{@{}c@{}} y_{n}^{\Ga} \nonumber \\ y_{n}^{\,p} \end{array}\right] ,\qquad \eta_{n}\equiv \left[\begin{array}{@{}c@{}} \eta_{n}^{\Ga} \nonumber \\ \eta_{n}^{\,p} \end{array}\right],\qquad \Gamma_{n}\equiv \left[\begin{array}{@{}cc@{}} \Gamma_{n}^{\Ga} &0 \nonumber \\ 0 &\Gamma_{n}^{\,p} \end{array}\right]. \nonumber \end{align} \tag{ 26 }$

Remark 2.4. Due to the nature of the RTM problem, we have that the pressure p(x_m,t) at each sensor x_m should increase with time as well as the fact that $\newcommand{\cG}{{\mathcal G}} \newcommand{\Ga}{\Upsilon} \cG_{n+1}^{\Ga}(u)\geqslant\cG_{n}^{\Ga}(u)$ . However, the Gaussian noise in (23) and (24) can make the observations $y_{n}^{\,p}$ and $\newcommand{\Ga}{\Upsilon} y_{n}^{\Ga}$ 'unphysical'. In practice, observations need to be post-processed before using them for the Bayesian inverse problem and unphysical $\newcommand{\Ga}{\Upsilon} y_{n}^{\,p}, y_{n}^{\Ga}$ should be excluded. We leave the question of how to incorporate such a post-processing framework for future study. Here we follow the traditional point of view on data modeled via (23) and (24) and choose sufficiently small $\newcommand{\Ga}{\Upsilon} \Gamma_{n}^{\Ga}$ and $\newcommand{\Ga}{\Upsilon} \Gamma_{n}^{\,p}$ so that the probability of $\newcommand{\Ga}{\Upsilon} y_{n}^{\,p}, y_{n}^{\Ga}$ being unphysical is very low.

We adopt the Bayesian framework for inverse problems where the unknown $u(x)=\log{\kappa(x)}$ is a random field and our objective is to characterize the sequence of distributions of u conditioned on the observations which we express as $u\vert y_{1}, \dots, y_{n}$ . In other words, at each observation time t = t_n we aim at computing the Bayesian posterior $\mu_{n}(u)=\mathbb{P}(u\vert y_{1}, \dots, y_{n})$ . From this distribution we can obtain point estimates of the unknown that can be used in real time to, for example, modify controls (e.g. p_I). More importantly, as we stated in the Introduction, the aforementioned distribution enables us to quantify uncertainty not only of the unknown but also of quantities of interest that may be relevant to an optimization of resin injection in RTM.

Even though for the illustrative purposes the model presented in this section is discretized on a relatively low dimensional space (e.g. 60 cells), our aim is to introduce a general computational framework independent of the size of the discretized domain. We therefore consider an infinite-dimensional formulation of the Bayesian inverse problem for which the unknown u belongs to a functional space X. The discretization of the Bayesian inverse problem will be conducted at the last stage of the computational algorithm, when the posteriors are sampled/approximated. Thus, we are aiming at robust mesh-invariant computational algorithms.

2.2.1. The prior.

For the Bayesian approach that we adopt in this work, we require to specify a prior distribution $\mu_{0}(u)=\mathbb{P}(u)$ of the unknown, before the data are collected. This distribution comprises all our prior knowledge of the unknown and may include, for example, the regularity of the space of admissible solutions to the inverse problem. For the present work we consider Gaussian priors which have been used to characterize the uncertainty in the (log) permeability that arises from the preform fabrication [30, 31, 50] (see also references therein). In particular, here we consider stationary Gaussian priors $\newcommand{\cC}{{\mathcal C}} \mu_{0}=N(\overline{u}, \cC)$ with covariance operator $\newcommand{\cC}{{\mathcal C}} \cC$ that arises from the Wittle–Matern correlation function defined by [25, 28, 39, 43]:

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{eq13} c(x,y) =\sigma_{0}^{2}\frac{2^{1-\nu}}{\Gamma(\nu)}\Bigg(\frac{\vert x-y\vert}{l} \Bigg)^{\nu}K_{\nu}\Bigg(\frac{\vert x-y\vert}{l}\Bigg), \nonumber \end{align} \tag{ 27 }$

where Γ is the gamma function, l is the characteristic length scale, $\sigma_{0}^2$ is an amplitude scale and $K_{\nu}$ is the modified Bessel function of the second kind of order ν. The parameter ν controls the regularity of the samples. It can be shown [11, 43] that, for any $\nu>0$ , if $u\sim \mu_{0}$ , then $u\in C[0, x^{\ast}]$ almost-surely, i.e. $\mu_{0}([0, x^{\ast}])=1$ . This requirement, together with the continuity of the forward map ensures the well-posedness of the Bayesian inverse problems as we discuss in the next subsection. In the context of composite preform's permeability, it is natural to choose the mean $\overline{u}$ according to the log-permeability intended by the design of the composite part [38].

For computational purposes we use the prior to parametrize the unknown u in terms of its Karhunen–Loeve (KL) expansion [2]:

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eqKL} u(x)=\overline{u}(x)+\sum_{k=1}^{\infty}\lambda_{k}^{1/2} v_{k}(x) u_{k} \nonumber \end{align} \tag{ 28 }$

with coefficients u_k and where $\lambda_{k}$ and $v_{k}$ are the eigenvectors and eigenfunctions of $\mathcal{C}$ , respectively. A random draw from the prior $u\sim N(\overline{u}, \mathcal{C})$ can then be obtained from (28) with drawing $u_{k}\sim N(0, 1)$ i.i.d.

2.2.2. The posterior.

From (25) and our Gaussian assumptions on the observational noise, it follows that for a fixed $u\in X$ , we have $\newcommand{\cG}{{\mathcal G}} \newcommand{\Ga}{\Upsilon} \newcommand{\e}{{\rm e}} y_{n}=\cG_{n}(u)+\eta_{n} \sim N(\cG_{n}(u), \Gamma_{n})$ . Therefore, the likelihood of $y_{n} \vert u$ is given by

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \newcommand{\cG}{{\mathcal G}} \displaystyle \label{eq14} l_{n}(u,y_{n}) \propto \exp\Big[-\frac{1}{2}\vert\vert \Gamma_{n}^{-1/2}(y_{n}-\cG_{n}(u))\vert\vert^{2} \Big]. \nonumber \end{align} \tag{ 29 }$

At a given time t = t_n, the Bayesian posterior $\mu_{n}(u)=\mathbb{P}(u\vert y_{1}, y_{2}, \dots, y_{n})$ is defined by the following infinite-dimensional version of Bayes's rule.

Theorem 2.5 (Bayes theorem [46]).

theorem Let $\newcommand{\cG}{{\mathcal G}} \{\cG_{s}\}_{s=1}^{N}$ be the sequence of forward maps defined by (22) and let $\{l_{s}(u;y_{s})\}_{s=1}^{N}$ be the corresponding likelihood functions (29). Let $\mu_0=N(\overline{u}, \mathcal{C})$ be the prior distribution with correlation function (27). Then, for each $n\in \{1, \dots, N\}$ , the conditional distribution of $u|y_{1}, \cdots, y_{n}$ , denoted by $\mu_{n}$ , exists. Moreover, $\mu_{n}\ll\mu_0$ with the Radon–Nikodym derivative

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq17} \frac{{\rm d} \mu_{n}}{{\rm d}\mu_{0}}(u) =\frac{1}{Z_{n}} \prod\limits_{s=1}^{n}l_{s}(u,y_{s}), \nonumber \end{align} \tag{ 30 }$

where

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq18} Z_{n}=\int_{X} \prod\limits_{s=1}^{n} l_{s}(u,y_{s})\mu_{0}(u){\rm d}u>0. \nonumber \end{align} \tag{ 31 }$

Proof. The proof follows from the application of theorem 6.31 in [46] and the continuity of the forward maps (theorem 2.2) on a full $\mu_0$ -measure set X. □

Note that from our assumption of independence of $\newcommand{\e}{{\rm e}} \eta_{1}, \dots, \eta_{n}$ , the right hand side of (30) is the likelihood of $y_{1}, \dots, y_{n}\vert u$ .

Remark 2.6. Due to the assumption of independence between front location and pressure measurements, the likelihood (29) can be expressed as

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{eq14A} l_{n}(u,y_{n}) \propto l_{n}^{\,p}(u,y_{n}^{\,p})l_{n}^{\Ga}(u,y_{n}^{\Ga}), \nonumber \end{align} \tag{ 32 }$

where

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \newcommand{\cG}{{\mathcal G}} \displaystyle \label{eq14B} l_{n}^{\beta}(u,y_{n}^{\beta}) \propto \exp\Big[-\frac{1}{2}\vert\vert [\Gamma_{n}^{\beta}]^{-1/2}(y_{n}^{\beta}-\cG_{n}^{\beta}(u))\vert\vert^{2} \Big],\qquad \beta\in\{\,p,\Ga\}. \nonumber \end{align} \tag{ 33 }$

This enables us to define two particular cases of the inverse problem. The first case corresponds to the assimilation of only pressure measurements $y_{n}^{\,p}$ , while in the second case only front location measurements $\newcommand{\Ga}{\Upsilon} y_{n}^{\Ga}$ are assimilated. Similar arguments to those that led to theorem 2.5 can be applied (with $l_{s}^{\beta}(u, y_{n}^{\beta})$ instead of $l_{s}(u, y_{n})$ ) to define the Bayesian posteriors $\mu_{n}^{\,p}$ and $\newcommand{\Ga}{\Upsilon} \mu_{n}^{\Ga}$ associated to these two Bayesian inverse problems. In section S1 of the supplementary material SupMat we study these two particular cases with an eye towards understanding the added value of assimilating observations of the front location with respect to assimilating only pressure measurements.

3. Approximating the posteriors via sequential Monte Carlo method

In the previous section we have established the well-posedness of the Bayesian inverse problem associated to inferring the log-permeability in the 1D moving boundary problem (13)–(17). The solution of this inverse problem is the sequence of posterior measures $\{\mu_{n}\}_{n=1}^{N}$ defined by theorem 2.5. As we discussed in section 1, these posteriors cannot be expressed analytically and so a sampling approach is then required to compute the corresponding approximations. As stated earlier, the sampling of each posterior $\mu_{n}$ ( $n=1, \dots, N$ ) can be performed independently by, for example, Markov chain Monte Carlo (MCMC) methods. However, we reiterate that, for the present application SMC samplers are rather convenient as they exploit the sequential nature of the considered inverse problem by enabling a recursive approximation of the posterior measures as new data (in time) become available. Such recursive approximations of the posterior could enable practitioners to update their probabilistic knowledge of preform's log-permeability which is, in turn, essential to develop real-time optimal control strategies for RTM under the presence of uncertainty.

Recognizing that the inverse problem under consideration involves inferring a function potentially discretized on a very fine grid, it is vital to consider the application of SMC samplers such as the one introduced in [22], carefully designed for approximating measures defined on a high-dimensional space. In this section we review and apply this scheme for the approximation of the Bayesian posteriors $\{\mu_{n}\}_{n=1}^{N}$ that we defined in the previous section. The aims of this section are to (i) provide a deeper quantitative understanding of the accuracy of the fully-Bayesian methodology of [22] with respect to its computational cost under practical computational conditions; (ii) provide a motivation for the proposed REnKA that we propose from this SMC sampler in section 4; and (iii) define accurate approximations of $\{\mu_{n}\}_{n=1}^{N}$ which we use as a benchmark for testing our REnKA scheme.

In section 3.1 we briefly discuss the essence of the standard SMC that we then use in sections 3.2–3.3 to review methodological aspects of the adaptive-tempering SMC sampler for high-dimensional inverse problems of [22]. We then apply this SMC in section 3.4 for the solution of the Bayesian inverse problem in the 1D case defined in the previous section. In section 3.5 we assess practical limitations of the SMC.

3.1. Standard SMC for Bayesian inference

As we discussed in the Introduction, starting with the prior $\mu_{0}$ , the objective of SMC is to recursively compute an approximation of the sequence of Bayesian posteriors $\{\mu_{n}\}_{n=1}^{N}$ in terms of weighted particles. More specifically, assume that at the observation time t_n, we have a set of J particles $\{u_{n-1}^{(\,j)}\}_{j=1}^{J}$ with, for simplicity, equal weights ( $W_{n}^{(\,j)}=1/J$ , $j=1, \dots, J$ ), which provides the following particle approximation of $\mu_{n-1}(u)=\mathbb{P}(u\vert y_{1}, \dots, y_{n-1})$ :

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq35A0} \mu_{n-1}^{J}(u)\equiv \frac{1}{J}\sum_{j=1}^{J} \delta_{u_{n-1}^{(\,j)}}(u)\approx \mu_{n-1}(u). \nonumber \end{align} \tag{ 34 }$

The objective now is to construct a particle approximation of $\mu_{n}(u)=\mathbb{P}(u\vert y_{1}, \dots, y_{n})$ , which includes the new data y_n collected at time t_n. In a standard SMC framework [8, 10, 33], this particle approximation is constructed by means of an importance sampling step with proposal distribution $\mu_{n-1}$ . To illustrate this methodology, let us first note formally that

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq35A2} \mathbb{E}^{\mu_{n}}(\,f(u))\equiv \int_{X}f(u)\mu_{n}({\rm d}u) =\frac{Z_{n-1}}{Z_{n}}\int_{X}f(u) l_{n}(u,y_{n})\mu_{n-1}({\rm d}u)\nonumber \\ = \Big[\int_{X} l_{n}(u,y_{n})\mu_{n-1}({\rm d}u)\Big]^{-1}\int_{X}f(u) l_{n}(u,y_{n})\mu_{n-1}({\rm d}u), \nonumber \end{align} \tag{ 35 }$

where we have used

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq35A3} \frac{{\rm d} \mu_{n}}{{\rm d}\mu_{n-1}}(u) =\frac{Z_{n-1}}{Z_{n}} l_{n}(u,y_{n})\quad {\rm and} \quad \frac{Z_{n}}{Z_{n-1}}=\int_{X} l_{n}(u,y_{n})\mu_{n-1}({\rm d}u), \nonumber \end{align} \tag{ 36 }$

which can be obtained directly from theorem 2.5. An approximation of (35) can be obtained by

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq35A4} \mathbb{E}^{\mu_{n}^{J}}(\,f(u)) = c_{n}^{-1}\sum_{j=1}^{J}f(u^{(\,j)}) l_{n}(u^{(\,j)},y_{n})=\sum_{j=1}^{J}W_{n}^{(\,j)}f(u^{(\,j)}), \nonumber \end{align} \tag{ 37 }$

where

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq35A5} W_{n}^{(\,j)}\equiv c_{n}^{-1} l_{n}(u_{n}^{(\,j)},y_{n}), \qquad c_{n}\equiv \sum_{j=1}^{J}l_{n}(u_{n-1}^{(\,j)},y_{n}). \nonumber \end{align} \tag{ 38 }$

From (37) we see that the importance (normalized) weights $W_{n}^{(\,j)}$ assigned to each particle $u_{n-1}^{(\,j)}$ define the following empirical (particle) approximation of $\mu_{n}$ :

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq35A6} \mu_{n}^{J}(u)\equiv \sum_{j=1}^{J}W_{n}^{(\,j)} \delta_{u_{n-1}^{(\,j)}}(u). \nonumber \end{align} \tag{ 39 }$

However, the accuracy of such empirical approximation relies on $\mu_{n-1}$ being sufficiently close to $\mu_{n}$ ; when this is not the case, after a few iterations (observation times) the algorithm may produce only a few particles with nonzero weights. This is a well-known issue of weight degeneracy that often arises from the application of empirical (importance sampling) approximations within the context SMC samplers [5]. Weight degeneracy is routinely measured in terms of the effective sample size (ESS) statistic [24]:

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq37} {\rm ESS}\equiv \Bigg[\sum_{j=1}^{J}(W_{n}^{(\,j)})^2\Bigg]^{-1}, \nonumber \end{align} \tag{ 40 }$

which takes a value between 1 and J; ${\rm ESS}=J$ when all weights are equal and ${\rm ESS}=1$ when the distribution is concentrated at one single particle. A common approach to alleviate weight degeneracy is, for example, to specify a threshold for the ESS below which resampling (often multinomially) according to the weights $\{W_{n}^{(\,j)}\}_{j=1}^{J}$ is performed. Resampling discards particles with low weights by replacing them with several copies of particles with higher weights. The approximation of a sequence of measures via the combination of the importance sampling step followed with resampling leads to the sequential importance resampling (SIR) scheme [10].

It is important to note that the aforementioned resampling step in SIR can clearly lead to the lack of diversity in the population of resampled particles. This is, in turn, detrimental to the approximation of the sequence of posteriors. The general aim of the standard SMC approach is to diversify these particles by a mutation step with involves replacing them with samples from a Markov kernel $\mathcal{K}_{n}$ with invariant distribution $\mu_{n}$ . In the following subsection we provide a discussion of the aforementioned mutation in the context of the SMC sampler for high-dimensional inverse problem [22]. We refer the reader to [8, 10, 33, 34] for a thorough treatment of more standard SMC samplers.

3.2. SMC for high-dimensional inverse problems

The weight degeneracy in the importance sampling step described above is more pronounced when the two consecutive measures $\mu_{n-1}$ and $\mu_{n}$ differ substantially from each other. This has been particularly associated with complex (e.g. multimodal) measures defined in high-dimensional spaces. When the change from $\mu_{n-1}$ to $\mu_{n}$ is abrupt, the importance sampling step can result in a sharp failure, whereby the approximation of $\mu_{n}$ is concentrated on a single particle [5]. Recent work for high-dimensional inference problems has suggested [1, 22] that further stabilization of the importance weights is needed by defining a smooth transition between $\mu_{n-1}$ and $\mu_{n}$ . For the present work, we consider the annealing approach of [34, 35], where q_n intermediate artificial measures $\{\mu_{n, r}\}_{r=0}^{q_{n}}$ are defined such that $\mu_{n, 0}=\mu_{n-1}$ and $\mu_{n, q_{n}}=\mu_{n}$ . These measures can be bridged by introducing a set of q_n tempering parameters denoted by $\{\phi_{n, r}\}_{r=1}^{q_{n}}$ that satisfy $0=\phi_{n, 0}<\phi_{n, 1}<\cdots< \phi_{n, q_{n}}=1$ and defining each $\mu_{n, r}$ as the probability measure with density proportional to $l_{n}(u, y_{n}){}^{\phi_{n, r}}$ with respect to $\mu_{n-1}$ . More specifically, $\mu_{n, r}$ satisfies

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq37A1} \frac{{\rm d} \mu_{n,r}}{{\rm d}\mu_{n-1}} \propto \Big[l_{n}(u,y_{n})\Big]^{\phi_{n,r}}, \nonumber \end{align} \tag{ 41 }$

which, formally, implies

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq37A2} \frac{{\rm d}\mu_{n,r}}{{\rm d}\mu_{n,r-1}} \propto \Big[l_{n}(u,y_{n})\Big]^{\phi_{n,r}-\phi_{n,r-1}}. \nonumber \end{align} \tag{ 42 }$

Note that when q_n = 1, $\phi_{n, 1}-\phi_{n, 0}=1$ and so expression (42) reduces to (36). We now follow the SMC algorithm for high-dimensional inverse problems as described in [22].

3.2.1. Selection step.

The first stage of the SMC approach of [22] is a selection step which consists of careful selection of the tempering parameters which define the intermediate measures $\{\mu_{n, r}\}_{r=0}^{q_{n}}$ ; these are in turn approximated by the application of the SIR scheme described above. Let us then assume that at an observation time t_n and iteration level r − 1, the tempering parameter $\phi_{n, r-1}$ has been specified, and that a set of particles $u_{n, r-1}^{(\,j)}$ provides the following approximation (with equal weights) of the intermediate measure $\mu_{n, r-1}$ :

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq37B} \mu_{n,r-1}^{J}(u)\equiv \frac{1}{J}\sum_{j=1}^{J} \delta_{u_{n,r-1}^{(\,j)}}(u)\approx \mu_{n,r-1}(u). \nonumber \end{align} \tag{ 43 }$

From (42) we can see that the new tempering parameter $\phi_{n, r}$ must be selected to ensure that $\phi_{n, r}-\phi_{n, r-1}$ is sufficiently small, so that the subsequent measure $\mu_{n, r}$ is close to $\mu_{n, r-1}$ thus preventing a sharp failure of the empirical approximation of $\mu_{n, r}$ (39). In particular, once the next tempering parameter $\phi_{n, r}$ is specified, we note from expression (42) that the importance weights for the approximation of $\mu_{n, r}$ are given by

$\begin{align} \newcommand{\bi}{\boldsymbol} \newcommand{\e}{{\rm e}} \displaystyle \label{eq36} W_{n,r}^{(\,j)} =\mathcal{W}_{n,r-1}^{(\,j)}[\phi_{n,r}] = \frac{\big[l_{n}(u_{n,r-1}^{(\,j)},y_{n})\big]^{\phi_{n,r}-\phi_{n,r-1}}}{\sum\nolimits_{s=1}^{J}\big[l_{s}(u_{n,r-1}^{(s)},y_{n})\big]^{\phi_{n,r}-\phi_{n,r-1}}}. \nonumber \end{align} \tag{ 44 }$

Recognizing that the ESS in (40) quantifies weight degeneracy in SIR, the approach of [22] (see also [21]) proposes to define on-the-fly the next tempering parameter $\phi_{n, r}$ by imposing a fixed, user-defined value J_thres on the ESS. More specifically, $\phi_{n, r}$ is defined by the solution to the following equation:

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq38} {\rm ESS}_{n,r}(\phi)\equiv \Bigg[\sum_{j=1}^{J}(\mathcal{W}_{n,r-1}^{(\,j)}[\phi])^2\Bigg]^{-1}= J_{\rm thres}, \nonumber \end{align} \tag{ 45 }$

which may, in turn, be solved by a simple bisection algorithm on the interval $(\phi_{n, r-1}, 1]$ . An approximation of $\mu_{n, r}$ is then given by the weighted particle set $\{u_{n, r-1}^{(\,j)}, W_{n, r}^{(\,j)}\}_{j=1}^{J}$ . If at the r − 1 level, we find that ${\rm ESS}_{n, r}(1)>J_{\rm thresh}$ , it implies that no further tempering is required and thus one can simply define $\phi_{n, r}=1$ . We note that the number of tempering steps q_n is random.

While the tempering approach described above is aimed at preventing ESS from falling below a specified threshold J_thres and thus avoiding a sharp failure of the empirical approximation of $\mu_{n, r}$ , resampling is still required to discard particles with very low weights. Let us then denote by $\hat{u}_{n, r}^{(\,j)}$ ( $j=1, \dots, J$ ) the particles, with equal weights, that result from resampling with replacement of the set of particles $u_{n, r-1}^{(\,j)}$ according to the weights $W_{n, r}^{(\,j)}$ .

3.2.2. Mutation phase.

As stated in the preceding subsection, at the core of the SMC methodology is a mutation phase that adds diversity to the population of the resampled particles $\hat{u}_{n, r}^{(\,j)}$ . In the context of the tempering approach described above, this mutation is conducted by means of sampling from a Markov kernel $\mathcal{K}_{n, r}$ with invariant distribution $\mu_{n, r}$ . Similar to the approach of [22], here we consider mutations given by running $N_{\mu}$ steps of an MCMC algorithm with $\mu_{n, r}$ as its target distribution. More specifically, we consider the preconditioned Crank–Nicolson (pcn)-MCMC method from [9] with target distribution $\mu_{n, r}$ and reference measure $\mu_{0}$ . Formally, these two measures are related by

$\begin{align} \newcommand{\bi}{\boldsymbol} \newcommand{\e}{{\rm e}} \displaystyle \label{eq34C} \frac{{\rm d} \mu_{n,r}}{{\rm d}\mu_{0}} \propto l_{n,r}(u,y_{n})\equiv \big[l_{n}(u,y_{n})\big]^{\phi_{n,r}} \prod\limits_{s=1}^{n-1}l_{s}(u,y_{s}). \nonumber \end{align} \tag{ 46 }$

The pcn-MCMC method for sapling $\mu_{n, r}$ is summarised in algorithm 2 (see appendix B). Under reasonable assumptions this algorithm produces a $\mu_{n, r}$ -invariant Markov kernel [22]. The resulting particles denoted by $u_{n, r}^{(\,j)}$ ( $u_{n, r}^{(\,j)}\sim \mathcal{K}_{n, r}(\hat{u}_{n, r}^{(\,j)}, \cdot)$ ) then provide the following particle approximation of $\mu_{n, r}$ :

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq39} \mu_{n,r}^{J}\equiv \frac{1}{J}\sum_{j=1}^{J} \delta_{u_{n,r}^{(\,j)}} \to \mu_{n,r}\qquad {\rm as}~J\to\infty, \nonumber \end{align} \tag{ 47 }$

where the convergence is proven in a suitable metric for measures [1]. Note that at the end of the iteration r = q_n, the corresponding particle approximation $\newcommand{\e}{{\rm e}} \mu_{n}^{J}=\mu_{n, q_{n}}^{J}\equiv \frac{1}{J}\sum\nolimits_{j=1}^{J} \delta_{u_{n, q_{n}}^{(\,j)}}$ provides the desired approximation of the posterior that arises from the Bayesian inverse problem of interest. This SMC sampler is summarized in algorithm 3 (see appendix B).

Remark 3.1. For simplicity, here we use the resampling step at every iteration of the SMC sampler. However, whenever ${\rm ESS}_{n, r}(1)>J_{\rm thresh}$ (and so $\phi_{n, r}=1$ ) the resampling step can be skipped; this involves using the corresponding weighted particle approximation and modifying the formula for the incremental weights as discussed in [22, section 4.3].

3.3. A note on tempering

Let us define the following inverse of the increment in tempering parameters:

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{reg_param} \alpha_{n,r}=\frac{1}{\phi_{n,r}-\phi_{n,r-1}}, \nonumber \end{align} \tag{ 48 }$

and note that $0\leqslant \phi_{n, r}\leqslant 1$ implies $\alpha_{n, r}\geqslant 1$ . In addition, expression (42) can be written as

$\begin{align} \newcommand{\bi}{\boldsymbol} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \newcommand{\cG}{{\mathcal G}} \displaystyle \label{reg_param2} \frac{{\rm d} \mu_{n,r}}{{\rm d}\mu_{n,r-1}} \propto \big[l_{n}(u,y_{n})\big]^{\alpha_{n,r}^{-1}} \propto \exp \Bigg[-\frac{1}{2\alpha_{n,r}}\vert\vert (\Gamma_{n})^{-1/2}(y_{n}-\cG_{n}(u))\vert\vert^2 \Bigg]\nonumber \\ =\exp\Bigg[-\frac{1}{2}\vert\vert (\alpha_{n,r}\Gamma_{n})^{-1/2}(y_{n}-\cG_{n}(u))\vert\vert^2 \Bigg], \nonumber \end{align} \tag{ 49 }$

where we have used the definition of the likelihood in (29). Informally, we can then interpret each iteration of the SMC sampler (at a given observation time t_n) as the solution of a Bayesian inverse problem that consists of finding $\mu_{n, r}$ given the prior $\mu_{n, r-1}$ and the data:

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \newcommand{\cG}{{\mathcal G}} \displaystyle \label{eq42} y_{n}=\cG_{n}(u)+\tilde{\eta}_{n},\qquad \tilde{\eta}_{n,r}\sim N(0,\alpha_{n,r}\Gamma_{n}). \nonumber \end{align} \tag{ 50 }$

From (48) and the fact that $0\leqslant \phi_{n, r}\leqslant 1$ , it follows that $\alpha_{n, r}\geqslant 1$ . Therefore, (50) is nothing but the original problem (25) albeit with a noise $\newcommand{\e}{{\rm e}} \tilde{\eta}_{n, r}$ that has an inflated covariance $\newcommand{\Ga}{\Upsilon} \alpha_{n, r}\Gamma_{n}$ .

We also note that $\alpha_{n, r}$ plays the role of a regularization parameter in the sense that it controls the transition between $\mu_{n, r-1}$ and $\mu_{n, r}$ . The larger the $\alpha_{n, r}$ the smoother this transition. Alternatively, we can see that $\alpha_{n, r}$ can be interpreted as a 'temperature' in the tempering scheme which, in turn, flattens out the likelihood function at the observation time t_n. Clearly, more tempering will be required whenever $\newcommand{\cG}{{\mathcal G}} \newcommand{\Ga}{\Upsilon} \vert\vert (\Gamma_{n}){}^{-1/2}(y_{n}-\cG_{n}(u))\vert\vert^2$ is large; this can for example happen if the observational data are accurate (i.e. small $\newcommand{\Ga}{\Upsilon} \Gamma_{n}$ ) and/or many observations are available.

The amount of tempering is controlled by the number of parameters obtained via (45). The greater the number of tempering parameters, the larger the $\alpha_{n, r}$ 's which in turn indicates that more regularization is needed to ensure a stable transition between those measures. This has also, in turn, an associated increase in iterations and thus in computational cost.

3.3.1. Computational aspects of SMC.

The main computational cost of the SMC sampler previously discussed is attributed to the mutation step for which $N_{\mu}$ steps of the pcn-MCMC algorithm are performed. At each observation time t_n and iteration r, the SMC sampler then requires $J\ N_{\mu}$ evaluations of the nth forward map $\newcommand{\cG}{{\mathcal G}} \cG_{n}$ . Therefore, the computational cost of computing $\mu_{n}$ is $q_{n} g_{n}J \ N_{\mu}$ , where g_n denotes the computational cost of evaluating $\newcommand{\cG}{{\mathcal G}} \cG_{n}$ which, in turn, corresponds to solving the moving boundary problem from time t = 0 up to time t_n. The total computational cost of computing the full sequence of posteriors $\{\mu_{n}\}_{n=1}^{N}$ is then

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{cost_SMC} \mathbf{C}_{\rm SMC}\equiv J \ N_{\mu} \sum_{n=1}^{N} q_{n} \frac{g_{n}}{g_{N}}, \nonumber \end{align} \tag{ 51 }$

which is expressed in terms of g_N, the cost of evaluating $\newcommand{\cG}{{\mathcal G}} \cG_{N}$ (i.e. solving the forward model from time zero up to the final observation time).

The work of [22] has suggested that accurate approximations of the posterior via SMC samplers require, for example, values of $N_{\mu}=20$ and J = 10⁴. If we assume for a moment that only one observation time N = 1 is considered and that only one tempering step q₁ = 1 is required to compute $\mu_{1}$ , the computational cost in this case would be approximately 10⁵ times the cost of solving the forward model from time t = 0 up to time t₁. Such cost would be clearly computationally prohibitive for practical applications, where the aforementioned forward simulation may take several minutes of CPU time. In particular, for the 2D or 3D version of the RTM process, the high computational cost of the SMC sampler becomes impractical. While reducing the values of J and $N_{\mu}$ may result in a more affordable computational cost, this is substantially detrimental to the level of accuracy of the SMC sampler as we show via numerical experiments in section 3.5. Alternatively, a parallel implementation of the J forward model evaluations can substantially reduce the cost of this algorithm, which in turn, scales by a factor of J. However, within the manufacturing industry, the availability of computer resources that can deliver $10^5-10^6$ parallel simulations is the exception rather than the norm.

3.4. Numerical examples with SMC

In this subsection we report the results from the numerical application of the SMC sampler discussed in the previous subsection. The objective is to approximate the sequence of Bayesian posteriors that arise from the 1D moving boundary problem defined in section 2 for the experimental set-up described in section 3.4.1. In section 3.4.2 we discuss the numerical results obtained via the SMC sampler with a very high number of particles which results in accurate approximations of the Bayesian posteriors. These approximations are then used in section 3.5 to assess the practical limitations of the scheme under certain choices of tunable parameters and number of particles. These limitations motivate the approximate methods that we propose in section 4.

3.4.1. Experimental set-up.

We consider a dimensionless version of the 1D model (13)–(17) which together with its numerical approximation is described in appendix C. The dimensionless values for the control variables are p₀ = 1 and p_I = 2. We use a Gaussian prior distribution $\mu_{0}=N(\overline{u}, \mathcal{C})$ with the covariance operator $\mathcal{C}$ that arises from the covariance function defined in (27). We numerically solve (off-line) the eigenvalue problem associated to the matrix that results from discretizing $\mathcal{C}$ ; the corresponding eigenvector/eigenvalues are then stored for subsequent use in the parameterization of the log-permeability in the SMC sampler. The KL expansion (28) becomes a truncated sum with a number of elements equal to the the total number of eigenvalues of this matrix; these are, in turn, equal to the number of cells used for the discretization of the domain $D^{\ast}=[0, 1]$ . No further truncation to this KL expansion is carried out. A few samples from the prior are displayed in figure 2 (right). Pointwise percentiles (0.02, 0.25, 0.5, 0.75 and 0.98) of the prior are displayed in figure 3 (top-left). Tuneable parameters of the prior for the present experiments are $\sigma_{0}^{2}=0.5$ , $\nu=1.5$ , l = 0.05 and $\overline{u}(x)= 0.0$ for all $x\in D^{\ast}$ .

**Figure 2.** Left: true pressure field $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}p^{\dagger}(x, t)$ and space-time measurement configuration with M = 9 sensors and N = 5 observation times. Middle: true pressure at observation times $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}\{\,p^{\dagger}(x, t_{n})\}_{n=1}^{5}$ (curves) together with the corresponding synthetic data $y_{n}^{\,p}$ (dots). Right: samples from the prior.
Download figure:
Standard image High-resolution image

$ \newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}p^{\dagger}(x, t)$ — **Figure 2.** Left: true pressure field $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}p^{\dagger}(x, t)$ and space-time measurement configuration with M = 9 sensors and N = 5 observation times. Middle: true pressure at observation times $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}\{\,p^{\dagger}(x, t_{n})\}_{n=1}^{5}$ (curves) together with the corresponding synthetic data $y_{n}^{\,p}$ (dots). Right: samples from the prior.
Download figure:
Standard image High-resolution image

**Figure 3.** Top-left: percentiles of the prior log-permeability $\mu_{0}$ . Top-Middle to bottom-right: percentiles of the posteriors $\{\mu_{n}\}_{n=1}^{5}$ obtained via SMC with large number of samples J = 10⁵. Solid red line corresponds to the graph of the true log-permeability $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}u^{\dagger}$ . Vertical dotted line indicates the location of the true front $\newcommand{\Ga}{\Upsilon} \newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}\Ga^{\dagger}(t_{n})$ .
Download figure:
Standard image High-resolution image

**Figure 3.** Top-left: percentiles of the prior log-permeability $\mu_{0}$ . Top-Middle to bottom-right: percentiles of the posteriors $\{\mu_{n}\}_{n=1}^{5}$ obtained via SMC with large number of samples J = 10⁵. Solid red line corresponds to the graph of the true log-permeability $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}u^{\dagger}$ . Vertical dotted line indicates the location of the true front $\newcommand{\Ga}{\Upsilon} \newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}\Ga^{\dagger}(t_{n})$ .
Download figure:
Standard image High-resolution image

In order to generate synthetic data, we define the 'true/reference' log permeability field $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}u^{\dagger}$ whose graph (red curve) is displayed in figure 3 (top-left); this function is a random draw from the prior described above. We use $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}u=u^{\dagger}$ in the numerical implementation of (13)–(17) in order to compute the true pressure field $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}p^{\dagger}(x, t)$ as well as the true front location $\newcommand{\Ga}{\Upsilon} \newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}\Ga^{\dagger}(t)$ . The plot of $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}p^{\dagger}(x, t)$ is shown in figure 2 (left) together with the space-time configuration of M = 9 pressure sensors and N = 5 observation times. The graphs of $\{\,p(x, t_{n})\}_{n=1}^{5}$ are shown in figure 2 (middle). The true locations of the front $\newcommand{\Ga}{\Upsilon} \newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}\{\Ga^{\dagger}(t_{n})\}_{n=1}^{5}$ are 0.21, 0.40, 0.58, 0.73 and 0.87. Synthetic data are then generated by means of $\newcommand{\re}{{\rm Re}} \newcommand{\e}{{\rm e}} \renewcommand{\dag}{\dagger}y_{n}^{\,p} =\{\,p^{\dagger}(x_{m}, t_{n})\}_{m=1}^{9}+\eta_{n}^{\,p}$ and $\newcommand{\Ga}{\Upsilon} \newcommand{\re}{{\rm Re}} \newcommand{\e}{{\rm e}} \renewcommand{\dag}{\dagger}y_{n}^{\Ga}=\Ga^{\dagger}(t_{n})+\eta_{n}^{\Ga}$ , where $\newcommand{\e}{{\rm e}} \eta_{n}^{\,p}$ and $\newcommand{\Ga}{\Upsilon} \newcommand{\e}{{\rm e}} \eta_{n}^{\Gamma}$ are Gaussian noise (see section 2.2) with standard deviations equal to 1.5% of the size of the noise-free observations. Synthetic pressure data $\{y_{n}^{\,p}\}_{n=1}^{5}$ are superimposed on the graphs of $\{\,p(x, t_{n})\}_{n=1}^{5}$ in figure 2 (middle). Synthetic front locations $\newcommand{\Ga}{\Upsilon} \{y_{n}^{\Ga}\}_{n=1}^{5}$ are 0.21, 0.39, 0.59, 0.74, 0.86. In order to avoid inverse crimes, synthetic data are generated by using a finer discretization (with 120 cells) than the one used to approximate the posteriors (with 60 cells).

3.4.2. Application of SMC.

In this subsection we report the application of the SMC sampler of [22] (see algorithm 3 in appendix B) which, as described in the preceding section, provides a particle approximation of each posterior that converges to the exact posterior measure $\mu_{n}$ as the number of particles J goes to infinity. In order to achieve a high-level of accuracy we use J = 10⁵ number of particles which is substantially larger compared to the number of particles (e.g. 10³ to 10⁴) often used in existing applications of SMC for high-dimensional inverse problems [8, 22]. In addition, we consider the selection of tunable parameters $N_{\mu}=20$ and $J_{\rm thresh}=J/3$ similar to the ones suggested in [22]. For each observation time t_n, we store the ensemble of particles $\{u_{n}^{(\,j)}\}_{j=1}^{10^5}$ that approximates the corresponding posterior $\mu_{n}$ . From this ensemble, we compute the 0.02, 0.25, 0.5, 0.75, 0.98 posterior percentiles displayed in figure 3 (top-middle to bottom-right), where we also include the graph of the true log-permeability (red curve). The vertical line in these figures indicate the true location of the front $\newcommand{\Ga}{\Upsilon} \newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}\Ga^{\dagger}(t_{n})$ at each observation time t_n. We can clearly appreciate that the uncertainty band defined by these percentiles is substantially reduced as more observations (in time) are assimilated. In fact, the main reduction of the uncertainty is observed in the region of the moving domain $\newcommand{\Ga}{\Upsilon} \newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}D^{\dagger}(t_{n})=[0, \Ga^{\dagger}(t_{n})]$ at the corresponding observation time t_n. It is then clear that at each observation time t_n, measurements collected from pressure sensors with $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}x_{m}\in D^{\ast}\setminus D^{\dagger}(t_{n})$ are not very informative of the log-permeability field. This comes as no surprise when we recognize that the pressure field given by (20) depends on the permeability field only in the region of the moving domain $D(t)$ . In other words, the values of the permeability in the region defined by $D^{\ast}\setminus D(t_{n})$ have no effect on p(x, t_n); hence the nth likelihood function is independent of u in this region. We can indeed observe from figure 3 that the percentiles of the log-permeability in this region (see domain to the right of the vertical lines) is similar to those from the prior. However, due to the regularity of the log-permeability enforced in the prior $\mu_{0}$ , there is a smooth transition in the uncertainty band at the interface defined by the front location $\newcommand{\Ga}{\Upsilon} \Ga(t_{n})$ .

The number of intermediate tempering distributions that SMC adaptively computed to approximate the sequence of posteriors $\{\mu_{n}\}_{n=1}^{5}$ were the following:

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{it} q_{1}=4,\quad q_{2}=3, \quad q_{3}=3, \quad q_{4}=2,\quad q_{5}=3. \nonumber \end{align} \tag{ 52 }$

We use these numbers in (51) to compute the total computational cost of approximating the sequence $\{\mu_{n}\}_{n=1}^{5}$ . The values of g_n (i.e. cost of evaluating each $\newcommand{\cG}{{\mathcal G}} \cG_{n}$ ) are estimated by the average CPU time from 1000 simulations computed with different log-permeabilities sampled from the prior. We obtain that total cost is approximately $1.5\times 10^7$ times the cost of evaluation the 5th forward map $\newcommand{\cG}{{\mathcal G}} \cG_{5}$ (i.e. at the final observation time). Clearly, this computational cost is prohibitive for the two and three dimensional problems where, as stated earlier, evaluating the forward map can take several minutes of CPU time. For the present 1D case we are able to afford this cost due to the relatively low cost associated with solving the 1D moving boundary problem.

3.5. Reducing the cost of SMC by adjusting tunable parameters

Given the high computational cost of computing accurate approximations of the posteriors with SMC, it is reasonable to ask whether its computational cost can be reduced by adjusting the tunable parameters in (51). By reducing either the number of particles J and/or the number of MCMC steps $N_{\mu}$ , we can achieve a substantial decrease in the computational cost. The selection of J_thresh also determines the computational cost as it, in turns, defines the number of tempering steps for each posterior. However, it is essential to understand the effect of decreasing these tunable parameters on accuracy of the SMC sampler. In this subsection we aim at understanding this effect by comparing the application of the SMC sampler with smaller number of particles J and different choices of the tunable parameters $N_{\mu}$ and J_thresh. This requires creating a Benchmark against which we can compare performance of SMC. The Benchmark is obtained by the highly-resolved characterization of the posteriors that we computed in the preceding section by using the SMC sampler with large number of particles (J = 10⁵). In section S1 of the supplementary material SupMat we provide further discussions of the performance and diagnostics of the SMC sampler applied to approximate these posterior measures. These diagnostics offer evidence that the SMC sampler has been successfully applied, thereby providing accurate characterization of the posterior that we may use as a Benchmark to compare against the posteriors computed via algorithms with lower resolution/accuracy. The numerical investigation below is aimed at assessing SMC with different selections of ensemble size J as well as the tunable parameters $N_{\mu}$ and J_thresh.

For the reasons stated above, through the rest of the this and the following sections, we refer to the aforementioned highly-resolved SMC particle approximations (with J = 10⁵) as the 'exact' sequence of posteriors $\{\mu_{n}\}_{n=1}^{5}$ that we use for subsequent comparisons purposes. Moreover, for these comparisons we assume that the sample mean and variance of these SMC samples are exact approximations of the mean $\mathbb{E}^{\mu_{n}}$ and variance $\mathbb{V}^{\mu_{n}}$ of the posterior $\mu_{n}$ . In other words, we assume

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq:1001} \mathbb{E}^{\mu_{n}} =\overline{u}_{n,10^5},\qquad \mathbb{V}^{\mu_{n}}=\sigma_{n,10^5}^{2}, \nonumber \end{align} \tag{ 53 }$

where

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq:1001B} \overline{u}_{n,J} \equiv \frac{1}{J} \sum_{j=1}^{J}u_{n}^{(\,j)},\qquad \sigma_{n,J}^{2}\equiv \frac{1}{J-1} \sum_{j=1}^{J-1}(u_{n}^{(\,j)}-\overline{u}_{n,J})^2. \nonumber \end{align} \tag{ 54 }$

Let us now consider the application of the SMC sampler for the following choices of small number of particles: $J=50, 100, 200, 400, 800, 1600, 3200, 6400$ . We also consider three choices of the tunable parameter J_thresh ( $J_{\rm thresh}=J/3, J/2, 2J/3$ ) and two choices of $N_{\mu}$ ( $N_{\mu}=5, 20$ ). In figure 4 we show percentiles of the log-permeability posteriors $\mu_{n}$ (for $n=1, 3$ and 5) obtained using the aforementioned SMC sampler for some of those choices of the number particles J, and with the same selection of tunable parameters $N_{\mu}=20$ , $J_{\rm thresh}=J/3$ that we used for the highly-resolved SMC with large particles; percentiles from the latter are included in the right column of figure 4 for comparison purposes. We can see that as the ensemble of particle increases, the approximation of SMC improves when compared to the one provided by the highly-resolved SMC. Note that very small number of particles results in very poor approximations of these percentiles.

**Figure 4.** Percentiles of the posteriors $\mu_{n}$ 's ( $n=1, 3, 5$ ) obtained via SMC with (from left to right) J = 50, 400, 1600, 10⁵. Solid red line corresponds to the graph of the true log-permeability $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}u^{\dagger}$ . Vertical dotted line indicates the location of the true front $\newcommand{\Ga}{\Upsilon} \newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}\Ga^{\dagger}(t_{n})$ .
Download figure:
Standard image High-resolution image

**Figure 4.** Percentiles of the posteriors $\mu_{n}$ 's ( $n=1, 3, 5$ ) obtained via SMC with (from left to right) J = 50, 400, 1600, 10⁵. Solid red line corresponds to the graph of the true log-permeability $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}u^{\dagger}$ . Vertical dotted line indicates the location of the true front $\newcommand{\Ga}{\Upsilon} \newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}\Ga^{\dagger}(t_{n})$ .
Download figure:
Standard image High-resolution image

In order to quantify the level of approximation obtained with SMC with the aforementioned selections of parameters, we compute the $L^{2}(D^{\ast})$ -relative errors of the mean and variance with respect to the posterior measure approximated with the highly-resolved SMC computed as described in the preceding subsection. More precisely, let us define

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{mean_var} \mathbf{E}_{n}^{J}\equiv \frac{\vert\vert\mathbb{E}^{\mu_{n}}-\overline{u}_{n,J}\vert\vert_{L^{2}(D^{\ast})}}{\vert\vert \mathbb{E}^{\mu_{n}} \vert\vert_{L^{2}(D^{\ast})}},\qquad \mathbf{V}_{n}^{J}\equiv \frac{\vert\vert \mathbb{V}^{\mu_{n}}-\sigma_{n,J}^{2}\vert\vert_{L^{2}(D^{\ast})}}{\vert\vert \mathbb{V}^{\mu_{n}}\vert\vert_{L^{2}(D^{\ast})}}, \nonumber \end{align} \tag{ 55 }$

where $\mathbb{E}^{\mu_{n}}$ and $\mathbb{V}^{\mu_n}$ are the $\mu_{n}$ -posterior mean and variance characterized via SMC with large J from (53) and (54). In the previous expressions $\overline{u}_{n, J}$ and $\sigma_{n, J}^{2}$ are the sample mean and variance defined in (53) obtained from the ensemble $\{u_{n}^{(\,j)}\}_{j=1}^{J}$ computed via SMC for the choices of small J stated above and with the aforementioned selections of tunable parameters. In addition, we consider the estimator of the true log-permeability defined by the ensemble mean $\overline{u}_{n, J}$ and thus we monitor the corresponding $L^{2}(D^{\ast})$ -relative error defined by

$\begin{align} \renewcommand{\dag}{\dagger} \newcommand{\e}{{\rm e}} \newcommand{\re}{{\rm Re}} \displaystyle \label{truth_err} \epsilon_{n}^{J}\equiv \frac{\vert\vert u^{\dagger}-\overline{u}_{n,J}\vert\vert_{L^{2}(D^{\ast})}}{\vert\vert u^{\dagger} \vert\vert_{L^{2}(D^{\ast})}}. \nonumber \end{align} \tag{ 56 }$

Quantities $\mathbf{E}_{n}^{J}$ , $\mathbf{V}_{n}^{J}$ and $\newcommand{\e}{{\rm e}} \epsilon_{J}^{n}$ are random variables that depend on the initial ensemble of particles that we generate from the prior $\mu_{0}$ . We thus report these quantities (for each $n=1, 3, 5$ ) averaged over 15 experiments corresponding to different selections of the initial ensemble of particles. In figure 5 we display $\mathbf{E}_{n}^{J}$ (top), $\mathbf{V}_{n}^{J}$ (middle) and $\newcommand{\e}{{\rm e}} \epsilon_{n}^{J}$ (bottom) for (from left to right) $n=1, 3, 5$ as a function of the aforementioned selections of ensemble size J. For brevity we omit the results for $n=2, 4$ as they display similar behaviour. The total computational cost of computing the full sequence of posteriors (i.e. $\mathbf{C}_{\rm SMC}$ from (51)) is shown in figure 6 (left). We reiterate that this cost is expressed in terms of the number of evaluations of the 5th forward map $\newcommand{\cG}{{\mathcal G}} \cG_{5}$ .

**Figure 5.** SMC approximations. Top and middle: relative errors of mean (top) and variance (middle) of the posteriors $\mu_{n}$ , (from left to right) $n=1, 3, 5$ , obtained with SMC with different choices of small (log) ensemble size $\log(J)$ and tunable (SMC) parameters J_thresh and $N_{\mu}$ . Bottom: relative errors with respect to the truth $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}u^{\dagger}$ of the ensemble mean.
Download figure:
Standard image High-resolution image

**Figure 5.** SMC approximations. Top and middle: relative errors of mean (top) and variance (middle) of the posteriors $\mu_{n}$ , (from left to right) $n=1, 3, 5$ , obtained with SMC with different choices of small (log) ensemble size $\log(J)$ and tunable (SMC) parameters J_thresh and $N_{\mu}$ . Bottom: relative errors with respect to the truth $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}u^{\dagger}$ of the ensemble mean.
Download figure:
Standard image High-resolution image

**Figure 6.** Total computational cost in terms of $\newcommand{\cG}{{\mathcal G}} \cG_{5}$ -forward model evaluations. Left: total computational cost obtained via SMC with different choices of $N_{\mu}$ and J_thresh. Right: comparison of total computational cost obtained via REnKA with different choices of $J_{\rm thresh}=J/3$ against the cost of SMC with different selection of tunable parameters $N_{\mu}$ and J_thresh.
Download figure:
Standard image High-resolution image

$ \newcommand{\cG}{{\mathcal G}} \cG_{5}$ — **Figure 6.** Total computational cost in terms of $\newcommand{\cG}{{\mathcal G}} \cG_{5}$ -forward model evaluations. Left: total computational cost obtained via SMC with different choices of $N_{\mu}$ and J_thresh. Right: comparison of total computational cost obtained via REnKA with different choices of $J_{\rm thresh}=J/3$ against the cost of SMC with different selection of tunable parameters $N_{\mu}$ and J_thresh.
Download figure:
Standard image High-resolution image

While the numerical analysis of the convergence of the SMC sampler is beyond the scope of this work, the results presented in this section are aimed at understanding the level of accuracy of SMC with relatively small number of particles and for a selection of tunable parameters which may enable the use of this method in more practical scenarios. From these results it is clear that the selections of J_thresh have no substantial effect on the accuracy of the scheme in terms of approximating the mean and variance of each posterior. Similarly, the computational cost with respect to our selections of J_thresh does not seem to vary significantly. It is evident that the main effect in terms of accuracy is the choice of MCMC steps (i.e. parameter $N_{\mu}$ ). Indeed, note that the error obtained with $N_{\mu}=5$ is considerably larger than the one with $N_{\mu}=20$ although the computational cost of the former is one quarter of the computational cost of the latter. We conclude that even though decreasing $N_{\mu}$ can offer computational affordability, it is detrimental to the approximation properties of the scheme. This comes as no surprise as it is well known that the mutation step that involves running MCMC is crucial for the accuracy of any SMC methodology.

The behavior of the SMC sampler with respect to the number of particles J is as expected. On the one hand, an increase in J corresponds to a decrease in the error with respect to the mean and variance. On the other hand, the computational cost, $\mathbf{C}_{\rm SMC}$ , increases with J. Note that there is a clear linear relationship between these two variables which is, in turn, obvious from (51) provided that q_n is invariant with respect to J. Indeed, for the cases considered here, the number of intermediate tempering distributions (not reported) computed at each observation time, is invariant with respect to our choices of J. This is somewhat an expected outcome since our choice of J_tresh in (45) is always a fraction of J. It is also worth mentioning that the effect of J is less noticeable when we look at the error with respect to the truth. At each observation time, we notice that the $\newcommand{\e}{{\rm e}} \epsilon_{n}$ seems to converge to a nonzero value as J increases. Note that convergence to the truth is not ensured due to the limited number of measurements inverted and the potential lack of identifiability of the log-permeability.

The results reported in this subsection suggest that achieving a reduction in the computational cost by reducing $N_{\mu}$ has a severe detrimental effect in the accuracy of the SMC sampler with small number of particles. In addition, J_thresh does not seem to have a substantial effect in either the accuracy or computational cost. Clearly, we are only then limited to the number of particles J to control the computational cost of the sampler without severely compromising accuracy of the approximate posteriors.

4. Approximating the posteriors via a regularizing ensemble Kalman algorithm

In the previous section we have demonstrated, by means of numerical examples, that an accurate approximation of the Bayesian posteriors via the state-of-the art SMC samplers results in a very high computational cost; hence it is unfeasible for practical applications such as the 2D resin injection in RTM introduced in section 1. In this section we propose a regularizing ensemble Kalman algorithm (REnKA) that aims at providing an accurate approximation of the sequence of Bayesian posteriors at a much lower computational cost. In section 4.1 we introduce REnKA as a Gaussian approximation from the SMC sampler of [22] discussed in the preceding section. The proposed REnKA in the context of existing ensemble Kalman methods is discussed in section 4.2. A numerical investigation of the convergence properties of REnKA is reported in section 4.3.

For the subsequent development of the proposed scheme, we extend the domain of definition of the sequence of forward maps $\newcommand{\cG}{{\mathcal G}} \cG_{n}$ introduced in (22). More specifically, we assume $\newcommand{\cG}{{\mathcal G}} \cG_{n}:\mathcal{X}\to \mathbb{R}^{M+1}$ where $\mathcal{X}$ is a Hilbert space such that $X=C[0, x^{*}]\hookrightarrow\mathcal{X}$ (compactly). We denote by $<\cdot , \cdot>_{\mathcal{X}}$ and $<\cdot , \cdot>$ the inner products in $\mathcal{X}$ and $\mathbb{R}^{M+1}$ , respectively. In addition, we define $\newcommand{\e}{{\rm e}} \mathcal{Z}\equiv \mathcal{X}\times\mathbb{R}^{M+1}$ with inner product denoted by $<\cdot , \cdot>_{Z}$ .

4.1. Motivation for REnKA

Motivated by the SMC tempering approach described in the previous section, we now propose an ensemble Kalman algorithm whose aim is to approximate $\{\mu_{n, r}\}_{r=1}^{q_{n}}$ by a sequence of Gaussian measures $\{\nu_{n, r}\}_{r=1}^{q_{n}}$ which are, in turn, characterised by a set of particles with equal weights. Suppose that, at time t = t_n we have an ensemble $\{u_{n, r-1}^{(\,j)}\}_{j=1}^{J}$ of J samples from a Gaussian measure $\nu_{n, r-1}$ that approximates $\mu_{n, r-1}$ , and a prescribed tempering parameter $\phi_{n, r-1}$ . We may then solve (45) for the new $\phi_{n, r}$ and define the regularization parameter $\alpha_{n, r}$ in (48). We now wish to make a transition from $\nu_{n, r-1}$ to a Gaussian measure $\nu_{n, r}$ that approximates $\mu_{n, r}$ . To this end, it is convenient to define the augmented variable

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\cG}{{\mathcal G}} \displaystyle \label{eqE1} z=(u,\cG_{n}(u))^{T}\in Z \nonumber \end{align} \tag{ 57 }$

and note that, in terms of this variable, we may rewrite (50) as

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{eqE2} y_{n}=Hz+\tilde{\eta}_{n},\qquad \tilde{\eta}_{n,r}\sim N(0,\alpha_{n,r}\Gamma_{n}), \nonumber \end{align} \tag{ 58 }$

where $H=(0, I)$ and I is the identity operator. One can see that by reformulating the inverse problem in terms of the augmented variable, the resulting forward map (i.e. H) acting on this variable is linear.

From (57) we define the following augmented particles

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\cG}{{\mathcal G}} \displaystyle \label{eqE3} z_{n,r-1}^{(\,j)}= (u_{n,r-1}^{(\,j)}, \cG_{n}(u_{n,r-1}^{(\,j)}))^{T}, \nonumber \end{align} \tag{ 59 }$

and construct the empirical Gaussian measure:

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eqE4} \hat{\nu}_{n,r-1}(z) \equiv N(\overline{z}_{n,r-1},C_{n,r-1}), \nonumber \end{align} \tag{ 60 }$

where

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eqE5} \overline{z}_{n,r-1}\equiv \frac{1}{J}\sum\limits_{j=1}^{J} z_{n,r-1}^{(\,j)}, \nonumber \end{align} \tag{ 61 }$

and

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eqE6} C_{n,r-1} (\cdot)\equiv \frac{1}{J-1}\sum\limits_{j=1}^{J}(z_{n,r-1}^{(\,j)}-\overline{z}_{n,r-1})\langle z_{n,r-1}^{(\,j)}-\overline{z}_{n,r-1},\cdot\rangle_{Z}. \nonumber \end{align} \tag{ 62 }$

The Gaussian measure $\hat{\nu}_{n, r-1}(z)$ is used to approximate the measure, denoted by $\hat{\mu}_{n, r-1}(z)$ , that arises from pushing forward $\mu_{n, r-1}(u)$ under (57). By using this Gaussian approximation of $\hat{\mu}_{n, r-1}(z)$ , we then provide a Bayesian formulation of the inverse problem given by (58). More specifically, we wish to compute $\newcommand{\e}{{\rm e}} \hat{\nu}_{n, r}(z)\equiv \mathbb{P}(z\vert y_{n})$ given $\hat{\nu}_{n, r-1}(z)$ and the data from (58). A formal application of Bayes theorem yields

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{eqE7} \frac{{\rm d}\hat{\nu}_{n,r}}{{\rm d}\hat{\nu}_{n,r-1}}(z)\propto \exp\Big[-\frac{1}{2} \vert\vert (\alpha_{n,r}\Gamma_{n})^{-1/2}(y_{n}-Hz)\vert\vert^2\Big]. \nonumber \end{align} \tag{ 63 }$

Moreover, from (60) and the linearity of the forward map H (on the augmented variable), it follows by standard arguments [26] that

$\begin{align} \newcommand{\bi}{\boldsymbol} \newcommand{\e}{{\rm e}} \displaystyle \label{eqE8} \hat{\nu}_{n,r}(z)= N\big(\overline{z}_{n,r-1}+ \mathcal{K}_{n,r}(y_{n}-H\overline{z}_{n,r-1}),(I- \mathcal{K}_{n,r}H)C_{n,r-1}\big), \nonumber \end{align} \tag{ 64 }$

where

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{eqE9} \mathcal{K}_{n,r}\equiv C_{n,r-1}H^{T}(HC_{n,r-1} H^{T}+\alpha_{n,r}\Gamma)^{-1}. \nonumber \end{align} \tag{ 65 }$

Let us then note that C_n,r−1 in (62) can be written as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eqE10} C_{n,r-1}=\left[\begin{array}{@{}cc@{}} C_{n,r-1}^{uu} & C_{n,r-1}^{uw} \nonumber \\ (C_{n,r-1}^{uw})^{T} & C_{n,r-1}^{ww}\end{array}\right], \nonumber \end{align} \tag{ 66 }$

where

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\cG}{{\mathcal G}} \displaystyle \label{eqE11} C_{n,r-1}^{ww}(\cdot)=\frac{1}{J-1}\sum\limits_{j=1}^{J}(\cG_{n}(u_{n,r-1}^{(\,j)})-\overline{\cG}_{n,r-1})\langle \cG_{n}(u_{n,r-1}^{(\,j)})-\overline{\cG}_{n,r-1},\cdot\rangle, \nonumber \end{align} \tag{ 67 }$

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\cG}{{\mathcal G}} \displaystyle C_{n,r-1}^{uw}(\cdot) = \frac{1}{J-1}\sum\limits_{j=1}^{J} (u_{n,r-1}^{(\,j)}-\overline{u}_{n,r-1})\langle \cG_{n}(u_{n,r-1}^{(\,j)})-\overline{\cG}_{n,r-1},\cdot \rangle,\label{eqE12} \nonumber \end{align} \tag{ 68 }$

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle C_{n,r-1}^{uu}(\cdot) = \frac{1}{J-1}\sum\limits_{j=1}^{J} (u_{n,r-1}^{(\,j)}-\overline{u}_{n,r-1})\langle u_{n,r-1}^{(\,j)}-\overline{u}_{n,r-1},\cdot \rangle_{\mathcal{X}},\label{eqE13} \nonumber \end{align} \tag{ 69 }$

and

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\cG}{{\mathcal G}} \displaystyle \overline{\cG}_{n,r-1} \equiv \frac{1}{J}\sum\limits_{j=1}^{J} \cG_{n}(u_{n,r-1}^{(\,j)}),\qquad \overline{u}_{n,r-1} \equiv \frac{1}{J}\sum\limits_{j=1}^{J}u_{n,r-1}^{(\,j)}.\nonumber \end{align*}$

Informally, we use the block structure of (66) and define $\nu_{n, r}$ , the approximation of $\mu_{n, r}$ , as the marginal measure of $\hat{\nu}_{n, r}$ given by

$\begin{align} \newcommand{\bi}{\boldsymbol} \newcommand{\e}{{\rm e}} \newcommand{\cG}{{\mathcal G}} \displaystyle \label{eqE14} \nu_{n,r}(u)\equiv N\big(\overline{u}_{n,r-1}+ \mathcal{K}_{n,r}^{u}(y_{n}-\overline{\cG}_{n,r-1}),C_{n,r-1}^{uu}- \mathcal{K}_{n,r}^{u}(C_{n,r-1}^{uw})^{T}\big), \nonumber \end{align} \tag{ 70 }$

where

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{eqE15} \mathcal{K}_{n,r}^{u}= C_{n,r-1}^{uw}(C_{n,r-1}^{ww}+\alpha_{n,r}\Gamma)^{-1}. \nonumber \end{align} \tag{ 71 }$

Although the measure (70) is fully characterised by its mean and covariance, for the subsequent tempering step we need a particle approximation of $\nu_{n, r}(u)$ . We can obtain those particles by updating the current set of particles $u_{n, r-1}^{(\,j)}$ via the formula

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \newcommand{\cG}{{\mathcal G}} \displaystyle \label{eqE16} u_{n,r}^{(\,j)} =u_{n,r-1}^{(\,j)}+C_{n,r-1}^{uw}(C_{n,r-1}^{ww} +\alpha_{n,r}\Gamma)^{-1}(y_{n,r}^{(\,j)}-\cG_{n}(u_{n,r-1}^{(\,j)})), \nonumber \end{align} \tag{ 72 }$

where

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{eqE17} y_{n,r}^{(\,j)}\equiv y_{n}+\eta_{n,r}^{(\,j)},\qquad \eta_{n,r}^{(\,j)}\sim N(0,\alpha_{n,r}\Gamma_{n}). \nonumber \end{align} \tag{ 73 }$

Indeed, under the standard assumption that the noise $\newcommand{\e}{{\rm e}} \eta_{n, r}^{(\,j)}$ is independent from $u_{n, r-1}^{(\,j)}$ , it can be shown by the standard arguments in Kalman-based methods (see for example [7]) that

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eqE18} \nu_{n,r}^{J}(u)\equiv \frac{1}{J}\sum_{j=1}^{J}\delta_{u_{n,r}^{(\,j)}}(z)\to \nu_{n,r}(u),\quad J\to \infty. \nonumber \end{align} \tag{ 74 }$

Expression (72) leads to an iterative (regularised) ensemble Kalman-based algorithm, summarized in algorithm 1, for which the proposed selection of the regularisation parameter $\alpha_{n, r}$ is based on the tempering approach discussed in section 3.2. While this selection of $\alpha_{n, r}$ ensures a smooth transition between the measures $\mu_{n-1}$ and $\mu_{n}$ in SMC, we conjecture that a similar effect should be observed between the corresponding Gaussian approximations computed via the REnKA scheme, thereby leading to stable and reasonably accurate approximations of the posteriors computed via small number of samples. Our conjecture will be verified via numerical experiments in section 4.3.

Algorithm 1. Regularizing ensemble Kalman algorithm (REnKA).

Let $\{u_{0, 0}^{(\,j)}\}_{j=1}^{J}\sim \mu_{0}$ $\{u_{0, 0}^{(\,j)}\}_{j=1}^{J}\sim \mu_{0}$ be the initial ensemble of J particles.

Define the tunable parameter J_thresh.

for $n=1, \dots, N$ $n=1, \dots, N$ do

Set r = 0 and $\phi_{n, 0}=0$ $\phi_{n, 0}=0$ ;

while $\phi_{n, r}<1$ $\phi_{n, r}<1$ do

$r\to r+1$ $r\to r+1$

Compute the nth likelihood (29) $l_{n}(u_{n, r-1}^{(\,j)}, y_{n})$ $l_{n}(u_{n, r-1}^{(\,j)}, y_{n})$ for $j=1, \dots, J$ $j=1, \dots, J$ .

(this implies computing $\newcommand{\cG}{{\mathcal G}} \cG(u_{n, r-1}^{(\,j)})$ $\newcommand{\cG}{{\mathcal G}} \cG(u_{n, r-1}^{(\,j)})$ needed below).

Compute the tempering parameter $\phi_{n, r}$ $\phi_{n, r}$ :

if $\min\nolimits_{\phi\in (\phi_{n, r-1}, 1]}{\rm ESS}_{n, r}(\phi)>J_{\rm thresh}$ $\min\nolimits_{\phi\in (\phi_{n, r-1}, 1]}{\rm ESS}_{n, r}(\phi)>J_{\rm thresh}$ then

set $\phi_{n, r}=1$ $\phi_{n, r}=1$ .

else

compute $\phi_{n, r}$ $\phi_{n, r}$ such that ${\rm ESS}_{n, r}(\phi)\approx J_{\rm thresh}$ ${\rm ESS}_{n, r}(\phi)\approx J_{\rm thresh}$ using a

bisection algorithm on $(\phi_{n, r-1}, 1]$ $(\phi_{n, r-1}, 1]$ .

end if

Construct $C_{n, r-1}^{uw}$ $C_{n, r-1}^{uw}$ , $C_{n, r-1}^{ww}$ $C_{n, r-1}^{ww}$ defined by expressions (67) and (68).

Update each ensemble member:

for $j=1, \dots, J$ $j=1, \dots, J$ do

$\begin{align}\newcommand{\cG}{{\mathcal G}} \newcommand{\Ga}{\Upsilon} u_{n, r}^{(\,j)} =u_{n, r-1}^{(\,j)}+C_{n, r-1}^{uw}(C_{n, r-1}^{ww} +\alpha_{n, r}\Gamma){}^{-1}(y_{n, r}^{(\,j)}-\cG_{n}(u_{n, r-1}^{(\,j)})),\end{align} \tag{ 75 }$

where

$\newcommand{\e}{{\rm e}} \alpha_{n, r}=(\phi_{n, r}-\phi_{n, r-1}){}^{-1}, \qquad y_{n, r}^{(\,j)}=y_{n}+\eta_{n, r}^{(\,j)},$ $\newcommand{\e}{{\rm e}} \alpha_{n, r}=(\phi_{n, r}-\phi_{n, r-1}){}^{-1}, \qquad y_{n, r}^{(\,j)}=y_{n}+\eta_{n, r}^{(\,j)},$

with $\newcommand{\Ga}{\Upsilon} \newcommand{\e}{{\rm e}} \eta_{n, r}^{(\,j)}\sim N(0, \alpha_{n, r}\Gamma_{n}).$ $\newcommand{\Ga}{\Upsilon} \newcommand{\e}{{\rm e}} \eta_{n, r}^{(\,j)}\sim N(0, \alpha_{n, r}\Gamma_{n}).$

end for

end while

Set $\newcommand{\e}{{\rm e}} u_{n+1, 0}^{(\,j)}\equiv u_{n, r-1}^{(\,j)}$ $\newcommand{\e}{{\rm e}} u_{n+1, 0}^{(\,j)}\equiv u_{n, r-1}^{(\,j)}$ . Approximate $\mu_{n}$ $\mu_{n}$ with $\newcommand{\e}{{\rm e}} \nu_{n}^{J}\equiv \frac{1}{J}\sum\nolimits_{j=1}^{J} \delta_{u_{n, r}^{(\,j)}}.$ $\newcommand{\e}{{\rm e}} \nu_{n}^{J}\equiv \frac{1}{J}\sum\nolimits_{j=1}^{J} \delta_{u_{n, r}^{(\,j)}}.$

end for

Remark 4.1. Note that the key assumption for the proposed scheme is the Gaussian approximation of $\hat{\mu}_{n, r-1}(z)$ provided by (64). It is clear that the measure $\hat{\mu}_{n, r-1}(z)$ is, as a rule, non-Gaussian and the aforementioned assumption will result in a methodology that will, in general, not converge to the posteriors $\mu_{n}$ as the ensemble size $J\to \infty$ . Nevertheless, we will show via numerical examples that this approximation provides reasonably accurate estimates using only a small number of particles.

It is not difficult to see that the main computational cost of REnKA, in terms of the cost of evaluating the forward model at the final observation time, is given by

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{cost_RENKA} \mathbf{C}_{\rm REnKA}\equiv J \sum_{n=1}^{N} q_{n} \frac{g_{n}}{g_{N}}, \nonumber \end{align} \tag{ 76 }$

where, as before, g_n denotes the computational cost of evaluating the $\newcommand{\cG}{{\mathcal G}} \cG_{n}$ -forward map. As we will demonstrate via numerical experiments, for the moving boundary problem of section 2.1, REnKA offers a computationally affordable and thus practical approach to approximate the solution to the Bayesian inverse problem that arises from RTM.

It is important to mention that, at a given observation time t_n and iteration level r, the value of $\sum\nolimits_{j=1}^{J} l_{n}(u_{n, r-1}^{(\,j)}, y_{n}){}^{1-\phi_{n, r-1}}$ may be zero to machine precision. In this case, the tempering parameter $\phi_{n, r}$ is not be computable, via a bisection scheme on $(\phi_{n, r-1}, 1]$ , as stated in algorithm 1. This computational issue is more likely to arise at the early iterations of the scheme for which the value of $\phi_{n, r-1}$ is not sufficiently close to one. This can be overcome, for example, by simply adapting the bisection algorithm in order to first compute a $\phi_{\ast}$ such that $\sum\nolimits_{j=1}^{J} l_{n}(u_{n, r-1}^{(\,j)}, y_{n}){}^{\phi_{\ast}-\phi_{n, r-1}}>0$ . If $\min\nolimits_{\phi\in (\phi_{n, r-1}, \phi_{\ast}]}{\rm ESS}_{n, r}(\phi)>J_{\rm thresh}$ we then set $\phi_{n, r}=\phi_{\ast}$ ; otherwise, we find $\phi_{n, r}$ by solving (45) via a bisection algorithm on $(\phi_{n, r-1}, \phi_{\ast}]$ . For the numerical experiments reported in the present work, zero values to machine precision for $\sum\nolimits_{j=1}^{J} l_{n}(u_{n, r-1}^{(\,j)}, y_{n}){}^{1-\phi_{n, r-1}}$ were only encountered where a large number of measurements were inverted in the 2D setting of section 5.

4.2. REnKA in the context of existing ensemble Kalman methods for inverse problems

Ensemble Kalman methods for inverse/calibration problems have been widely used in the last decades [15, 20]. More recently, using iterative Kalman methods with a regularization parameter (e.g. $\alpha_{n, r}$ in (75)) have been proposed for a wide class of applications. In particular, the proposed REnKA scheme can be related to the recently developed regularizing ensemble Kalman method introduced in [19] for solving classical (deterministic) inverse problems. More precisely, algorithm 1 is nothing but a sequential version of the iterative scheme presented in [19] except for the selection of the regularization parameter $\alpha_{n, r}$ . While in the present work we have motivated algorithm 1 from the SMC framework, the algorithm in [19] was obtained as an ensemble approximation of the regularizing Levenberg–Marquardt scheme initially developed in [16] for solving ill-posed inverse problems. In the context of the proposed work, REnKA aims at providing a derivative-free approximation to the solution of

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \newcommand{\cG}{{\mathcal G}} \displaystyle \label{ls_RENKA} \vert \vert \Gamma_{n}^{-1/2}(y_{n}- \cG_{n}(u))\vert\vert \to \min \nonumber \end{align} \tag{ 77 }$

in a stable (regularized) fashion. The regularization parameter in [19] was selected according to the discrepancy principle in order to regularize the inverse problem posed by (77). In contrast, the present work uses the adaptive tempering approach of [22] for the selection of this regularization parameter in the context of SMC. It is clear that tempering can be understood as a regularization to the Bayesian inverse problem; the effect of $\alpha_{n, r}$ is to flatten out the posterior and allow for a more controlled/regularized transition between the sequence of measures. Other works highlighting the connection between ensemble Kalman methods and SMC approaches include [40, 44, 45]. In addition, by noticing from (48) that, for each $n=1, \dots, N$ , $\sum\nolimits_{r=1}^{q_{n}}\alpha_{n, r}^{-1}=1$ , the proposed REnKA can be also understood as a sequential version of the ensemble smoother with multiple data assimilation proposed by [12]. However, it is important to reiterate that the adaptive selection of $\alpha_{n, r}$ proposed here is inherited from the SMC approach of [22]. This selection differs substantially from the strategy proposed in [12] for which the number of intermediate tempering distributions q_n is fixed and selected a priori.

4.3. Numerical approximating the posterior with REnKA

In this subsection we report the results from applying REnKA proposed in section 4.1 for the approximation of the sequence of posteriors $\{\mu_{n}\}_{n=1}^{5}$ that we introduced in the framework of section 3.4. The algorithm is applied with the same selection of ensemble sizes (J = 50, 100, 200, 400, 800, 1600, 3200, 6400) that we use for the SMC sampler of section 3.5. In addition, we consider three choices of the tunable parameter J_thresh ( $J_{\rm thresh}=J/3, J/2, 2J/3$ ). In figure 7 we display the percentiles of the log-permeability posteriors $\mu_{1}$ , $\mu_{3}$ and $\mu_{5}$ obtained with REnKA, for a fixed set of initial ensembles, and for some of these choices of J. For comparison purposes we include the fully resolved posterior (via SMC) in the right column of figure 7. We can clearly observe that the approximations provided by REnKA improves as the ensemble size J increases. More importantly, we can note that the uncertainty band defined by these percentiles provided better approximations than those from SMC with the same number of particles (see figure 4).

**Figure 7.** Percentiles of the posteriors $\mu_{1}$ , $\mu_{3}$ , and $\mu_{5}$ obtained via REnKA with (from left to right) $J=50, 400, 1600$ and SMC (right column) with J = 10⁵. Solid red line corresponds to the graph of the true log-permeability $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}u^{\dagger}$ . Vertical dotted line indicates the location of the true front $\newcommand{\Ga}{\Upsilon} \newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}\Ga^{\dagger}(t_{n})$ .
Download figure:
Standard image High-resolution image

**Figure 7.** Percentiles of the posteriors $\mu_{1}$ , $\mu_{3}$ , and $\mu_{5}$ obtained via REnKA with (from left to right) $J=50, 400, 1600$ and SMC (right column) with J = 10⁵. Solid red line corresponds to the graph of the true log-permeability $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}u^{\dagger}$ . Vertical dotted line indicates the location of the true front $\newcommand{\Ga}{\Upsilon} \newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}\Ga^{\dagger}(t_{n})$ .
Download figure:
Standard image High-resolution image

We quantify the level of accuracy of REnKA with respect to the Benchmark defined by the highly-resolved SMC sampler reported in section 3.4.2. In figure 8 we display $\mathbf{E}_{n}^{J}$ (top), $\mathbf{V}_{n}^{J}$ (middle) and $\newcommand{\e}{{\rm e}} \epsilon_{n}^{J}$ (bottom) for (from left to right) $n=1, 3, 5$ computed with the REnKA samples with various ensemble sizes J. Similar results (not shown) are obtained for $n=2, 4$ . The total computational cost ( $\mathbf{C}_{\rm REnKA}$ from (76)) of computing the full sequence of posteriors is shown in figure 6 (right). In figures 8 and 6 (right) we also include some of the results obtained with the SMC samplers with the same choice of small number of particles discussed in section 3.5. These results speak for themselves; given a small number of particles, REnKA provides a much more accurate approximation of the posterior measures than SMC. For example, note that for the final measure $\mu_{5}$ , REnKA (applied with $J_{\rm thresh}=J/3$ ) with an ensemble of J = 200 particles yields $\newcommand{\e}{{\rm e}} \mathbf{E}_{5}^{200}\equiv 12\%$ , $\newcommand{\e}{{\rm e}} \mathbf{V}_{5}^{200}\equiv 18\%$ at a computational cost of $1.6\times 10^3$ $\newcommand{\cG}{{\mathcal G}} \cG_{5}$ -forward model evaluations. In order to obtain a similar level of accuracy ( $\newcommand{\e}{{\rm e}} \mathbf{E}_{5}^{200}\equiv 11\%$ , $\newcommand{\e}{{\rm e}} \mathbf{V}_{5}^{200}\equiv 24\%$ ), we need to apply the SMC sampler (say with $J_{\rm thresh}=J/3$ , and $N_{\mu}=20$ ) with J = 6400 particles for which the computational cost is approximately $5\times 10^5$ $\newcommand{\cG}{{\mathcal G}} \cG_{5}$ -forward model evaluations.

**Figure 8.** Top and middle: relative errors of mean (top) and variance (middle) of the posteriors $\mu_{n}$ , (from left to right) n = 1,3, 5, obtained via REnKA with different choices of (log) ensemble size $\log(J)$ and tunable parameter J_thresh. Bottom: relative errors of the ensemble mean obtained via REnKA with respect to the truth $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}u^{\dagger}$ .
Download figure:
Standard image High-resolution image

**Figure 8.** Top and middle: relative errors of mean (top) and variance (middle) of the posteriors $\mu_{n}$ , (from left to right) n = 1,3, 5, obtained via REnKA with different choices of (log) ensemble size $\log(J)$ and tunable parameter J_thresh. Bottom: relative errors of the ensemble mean obtained via REnKA with respect to the truth $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}u^{\dagger}$ .
Download figure:
Standard image High-resolution image

The results above not only demonstrate that, when a small number of particles is used, the performance (accuracy versus computational cost) of REnKA outperforms SMC, but also these results show that REnKA is robust for reasonable selections of the tunable parameter J_thresh. Similar to SMC, this parameter determines the number of tempering distributions at each observation time and thus has an impact on the computational cost of the scheme. It is also important to remark that even though the relative errors of REnKA with respect to the mean and variance decrease as the ensemble size increases, these errors seem to converge to a non-zero value thereby indicating that REnKA does not provide an asymptotic convergence to the posterior measures as $J\to \infty$ . Nevertheless, our results clearly showcase the advantages of REnKA for approximating these measures in an accurate and efficient fashion for a limited and realistic computational cost.

Having numerical evidence from section 4.3 that REnKA is a robust and computationally feasible approach for addressing the Bayesian inverse problem defined in section 2, in section S1 of the supplementary material SupMat we provide further practical insights in the Bayesian inverse problem for the 1D case studied earlier. In particular, we study effect of (i) number/type of measurements, (ii) number of observation times and (iii) observational noise level, on the sequence of Bayesian posteriors that result from approximating, via REnKA, the Bayesian inverse problem in the 1D case.

5. Bayesian inversion in 2D RTM

In this section we apply REnKA for the Bayesian inversion of the 2D moving boundary problem described by (1)–(9). In contrast to the 1D case, the numerical solution of the 2D moving boundary problem is much more computationally intensive. Therefore, the application of a fully Bayesian methodology such as the SMC sampler considered in section 3 is impractical for an online computation of the Bayesian posteriors in the 2D case. In this subsection, we exploit the capabilities of REnKA for providing an accurate yet computationally tractable approach for inferring preform (log) permeability alongside with its uncertainty.

5.1. Formulation of the 2D Bayesian inverse problem

Let us consider now the 2D moving boundary problem introduced (1)–(9) from section 1. We recall that we are interested in the inference of the log-permeability $u(x)=\log{\kappa(x)}$ given noisy measurements of the pressure field $\{\,p(x_{m}, t_{n})\}_{m=1}^{M}$ from M sensors located at points $\{x_{m}\}_{m=1}^{M}\subset D^{\ast}$ collected at a discrete observation times $\{t_{n}\}_{n=1}^{N}$ . In addition, we wish to invert observations of the front location, or alternatively from the moving domain $D(t)$ that can be potentially obtained from CCT cameras such as in [32, 49]. While in the 1D case we can trivially define observations of the (single point) front, in 2D the front $\newcommand{\Ga}{\Upsilon} \Ga(t_{n})$ is a curve which defines the moving domain $D(t)$ . Therefore, rather than dealing with measurements of the front $\newcommand{\Ga}{\Upsilon} \Ga(t)$ itself we may assume pointwise measurements of $D(t)$ via its characteristic function defined by

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{moving} \chi(x,t)\equiv \left\{\begin{array}{@{}cc@{}} 1& x\in D(t), \nonumber \\ 0 &x\notin D(t). \end{array}\right. \nonumber \end{align} \tag{ 78 }$

More precisely, we define observations of the form $\newcommand{\Ga}{\Upsilon} \{\chi(x_{m}^{\Ga}, t_{n})\}_{m=1}^{M_{\Ga}}$ , where $\newcommand{\Ga}{\Upsilon} \{x_{m}^{\Ga}\}_{m=1}^{M_{\Ga}}\subset D$ is an array of points for which the characteristic function of $D(t)$ is observed. In practice this array should be considered dense when a high resolution camera is used for capturing the moving domain. Note that this mathematical description of front measurements is suitable as it enables its direct comparison with observations from digitalized images and also with the discrete formulation of (2)–(9) via the control volume finite element method (CV/FEM) in which the front location is characterized in terms of the filling factor (see details in [38]) rather than using a parameterization of $\newcommand{\Ga}{\Upsilon} \Ga$ .

For specified (known) p_I, p₀, $\partial D_{I}$ , $\partial D_{N}$ , φ, the solution of (2)–(9) induces a map $u=\log{\kappa} \to [\,p(x, t), D(t)]$ which enables us to define the sequence of forward maps $\newcommand{\cG}{{\mathcal G}} \newcommand{\Ga}{\Upsilon} \cG_{n}:C(\overline{D})\to \mathbb{R}^{M+M_{\Ga}}$ by means of

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \newcommand{\cG}{{\mathcal G}} \displaystyle \label{eq:2010} \cG_{n}(u)=\Big[\{\,p(x_{m},t_{n})\}_{m=1}^{M},\{\chi(x_{m}^{\Ga},t_{n})\}_{m=1}^{M_{\Ga}}\Big]. \nonumber \end{align} \tag{ 79 }$

To our best knowledge, uniqueness, existence and regularity theory for problem (2)–(9) with non constant $\kappa(x)={\rm e}^{u(x)}$ is an open problem for d > 1 (see a related discussion in [38]). However, for the present work we assume that the following condition holds.

Assumption 1. The sequence of forward maps $\newcommand{\cG}{{\mathcal G}} \newcommand{\Ga}{\Upsilon} \cG_{n}:C(\overline{D})\to \mathbb{R}^{M+M_{\Ga}}$ ( $n=1, \dots, N$ ) are continuous.

We now follow the same formulation of the Bayesian inverse problem as the one we described in section 2.2. At each observation time t = t_n, we collect noisy measurements of the front location $\newcommand{\Ga}{\Upsilon} y_{n}^{\Ga}\in \mathbb{R}^{M_{\Ga}}$ as well as pressure measurements from sensors $y_{n}^{\,p}\in \mathbb{R}^{M}$ . We assume that observations $\newcommand{\Ga}{\Upsilon} y_{n}=[y_{n}^{\Ga}, y_{p}]^{T}$ are related to the unknown via expressions (25) -(26) with $\newcommand{\cG}{{\mathcal G}} \cG_{n}$ defined in (79). As before, both measurements of $D(t)$ (via its characteristic function) and pressures are assumed to be uncorrelated in time and independent from each other. Furthermore, we consider Gaussian priors $\newcommand{\cC}{{\mathcal C}} \mu_{0}=N(\overline{u}, \cC)$ with a covariance operator $\newcommand{\cC}{{\mathcal C}} \cC$ that arises from the Whittle–Matern correlation function defined in (27). The assumption of continuity of the forward maps as well as the fact that $\mu_{0}(C(\overline{D}^{\ast}))=1$ ensures existence of the sequence of posterior measures $\mu_{n}=\mathbb{P}(u\vert y_{1}, \dots, y_{n})$ given by theorem 2.5 with the same definition of the likelihood functions introduced in (29). In the following section we use REnKA to to compute an ensemble approximation of $\{\mu_{n}\}_{n=1}^{N}$ .

5.2. 2D numerical experiments

In this subsection we apply REnKA for the solution of the 2D Bayesian inverse problem defined in the previous subsection. The forward model described by (1)–(9) is solved numerically with the MATLAB code developed in [38] and available from https://github.com/parkmh/MATCVFEM. This code is based on the interface-tracking control volume finite element method (CV/FEM) [3, 17, 48]. For experiments of this subsection, we consider the following fixed values:

$\begin{align*} \newcommand{\e}{{\rm e}} \displaystyle D^{\ast}=[0,1{\rm m}]\times [0,1{\rm m}],\quad \mu= 0.1{\rm Pa}\cdot {\rm s},\quad \varphi=1,\quad p_{0}=1{\rm MPa},\quad p_{I}=6{\rm MPa}.\nonumber \end{align*}$

Samples from $\mu_{0}$ are generated via KL parametrization as described in section 2.2.1 with parameters $\sigma_{0}^{2}=0.25$ , $\nu=1.5$ , l = 0.1 and $\overline{u}(x)= 0.0$ for all $x\in D^{\ast}$ . Some draws from the prior are displayed in figure 9 (middle row). The log-permeability field is plotted in figure 9 (top-left) is a random draw from the prior that we use as the truth $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}u^{\dagger}$ for the present experiments.

**Figure 9.** Top-left: true log permeability ( $\log[{\rm m}^{2}]$ ). Top-middle: computational domain for the generation of synthetic data. Top-right: computational domain for the inversion (via REnKA). Middle row: random draws from the Gaussian prior $\mu_{0}$ ( $\log[{\rm m}^{2}]$ ); bottom row: measurement configuration for the moving front $\newcommand{\Ga}{\Upsilon} M_{\Ga}=100$ (left); pressure measurement configuration with (from left to right) M = 9, M = 25 and M = 49 sensors.
Download figure:
Standard image High-resolution image

**Figure 9.** Top-left: true log permeability ( $\log[{\rm m}^{2}]$ ). Top-middle: computational domain for the generation of synthetic data. Top-right: computational domain for the inversion (via REnKA). Middle row: random draws from the Gaussian prior $\mu_{0}$ ( $\log[{\rm m}^{2}]$ ); bottom row: measurement configuration for the moving front $\newcommand{\Ga}{\Upsilon} M_{\Ga}=100$ (left); pressure measurement configuration with (from left to right) M = 9, M = 25 and M = 49 sensors.
Download figure:
Standard image High-resolution image

We use one dense configuration of $\newcommand{\Ga}{\Upsilon} M_{\Ga}=100$ measurement locations $\newcommand{\Ga}{\Upsilon} \{x_{m}^{\Ga}\}_{m=1}^{M_{\Ga}}$ for the observation of the moving domain given in terms of (78); these locations are displayed in figure 9 (bottom-left). We have selected a large number of measurements locations assuming that the moving domain can be densely observed with high-resolution cameras or dielectric sensors. In addition, we consider three possible configurations of M = 9, M = 25 and M = 49 pressure sensors $\{x_{m}\}_{m=1}^{M}$ , whose locations are shown in the left-middle to right panels of figure 9 (bottom). The summary of measurement configurations that we investigate are summarised below:

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq:final1} (M=0, D(t):\checkmark),\quad (M=9, D(t):\checkmark), \quad (M=25, D(t):\checkmark),\quad (M=49, D(t):\checkmark), \nonumber \\ (M=9, D(t): \mathcal{X}), \quad (M=25, D(t):\mathcal{X}),\quad (M=49, D(t): \mathcal{X}), \nonumber \end{align} \tag{ 80 }$

where ✓ (resp. $\mathcal{X}$ ) indicates whether the moving domain D(t_n) has been observed via (78).

We use the true log-permeability to numerically solve the forward model (1)–(9) via the CV/FEM code described above. Then, for each of these measurement configurations, synthetic data with a realistic choice of 2.5% Gaussian noise are generated in a similar manner to the one described in section 3.4.1. In order to avoid inverse crimes, synthetic data are computed on a finer grid than the one we use for the Bayesian inversion via REnKA. These are shown in the middle and right panels of figure 9 (top). Snapshots of the true pressure field $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}p^{\dagger}$ at the initial time t₀ and observation times $\{t_{n}\}_{n=1}^{N}$ (in seconds) are displayed in figure 10 alongside with the corresponding true moving domain $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}D^{\dagger}(t_{n})$ .

**Figure 10.** Snap shots of the true pressure field (Pa) and the true moving domain $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}D^{\dagger}(t_{n})$ for the initial time t₀ and the observation times $\{t_{n}\}_{n=1}^{7}$ (s).
Download figure:
Standard image High-resolution image

$ \newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}D^{\dagger}(t_{n})$ — **Figure 10.** Snap shots of the true pressure field (Pa) and the true moving domain $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}D^{\dagger}(t_{n})$ for the initial time t₀ and the observation times $\{t_{n}\}_{n=1}^{7}$ (s).
Download figure:
Standard image High-resolution image

For the application of REnKA we use a fixed number of J = 150 with tunable parameter $J_{\rm thresh}=J/3$ . For clarity of the notation, in the expression for the ensemble mean and variance (54), we then omit the index J as appropriate. The following numerical investigations are focused on the following quantities defined at each observation time t_n ( $n=1, \dots, N$ ):

(A)
a measure of the uncertainty provided by the $L_{2}(D^{\ast})$ -norm of the ensemble variance $\sigma _{n}^{2}$ relative to the prior, i.e.
$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{truth_var0} \Sigma_{n}\equiv \frac{\vert\vert \sigma_{n}^{2}\vert\vert_{L^{2}(D^{\star})}}{\vert\vert \sigma_{0}^{2}\vert\vert_{L^{2}(D^{\star})}}; \nonumber \end{align} \tag{ 81 }$
(B)
a normalized $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}L^{2}(D^{\dagger}(t_n))$ -error with respect to the truth defined by
$\begin{align} \renewcommand{\dag}{\dagger} \newcommand{\e}{{\rm e}} \newcommand{\re}{{\rm Re}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{truth_err2} \epsilon_{n}^{\Ga}\equiv \frac{1}{\vert D^{\dagger}(t_n)\vert} \vert\vert u^{\dagger}-\overline{u}_{n}\vert\vert_{L^{2}(D^{\dagger}(t_n))}. \nonumber \end{align} \tag{ 82 }$

We report quantities in (A) and (B) averaged over 15 experiments corresponding to different selection of the initial ensemble from the prior. The motivation for using the normalized $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}L^{2}(D^{\dagger}(t_n))$ -error instead of the relative error as defined in (56) is discussed in section S2 of the supplementary material SupMat.

For the configuration with ( $M=49, D(t):\checkmark$ ) the ensemble mean and (log) variance of each posterior $\mu_{n}$ approximated with REnKA are displayed in figure 11. We note that as more observations (in time) are assimilated, the ensemble mean better captures the spatial features of the truth while the (variance) uncertainty is reduced in the region of the true moving domain $\newcommand{\re}{{\rm Re}} \renewcommand{\dag}{\dagger}D^{\dagger}(t_n)$ . In figure 12 we show the ensemble mean and (log) variance of the final time posterior $\mu_{7}$ approximated via REnKA for different choices of the number of pressure sensors and with the inversion of the moving domain (✓). Note that the pure inversion of the moving domain (i.e. M = 0 pressure sensors) results in an informative measure of the log-permeability. It is clear that the accuracy in the estimation of the log-permeability improves with the number of pressure sensors.

**Figure 11.** Ensemble mean and variance of $\mu_{n}$ obtained via REnKA at each observation time. The true moving boundary has been included at each of these observation times.
Download figure:
Standard image High-resolution image

**Figure 11.** Ensemble mean and variance of $\mu_{n}$ obtained via REnKA at each observation time. The true moving boundary has been included at each of these observation times.
Download figure:
Standard image High-resolution image

**Figure 12.** Top: true-log permeability. Bottom: ensemble mean and variance of $\mu_{7}$ obtained via REnKA at the final observation time t₇.
Download figure:
Standard image High-resolution image

**Figure 12.** Top: true-log permeability. Bottom: ensemble mean and variance of $\mu_{7}$ obtained via REnKA at the final observation time t₇.
Download figure:
Standard image High-resolution image

From the plot of $\newcommand{\Ga}{\Upsilon} \newcommand{\e}{{\rm e}} \epsilon^{\Ga}$ displayed in figure 13 (left), we note that, at the latest observation times ( $n=6, 7$ ), there is a clear improvement in the accuracy with increasing the number of pressure locations. Similar to the 1D case, we also observe that the benefit of inverting measurements from the moving domain is only noticeable when the number of pressure sensors is relatively small (M = 9). This configuration of sensors is more realistic in practical settings. It is also worth noticing that, at the earliest observation times ( $n=1, 2$ ) when the front has not reached most pressure sensors, inverting measurements from the moving domain provides additional information of the log-permeability to the one provided only by pressure measurements.

**Figure 13.** Performance of REnKA (J = 150) for the the approximation of the posterior measures $\{\mu_{n}\}_{n=1}^{7}$ with measurement configurations from (80). Left: $\newcommand{\Ga}{\Upsilon} \newcommand{\e}{{\rm e}} \log{\epsilon_{n}^{\Ga}}$ from (82). Middle: relative norm of the variance $\Sigma_{n}$ (81). Right: scalable (with respect to J) computational cost of REnKA.
Download figure:
Standard image High-resolution image

As more observations (in time) are assimilated, the reduction of the uncertainty in terms of the ensemble variance can be observed from the plot of $\Sigma_{n}$ displayed in figure 13 (middle). From this plot we also note that the variance decreases as we increase the number of pressure sensors. The added value of measurements from the moving domain is also quite substantial and more noticeable for a small number of pressure sensors. In fact, note that smaller uncertainty has been achieved by inverting only the front (M = 0) compared to the inversion of only pressure data from M = 9 sensors. Here we also find that, at the earliest observation times, the additional inversion of measurements of the moving domain results in further reductions of the uncertainty in comparison to the inversion of only pressure data. While this investigation was conducted with a realistic choice of measurement noise (2.5%), further studies should be conducted to understand the effect of the noise level on the uncertainty estimates of log-permeability in the 2D case.

Finally, the computational cost (see expression (76)) of approximating the sequence of posteriors $\{\mu_{n=1}^{7}\}$ via REnKA is displayed in figure 13 (right). This cost is expressed in terms of the number of $\newcommand{\cG}{{\mathcal G}} \cG_{7}$ forward model evaluations which, in turn, correspond to solving the moving boundary problem from t = 0 to the last observation time t₇. Furthermore, this cost has been normalised by the number of particles J used in REnKA. This normalisation enables us to provide a rough estimate of the scalable (with respect to J) computational cost of the REnKA (76) if each evaluation of the forward map (see step 2(b) in algorithm 1) is conducted in parallel. As discussed in section 3.3, increasing the number of measurements results in more tempering distributions (i.e. iterations) in the REnKA scheme. Therefore the computational cost increases with the number of measurements. However, for a realistic choice of pressure sensors M = 9, we note that cost of inverting measurements of both front location and pressure sensor results (in average) in a scalable cost of 21 iterations. Since the number of particles that we use for REnKA is relatively low (J = 150), such scalability with respect to the number of particles is reasonable with a high-end computer cluster and can be achieved within a few minutes.

6. Summary and conclusions

In this work we studied the Bayesian inverse problem that arises from inferring physical properties in a setting for porous media flow with a moving boundary. Our investigation is focused on the inference of log-permeability from measurements of pressure and observation of the (moving) domain occupied by resin during the resin injection stage of RTM relevant to the fabrication of composite materials. We adopted the infinite-dimensional Bayesian approach to inverse problems where the aim is to characterise, at each observation time, the posteriors that arise from conditioning the log-permeability on pressure/front measurements. The simplicity of the 1D RTM model enabled us to show existence of the Bayesian posteriors in the aforementioned infinite-dimensional setting. These posteriors were then probed numerically with the dimension-independent SMC sampler for inverse problems from [22]. Our numerical experiments indicated that very large number of particles were needed to accurately approximate the Bayesian posteriors. This resulted in a high computational cost unfeasible for practical RTM settings.

In order to reduce computational cost of Bayesian inversions for practical RTM settings, we proposed a regularising ensemble Kalman algorithm (REnKA) that we motivated from the adaptive tempering SMC sampler of [22]. The proposed REnKA is based on Gaussian approximations of the sequence of Bayesian posteriors and thus, in general, asymptotic convergence of posterior expectations cannot be ensured. Nevertheless, our numerical results demonstrated that REnKA is robust with respect to tuneable parameters and provides reasonably accurate estimates of the posterior mean and variance with a computational cost affordable for practical RTM processes.

While measurements have been widely used with ad-hoc approaches to estimate permeability of preform in RTM, to the best of our knowledge, this work constitutes the first investigation that uses a Bayesian inverse modeling framework for random moving boundary problems. From the numerical investigations that we conducted some conclusions and recommendations can be made with relevant implications to practical RTM settings. In particular, our synthetic experiments indicated that, when a small number of sensors (5 sensors in 1D and 9 sensors in 2D) are used to measure pressure, observing the front/moving domain can substantially reduce the uncertainty (variance) of the estimates of the log-permeability. This is particularly relevant in real experiments when the number of pressure sensors are usually limited. However, when the inversion is conducted with only measurements of the moving front, the reconstruction of the main spatial features of the true permeability are not recovered accurately.

Our results also display the benefit of the proposed sequential approach in updating our knowledge of the log-permeability as soon as measurements become available. Indeed, inverting measurements of pressure and front frequently in time, enabled us to reduce the uncertainty in the log-permeability. While the reduction of the uncertainty is mainly achieved within the region occupied by the resin at a given time, a decrease in the uncertainty (with respect to the prior) can also be observed in an unfilled region close to the front. Such a reduction of this uncertainty via the Bayesian posteriors can be valuable for decision-making purposes with the aim of an active control of RTM in real time. Further, we note that the proposed sequential approach is also beneficial in terms of obtaining more accurate estimates with respect to the truth than those obtained if the algorithm is modified to (iteratively) invert data all-at-once. More precisely, we conducted numerical experiments where we inverted, simultaneously, the entire set of data collected during a given time window for given synthetic experiment. Our results (not displayed) suggest that the sequential approach outperforms the all-at-once approach, producing estimates with smaller errors with respect to the truth and smaller variances of the log-permeability.

Finally, our numerical investigations show that the observation noise in pressure measurements and front location have a substantial effect on the estimates of log-permeability and its uncertainties. Indeed, it comes as no surprise that more accurate measurements (e.g. $1\%$ ) result in estimates of log-permeability concentrated around the truth. Again, giving the limitation of deploying many pressure sensors within a real RTM scenario, it is then essential to use high precision pressure sensors to achieve enough confidence in the posterior uncertainties of the inferred permeability.

Although the context of this work is related to the resin injection in RTM processes, the Bayesian framework at the core of the proposed methodology, and the corresponding Gaussian approximations emerged from the proposed REnKA are generic, flexible, and thus transferable to a wide class of inverse problems constrained by PDEs with moving boundaries.

Acknowledgments

This work was supported by the Engineering and Physical Sciences Research Council [grant number EP/K031430/1]. The authors are very grateful to Mikhail Matveev for useful discussions. MI wishes to thank Nikolas Kantas for helpful discussions on the computational aspects and implementation of SMC.

Appendix A.: Proof of theorem 2.2

For the proof of theorem 2.2 we consider the dimensionless version of (19) and (20) that we derive in the following subsection.

A.1. Dimensional analysis

Let us consider the following change to dimensionless variables:

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape1} x\to x/x^{\ast}, \quad t\to t/t_{f},\quad p(x,t)\to p(x,t)/p_{0},\nonumber \\ u(x)\to u(x)-u_{0},\quad \Ga(t)\to \Ga(t)/x^{\ast}, \nonumber \end{align} \tag{ A.1 }$

where t_f is a reference time and u₀ is a reference (constant) log-permeability. For simplicity we choose

$\begin{align*} \newcommand{\e}{{\rm e}} \displaystyle \frac{p_{0}{\rm e}^{u_{0}}t_{f}}{\mu (x^{\ast})^2}=1,\qquad \varphi=1,\qquad p_{I}=2p_{0},\nonumber \end{align*}$

which enable us to transform (19) and (20) into

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape2} \Ga(t) =W_{u}^{-1}(t), ~~t>0 \nonumber \end{align} \tag{ A.2 }$

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape3} p(x,t) = \left\{\begin{array}{@{}cc@{}} 2-\frac{F_u(x)}{F_u(\Ga(t))}, &\ t\geqslant 0,\ x\in D(t)\equiv (0,\Ga(t)), \nonumber \\ 1& t\geqslant 0, \ x\in [0,1]\setminus D(t), \end{array}\right. \nonumber \end{align} \tag{ A.3 }$

where, with a slight abuse in the notation, the variables p, u, $\newcommand{\Ga}{\Upsilon} \Ga$ , x and t are now the dimensionless variables. Similarly, the dimensionless filling time (21) is given by

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{ape4} \tau^{\ast}= \int_{0}^{x^{*}}F_{u}(\xi){\rm d}\xi. \nonumber \end{align} \tag{ A.4 }$

Let us note from (A.2) and (18) that

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape5} \frac{{\rm d}\Ga}{{\rm d}t}(t) =\frac{1}{F_{u}(\Ga(t))}. \nonumber \end{align} \tag{ A.5 }$

Recall that $\{t_{n}\}_{n=1}^{N}$ is the collection of observation times with $0<t_{1}<t_{2}< \dots< t_{N}$ and $\{x_{m}\}_{m=1}^{M}$ are the pressure measurement locations ( $0<x_{1}<\dots<x_{M}\leqslant x^{\ast}$ ). We now prove two technical lemmas which are needed for the proof of theorem 2.2.

A.2. Technical lemmas

For all $\newcommand{\e}{{\rm e}} u\in X\equiv C[0, x^{\ast}]$ , we define the norm

$\begin{align*} \newcommand{\e}{{\rm e}} \displaystyle \vert\vert u\vert\vert\equiv \max_{x\in [0,x^{*}]}\vert u(x) \vert,\nonumber \end{align*}$

and denote by $\newcommand{\Ga}{\Upsilon} (\Ga^{u}, p^{u})$ the corresponding solutions of the dimensionless moving boundary problem (A.2) and (A.3). Similarly, the filling time given by (A.4) is denoted by $\tau^{\ast, u}$ .

Lemma A.1. For all $\newcommand{\e}{{\rm e}} u\in X\equiv C[0, x^{\ast}]$ , there exists a constant $\mathcal{A}_{u}$ such that

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape6} \vert \Ga^{u}(t) -\Ga^{u}(\hat{t})\vert \leqslant \mathcal{A}_{u}\vert t -\hat{t}\vert \nonumber \end{align} \tag{ A.6 }$

for all $t, \hat{t}\in [t_{1}, \tau^{\ast, u}]$ . The constant $\mathcal{A}_{u}$ may depend only on $\vert\vert u\vert\vert, t_{1}$ and $x^{\ast}$ .

Proof. Let $u\in X$ and $t\in [t_{1}, \tau^{\ast, u}]$ . From (A.5) and (18) it is clear that $\newcommand{\Ga}{\Upsilon} \Ga^{u}(t)$ is increasing and satisfies $\newcommand{\Ga}{\Upsilon} \Ga^{u}(t)\leqslant \Ga^{u}(\tau^{*, u})=x^{*}$ . Therefore,

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape7} F_{u}(\Ga^{u}(t)) =\int_{0}^{\Ga^{u}(t)}{\rm e}^{-u(z)}{\rm d}z\leqslant \int_{0}^{x^{\ast}}{\rm e}^{-u(z)}{\rm d}z\leqslant x^{\ast} {\rm e}^{\vert\vert u\vert\vert}. \nonumber \end{align} \tag{ A.7 }$

Then, from (A.5) we have that

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape8} \frac{{\rm d}\Ga^{u}}{{\rm d}t}(t) =\frac{1}{F_{u}(\Ga^u(t))}\geqslant \frac{1}{x^{\ast}} {\rm e}^{-\vert\vert u\vert\vert}, \nonumber \end{align} \tag{ A.8 }$

which implies

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape9} \Ga^{u}(t) \geqslant \frac{t}{x^{\ast}} {\rm e}^{-\vert\vert u\vert\vert}. \nonumber \end{align} \tag{ A.9 }$

Similarly, note that

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape10} F_{u}(\Ga^{u}(t)) =\int_{0}^{\Ga^{u}(t)}{\rm e}^{-u(z)}{\rm d}z\geqslant \Ga^{u}(t){\rm e}^{-\vert\vert u\vert\vert}\geqslant \frac{t}{x^{\ast}} {\rm e}^{-2\vert\vert u\vert\vert}, \nonumber \end{align} \tag{ A.10 }$

and so

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape11} \frac{{\rm d}\Ga^{u}}{{\rm d}t}(t) =\frac{1}{F_{u}(\Ga^{u}(t))}\leqslant \frac{x^{\ast}}{t} {\rm e}^{2\vert\vert u\vert\vert}\leqslant \frac{x^{\ast}}{t_{1}} {\rm e}^{2\vert\vert u\vert\vert}, \nonumber \end{align} \tag{ A.11 }$

for all $t\in[t_{1}, \tau^{*, u}]$ . The Mean Value theorem combined with (A.11) yields (A.6). □

Lemma A.2. For all $u, v\in X$ , there exists a constant $\mathcal{B}_{u, v}$ , such that

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape12} \vert \Ga^{u}(t)-\Ga^{v}(t)\vert \leqslant \mathcal{B}_{u,v} \vert\vert u-v\vert\vert, \nonumber \end{align} \tag{ A.12 }$

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \vert\,p^{u}(x_{m},t)-p^{v}(x_{m},t)\vert \leqslant \mathcal{B}_{u,v} \vert\vert u-v\vert\vert, \label{ape12B} \nonumber \end{align} \tag{ A.13 }$

for all $t\in [t_{1}, \min\{\tau^{\ast, u}, \tau^{\ast, v}\}]$ and all $m=1, \dots, M$ . Moreover,

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{ape13} \vert \tau^{*,u}-\tau^{*,v}\vert \leqslant \mathcal{B}_{u,v} \vert\vert u-v\vert\vert. \nonumber \end{align} \tag{ A.14 }$

The constant $\mathcal{B}_{u, v}$ may depend only on $\vert\vert u\vert\vert, \vert\vert v\vert\vert, t_{1}$ and $x^{\ast}$ .

Proof. From the mean value theorem it is not difficult to see that

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{ape14} \vert F_{u}(x)-F_{v}(x) \vert \leqslant x {\rm e}^{\max\{\vert\vert u\vert\vert,~ \vert\vert v\vert\vert\}}\vert\vert u-v\vert\vert\leqslant x~ \mathcal{M}_{u,v}\vert\vert u-v\vert\vert, \nonumber \end{align} \tag{ A.15 }$

where

$\begin{align*} \newcommand{\e}{{\rm e}} \displaystyle \mathcal{M}_{u,v}\equiv {\rm e}^{\max\{\vert\vert u\vert\vert,~ \vert\vert v\vert\vert\}}.\nonumber \end{align*}$

It is also not difficult to see that

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape15} \vert F_{v}(\Ga^{u}(t))-F_{v}(\Ga^{v}(t)) \vert\leqslant \Big\vert \int_{\Ga^{u}(t)}^{\Ga^{v}(t)}{\rm e}^{-v(z)}{\rm d}z\vert\leqslant \vert \Ga^{u}(t)-\Ga^{v}(t)\vert {\rm e}^{\vert\vert v\vert\vert}\leqslant \mathcal{M}_{u,v}\vert \Ga^{u}(t)-\Ga^{v}(t)\vert . \nonumber \end{align} \tag{ A.16 }$

From (A.15) and (A.16), and the fact that $\newcommand{\Ga}{\Upsilon} \Ga^{u}(t)\leqslant x^{*}$ , we have

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape16} \vert F_{u}(\Ga^{u}(t))-F_{v}(\Ga^{v}(t)) \vert \leqslant \vert F_{u}(\Ga^{u}(t))-F_{v}(\Ga^{u}(t))\vert+\vert F_{v}(\Ga^{u}(t))-F_{v}(\Ga^{v}(t)) \vert \nonumber \\ \leqslant \mathcal{M}_{u,v}\Big[x^{\ast}\vert\vert u-v\vert\vert+ \vert \Ga^{u}(t)-\Ga^{v}(t)\vert \Big]. \nonumber \end{align} \tag{ A.17 }$

From (A.11) we get

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape17} \frac{1}{F_{u}(\Ga^{u}(t))}\frac{1}{F_{v}(\Ga^{v}(t))}\leqslant \Big[\frac{x^{\ast}}{t_{1}}\Big]^2 {\rm e}^{2\vert\vert u\vert\vert}{\rm e}^{2\vert\vert v\vert\vert}\leqslant \Big[\frac{x^{\ast}}{t_{1}}\Big]^2 \mathcal{M}_{u,v}^4. \nonumber \end{align} \tag{ A.18 }$

Therefore

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \Big \vert \frac{{\rm d}\Ga^{u}}{{\rm d}t}(t)-\frac{{\rm d}\Ga^{v}}{{\rm d}t}(t)\Big\vert \leqslant \frac{\vert F_{u}(\Ga^{u}(t))-F_{v}(\Ga^{v}(t)) \vert}{F_{u}(\Ga^{u}(t))F_{v}(\Ga^{v}(t))} \nonumber \\ \leqslant \Big[\frac{x^{\ast}}{t_{1}}\Big]^2\mathcal{M}_{u,v}^5\Bigg[x^{\ast}\vert\vert u-v\vert\vert+ \vert \Ga^{u}(t)-\Ga^{v}(t)\vert \Bigg] .\nonumber \end{align*}$

We recall that $\newcommand{\Ga}{\Upsilon} \Ga^{u}(0)=\Ga^{v}(0)=0$ and use Gronwall's inequality to conclude that

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape19} \vert \Ga^{u}(t)-\Ga^{v}(t)\vert& \leqslant \exp\Bigg[\int_{0}^{t}\Big[\frac{x^{\ast}}{t_{1}}\Big]^2\mathcal{M}_{u,v}^5{\rm d}s \Bigg]~\int_{0}^{t}\frac{(x^{\ast})^3}{(t_{1})^2}\mathcal{M}_{u,v}^5 \vert\vert u-v\vert\vert {\rm d}s\nonumber \\ &=\exp\Bigg[t\Big[\frac{x^{\ast}}{t_{1}}\Big]^2\mathcal{M}_{u,v}^5 \Bigg]~t\frac{(x^{\ast})^3}{(t_{1})^2}\mathcal{M}_{u,v}^5 \vert\vert u-v\vert\vert . \nonumber \end{align} \tag{ A.19 }$

From (A.4) we see that

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{ape20} t\leqslant \min\{\tau^{\ast,u},\tau^{\ast,v}\}=\min\Bigg\{\int_{0}^{x^{*}} F_{u}(\xi) {\rm d}\xi, \int_{0}^{x^{*}} F_{v}(\xi) {\rm d}\xi \Bigg\} \leqslant (x^{\ast})^2 \mathcal{M}_{u,v} \nonumber \end{align} \tag{ A.20 }$

which we combine with (A.19) to obtain

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape21} \vert \Ga^{u}(t)-\Ga^{v}(t)\vert \leqslant \mathcal{C}_{u,v} \vert\vert u-v\vert\vert , \nonumber \end{align} \tag{ A.21 }$

where

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{ape22} \mathcal{C}_{u,v}\equiv \exp\Bigg[\frac{(x^{\ast})^4}{t_{1}^2}\mathcal{M}_{u,v}^6 \Bigg]~\frac{(x^{\ast})^5}{(t_{1})^2}\mathcal{M}_{u,v}^6 . \nonumber \end{align} \tag{ A.22 }$

Hence, (A.11) is proved.

Let us now consider the case $\newcommand{\Ga}{\Upsilon} x_{m}> \Ga^{u}(t)$ and $\newcommand{\Ga}{\Upsilon} x_{m}> \Ga^{v}(t)$ , then

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{ape23} \vert\,p^{u}(x_{m},t)-p^{v}(x_{m},t) \vert =0 \nonumber \end{align} \tag{ A.23 }$

for all $t\in [0, \min\{\tau^{*, u}, \tau^{*, v}\}]$ . Assume now that $\newcommand{\Ga}{\Upsilon} x_{m}\leqslant \Ga^{u}(t)$ , $\newcommand{\Ga}{\Upsilon} x_{m}\leqslant \Ga^{v}(t)$ , and let us note that

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape24} F_{u}(\Ga^{u}(t)) =\int_{0}^{\Ga^{u}(t)}{\rm e}^{-u(z)}{\rm d}z \geqslant \int_{0}^{x_{m}}{\rm e}^{-u(z)}{\rm d}z = F_{u}(x_{m}) . \nonumber \end{align} \tag{ A.24 }$

Therefore, from (A.3), (A.11), (A.24), (A.15), (A.17) and (A.21), we find

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape25} & \vert\,p^{u}(x_{m},t)-p^{v}(x_{m},t) \vert \leqslant \frac{1}{F_{v}(\Ga^{v}(t))}\Bigg\vert F_{v}(x_m)-F_{u}(x) +\frac{F_{u}(x_m)}{F_{u}(\Ga^{u}(t))}(F_{u}(\Ga^{u}(t))-F_{v}(\Ga^{v}(t))) \Bigg\vert \nonumber \\ & \leqslant \frac{x^{*}}{t_{1}}{\rm e}^{2\vert\vert v\vert\vert}\Bigg[\vert F_{v}(x_{m})-F_{u}(x_{m}) \vert +\vert F_{u}(\Ga^{u}(t))-F_{v}(\Ga^{v}(t))\vert \Bigg]\nonumber \\ & \leqslant \frac{x^{*}}{t_{1}}\mathcal{M}_{u,v}^3\Bigg[x_{m} \vert\vert u-v\vert\vert +x^{\ast} \vert\vert u-v\vert\vert+ \vert \Ga^{u}(t)-\Ga^{v}(t)\vert \Bigg] \nonumber \\ &\qquad \qquad \leqslant \frac{x^{*}}{t_{1}}\mathcal{M}_{u,v}^3 (2x^{\ast} +\mathcal{C}_{u,v})\vert\vert u-v\vert\vert. \nonumber \end{align} \tag{ A.25 }$

Consider now the case $\newcommand{\Ga}{\Upsilon} x_{m}\leqslant \Ga^{u}(t)$ , $\newcommand{\Ga}{\Upsilon} x_{m}> \Ga^{v}(t)$ . From (A.3) and (A.24) that $p^{u}(x_{m}, t)\geqslant 1$ . Therefore,

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \vert\,p^{u}(x_{m},t)-p^{v}(x_{m},t) \vert =p^{u}(x_{m},t)-1= \frac{1}{F_{u}(\Ga^{u}(t))}(F_{u}(\Ga^{u}(t)) - F_{u}(x_m)) \nonumber \\ = \frac{1}{F_{u}(\Ga^{u}(t))}(F_{u}(\Ga^{u}(t))- F_{u}(\Ga^{v}(t)) +F_{u}(\Ga^{v}(t))- F_{u}(x_m)). \nonumber \end{align} \tag{ A.26 }$

Since $\newcommand{\Ga}{\Upsilon} F_{u}(\Ga^{v}(t))- F_{u}(x_{m})<0$ (recall $\newcommand{\Ga}{\Upsilon} \Ga^{v}(t)< x_{m}$ ) and $\newcommand{\Ga}{\Upsilon} F_{u}(\Ga^{u}(t))\geqslant F_{u}(\Ga^{v}(t))$ , it then follows from (A.16), (A.11) and (A.21) that

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \vert\,p^{u}(x_{m},t)-p^{v}(x_{m},t) \vert < \frac{1}{F_{u}(\Ga^{u}(t))}(F_{u}(\Ga^{u}(t))- F_{u}(\Ga^{v}(t)))\leqslant \frac{x^{\ast}}{t_{1}} \mathcal{M}_{u,v}^3\vert \Ga^{u}(t)- \Ga^{v}(t)\vert\nonumber \\ \leqslant \frac{x^{\ast}}{t_{1}} \mathcal{M}_{u,v}^3 \mathcal{C}_{u,v} \vert\vert u-v\vert\vert \leqslant \frac{x^{*}}{t_{1}}\mathcal{M}_{u,v}^3 (2x^{\ast} +\mathcal{C}_{u,v})\vert\vert u-v\vert\vert. \nonumber \end{align*}$

Hence, (A.12) is proved.

Finally, from (A.4) and (A.15), it is easy to see that

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{ape28} \vert \tau^{\ast,u}-\tau^{\ast,v}\vert \leqslant \int_{0}^{x^{*}} \vert F_{u}(\xi) -F_{v}(\xi)\vert {\rm d}\xi\leqslant \frac{1}{2}(x^{*})^2 \mathcal{M}_{u,v}\vert\vert u-v\vert\vert . \nonumber \end{align} \tag{ A.27 }$

We combine (A.21), (A.23), (A.25)–(A.27) and then (A.12)–(A.14) follows with

$\begin{align*} \newcommand{\e}{{\rm e}} \displaystyle \mathcal{B}_{u,v}\equiv \min\Big\{\mathcal{C}_{u,v},\frac{x^{*}}{t_{1}}\mathcal{M}_{u,v}^3 (2x^{\ast} +\mathcal{C}_{u,v}),\frac{1}{2}(x^{*})^2 \mathcal{M}_{u,v}\Big\} .\nonumber \end{align*}$

□

A.3. Proof of theorem 2.2

Given $t\geqslant t_{1}$ fixed, we first establish continuity of the following map $\newcommand{\Ga}{\Upsilon} u\to \Ga^{u}(t\wedge\tau^{\ast, u})$ . Let $u, v\in X$ and without loss of generality assume that $\tau^{\ast, v}\leqslant \tau^{\ast, u}$ . Note that

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape29} \vert \Ga^{u}(t\wedge\tau^{\ast,u})&-\Ga^{v}(t\wedge\tau^{\ast,v})\vert\leqslant \vert \Ga^{u}(t)-\Ga^{v}(t)\vert \mathbb{I}_{t<\tau^{\ast,v}}+ \vert \Ga^{u}(t)-\Ga^{v}(\tau^{\ast,v})\vert \mathbb{I}_{t\in [\tau^{\ast,v},\tau^{\ast,u}]}\nonumber \\ &+ \vert \Ga^{u}(\tau^{\ast,u})-\Ga^{v}(\tau^{\ast,v})\vert \mathbb{I}_{t>\tau^{\ast,u}}. \nonumber \end{align} \tag{ A.28 }$

Since $\newcommand{\Ga}{\Upsilon} \Ga^{u}(\tau^{\ast, u})=\Ga^{v}(\tau^{\ast, v})=x^{\ast}$ , the previous expression reduces to

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape30} \vert \Ga^{u}(t\wedge\tau^{\ast,u})-\Ga^{v}(t\wedge\tau^{\ast,v})\vert\leqslant \vert \Ga^{u}(t)-\Ga^{v}(t)\vert \mathbb{I}_{t<\tau^{\ast,v}}+ \vert \Ga^{u}(t)-\Ga^{v}(\tau^{\ast,v})\vert \mathbb{I}_{t\in [\tau^{\ast,v},\tau^{\ast,u}]}. \nonumber \end{align} \tag{ A.29 }$

We observe that we can write

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape31} \Ga^{u}(t)-\Ga^{v}(\tau^{\ast,v}) = \frac{1}{2}[\Ga^{u}(t)-\Ga^{u}(\tau^{\ast,u})] + \frac{1}{2}[\Ga^{u}(t)-\Ga^{u}(\tau^{\ast,v})]. \nonumber \end{align} \tag{ A.30 }$

Then by lemmas A.1 and A.2, we obtain

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape33} &\vert\Ga^{u}(t)-\Ga^{v}(\tau^{\ast,v}) \vert\leqslant \mathcal{A}_{u} \vert t-\tau^{\ast,u}\vert +\mathcal{A}_{u}\vert t-\tau^{\ast,v}\vert \nonumber \\ &=\mathcal{A}_{u} (\tau^{\ast,u}-t) +\mathcal{A}_{u} (t-\tau^{\ast,v})= \mathcal{A}_{u}(\tau^{\ast,u}-\tau^{\ast,v}) \leqslant \mathcal{A}_{u}\mathcal{B}_{u,v}\vert\vert u-v\vert\vert \nonumber \end{align} \tag{ A.31 }$

for all $t\in [\tau^{*, v}, \tau^{*, u}]$ . We combine (A.31) with (A.28) and use lemma A.1 again to obtain

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape34} \vert \Ga^{u}(t\wedge\tau^{\ast,u})-\Ga^{v}(t\wedge\tau^{\ast,v})\vert\leqslant \mathcal{B}_{u,v}(1+ \mathcal{A}_{u})\vert\vert u-v\vert\vert \nonumber \end{align} \tag{ A.32 }$

for all $t\geqslant t_{1}$ which establishes the continuity of $\newcommand{\Ga}{\Upsilon} u\to \Ga^{u}(t\wedge\tau^{\ast, u})$ and, in particular, of $\newcommand{\cG}{{\mathcal G}} \newcommand{\Ga}{\Upsilon} \cG_{n}^{\Ga}(u)=\Ga^{u}(t_{n}\wedge\tau^{\ast, u})$ for all $n=1, \dots, N$ .

Similarly, we now prove the continuity of $u\to p^{u}(x_{m}, t\wedge\tau^{\ast, u})$ . We note that

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{ape35} \vert\,p^{u}(x_{m},t\wedge\tau^{\ast,u})-p^{v}(x_{m},t\wedge\tau^{\ast,v})\vert\leqslant \vert\,p^{u}(x_{m},t)-p^{v}(x_{m},t)\vert \mathbb{I}_{t<\tau^{\ast,v}} \nonumber \\ +\vert\,p^{u}(x_{m},t)-p^{v}(x_{m},\tau^{\ast,v})\vert \mathbb{I}_{t\in [\tau^{\ast,v},\tau^{\ast,u}]} + \vert\,p^{u}(x_{m},\tau^{\ast,u})-p^{v}(x_{m},\tau^{\ast,v})\vert \mathbb{I}_{t>\tau^{\ast,u}} \, . \nonumber \end{align} \tag{ A.33 }$

From (A.3) it follows

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape36} & \vert\,p^{u}(x_{m},t)-p^{v}(x_{m},\tau^{\ast,v})\vert =\Big\vert\frac{F_{u}(x_{m})}{F_{u}(\Ga^{u}(t))}-\frac{F_{v}(x_{m})}{F_{v}(\Ga^{v}(\tau^{\ast,v}))} \Big\vert\nonumber \\ & \leqslant \frac{1}{F_{u}(\Ga^{u}(t))}\Big[\Big\vert F_{u}(x_{m})-F_{v}(x_{m}) \Big\vert+ \frac{F_{v}(x_{m})}{F_{v}(x^{*})}\vert F_{v}(\Ga^{v}(\tau^{\ast,v}))- F_{u}(\Ga^{u}(t)) \vert\Big] \nonumber \end{align} \tag{ A.34 }$

for all $t\in [\tau^{\ast, v}, \tau^{\ast, u}]$ . Using (A.31) as well as similar arguments to the ones used before, we obtain

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{ape37} & \vert F_{u}(\Ga^{u}(t))-F_{v}(\Ga^{v}(\tau^{\ast,v})) \vert \leqslant \vert F_{u}(\Ga^{u}(t))-F_{u}(\Ga^{v}(\tau^{*,v}))\vert +\vert F_{u}(\Ga^{v}(\tau^{*,v}))-F_{v}(\Ga^{v}(\tau^{\ast,v})) \vert \nonumber \\ & \leqslant \mathcal{M}_{u,v}\vert \Ga^{u}(t) -\Ga^{v}(\tau^{*,v})\vert +\vert F_{u}(x^{*})-F_{v}(x^{*}) \vert\nonumber \\ & \leqslant\mathcal{M}_{u,v}\vert \Ga^{u}(t) -\Ga^{v}(\tau^{*,v})\vert +x^{\ast}~ \mathcal{M}_{u,v}\vert\vert u-v\vert\vert \leqslant \mathcal{M}_{u,v}(\mathcal{A}_{u}\mathcal{B}_{u,v} +x^{\ast})\vert\vert u-v\vert\vert . \nonumber \end{align} \tag{ A.35 }$

We use (A.35), (A.11) and the fact that $F_{v}(x_{m})\leqslant F_{v}(x^{*})$ to rewrite (A.34) as follows

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{ape38} \vert\,p^{u}(x_{m},t)&-p^{v}(x_{m},\tau^{\ast,v})\vert \leqslant \frac{x^{\ast}}{t_{1}} \mathcal{M}_{u,v}^{2} \Big[x_{m}\mathcal{M}_{u,v}+ \mathcal{M}_{u,v}(\mathcal{A}_{u}\mathcal{B}_{u,v} +x^{\ast})\Big]\vert\vert u-v\vert\vert\nonumber \\ &\leqslant 2\frac{x^{\ast}}{t_{1}} \mathcal{M}_{u,v}^{3} \Big[x^{*}+ \mathcal{A}_{u}\mathcal{B}_{u,v} \Big]\vert\vert u-v\vert\vert . \nonumber \end{align} \tag{ A.36 }$

From similar arguments it is easy to see that

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{ape39} &\ \vert\,p^{u}(x_{m},\tau^{\ast,u})-p^{v}(x_{m},\tau^{\ast,v})\vert=\Big\vert\frac{F_{u}(x_{m})}{F_{u}(x^{*})}-\frac{F_{v}(x_{m})}{F_{v}(x^{*})} \Big\vert\nonumber \\ &\leqslant \frac{1}{F_{u}(x^*)}\Big[\Big\vert F_{u}(x_{m})-F_{v}(x_{m}) \Big\vert+ \frac{F_{v}(x_{m})}{F_{v}(x^{*})}\vert F_{v}(x^*)- F_{u}(x^*) \vert\Big]\nonumber \\ & \leqslant \frac{x^{\ast}}{t_{1}} \mathcal{M}_{u,v}^{2}\Big[x_{m}\mathcal{M}_{u,v} + x^{*}\mathcal{M}_{u,v} \vert\Big]\vert\vert u-v\vert\vert \leqslant 2\frac{(x^{\ast})^2}{t_{1}} \mathcal{M}_{u,v}^{3} \vert\vert u-v\vert\vert . \nonumber \end{align} \tag{ A.37 }$

We use (A.36) and (A.37) and lemma A.2 to conclude that

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \vert\,p^{u}(x_{m},t\wedge\tau^{\ast,u})-p^{v}(x_{m},(t\wedge\tau^{\ast,v}))\vert\leqslant \Big[\mathcal{B}_{u,v}+4\frac{x^{\ast}}{t_{1}} \mathcal{M}_{u,v}^{3} (x^{*}+ \mathcal{A}_{u}\mathcal{B}_{u,v})\Big]\vert\vert u-v\vert\vert \nonumber \end{align} \tag{ A.38 }$

which proves the continuity of $u\to p^{u}(x_{m}, t\wedge\tau^{\ast, u})$ for all $t\geqslant t_{1}$ . The continuity of $\newcommand{\cG}{{\mathcal G}} \cG_{n}^{\,p}(u)$ then follows. □

Appendix B. SMC sampler and pcn-MCMC algorithm

In algorithm 2 we display the pcn-MCMC method that we use for the mutation step in the SMC sampler of [22] discussed in section 3.2 and summarized in algorithm 3 below.

Algorithm 2. pcn-MCMC to generate samples from a $\mu_{n, r}$ -invariant Markov kernel.

Select $\beta\in (0, 1)$ $\beta\in (0, 1)$ and an integer $N_{\mu}$ $N_{\mu}$ .

for $j=1, \dots, J$ $j=1, \dots, J$ do

Initialize $\nu^{(\,j)}(0)= \hat{u}_{n, r}^{(\,j)}$ $\nu^{(\,j)}(0)= \hat{u}_{n, r}^{(\,j)}$

while $\alpha\leqslant N_{\mu}$ $\alpha\leqslant N_{\mu}$ do

(1) pcN proposal. Propose u_prop from

$u_{\rm prop}=\sqrt{1-\beta^2}\nu^{(\,j)}(\alpha)+(1-\sqrt{1-\beta^2})\overline{u}+\beta\xi, \qquad {\rm with}~ \xi\sim N(0, \mathcal{C})$ $u_{\rm prop}=\sqrt{1-\beta^2}\nu^{(\,j)}(\alpha)+(1-\sqrt{1-\beta^2})\overline{u}+\beta\xi, \qquad {\rm with}~ \xi\sim N(0, \mathcal{C})$

(2) Set $\nu^{(\,j)}(\alpha+1)=u_{\rm prop}$ $\nu^{(\,j)}(\alpha+1)=u_{\rm prop}$ with probability $a(\nu^{(\,j)}(\alpha), u)$ $a(\nu^{(\,j)}(\alpha), u)$ and $\nu^{(\,j)}(\alpha+1)=\nu^{(\,j)}(\alpha)$ $\nu^{(\,j)}(\alpha+1)=\nu^{(\,j)}(\alpha)$ with probability $1-a(\nu^{(\,j)}(\alpha), u)$ $1-a(\nu^{(\,j)}(\alpha), u)$ , where

$a(u, v)=\min\Big\{1, \frac{l_{n, r}(u_{\rm prop}, y_{n})}{l_{n, r}(v, y_{n})}\Big\}$ $a(u, v)=\min\Big\{1, \frac{l_{n, r}(u_{\rm prop}, y_{n})}{l_{n, r}(v, y_{n})}\Big\}$

with l_n,r defined in (46)

(3) $\alpha \gets \alpha+1$ $\alpha \gets \alpha+1$

end while

end for

Algorithm 3. SMC algorithm for high-dimensional inverse problems.

Let $\{u_{0, 0}^{(\,j)}\}_{j=1}^{J}\sim \mu_{0}$ $\{u_{0, 0}^{(\,j)}\}_{j=1}^{J}\sim \mu_{0}$ be the initial ensemble of J particles.

Define the tunable parameters J_thresh and $N_{\mu}$ $N_{\mu}$ .

for $n=1, \dots, N$ $n=1, \dots, N$ do

Set r = 0 and $\phi_{n, 0}=0$ $\phi_{n, 0}=0$

While $\phi_{n, r}<1$ $\phi_{n, r}<1$ do

$r\to r+1$ $r\to r+1$

Compute the nth likelihood (29) $l_{n}(u_{n, r-1}^{(\,j)}, y_{n})$ $l_{n}(u_{n, r-1}^{(\,j)}, y_{n})$ (for $j=1, \dots, J$ $j=1, \dots, J$ )

Compute the tempering parameter $\phi_{n, r}$ $\phi_{n, r}$ :

if $\min\nolimits_{\phi\in (\phi_{n, r-1}, 1)}{\rm ESS}_{n, r}(\phi)>J_{\rm thresh}$ $\min\nolimits_{\phi\in (\phi_{n, r-1}, 1)}{\rm ESS}_{n, r}(\phi)>J_{\rm thresh}$ then

set $\phi_{n, r}=1$ $\phi_{n, r}=1$ .

else

compute $\phi_{n, r}$ $\phi_{n, r}$ such that ${\rm ESS}_{n, r}(\phi)\approx J_{\rm thresh}$ ${\rm ESS}_{n, r}(\phi)\approx J_{\rm thresh}$

using a bisection algorithm on $(\phi_{n, r-1}, 1]$ $(\phi_{n, r-1}, 1]$ .

end if

Computing weights from expression (44) $\newcommand{\e}{{\rm e}} W_{n, r}^{(\,j)}\equiv \mathcal{W}_{n, r-1}^{(\,j)}[\phi_{n, r}]$ $\newcommand{\e}{{\rm e}} W_{n, r}^{(\,j)}\equiv \mathcal{W}_{n, r-1}^{(\,j)}[\phi_{n, r}]$

Resample. Let $(\,p_{n}^{(1)}, \dots, p_{n}^{(N_{p})})\in \mathcal{R}(W_{n, r}^{(1)}, \dots, W_{n, r}^{(N_{p})})$ $(\,p_{n}^{(1)}, \dots, p_{n}^{(N_{p})})\in \mathcal{R}(W_{n, r}^{(1)}, \dots, W_{n, r}^{(N_{p})})$ ,

Set $\newcommand{\e}{{\rm e}} \hat{u}_{n, r}^{(\,j)}\equiv u_{n, r-1}^{(\,p_{n}^{(\,j)})}$ $\newcommand{\e}{{\rm e}} \hat{u}_{n, r}^{(\,j)}\equiv u_{n, r-1}^{(\,p_{n}^{(\,j)})}$ and $W_{n, r}^{(\,j)}=\frac{1}{J}$ $W_{n, r}^{(\,j)}=\frac{1}{J}$

Mutation. Sample $u_{n, r}^{(\,j)}\sim \mathcal{K}_{n, r}(\hat{u}_{n, r}^{(\,j)}, \cdot)$ $u_{n, r}^{(\,j)}\sim \mathcal{K}_{n, r}(\hat{u}_{n, r}^{(\,j)}, \cdot)$ via algorithm 2.

end while

Set $\newcommand{\e}{{\rm e}} u_{n+1, 0}^{(\,j)}\equiv u_{n, r}^{(\,j)}$ $\newcommand{\e}{{\rm e}} u_{n+1, 0}^{(\,j)}\equiv u_{n, r}^{(\,j)}$ . Approximate $\mu_{n}$ $\mu_{n}$ by $\newcommand{\e}{{\rm e}} \mu_{n}^{J}\equiv \frac{1}{J}\sum\nolimits_{j=1}^{J} \delta_{u_{n, r}^{(\,j)}}$ $\newcommand{\e}{{\rm e}} \mu_{n}^{J}\equiv \frac{1}{J}\sum\nolimits_{j=1}^{J} \delta_{u_{n, r}^{(\,j)}}$

end for

Appendix C. Numerical implementation of the 1D forward model

In this section we discuss the key aspects of the numerical implementation of the dimensionless version of the 1D RTM forward model derived in appendix A.1.

Note that F_u defined in (18) can we written as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq:ape1:2} F_{u}(x)\equiv \int_{0}^{x}{\rm e}^{-u(z)}{\rm d}z =\int_{0}^{1}{\rm e}^{-u(z)}H(x-z){\rm d}z , \nonumber \end{align} \tag{ C.1 }$

where

$\begin{align*} \newcommand{\e}{{\rm e}} \displaystyle H(x)=\left\{\begin{array}{@{}cc@{}} 0, &{\rm if}~x<0, \nonumber \\ 1,& {\rm if}~x\geqslant 0,\end{array}\right. \nonumber \end{align*}$

is the Heaviside function. In order to approximate (A.2) and (A.3), we discretize the domain $[0, 1]$ with S subintervals with end points defined by $x_{s+1/2}=[1/2+s]\Delta x$ $(s=0, \dots S)$ , where $\Delta x= x^{\ast}/S$ and the centers of the cells are $x_{s}=\frac{x_{s-1/2}+x_{s+1/2}}{2}$ . Let us consider a piecewise constant approximation of the unknown u defined on the centers of the cells, i.e.

$\begin{align*} \newcommand{\e}{{\rm e}} \displaystyle u(x)\approx \sum_{s=1}u_{s}\chi_{[x_{s-1/2},x_{s+1/2}]}, \nonumber \end{align*}$

where $u_{s}=u(x_{s})$ . Therefore, (C.1) can be approximated by

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{eq:ape1:3} F_{u}(x) \approx \sum_{s=1}^{S} {\rm e}^{-u_{s}}\Bigg[\frac{1}{2}+\frac{1}{2}\tanh{r (x-x_{s})}\Bigg]\Delta x, \nonumber \end{align} \tag{ C.2 }$

where we have replaced $H(x)$ with its smooth approximation $\hat{H}(x)=\frac{1}{2}+\frac{1}{2}\tanh{r x}$ (with r = 300) [6].

We consider a temporal domain $[0, 0.4]$ discretized with K points $t_{k}=k \Delta t$ , where $\newcommand{\e}{{\rm e}} \Delta t \equiv \ t_{f}/K$ . An implicit backward Euler scheme applied to the dimensionless version of (16) yields

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{eq:ape1:4} \frac{1}{\Delta t}(\Ga_{k+1}-\Ga_{k})-\frac{1}{F_{u}(\Ga_{k+1})}=0,\qquad \Ga_{0}=0, \nonumber \end{align} \tag{ C.3 }$

where $\newcommand{\Ga}{\Upsilon} \newcommand{\e}{{\rm e}} \Ga_{k}\equiv \Ga(t_{k})$ . For the approximation of (C.3), we use (C.2). The solution of the resulting nonlinear equation is implemented in MATLAB by means of the routine fzero. Once $\newcommand{\Ga}{\Upsilon} \Ga_{k}$ is computed, we evaluate the pressure field from

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Ga}{\Upsilon} \displaystyle \label{eq:ape1:5} p(x,t_{k}) = 2-\frac{F_u(x)}{F_u(\Ga_{k})} \nonumber \end{align} \tag{ C.4 }$

at the mesh points x_s+1/2 defined earlier.

Bayesian inversion in resin transfer molding

Article metrics

Submit

Share this article

Author e-mails

Author affiliations

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

1.1. Practical approaches for permeability estimation in fiber-reinforced composites

1.2. The Bayesian approach to inverse problems

1.3. Contributions of this work

2. Bayesian inversion of a 1D RTM model

2.1. The forward 1D RTM model

2.2. The Bayesian inverse problem

2.2.1. The prior.

2.2.2. The posterior.

3. Approximating the posteriors via sequential Monte Carlo method

3.1. Standard SMC for Bayesian inference

3.2. SMC for high-dimensional inverse problems

3.2.1. Selection step.

3.2.2. Mutation phase.

3.3. A note on tempering

3.3.1. Computational aspects of SMC.

3.4. Numerical examples with SMC

3.4.1. Experimental set-up.

3.4.2. Application of SMC.

3.5. Reducing the cost of SMC by adjusting tunable parameters

4. Approximating the posteriors via a regularizing ensemble Kalman algorithm

4.1. Motivation for REnKA

4.2. REnKA in the context of existing ensemble Kalman methods for inverse problems

4.3. Numerical approximating the posterior with REnKA

5. Bayesian inversion in 2D RTM

5.1. Formulation of the 2D Bayesian inverse problem

5.2. 2D numerical experiments

6. Summary and conclusions

Acknowledgments

Appendix A.: Proof of theorem 2.2

A.1. Dimensional analysis

A.2. Technical lemmas

A.3. Proof of theorem 2.2

Appendix B. SMC sampler and pcn-MCMC algorithm

Appendix C. Numerical implementation of the 1D forward model