Inverse radiative transfer with goal-oriented hp-adaptive mesh refinement: adaptive-mesh inversion

Shukai Du; Samuel N Stechmann

doi:10.1088/1361-6420/acf785

1. Introduction

This paper concerns the numerical solution of inverse radiative transfer equation, which serves as the mathematical foundation for applications such as optical tomography [2, 4, 32], remote sensing [13, 14, 29, 34], and neutron transport [27, 28, 30]. Despite its wide applications, devising numerical methods for inverse radiative transfer is notoriously challenging because of the high-dimensionality of the forward problem, for which standard discretizations with sufficient accuracy would usually require large memory occupation, and can render the solver too slow to be applicable. Therefore, it is of great interest to devise numerical methods which use fewer degrees of freedoms (DOFs) or less memory, while still maintain the accuracy requirement.

Among the various types of memory reduction methods, we are interested in hp-adaptive mesh refinement (hp-AMR) methods. This is for several reasons. Firstly, the method is versatile in the sense that it can efficiently represent the solution in regions where it is smooth, while also capturing local features. This advantage is especially beneficial for applications such as optical tomography and remote sensing, for which it is common to observe a pattern with a few local features embedded in a smooth background distribution (clouds in the sky for remote sensing; narrow, dirac-delta like inflow laser beam for optical tomography; etc). Another reason concerns the well-developed theoretical understanding of AMR, and its success in the field of solid and fluid mechanics. Indeed, the concept of hp-AMR traces back to at least the 1980s by the pioneering work of Babuška and his collaborators [18]; see also the review paper [5] for hp-finite element methods (FEM). It was shown that hp-adaptive FEM can achieve exponential convergence even when the solution presents singularities. This suggests the great potential of using hp-AMR to reduce the DOFs for radiative transfer. Our recent work [15] demonstrates this potential for the forward problem, while here we consider solving the inverse radiative transfer problem.

Before we proceed to the introduction of our proposed method, we first review some related existing work on inverse radiative transfer, or more generally on inverse transport problems. The well-posedness of inverse transport problems was studied in [6, 12, 37] by exploiting the singular decomposition of the albedo operators. Then, the stability of the inverse problem in different scaling was studied in [10, 11, 25, 26]. For the numerical discretization of the inverse problem, the time-independent inverse radiative transfer was considered in [1, 22–24], and the frequency domain problem was considered in [33]. Finally, [32] gives a review on the numerical techniques for the inverse transport problems in medical imaging. In [3], the effect of the numerical error on the quality of the reconstruction was discussed, where it was shown that the numerical error on the forward/adjoint solver can lead to significant errors in the reconstruction of the optical images.

Considering the efficiency of AMR in reducing numerical error, it is not surprising to see that AMR has also been applied to inverse problems. In [19, 20], the authors considered the diffuse optical imaging problem, where the connection between the forward/adjoint solver error and the reconstruction error was studied theoretically, and an AMR method was proposed based on their theoretical findings. In [35, 36], finite volume (FV) based AMR methods for fluorescence/luminescence imaging were studied numerically. In [7, 8], a FEM based AMR method was proposed for fluorescence-enhanced optical tomography, and a large savings in DOFs was observed in numerical tests.

In this paper, we propose a goal oriented hp-AMR method which is distinctive to the existing AMR approaches of solving the inverse radiative transfer equation in the following aspects. First, we devise a novel goal-oriented error estimator, which provides more efficient mesh-refinement strategies compared to energy or L²-norm based error estimators; see [9, 39]. In addition, we use hp-AMR instead of only h-AMR with fixed p. The extra p-adaptivity allows more efficient representation of the solution where it is smooth. Finally, we consider the full radiative transfer equation instead of its diffusion-type approximation. The full radiative transfer equation is a more computationally challenging task because of the interplay of both the advection and the scattering terms. Since we aim for maximal generality of our method for the different regimes of the radiative transfer equation, we use discontinuous Galerkin (DG) methods to approximate the forward and the adjoint equations. The DG methods can handle well both the advection-dominated and the diffusive regimes, by their flexible choices of the numerical traces. For instance, upwind-type fluxes allow the schemes to suppress unphysical oscillations in the advection-dominated regime, while in the diffusive regime, HDG-type fluxes enable the techniques of static condensation to save more DOFs. Finally, the DG setting allows a straightforward implementation of hp-adaptivity, which lies at the center of our memory reduction method.

The development of our method is based on two key observations. The first is that there exists a goal function, and by minimizing this goal function, the discretized error landscape $\Phi_h(\sigma)$ will be sandwiched by the original landscape $\Phi(\sigma)$ in the PDE setting (see (4) and (13) for their definitions). This observation motivates the use of this goal function for devising error estimators for AMR. Another observation is that when the numerical error is large, we can use the adjoint solution, which was originally calculated for updating the optical parameters, to also devise the goal-oriented error estimators for doing AMR. This observation suggests that the goal-oriented AMR is especially suitable for the inverse problem setting, since the adjoint solution only needs to be solved for one time but can serve two purposes, namely, the gradient calculation and the error estimator calculation.

The rest of the paper is organized as follows. In section 2, we consider the forward and the inverse problems of the radiative transfer in the PDE setting, and propose an algorithm of reconstructing the optical coefficients. This algorithm will serve as the 'PDE map' for the numerical discretization of the inverse equation. In section 3, we first introduce the DG discretization for the forward and the adjoint problems, and the gradient calculation. Then, we propose the goal function and devise the goal-oriented error estimators. At the end, we combine the discretization and the error estimator and propose an algorithm of reconstructing the optical coefficient with hp-mesh adaptivity. Finally, in section 4, we present numerical experiments to test the performance of our proposed method.

2. Forward and inverse problems

In this section, we introduce the forward and the inverse problems of the radiative transfer equation. Then, an algorithm (algorithm 1) is proposed to reconstruct the optical coefficients in the PDE setting.

Algorithm 1. Reconstructing optical coefficients—PDE setting.
1: Set the maximal iteration count $N_\mathrm{max\_iter}$ and the error tolerance ε > 0
2: Initialize σ⁰
3: Set k = 0
4: while $k\leqslant N_\mathrm{max\_iter}$ do
5: Using σ^k, solve (7c ) = 0 to obtain uⁱ
6: Using σ^k and uⁱ , solve (7b ) = 0 to obtain vⁱ
7: Using σ^k, uⁱ , vⁱ , and (7a ) to obtain the updating direction $\delta\sigma^k$
8: Update the optical coefficients by $\sigma^{k+1} = \sigma^k-\gamma\, \delta\sigma^k$ , where γ is determined by a line search
9: Update the iteration count: $k = k+1$
10: if $\|\Phi(\sigma^{k})-\Phi(\sigma^{k-1})\|/\|\Phi(\sigma^k)\|\lt\epsilon$ then
11: Break the loop
12: end if
13: end while

2.1. Forward problem

Let $\Omega\subset \mathbb{R}^d$ be a bounded Lipschitz domain, and S represent the unit sphere in $\mathbb{R}^d$ . Let $\partial\Omega$ be the boundary of Ω, and n the outward-pointing unit normal vector on the boundary. We denote by

$\begin{align*} \Gamma^{+}: = \left\{\left(\mathbf{x},\mathbf{s}\right)\in\partial\Omega\times S:\, \mathbf{n}\cdot\mathbf{s}\geqslant 0\right\},\quad \Gamma^{-}: = \left\{\left(\mathbf{x},\mathbf{s}\right)\in\partial\Omega\times S:\, \mathbf{n}\cdot\mathbf{s}\leqslant 0\right\}, \end{align*}$

the outflow and the inflow boundary of $\Omega\times S$ , respectively. We consider the following (time-independent) radiative transfer equation: find $u\in V$ such that

$\begin{align} \mathcal D\left(\sigma\right)\left[u\right]: = \mathbf{s}\cdot\nabla u + \left(\sigma_a+\sigma_\mathrm s\right)u-\sigma_\mathrm s\int_S P\left(\mathbf{s},\mathbf{s}^{\prime}\right)u\left(\mathbf{s}^{\prime}\right) &= f \qquad \mathrm{in}\,\,\Omega\times S, \end{align} \tag{ 1a }$

$\begin{align} \qquad\qquad\qquad\qquad\qquad\qquad\qquad\quad\quad\quad\! u &= g \quad \ \ \ \mathrm{on}\,\,\Gamma^{-}. \end{align} \tag{ 1b }$

In the above equation, $u(\mathbf{x},\mathbf{s})$ is the radiative intensity at spatial location x and along the direction $\mathbf{s}\in S$ , while $\sigma_\mathrm a(\mathbf{x})\geqslant0$ , $\sigma_\mathrm s(\mathbf{x})\geqslant0$ , $f(\mathbf{x},\mathbf{s})$ , and $g(\mathbf{x},\mathbf{s})$ are the absorption coefficient, the scattering coefficient, the source term, and the inflow radiation, respectively. We assume the scattering phase function $P(\mathbf{s},\mathbf{s}^{\prime})$ to have the form of the Henyey–Greenstein function:

$\begin{align} P\left(\mathbf{s},\mathbf{s}^{\prime}\right) = \frac{1-g^2}{c\left(1+g^2-2g\cos\theta\right)^{3/2}}, \end{align} \tag{ 2 }$

where g is the asymmetric parameter, and c is chosen such that $\int_SP(\mathbf{s},\mathbf{s}^{\prime})\mathrm ds^{\prime}\equiv1$ . Here V is the solution space

$\begin{align*} V: = \left\{u\in L^2\left(\Omega\times S\right):\, \mathbf{s}\cdot\nabla u\in L^2\left(\Omega\times S\right)\right\},\qquad V^{\,g}: = \left\{u\in V:\, u = g\quad\mathrm{on}\quad \Gamma^-\right\},\qquad \qquad \end{align*}$

where V^g is the affine space of V satisfying the inflow condition (1b ). Note that V⁰ is a subspace of V.

To render the presentation more concise, we rewrite (1) into a more compact form. Let $F = (f,g)^\mathrm T$ , and $\mathcal L(\sigma) = (\mathcal D(\sigma), \gamma^-)^\mathrm T$ , where γ⁻ is the trace operator to $L^2(\Gamma^-)$ , and σ represents the collection of both $\sigma_\mathrm e$ and $\sigma_\mathrm s$ , namely $\sigma(\mathbf{x}): = (\sigma_\mathrm e(\mathbf{x}),\sigma_\mathrm s(\mathbf{x}))$ . Then the equation (1) transforms into

$\begin{align} \mathcal L\left(\sigma\right)\left[u\right] = F. \end{align} \tag{ 3 }$

In the equation (1) (or (3)), we use the square bracket $[\cdot]$ to indicate that $\mathcal D(\sigma)$ (or $\mathcal L(\sigma)$ ) depends linearly on u; the round bracket suggests that $\mathcal D$ (or $\mathcal L$ ) can depend non-linearly on σ. In the setting of the forward problem, the optical property represented by σ and the source/boundary term F are given as input data. Then the task it to solve the equation (1) (or (3)) for the radiative intensity u. We refer to [16] and the references therein for the well-posedness of the forward problem.

2.2. Inverse problem

In the setting of the inverse problem, we are given certain measurements of the solution, here denoted as $\mathcal Mu$ , while the task is to infer the distribution of the optical property σ or the source term f. The former task fits into the field of optical tomography, while the latter one is usually referred to as inverse source problems. We refer to [2, 6] for a review on these topics.

In this paper, we focus on the optical tomography problem. To be more specific, we consider the task of reconstructing the optical property of the medium σ, based on a set of tests determined by the source terms Fⁱ with $i = 1,\ldots,N_t$ , and a set of measurements on the outflow boundary $\Gamma^+$ , here denoted as $\mathcal M^j$ with $j = 1,\ldots,N_m$ . This problem can be formulated as an optimization problem. Suppose $\sigma^*$ is the target optical coefficient we would like to reconstruct. We define $y^{ij} = \mathcal M^ju^i = \mathcal M^j(\mathcal L(\sigma^*))^{-1}[F^i]$ as the measurement data. For many applications, the data are polluted by noise. In such a case we define $y^{ij}: = \mathcal M^ju^i+\delta_{ij}$ where δ_ij represent the noise. The optical tomography problem becomes

$\begin{align} \min_{\sigma}\Phi\left(\sigma\right): = \sum_{i = 1}^{N_t}\sum_{j = 1}^{N_m}w_{ij}|\mathcal M^j\left(\mathcal L\left(\sigma\right)\right)^{-1}\left[F^i\right]-y^{ij}|^2+\alpha\mathcal R\left(\sigma\right), \end{align} \tag{ 4 }$

where w_ij are the weights associated to each test-measurement signal y^ij. A typical choice is $w^{ij} = \frac{1}{|y^{ij}|^2}$ ; see [32]. Here, $\alpha\mathcal R(\sigma)$ is the regularization term and α is the regularization parameter. Some widely used examples of regularization include the L²-regularization, for which we choose $\mathcal R(\sigma) = \|\sigma-\overline{\sigma}\|_{L^2(\Omega)}$ (where $\overline{\sigma}$ is a spatial average), or H¹-regularization, for which we choose $\mathcal R(\sigma) = \|\nabla\sigma\|_{L^2(\Omega)}$ . For more on regularization techniques and the choices of regularization parameters, we refer to the monograph [17] and the references therein.

Note that $\Phi(\sigma)$ is nonlinear and there is no explicit expression for $\Phi(\sigma)$ in general cases. To proceed, we transform (4) into a PDE-constrained optimization problem:

$\begin{align} &\min_{\sigma,u}\Phi\left(\sigma,u\right): = \sum_{i = 1}^{N_t}\sum_{j = 1}^{N_m}w_{ij}|\mathcal M^ju^i-y^{ij}|^2+\alpha\mathcal R\left(\sigma\right), \end{align} \tag{ 5a }$

$\begin{align} &\textrm{subject to}\quad \mathcal L\left(\sigma\right)\left[u^i\right] = F^i\quad\textrm{for}\quad i = 1,\ldots,N_t. \end{align} \tag{ 5b }$

To solve the problem, we introduce the Lagrangian

$\begin{align} \Psi\left(\sigma,u,v\right) = \Phi\left(\sigma,u\right) + \sum_i\left\langle \mathcal L\left(\sigma\right)\left[u^i\right]-F^i,v^i\right\rangle, \end{align} \tag{ 6 }$

where vⁱ is the Lagrange multiplier. By the Karush–Kuhn–Tucker (KKT) necessity condition [38], the solution of (5a ), if exists, is also the first-order stationary point of Ψ in (6). By taking Fréchet derivatives of (6), we obtain

$\begin{align} \ \ \frac{\delta\Psi}{\delta\sigma}\left[\delta\sigma\right] & = \alpha\mathcal R^{\prime}\left(\sigma\right)\left[\delta\sigma\right]+\sum_i\left\langle \mathcal L^{\prime}\left(\sigma\right)\left[\delta\sigma\right]\left[u^i\right],v^i\right\rangle, \end{align} \tag{ 7a }$

$\begin{align} \frac{\delta\Psi}{\delta u^i}\left[\delta u^i\right] & = 2\sum_{j = 1}^{N_m}w_{ij}\left(\mathcal M^ju^i-y^{ij}\right)\mathcal M^j\delta u^i+\left\langle \delta u^i,\mathcal L\left(\sigma\right)^*\left[v^i\right]\right\rangle\qquad i = 1,\ldots,N_t, \end{align} \tag{ 7b }$

$\begin{align} \frac{\delta\Psi}{\delta v^i}\left[\delta v^i\right] & = \left\langle \mathcal L\left(\sigma\right)\left[u^i\right]-F^i,\delta v^i\right\rangle\qquad\qquad\qquad\qquad\qquad\qquad\quad\! i = 1,\ldots,N_t, \end{align} \tag{ 7c }$

where $\langle \cdot,\cdot\rangle$ is the trial-test bracket, manifested as the summation of the L²-inner product on $\Omega\times S$ and $\Gamma^-$ , and $\mathcal L(\sigma)^*$ is the adjoint operator of $\mathcal L(\sigma)$ .

Now, the optimization problem reduces to finding the stationary point $(\sigma^*,u^*,v^*)$ such that equations (7) equal 0. Based on (7), we propose the following algorithm 1 of finding $\sigma^*$ . In the algorithm, the forward and the adjoint problems associated with (7c ) and (7b ) are first solved to obtain uⁱ and vⁱ , which are then used to update σ by an updating direction $\delta\sigma^k$ and the step length γ > 0. The step length γ can be determined by a line search [31]. The updating direction can be simply chosen as the gradient given by (7a ), or by Quasi–Newton method such as the BFGS method. We refer to [31] more details.

3. Numerical discretization

In this section, we introduce the numerical approximations to the inverse problem (4), manifested by using the discontinuous Galerkin spectral element (DGSE) methods to discretize the forward problem (7c ) = 0, the adjoint problem (7b ) = 0, and the gradient calculation (7a ). Then, we propose a goal-oriented error estimator which can be easily calculated based on the forward and the adjoint numerical solution. Finally, based on the error estimator, we introduce an hp-AMR algorithm to solve the inverse radiative transfer problem numerically.

3.1. DG

3.1.1. Approximation spaces.

We begin by considering the approximation spaces. For simplicity, we restrict ourselves to consider only a 2D domain Ω discretized by rectangular meshes $\mathcal T_h$ . The generalization to 3D and polyhedral meshes is straightforward under the DG setting. Since any $K\in\mathcal T_h$ is rectangular, we can write $K = K^{\,x}\times K^{\,y}$ . We require that the mesh $\mathcal T_h$ satisfies the one-irregular condition, i.e. for any element $K\in\mathcal T_h$ , there are at most two neighbour elements connecting to K through each edge. Here h is both an index for the triangulations $\mathcal T_h$ and the mesh-size, which is defined as $h: = \max_{K\in\mathcal T_h}h_K$ . For each $K\in\mathcal T_h$ , we denote by $\mathcal T_h^{a,K}$ the triangulation of the unit sphere S. We also let $\mathcal F_K$ be the collection of the faces of K. Now we introduce the approximation space:

$\begin{align*} V_h: = &\left\{\sum_{K\in\mathcal T_h}\sum_{K^a\in\mathcal T_h^{a,K}}\sum_{i,j,k} u_{i,j,k}\,\phi_i^{K^x}\left(x\right)\phi_j^{K^y}\left(y\right)\phi_k^{K^a}\left(\theta\right): \right.\nonumber \\ & \quad \left.\phi_i^{K^\star}\big|_{i = 1}^{p_{K^\star}+1}~\textrm{are a set of polynomial bases on}~K^\star \right\}, \end{align*}$

where $p_{K^\star}$ represents the polynomial degree and $\star\in\{x,y,a\}$ . Here we choose $\phi_i^{[a,b]}(x) = \phi_i\circ\mathcal F_{[a,b]}^{-1}(x)$ where φ_i supported on $[-1,1]$ is the Lagrange polynomial basis associated to the ith Gauss–Legendre–Lobatto quadrature points, and $\mathcal F_{[a,b]}(\widehat{x}) = \frac{b-a}{2}(\widehat{x}+1)+a$ is the push-forward map. This choice of basis gives us spectral element methods. Note that the space V_h is completely determined by the mesh discretization $\mathcal T_h$ and the polynomial degrees $p_{K^a}$ , $p_{K^y}$ , and $p_{K^a}$ which are associated to each spatial-angular element $K\times K^a$ . We will sometimes write $\mathcal T_{hp}$ to denote the mesh $\mathcal T_h$ encoded with also the polynomial degree information. Finally, to incorporate the boundary condition, we introduce

$\begin{align*} V_h^g : = \left\{u_h\big|_{\left(\Omega^\circ\times S\right)\cup\Gamma^+}\in V_h:\, u_h\big|_{\Gamma^-} = g\right\}. \end{align*}$

As a special case, $V_h^{\,0}$ is a subspace of V_h with zero inflow radiation.

For notation conciseness, we define

$\begin{align*} \left( f,g \right)_{K\times K^a}: = \int_K\int_{K^a} f\left(\mathbf{x},\mathbf{s}\right)g\left(\mathbf{x},\mathbf{s}\right)\mathrm ds\mathrm dx \end{align*}$

as the L² inner product on $K\times K^a$ , and we denote the associated norm as $\|\,f\,\|_{K\times K^a}: = (f,f\,)_{K\times K^a}^{1/2}$ . Also, when there are two elements $K^+$ and $K^-$ which share a common face F, we introduce the jump notation $[\hspace{-1pt}[ u_h ]\hspace{-1pt}]: = u_h^+\mathbf{n}^+ + u_h^-\mathbf{n}^-$ . If the face F is an exterior face, namely, if $F\subset\partial\Omega$ , then $[\hspace{-1pt}[ u_h ]\hspace{-1pt}]: = 0$ .

3.1.2. DG for the forward and adjoint problems.

Now we are ready to introduce the DG numerical discretization for the radiative transfer equations. We first define the following bilinear form:

$\begin{align} a_h(u_h,v_h)&: = \sum_{K\in\mathcal T_h}\sum_{K^a\in\mathcal T_h^{a,K}} \left(\int_{\partial K}\int_{K^a} (\mathbf{s}\cdot\mathbf{n})\widehat{u}_hv_h -\int_K\int_{K^a}u_h\mathbf{s}\cdot\nabla v_h \right.\nonumber\\ &\quad \left.+\int_K\int_{K^a} (\sigma_a+\sigma_s) u_h v_h -\int_K\sigma_s\int_{K^a}\int_SP(\mathbf{s},\mathbf{s}^{\prime})u_h(\mathbf{s}^{\prime})\mathrm ds^{\prime}v_h(\mathbf{s})\mathrm ds\right), \end{align} \tag{ 8a }$

where we take the upwind numerical trace $\widehat{u}_h$ :

$\begin{align} \widehat{u}_h\big|_{\partial K}: = \left\{ \begin{array}{ll} u_h^\mathrm{nbr} & \textrm{if}\; \mathbf{s}\cdot\mathbf{n}_{F}\lt 0\textrm{and } F \textrm{ is an interior face},\\ u_h\big|_{\partial K} & \textrm{otherwise},\\ \end{array} \right. \end{align} \tag{ 8b }$

and $u_h^\mathrm{nbr}$ is the restriction of u_h from the neighbour elements of K on F.

The DG approximation for the ith test of the forward problem (7c ) reads as

$\begin{align} \textrm{find}~u_h^i\in V_h^{g^i}\quad\textrm{such that}\quad a_h\left(u_h^i,v_*\right) = \int_\Omega\int_Sf^iv_*\quad\forall v_*\in V_h^{\,0}. \end{align} \tag{ 9 }$

On the other hand, the DG approximation for the ith test of the adjoint problem (7b ) reads as

$\begin{align} \textrm{find}~v_h^i\in V_h^{\,0}\quad \textrm{such that}\quad a_h\left(u_*,v_h^i\right) = M^i\left(u_h^i\right)\left[u^*\right]\quad\forall u_*\in V_h^{\,0}, \end{align} \tag{ 10 }$

where

$\begin{align} M^i\left(u_h^i\right)\left[u^*\right]: = 2\sum_jw_{ij}\left(y^{ij}-\mathcal M^ju_h^i\right)\mathcal M^ju^* \end{align} \tag{ 11 }$

are the measurements. For different tests, we choose different underlying hp-meshes. We denote by $\mathcal T_{hp}^i$ and $\mathcal T_{hp}^{\star-i}$ as the underlying hp-mesh for the ith test of the forward and the adjoint problem, respectively. We shall assume that $\mathcal T_{hp}^{\star-i}$ is obtained by refining $\mathcal T_{hp}^i$ . For instance, $\mathcal T_{hp}^{\star-i}$ can be a one-level h or p-refinement of $\mathcal T_{hp}^i$ . This extra refinement for the adjoint meshes will be essential for devising the error estimators for the inverse problem, and will be discussed in more detail in section 3.2.2.

3.1.3. Gradient calculation in the discrete setting.

For the rest of this subsection, we consider how to use the forward and the adjoint numerical solutions, namely $u_h^i$ and $v_h^i$ , to calculate the gradient and update the optical parameter in the DG setting. Namely, we consider the DG discretization for the gradient calculation (7a ).

We begin by introducing the approximation spaces for the optical parameter σ. Since we use the DG spectral element method as our radiative transfer equations solver, it is natural to choose the discretized optical parameter σ_h to live in a DG spectral element space. We shall denote by $\mathcal T_{hp}^\sigma$ as the underlying mesh for this space, and by $\Sigma_h$ , or $\Sigma_h(\mathcal T_{hp}^\sigma)$ as the space.

Here we adopt the parametric reconstruction method [32] as a regularization to relieve the ill-posedness of the original inverse problem (4). This means that we shall set $\mathcal T_{hp}^\sigma$ to live in a coarse mesh with large mesh-size. In the extreme case, we choose $\mathcal T_{hp}^\sigma$ to be the one-element mesh covering the whole domain Ω. Then, the polynomial degree p can be adjusted in the sense that a lower degree p represents a stronger regularization effect (restriction to the low-frequency modes) and vice versa.

Now, the discretized version of (7a ) becomes

$\begin{align} \frac{\delta\Psi_h}{\delta\sigma_h}\left[\delta\sigma_h\right] &= \alpha\mathcal R^{\prime}\left(\sigma_h\right)\left[\delta\sigma_h\right]+ \sum_{i = 1}^{N_\mathrm{tst}} \sum_{K\in\mathcal T_{h}^i}\sum_{K^a\in\mathcal T_{h}^{a,K}} \left( \left(u_h^iv_h^i,\delta\sigma_{a,h}\right)_{K\times K^a} \right. \nonumber\\ & \quad \left.+\left(u_h^iv_h^i-u_h^{i,s}v_h^i,\delta\sigma_{\mathrm s,h}\right)_{K\times K^a} \right), \end{align} \tag{ 12 }$

for any test function $\delta\sigma_h\in\Sigma_h$ , where $u_h^{i,s}: = \int_S P(\mathbf{s},\mathbf{s}^{\prime})u_h^i(\mathbf{s}^{\prime})\mathrm ds^{\prime}$ , and $u_h^i$ and $v_h^i$ are the solution of (9) and (10), respectively. By letting the test function $\delta\sigma_h$ go over all the basis polynomials of $\Sigma_h$ , we obtain a vector as the updating gradient for σ_h.

We conclude section 3.1 by the following algorithm 2.

Algorithm 2. Reconstructing optical coefficients—DG discretization.
1: Set the mesh for the optical parameter— $\mathcal T_{hp}^\sigma$
2: Set the mesh for each test i— $\mathcal T_{hp}^i$ for the forward and $\mathcal T_{hp}^{\star-i}$ for the adjoint problem
3: Set the maximal iteration count $N_\mathrm{max\_iter}$ and the error tolerance ε > 0
4: Initialize $\sigma_h^0$
5: Set k = 0
6: while $k\leqslant N_\mathrm{max\_iter}$ do
7: Using $\sigma_h^k$ , solve (9) to obtain $u_h^i$
8: Using $\sigma_h^k$ and $u_h^i$ , solve (10) to obtain $v_h^i$
9: Using $\sigma_h^k$ , $u_h^i$ , $v_h^i$ , and (12) to obtain the updating direction $\delta\sigma_h^k$
10: Update the optical coefficients by $\sigma_h^{k+1} = \sigma_h^k-\gamma\, \delta\sigma_h^k$ , where γ is determined by a line search
11: Update the iteration count: $k = k+1$
12: if $\|\Phi_h(\sigma_h^{k})-\Phi_h(\sigma_h^{k-1})\|/\|\Phi_h(\sigma_h^k)\|\lt\epsilon$ ( $\Phi_h$ defined in (13)) then
13: Break the loop
14: end if
15: end while

3.2. Goal-oriented error estimator

3.2.1. The goal function.

The numerical solver (9) and (10) can be rewritten in a compact form as follows:

$\begin{align*} u_h^i = \mathrm{DGSE}\left(\mathcal T_{hp}^i,\sigma\right)\left[F^i\right],\qquad v_h^i = \mathrm{DGSE}^*\left(\mathcal T_{hp}^i,\sigma\right)\left[M^i\left(u_h^i\right)\right], \end{align*}$

where $\mathrm{DGSE}$ and $\mathrm{DGSE}^*$ represent the forward and the adjoint solver, respectively. Using this notation, the discretized inverse problem (4) becomes

$\begin{align} \min_{\sigma,\mathcal T_{hp}^I}\Phi_h\left(\sigma,\mathcal T_{hp}^I\right) : = \sum_{i = 1}^{N_t}\sum_{j = 1}^{N_m}w_{ij} |\mathcal M^j\,\mathrm{DGSE}\left(\mathcal T_{hp}^i,\sigma\right)\left[F^i\right]-y^{ij}|^2+\alpha\mathcal R\left(\sigma\right), \end{align} \tag{ 13 }$

where $I: = \{1,\ldots,N_t\}$ is the index set for all tests. Note that the discretized error landscape functional $\Phi_h$ depends both on the optical property σ and the set of the hp-meshes $\mathcal T_{hp}^I$ . If we denote

$\begin{align*} \mathrm{Err}_h\left(\sigma\right)&\,: = \sum_{i = 1}^{N_t}\mathrm{Err}_h^i\left(\sigma\right): = \sum_{i = 1}^{N_t}\sum_{j = 1}^{N_m}w_{ij}|\delta_h^{ij}\left(\sigma\right)|^2,\\ \delta_h^{ij}\left(\sigma\right)&: = \mathcal M^j\left(\mathcal L\left(\sigma\right)\right)^{-1}\left[F^i\right]-\mathcal M^j\,\mathrm{DGSE}\left(\mathcal T_{hp}^i,\sigma\right)\left[F^i\right], \end{align*}$

then the error landscape $\Phi(\sigma)$ can be related to its discretized version $\Phi_h(\sigma,\mathcal T_{hp}$ ) as follows:

$\begin{align*} &\Phi_h\left(\sigma,\mathcal T_{hp}^I\right) \leqslant 2\left(\Phi\left(\sigma\right) +\mathrm{Err}_h\left(\sigma\right)\right),\quad \Phi\left(\sigma\right) \leqslant 2\left(\Phi_h\left(\sigma,\mathcal T_{hp}^I\right) +\mathrm{Err}_h\left(\sigma\right)\right),\quad \forall \sigma,\quad\forall\mathcal T_{hp}^I. \end{align*}$

As a result, when $\mathrm{Err}_h(\sigma)\rightarrow 0$ , we have

$\begin{align*} \frac{1}{2}\Phi\left(\sigma\right)\leqslant \Phi_h\left(\sigma,\mathcal T_{hp}^I\right)\leqslant 2\Phi\left(\sigma\right). \end{align*}$

The above estimate suggests that, when the error $\mathrm{Err}_h(\sigma)$ goes to zero, the minimizer of $\Phi(\sigma)$ also minimizes $\Phi(\sigma,\mathcal T_{hp}^I)$ and vice versa. This observation suggests that we should aim for devising numerical methods such that $\mathrm{Err}_h(\sigma)$ is reduced in the most efficient way. We next explain how this can be achieved in more detail.

Toward this aim, for simplicity, write $u^i(\sigma): = (\mathcal L(\sigma))^{-1}[F^i]$ and $u_h^i(\sigma): = \mathrm{DGSE}(\mathcal T_{hp}^i,\sigma)[F^i]$ . Recall that we denote by $\sigma^*$ the minimizer of $\Phi(\sigma)$ so we have $y^{ij} = \mathcal M^ju^i(\sigma^*)$ by definition. Now, for each test i, we introduce the functional $J^i(\sigma)$ :

$\begin{align*} J^i\left(\sigma\right)\left[\varphi\right]: = 2\sum_jw_{ij}\mathcal M^j\left(u^i\left(\sigma\right)-u_h^i\left(\sigma\right)\right)\mathcal M^j\varphi. \end{align*}$

Then, it holds that

$\begin{align*} J^i\left(\sigma\right)\left[u^i\left(\sigma\right)-u_h^i\left(\sigma\right)\right] = 2\sum_{j = 1}^{N_m}w_{ij}|\delta_h^{ij}\left(\sigma\right)|^2 = 2\,\mathrm{Err}_h^i\left(\sigma\right). \end{align*}$

The above identity suggests that we can use the functional $J^i(\sigma)$ to devise error estimators to guide the mesh-refinement of $\mathcal T_{hp}^i$ . This approach fits into the well-known framework of goal-oriented error control and AMR; see [9]. To be more specific, the error estimators are obtained by solving the dual equation

$\begin{align*} a_h\left(u_*,z_h^i\right) = J^i\left(\sigma\right)\left[u_*\right]\quad\forall u_*\in V_h^{\,0}, \end{align*}$

which coincides with (10) except the right-hand-side term is different. Now note that

$\begin{align*} J^i\left(\sigma\right)\left[\varphi\right] & = 2\sum_jw_{ij}\mathcal M^j\left(u^i\left(\sigma\right)-u_h^i\left(\sigma\right)\right)\mathcal M^j\varphi\\ & = M^i\left(u_h^i\left(\sigma\right)\right)\left[\varphi\right] +2\sum_jw_{ij}\left(\mathcal M^ju^i\left(\sigma\right)-y^{ij}\right)\mathcal M^j\varphi. \end{align*}$

Therefore, when $|y^{ij}-\mathcal M^ju^i(\sigma)|\lll|y^{ij}-\mathcal M^ju_h^i(\sigma)|$ , or namely, when the numerical error dominates the error caused by optical parameters, we have

$\begin{align*} J^i\left(\sigma\right)\left[\varphi\right] \approx M^i\left(u_h^i\left(\sigma\right)\right)\left[\varphi\right]. \end{align*}$

Thus, in this case, we can use $M^i(u_h^i(\sigma))$ as an alternative of $J^i(\sigma)$ to devise error estimators to guide the mesh refinement of $\mathcal T_{hp}^i$ . As a result, we have $z_h^i\approx v_h^i$ . This suggests that we do not need to solve another adjoint equation for $z_h^i$ but can simply use $v_h^i$ to devise the error estimators.

3.2.2. The error estimators.

For the rest of this subsection, we show how a goal-oriented error estimator can be devised based on a given functional J in the DG setting. Namely, we would like to control $|J(u-u_h)|$ by a posteriori error estimators. To make the presentation concise, we shall consider the forward problem (1), and its DG discretization formulated as follows: find $u_h\in V_h^g$ such that

$\begin{align} a_h\left(u_h,v_*\right) = \int_\Omega\int_Sf\,v_*\quad\forall v_*\in V_h^{\,0}. \end{align} \tag{ 14 }$

Before we present the main result, we first introduce two lemmas on the consistency and the Galerkin orthogonality of the DG bilinear form (8a ).

Lemma 3.1 (consistency of the DG bilinear form). Suppose u solves (1). Then

$\begin{align} a_h\left(u,v_*\right) = \int_\Omega\int_Sf\,v_*\qquad\forall v_*\in V. \end{align} \tag{ 15 }$

Proof. By multiplying (1a ) with $v_*$ and integrating on $K\times K^a$ , we obtain

$\begin{align*} &\int_{K^a}\int_{\partial K}\left(\mathbf{s}\cdot\mathbf{n}\right)u v_* -\int_{K^a}\int_K\mathbf{s}u\cdot\nabla v_* +\int_{K^a}\int_K\left(\sigma_a+\sigma_\mathrm s\right)uv_*\\ &\qquad -\int_{K^a}\int_K\sigma_\mathrm s\int_S P\left(\mathbf{s},\mathbf{s}^{\prime}\right)u\left(\mathbf{s}^{\prime}\right)\mathrm ds^{\prime}v_*\left(\mathbf{s}\right)\mathrm ds = \int_{K^a}\int_K fv_*. \end{align*}$

Since u is the exact solution, we have $(\mathbf{s}\cdot\mathbf{n})u = (\mathbf{s}\cdot\mathbf{n})\widehat{u}$ by the consistency of the numerical trace, see (8b ). Then we take summation for all $K\in\mathcal T_h$ and $K^a\in\mathcal T_h^{a,K}$ . This completes the proof.

Lemma 3.2 (Galerkin orthogonality). Suppose u solves (1) and u_h solves (14). Then

$\begin{align} a_h\left(u_h-u,v_*\right) = 0\qquad\forall v_*\in V_h^{\,0}. \end{align} \tag{ 16 }$

Proof. We simply choose $v_*\in V_h^{\,0}$ in (15) and then take the difference between (14) and (15). This completes the proof.

Consider the following dual equation: find $z\in V^{\,0}$ such that

$\begin{align} a_h\left(v,z\right) = J\left(v\right)\qquad\forall v\in V^{\,0}, \end{align} \tag{ 17 }$

Then we define

$\begin{align} \eta_{K\times K^a}: = \rho_{K\times K^a}^1w_{K\times K^a}^1+\rho_{K\times K^a}^2w_{K\times K^a}^2,\qquad\eta_K : = \sum_{K^a\in\mathcal T_h^{a,K}}\eta_{K\times K^a}, \end{align} \tag{ 18a }$

where

$\begin{align} \rho_{K\times K^a}^1 &: = \|f-\nabla u_h\cdot\mathbf{s}-\left(\sigma_\mathrm a+\sigma_\mathrm s\right)u_h-\sigma_\mathrm s\int_S P\left(\mathbf{s},\mathbf{s}^{\prime}\right)u_h\left(\mathbf{s}^{\prime}\right)\|_{K\times K^a}, \end{align} \tag{ 18b }$

$\begin{align} w_{K\times K^a}^1 &: = \|z-\varphi_h\|_{K\times K^a}, \end{align} \tag{ 18c }$

$\begin{align} \rho_{K\times K^a}^2 &: = \|\left(\mathbf{s}\cdot\mathbf{n}\right)\left[\hspace{-1pt}\left[ u_h \right]\hspace{-1pt}\right]\|_{\partial K\times K^a}, \end{align} \tag{ 18d }$

$\begin{align} w_{K\times K^a}^2 &: = \|z-\varphi_h\|_{\partial K\times K^a}. \end{align} \tag{ 18e }$

In the above equations, ϕ_h can be any function in $V_h^{\,0}$ .

Proposition 3.1. Let $u\in V^{\,g}$ be the solution of (1), $u_h\in V_h^g$ solves (14), and $z\in V^{\,0}$ be the solution of the dual problem (17). Then, we have

$\begin{align*} |J\left(u-u_h\right)| \leqslant \sum_{K\in\mathcal T_h}\sum_{K^a\in\mathcal T_h^{a,K}} \left(\rho_K^1w_K^1 +\rho_K^2w_K^2\right). \end{align*}$

Proof. Since $u_h\in V_h^g$ and $u\in V^{\,g}$ , we know $(u-u_h)\in V^{\,0}$ . Thus, we can let $v = u-u_h$ in (17), which gives us

$\begin{align*} J\left(u-u_h\right) = a_h\left(u-u_h,z\right) = a_h\left(u-u_h,z-\varphi_h\right)\qquad\forall \varphi_h\in V_h^{\,0}, \end{align*}$

where the Galerkin orthogonality identity (16) is used for the second equal sign. Then, by the definition of the bilinear form (8a ) and its consistency property (15), we can proceed as follows:

$\begin{align*} |J\left(u-u_h\right)|& = |a_h\left(u,z-\varphi_h\right)-a_h\left(u_h,z-\varphi_h\right)|\\ & = \Big| \sum_{K\in\mathcal T_h}\sum_{K^a\in\mathcal T_h^{a,K}}\int_K\int_{K^a}f\left(z-\varphi_h\right)\\ &\quad\, -\sum_{K\in\mathcal T_h}\sum_{K^a\in\mathcal T_h^{a,K}} \left(\int_{\partial K}\int_{K^a} \left(\mathbf{s}\cdot\mathbf{n}\right)\left(\widehat{u}_h-u_h\right)\left(z-\varphi_h\right) +\int_K\int_{K^a}\left(\nabla u_h\cdot\mathbf{s}\right)\left(z-\varphi_h\right)\right.\\ \nonumber &\quad\left.+\int_K\int_{K^a} \left(\sigma_\mathrm a+\sigma_\mathrm s\right) u_h \left(z-\varphi_h\right) -\int_K\sigma_\mathrm s\int_{K^a}\int_SP\left(\mathbf{s},\mathbf{s}^{\prime}\right)u\left(\mathbf{s}^{\prime}\right)\mathrm ds^{\prime}\left(z-\varphi_h\right)\left(\mathbf{s}\right)\mathrm ds\right)\Big|\\ &\leqslant \sum_{K\in\mathcal T_h}\sum_{K^a\in\mathcal T_a^{a,K}}\!\Big|\left( f-\nabla u_h\cdot\mathbf{s}-\left(\sigma_\mathrm a+\sigma_\mathrm s\right)u_h\!+\!\sigma_\mathrm s\int_SP\left(\mathbf{s},\mathbf{s}^{\prime}\right)u\left(\mathbf{s}^{\prime}\right)\mathrm ds^{\prime},\, z-\varphi_h \right)_{K\times K^a}\Big|\\ &\quad +\sum_{K\in\mathcal T_h}\sum_{K^a\in\mathcal T_h^{a,K}}|\left( \left(\mathbf{s}\cdot\mathbf{n}\right)\left(u_h-\widehat{u}_h\right),\, z-\varphi_h \right)_{\partial K\times K^a}|. \end{align*}$

Now, note that

$\begin{align*} |\left( \left(\mathbf{s}\cdot\mathbf{n}\right)\left(u_h-\widehat{u}_h\right),\, z-\varphi_h \right)_{\partial K\times K^a}| & = |\sum_{F\in{\mathcal {F}}_K}\left( \left(\mathbf{s}\cdot\mathbf{n}\right)\left(u_h-\widehat{u}_h\right),\, z-\varphi_h \right)_{F\times K^a}|\\ & \leqslant \sum_{F\in\mathcal F_K}\|\left(\mathbf{s}\cdot\mathbf{n}\right)\left(u_h-u_h^\mathrm{nbr}\right)\|_{F\times K^a}\|z-\varphi_h\|_{F\times K^a}\\ & \leqslant \sum_{F\in\mathcal F_K} \|\left(\mathbf{s}\cdot\mathbf{n}\right)\left[\left[ u_h \right]\right]\|_{F\times K^a} \|z-\varphi_h\|_{F\times K^a}. \end{align*}$

This completes the proof.

Proposition 3.1 suggests that we need to solve a dual equation (17) to calculate the error estimators. Since a direct solve for z is not feasible, we calculate an approximate solution to z by z_h . Here z_h should live in a more refined space than the space for u_h . Otherwise, we can choose $\varphi_h = z_h$ and this will render the error estimators vanishing. To be more specific, we seek $z_h\in \tilde V_h^{\,0}$ such that

$\begin{align*} a_h\left(v_*,z_h\right) = J\left(v_*\right)\qquad v_*\in \tilde V_h^{\,0}. \end{align*}$

Here $\tilde V_h$ is a more refined space than V_h . For instance, we can choose $\tilde V_h$ to be an h-refined or p-refined version of V_h . In this paper, we shall use the p-refined version since it is straightforward to implement. Once z_h is calculated, we choose $\varphi_h = \Pi_Vz_h$ where $\Pi_V$ is the L² projection onto the space V_h . Namely, the weighting terms become

$\begin{align*} w_K^1 = \|z_h-\Pi_Vz_h\|_{K\times K^a},\quad w_K^2 = \|z_h-\Pi_Vz_h\|_{\partial K\times K^a}. \end{align*}$

For the rest of this section, we use the error estimators derived here to design an hp-AMR method to solve the inverse radiative transfer problem (4).

3.3. Goal-oriented mesh adaptation for the inverse problem

Up to this point we have introduced the forward solver (9), the adjoint solver (10), the gradient calculation (12), and the error estimators (18). Next, we show how they can be combined to solver the inverse problem (13) with mesh adaptation. We shall first go over the basic elements of the hp-AMR and then propose the full algorithm.

3.3.1. hp-AMR.

Given a mesh $\mathcal T_{hp}$ , we can refine it based on the error estimators introduced in (18). For h-refinement only, (18) is sufficient for an AMR algorithm. For hp-AMR, we will need an additional smoothness estimators to complete the algorithm. Since we use a more refined mesh for the duality solution v_h , the dominating error comes from u_h . Hence, here we use u_h for devising the smoothness estimator.

We consider the smoothness estimators proposed in [21]. These estimators are obtained by examining the decaying pattern of the coefficients obtained from a Legendre expansion of the numerical solution. To be more specific, given the forward solution u_h , we first calculate its mean intensity $u_h^\mathrm{mI}: = \int_S u_h$ . Then, for any element $K\in\mathcal T_h$ , we calculate the Legendre expansion coefficients of $u_h^\mathrm{mI}$ on K, here denoted as $a_{i,j}^K$ with $i = 0,\ldots,p_x$ and $j = 0,\ldots,p_y$ , where p_x and p_y are the polynomial degrees of the tensor-product basis function on K in the x and the y directions, respectively. Then, the smoothness estimator is defined as follows:

$\begin{align} &\eta_K^s: = \frac{1}{2}\left( \frac{\log\left(\frac{2p_x+1}{2|a_{p_x}^x|^2}\right)}{2\log\left(p_x\right)} +\frac{\log\left(\frac{2p_y+1}{2|a_{p_y}^y|^2}\right)}{2\log\left(p_y\right)} \right), \quad \mathrm{where} \quad \left(a_i^x\right)^2: = \sum_{j = 0}^{p_y}|a_{i,j}^K|^2\frac{2}{2j+1}, \nonumber\\ & \left(a_j^y\right)^2: = \sum_{i = 0}^{p_x}|a_{i,j}^K|^2\frac{2}{2i+1}. \end{align} \tag{ 19 }$

With the error estimator η_K (defined in (18)) and the smoothness estimator $\eta_K^s$ (defined in (19)), we propose the following algorithm 3 to refine a given mesh $\mathcal T_{hp}$ .

Algorithm 3. hp-adaptive mesh refinement.
1: Sort all element $K\in\mathcal T_h$ in increasing order according to the error estimator η_K defined in (18).
2: Mark the largest $r_\mathrm{ref}$ percentage of the elements for refinement— $\mathcal T_h^\mathrm{mark}$ .
3: For all $K\in\mathcal T_h^\mathrm{mark}$ , calculate the smoothness estimators $\eta_K^s$ defined in (19).
4: For all $K\in\mathcal T_h^\mathrm{mark}$ , we perform the following refinement
5: if $p_K+1\lt\eta_K^s-\frac{1}{2}$ then
6: Perform p-refinement for K.
7: else
8: Perform h-refinement for K.
9: end if

Note that for each test i, the error estimators $\eta_K^i$ are computed based on the forward solution $u_h^i$ and the adjoint solution $v_h^i$ ; see equation (18), where z is replaced by $v_h^i$ .

3.3.2. The full algorithm.

We conclude this section by proposing the algorithm 4 of solving the discretized inverse problem (13) with mesh adaptation.

Algorithm 4. Reconstructing optical coefficients—DG discretization with hp-adaptive mesh refinement.
1: Set $\mathcal T_{hp}^\sigma$ , the mesh for the optical parameter
2: Set the maximal iteration count $N_\mathrm{max\_iter}$ , the maximal DOFs count $N_\mathrm{dofs}$ , and the error
tolerance ε > 0.
3: For each test i, initialize $\mathcal T_{hp}^i$ , the mesh for solving the forward and the adjoint problems
4: Initialize the optical parameter $\sigma_h^0$ .
5: Set k = 0
6: while $k\leqslant N_\mathrm{max\_iter}$ and $\mathrm{DOFs}\leqslant N_\mathrm{dofs}$ do
7: Using $\sigma_h^k$ , solve (9) to obtain $u_h^i$
8: Using $\sigma_h^k$ and $u_h^i$ , solve (10) to obtain $v_h^i$
9: Using $\sigma_h^k$ , $u_h^i$ , $v_h^i$ , and (12) to obtain the updating direction $\delta\sigma_h^k$
10: Update the optical coefficients by $\sigma_h^{k+1} = \sigma_h^k-\gamma\, \delta\sigma_h^k$ , where γ is determined by a line search
11: Update the iteration counting: $k = k+1$
12: if $\|\Phi_h(\sigma_h^{k})-\Phi_h(\sigma_h^{k-1})\|/\|\Phi_h(\sigma_h^k)\|\lt\epsilon$ ( $\Phi_h$ defined in (13)) then
13: For each test i, refine the mesh $\mathcal T_{hp}^i$ by algorithm 3
14: Reinitialize the optical parameter $\sigma_h^k = \lambda\sigma_h^k+(1-\lambda)\sigma_h^0$
15: end if
16: end while

Note that previously in algorithm 2, we stop the iteration when the relative difference $|\Phi_h(\sigma_h^k)-\Phi_h(\sigma_h^{k-1})|/|\Phi_h(\sigma_h^k)|\lt\epsilon$ . Here in algorithm 4, we apply mesh-refinement instead of stopping the iteration. This procedure is repeatedly applied until the maximal DOFs count $N_\mathrm{dofs}$ is reached. This can prevent the algorithm from refining indefinitely. The constant $N_\mathrm{dofs}$ can be chosen according to the memory-occupation restriction or computational time restriction. The parameter λ determines how much information will be inherited from the optical parameter recovered in the previous mesh setting. If the previous mesh setting can provide a relatively good approximation of σ, then we can increase λ for a better initial guess. Otherwise, we choose λ = 0 to avoid potential bias from an inaccurate numerical solution. In the numerical tests we carried out in this paper, it only takes a few steps for $\sigma_h^k$ to converge, so we always choose λ = 0.

4. Numerical experiments

In this section, we carry out numerical experiments to test out the performance of our proposed numerical methods. In the first subsection, we test the goal-oriented AMR method for the forward problem. Then, in the second subsection, we test the performance of the method for the inverse problem.

4.1. Goal-oriented AMR for the forward problem

In this subsection, we consider the numerical approximation to the forward problem (1).

Before testing the goal-oriented AMR methods, we first test the convergence of the DGSE method without adaptivity. We consider a rectangular spatial domain $[0,L_x]\times[0,L_y]$ with the inflow radiation coming from the top and the left boundary:

$\begin{align*} u\left(\mathbf{x},\mathbf{s}\right) &= \frac{16}{\pi}\chi_{\left[\frac{7\pi}{4},\frac{7\pi}{4}+\frac{\pi}{16}\right]}\left(\theta\right),\quad\mathrm{where}\quad\mathbf{s} = \left(\cos\theta,\sin\theta\right),\\ & \quad \left(x,y\right)\in\left[0,L_x\right]\times\left\{L_y\right\}\cup \left\{0\right\}\times\left[0,L_y\right]. \end{align*}$

For the other sides of the boundary, we assume zero inflow radiation and zero source term f. A scatterer is placed at the center of the domain, so that the extinction coefficient $\sigma_\mathrm e$ is

$\begin{align} \sigma_\mathrm e = \frac{11}{ 1+\exp{\left(-2k_0\left(0.1L_y-r\left(x,y\right)\right)\right)}}. \end{align} \tag{ 20 }$

We choose the single scattering albedo to be $\tilde\omega = \frac{10}{11}$ , so the scattering coefficient is $\sigma_s = \tilde\omega\sigma_\mathrm e$ and the absorption coefficient is $\sigma_\mathrm a = \sigma_\mathrm e-\sigma_\mathrm s$ . We choose $k_0 = 2$ . The asymmetric parameter in the scattering phase function in (1) is chosen as g = 0.8, which is a typical value for water clouds in the atmosphere.

For this test we consider the DG method (9) with the polynomial degree $p_x = p_y = 1$ and also with $p_x = p_y = 2$ for the spatial discretization. To compare, we also consider an FV spatial discretization method, which is equivalent to the DG method with $p_x = p_y = 0$ . For all these methods, we shall uniformly partition the domain $[0,L_x]\times[0,L_y]$ into a collection of elements and slowly decrease the element size. For the angular discretization, we apply a uniform partition into 32 pieces by P₀ element (FV angular discretization). Figure 1 shows that the DG method has errors that converge as $\mathcal O(\mathrm{DOFs}^{-(p+1)/d})$ where the dimension d = 2. This is the expected optimal convergence order for the DG method and the FV method (upwind).

**Figure 1.** Convergence test for the forward problem, without AMR. Left: mean intensity of the radiative solution calculated by the DG-P₂ method. Right: L² error of the mean intensity calculated by the finite volume method, DG with P₁, and DG with P₂ polynomial approximation. All methods converge at the expected rates of $\mathcal O(\mathrm{DOFs}^{-(p+1)/d})$ with dimension d = 2, and the DG methods with larger p converge faster.
Download figure:
Standard image High-resolution image

**Figure 1.** Convergence test for the forward problem, without AMR. Left: mean intensity of the radiative solution calculated by the DG-P₂ method. Right: L² error of the mean intensity calculated by the finite volume method, DG with P₁, and DG with P₂ polynomial approximation. All methods converge at the expected rates of $\mathcal O(\mathrm{DOFs}^{-(p+1)/d})$ with dimension d = 2, and the DG methods with larger p converge faster.
Download figure:
Standard image High-resolution image

We next consider the forward problem with the goal-oriented AMR algorithm. We consider a rectangular spatial domain $[0,L_x]\times[0,L_y]$ with the inflow radiation coming from the top boundary:

$\begin{align*} u\left(\mathbf{x},\mathbf{s}\right) = \frac{16}{\pi}\chi_{\left[\frac{3\pi}{2},\frac{3\pi}{2}+\frac{\pi}{16}\right]}\left(\theta\right),\quad\mathrm{where}\quad\mathbf{s} = \left(\cos\theta,\sin\theta\right),\quad x\in\left[0,L_x\right],\quad y = L_y. \end{align*}$

For the other sides of the boundary, we assume zero inflow radiation and zero source term f. For this test, we also consider a scatterer placed at the center of the domain following the form of equation (20). To better test the AMR algorithm, here we consider the scatterer with a sharper boundary transition by taking $k_0 = 10$ . The other settings are the same as the previous test; namely, we choose $\widetilde\omega = 10/11$ and g = 0.8. See figure 2 for a visualization of $\sigma_\mathrm e$ .

**Figure 2.** Extinction coefficient $\sigma_\mathrm e$ , to represent an idealized cloud located at the center of the domain.
Download figure:
Standard image High-resolution image

**Figure 2.** Extinction coefficient $\sigma_\mathrm e$ , to represent an idealized cloud located at the center of the domain.
Download figure:
Standard image High-resolution image

Our aim is to find an efficient numerical approximation to J(u), where u is the solution of (1). Here we set the goal function J to be (a smoothed version of) a point measurement on the outflow radiation, located at the right side of the domain:

$\begin{align*} J\left(u\right) &= \int_0^{L_y} u\left(L_x,y\right) h\left(y\right) \mathrm dy, \quad \mathrm{where}\quad h\left(y\right) &= \frac{1}{\left(L_y/20\right)\sqrt{2\pi}} \nonumber\\ & \quad \times \exp{\left(-0.5\left(y-\frac{15}{21}L_y\right)^2/\left(L_y/20\right)^2\right)}. \end{align*}$

Specifically, the (smoothed) point measurement is put at $(L_x,\frac{15}{21}L_y)$ . In the first test, we use the error estimator η_K defined in (18) to guide mesh refinement. The mesh refinement algorithm 3 is repeatably applied with the refinement ratio $r_\mathrm{ref} = 0.2$ , until the DOFs for u_h reach the threshold $N_\mathrm{max\_dof} = 2\times 10^5$ . To better visualize where the meshes are refined, here we use a fixed polynomial degree p = 3.

To compare, we also consider a standard jump estimator

$\begin{align} \eta_K^\mathrm{jump}: = \left( \frac{1}{|\partial K|}\int_{\partial K}\bigg|\left[\hspace{-1pt}\left[ \int_Su_h \right]\hspace{-1pt}\right]\bigg|^2 \right)^{1/2}. \end{align} \tag{ 21 }$

Since the exact solution u is not known, we calculate $|J(u_h)-J(\tilde u)|$ as an approximation to the true functional error $|J(u_h)-J(u)|$ , where $\tilde u$ is calculated on a more refined mesh than u. Here $\tilde u$ is calculated with an extra p-refinement for each element $K\in\mathcal T_h$ .

Figure 3 shows the numerical solutions and the meshes calculated based on the goal-oriented estimator η_K and the standard jump estimator $\eta_K^\mathrm{jump}$ . We observe that the goal-oriented estimator can successfully guide the mesh to refine at the place where the solution has a sharp gradient, namely, top of the scatter. In addition, it refines along the path connecting the place of the sharp gradient and the place of the point measurement (located at the right side boundary at $y = \frac{15}{21}L_y$ ). In contrast, the standard estimator $\eta^\mathrm{jump}$ only refines where there are sharp gradients, without taking into consideration of the effect of the measurement J.

**Figure 3.** Goal-oriented AMR versus standard AMR. The plots show the mean intensity of the radiant intensity *u_h* and the meshes. Left: using the goal-oriented error estimator η_K introduced in (18). Right: using the standard jump estimator $\eta_K^\mathrm{jump}$ defined in (21). The standard jump estimator induces refinement near sharp gradients, whereas the goal-oriented error estimator also induces refinement along the path between the scatterer and the measurement location on the boundary at $(x,y) = (L_x,(15/21)L_y)$ .
Download figure:
Standard image High-resolution image

In figure 4, we plot the error $|J(u)-J(u_h)|$ . We observe that the goal-oriented estimator reduces the error efficiently while the standard estimator fails to decrease the error. This is consistent with our observation in figure 3, where the goal-oriented estimator refines both the places of the sharp gradient and the path connecting to the measurement, while the standard estimator fails to refine those elements concerning the measurement J. The experiment shows that the goal-oriented estimator can be much more efficient than the standard one, when the goal function is not an L² or energy-based norm, but instead a measurement on the domain boundary. Note that this is exactly the case for the applications such as optical tomography, which we shall consider in the next subsection.

**Figure 4.** Evaluation of the error $|J(u_h)-J(u)|$ by the goal-oriented and the standard error estimator. The goal-oriented estimator leads to a much smaller error.
Download figure:
Standard image High-resolution image

4.2. Goal-oriented AMR for the inverse problem

In this subsection, we consider the numerical tests for the inverse problem (13). We aim to test the performance of algorithm 4 and compare it with other mesh-refinement methods. We consider a rectangular domain $[0,L_x]\times[0,L_y]$ where on each side, we put 2 inflow radiation laser beams and 20 measurements collecting the angularly-averaged outflow radiation. As a result, we have $N_\mathrm{tst} = 8$ and $N_\mathrm{msm} = 80$ in the formulation (4) or (13). The absorption is fixed as $\sigma_a = 0.1$ and the asymmetric parameter is chosen as g = 0.1. The scattering coefficient σ_s is decomposed into a summation $\sigma_\mathrm s = \sigma_s^0+\tilde\sigma_s$ where $\sigma_s^0$ is a background state and $\tilde\sigma$ represents a perturbation. We assume the background state $\sigma_s^0$ and the absorption $\sigma_\mathrm a$ are known and we aim to recover the scattering coefficient $\sigma_\mathrm s$ . We shall consider two test cases with the same background state $\sigma_s^0 = 1$ but with different perturbations $\tilde\sigma_s$ . See the upper-left sub-figures of figures 6 and 9 for the true scattering coefficients $\sigma_\mathrm s$ for the first and the second test case, respectively.

To generate the data $y^{ij} = \mathcal M^j\tilde u_h^i$ , we solve the forward problem (9) on a very refined mesh, namely, a 8 by 8 mesh with the polynomial degree chosen as p = 9, so we can regard $\tilde u_h^i$ as the 'true' solution. The optical parameter σ_h is set to live on the one-element mesh with p = 19.

To solve the inverse problem, we apply an H¹-regularization, namely $\mathcal R(\sigma) = \|\nabla\sigma\|_{L^2(\Omega)}$ , with the regularization parameter chosen as $\alpha = 10^{-1}$ ; see (4) where α was first introduced. We implement algorithm 4, with $\epsilon = 10^{-3}$ , and $r_\mathrm{ref} = 0.2$ for the mesh refinement algorithm 3. For the iteration method, we use the limited memory BFGS algorithm to calculate the updating direction $\delta\sigma_h^k$ . The parameter γ is determined by a line search method, for which we use the backtracking algorithm which starts with µ = 1 and then followed by the candidate step length $\mu = 2^{-1},2^{-2},2^{-3},\ldots,$ until the Armijo condition

$\begin{align*} \Phi\left(\sigma_h^k-\mu\delta\sigma_h^k\right)\leqslant \Phi\left(\sigma_h^k\right)-c_1\mu\nabla\Phi\left(\sigma_h^k\right)\delta\sigma_h^k \end{align*}$

is satisfied, where $c_1 = 10^{-4}$ . For more details on the BFGS method and the line search algorithm, we refer to [31].

We test the performance of algorithm 4 by comparing it with other mesh-refinement methods. We set the initial mesh $\mathcal T_{hp}^i$ ( $i = 1,\ldots,N_\mathrm{tst}$ ) by a 4 by 4 mesh with p = 2. To be more specific, we shall compare algorithm 4 with other methods, for which, instead of applying the AMR algorithm 3 in the algorithm 4, we apply (1) h-refinement, for which we partition the domain into more elements (uniformly divided), (2) p-refinement, for which we increase the polynomial degree on each element, (3) h-AMR-goal, for which we only apply h-refinement in the algorithm 3, and (4) hp-AMR-standard, for which we use the standard jump error estimator (18) to guide the mesh-refinement in the algorithm 3. We remark that the h-refinement method with p = 2 is similar to a FV method as mesh resolution is increased, while the p-refinement method with an increased polynomial order p is similar to a spectral method in the sense that an exponential (spectral) convergence can be achieved when the solution is sufficiently smooth.

Now we show the results for the first test case. Figure 5 shows that the hp-AMR-goal method is the most efficient method in reducing the error $\|\sigma_h-\sigma_h^*\|_{L^2(\Omega)}$ . Namely, it uses the fewest DOFs to achieve the best recovery. The h-AMR-goal method is overall less efficient than the hp-AMR-goal method but their performances are similar. For the hp-AMR-standard, h-Ref, and p-Ref methods, we observe their errors first decrease and then increase, and in all DOFs levels, their recovered optical property is poorer than the one recovered by the hp-AMR-goal method. The error for the hp-AMR-standard method decreases fast at the beginning but then the error increases to back to the original value, but even in its decreasing phase its performance is less efficient than the hp-AMR-goal method. This consolidates our claim that the goal-oriented estimator is necessary for inverse problems. For h-Ref and p-Ref methods, we observe that they decrease the errors much more slowly than hp-AMR-goal. In summary, the hp-AMR-goal method behaves better than other methods in almost any measure, and the h-AMR-goal method behaves less well but similar to hp-AMR-goal.

**Figure 5.** Total DOFs of the forward and the adjoint solutions (the summation of all tests) versus the L² error of the optical property $\|\sigma_h-\sigma_h^*\|_{L^2(\Omega)}$ for the first test case.
Download figure:
Standard image High-resolution image

To see this more clearly, we plot in figure 6, the recovered scattering coefficients by the different refinement methods. To make a fair comparison, for all the refinement methods, we take the snapshot at the refinement step such that the DOFs are closest to 10⁶. The results agree with what we observe in figure 5. Namely, the hpAMR-goal method gives the best approximation to the true scattering coefficient, while using the fewest number of DOFs.

As additional detail of what is seen in figure 6, we observe that the hRef method and hpAMR-standard method both fail to provide a satisfactory recovery to the true distribution (left-top sub-figure). All of the pRef, hAMR-goal, and hp-AMR methods can successfully recover the correct location and the approximate shape of the perturbation $\tilde\sigma_s$ . In further detail, though, the h-AMR-goal and hp-AMR-goal methods provide a more accurate recovery for the amplitude of perturbation in comparison to the p-Ref method. Note that here we chose a fixed regularization parameter α for all refinement methods for a fair comparison. Despite this, we remark that if it is allowed for the regularization parameter α to be adapted properly to the different refinement algorithms, the recovery for pRef, hAMR-goal, and hpAMR-goal might further improve.

To visualize how the meshes are refined by the different methods, we plot the forward solution and the corresponding meshes in figure 7. For goal-oriented hp-AMR method, we observe that the mesh is refined both at the place where the forward solution has a sharp gradient, and also at the left boundary where the measurement is located. In contrast, the standard hp-AMR method only refines the mesh where the sharp gradients are located, but fail to refine also at the measurements. The goal-oriented h-AMR method refines both at the sharp gradients and at the boundary where the measurements are located. However, it does not use a high order approximation as in the hp-AMR cases. This could explain why its performance is overall less great than hp-AMR-goal. Therefore, the goal-oriented hp-AMR is the only method that supports both (1) a balanced refinement strategy between resolving the forward and the adjoint solution, and (2) an automatic high-order approximation by p-adaptivity. Considering this, it is not surprising to see that the hp-AMR-goal method recovers the optical coefficient with the best quality in figure 6.

**Figure 6.** Recovered scattering coefficients $\sigma_\mathrm s$ for the different mesh refinement methods, with the total DOFs at around 10⁶ for the first test case. L² errors are printed at the upper-right corners for each sub-figures.
Download figure:
Standard image High-resolution image

**Figure 7.** Top row: forward solution $u_h^1$ for the first test case, by goal-oriented hpAMR (left), goal-oriented hAMR(middle), and standard hpAMR(right) at the last step of the refinement. Bottom row: the corresponding meshes $\mathcal T_{hp}^1$ , where the colorbar represents the polynomial degrees.
Download figure:
Standard image High-resolution image

**Figure 7.** Top row: forward solution $u_h^1$ for the first test case, by goal-oriented hpAMR (left), goal-oriented hAMR(middle), and standard hpAMR(right) at the last step of the refinement. Bottom row: the corresponding meshes $\mathcal T_{hp}^1$ , where the colorbar represents the polynomial degrees.
Download figure:
Standard image High-resolution image

To test the reliability of our proposed method, we conduct another experiment (the second test case) with the same setting used for figure 5 but with a different scattering coefficient σ_s to be recovered; see the top-left figure of figure 9 for a visualization of $\sigma_\mathrm s$ . From figures 8 and 9, we again observe that our proposed goal-oriented hpAMR method has the best performance compared to all the other refinement approaches. Namely, it uses the fewest DOFs but provides the best quality of the recovered scattering coefficient σ_s.

**Figure 8.** Total DOFs of the forward and the adjoint solutions (the summation of all tests) versus the L² error of the optical property $\|\sigma_h-\sigma_h^*\|_{L^2(\Omega)}$ for the second test case.
Download figure:
Standard image High-resolution image

**Figure 9.** Recovered scattering coefficients $\sigma_\mathrm s$ for the different mesh refinement methods, with the total DOFs at around 10⁶ for the second test case. L² errors are printed at the upper-right corners for each sub-figures.
Download figure:
Standard image High-resolution image

To test the robustness of the proposed goal-oriented hp-adaptive algorithm, we conduct additional numerical tests with noise added to the signals y^ij. Namely, instead of using y^ij, we use $\tilde y^{ij}: = y^{ij}(1+\delta X_{ij})$ where X_ij are independent, identically distributed random variables from a uniform distribution on $[-1,1]$ , and δ represents the noise level. We consider two cases of $\delta = 1\%$ and $\delta = 10\%$ .

Figure 10 shows that when the noise level $\delta = 1\%$ , the performance of the different refinement methods are similar to the noise-free case (figure 5). We again observe that the hp-AMR-goal methods gives the smallest error while using the fewest DOFs. When the noise level δ increases to $10\%$ , we observe an increase of the errors for most refinement methods. Despite this, the hp-AMR-goal method still provides the overall smallest error compared to the other refinement approaches, especially in the last refinement level. In figures 11 and 12, we plot the recovered scattering coefficient by the different refinement methods with the noise level $1\%$ and $10\%$ , respectively. The results are consistent with what we observe in figure 10.

**Figure 11.** Recovered scattering coefficients $\sigma_\mathrm s$ for the different mesh refinement methods, with the total DOFs at around 10⁶ for the first test case. L² errors are printed at the upper-right corners for each sub-figures. $1\%$ noise is added to y^ij.
Download figure:
Standard image High-resolution image

**Figure 12.** Recovered scattering coefficients $\sigma_\mathrm s$ for the different mesh refinement methods, with the total DOFs at around 10⁶ for the first test case. L² errors are printed at the upper-right corners for each sub-figures. $10\%$ noise is added to y^ij.
Download figure:
Standard image High-resolution image

We repeat the noise test for the second test case (with the true scatterer shown in the upper-left sub-figure of figure 9). In figure 13, we observe that in both cases of the noise level of $\delta = 1\%$ and $\delta = 10\%$ , the hp-AMR-goal method provides the overall smallest error. We plot in figures 14 and 15 the recovered scattering coefficients. The results are consistent with what we observe in figure 13 in the sense that the hp-AMR-goal methods provides the best quality of the recovery in the different noise levels.

**Figure 14.** Recovered scattering coefficients $\sigma_\mathrm s$ for the different mesh refinement methods, with the total DOFs at around 10⁶ for the second test case. L² errors are printed at the upper-right corners for each sub-figures. $1\%$ noise is added to y^ij.
Download figure:
Standard image High-resolution image

**Figure 15.** Recovered scattering coefficients $\sigma_\mathrm s$ for the different mesh refinement methods, with the total DOFs at around 10⁶ for the second test case. L² errors are printed at the upper-right corners for each sub-figures. $10\%$ noise is added to y^ij.
Download figure:
Standard image High-resolution image

5. Conclusion

In this paper we propose a goal-oriented hp-AMR method to solve the inverse radiative transfer equation. The method is based on the development of a novel goal-oriented error estimator, which is achieved by connecting two kinds of duality arguments in different fields, namely, (1) the duality-based based mesh adaptivity for goal-oriented error minimization, and (2) the adjoint-based inversion techniques for solving inverse problems. The numerical tests suggest that the proposed method solves the inverse problem with the best quality of the recovered optical coefficients, while using the fewest DOFs. While our method is proposed here for inverse radiative transfer, the general principles of devising the error estimators and designing the refinement algorithms should be able to be naturally extended to enable adaptive-mesh inversion for other types of inverse problems, such as the Calderón problem or inverse scattering.

Potential future work includes further study of combinations of more involved regularization strategies, and applications to remote sensing problems.

Acknowledgments

This research was partially supported by Grant NSF DMS 2324368 and by the Office of the Vice Chancellor for Research and Graduate Education at the University of Wisconsin–Madison with funding from the Wisconsin Alumni Research Foundation. The authors thank two anonymous reviewers whose suggestions led to improvements of the paper.

Data availability statement

No new data were created or analysed in this study.

Inverse radiative transfer with goal-oriented hp-adaptive mesh refinement: adaptive-mesh inversion

Article metrics

Submit

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

2. Forward and inverse problems

2.1. Forward problem

2.2. Inverse problem

3. Numerical discretization

3.1. DG

3.1.1. Approximation spaces.

3.1.2. DG for the forward and adjoint problems.

3.1.3. Gradient calculation in the discrete setting.

3.2. Goal-oriented error estimator

3.2.1. The goal function.

3.2.2. The error estimators.

3.3. Goal-oriented mesh adaptation for the inverse problem

3.3.1. hp-AMR.

3.3.2. The full algorithm.

4. Numerical experiments

4.1. Goal-oriented AMR for the forward problem

4.2. Goal-oriented AMR for the inverse problem

5. Conclusion

Acknowledgments

Data availability statement

Inverse radiative transfer with goal-oriented hp-adaptive mesh refinement: adaptive-mesh inversion

Article metrics

Submit

Share this article

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

2. Forward and inverse problems

2.1. Forward problem

2.2. Inverse problem

3. Numerical discretization

3.1. DG

3.1.1. Approximation spaces.

3.1.2. DG for the forward and adjoint problems.

3.1.3. Gradient calculation in the discrete setting.

3.2. Goal-oriented error estimator

3.2.1. The goal function.

3.2.2. The error estimators.

3.3. Goal-oriented mesh adaptation for the inverse problem

3.3.1. hp-AMR.

3.3.2. The full algorithm.

4. Numerical experiments

4.1. Goal-oriented AMR for the forward problem

4.2. Goal-oriented AMR for the inverse problem

5. Conclusion

Acknowledgments

Data availability statement