Quantitative cone-beam CT reconstruction with polyenergetic scatter model fusion

Jonathan H Mason; Alessandro Perelli; William H Nailon; Mike E Davies

doi:10.1088/1361-6560/aae794

1. Introduction

X-ray scatter is a large source of errors in CT, where scattered x-rays corrupt the line of sight attenuation models used during reconstruction. In CBCT, due to its large field of view, the magnitude of these interfering x-rays is commonly of the same order of magnitude as the signal of interest and can even be considerably higher (Siewerdsen and Jaffray 2001). With this, artifacts and inaccuracies in the reconstruction are inevitable unless it can be correctly compensated. In Mason et al (2017c), we introduced a polyenergetic quantitative reconstruction technique allowing the direct inference of electron density—known as Polyquant. This method can incorporate a prior estimate of the additive scatter, but this does not exploit any of the polyenergetic information or indeed the electron density available during iterations.

The problem then remains of how one should generate this estimate of scatter, which is an active area of research with many possible approaches (Rührnschopf and Klingenbeck 2011). A fundamental reason why estimating scatter is difficult is that unlike modelling attenuation, requiring a single path from source to detector for each measurement, one must consider every possible path a photon can take through various numbers of scattering events to get the full picture. This will therefore not only be dependent on the attenuating materials and intensity along a single pencil beam, but it will be dependent on the full projection fluence and the complete structure of the specimen. Not only would calculating all these paths be computationally exhaustive, but one does not even have knowledge of the structure of the specimen prior to reconstruction, since this is the task of the reconstruction itself. This therefore presents a dependency problem, where one requires the image to estimate scatter, but cannot form the image accurately because of scatter contamination.

To overcome this limitation, a compelling approach is to integrate the scatter estimation and reconstruction processes, so that they are performed simultaneously. This notion is exploited to some degree by efficient Monte Carlo scatter estimation methods such as (Zbijewski and Beekman 2006, Poludniowski et al 2009a, Sisniega et al 2015, Xu et al 2015), where one requires a preliminary reconstruction to estimate the scatter, which can then be iteratively refined. Another approach in Lee and Chen (2015) used an analytic single scatter model to generate scatter estimates during reconstruction. A drawback of either Monte Carlo or analytic scatter models is their computational cost, which would make the already high cost of iterative reconstruction huge.

There are also hardware approaches to scatter estimation, which usually allow very cheap computational estimation, but typically at the loss of detector utilisation, unless additional acquisitions are taken (Rührnschopf and Klingenbeck 2011). One such approach proposed by Siewerdsen et al (2006) directly measures the scatter that hits the edge of the detector, in the collimator shadow, and interpolates this over an entire projection, through low-order polynomial fitting and filtering. With this, one has a trade off between the field of view size, and accuracy at the center of the detector. A similar method, beam stop array (BSA), measures the scatter behind an array of beam blockers distributed over the entire detector area (Ning et al 2004), and uses interpolation between these sparse points to form the estimate. By moving the beam blockers throughout the acquisition, then separate scans need not be taken (Zhu et al 2005). In this work, we will be primarily be focussing on methods that utilize the entire detector area and with no additional hardware, but we will compare against a BSA approach in section 5.1.

Another family of scatter estimation approaches that are purely measurement based are convolutional methods, known as scatter kernel superposition (SKS) (Sun and Star-Lack 2010) or beam scatter kernels (BSK) (Rührnschopf and Klingenbeck 2011). Here, the scatter is estimated as a convolution between a stationary point spread function, usually calculated or measured for a slab of material, and a forward scatter factor (Ohnesorge et al 1999) derived from the small angle scatter magnitude hitting the line-of-sight detector element. SKS methods are typically very fast, due to their ability to exploit the fast Fourier transform (FFT) based calculation of convolution, but are naturally limited in their inability to account for inhomogeneities in the scatter paths. These methods may however be supplemented with heuristic perturbations to compensate for the most significant inhomogeneous effects such as fringing (Maltz et al 2008), varying thickness (Sun and Star-Lack 2010), effects at the edge of an object (Sun and Star-Lack 2010), scatter from the couch (Sun et al 2011), and more general spatial inhomogeneities (Sun et al 2014).

From a preliminary study comparing scatter estimation methods when used with statistical iterative reconstruction (Mason et al 2017b), we found that SKS approaches matched the performance of sub-sampled Monte Carlo estimates, whilst being significantly faster. For this reason, we focus on the integration of SKS models into the polyenergetic reconstruction algorithm, Polyquant, and see if their performance can be enhanced further from information available during the iterations.

In this work, we derive a polyenergetic SKS (PolySKS) model in terms of attenuation and electron density projections, include perturbations to account for the specimen position and inhomogeneities in scatter paths, and integrate this into the Polyquant framework. We denote this integrated reconstruction method as Polyquant–PolySKS. While polyenergetic kernel convolution has been used for radiotherapy dose calculation (Papanikolaou et al 1993), we provide the analysis and derivation in this article for CT imaging. We then propose a fast algorithm for its implementation using accelerated ordered subsets as in Kim et al (2015), able to include sparsity exploiting regularization such as total variation (TV). Finally, we demonstrate the method on both numerical and real CBCT data, and analyse its performance in terms of relative electron density accuracy.

2. Background

2.1. X-ray attenuation model and Polyquant

In this section, we will review x-ray attenuation and the Polyquant model as presented in Mason et al (2017c) for direct electron density imaging.

Considering a polyenergetic x-ray source, the intensity after transmission through a specimen can be modelled using an approximate Poisson distribution in the discrete domain (Chang et al 2014) as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:poly-poiss} y_i \sim {\rm Poisson}\left\{\sum_{j=1}^{N_\xi}b_i(\xi_j)\exp\left(-[\boldsymbol{\Phi}\boldsymbol{\mu}(\xi_j)]_i\right)+s_i \right\} {{\rm ~for~}} i=1,...,N_{\rm ray}, \nonumber \end{align} \tag{ 1 }$

where $N_\xi$ are the number of energies, $\boldsymbol{b}\in\mathbb{R}^{N_{\rm ray}}$ is the spectrally dependent source intensity, $\boldsymbol{\Phi}\in\mathbb{R}^{N_{\rm ray}\times N_{\rm vox}}$ is a system matrix describing the line-of-sight path of each ray, $\boldsymbol{\mu}(\xi)\in\mathbb{R}^{N_{\rm vox}}$ is the energy dependent x-ray attenuation, and $\boldsymbol{s}\in\mathbb{R}^{N_{\rm ray}}$ is the expectation of scatter reaching the detector. N_ray and N_vox represent the number of measurements (rays) and voxels in the discretized volume, respectively.

The Polyquant model allows the parameterization of this attenuation process as a piecewise linear function of the electron density. This is shown in Mason et al (2017c) to fit well to a range of biological tissues such as soft tissues and bone, but also metallic materials for implants such as titanium. This linear fit is expressed as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:gen_fit} \hat{\boldsymbol{\mu}}(\boldsymbol{\rho}_e,\xi) = \sum_{l=1}^{N_f}\boldsymbol{f}_l\odot(\alpha_l(\xi)\boldsymbol{\rho}_e+\beta_l(\xi)), \nonumber \end{align} \tag{ 2 }$

where $\odot$ represents an element-wise multiplication, N_f is the number of linear intervals—typically N_f = 2 for biological tissues and N_f = 3 with the inclusion of metal—and $\boldsymbol{\rho}_e\in\mathbb{R}^{N_{\rm vox}}$ is the relative electron density defined as (Schneider et al 1996)

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:red} \rho_e = \frac{\rho N_g}{\rho_{\rm water}N_{g,\mathrm{water}}}, \nonumber \end{align} \tag{ 3 }$

where ρ is the mass density, N_g is the electrons per unit volume, and the subscript indicates associated values for liquid water at standard temperature and pressure. α and β are fitted parameters to form a connected piecewise linear function to x-ray attenuation data such as from ICRP Publication 89 (2002). To ensure vacuum results in no attenuation, the fit is constrained such that $\beta_1=0$ . The thresholding vector $\boldsymbol{f}_l$ is a mask to isolate voxels in the image corresponding to the lth section of the linear fit as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:threshold} f_{l,i} = \left\{\begin{array}{@{}ll@{}} 1 & {{\rm if~}} k_{l-1} \leqslant \rho_{e,i} < k_l \nonumber \\ 0 & {{\rm otherwise}} \end{array}\right. {{\rm ~for~}} l=1,\dots,N_f; {{\rm ~for~}} i=1,\dots,N_{\rm ray}, \nonumber \end{align} \tag{ 4 }$

where k are the 'knee' transition points between linear intervals, with k₀ = 0, and (4) is updated at each iteration. By using the substitution

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:psi} \psi_i(\boldsymbol{\rho}_e,\xi) \equiv b_i(\xi)\exp\left(-[\boldsymbol{\Phi}\boldsymbol{\hat{\mu}}(\boldsymbol{\rho}_e,\xi)]_i\right) {{\rm ~for~}} i=1,...,N_{\rm ray}, \nonumber \end{align} \tag{ 5 }$

the negative log-likelihood (NLL) of the noise model in (1) can then be expressed in terms of the relative electron density as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:nll} \mathrm{NLL}(\boldsymbol{\rho}_e;\boldsymbol{y}) = \sum_{i=1}^{N_{\rm ray}}\sum_{j=1}^{N_\xi}\psi_i(\boldsymbol{\rho}_e,\xi_j)+s_i - y_i\log\left(\sum_{j=1}^{N_\xi}\psi_i(\boldsymbol{\rho}_e,\xi_j)+s_i\right), \nonumber \end{align} \tag{ 6 }$

which is differentiable, and can be minimized through gradient descent with appropriate regularization in an algorithm such as the 'Prox-Polyquant' given in Mason et al (2017c).

One significant discrepancy between the model in (1) and reality however, is that the scatter term $\boldsymbol{s}$ is not independent. Instead it is highly dependent on the attenuation of the specimen $\boldsymbol{\mu}$ , the intensity and spectrum of the source $\boldsymbol{b}$ , the system geometry $\boldsymbol{\Phi}$ , and also the electron density $\boldsymbol{\rho}_e$ as this determines the probability of a scatter event to occur (Jackson and Hawkes 1981). With this, a more statistically meaningful expression for the transmission model is

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:scat-poiss} y_i \sim {\rm Poisson}\left\{\sum_{j=1}^{N_\xi}b_i(\xi_j)\exp\left(-[\boldsymbol{\Phi}\boldsymbol{\mu}(\xi_j)]_i\right)+s_i(\boldsymbol{b},\boldsymbol{\Phi},\boldsymbol{\mu},\boldsymbol{\rho}_e) \right\} {{\rm ~for~}} i=1,...,N_{\rm ray}. \nonumber \end{align} \tag{ 7 }$

Due to this dependency, an appropriate model of scatter is required to perform Polyquant like reconstruction on the associated NLL function. This is what we develop in section 3.1.

2.2. Convolutional scatter estimation

A simple form of an SKS convolutional scatter model is represented in the discrete domain as (Ohnesorge et al 1999)

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:simple_conv} \boldsymbol{s}\{k\}(u,v) = \boldsymbol{p}\{k\}(u,v)*\boldsymbol{g}(u,v) {{\rm ~for~}} k=1,...,N_{\rm proj}, \nonumber \end{align} \tag{ 8 }$

where we use the notation $\{k\}(u, v)$ to indicate the kth projection, with coordinates in the 2D detector plane $(u, v)$ as in figure 1, N_proj is the number of projections, and $*$ represents a 2D convolution across the $(u, v)$ dimensions. With this, the scatter is modelled as some intensity function $\boldsymbol{p}$ filtered by a spatially invariant kernel $\boldsymbol{g}$ . The assumption implied by this spatial invariance is that the scatter paths are through some homogeneous slab media, instead of the true inhomogeneous object, as is illustrated in figure 1.

The scatter kernel function $\boldsymbol{g}$ in (8) often takes the form of the sum of two 2D Gaussian functions (Ohnesorge et al 1999, Sun and Star-Lack 2010) as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle g(u,v) = a_\mathcal{N}\exp\left(-\frac{(u-u_0)^2+(v-v_0)^2}{c_\mathcal{N}^2}\right) + a_\mathcal{B}\exp\left(-\frac{(u-u_0)^2+(v-v_0)^2}{c_\mathcal{B}^2}\right), \nonumber \end{align} \tag{ 9 }$

where $(u_0, v_0)$ are the coordinates at the center of the detection plane, and a and c are the amplitude and width of the two Gaussian functions. We use the subscripts $\mathcal{N}$ and $\mathcal{B}$ to indicate the narrow and broad functions, respectively.

The scatter intensity function $\boldsymbol{p}$ in (8) has some physical meaning, as it is the amount of scattered photons arriving along the initial trajectory of the ray. With reference to figure 1, the derivation of this factor according to Ohnesorge et al (1999) may be made in the continuous domain for a monoenergetic beam and ignoring multiple scattering. To begin with, the photon flux at point U after travelling through the object is expressed as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle I_U = I_O\exp\left(-\int_O^U \boldsymbol{\mu}(\ell) \, d\ell \right). \nonumber \end{align} \tag{ 10 }$

At this point, photons will have a probability of scattering at any angle by electrons in the object. We will denote the proportion of photons scattered at an angle of $0^{\circ}$ as $K\mu(U)$ , which is assumed proportional to the attenuation coefficient at that point. In reality, there will be a dependence on the spectrum of the source and the electron density, and we will explore this when developing the PolySKS model in section 3.1. For this to be detected at the point D, with coordinates $(u_0, v_0)$ in this case, it must travel through the rest of the object. The expectation of detected scatter from this single point is therefore

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:scat_prob} I_D = I_UK\mu(U)\exp\left(-\int_U^D \boldsymbol{\mu}(\ell) \, d\ell \right). \nonumber \end{align} \tag{ 11 }$

The total scatter factor is the summation of all the individual scatter factors from O to D, which can be expressed as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:for_deriv_mono} p &= \int_O^D I_O\exp\left(-\int_O^{U'} \boldsymbol{\mu}(\ell) \, d\ell \right)K\mu(U')\exp\left(-\int_{U'}^D \boldsymbol{\mu}(\ell) \, d\ell \right) \,dU' \nonumber \\ &= I_O\exp\left(-\int_O^D \boldsymbol{\mu}(\ell) \, d\ell \right) K \int_O^D\mu(\ell) \, d\ell. \nonumber \end{align} \tag{ 12 }$

By now adopting discrete notation, this may be expressed with respect to the system matrix $\boldsymbol{\Phi}$ as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:classic_for} p_i = KI_{\rm in}\exp(-[\boldsymbol{\Phi\mu}]_i)[\boldsymbol{\Phi\mu}]_i {{\rm ~for~}} i=1,...,N_{\rm ray}, \nonumber \end{align} \tag{ 13 }$

where $I_{\rm in} = \sum_{j=1}^{N_\xi}b(\xi)$ is the total incident flux of the source. Now with (13), it is possible to calculate the scatter factor at each iteration according to this monoenergetic model, and we will explore this concept with the Int-fASKS method tested in the experimental section. In practice, however, one often performs the estimation as a preprocessing operation before reconstruction, when one has no knowledge of $\boldsymbol{\mu}$ . By assuming the linearity from a monoenergetic source, an estimate is possible through calculating

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:pre_forward} p_i = Ky_i^{h_1}\left[-\log\left(\frac{y_i}{I_{\rm in}}\right)\right]^{h_2} {{\rm ~for~}} i=1,...,N_{\rm ray}, \nonumber \end{align} \tag{ 14 }$

where the empirical parameters h₁ and h₂ provide a better fit to a real source, and selected to match observation (Ohnesorge et al 1999). They may be fitted by using a non-linear least squares approach such as Newton's method to experimental or simulated kernels. The effect of these parameters may compensate for some of the approximations including: the monoenergetic model, and only single scattering events from the forward scatter model.

Along with the before-mentioned approximations of this model, there are a couple of limitations in calculating forward scatter using (14) instead of (13). Firstly, it is calculated using the physical measurements $\boldsymbol{y}$ , which will themselves be contaminated by the scatter one is attempting to estimate, and will lead to a biased estimate. In an attempt to mitigate this, scatter estimation may be performed iteratively, where $\boldsymbol{y}$ is first corrected for scatter, then used as the basis for the next estimate as in Sun and Star-Lack (2010). This process then acts as a deconvolution, which is known to be non-exact with the presence of noise. This iterative correction also presents a second problem in cases when the scatter expectation is larger that the measurements themselves, which is possible with CBCT (Siewerdsen and Jaffray 2001), and where the $\log(\cdot)$ will be undefined. Applying thresholding to the corrected $\boldsymbol{s}$ to ensure it remains positive as in Sun and Star-Lack (2010), Chang et al (2014) and will allow the calculation of (14), but at the expense of introducing a bias.

Due to the high dimensionality of the CBCT measurements, one can perform the convolution in (8) efficiently through use of the FFT. With this, the order of complexity can be reduced from $\mathcal{O}(N_{\rm proj}N_u^2N_v^2)$ to $\mathcal{O}(N_{\rm proj}N_uN_v\log(N_uN_v))$ for a detector with dimensions $N_u \times N_v$ . The re-expression of (8) employing FFT operations is

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \boldsymbol{s}\{k\}(u,v) = \mathcal{F}^{-1}\left[\mathcal{F}\left(\boldsymbol{p}\{k\}(u,v)\right)\odot\mathcal{F}\left(\boldsymbol{g}(u,v)\right)\right] {{\rm ~for~}} i=1,...,N_{\rm proj}, \nonumber \end{align} \tag{ 15 }$

where $\mathcal{F}[\cdot]$ and $\mathcal{F}^{-1}[\cdot]$ are FFT and inverse FFT (IFFT) operations. To avoid aliasing and to allow an efficient FFT decomposition, some degree of zero padding is necessary, which may typically double the effective number of samples.

It was demonstrated in Sun and Star-Lack (2010) that even for homogeneous slabs of water, both the parameters of the forward scatter calculation K, h₁ and h₂, and the width of the Gaussian kernels $c_\mathcal{N}$ and $c_\mathcal{B}$ are dependent on the thickness of the slab for a typical energy spectrum. To account for this and still enable fast convolution, they decomposed the projection into several thickness groups, each with a different set of parameters. On top of this, they also include factor to account for inhomogeneities at edges, and linearly modulate the kernel amplitude with the object thickness to approximate more realistic scatter media. In combination, this is known as the fast adaptive SKS (fASKS) and can be expressed as (Sun and Star-Lack 2010)

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:fasks} \boldsymbol{s}\{k\}(u,v) = &(1-\gamma\boldsymbol{\tau}\{k\}(u,v))\odot\mathcal{F}^{-1}\left[\sum_{j}^{N_\mathcal{R}}\mathcal{F} \left(\boldsymbol{\mathcal{R}}_j\{k\}(u,v)\odot\boldsymbol{c}_j\{k\}(u,v)\odot\boldsymbol{p}_j\{k\}(u,v)\right)\odot\mathcal{F}\left(\boldsymbol{g}_j(u,v)\right)\right] \nonumber \\ &+\gamma\mathcal{F}^{-1}\left[\sum_{j}^{N_\mathcal{R}}\mathcal{F}\left(\boldsymbol{\tau}\{k\} (u,v)\odot\boldsymbol{\mathcal{R}}_j\{k\}(u,v)\odot\boldsymbol{c}_j\{k\}(u,v)\odot\boldsymbol{p}_j\{k\}(u,v)\right)\odot\mathcal{F}\left(\boldsymbol{g}_j(u,v)\right)\right] \nonumber \end{align} \tag{ 16 }$

where $N_\mathcal{R}$ are the number of thickness groups, $\boldsymbol{\mathcal{R}}_j\{k\}(u, v)$ are binary masking functions to isolate the projection elements belonging to the appropriate thickness interval, $\boldsymbol{c}_j\{k\}(u, v)$ are factors to linearly attenuate the scatter kernels close to edges, γ is a scalar constant to weight the strength of the thickness modulation, and $\boldsymbol{\tau}\{k\}(u, v)$ is the water equivalent thickness of the projection given as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \boldsymbol{\tau}\{k\}(u,v) = -\frac{1}{m_w}\log\left(\boldsymbol{y}\{k\}(u,v)\oslash\boldsymbol{I}_{\rm in}\{k\}(u,v)\right) {{\rm ~for~}} i=1,...,N_{\rm proj}, \nonumber \end{align} \tag{ 17 }$

where m_w is the mass attenuation coefficient of water at some nominal photon energy, and $\oslash$ denotes the element-wise division. To reduce the bias associated in calculating (14), (16) is iteratively applied to the measurements four times in Sun and Star-Lack (2010).

3. Methodology

3.1. Polyenergetic scatter modelling

3.1.1. Polyenergetic forward scatter derivation

The derivation of the forward scatter factor for SKS for preprocessed estimation in section 2.2 made the assumption that the incident beam was monoenergetic, and was in terms of just the linearized attenuation coefficient μ. However, one can derive this in a more accurate polyenergetic fashion, where the proportion of scattered x-ray at the point D in figure 1 at a given energy ξ is

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle I_D(\xi) = I_U(\xi)K(\xi)\rho_e(U)\exp\left(-\int_U^D \boldsymbol{\mu}(\xi,\ell) \, d\ell \right). \nonumber \end{align} \tag{ 18 }$

There are a couple of differences here from (11). These are the explicit dependence on source energy ξ, and the dependence on the energy independent electron density $\rho_e$ . The reason for this is that the attenuation through scattering events may be expressed in terms of the associated interactive cross sections as (Jackson and Hawkes 1981)

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \mu_{\rm scatter}(\xi) = \rho N_g(\sigma_{\rm incoh}(\xi)+\sigma_{\rm coh}(\xi)), \nonumber \end{align} \tag{ 19 }$

where $\sigma_{\rm incoh}(\xi)$ is incoherent Compton scatter, and $\sigma_{\rm coh}(\xi)$ are coherent scatter events. From the definition of relative electron density in (3) the proportionality to relative electron density can be seen. Furthermore, one can find the scalar constant $K(\xi)$ is related to the physics as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle K(\xi) \propto \frac{\sigma_{\rm incoh}(\xi)+\sigma_{\rm coh}(\xi)}{\rho_{\rm water}N_{g,\mathrm{water}}}, \nonumber \end{align} \tag{ 20 }$

with the constant of proportionality being related to the size and position of the detector element. With this, having the scatter intensity proportional to μ as in (11) can be seen as an approximation and will implicitly include the non-scattering photoelectric interactions.

Similarly to the monoenergetic derivation in (12), the forward scatter factor can be calculated by integrating all the individual scatter factors from O to D, which can be expressed as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:for_deriv_poly} p(\xi) &= \int_O^D I_O(\xi)\exp\left(-\int_O^{U'} \boldsymbol{\mu}(\xi,\ell) \, d\ell \right)K(\xi)\rho_e(U')\exp\left(-\int_{U'}^D \boldsymbol{\mu}(\xi,\ell) \, d\ell \right) \,dU' \nonumber \\ &= I_O(\xi)\exp\left(-\int_O^D \boldsymbol{\mu}(\xi,\ell) \, d\ell \right) K(\xi) \int_O^D\boldsymbol{\rho}_e(\ell) \, d\ell, \nonumber \end{align} \tag{ 21 }$

where the combination of the two attenuation integrals can only be made since photons do not change energy with scattering events with no deflection (Davisson and Evans 1952, Jackson and Hawkes 1981). Since in practice one will have a finite detector size, there will be some small change in photon energy from Compton scatter, but the approximation in (21) is still reasonable since the angles will be small. In any case, the energy dependency of the scatter factor $K(\xi)$ is retained to allow any compensation for this, despite the zero angle scattering not having a dependency on energy (Klein and Nishina 1929).

As in (13), one can express this in a discretized setting for all measurements as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:forward_scatter} p_i(\xi) = K(\xi)b_i(\xi)\exp(-[\boldsymbol{\Phi\mu}(\xi)]_i)[\boldsymbol{\Phi\rho}_e]_i {{\rm ~for~}} i=1,...,N_{\rm ray}. \nonumber \end{align} \tag{ 22 }$

A convenient property of (22) when used within Polyquant reconstruction is that the energy dependent attenuation $\boldsymbol{\mu}(\xi)$ and the electron density $\boldsymbol{\rho}_e$ are readily available at each iteration without need for further computation. With this, given that scatter kernels are consistent within a given energy, then the same scatter model will be able to be used for any spectrum, given that is it known.

As with the conventional SKS derivation, (22) does ignore multiple scattering that rejoins the line-of-sight path, and we will analyse in the next section how well this assumption holds.

3.1.2. Polyenergetic water Kernel analysis

To build a convolutional model to utilize the polyenergetic scatter decomposition in (22), we must investigate the nature of kernels from different energy sources. For this, we used the Monte Carlo particle simulation toolbox Gate (Jan et al 2011), to calculate scatter profiles through slabs water of varying thickness but uniform cross sectional size of $80\times 80$ cm. We positioned a slab 50 cm from the planar detector and 100 cm from the point source. Some examples of these kernels are shown in figure 2.

Initial observations from figure 2 are that there is indeed a strong link between the shape of the scatter profile and energy of the photon, and it appears that the width of the kernel decreases with an increase in energy as in figure 2(a). As expected, there is also a strong link between thickness and kernel shape as shown in figure 2(b), though it appears the width is roughly preserved. This suggests that instead of using a multi thickness decomposition for different kernel shapes as in (16), one could instead decompose in terms of energy. This would fit with the polyenergetic forward model derived in (22).

We have also plotted the relationship between the amplitude of the measured scatter kernels and the polyenergetic scatter factor with K = 1 as calculated in (22), which are shown in figure 3.

It is clear from figure 3(a), that our analytic model fits extremely well to the narrow Gaussian, where one can perform fitting by simply finding an appropriate proportionality constant $K(\xi)$ . On the other hand, it does not agree entirely with the broad Gaussian as shown in figure 3(b).

However, by employing the same adjustment parameters $h_1, h_2$ as in classical scatter factor calculation in (14) (Ohnesorge et al 1999), we are able to obtain a very good linear fit also, which is shown in figure 3(b). This then takes the form

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle p_{\mathcal{B},i}(\xi) = K(\xi)b_i(\xi)\exp(-[\boldsymbol{\Phi\mu}(\xi)]_i)^{h_1}[\boldsymbol{\Phi\rho}_e]_i^{h_2} {{\rm ~for~}} i=1,...,N_{\rm ray}. \nonumber \end{align} \tag{ 23 }$

Due to properties of exponentials, this may be rewritten as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle p_{\mathcal{B},i}(\xi) = K(\xi)b_i(\xi)\exp(-h_1[\boldsymbol{\Phi\mu}(\xi)]_i+h_2\log[\boldsymbol{\Phi\rho}_e]_i) {{\rm ~for~}} i=1,...,N_{\rm ray}, \nonumber \end{align} \tag{ 24 }$

where the parameters become scaling factors, lending itself to faster computational implementation. The fact that our derived single scatter forward model fits so well to the narrow Gaussian amplitude may be indicative that these result from single scatter events, whilst the broad signal is attributed to multiple scatters. However, through analysing the scattering order from our simulations, we have found this not to hold entirely, with single scatter also contributing significantly to the broad peak.

With these and other observations, we note that double Gaussian kernels from various energies and thickness of slab have the following properties:

The double Gaussian decomposition fits well for each energy and water thickness.
The narrow field has a width that is dependent on the energy of the incident photons, and an amplitude that is directly proportional to the polyenergetic forward scatter factor in (22).
The broad field has a width which is independent of photon energy, and is proportional to the polyenergetic forward scatter factor in (22) with appropriate fitting parameters $h_1, h_2$ .
With the above factors taken into account, there appears no additional dependence on thickness, suggesting the polyenergetic decomposition takes account for the change in shape observed in Sun and Star-Lack (2010).

These observations are key to the implementation details of our scatter model given in the next section.

3.1.3. Basic polyenergetic convolutional model

Using the polyenergetic forward model in (22) and separate narrow and broad components, a basic form of double Gaussian polyenergetic scatter kernel accurate for a static slab may be written as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \boldsymbol{s}\{k\}(u,v) = \sum_{j=1}^{N_\xi}\boldsymbol{p}_\mathcal{N}\{k\}(u,v,\xi_j)*\boldsymbol{g}_\mathcal{N}(u,v,\xi_j) \nonumber \end{align} \tag{ 25 }$

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle +\boldsymbol{p}_\mathcal{B}\{k\}(u,v,\xi_j)*\boldsymbol{g}_\mathcal{B}(u,v,\xi_j). \nonumber \end{align} \tag{ 26 }$

Since we have found that the broad field is roughly independent of energy, this may be simplified to

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \boldsymbol{s}_\mathcal{B}\{k\}(u,v) = \left(\sum_{j=1}^{N_\xi}\boldsymbol{p}_\mathcal{B}\{k\}(u,v,\xi_j)\right)*\boldsymbol{g}_\mathcal{B}(u,v), \nonumber \end{align} \tag{ 27 }$

which requires only a single convolution.

As with the preprocessed SKS, the convolution operations may be replaced by point-wise multiplication in the spatial frequency domain, whereby the calculation of scatter may be made as follows

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \boldsymbol{s}\{k\}(u,v) = \mathcal{F}^{-1}\left[\boldsymbol{S}_\mathcal{N}\{k\}(\omega_u,\omega_v) + \boldsymbol{S}_\mathcal{B}\{k\}(\omega_u,\omega_v)\right] {{\rm ~for~}} i=1,...,N_{\rm proj}, \nonumber \end{align} \tag{ 28 }$

where $(\omega_u, \omega_v)$ are the spatial frequencies along the dimensions of the detector plane $(u, v)$ , and

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ4:narrow_scat_fast} \boldsymbol{S}_\mathcal{N}\{k\}(\omega_u,\omega_v) = \sum_{j=1}^{N_\xi}\mathcal{F}\left[\boldsymbol{p}_\mathcal{N}\{k\}(u,v,\xi_j)\right] \odot\mathcal{F}\left[\boldsymbol{g}_\mathcal{N}(u,v,\xi_j)\right], \nonumber \end{align} \tag{ 29 }$

and

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ4:broad_scat_fast} \boldsymbol{S}_\mathcal{B}\{k\}(\omega_u,\omega_v) = \mathcal{F}\left[\sum_{j=1}^{N_\xi}\boldsymbol{p}_\mathcal{B}\{k\} (u,v,\xi_j)\right]\odot\mathcal{F}\left[\boldsymbol{g}_\mathcal{B}(u,v)\right]. \nonumber \end{align} \tag{ 30 }$

We note that with this, the computational cost of a full scatter estimate involves N_proj IFFT operations and $N_{\rm proj}(N_\xi+1)$ FFT operations, since the Fourier transform of the Gaussian kernels need not be explicitly computed.

3.2. Accounting for detector distance

We already highlighted the fact that the forward scatter scalar constant $K(\xi)$ will in practice depend on the distance to the detector due to a finite detector size. On top of this, the width of the kernels will also vary as the distance to detector is varied, due to a magnification effect. In traditional preprocessing scatter calculation such as fASKS, these factors cannot be easily accounted for, since one does not know the position of the object. In our proposed integrated model, however, this information is available from the current iterate and may be exploited to further enhance the estimation accuracy.

With reference to figure 4, we can derive these correction factors based on the relative location of the object's center of mass. This makes the assumption that the mean scatter originates from the center of mass, which implies independence on the position of the source. As the object moves closer to the detector, we expect a decrease in Gaussian width, and an increase in amplitude due to the increase in area covered by the central pixel. By introducing a magnification factor ζ, defined as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \zeta = \frac{\ell_{OD}-\ell_s}{\ell_{OD}}, \nonumber \end{align} \tag{ 31 }$

one would expect the width to change by a factor of ζ, and the amplitude by a factor of $1/\zeta^2$ . In practice, this shift distance $\newcommand{\e}{{\rm e}} \ell_s$ will vary depending on the projection angle due to the change in scatter direction relative to the fixed coordinate system of the specimen. This is compensated for by rotating the coordinate system with

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \ell_s = x_0 \cos\theta - y_0 \sin\theta, \nonumber \end{align} \tag{ 32 }$

where $(x_0, y_0)$ are the coordinates at the center of the specimen and θ is the angle of the gantry.

**Figure 4.** Scatter shift diagram illustrating magnification effect assuming the scattered photons originate from the center of the specimen.
Download figure:
Standard image High-resolution image

To test if this model holds, we evaluated the change in scatter width and forward scatter factors for the narrow and broad components. The experiment consisted of the Monte Carlo simulation of a 10 cm slab of water shifted axially from its original position 50 cm away from the detector. We have plotted the resulting factors in figure 5 for a 60 keV source, and have found the same trend holds throughout other energies.

It can be observed from figure 5 that the amplitudes of both a narrow and broad Gaussian scatter components indeed fit $1/\zeta^2$ very well. The narrow Gaussian width factor also follows the expected value very well, though the broad factor appears to have a weaker dependence on shifts in the object. We note, however, that it empirically fits well to a modified factor of $\sqrt{\zeta}$ , which we adopt in practice.

3.3. Compensation for object edges

A critical effect from inhomogeneous objects is the reduction of the scatter magnitude towards edges, as observed in Maltz et al (2008), Sun and Star-Lack (2010) and Rührnschopf and Klingenbeck (2011). This occurs since photons scattered away from an edge into a lower attenuating material such as air are far less likely to undergo further scattering events. Using a slab model such as SKS at the edges leads to an overestimation of scatter behind the thicker parts of the object, which will typically have the highest scatter to primary ratio. The consequences of this are not only that the attenuation through the line-of-sight measurements will be overestimated, but photon starvation and streak artifacts are likely to occur in the reconstruction (Chang et al 2014) after its compensation.

In the fASKS work of Sun and Star-Lack (2010), the authors observe an exponential drop in kernel magnitude towards an edge, which they approximate with the linear weighting factor $\boldsymbol{c}\{k\}(u, v)$ that can be applied between interfaces of thickness groups.

We employ a similar mechanism for use in our polyenergetic scatter model that instead produces a continuous modification of the kernel amplitude. From simulations on water slabs and elliptical tubes, we have found that the reduction in kernel amplitude is only significant on the broad Gaussian, which takes the following exponential form

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:edge_factor} \tilde{\boldsymbol{p}}_\mathcal{B} = \boldsymbol{p}_\mathcal{B}\exp\left(-\frac{\boldsymbol{t}_u^2+\boldsymbol{t}_v^2} {c_\mathcal{B}^2}\right), \nonumber \end{align} \tag{ 33 }$

where $\boldsymbol{t}_u$ and $\boldsymbol{t}_v$ are continuous edge weights in the detector plane coordinates, and $\tilde{\boldsymbol{p}}_\mathcal{B}$ is the compensated broad forward scatter factor.

In figure 6, we have shown the least squares fitting for $\boldsymbol{t}_u$ according to the model in (33). The data was generated from Monte Carlo simulations of an elliptical tube with semi-major axis of 15 cm semi-minor axis of 7.5 cm and length of 80 cm, where in each run the object was laterally shifted relative to the detector plane axis u. We are interested in the elliptical tube object since it crudely approximates the shape of a human body lying on a couch. With this, we also plotted the factor $\boldsymbol{c}\{k\}(u, v)$ as given in Sun and Star-Lack (2010), though within the exponential model of (33).

From the nature of the kernel factors as observed in figure 6, the increasing kernel attenuation factor towards the edge of the elliptical object can be seen. While this trend is somewhat caught by the linear edge factor, $\boldsymbol{c}\{k\}(u, v)$ from Sun and Star-Lack (2010), it does not capture the right functional form. Other observations from figure 6 are that there is not a strong dependence on the incident energy, that there is a very small deviation from the slab model at the center of the object, and the trend in $\boldsymbol{t}_u$ is approximately linear.

Motivated by this linear trend, we opt to use the following edge weighting function

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:shift_factor} \boldsymbol{t}_u\{k\}(u,v) = k_{\rm edge}\boldsymbol{\tau}_e\{k\}(u,v)\frac{\partial\boldsymbol{\tau}_e\{k\}(u,v)}{\partial u}, \nonumber \end{align} \tag{ 34 }$

where $\boldsymbol{\tau}_e$ is the electron density projection as $\boldsymbol{\tau}_e = \boldsymbol{\Phi}\boldsymbol{\rho}_e$ , and $\frac{\partial\boldsymbol{\tau}_e\{k\}(u, v)}{\partial u}$ is the spatial derivative of the electron density projection that can be calculated in practice as a discrete gradient. The factor k_edge is a scalar parameter to set the strength of the edge compensation effect. For isotropy, the factor $\boldsymbol{t}_v\{k\}(u, v)$ is calculated in the same way but with the derivative with respect to $v$ .

It is easy to show that for a simple ellipse model, the scatter estimate in (34) gives the correct linear trend observed in figure 6. Figure 7 shows an ellipse with major and minor semi axes lengths of a and b on x and y respectively. The thickness at a given point along x is denoted by $\tau(x)$ . It can be shown that the variation in thickness with respect to x is expressed as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \frac{d \tau(x)}{d x} = -\frac{b^2x}{a^2\tau(x)}. \nonumber \end{align} \tag{ 35 }$

**Figure 7.** Diagram of ellipse for edge factor derivation.
Download figure:
Standard image High-resolution image

Using the form of (34) in this case becomes

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle k_{\rm edge}\tau(x)\frac{d \tau(x)}{d x} \propto \frac{b^2}{a^2}x, \nonumber \end{align} \tag{ 36 }$

which is now linear with the distance towards the edge, and dependent on the squared ratio between height and width of the ellipse. This second feature is also consistent with the experimental observation in Sun and Star-Lack (2010), where they find an increase in edge effect with thicker objects.

To provide robustness of this method to rapidly varying projections, we found smoothing the electron density thickness $\boldsymbol{\tau}_e\{k\}(u, v)$ is beneficial. In the experimental sections, we employ a 2D Gaussian filter with standard deviation 1.5 cm. This smoothing also allows the edge compensation to be active towards sharp edges.

3.4. Complete polyenergetic scatter estimation model and fitting

Combining the various elements, we note our complete scatter model—PolySKS—may be formed using the factors

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:narrow_for} \boldsymbol{S}_\mathcal{N}\{k\}(\omega_u,\omega_v) = \sum_{j=1}^{N_\xi}\mathcal{F}\left[\frac{1}{\zeta^2}\boldsymbol{p}_\mathcal{N}\{k\}(u,v,\xi_j)\right] \odot\mathcal{F}\left[\exp\left(-\frac{\boldsymbol{u}^2+\boldsymbol{v}^2}{\zeta^2c_\mathcal{N}^2(\xi_j)}\right)\right], \nonumber \end{align} \tag{ 37 }$

and

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:broad_for} \boldsymbol{S}_\mathcal{B}\{k\}(\omega_u,\omega_v) = &\mathcal{F}\left[\frac{1}{\zeta^2}\sum_{j=1}^{N_\xi}\boldsymbol{p}_\mathcal{B}\{k\}(u,v,\xi_j)\exp\left(-\frac{\boldsymbol{t}_u^2+\boldsymbol{t}_v^2}{c_\mathcal{B}^2}\right)\right] \nonumber \\ &\odot\mathcal{F}\left[\exp\left(-\frac{\boldsymbol{u}^2+\boldsymbol{v}^2}{\zeta c_\mathcal{B}^2}\right)\right] \nonumber \end{align} \tag{ 38 }$

and as before the estimate is then

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \boldsymbol{s}\{k\}(u,v) = \mathcal{F}^{-1}\left[\boldsymbol{S}_\mathcal{N}\{k\}(\omega_u,\omega_v) + \boldsymbol{S}_\mathcal{B}\{k\}(\omega_u,\omega_v)\right] {{\rm ~for~}} i=1,...,N_{\rm proj},. \nonumber \end{align} \tag{ 39 }$

In calculating (37), the exact form of forward scatter factor used is

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ:fast_scat_fac} \boldsymbol{p}_\mathcal{N}\{k\}(u,v,\xi_j) = K_\mathcal{N}(\xi_j)\boldsymbol{b}\{k\}(u,v,\xi_j)\odot \exp(-[\boldsymbol{\Phi\mu}(\xi_j)]\{k\})\odot[\boldsymbol{\Phi\rho}_e]\{k\}, \nonumber \end{align} \tag{ 40 }$

whilst the broad component of (38) utilizes the fitting parameters as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \boldsymbol{p}_\mathcal{B}\{k\}(u,v,\xi_j) = K_\mathcal{B}(\xi_j)\boldsymbol{b}\{k\}(u,v,\xi_j)\odot\exp(-h_1(\xi_j)[\boldsymbol{\Phi\mu}(\xi_j)]\{k\}+h_2(\xi_j)\log([\boldsymbol{\Phi\rho}_e]\{k\})). \nonumber \end{align} \tag{ 41 }$

In order to use this model, one requires to find the energy dependent parameters c_N, $K_\mathcal{N}$ , $K_\mathcal{B}$ , h₁ and h₂, as well as the energy independent parameters c_B and k_shift from (34) to calculate $\boldsymbol{t}_u$ and $\boldsymbol{t}_v$ . Apart from the edge factor parameters, these were fitted to data from simulating scatter through water slabs having a thickness range of 1 cm to 40 cm with 1 cm increment, and for a number of different photon energies. In each case, we used a detector response function derived from the energy absorption coefficient of Cesium Iodine (CsI) with a thickness of 0.6 mm to replicate a photo absorption scintillator, and a detector of $40\times 40$ cm with resolution $256\times 256$ . We then minimized the nonlinear least squares error against this data to our model using Newton's method, assuming a fixed broad width of 35 cm. These parameters are shown in table 1.

Table 1. PolySKS parameters from water kernel fitting used throughout the experimental sections.

${{\rm Energy~(keV)}}$	$K_\mathcal{N}$	$c_\mathcal{N}~{{\rm (cm)}}$	$K_\mathcal{B}$	h₁	h₂	$c_\mathcal{B}~{{\rm (cm)}}$	$k_{\rm edge} {{\rm (\,full~fan)}}$	$k_{\rm edge} {{\rm (half~fan)}}$
40	$1.38\times 10^{-6}$	6.91	$3.51\times 10^{-7}$	0.876	1.10	35	2.35	1.57
60	$1.40\times 10^{-6}$	4.77	$3.16\times 10^{-7}$	0.828	1.15	—	—	—
80	$1.39\times 10^{-6}$	3.62	$2.97\times 10^{-7}$	0.804	1.20	—	—	—
100	$1.40\times 10^{-6}$	2.90	$2.79\times 10^{-7}$	0.791	1.25	—	—	—
120	$1.40\times 10^{-6}$	2.44	$2.66\times 10^{-7}$	0.785	1.29	—	—	—

From table 1, one can see typical parameters from our fitting. It is interesting to see that although we included the energy dependence on $K_\mathcal{N}$ to account for finite detector effects, there is almost no dependence from our fitting, which agrees with theory from Klein and Nishina (1929). Also interesting is the significant deviation of the narrow width and broad forward model parameters throughout energy. A critical factor to consider is that the scatter scaling factors $K_\mathcal{N}$ and $K_\mathcal{B}$ will be dependent on the area of a detector element, and they should be appropriately scaled against our detector element area of 0.0244 cm² for other resolutions.

For the perturbation factor k_edge, we calculated the average factor across energies for the elliptical tube experiment presented in figure 6 according to the model in (34). When we repeated this for an offset detector, we found a reduction of around 33%. This may be due to wider cone angles of the geometry having less of an effect from edges perpendicular to the detector. For the other parameters, however, we found they hold in both full fan or half fan acquisitions.

4. Incorporating the polyenergetic scatter model in Polyquant

Although the polyenergetic model as derived in section 3 is a general model for scatter that may be used in various algorithms or applications, its exact formulation works particularly well in the Polyquant framework, Polyquant–PolySKS. To derive an algorithm, we start with the gradient term of the negative log-likelihood NLL function with respect to the electron density, which has the parameter

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ4:polysks_diff} \boldsymbol{d}(\boldsymbol{\rho}_e) = \boldsymbol{y}\oslash\left(\sum_{j=1}^{N_\xi}\boldsymbol{\psi} (\boldsymbol{\rho}_e,\xi_j)+\boldsymbol{s}(\boldsymbol{\rho}_e)\right) - \mathbf{1}, \nonumber \end{align} \tag{ 42 }$

where $\boldsymbol{s}(\boldsymbol{\rho}_e)$ is the updated scatter estimate based on the current iterate using (40), and $\boldsymbol{\psi}(\boldsymbol{\rho}_e, \xi_j)$ is defined in (5). This is then used to calculate the derivative as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \label{equ4:polysks_deriv} \frac{\partial\mathrm{NLL}(\boldsymbol{\rho}_e;\boldsymbol{y})}{\partial\boldsymbol{\rho}_e} \approx \sum_{i=1}^{N_f}\boldsymbol{f}_i(\boldsymbol{\rho}_e) \odot\boldsymbol{\Phi}^T \left[\sum_{j=1}^{N_\xi}\alpha_i(\xi_j)\boldsymbol{\psi}(\boldsymbol{\rho}_e,\xi_j) \odot\boldsymbol{d}(\boldsymbol{\rho}_e)\right], \nonumber \end{align} \tag{ 43 }$

in the same manner as the regular Polyquant method (Mason et al 2017c). We note a subtle difference is the dependence of the scatter term with $\boldsymbol{\rho}_e$ . With this, the exact formulation of the gradient would include a term from the derivative of the scatter w.r.t. the current estimate, which for convolutional models would be an extra filtering term. We have not included this in (43), since we have found it only has a weak dependence on the NLL and hence can be ignored. For these reasons, we have ignored this dependence and have symbolized this non-exactness by the ' ≈ ' sign.

Calculating the term $\boldsymbol{s}(\boldsymbol{\rho}_e)$ requires from (22) the terms $\boldsymbol{\Phi \mu}(\xi)$ for each energy and $\boldsymbol{\Phi \rho}_e$ . For the first term, we note from the Polyquant formulation in (2) we have

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \boldsymbol{\Phi} \hat{\boldsymbol{\mu}}(\xi_j) = \sum_{i=1}^{N_f}\alpha_i(\xi_j)\boldsymbol{\Phi}[\boldsymbol{\,f}_i(\boldsymbol{\rho}_e) \odot\boldsymbol{\rho}_e]+\beta_i(\xi_j)\boldsymbol{\Phi}\boldsymbol{f}_i(\boldsymbol{\rho}_e), \nonumber \end{align} \tag{ 44 }$

where the expensive factors to compute $\boldsymbol{\Phi}[\boldsymbol{\,f}_i(\boldsymbol{\rho}_e)\odot\boldsymbol{\rho}_e]$ and $\boldsymbol{f}_i(\boldsymbol{\rho}_e)$ are already available at each iteration. Additionally, due to the properties of the threshold functions given in (4), we can find that

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \boldsymbol{\Phi}\boldsymbol{\rho}_e = \sum_{i=1}^{N_f}\boldsymbol{\Phi}[\boldsymbol{f}_i(\boldsymbol{\rho}_e)\odot\boldsymbol{\rho}_e], \nonumber \end{align} \tag{ 45 }$

which also only requires projections that are already available at each iteration anyway.

4.1. Fast scatter fused reconstruction algorithm

From the new formulation with the fusion of a convolutional scatter model in (43) with (42), Polyquant–PolySKS, can be treated with the same Prox-Polyquant algorithm as given in Mason et al (2017c). In order to increase the efficiency of the algorithm—important for large CBCT data sets—we consider here an accelerated variant of the original Polyquant. For this, we employ the concept of combining 'ordered subsets' with momentum as in Kim et al (2015) and Wang et al (2015), given in algorithm 1. We highlight that this is essentially an implementation of FISTA (Beck and Teboulle 2009) applied to our objective function, though that each calculation of a gradient operates on some subset $\mathcal{S}^k$ of the full NLL.

Algorithm 1. Fast-Polyquant.

Initialization: Subset selection scheme for $\mathcal{S}^k$ $\mathcal{S}^k$ , regularization constant λ, Lipschitz approximation L₀, number of iterations N_iter, number of subset divisions N_split, and starting point $\boldsymbol{\rho}_e^1$ $\boldsymbol{\rho}_e^1$ .

$\boldsymbol{\rho}_e^1 \leftarrow \mathbf{1}$ $\boldsymbol{\rho}_e^1 \leftarrow \mathbf{1}$

$\boldsymbol{z}^0 \leftarrow \boldsymbol{\rho}_e^1$ $\boldsymbol{z}^0 \leftarrow \boldsymbol{\rho}_e^1$

$\delta \leftarrow 1.9N_{\rm split}/L_0$ $\delta \leftarrow 1.9N_{\rm split}/L_0$

for $k = 1, 2, \ldots, N_{\rm iter}$ $k = 1, 2, \ldots, N_{\rm iter}$ do

$\boldsymbol{z}^{k} \leftarrow \mathbf{prox}_{\delta\lambda R}\left(\boldsymbol{\rho}_e^{k}-\delta\frac{\partial\mathrm{NLL}(\boldsymbol{\rho}_e^{k};\boldsymbol{y}\{\mathcal{S}^k\}, \mathcal{S}^k)}{\partial\boldsymbol{\rho}_e}\right)$ $\boldsymbol{z}^{k} \leftarrow \mathbf{prox}_{\delta\lambda R}\left(\boldsymbol{\rho}_e^{k}-\delta\frac{\partial\mathrm{NLL}(\boldsymbol{\rho}_e^{k};\boldsymbol{y}\{\mathcal{S}^k\}, \mathcal{S}^k)}{\partial\boldsymbol{\rho}_e}\right)$

$t^{k+1} \leftarrow \frac{1+\sqrt{1+4(t^k){}^2}}{2}$ $t^{k+1} \leftarrow \frac{1+\sqrt{1+4(t^k){}^2}}{2}$

$\boldsymbol{\rho}_e^{k+1} \leftarrow \boldsymbol{z}^{k}+\left(\frac{t^k-1}{t^{k+1}}\right)(\boldsymbol{z}^{k}-\boldsymbol{z}^{k-1})$ $\boldsymbol{\rho}_e^{k+1} \leftarrow \boldsymbol{z}^{k}+\left(\frac{t^k-1}{t^{k+1}}\right)(\boldsymbol{z}^{k}-\boldsymbol{z}^{k-1})$

end for

return $\boldsymbol{\rho}_e^{N_{\rm iter}}$ $\boldsymbol{\rho}_e^{N_{\rm iter}}$

In algorithm 1, each iteration uses a gradient calculated on only a subset of the NLL, defined as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \mathrm{NLL}(\boldsymbol{x}^{k};\boldsymbol{y},\mathcal{S}^k) = \sum_{i\in\mathcal{S}^k}\sum_{j=1}^{N_\xi}\psi_i(\boldsymbol{x},\xi_j)+s_i - y_i\log\left(\sum_{j=1}^{N_\xi}\psi_i(\boldsymbol{x},\xi_j)+s_i\right), \nonumber \end{align} \tag{ 46 }$

given some subset $\mathcal{S}^k\subset[1, \dots, N_{\rm ray}]$ , where the projection indices array is defined as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \boldsymbol{\iota}^k = [l^k:N_{\rm split}:(N_{\rm proj}-N_{\rm split}+l^k)], \nonumber \end{align} \tag{ 47 }$

where N_split is the number of subset divisions, and $\boldsymbol{\iota}$ is a subset index array that selects a subset as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \hat{\boldsymbol{y}}_{j} = [\boldsymbol{y}\{\boldsymbol{\iota}^k\}]_j {{\rm ~for~}} j = 1,\dots,\frac{N_{\rm proj}}{N_{\rm split}}, \nonumber \end{align} \tag{ 48 }$

with $\hat{\boldsymbol{y}}$ as the subset of the full set of measurements $\boldsymbol{y}$ . A simple consecutive ordered selection of the subsets can be performed as $l^k = (k {\rm mod} N_{\rm split})+1$ . Selection of l^k has been shown to have impact of the robustness of these types of approaches (Kim et al 2015), for which we have adopted the 'bit reversal ordering' of Herman and Meyer (1993). This has the effect of reducing the correlation and error accumulation between consecutive iterations (Kim et al 2015). The number of iterations N_iter is set to a multiple of the subset divisions such that $N_{\rm iter} = N_{\rm epoch}N_{\rm split}$ , where N_epoch is the number of effective data passes (epochs), which is selected to allow for empirical convergence.

The term $\mathbf{prox}_{\delta R}$ is the proximal operator defined as

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\argmin}{\mathop{\rm argmin}\limits} \displaystyle \label{equ:prox} \mathbf{prox}_{\delta R}(\boldsymbol{w}) = \argmin_{\boldsymbol{\rho}_e\in\mathcal{C}} \frac{1}{2} \|\boldsymbol{w}-\boldsymbol{\rho}_e\|_2^2 + \delta\lambda R(\boldsymbol{\rho}_e), \nonumber \end{align} \tag{ 49 }$

where $\boldsymbol{w}$ is a temporary input, $R(\cdot)$ is a convex regularization function, λ is the regularization constant, and $\mathcal{C}$ is a constraint set on electron density. In practice, we use a box constraint so that $0\leqslant \rho_{e, i}\leqslant \zeta {{\rm ~for~}} i=1, ..., N_{\rm vox.}$ , where ζ is the maximum allowable density value (possibly $\zeta=\infty$ ), and the constraint set ensures non-negative density values. Without loss of generality we opt for TV regularization for $R(\cdot)$ . By using TV, which promotes piecewise smoothness—something expected with homogeneous slabs of biological tissue—the term (49) may be solved as outlined in Beck and Teboulle (2009) for example.

As with the original 'Prox-Polyquant' algorithm in Mason et al (2017c), we use an estimate of the Lipschitz constant L₀, computed as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle L_0 = \left\|\boldsymbol{\Phi}^T\left[\left(\sum_{j=1}^{N_\xi}\alpha_1^2(\xi_j)\boldsymbol{b} (\xi_j)\right)\odot\boldsymbol{\Phi1}\right]\right\|_\infty, \nonumber \end{align} \tag{ 50 }$

where $\|\cdot\|_\infty$ is the infinity norm that selects the maximum of the diagonal of the Hessian at point $\boldsymbol{0}$ .

Kim et al (2015) also introduced a second variant of acceleration using the algorithm of Nesterov (2005) that provided better performance in Wang et al (2015). However, we found the FISTA variant performed better with our model and with TV regularization.

5. Evaluation from simulation

In the first experiment, we evaluate the accuracy of our proposed integrated scatter model, Polyquant–PolySKS, with comparison to precomputed scatter. To synthesize the data, we performed a Monte Carlo simulation of a CBCT system using the software gate (Jan et al 2011) according to the method outlined in Mason et al (2017a).

We generated two acquisitions of the female adult reference computational phantom (ARCP) (ICRP Publication 110 2009) from the pelvis and head region, using half and full fan scans respectively. In both cases, we used a planar detector of $40\times 30$ cm, placed 150 cm from the source and 50 cm from the center of rotation. For the half fan pelvis case, we offset the detector by 16 cm and used a shaped source to replicate the effect of a bow-tie filter. We also used a appropriate source profile for the full fan head case. For both scans, we simulated a total of $1\times 10^{11}$ photons over 160 projections and with a detector resolution of $256\times 128$ . The image resolutions were those of the original ARCP at $299\times 137$ and 60 slices of 0.484 cm thickness. We do not include any scatter grid in our simulation.

All methods under test use the Polyquant reconstruction technique (Mason et al 2017c), with identical TV regularization, allowing a fair comparison between scatter estimation approaches. We will thereby refer to the reconstruction method Polyquant–PolySKS only by the scatter estimation method PolySKS. The spectra used were equivalent to 100 kVp and 120 kVp tube potentials for the head and pelvis cases respectively, which we equally sampled into 21 energies. The spectra were generated using the SpekCalc software (Poludniowski et al 2009b) for a tungsten anode at $30^{\circ}$ . The system operators $\boldsymbol{\Phi}$ and $\boldsymbol{\Phi}^T$ were implemented as separable footprint projections (Long et al 2010) within the Michigan image reconstruction toolbox (Fessler 2018) in Matlab. The methods we will use for comparison are:

Scatter free: we have entirely removed all scatter from the numerical data, which represents the gold standard of any scatter mitigation method.
No correction: scatter is included in the measurements, but no attempt is made to correct it, to show the extent of scatter induced artifacts and represents the 'worst case'.
Pre-fASKS: Polyquant with pre-calculated fASKS calculated according to (16), applied iteratively to the measurements ten times. For fairness, we refit all parameters to the same simulated water slabs as were used to inform the PolySKS parameters in table 1. For the perturbation factors, we used the same edge factor as presented in the original publication (Sun and Star-Lack 2010), and used $\gamma=0.04$ and $\gamma=0.03$ for the head and pelvis cases respectively, selected to give the best fitting against our elliptical shift data used to fit k_edge.
Int-fASKS: an identical scatter model and parameters as in Pre-fASKS, but performed at each iteration using the forward scatter factor (13) in place of its approximation in (14). This does not exploit either the spectral information nor the electron density.
PolySKS: our proposed polyenergetic integrated scatter model as detailed in section 3.1 to give Polyquant–PolySKS, according to parameters given in table 1.

Additionally, we provide a comparison to a hardware based estimation approach using a beam stop array which is detailed in section 5.1. For each method, we performed ten epochs in the head case and 20 epochs for the pelvis. It seemed the lower amount of attenuating material in the head case meant these methods reach convergence significantly faster.

To evaluate the performance of the methods, we have shown reconstructed images in figures 8 and 9. We also quantified the accuracy through the root-mean-squared-error (RMSE) through slices 18–42—corresponding to slices in the cone angle—and these are tabulated in table 2. The various slices selected for display were those exhibiting significant structure or artifacts. The relative quantitative performance of the methods under test was found to be consistent throughout all well sampled slices.

**Figure 8.** Visual results of slice 36 from the head CBCT data with display window $[0.5, 1.4]$ to aid visualization of soft tissue and reconstruction artifacts; absolute errors have display window $[0, 0.3]$ . (a) Scatter free. (b) No correction. (c) Pre-fASKS. (d) Int-fASKS. (e) PolySKS. (f) Scatter free error. (g) No estimate error. (h) Pre-fASKS error. (i) Int-fASKS. (j) PolySKS error.
Download figure:
Standard image High-resolution image

**Figure 8.** Visual results of slice 36 from the head CBCT data with display window $[0.5, 1.4]$ to aid visualization of soft tissue and reconstruction artifacts; absolute errors have display window $[0, 0.3]$ . (a) Scatter free. (b) No correction. (c) Pre-fASKS. (d) Int-fASKS. (e) PolySKS. (f) Scatter free error. (g) No estimate error. (h) Pre-fASKS error. (i) Int-fASKS. (j) PolySKS error.
Download figure:
Standard image High-resolution image

**Figure 9.** Visual results of slices 43 and 30 from the pelvis CBCT data with display window $[0.5, 1.4]$ to aid visualization of soft tissue and reconstruction artifacts; absolute errors have display window $[0, 0.3]$ . (a) Scatter free, slice 43. (b) Scatter free, slice 30. (c) Scatter free error, slice 30. (d) No correction, slice 43. (e) No correction, slice 30. (f) No correction error, slice 30. (g) Pre-fASKS, slice 43. (h) Pre-fASKS, slice 30. (i) Pre-fASKS error, slice 30. (j) Int-fASKS, slice 43. (k) Int-fASKS, slice 30. (l) Int-fASKS error, slice 30. (m) PolySKS, slice 43. (n) PolySKS, slice 30. (o) PolySKS error, slice 30.
Download figure:
Standard image High-resolution image

Table 2. Quantitative results: relative electron density ( $\rho_e$ ) RMSE of slices 18–42.

	Polyquant			FDK
Scheme	Head RMSE	Pelvis RMSE	Run time^a (s)	Head RMSE	Pelvis RMSE	Run time (s)
${{\rm Scatter~free}}$	0.0112	0.0322	150	0.0631	0.0683	1.2
${{\rm No~estimate}}$	0.0413	0.122	150	0.0707	0.102	1.2
${{\rm BSA}}$	0.0165	0.0430	150	0.0639	0.0695	1.5
${{\rm Pre\mbox{-}fASKS}}$	0.0258	0.0479	170	0.0686	0.0906	21
${{\rm Int\mbox{-}fASKS}}$	0.0183	0.0435	200	—	—	—
${{\rm PolySKS}}$	$\mathbf{0.0152}$	$\mathbf{0.0398}$	240	—	—	—

^aPolyquant reconstruction times given for head case.

For the scatter contaminated head phantom reconstruction, shown illustratively in figures 8(b) and (g), the scatter leads to visible shading artifacts, especially around the edges and close to the bones, as well as a significant underestimation in the bony structures. Of the scatter correction methods, both the integrated scatter estimation approaches Int-fASKS and PolySKS perform similarly, whilst the Pre-fASKS still exhibits shading artifacts and has a higher level of noise. Quantitatively, from the second column in table 2, this trend is replicated, with all correction methods considerably closing the gap between 'scatter free' and 'no estimate', with the PolySKS giving the highest accuracy. The difference between the fASKS methods is especially interesting as it implies one must incur some loss from the deconvolution and using (14) in place of (13).

The pelvis reconstructions are then shown in figure 9. Here, scatter is much more of a critical factor, with the 'no estimate' in figures 9(d)–(f) showing huge errors throughout the image. In this case, both Pre-fASKS and Int-fASKS exhibit strong shading artifacts also, especially in the region between the bones. Although PolySKS does have a number of overestimated regions in the fat regions visible in the error plot in figure 9(o), it is generally a good match to the 'scatter free' reconstruction. Quantitatively from table 2, the same trend as the head case can be seen, with the Int-fASKS giving roughly a 9% drop in error over Pre-fASKS, and PolySKS giving another 9% reduction over that.

For computational comparison, we have indicated the run times of the various approaches in table 2 for the same hardware. In this case, all methods are run in Matlab on a commodity PC with an 8 core Intel Xeon E5-1660 v3 processor running at 3 GHz with 32 GB of RAM. To show the cost of the iterative reconstruction, we have also included times for the Feldkamp–Davis–Kress (FDK) method (Feldkamp et al 1984), for the preprocessing scatter estimation approaches. Although the cost of FDK is considerably lower than Polyquant, there is also a vast loss in accuracy. Between the Polyquant approaches, all are at the same order of computational time, but PolySKS does have a 60% higher cost over no estimation in this case. Potential approaches for mitigating this additional cost are discussed in section 7.3.

5.1. Beam stop array comparison

We have also quantitatively evaluated the relative performance of a hardware based scatter estimation technique, beam stop array (BSA). For our implementation of BSA, we replicated the experimental set up in Ning et al (2004) through simulation. We positioned the array between source and specimen, and extracted the detected values behind their shadows as a sparse sampling of scatter. We then used smoothing and cubic B-spline interpolation to cover the entire detector. Unlike in Ning et al (2004), where the BSA was subsampled at a low dose, we used a separate fully sampled acquisition, to show a best case performance for this method. With this, the effective radiation dose increased by ∼98% over other tested methods.

The performance of BSA is competitive, with a relative error of 8% against our proposed PolySKS, and stands as the second best method under test, although this gap will likely broaden if evaluated at the same dose. A notable benefit of BSA is its rapid computational time, which is found to be a fraction of the cost of an FDK reconstruction. With this, a trade off between dose, computation and accuracy exists, and an appropriate choice will depend on the application. For maximum quantitative performance at a minimal dose, Polyquant–PolySKS is seen to be the strongest method under test.

6. Validation from real data

In this section, we tested our method on real data from CBCT scans of physical phantom objects. For this, we have two quantitative CIRS (Computerized Imaging Reference Systems, Inc., Norfolk, VA (USA)) phantoms with acquisitions on a Varian^® TrueBeam^TM On-Board Imager^® (Varian Medical Systems, Inc., Palo Alto, CA (USA)).

In each of the two phantom studies we compare these three approaches:

No estimate: Polyquant reconstruction with no scatter estimate. This is included to indicate the magnitude of the scatter artifacts and to represent a worst case.
Pre-fASKS: Polyquant reconstruction is performed using the estimate of scatter generated by the TrueBeam system as preprocessing. This is an instance of fASKS shown in (16), but with the manufacturers parameters, and with a constant tailed triangle modulation of the kernel shapes to account for the scatter grid on the scanner, the inclusion of a detector glare model (Sun and Star-Lack 2010) and a heuristic to improve upon scatter inaccuracies from the couch as in Sun et al (2011).
PolySKS: our proposed integrated quantitative scatter model as outlined in section 3.1. In this case, we use the very same model and parameters from table 1 as used in the simulated section, though with the inclusion of a similar modulation for the presence of a scatter grid as with Pre-fASKS.

For every method, we ran Polyquant for 50 epochs. Unlike the simulated experiments, we did not evaluate the Int-fASKS approach since we did not have means to modify the system's scatter approach, and we have already demonstrated the superiority of PolySKS over this in the numerical results. Additionally, we were unable to evalutate BSA, since the hardware was unavailable for our system.

6.1. CIRS insert phantom

In the first case, we investigate reconstructions of a CIRS electron density insert phantom. The acquisition consisted of 896 projections over $360^{\circ}$ with a 100 kVp tube potential and a current of 20 mA over a 15 ms pulse duration.

The phantom itself has a cross sectional diameter of 18 cm and a thickness of 5 cm. This poses a challenge for scatter estimation due to its thinness, since it naturally deviates significantly from the underlying assumed slab model.

Table 3 shows material information for the tissue inserts in the phantom. Of particular interest is the relative electron densities $\rho_e$ , since these will act as the ground truths against which to compare our quantitative estimates. To calculate the estimates from the reconstructions, we took the mean value from a circular region of radius 1 cm of each insert. Both the body of the phantom and the central left and right inserts are manufactured from 'water equivalent' resin. Since we are pursuing the quantitatively accurate estimation of soft tissues, we have not evaluated these. However, since we would like to investigate the shading effect of scatter on homogeneous slabs, we do assess the deviation with the body structure.

Table 3. Material definitions of insert phantom.

${{\rm Label}}$	${{\rm A}}$	${{\rm B}}$	${{\rm C}}$	${{\rm D}}$	${{\rm E}}$	${{\rm F}}$
${{\rm Material}}$	${{\rm Adipose}}$	${{\rm Muscle}}$	${{\rm Soft~bone}}$	${{\rm Muscle}}$	${{\rm Breast}}$	${{\rm Liver}}$
Density (g cm⁻³)	0.96	1.06	1.16	1.06	0.99	1.07
${{\rm RED,~}} \rho_e$	0.949	1.043	1.117	1.043	0.949	1.052

Reconstructions from the insert phantom are shown in figure 10. To supplement this, we have also plotted a line profile horizontally through their centers in figure 11. From observation, all three methods produce similar reconstructions. In the case of no estimate in figure 10(a), there is a visible shading towards the center of the object also visible in its profile, which is very typical of a scatter artifact. Whilst both the Pre-fASKS and PolySKS compensate for this, PolySKS appears to generate more of a uniform slab estimation, with the former producing a raised estimate at the center and edges of the profile in figure 11. This overestimation is likely caused by overestimating scatter, perhaps arising the thinness of the phantom.

**Figure 10.** Illustrations of reconstructed volumes of insert phantom of central slice with display window $[0.8, 1.2]$ . (a) No estimate. (b) Pre-fASKS. (c) PolySKS.
Download figure:
Standard image High-resolution image

**Figure 11.** Insert location labels, along with slab ROI mask, and line profiles through center of reconstructions. (a) Regions of interest. (b) Profiles through centre of reconstructions.
Download figure:
Standard image High-resolution image

Table 4 then shows the quantitative analysis of the reconstructions. It is clear that Pre-fASKS produces an overestimation of the electron densities across the board, which in many cases represents a worse performance than from no estimate. PolySKS on the other hand performs very well in most cases, having an error within the stated material manufacturing tolerance of 1%, with the exception of the breast and bone materials. Although it may appear disconcerting that the no estimate has the most accurate performance in the bone and breast cases, we note that these are still overestimations. Since in reality there will be some non-zero scatter behind these structures, and the addition of any scatter compensation will raise the estimation of density, it is unsurprising that both Pre-fASKS and PolySKS overestimate this further, so we suggest this is unlikely arising from a deficit in scatter estimation. Possible causes for the discrepancy may be inaccurate spectrum modelling or effects from objects outside the field of view such as the couch.

Table 4. Quantitative results: relative electron density accuracy of insert phantom.

	Tissue accuracy (% error)						Slab deviation (std)
${{\rm Scheme}}$	${{\rm A}}$	${{\rm B}}$	${{\rm C}}$	${{\rm D}}$	${{\rm E}}$	${{\rm F}}$	${{\rm ROI}}$	${{\rm Run~time~(min)}}$
${{\rm No~estimate}}$	−2.73	−1.08	$\mathbf{1.72}$	−1.04	$\mathbf{1.70}$	−1.75	$7.74\times 10^{-3}$	78
pre-fASKS	6.23	2.32	8.01	2.14	5.07	1.69	$5.38\times 10^{-3}$	78^a
${{\rm PolySKS}}$	$\mathbf{0.0718}$	$\mathbf{0.390}$	3.91	$\mathbf{0.171}$	3.56	$\mathbf{-0.277}$	$\mathbf{4.77\times 10^{-3}}$	120

^aComputation time of fASKS unknown but likely small relative to cost of iterative reconstruction.

The other reported value in table 4 is the slab deviation, which is the standard deviation of all pixels within the ROI in figure 11(a). Here, we note that either scatter modelling strategy offers a large increase in uniformity by at least 30% over 'no estimate', with a further 11% increase from Pre-fASKS to PolySKS. This result is unsurprising considering the flatness of the profiles in figure 11.

Due to the higher dimensionality of the data, the run times as shown in table 4 are significantly longer than those from the simulated cases in table 4, but the relative additional computation of incorporating PolySKS is roughly preserved.

In summary from this experiment: PolySKS results in the most uniformity throughout homogeneous structures, and is the only variant offering highly quantitatively electron density estimation, apart from the bone and breast that are overestimated in every case.

6.2. STEEV head phantom

In the second case, we performed reconstruction of the CIRS STEEV head phantom. In this case, the phantom consists of complex resin structures to mimic the attenuation and form of a human head. Again, we wish to analyse the quantitative accuracy and uniformity of the homogeneous materials. The acquisition used the same source settings as the insert phantom, and with 501 projections over a $200^{\circ}$ gantry rotation.

Reconstructions of the head phantom are shown in figure 12. Observations are that from the 'no estimate' in figure 12(a) exhibits similar shading artifacts in the soft tissue regions as the simulated head case in figure 8(b). Between the two scatter estimation approaches, both appear to similarly correct for inhomogeneities from scatter, though the Pre-fASKS does show higher intensities than PolySKS. Although we are unsure of the exact reason why fASKS leads to a consistent overestimation of scatter in our real CBCT data experiments, we should stress that the method is designed only to enhance uniformity, subject to a subsequent calibration of the image, unlike our approach of direct quantification.

**Figure 12.** Illustrations of reconstructed volumes of STEEV phantom with display window using slice 77 $[0.7, 1.4]$ . (a) No estimate. (b) Pre-fASKS. (c) PolySKS. (d) Tissue ROIs.
Download figure:
Standard image High-resolution image

To quantitatively assess these reconstructions, we calculated the mean and standard deviation of homogeneous tissue types: brain, soft tissue and soft bone. We isolate regions from these materials with a simple threshold based segmentation of the Pre-fASKS reconstruction, and have shown the colored regions in green, red and blue respectively in figure 12(d). These are then reported in table 5.

Table 5. Quantitative results on soft tissues slice 77 of STEEV head phantom.

Material	Soft tissue		Brain		Soft bone		Total
${{\rm ROI~color}}$	${{\rm Green}}$		${{\rm Red}}$		${{\rm Blue}}$		${{\rm All}}$
$\rho_e$	${{\rm 1.028}}$		${{\rm 1.039}}$		${{\rm 1.157}}$		—

	$\rm{\%\ mean}$	${{\rm RMSE}}$	$\rm{\%\ mean}$	${{\rm RMSE}}$	$\rm{\%\ mean}$	${{\rm RMSE}}$	${{\rm RMSE}}$

${{\rm No~estimate}}$	−2.26	0.0430	−1.41	0.0158	−3.14	0.0409	0.0345
${{\rm Pre\mbox{-}fASKS}}$	1.44	$\mathbf{0.0201}$	1.57	0.0173	3.92	0.0478	0.0253
${{\rm PolySKS}}$	$\mathbf{0.0544}$	0.0209	$\mathbf{0.439}$	$\mathbf{0.007\, 09}$	$\mathbf{0.687}$	$\mathbf{0.0169}$	$\mathbf{0.0162}$

From the quantitative results in table 5 it can be seen that similarly to the insert phantom, the PolySKS scatter integration leads to the most quantitatively accurate reconstruction. Also similar to the insert case, is the trend with the no estimate leading to an underestimation of electron density, and Pre-fASKS overcompensating to an over estimation. With the exception of the RMSE on the soft tissue, the PolySKS is the best performing method in every metric. In this case however, the Pre-fASKS is the best performing approach due to a lower variation throughout this tissue. It could be that one of the additional correction factors such as the asymmetric modulation, detector glare or couch scatter, not included in our tested PolySKS implementation, may be worthwhile including in our approach also. We also expect the parameters in table 1 may be improved through fitting to real measurements of scatter. Even with this discrepancy though, PolySKS shows a significant advantage over the whole image, with a 36% drop in total RMSE over Pre-fASKS.

7. Discussion

There a several important aspects of our method, about which we would like to discuss: its extension with other perturbations, the computational impact of the new scatter model, and its use in other models besides Polyquant.

7.1. Extension of the PolySKS model

Due to the non-exactness of convolutional scatter estimations, there are many heuristics that may enhance their practical performance under certain conditions. In our presentation of PolySKS, we have tried to present a simple model, whilst employing a couple of empirical perturbations that were found to give significant performance gains for minimal computational cost. Naturally, one could employ several other extensions also, and we will discuss some of these options.

Firstly, we have not included the asymmetric modulation of fASKS in our model—active when $\gamma>0$ in (16). This could certainly be applied both on the total scatter factors, or separately on each energy signal. We have performed preliminary experiments in both of these cases, but did not find any significant gain in performance for our numerical data, whilst increasing the computational cost by at least $2\times$ . Instead, we found just the edge factor derived in section 3.3 gave good compensation of inhomogeneities with our data. It could be that objects with more complicated variations in thickness such as the torso, with arms and lung regions may benefit from its inclusion.

Another factor that may be worth considering is the effect of objects outside the field of view, such as the patient couch or scatter from other hardware. In Sun et al (2011), the authors do report on this having significant effects on scatter estimation, and its inclusion may widen the gap between PolySKS and the commercial fASKS, where this is employed, in our experimental section. In our numerical tests however, we do not have any additional hardware, allowing us to fairly characterize the relative effectiveness of the methods under test.

We have focussed on convolutional scatter estimation due to their quick run time with FFT implementations. However, a new method using the linear Boltzmann transport equation promises both speed and accuracy (Maslowski et al 2018, Wang et al 2018). Due to the wealth of information available during the iterates of Polyquant, we believe this method may be readily integrated into the algorithm as with Polyquant–PolySKS, but the gain this may offer remains to be demonstrated. In any case, it has been shown that a similar model may be used to perturb convolutional models and increase their accuracy in Sun et al (2014), which may also offer benefit to our method.

Another potential avenue of research is attempting to apply convolutional neural network learning to the polyenergetic scatter factor developed here. A related approach was recently published by Maier et al (2018) during the development of this work, but this was a preprocessing correction for monoenergetic assumptions. It is possible that such a network could adapt to complex scattering structures, instead of opting for simple perturbations from a slab model, such as the edge factor in section 3.3.

7.2. Comparison with hardware scatter estimation

An advantage of convolutional scatter estimation is that it requires no additional hardware, and allows use of the full detector for measurements. Beam block solutions to scatter measurement such as collimator shadow or BSA may also offer accurate estimation, and will typically allow much faster processing times as demonstrated in table 2.

These two approaches may also be combined. Similarly to the perturbation of fASKS by the Boltzmann transport equation in Sun et al (2014), a convolutional estimate may be adjusted to match measurements from beam blocker shadows.

7.3. Computational feasibility

A critical issue with scatter estimation methods is their computational complexity, especially when they are performed at every iteration of reconstruction, such as with Polyquant–PolySKS. Since the motivation behind maintaining a spatially invariant convolutional approach is fast implementation, it is certainly worth directly addressing this point.

As we highlighted in section 3.1, calculating the convolution itself bears a cost of N_proj IFFT operations and $N_{\rm proj}(N_\xi+1)$ FFT operations per data pass. Comparatively, the fASKS method has a convolution cost of $2N_{\rm proj}N_{\rm iter}$ IFFTs and $2N_{\rm proj}N_{\rm thick}N_{\rm iter}$ FFTs, where N_thick are the number of thickness groups and N_iter are the number of iterations, which are set to 3 and 4 respectively in Sun and Star-Lack (2010). If one set $N_{\rm thick} = 3$ and $N_\xi = 21$ as we have in our experiments, then we note that one could only perform 1.4 epochs for the same cost of fASKS. Therefore, for the 50 data passes we use in our real data experiments, PolySKS carries a notably higher cost. However, we believe that sub-sampling the energy bins is unlikely to have a vast impact of its accuracy, whilst providing big speed ups.

Although when compared to fASKS our estimation approach is likely to be more expensive, we highlight the fact that relative to the cost of performing iterative reconstruction, it is still reasonable. In both the simulated and real data tests, the practical additional cost was 60%, with no attempt to sub-sample any of the scatter calculation. Our implementation made in Matlab is also likely to be sped up significantly by rewriting in a compiled language such as C++, for which (Sun and Star-Lack 2010) saw a $4.7\times$ speed up for a single thread with their fASKS—an implementation on GPGPU hardware is likely to be many times faster still. Additionally, over methods such as Monte Carlo or single scatter models, PolySKS is still likely to be many times faster.

7.4. Application of PolySKS in other methods

Although we have developed this scatter model specifically for the Polyquant framework, it is certainly available for alternative methods also. Firstly, due to the reported advantage of Int-fASKS over Pre-fASKS, one may use this scatter integration concept with any iterative method, and expect to realize a significant gain by also fusing the scatter estimation with reconstruction.

With the PolySKS scatter model, we believe it may be incorporated into most polyenergetic likelihood models, though with concessions. For methods that quantify a monoenergetic attenuation or mass density rather than electron density such as IMPACT (De Man et al 2001) or Poly-SIR respectively (Elbakri and Fessler 2002), the information $\boldsymbol{\Phi\rho}_e$ is not available at each iteration. To overcome this, one could either perform a non linear conversion from the projection information $\boldsymbol{\Phi\mu}(\xi)$ , or indeed use directly $\boldsymbol{\Phi\mu}(\xi)$ in place of $\boldsymbol{\Phi\rho}_e$ . This would in fact be directly proportional for water, with the constant of proportionality being its mass attenuation coefficient, but is likely to lose accuracy over diverse material classes.

As a preprocessing method before an analytic reconstruction such as FDK (Feldkamp et al 1984) however, the polyenergetic convolutional model can not be directly applied, due to not having an energy decomposition of the attenuation available. One possible way to exploit it though may be to use an approximate water–bone beam hardening model such as (Joseph and Spital 1978) to approximate the spectral properties. Additionally, from dual energy (or spectral) CT measurements, such a decomposition may be inferred and the model may be applicable. In both cases, one would likely have to apply the model iteratively as a deconvolution of the contaminated primary measurements as in Pre-fASKS (Sun and Star-Lack 2010) and it is not clear this would offer any gain.

8. Conclusions

We have introduced a polyenergetic convolutional scatter model, and integrated it into a quantitative iterative reconstruction method, denoted Polyquant–PolySKS. Additionally, have developed a fast algorithm for its implementation. The advantages of this are twofold: we have demonstrated through using the same convolutional model, fASKS, that one achieves better reconstruction performance when integrated rather than used in preprocessing; and we have demonstrated that one can exploit polyenergetic information available during Polyquant to gain a higher accuracy still. Due to its spatial invariance, the scatter calculation can be done rapidly through FFT operations, and contributes only a fraction of additional computational time over an iterative reconstruction. From our experiments with both numerical and real data, we have demonstrated superior electron density accuracy with this method, which promises to enhance quantitative medical procedures such as radiation therapy dose calculation.

Acknowledgments

The authors would like to sincerely thank Dr Adam Wang from Varian Medical Systems, for preprocessing our CBCT data, providing invaluable information and advice about the imaging system, and making the CIRS insert phantom data available to us. We would also like to thank Andiappa Sankar for his assistance in performing the CBCT acquisition. This work was supported by the Maxwell Advanced Technology Fund, EPSRC DTP studentship funds and ERC project: C-SENSE (ERC-ADG-2015-694888). MD is also supported by a Royal Society Wolfson Research Merit Award.

Quantitative cone-beam CT reconstruction with polyenergetic scatter model fusion

Article metrics

Permissions

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. Background

2.1. X-ray attenuation model and Polyquant

2.2. Convolutional scatter estimation

3. Methodology

3.1. Polyenergetic scatter modelling

3.1.1. Polyenergetic forward scatter derivation

3.1.2. Polyenergetic water Kernel analysis

3.1.3. Basic polyenergetic convolutional model

3.2. Accounting for detector distance

3.3. Compensation for object edges

3.4. Complete polyenergetic scatter estimation model and fitting

4. Incorporating the polyenergetic scatter model in Polyquant

4.1. Fast scatter fused reconstruction algorithm

5. Evaluation from simulation

5.1. Beam stop array comparison

6. Validation from real data

6.1. CIRS insert phantom

6.2. STEEV head phantom

7. Discussion

7.1. Extension of the PolySKS model

7.2. Comparison with hardware scatter estimation

7.3. Computational feasibility

7.4. Application of PolySKS in other methods

8. Conclusions

Acknowledgments

Quantitative cone-beam CT reconstruction with polyenergetic scatter model fusion

Article metrics

Permissions

Share this article

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. Background

2.1. X-ray attenuation model and Polyquant

2.2. Convolutional scatter estimation

3. Methodology

3.1. Polyenergetic scatter modelling

3.1.1. Polyenergetic forward scatter derivation

3.1.2. Polyenergetic water Kernel analysis

3.1.3. Basic polyenergetic convolutional model

3.2. Accounting for detector distance

3.3. Compensation for object edges

3.4. Complete polyenergetic scatter estimation model and fitting

4. Incorporating the polyenergetic scatter model in Polyquant

4.1. Fast scatter fused reconstruction algorithm

5. Evaluation from simulation

5.1. Beam stop array comparison

6. Validation from real data

6.1. CIRS insert phantom

6.2. STEEV head phantom

7. Discussion

7.1. Extension of the PolySKS model

7.2. Comparison with hardware scatter estimation

7.3. Computational feasibility

7.4. Application of PolySKS in other methods

8. Conclusions

Acknowledgments