Paper The following article is Open access

A backprojection kernel (KRNL3D) for very-wide-aperture 3D tomography applied to PET with Multigrid for precise use of time-of-flight data

Published 26 November 2021 © 2021 The Author(s). Published on behalf of Institute of Physics and Engineering in Medicine by IOP Publishing Ltd
, , Citation Keith Miller 2021 Phys. Med. Biol. 66 235007 DOI 10.1088/1361-6560/ac320a

0031-9155/66/23/235007

Abstract

In 'KRNL3D' we derive a kernel function K(y1, y2, φ) whose backprojections from all directions (θ, φ) in the spherical band $| \varphi | \lt {\bar{\varphi }}_{\max }$ on the celestial sphere, when integrated with respect to solid angle, yield ρ, the 3D Gaussian point response function (PRF) of radius 1. This K, when convolved against line integral data from an unknown density function f, yields an integral formula for the 'mollification' ff = ρf, which is a slightly blurred version of f, and which stabilizes the mild ill-posedness. Applied to positron emission tomography that backprojection reconstruction occurs stochastically and one emission event at a time, after needed data corrections. We describe Octave (≈Matlab) codes to tabulate K and to test its use with a large aperture ${\bar{\varphi }}_{\max }=\pi /3$ or π/6. 'KRNL3D-TOF' truncates backprojection to a cylindrical patch about the TOF approximate location of each event. These 'backplacements' decrease the computational cost and limit noise and streaking in one region from contaminating the reconstruction in more distant regions. They also retain the ability to count emission events in an isolated blob despite very low event counts, a valuable feature for dynamic studies of metabolic processes. 'Multigrid' allows further reduction in the radius and lengths of the cylinders, thereby enabling even more precise use of the TOF information. This precision should be especially important as researchers decrease the TOF uncertainty in newer generation scanners. Finally, we discuss 'further work' that needs to be done. Our codes are being made freely available at https://github.com/keithmillerberkeley/PET-codes.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

In positron emission tomography (PET) one seeks to image the concentration f(x, y, z) of a positron-emitting radioisotope accumulated in the body of the subject. When a nucleus of the isotope decays and emits a positron that positron is almost immediately annihilated by a free electron. That annihilation emits a pair of 511 keV gamma ray photons in almost exactly opposite directions along a straight line through the original nucleus. The directions of these lines are randomly equidistributed with respect to spherical solid angle.

A typical detector geometry is shown in figure 1. The subject lies in the region of interest (ROI) aligned with the z axis, and closely spaced detectors wrap in a cylindrical band about the ROI. The raw data are a list of coincidence events (containing true, scattered, and random events), near-simultaneous detection (within 6–12 nanoseconds) of annihilation photons by a pair of detectors. Each true event represents a line in space connecting the two detectors, with cylindrical coordinates (v1, v2), $({\bar{v}}_{1},{\bar{v}}_{2})$, along which the positron-emission occurred (the line of response (LOR)). Our simulations assume only true events and an idealized such scanner, with no granularity in the cylindrical coordinates and no gaps between detector modules.

Figure 1.

Figure 1. A PET system with a wide angle cylindrical band of detectors. A positron annihilation event at the 'source' point src in the region of interest (ROI) emits a pair of 511 keV photons in almost exactly opposite directions. These are detected nearly simultaneously by detectors (v1, v2) and $({\bar{v}}_{1},{\bar{v}}_{2})$ on the line of response (LOR). The 'shifted source' ssrc is any point along that LOR. TOF information yields the approximate 'presumed source' location psrc along that LOR.

Standard image High-resolution image

We would like to be able to reconstruct the pointwise value of f at any given point (x, y, z) in the ROI; however, that is a mildly ill-posed problem. Instead, we seek reconstruction of the value of the convolution ff = ρf at that point, or array of such points, where the desired point response function (PRF) ρ is a small-radius Gaussian peak of volume integral one. This 'mollification' of f is a slightly blurred version of f and stabilizes the mild-ill-posedness of image reconstruction.

Before proceeding, we point out that our notation for longitude and latitude (from the equator) is the (θ, φ) used by mathematicians, as opposed to the (φ, θ) used by physicists. Also, the 'increments' μ, dh, dΩ, d α in sections 2 and 3 are meant to be 'infinitesimally small' in the intuitive but nonrigorous sense of Leibnitz and early practitioners of 'the infinitesimal calculus'; our apologies to mathematical purists.

Our units of length, volume, etc are not in centimeters, but in terms of the 'radius one' dimension of our Gaussian PRF ρ of (2.3). The full width half max (FWHM) of this ρ is ≈ 2.35 in these normalized units; it would be ≈0.6 cm if one were to choose 1 cm = 4 normalized units. Likewise, the FWHM of our standard deviation TOFu = 10 Gaussian position uncertainty distribution in (11.1) would be ≈ 23.5 in normalized units and ≈6 cm; this would correspond to an ≈400 ps time-of-flight (TOF) uncertainty distribution.

We point out that there are no voxels in our method, no attempt to rebin the data into 'sinograms', no reprojections, and no fast-${ \mathcal F }$ transforms It is based upon stochastic evaluation of a 5D integral formula (4.4) for the value of ff at any single point (x, y, z) or array of such points. In KRNL3D that stochastic evaluation occurs one event at a time by backprojecting the kernel function K(y1, y2, φ) for (4.4) along the LOR of the event. In sections 24 we derive this K by a complicated averaging of rotated 2D backprojection kernels. In fact, we derive a whole family of such K because the weighting factor $\eta (\bar{\varphi })$ in (4.1) is arbitrary. Section 5 describes its use in backprojection, and 'the probability of detection Q-factor' for some of the needed data corrections as applied to PET. Sections 68 describe codes in Octave (a freeware version of Matlab) to tabulate K and to test its use in backprojection with a large aperture ${\bar{\varphi }}_{\max }=\pi /3$.

Section 9 reveals an important diagnostic feature of the method; its reconstructed ff can be used to count the number of events in an isolated blob of events despite a very low event count. This feature, as well as the increased geometric sensitivity from the large aperture ${\bar{\varphi }}_{\max }$, plus the increased localization and increased signal-to-noise ratio from use of TOF data, should combine with newer PET scanners of large axial length to allow dynamic collection of metabolic and chemical target information simultaneously from multiple organs (brain, liver, ...).

Section 10 discusses the stabilizing effect of mollification. Section 11 explains KRNL3D-TOF, which adds TOF information to localize the position of each emission event along its LOR. This approximate method allows one to limit backprojection to a cylindrical patch about each presumed source location psrc from TOF. This 'backplacement' of patches reduces streaking from more distant regions and thereby increases the signal-to-noise ratio; it decreases computational cost, and it also fortunately maintains the ability to count emission events.

Section 12 introduces 'Multigrid' to further reduce the radius and lengths of the backplacement cylinders, thereby enabling even more precise use of TOF data to reduce the streaking distance. Sections 1316 discuss the relation of KRNL3D to previous analytical (noniterative) methods, ways to lessen its great computational demands, and the potential advantages of KRNL3D-TOF and Multigrid. Section 17 discusses the 'further work' which remains to be done.

In this paper, we have assumed that the scanner is 'as long as it needs to be' in order to see all LORs whose oblique angle φ is less than the desired universal aperture ${\bar{\varphi }}_{\max }$ (all LORs whose φ is greater than that are ignored). This condition would severely limit the permissible aperture for most scanners. Continuous-bed-motion (CBM) would allow one to largely remedy that restriction for medium-length scanners; we suggest that the subject be uniformly translated into, through, and beyond the scanner so that all parts of the ROI receive the same amount of time in the scanner. The required long subject bed could be supported before and after the scanner, rather than cantilevered. Our github codes include two Multigrid + CBM methods that perform well under simulation. However, this paper is already overly long, so we defer discussion of this CBM topic until a short later paper.

To visualize our computational results we will use the Octave (Matlab) commands 'mesh', 'plot', or 'contour'. These give 3 different modes to visualize a 2D matrix of numbers. We tend to prefer 'mesh' because it provides the most intuitive detail in a single 'wire frame' type figure. In most of our figures we have normalized so that $\max {\text{}}{ff}=1;$ sometimes we then display not ff but $\min (0.1,{\text{}}{ff})$. All figures are with a dxx = 1 grid unless otherwise noted.

2. The 3D kernel K(y1, y2, φ) in parallel-beam geometry

We now begin derivation of our backprojection kernel function K for 3D parallel-beam tomography. See figure 2. For the sake of our exposition let us imagine that this celestial sphere is of very large radius and that the ROI is situated inside near its center. Perpendicular to each direction we imagine a planar detection/backprojection screen, tangent at (θ, φ) on that sphere. Its coordinates (y1, y2) parametrize the outprojection line integral data g(y1, y2, θ, φ) from parallel beams at that direction. This detection screen is also the backprojection screen on which we construct our kernel function K(y1, y2, φ). The north pole is on the positive z axis, θ is the longitude, φ is the latitude (from the equator). The y1 is in the direction of increasing θ, and y2 is in the direction of increasing φ toward the north pole.

Figure 2.

Figure 2. Great circles ${{\rm{\Gamma }}}_{\overline{\theta },\overline{\varphi }}$ (highest point $(\overline{\theta },\overline{\varphi })$) filling the ' $\bar{\varphi }$ -band' of points (θ, φ) on the celestial sphere. The imagined flat detection/backprojection screen at one such (θ, φ) direction, with screen coordinates (y1, y2). Note the two of these circles through (θ, φ) with 'tilt angles' ±β between the y1 axis and the circles. An infinitesimal 'sliver' of these circles. The outprojected screen coordinates $({y}_{1}^{* },{y}_{2}^{* })$ of a single LOR.

Standard image High-resolution image

We will speak of 'outprojection' when information is passed on parallel beams from the ROI onto the screen, and of 'backprojection' when information is passed on parallel beams from the screen onto the ROI.

Shown in figure 2 is a particular great circle Γ through a particular (θ, φ) point, with its highest point $B=(\bar{\theta },\bar{\varphi })$ and its equatorial point $A=(\bar{\theta }-\pi /2,0)$. We denote by β the 'tilt angle' on the screen between the y1 axis and the tangent to Γ; see (2.5) and (3.8) later. Also indicated in figure 2 is a 'sliver' of these great circles adjacent to the given Γ; see (3.3) and (3.4) later. Because of rotational symmetry K will be independent of θ. Our final formula (4.4) will use the line integral outprojection data g(y1, y2, θ, φ) from all directions (θ, φ) in the 'spherical band' $| \varphi | \lt {\bar{\varphi }}_{\max }$. Of course with PET we do not have complete projection data; instead from each coincidence event we have a single LOR whose intersection with the imagined detector screen at that direction yields the 'screen coordinates' ${y}_{1}^{* },{y}_{2}^{* }$ of that single event.

We wish to construct a kernel function K(y1, y2, φ) such that

Equation (2.1)

Here ρ is the '3D Gaussian peak of radius 1 and volume integral = 1',

Equation (2.2)

the dΩ indicates integration with respect to solid angle on the celestial sphere, and the '∼' indicates 'is proportional to' (we will not attempt to keep track of all the constant multiplicative factors of ${(2\pi )}^{-\tfrac{1}{2}}$ etc which occur). The BPθ,φ indicates parallel 'backprojection' of the K function on the 2D (y1, y2, θ, φ) screen onto the 3D ROI.

Derivation of such a 3D kernel function K involves averaging formulas from 2D parallel-beam tomographic reconstruction. In that 2D setting one has the reconstruction kernel function

Equation (2.3)

whose graph, for $\hat{\gamma }(| \xi | )=\exp (-| \xi {| }^{2}/2)$, is shown in figure 3(a). This kernel comes from the 'filtered inverse 2D Radon transform' via Fourier transforms. The '∣ξ∣' in (2.3) enters because of a change of variable for a 2D integral in Fourier space from Cartesian to polar coordinates. It is often known in the literature as 'the ramp filter' and corresponds to the Dirac delta function as the PRF. The $\hat{\gamma }(| \xi | )$ in (2.3) we will refer to as 'the blurring filter'; it is the radially symmetric 2D Fourier transform of the desired positive radially symmetric 2D PRF γ(x, y). This kernel, when backprojected from every angle α on the unit circle, and the results averaged, yields the desired PRF γ. This formula for H(t) is well-known, but see for example 3.1–3.11 in Miller (2006) for a short derivation. In the Gaussian case it yields the desired PRF $\rho \sim \exp (-.5({x}^{2}+{y}^{2}))$.

Figure 3.

Figure 3. (a) The 2D kernel H(t) with its positive and negative lobes. Backprojection onto (x,y) space from all angles −π < θ < π produces (b) the 2D Gaussian peak $\gamma =\exp (-{r}^{2}/2)$ shown. (c) The 2D kernel HC4(t) whose backprojections produce (d) the more compact peak $\gamma =\exp (-{r}^{4}/4)$ shown. 'Overlap cancellation' of the positive and negative lobes on neighboring angles results in approximately zero values away from the peak.

Standard image High-resolution image

When these backprojections are done on only a discrete set of angles about the origin, with equiangular spacing Δα, the approximation to this 2D Gaussian (in particular the approximation to almost zero value at points well away from the center) is extremely good at radii r such that rΔα is substantially less than 1. This is because the negative and positive lobes of H(t) on neighboring backprojection rays overlap and cancel each other out. But at radii r with rΔα > 1 that cancellation fails and the nakedly nonoverlapping backprojections produce 'streaking'. When the discrete projection angles are determined stochastically (as would occur in a 2D PET machine with only one ring of detectors) their distribution would have to be fine enough and uniform enough to cause this 'overlap cancellation' out to the desired radii. See section 11 for further explanation.

In the 3D parallel-beam setting, if one multiplies this 2D kernel H(t) by $\exp (-{z}^{2}/2)$ then averages its 3D backprojections from around the equatorial great circle this yields the desired 3D Gaussian of (2.2). However, one need not be restricted to the equatorial great circle. Consider any single great circle ${{\rm{\Gamma }}}_{\overline{\theta },\overline{\varphi }}$, labeled by its highest point $(\bar{\theta },\bar{\varphi })$ on the celestial sphere, and parametrized by the arclength α as shown in figure 2. We then have

Equation (2.4)

where Hβ is the 'tilted' function

Equation (2.5)

Here (t, s) are the β-rotated coordinates on the y1, y2 screen at (θ, φ), with the t axis aligned in the direction of increasing α on this great circle, the s axis perpendicular to that t axis, and β is the 'tilt angle' on the sphere between the great circle and the direction of increasing θ, as shown on figure 2. Here θ, φ, and β are functions of $\bar{\theta }$, $\bar{\varphi }$, and α.

Because the d α in (2.4) is an increment of arclength on the great circle, if one imagines that circle to be slightly widened by a constant infinitesimal width μ, one can consider the integral with respect to d α in (2.4) to be actually a singular form of integral with respect to the solid angle dΩ = μ d α on the sphere. In other words, it is a singular version of the desired form (2.1).

3. Derivation of the kernel for a single $\bar{\varphi }$-band

In particular, for any fixed $\bar{\varphi }$ with $0\lt \bar{\varphi }\lt {\bar{\varphi }}_{\max }$ there are all the $\bar{\theta }$-rotated great circles ${{\rm{\Gamma }}}_{\overline{\theta },\overline{\varphi }}$. These are shown in figure 2 and fill up the '$\bar{\varphi }$ -band' of points on the sphere with $| \varphi | \leqslant \bar{\varphi }$. Notice from the figure that through each point (θ, φ) in this band there are two such great circles with tilt angles ±β. For fixed $\bar{\varphi }$ the ±β at point (θ, φ) depend only on φ and are independent of θ. Thus it seems that a suitable candidate, on the (y1, y2) backprojection screen, for the kernel K(y1, y2, φ) in formula (2.1) would be

Equation (3.1)

However, it turns out that this term needs to be multiplied by the width-factor of

Equation (3.2)

We now give an explanation of the reason for this: we note that our trial code errs significantly without it, especially for large ${\bar{\varphi }}_{\max }$ such as π/3, but works perfectly with it included; see figures 5(A) and (B).

Recall that $\bar{\varphi }$ is fixed. Consider now only the right-leaning great circles in figure 2. It turns out that two of these great circles with $\bar{\theta }$ and $\bar{\theta }+d\bar{\theta }$ have a perpendicular distance apart of

Equation (3.3)

Thus the 'sliver' $S(\bar{\theta },\bar{\theta }+d\bar{\theta })$ of points (θ, φ) between these two great circles, see figure 2, is not a ribbon of constant width, but instead its width gets smaller and smaller as φ approaches $\bar{\varphi }$. The increment of solid angle between $\bar{\theta }$ and $\bar{\theta }+d\bar{\theta }$ and α and α + d α is thus

Equation (3.4)

Let us take the integration in (2.1) not over the whole spherical band, but instead over a single sliver. We would like the resulting integral with respect to dΩθ,φ to also be a formula of type (2.1), that is, the result to be proportional to the desired 3D Gaussian ρ(x, y, z). But by (3.4), ${wf}(\varphi ,\bar{\varphi })d{\rm{\Omega }}=d\bar{\theta }\,d\alpha $. Hence multiplying the kernel Hβ in (3.1) by ${wf}(\varphi ,\bar{\varphi })$ gives us an integral with respect to d α,

Equation (3.5)

which by (2.4) is proportional to ρ(x, y, z) times $d\bar{\theta }$.

Now the spherical $\bar{\varphi }$-band of points (θ, φ) is covered by the collection of all these nonoverlapping slivers; thus the integral over this band becomes

Equation (3.6)

which is proportional to the desired ρ(x, y, z).

Finally, considering also the left-leaning great circles in figure 2 and their left-leaning slivers, one can replace Hβ in (3.6) by Hβ + Hβ . We see therefore that a kernel function for a formula of type (2.1), using integration with respect to solid angle dΩθ,φ on only the single $\bar{\varphi }$ -band, would be

Equation (3.7)

We will later take a weighted average of these ${{HH}}_{\bar{\varphi }}$ kernels, for $0\lt \bar{\varphi }\lt {\bar{\varphi }}_{\max }$, to get the final kernel function K.

It also turns out that

Equation (3.8)

For the sake of brevity we skip the formula manipulations used to derive both (3.3) and (3.8).

4. Weighted average of the $\bar{\varphi }$-band formulas for $\bar{\varphi }\lt {\bar{\varphi }}_{\max }$

The kernel function ${{HH}}_{\bar{\varphi }}$ of (3.7), for a single $\bar{\varphi }$-band of great circles with $\bar{\varphi }\lt {\bar{\varphi }}_{\max }$, if set to zero for $| \varphi | \gt | \bar{\varphi }| $, would be a candidate for the desired kernel function K in formula (2.1). Note however that the width factor $w(\varphi ,\bar{\varphi })$ has a $1/{(\bar{\varphi }-\varphi )}^{1/2}$ type singularity for φ near $\bar{\varphi }$, and a $1/\bar{\varphi }$ type singularity for $\bar{\varphi }$ near zero. Instead, one can obtain a much smoother function of (y1, y2, φ), and one which takes advantage of all the desired great circles through (θ, φ), by taking a weighted integral with respect to $\bar{\varphi }$ of those ${{HH}}_{\bar{\varphi }}$ functions, with a smooth weighting factor $\eta (\bar{\varphi })$. This η can be chosen rather arbitrarily, except that one needs η(0) = 0 to handle the singularity in $w(\varphi ,\bar{\varphi })$ for $\bar{\varphi }$ near zero. Thus our final desired backprojection kernel function for formula (2.1) will be

Equation (4.1)

This yields a whole family of K's, because of the arbitrariness of η. Notice that, for a given φ, this integral runs only for $\bar{\varphi }\gt | \varphi | $ since ${{HH}}_{\bar{\varphi }}$ is zero for $\bar{\varphi }\leqslant | \varphi | $.

Our favorite choice for η is

Equation (4.2)

which is just (4π)−1 times the solid angle of the $\bar{\varphi }$-band.

Moreover, calculations with the trial code indicate that as ${\bar{\varphi }}_{\max }$ approaches π/2 the kernel K(y1, y2, φ) of (4.1) and (4.2) becomes a function of the radius $r={({y}_{1}^{2}+{y}_{2}^{2})}^{1/2}$ alone, and is independent of φ. This is revealed by graphics such as figure 4(D) where ${\bar{\varphi }}_{\max }=\pi /2-0.0001$.

Figure 4.

Figure 4. Four 2D slices of the 403 table for the kernel K(y1, y2, φ). These indicate that K is a smooth function of the compressed variables (u1, u2, u3) of (6.1). (A)–(C) are with ${\bar{\varphi }}_{\max }=\pi /3$. (A) 'mesh' view of (: , : , 1) slice. (B) 'plot' of that (: , : , 1) slice. (C) 'plot' of the (: , 1, :) slice. (D) repeat of (C) but with ${\bar{\varphi }}_{\max }=\pi /2-.0001$.

Standard image High-resolution image

In the parallel-beam geometry one would obtain the desired slightly blurred image

Equation (4.3)

by convolving K in each 2D (θ, φ) screen against the outprojected line integral data g(y1, y2, θ, φ) on that screen, then backprojecting that Kg(y1, y2) from the screen, weighted by solid angle dΩθ φ , onto the 3D ROI.

We have, therefore, derived the following in sections 24. For each single point (x, y, z) in the ROI, the exact value of the blurred ff = ρf is given by the 4D integral

Equation (4.4)

Here, for each direction (θ, φ) in the ${\bar{\varphi }}_{\max }$-spherical band, the y1, y2 are the screen coordinates on the detection/backprojection screen, the sy1, sy2 are the screen coordinates of the (x, y, z) point or array of points projected out onto the screen, and g(y1, y2, θ, φ) are the outprojected line integrals of our density function f on the lines with this direction through the points (y1, y2) on the screen. In fact, this is actually a 5D integral since each data value g is itself a line integral. The constant of proportionality $c({\bar{\varphi }}_{\max })$ in (4.4) is not really important and we ignore it for most of our computations. However, it could be easily evaluated with a deterministic 2D numerical integration over the ${\bar{\varphi }}_{\max }$-band of our computed K(y1, y, φ) of section 6, by letting f be the Dirac delta function centered at the origin.

5. Application to PET, the ' Q-factor' for data corrections

For PET the backprojection reconstruction occurs by stochastic evaluation of that 5D integral, one coincidence event at a time. From the geometry of the detectors (v1, v2) and $({\bar{v}}_{1},{\bar{v}}_{2})$ in figure 1 one determines (dir,ssrc) where dir is the unit direction vector of the LOR and ssrc is a shifted source point somewhere along that LOR (for example the center point). Since each event represents one unit of position-emission activity along its LOR, every backprojection of K should be given equal weight.

However, in practice there are many needed data corrections. The raw data include true, random, and scattered detected coincident events. Correction methods for random and scatter are complicated, but fortunately the distribution of this contamination is quite broad and relatively featureless. We refer to the review of Meikle and Badawi (2005), with emphasis on low-resolution corrections by Monte Carlo techniques.

We consider only true detected events. Each such event has survived the possibility of being attenuated or of being not detected due to insensitivity or geometry of the detectors. Let us imagine hypothetically that each discrete detector at location (v1, v2) represents a rectangular tile in the cylindrical coordinates (with no gaps between tiles). Let $Q({v}_{1},{v}_{2};{\bar{v}}_{1},{\bar{v}}_{2})$ denote the probability of detection for any true annihilation event whose LOR should touch both the (v1, v2) tile and the $({\bar{v}}_{1},{\bar{v}}_{2})$ tile. Thus each detected event is a proxy for 1/Q detected and undetected events, and one should therefore backproject not K but 1/Q times K.

Local subfactors of Q involve geometry and angle of incidence of the LOR on each detector. The Q subfactor for attenuation is usually discovered on combined 3D PET/CT machines by using a 3D CT scan to integrate the expected attenuation probability through the patient's bones and tissues along each LOR. Moreover, the radioisotope activity decays exponentially in time and this should be corrected for if one wants f to indicate the radioisotope density at an initial time T = 0. In all our trial computations we assume Q = 1.

This Q-factor also cannot accurately account for missing data from the z-gaps between the detector modules on many scanners; we add this topic to the list of needed further work in section 17.

6. A code to tabulate the K(y1, y2, φ) kernel

The kernel K of (4.1) and (4.2) is symmetric with respect to y1, y2 and φ. We would like to compute it for all ∣y1∣ and ∣y2∣ in [0, 2000) and all $| \varphi | \lt {\bar{\varphi }}_{\max }$. Because of the asymptotic decay of H(t) (asymptotic to −t−2 for large t) we can compress the ∣y1∣, ∣y2∣, ∣φ∣ variables into a [0, 1]3 cube, and compute and store K once and for all on a fine uniformly spaced discretization of that cube. One can then compute K for arbitrary ∣y1∣, ∣y2∣, ∣φ∣ by table lookup and interpolation (or rounding).

We describe a code implemented in Octave.

mkTBLK.m—this main code (make table of K values) sets up a table TBLK (usually 100 × 100 × 40) of grid points y1, y2, φ on which to accumulate K. We map $\left[0,2000\right]\times [0,2000]\times [0,{\bar{\varphi }}_{\max }]$ into the unit (u1, u2, u3) cube by the compression

Equation (6.1)

Then, for each fixed φ in the table we do the integral with respect to $d\bar{\varphi }$ in (4.1). But, we use integration by a change of variable $\bar{\varphi }=\varphi +{\bar{v}}^{2}$ to handle the $1/{({\bar{\varphi }}_{\max }-\bar{\varphi })}^{1/2}$ type singularity at $\bar{\varphi }$ near ${\bar{\varphi }}_{\max }$.

Figures 4(A)–(C) show 3 views of 2D slices of the computed 3D table for K with ${\bar{\varphi }}_{\max }=\pi /3$. These suggest that K is a smooth function of the compressed variables u1, u2, u3 of (6.1), suitable for accurate interpolation. Figure 4(D) is a repeat of 4C except with ${\bar{\varphi }}_{\max }$ changed to π/2 − 0.0001.

The Gaussian PRF $\exp (-{r}^{2}/2)$ implemented here could easily be replaced by a more compact PRF. For example, the kernel H(t) in (2.5) can be replaced by the kernel HC4(t) whose backprojections produce the 2D PRF $\exp (-{r}^{4}/4);$ see figures 3(C), (D). One then replaces the $\exp (-{s}^{2}/2)$ in (2.5) by $\exp (-{s}^{4}/4)$ and proceeds through (4.1) and (4.2). The result is a kernel K(y1, y2, φ) whose more 'compact' PRF is a weighted average of β-rotations of this squared-off response function. Because this PRF produces an ff with somewhat more jagged oscillations for a given number Nsrc of sources, we prefer henceforth to work only with the Gaussian PRF.

7. A backprojection PET code to use the kernel K

We describe a trial code to use the previously computed table TBLK.

randPET3D.m—this main code (randomized PET in 3D) begins by setting up a 3D cubic grid of reconstruction points, with grid spacing dxx, on which to accumulate the Gaussian-blurred reconstructed image ff = ρf of (4.3) and (4.4). That backprojection accumulation of ff occurs stochastically, one beam or event at a time, with randomized data supplied by onebeam.m.

onebeam.m—computes a 3D unit vector 'dir', randomly oriented with respect to solid angle, with oblique angle φ such that $| \varphi | \lt {\bar{\varphi }}_{\max }$. It then computes a 3D emission source point 'src' randomly uniformly distributed in a test 2a × 2b × 2c 'cube' (or in a similarly sized sphere or truncated ellipsoid). It's output is (dir,ssrc), where the 'shifted source ssrc' is any point along the beam.

vecmap.m—computes the out-projected screen coordinates ${y}_{1}^{* },{y}_{2}^{* }$ of this (dir,ssrc) beam and also the out-projected screen coordinates sy1, sy2 of the array of reconstruction points.

lookupK.m—gives the backprojection along this beam of the kernel K onto the grid. One then uses interpolation on u3 and rounding on u1, u2 on the TBLK table to get $K({sy}1-{y}_{1}^{* },{sy}2-{y}_{2}^{* },\,\varphi )$ on the grid.

We do this for Nsrc sources to accumulate our approximate Gaussian-blurred ffρf, where f(x, y, z) is the uniformly distributed emission event density on the cube or ellipsoid. We prefer cubes rather than ellipsoids because their corners serve to more clearly reveal the spatial resolution of our methods.

8. Some test results from the trial code

Initial test results will be computed with ${\bar{\varphi }}_{\max }=\pi /3$ (changed from section 11 onward to π/6) with a 100 × 100 × 40 table TBLK, and with 999 values of $\bar{v}$ for the change of variables integration mentioned following (6.1). One can afford to be extravagant and inefficient in computing this table because it (taking about 30 cpu seconds) is computed once and saved for millions of lookups.

To begin to test our stochastic approximation to the integral with respect to solid angle dΩθ,φ in (4.4), with K given by (3.7), (4.1), and (4.2), we place all the event sources src at the origin. Figure 5(A) is a plot of the resulting Gaussian PRF on the (x, 0, z) slice, with Nsrc = 104 events and dzz = 0.25. We choose this slice because it tends to better show 'undershoot' errors than do (x, y, 0) slices. This very small negative undershoot at the base of the peak is not discernable in figure 5(A) because with Nsrc = 104 or 105 it has values of −0.0027 or −0.0016.

Figure 5.

Figure 5.  KRNL3D with ${\bar{\varphi }}_{\max }=\pi /3$. (A) (x, 0, z) slice of reconstructed ff of $\exp (-{r}^{2}/2)$ PRF by Nsrc = 104 events at the origin, with dzz = 0.25. (B) repeat of (A) but without width-factor of (3.2); apparent 2.2% undershoot shown is actually 29% in terms of volume integrals. (C) (x, y, 0) slice of ff from Nsrc = 106 sources in 4 cubes of size 16 × 16 × 4. Three 'visible' cubes have (0.5, 1.0, 0.1) × Nsrc events; a 4th 'hidden' cube with 1.0 × Nsrc events is centered at (0, 0, − 10). (D) plot of (C). For each of the three in (C), the volume integral of ff surrounding the cube is ≈28.3 ×the number of events in the cube, thus permitting counting of the number of events.

Standard image High-resolution image

However, if the width-factor ${wf}(\varphi ,\bar{\varphi })$ of (3.2) is eliminated, this undershoot becomes a significant −0.022, as is seen in figure 5(B). It might seem that this 2% undershoot is not sufficient to rule out this ff as a somewhat acceptable PRF; however, in terms of volume integrals the undershoot is actually 29%, since the integral of the negative part of ff is −0.29 times that of ff. This would lead to considerable ringing at internal edges.

9. Counting emission events in an isolated 'blob'

We have so far not attempted to keep track of the factors of ${(2\pi )}^{-\tfrac{1}{2}}$ etc which creep into formulas such as (2.3) derived from Fourier integrals. Instead we now calibrate TBLK by computations in randPET3D.m, with ${\bar{\varphi }}_{\max }=\pi /3$ still, with all sources at the origin. With Nsrc = 103 or 104 one gets a Gaussian peak ff whose volume integral is ≈ 28.3 × Nsrc. This holds true for grid size dxx = 1, or 0.5, or 0.25. Fortunately, this proportionality remains true for f corresponding to much more general distributions of events, and even with very low event counts, as is seen below.

In figures 5(C), (D) we show some computations with sources randomly equidistributed in '4 cubes', each of size 16 × 16 × 4. Three 'visible' cubes have their centers at positions (20, −14, 0), (12, 12, 0), (−14, 20, 0) with (0.5, 1.0, 0.1) × Nsrc sources in each. Our grid is 77 × 77 × 11 centered at (0, 0, 0), with dxx = 1.0. The 4th 'hidden' cube is centered at (0, 0, −10), with 1.0 × Nsrc sources.

For each of the three visible cubes, the integral of ff over the cube is ≈ 28.3× the number of events in the cube (to within 2% or 3%). This holds true for Nsrc = 106 here, and also with Nsrc = 105 or with the very low count Nsrc = 104.

We point out that the extreme lumpiness seen in some of our low-event-count examples is not due to errors in the data (there are none, only the insufficient stochastic sampling of the sources and their directions).

10. Stabilizing the mild ill-posedness by 'mollification'

Mollification involves approximating a given function f by its 'mollification' or 'blurring' ff = ρf with a smooth positive PRF ρ of small radius and volume integral 1. It has several desirable approximation properties: it is smooth; it preserves nonnegativity, integrals over any isolated blob, derivatives; it does not increase any Lp norm (1 ≤ p). It was introduced as a theoretical tool in the theory of generalized derivatives for partial differential equations (PDEs), Friedman (1969). However, it has proved to be a practical tool for ill-posed inverse problems, provided that the ill-posedness is not too strong. In this section we examine the difference between strongly ill-posed and mildly ill-posed problems.

The author has been involved in the development of stabilized methods for ill-posed inverse problems since the early years of that field, mostly within the setting of PDEs, where the ill-posedness can often be stabilized by a known global bound on the unknown solution. Our general method of partial eigenfunction expansion (MPEE) was introduced in Miller (1964) and in more detail in Miller and Viano (1973); this seems to be known these days as a truncated singular value decomposition method. Our general least squares method followed in Miller (1970).

These PDE problems are typically strongly ill-posed, with the restored stability being typically only Hölder stability, and that only on compact subsets of the global solution domain. For example, consider solutions of a heat equation in divergence form ut = ∇ · cu with time-independent conductivity c(x, y) in two space dimensions and zero-flux boundary conditions. If the difference between two such solutions is bounded in L2 norm by 1.0 at t = 0 and by epsilon at t = T > 0, then (because $\mathrm{log}\parallel u(t)\parallel $ is convex) it is bounded by epsilont/T at 0 < t < T. However, that restored stability can totally fail if the conductivity is allowed to depend also on t. We constructed an example of backward nonuniqueness with a time-dependent conductivity c(x, y, t); the solution can be made to decay at faster and faster exponential rates and to finally vanish to zero at a finite T and thereafter; see Miller (1974).

In 1977 the problem of 2D x-ray tomography attracted our attention. Because of some surprising nonuniqueness results, Smith et al (1977), we initially thought it to be strongly ill-posed and applied our MPEE to it (Miller 1978). That grossly inefficient MPEE expansion in terms of Hermite polynomials was never published, except for that short preliminary report in the 1978 AMS Notices. Because this problem was ill-posed, we also considered stability bounds for the mollification ff = ρf.

Murio applied the 'mollification method' and variants to stabilize many practical inverse problems in his thesis Murio (1981) and thereafter Murio (1993).

It turns out that 2D image reconstruction of f(x, y) from equispaced projections gi (t) is only mildly ill-posed. Here the attempted exact reconstruction of f involves multiplying each 1D Fourier transform ${\hat{g}}_{i}(\xi )$ by the 'ramp filter' ∣ξ∣ before taking the inverse transform and backprojecting. This problem is 'mollifiable'; if one seeks instead to reconstruct the blurred ρf with a Gaussian ρ(x, y) of small radius σ, the ∣ξ∣ above is replaced by $| \xi | \hat{\rho }(| \xi | )$ where $\hat{\rho }(| \xi | )$ is the radially symmetric 2D transform of ρ, which is itself a Gaussian of radius 1/σ. Thus the Gaussian blurring of mollification serves to dampen the amplification by the ∣ξ∣ of spatial high frequency components in the data error. As the mollification radius σ decreases, the amplification of high frequency components in the data error increases.

On the other hand, the problem of backward solution, from data for u(x, 0), of the 1D heat equation ut = uxx on − < x < is easily seen to be strongly ill-posed. The transform $\hat{u}(\xi ,t)$ of u(x, t) satisfies $\hat{u}^{\prime} =-| \xi {| }^{2}\hat{u}$. Hence $\hat{u}(\xi ,t)={e}^{-| \xi {| }^{2}t}\cdot \hat{u}(\xi ,0)$ and going backward to a time t < 0 would involve multiplying the high frequency components of any data error for u(x, 0) by a disastrous ${e}^{+| \xi {| }^{2}| t| }$. This problem is not mollifiable.

Despite the mild ill-posedness of image reconstruction from projections, one should never forget that the problem is ill-posed and that, therefore, one should not attempt to reconstruct the actual pointwise values of f. Methods based on voxels attempt to reconstruct the average value of f on each cubic voxel; this can be considered as seeking the mollified value ρf at each voxel center, where ρ is 1/(Δx)3 times the characteristic function of the voxel. One sometimes reads in the literature the complaint that amplification of data noise increases as the voxel size Δx decreases, and also that for some iterative methods the error seems to grow as the iterations converge toward better and better fit of the approximate model to the data. One should keep in mind that for ill-posed problems it is not sufficient to compute a model which gives a close fit to the data.

We mention that there would be little point in trying to choose the mollification radius of our Gaussian PRF smaller than the size of the detector crystals in the scanner; also, it should be chosen in dependence on the error statistics of our data.

11.  KRNL3D-TOF, backplacements onto cylindrical patches

TOF information allows an approximation which serves to decrease both the computational cost and the streaking-at-a-distance of backprojection. This involves truncation of the backprojection to a cylindrical patch of radius R1t and half-length LG1t about each 'presumed source psrc' from the TOF information. These 'backplacement' truncations are a form of 'confidence weighting' which prevent data errors and streaking from contaminating the image reconstruction in more distant regions.

For our simulations we compute, on-the-fly, for each event

Equation (11.1)

where src is the true source, TOFu is the 'TOF uncertainty', randn is the 'random normal' Octave function which computes a random number with Gaussian distribution of mean 0 and standard deviation 1, and dir is the unit direction vector for the LOR. After considerable experimentation, we choose

Equation (11.2)

In real-life practice, one would assume a similar Gaussian distribution, where TOFu is a known upper bound for the standard deviation of the TOF uncertainty. For our simulations, we usually use TOFu = 10 and R1t = 20. The schematics of figure 6 help to illustrate the reasons for the choices of 1.25 and 2.0 in (11.2). There we have backprojections from Nsrc = 25 sources at the origin. Unfortunately, it would be difficult to illustrate the behavior of 3D computations with (x, y, 0) slices because any backprojection with φ ≈ 0, but not = 0, would eventually dip above or below the (x, y, 0) plane and thus not show its long-distance streaking. Therefore, these four examples are 2D PET examples, obtained by using the 2D kernel H(t) and generating only LOR's with φ = 0.

Figure 6.

Figure 6.  KRNL3D-TOF versus KRNL3D (non-TOF). See section 11. Nsrc = 25 sources at origin, with 2D PET, to explain choice (11.2) for half-length of backplacement cylinders. (A) is non-TOF with equiangular spacing; note disc of near-zero 'overlap cancellation'. (B) is non-TOF with random spacing; much less cancellation, but each backprojection does add to Gaussian peak at origin. (C) is TOF with too-short cylinders; many do not even lap the origin and add to the peak; 'excess positivity' builds up away from the peak. (D) is TOF with good choice (11.2) of truncation lengths. These are contour plots of $\min (.1,\_);$ plateaus have no contour lines and show as white space.

Standard image High-resolution image

With equiangular spacing Δα the positive and negative lobes of H(t) (see figure 3) on neighboring backprojection rays overlap and cancel each other out to almost zero value at points away from the peak. The radius of the zone of near-zero cancellation is proportional to the number Nsrc of backprojection angles. But at radii r with rΔα > 1 that cancellation fails and the nakedly nonoverlapping backprojections produce long-distance 'streaking'. This is well illustrated in figure 6(A). Notice also that at the origin the positive lobes pile up to produce the positive Gaussian peak (normalized to a maximum of 1 in all our figures).

However, when the discrete projection angles are determined randomly that zone of near-zero cancellation does not occur; instead the amplitude of the overlapping streaking tends to fall off like $1/\sqrt{{Nsrc}}$ (because each backprojection has a perpendicular cross-section with zero mean-value). The positive lobes do still pile up at the origin to produce the Gaussian peak; this is illustrated in figure 6(B).

In the first place, we choose the tube truncation radius = 20 and the 1.25 factor by experimenting with TOFu = 0 in (11.2). The resulting f1t (ff of blur radius 1, truncated) seems visually to do as well as the full untruncated ff in producing the desired Gaussian PRK. However, the truncation does introduce a bit of 'excess positivity' to the backprojections since H(t) is asymptotic to −t−2 and its truncated integral is no longer zero but is asymptotic to +(R 1 t)−1. (In the 3D case our kernel K(y1, y2, φ) is an average of these rotated H(t), so is asymptotic to −r−3 with $r={({y}_{1}^{2}+{y}_{2}^{2})}^{1/2}$, and its truncated cross-sectional integral is still asymptotic to +(R 1 t)−1.)

Figure 6(C) illustrates the result if one chooses a too-short half-length LG1t for the cylindrical backplacement patches. There, with R1t = 10 and LG1t = TOFu = 20, we see the errors that occur. Notice that several of the backplacements do not lap the origin at all; therefore they do not add to the buildup of the Gaussian peak at the origin. Moreover, because of their excessive positivity with R1t = 10 they contribute to a buildup of positivity away from the base of the peak. See also the excess positivity in figure 7(D).

Figure 7.

Figure 7.  KRNL3D-TOF with ${\bar{\varphi }}_{\max }=\pi /6$, with '4 cubes' of size 16 × 16 × 4, with (0.5, 1.0, 0.1) × 106 events in 3 visible cubes, 1.0 × 106 in 4th hidden cube. (A)–(C) are f1t with TOFu = 10, with good cylinder radius R1t = 20 and half-length LG 1 t = 45 ala (11.2). (D) is f1t with TOFu = 20 and a too-narrow R1t = 10; note buildup of 'excess positivity'. In (A)–(C) the integral of f1t about each visible cube is ≈ 13.2 × number of events.

Standard image High-resolution image

Figure 6(D) instead illustrates the result if one chooses LG1t with the 2.0 in (11.2). With this '2-sigma' choice, all but 4.5% of the psrc will be close enough to the true src that the backplacement will lap the src by at least a length of 1.25 · R1t. Thus these backplacements do add to the buildup of the peak at the true src.

These considerations about (11.2) carry over from the 2D case to the 3D case because the 3D kernel K(y1, y2, φ) is an average of rotated 2D kernels.

Very large incident angles such as φ = π/3 suffer from problems such as detector efficiency, scatter, etc. We switch henceforth for the examples of this paper to a more reasonable ${\bar{\varphi }}_{\max }=\pi /6$.

Figure 7 returns to the '4 cubes problem' of section 9 and figure 5(C) except with the f1t and TOFu additions of (11.1) and (11.2). Figures 7(A)–(C) are mesh, contour, and plot views with a large Nsrc = 106. Figure 7(D) is with a 'too-small' R1t = 10 and shows 'excess positivity'.

Once again, as in section 9 and figures 5(C), (D), we are able to count the number of events in an isolated blob of events. As there, we first calibrate by computations with Nsrc sources at the origin; with Nsrc = 103 or 104 one gets a Gaussian peak f1t whose volume integral is ≈ 13.2 × Nsrc. We then find that the volume integral of f1t around each of the three visible cubes is ≈ 13.2 × the number of events in the cube. This holds for Nsrc = 106, ..., 103.

12.  Multigrid, tuning for TOF

With 'Multigrid' we tackle the problem, first encountered in section 11, that the tails of the kernel H(t) in 2D and of K(y1, y2, φ) in 3D die out quite slowly. For a given ${\bar{\varphi }}_{\max }$, let K1, K2, K4 denote the 3D backprojection kernels which produce Gaussian PRF's of blur radius 1, 2, 4 and volume integral = 1, with the $\eta (\bar{\varphi })$ of (4.2). Consider the integral formula (2.3) for H(t) with $\hat{\gamma }| \xi | =\exp (-.5| \xi {| }^{2})$. Successive integration by parts yields the asymptotic expansion for large ∣t

Equation (12.1)

Because K1 is an average of rotations of these H(t)'s, it has an asymptotic expansion for each fixed φ in terms of −r−3, −r−5, −r−7, .... Since K2(y1, y2, φ) = (1/8) · K1(0.5y1, 0.5y2, φ), its asymptotic expansion has the same leading term. This is clearly evident in figure 8(A) with (K1, K2, K4) times r3 (left to right), evaluated at K(r, 0, φ), and with the three values of φ = (0.0, 0.5, 0.75) × π/6 (bottom to top). Therefore, for each fixed φ we have that K12 ≡ (K1K2) decays like + r−5 (also evident in graphics of K12 × r5). Truncating K12 at r = R12 then leads to a truncated kernel K12t whose area integral is proportional to −(R12)−3 instead of +(R1t)−1.

Figure 8.

Figure 8.  Multigrid. (A) shows asymptotic −r−3 behavior of kernels K1, K2, K4 for blur radii 1, 2, 4 (left to right) for three values of φ (bottom to top). (B)–(D) show three-level Multigrid for '4 cubes problem' of figure 7, with R12 = 8 and the 'tuned' fac = 0.8200. The f124fill here behaves essentially the same as f1t in figures 7(A)–(C) with its radius R1t = 20; moreover its smaller backplacement cylinders make sharper use of TOF information and involve many fewer backprojection grid points than f1t. (D) suggests that kernel K12t ≡ (K1fac · K2)t which produces f12t here could be used locally to sharpen resolution of an f2 of blur radius 2.

Standard image High-resolution image

Multigrid, with three multigrid levels, then involves the following succession of approximations:

Equation (12.2)

In its first form, in (a) f1 is the desired ff obtained by backprojection using K1; f12 = f1f2 by using K12 = K1K2; f24 using K24 = K2K4; and f4 using K4. In (b) f12t uses the kernel K12t truncated at radius R12, with backplacement on a cylinder of radius R12 and half-length LG12 = 1.25 · R12 + 2.0 · TOFu. The f24t uses K24t, truncated at radius R24 = 2 · R12, with LG24 = 1.25 · R24 + 2.0 · TOFu. The f4 uses the untruncated K4. In (c) f12t is accumulated on a grid with dxx = 1; f24tfill is accumulated on a grid with dxx = 2, then finally filled in to the dxx = 1 grid by trilinear interpolation; f4fill is accumulated on a grid with dxx = 4, then filled in to the dxx = 1 grid by trilinear interpolation.

We applied this first form to the '4 cubes problem' of section 13 and its figure 7 and its ${\bar{\varphi }}_{\max }=\pi /6$. The results, with Nsrc = 106, TOFu = 10, and the choices R12 = 10, LG12 = 32.5, and LG24 = 45, are quite satisfactory, essentially the same as figures 7(A)–(C) with its R1t = 20.

A second form of Multigrid involves recognizing that the main reason for combining K1 and K2 into K12t = (K1K2)t is to get a truncated kernel whose y1, y2 area integrals (and thus the perpendicular cross sections of its backprojections) are ≈ 0, without having to make the cylinder radius R12 very large. Instead, one can accomplish this more directly by making K12t = (K1fac · K2)t where the factor 'fac' is chosen to make its area integrals (for $| \varphi | \lt {\bar{\varphi }}_{\max })\approx 0$. Fortunately, that value of fac seems to be independent of φ and of ${\bar{\varphi }}_{\max };$ for ${\bar{\varphi }}_{\max }=\pi /6$ and R12 = 20, 08, or 05 we get values of fac = 0.9888, 0.9053, or 0.7880. The succession 12.2abc then proceeds as before, except that f12t uses K12t = (K1fac · K2)t, f24t uses K24t = fac · (K2fac · K 4)t, and f4 uses (fac)2 · K4.

We applied this second form to the '4 cubes' problem, with R12 = 20, 8, or 5. Surprisingly, even the case with the extremely small R12 = 5 performs quite well.

A third and tuned form of Multigrid involves tuning the factor 'fac' to make even more precise use of the TOF data, provided that we know an approximate value of the TOF uncertainty parameter TOFu. We do simulations with Nsrc = 103 or 104 events at the origin, with ${\bar{\varphi }}_{\max }=\pi /6$ and a large dxx = 1 3D grid. We notice that f1, f1t, and f124fill all yield a decent approximation to the desired Gaussian near the origin, but produce a hazy cloud of very low-amplitude far-field oscillations. We then compute the 'far-to-near ratio' fton= (integral on r > 6)/(integral on r < 6).

For f1, with any TOFu, that ratio is always ≈ 0. For f1t, with any R1t and any TOFu, that ratio is always positive. For example, with R1t = 20 and TOFu = 10 (as was used in figures 7(A)–(C), one gets the large positive fton ≈ 0.42; it is even much larger for larger TOFu, but goes down to a tolerable 0.02 for TOFu = 3.

For f124fill with the second form of fac, that ratio is always somewhat negative; for example, with R12 = 8, TOFu = 10, and fac = 0.9035, one gets fton ≈ −0.12. However, with fac = 0, one would have f124 = f1t with its positive ratio; hence, there exists an intermediate value with an approximately zero ratio. We therefore home in to find such a 'tuned' value of fac such that ∣fton∣ < 0.01.

For ${\bar{\varphi }}_{\max }=\pi /6$, R12 = 5, and TOFu = 10, our tuned value is fac = 0.7110. For ${\bar{\varphi }}_{\max }=\pi /6$, R12 = 8, and TOFu = 10 or 3, our tuned values are fac = 0.8200 or 0.6120, which are used in figures 8 and 9.

Figure 9.

Figure 9.  KRNL3D versus Multigrid versus KRNL3D-TOF. Four comparisons on a 199 × 7 × 1 grid, with Nsrc events in an 8 × 8 × 32 cube, with Nsrc = 105 or 106 and TOF uncertainty of 10 or 3. From the bottom in each are plots of $\min (0.1,\underline{\ \ })$ for f1 (KRNL3D untruncated), f124fill (Multigrid with R12 = 8 and 'tuned' fac) and f1t (KRNL3D-TOF with R1t = 20). The streaking range (which limits contamination of more distant regions) is smallest for f124fill. This more precise use of TOF data becomes particularly useful as researchers decrease the TOF uncertainty in newer generation scanners.

Standard image High-resolution image

In figure 8(D) we also show the f12t from this same computation. This figure suggests that our kernel K12t could be used locally, with the same list of events, to sharpen the resolution of a previously computed f2 of blur radius 2 and grid size dxx = 2.

This f12t might be useful in its own right (without adding on f2) to reveal internal edges of organs or lesions. Moreover, f12t should be much less affected (than f1 or f2) by the largely featureless cloud of false LORs and their psrcs from randoms and scatters. Thus, corrections for those data contaminants could concentrate on f2.

We mention that the truncations involving R12, LG12, R24, and LG24 can be 'feathered' by replacing the 'sign' function by the 'erf' function. This we include as an option in our Octave codes and they then produce essentially unchanged results.

We mention also that (x, 0, z) slices of similar '4 cubes' computations (replacing the role of y with z) show the same good results. Trials with Nsrc reduced from 106 to 105 also show quite good results. Likewise with TOFu changed from 10 to 5 or 20.

This Multigrid approach for PET is reminiscent of the highly successful multigrid methods for accelerating iterative solution of finite-difference equations for elliptic (diffusion-dominated) PDEs. There, a few local iterations on a fine mesh quickly damp out the short Fourier wave length components of the error; the residual is then passed to a coarser mesh where local iterations quickly damp out somewhat longer wavelength components of the error; etc. One finally reaches a mesh sufficiently coarse that it is efficient to solve the residual equations by direct methods such as Gaussian elimination. See Briggs et al (2000).

The situation here with our three-level Multigrid for PET is similar. We reach a final f4 on a dxx = 4 grid which is much much cheaper to compute than is f1 on a dxx = 1 grid. It involves many fewer grid points. Its highly blurred f4 requires fewer events to sweep oscillations out to and is much more susceptible to 'clustering' of events, as will be discussed in section 15. Of course, one could increase from three Multigrid levels to four, computing f1248fill with a final extremely blurred f8 to be computed on a dxx = 8 grid, etc.

13. Relation of KRNL3D and KRNL3D-TOF to previous analytical methods

The field of 3D PET image reconstruction is very large and active. Several excellent review articles on the current state of the field focus mostly on iterative methods and on needed corrections to the data (attenuation, scatter, random events, detector efficiency, noncolinearity, patient motion, ...). See Budinger et al (1978), Ter-Pogossian et al (1981), Tomitani (1981), Budinger (1983), Kao (2008), Li et al (2015), Yu et al (2015), Li et al (2016), Cherry et al (2018), Zhang et al (2018), Zhang et al (2020), Tong et al (2010), Iriarte et al (2016), Gullberg et al (2010), Surti and Karp (2016), Zhang et al (2017), pp. 2465–2485), Jones and Townsend (2017), see their section 6 on reconstruction algorithms), and Vandenberghe et al (2016).

Papers on analytical (noniterative) 3D methods from the early days of fully 3D PET include Cho and Burger (1977), Colsher (1980), Snyder et al (1981), Ra et al (1982), Cho et al (1983), Budinger and Gullberg (1974), Huesman et al (1974), Kinahan et al (1988), Defrise et al (1989), Defrise et al (1993), and (van Velden et al 2008). Also from those early days is the oft-cited 3DRP method of Kinahan and Rogers (1989) which involves two passes of an analytic FBP step with an intermediate reprojection step in order to use all detected events with medium-length scanners. Later papers include Defrise et al (1995) and Defrise et al (1997), Defrise et al (2005), Erlandsson et al (2006), and Matej et al (2016). Similar to 3DRP, but working in Fourier domain, is the 3D-FRP method of Matej and Lewitt (2001).

The method which seems most pertinent to KRNL3D is the true three-dimensional reconstruction method (TTR) of Ra et al (1982). They also seek to fully utilize projection data from all directions (θ, φ) in a 'truncated sphere' (our ${\bar{\varphi }}_{\max }$-spherical band). We paraphrase their exposition into our terminology. For each great circle Γ in that truncated sphere they note that one can use 2D FBP (filtered backprojection) on the projection data sets g(y1, y2, θ, φ), with all the directions (θ, φ) in Γ, to get a formula for f(x, y, z). Their final formula for f is a weighted average of these formulas over all such great circles, weighted by 'the number of slice orientations which commonly share the same projection data set' (see their p. 42); this is perhaps a discrete version of our choice (4.2) for the arbitrary weight $\eta (\bar{\varphi })$. Since the same data set g(y1, y2, θ, φ) is used by all Γ passing through (θ, φ), they realize that this weighted average can be realized in Fourier-space by taking that weighted average of all those rotated ramp filters to yield a 'composite filter'.

There are many similarities between TTR and KRNL3D, but the notation and terminology are quite different. Our β-tilted ${H}_{\beta }=H(t)\cdot \exp (-{s}^{2}/2)$ in our (2.5), the kernel for backprojection along a single great circle Γ, corresponds roughly to their $h(x^{\prime} ,z^{\prime} )$ in their (4), (5). Two differences are that their h is missing the $\exp (-{s}^{2}/2)$ factor and is also based on the β-rotated ramp filter $| {\omega }_{x^{\prime} }| $ which seeks to reconstruct the unblurred f. In fact, in use with discrete F-transforms their filter, see their figure 5, has to be truncated at the Nyquist cutoff frequency, i.e. it is multiplied by a rectangular 'apodizing window' (a 'blurring filter' in our terminology of (2.3)) which leads to Gibbs-like oscillations. Even if the rectangular window is replaced by a smoother blurring window such as Gaussian or Hamming the blurring occurs only in the 2 directions parallel to the great circle, with no blurring in the perpendicular 's' direction. It is not clear to the author whether their h takes account of the 'width factor ${wf}(\bar{\varphi })$' of our (3.2).

Watson (2007) sets up a 2D TOF-PET framework for simulating and evaluating five different TOF filters with respect to their ability to decrease image noise variance. He finds that the CW (confidence weighted) filters yield the best results. These filters operate only in the TOF direction; in the perpendicular direction, they use the ramp filter, attempting to reconstruct the unblurred unknown image f. But, of course, the intricacies of the discrete-${ \mathcal F }$-transform implementation does lead to blurring in that perpendicular direction also. We wonder whether the resulting method leads to a positive point-response function.

Our development in sections 24 of the KRNL3D kernel is in many respects similar to the work of Kinahan et al (1988). Details are quite different, but the intent is the same. We develop a tabulated kernel K(y1, y2, φ) with compressed coordinates to be precomputed and used for backprojections of all events with $| \varphi | \lt {\bar{\varphi }}_{\max }$. They develop a tabulated kernel (which we will rename as W) with equispaced coordinates.

Their trials involve a 41 × 41 × 7 table W, with ∣φ∣ < 10°, used to reconstruct a uniform density sphere f(x, y, z) on a cubic grid of 40 × 40 × 40 voxels. They specifically start to construct their W with the Dirac delta-function as their PRF; but they then average W over each pixel and then apply a low-pass filter to the resulting W. Their results, using complete line integral data sets from 60 × 7 (θ, φ) directions, show that the sphere is faithfully reproduced except for a 17% undershoot near the z-axis just outside the sphere's radius. They conjecture that this artifact is due to an error in computing a certain unspecified numerical integral. We note that such an undershoot is what would be expected from a PRF such as that depicted in figure 5(B).

Comparing these two methods with their K and W kernels shows the advantage of seeking a smoothly-blurred PRF (in order to permit good tabular resolution of the essential positive and negative lobes near the origin) and of a compressed tabular grid as in (6.1) (to permit resolution of the kernel for really large radii).

Defrise et al (1995) consider analytic methods in great generality. Their section 3 on 'nontruncated parallel beam geometry' is particularly relevant to our present work. Their section 3.1 describes 'filtered-backprojection' FBP in general. Our kernel K would be one example of their general (but not specific) kernel h. Their section 3.4 describes 'backprojection-filter' BPF in general. Note that the BP step requires voxels on which to accumulate the infinitely thin backprojection rays. The convolutions in both FBP and BPF are of course usually performed in Fourier-transform space.

With list-mode data the backprojections in BPF can be formed rather cheaply event-by-event. However, with FBP the slowly decaying tail of any possible kernel h makes event-by-event FBP extremely expensive. They thus conclude, 'With list-mode data therefore, the BPF approach is the only practical one.' In section 15 we discuss how our 'backplacements' onto cylinders for KRNL3D-TOF or for Multigrid should greatly decrease the computational cost of the event-by-event FBP approach.

14.  KRNL3D versus Multigrid versus KRNL3D-TOF

We have applied KRNL3D, KRNL3D-TOF with R1t = 20, and Multigrid with R12 = 5 or 8 to the '4 cubes problem' of figures 5, 7, and 8 . The resulting f1, f1t, and f124fill are visually almost indistinguishable from each other, except for the low amplitude streaking. These were moderately sized problems on a 77 × 77 × 1 grid. The major difference in these methods lies in their far-field streaking; for this comparison we now turn to larger grids. We compare them with respect to their ability to limit streaking-at-a-distance and thereby prevent noise and streaking in one region from contaminating the reconstruction in more distant regions. This becomes especially important as researchers manage to decrease the TOF uncertainty in a newer generation of scanners. See for example the 127 picosecond coincidence timing resolution of Xie et al (2019).

In figure 9 we consider four trials with ${\bar{\varphi }}_{\max }=\pi /6$, with dxx = 1 on a 199 × 7 × 1 grid, with Nsrc events in an 8 × 8 × 32 cube, with medium or high event counts (Nsrc = 105 or 106) and with standard or aspirational TOF uncertainties (TOFu = 10 or 3). To better compare the streaking-at-a-distance we plot seven x-slices from the (x, y, 0) slice, with y = −3, −2, ..., +3. From bottom to top in each are the three reconstructions: with f1 (KRNL3D untruncated); with f124fill (Multigrid with the tuned values fac = 0.8200 or 0.6120, f12t truncated at R12 = 8, f24t truncated at R24 = 16, and f4 untruncated); and with f1t (KRNL3D-TOF truncated at R1t = 20).

Notice that the streaking range for f124fill is considerably less than that of f1t in all cases (A)-(D), but especially so in figure (C) and (D) with the small TOFu = 3. Remember that this streaking is really a 3D spray; thus if the streaking range is doubled then it affects 8 times the volume in the surrounding 3D reconstruction grid.

15. Computational demands, possible 'clustering' of LORs

Analytic 3D methods are inherently extremely computationally intensive. This is especially true of the backprojections or backplacements of this paper, one-event-at-a-time from an events list. The KRNL3D computation for figures 5(C), (D), as vectorized (but otherwise highly nonoptimized) in Octave on our DELL Optiplex 7050 SFF quad-core solid state drive computer required 1.7 cpu hours. This involved backprojection of 2.60 × 106 events onto a 77 × 77 × 11 grid. The KRNL3D-TOF computation of figures 7(A)–(C) required 2.5 cpu hours; however, the potential computational advantage of the radius R1t = 20 backplacements was not taken advantage of. Instead the truncated kernel K1t with all its zeros was backplaced onto the entire grid (as was the case for all the examples of this paper.

The author believes that event-by-event backplacement would yield the most precise use of the available LOR and TOF data. One way to make these more efficient would be to parallellize them on GPUs with 16 bit half-precision arithmetic and with a smaller TBLK. In this paper we have used a 100 × 100 × 40 table, but computations reveal that a 25 × 25 × 11 table is quite adequate. The 25 × 25 φ-slices (and probably the whole TBLK) would then fit in L1 cache for very fast table lookup. Note also that the highly local nature of these event-by-event backplacements would allow its use in specialized local reconstruction for a few specific organs (liver, lungs, brain, ...).

A second strategy for reducing the computational demands is possible 'clustering' or grouping of LORs for reconstruction on large grids, especially in high event count situations. Note that with a truly wide aperture ${\bar{\varphi }}_{\max }$ situation such as that shown in figure 1 there will be many more possible detector pairs (v1, v2) and $({\bar{v}}_{1},{\bar{v}}_{2})$ than there are events, and many detector pairs will record few or no events. LORs with nearly equal (θ, φ) directions and nearly equal psrc positions could be handled with a single backplacement. By 'nearly equal' we mean 'within a certain tolerance'; the TOF information would help greatly to relax those tolerances.

There could be even further clustering of the data. Matej et al (2009) developed the DIRECT method (Direct Image Reconstruction for TOF) which involves (1) angular grouping of TOF events to a set of views as given by the angular sampling requirements of TOF, and (2) deposition of the grouped events and correction data into a set of 'histo-images', one per view. This 2009 paper involved iterative reconstruction. That DIRECT data partitioning framework was extended by Matej et al (2016) to an analytic TOF PET reconstruction; they note that TOF reduces the number of views needed by an order of magnitude. More recently Zhang et al (2020) employed an extreme such data grouping for the 2 meter total-body EXPLORER and iterative reconstruction.

16. Some potential advantages of the Multigrid or KRNL3D-TOF approach

The method employs the data as it is actually presented to us by Nature, one coincidence event at a time, from random sources within the subject's body and from random directions (θ, φ) equidistributed with respect to solid angle.

K(y1, y2, φ) is a smooth function of its variables, and, under the compression (6.1), an even smoother function of the table index variables (u1, u2, u3) in [0, 1]3. Thus it is susceptible to accurate table lookup. The tabulated values can be considered to be essentially exact because they are computed accurately once-and-for-all, using accurate numerical integration, using change of variables to integrate past singularities as $\bar{\varphi }$ approaches 0 or ${\bar{\varphi }}_{\max }$, using the asymptotic expansion of H(t) in powers of t−2 for large t, .... All the mollification by the 3D Gaussian ρ, and all the possible singularities and numerical difficulties are smoothed out in the zealously-accurate once-and-for-all computation of K.

The backprojections using this K are extremely local. One can backproject onto any desired array of reconstruction points (x, y, z), for example onto a discretized cubic or tetrahedral grid covering the liver, with arbitrarily large or small grid size.

The local nature of KRNL3D just discussed involves the ability to chose any 'blob' of (x, y, z) reconstruction points. But it seems that this localness extends also to the collection of LOR's needed in reconstructing that blob, provided that the event count is sufficiently high. Simulations suggest that one can ignore all LORs whose grazing distance from the blob is ≥ 3.

The backprojections using this K are extremely simple. The corrections to the data for detector insensitivity, for attenuation on the LOR, etc, are incorporated into the Q data correction factors of section 5 before backprojection by (1/Q)K. Single precision, or even half-precision, should be adequate.

The feature of sections 9 and 11, with their figures 5(C), (D) and 7(A)–(C), counting emission events in an isolated blob, should be quite useful for dynamic (time dependent) studies. This desire to efficiently obtain certain spatially integrated values over a certain blob (organ, liver), and their statistics, despite a very low count of events, is more fully handled by Huesman's method (Huesman 1984) as applied to standard iterative methods.

The fact that computation of the K table, and backprojection using K, remain in the physical domain (with physical variables t, y1, y2, φ, x, y, z, etc), rather than in the Fourier domain (with variables on a discretized uniform spacing mesh), means that one can use compressed variables which concentrate grid points into regions of high variation of K(y1, y2, φ) and which efficiently represent the −t−2 asymptotic tail of H(t) or the asymptotic decay of K at large values of y1 and y2.

Because there are no reprojections, as in some iterative or analytical methods, there is no need to accurately model the parallax in such a reprojection step. Of course, one would need to understand how such parallax adds further blurring to the reconstructed image.

17. Conclusions and needed further work

In section 2 we briefly reexamined the filtered backprojection method FBP of 2D tomographic image reconstruction and emphasized that the 'blurring filter' there should be the radially symmetric 2D Fourier transform of the desired positive radially symmetric 2D PRF; see (2.3) and figure 3.

The kernel K(y1, y2, φ) of KRNL3D is the analog, for wide aperture 3D tomography, of the backprojection kernel H(t) for the Gaussian-blurred inverse Radon transform of 2D tomography. It allows one to fully utilize all projection data from very large oblique angles $| \varphi | \lt {\bar{\varphi }}_{\max }$, with ${\bar{\varphi }}_{\max }$ set at any value in (0, π/2).

Multigrid and KRNL3D-TOF restrict backprojection to a cylindrical patch about the TOF presumed source psrc of each event. These 'backplacements' prevent noise and streaking in one region from contaminating the reconstruction in more distant regions; they should also greatly decrease the computational cost of backprojection. They also can be employed with continuous-bed-motion CBM for medium-length scanners; two such methods are included in our github codes and will be discussed in a short future paper.

The advantages discussed in section 16 should be highly useful in dynamic studies. Moreover, these attributes should improve the detection of lesions in a clinical patient.

Much 'further work' needs to be done before KRNL3D could become a usable PET tool. First, there are simulations that should be done; these include:

  • The backprojections and backplacements should be parallelized, preferably at half-precision on GPUs. In particular, an efficient implementation is needed for the backplacements on cylinders. In the farther future, those backplacements might be done on inexpensive specially faricated chips, each containing very many simple arithmetic units to perform the simple half-precision arithmetic and lookups required.
  • As mentioned in section 9, there are no data errors, except for insufficient stochastic sampling, in our simulated examples. The method with its 'probability of detection' Q factors of section 5 should be tested against realistic models of statistical data errors for attenuation and for efficiency of the detectors as a function of incident angle. Simulation should introduce data correction for randoms and scatter.
  • The method should be compared, with a given number Nsrc of events, for accuracy and for computational efficiency, against well-established analytic and iterative reconstruction methods.
  • Our trial computations have assumed no granularity of the detector banks and no gaps between detector modules. How does the granularity produce further blurring, combined with that of our Gaussian-blurred PRF, and what blurring radius should be chosen for our PRF to deal with that granularity and with the statistics of the data errors? The problem of the detector module z-gaps such as occur in the EXPLORER scanner could be mitigated by slight z-translations of the subject bed (super sampling) with further correction by comparing local volume integrals of computed ff for point sources placed at various z and radial positions near the gap.

Second, the method should be tested on real-life data from currently available large axial length scanners, and also on medium-length scanners with CBM. This would require that the method be integrated with the geometry, list-mode data format, Q corrections to the data, ...for the particular machine.

This 'further work' will not be done by the author. He has been retired since 2005 and intends to return to that retirement. Instead the Octave codes for most of the trial computations of this paper are being made freely available, in the hopes that other interested researchers will carry on with this KRNL3D approach. These codes are at https://github.com/keithmillerberkeley/PET-codes.

Glossary

H(t)—the kernel, for Gaussian-blurred 2D filtered backprojection, whose backprojections from all angles θ in (0, 2π) produce the 2D PRF $\gamma \sim \exp (-{r}^{2}/2)$. See (2.3) and figure 3.

(θ, φ)—longitude and latitude (from equator) of a direction on the celestial sphere; see figure 2.

$\bar{\varphi }$-band—the set of directions on that sphere with $| \varphi | \lt \bar{\varphi }$.

$\bar{\varphi }$—the high point $(\bar{\theta },\bar{\varphi })$ of each of the great circles ${{\rm{\Gamma }}}_{\bar{\theta },\bar{\varphi }}$ which fill up that $\bar{\varphi }$-band. See figure 2.

${\bar{\varphi }}_{\max }$—we use projection data from all directions (θ, φ) with $| \varphi | \lt {\bar{\varphi }}_{\max }$.

(y1, y2)—coordinates on the imagined 2D detector/backprojection screen at each direction (θ, φ). See figure 2.

PRF—the desired point response function of our mathematical reconstruction algorithm when it is given exact data for a sufficient number of events concentrated at a single point. This is in contrast to the point spread function PSF, sometimes discussed in the literature, which deals with the spread in such response due to parallax thickening of the LORs from the nonzero size of the detector elements in a particular scanner; see for example Matej et al (2016) and Schmall et al (2016).

K(y1, y2, φ)—the kernel function whose backprojections from all directions (θ, φ) in the ${\bar{\varphi }}_{\max }$-band, when integrated with respect to solid angle over that band, produce the desired 3D Gaussian PRF $\rho \sim \exp (-{r}^{2}/2)$. See sections 24.

f(x, y, z)—concentration of a positron emitting radioisotope accumulated in the body of the patient. We wish to reconstruct it from the line integral projection data g(y1, y2, θ, φ) at all directions (θ, φ) with $| \varphi | \lt {\bar{\varphi }}_{\max }$. In PET that data arrives randomly one event at a time.

Mollification—we stabilize this mildly ill-posed problem by seeking to reconstruct not the exact f but instead its slightly blurred 'mollification' ff = ρf. See section 10.

TBLK—the 3D table of K(y1, y2, φ) values, with the (∣y1∣, ∣y2∣, ∣φ∣) compressed into variables (u1, u2, u3) in [0, 1]3. See section 6.

Nsrc—number of sources with $| \varphi | \lt {\bar{\varphi }}_{\max }$ in a particular cube or blob, for our simulations.

Q—the probability of detection for any annihilation event whose LOR should touch the detector 'tiles' at both detectors (v1, v2) and $({\bar{v}}_{1},{\bar{v}}_{2})$. It is < 1 because of attenuation, detector sensitivity and geometry, .... We thus backproject not K but (1/Q)K for each actually detected event. See section 5 and figure 1.

psrc—the TOF presumed source of the event along the LOR. See figure 1.

backplacement—for KRNL3D-TOF and Multigrid, we truncate backprojections to a cylinder centered on psrc.

R1t, LG1t—cylinder dimensions in KRNL3D-TOF for computing f1t by backplacement with the truncated kernel K1t. See section 11.

R12, LG12, fac—cylinder dimensions and tuning parameter in Multigrid for computing f12t by backplacement with the truncated kernel K12t = (K1fac · K2)t. See section 12.

Acknowledgments

I am greatly indebted to Thomas Budinger for numerous useful discussions, for his extensive knowledge of the field, for much helpful editing, and for pressing me to look into adding TOF information to KRNL3D. I thank Grant Gullberg and Jinyi Qi for many suggested references. I am indebted to the two reviewers of a previous version of this work for suggestions to improve its exposition, for suggesting leads towards more modern references on analytic reconstruction, and for in one instance suggesting correctly that my initial approach to using TOF data might lead to systematic errors. Many thanks to Faye Yeager for quick and accurate technical typing of many not-so-legible versions of this manuscript.

Please wait… references are loading.