Single projection driven real-time multi-contrast (SPIDERM) MR imaging using pre-learned spatial subspace and linear transformation

Pei Han; Junzhou Chen; Jiayu Xiao; Fei Han; Zhehao Hu; Wensha Yang; Minsong Cao; Diane C Ling; Debiao Li; Anthony G Christodoulou; Zhaoyang Fan

doi:10.1088/1361-6560/ac783e

1. Introduction

Image-guided radiation therapy (IGRT) is a technology that incorporates frequent imaging during the course of radiation therapy. It can improve the accuracy and precision of dose delivery and allows adaptive radiotherapy to account for temporal variations of the tumor in, for example, shape, volume size, and location (Dawson and Jaffray 2007, De Los Santos et al 2013). This is particularly important for the abdominal site that is often subject to breathing motion and filling effects. In recent years, MR-guided radiation therapy (MRgRT) has gained growing interest since the introduction of MR-Linac that integrates an MR scanner and a medical linear accelerator into one system (Mutic and Dempsey 2014, Raaymakers et al 2009, 2017). For abdominal external beam RT, on-board MR imaging during daily treatment provides unique advantages over conventional cone-beam CT equipped in routine Linac systems: (a) superior visualization of the tumor and many organs-at-risk (OARs) based on versatile soft-tissue contrast, (b) real-time tomographic images for target tracking that permits respiratory-gated dose delivery, and (c) no ionizing radiation exposure (Otazo et al 2020). To achieve reasonable spatiotemporal resolution, tumor tracking with commercial MR-Linac systems is currently limited to real-time 2D imaging that continuously acquires a single or 2–3 orthogonal slices (Fast et al 2019, Witt et al 2020). However, due to potentially complicated motion trajectories in the abdomen, real-time 3D (volumetric) imaging is more desired for precision medicine.

Different methods have been proposed for fast real-time 3D imaging in the context of MRgRT. The motion-model based approach is commonly used, in which a high-quality 3D reference image is first acquired, and motion fields (often called deformation vector fields, or DVFs) are estimated from continuous k-space acquisitions and then applied to the reference to generate real-time images (Stemkens et al 2016, Otazo et al 2020, Huttinga et al 2020, 2021). To acquire the motion-free reference image, breath-holding is needed and thus poses a restriction on achievable spatial resolution. Also, an inverse problem needs to be solved online to estimate the motion field, which limits the temporal resolution or latency in real-time imaging. Another class of approaches exploits artificial intelligence to reconstruct real-time images from highly undersampled k-space data. A patient-specific model based on, for example, principal component analysis (Dietz et al 2017) or convolutional neural network (Dietz et al 2019), can be trained prior to treatment. The undersampling factor, however, is very limited with existing techniques, which makes it difficult to achieve high temporal resolution. Recently, a signature matching technique was proposed whereby real-time images are selected from an image dictionary (Feng et al 2020). The dictionary is built through a 4D-MR pre-scan that generates a group of 3D images based on respiratory amplitude binning of k-space data. Despite superior image quality and latency, there is a limited number of bins in the dictionary to choose from. More importantly, all methods above rely on continuous steady-state acquisitions for reference or template images, thus resulting in one single and fixed tissue contrast weighting (commonly T1-weighting) in the final real-time images. This contrast weighting is unsuitable for some tumor targets (Zhang et al 2018).

In this work, we developed a novel technique named Single ProjectIon DrivEn Real-time Multi-contrast (SPIDERM) MR to provide real-time 3D images with flexible contrast weightings and a low latency. SPIDERM exploits the separability of spatial and dynamic information in a low-rank/partial separability model (Liang 2007). Briefly, a 'prep' scan, also serving as a pre-beam simulation scan, is first performed to learn a subject-specific model (a spatial subspace and a linear transformation from navigator data to subspace coordinates). A 'live' scan for beam-on real-time imaging is then performed by repeatedly acquiring the central k-space line only to dynamically determine subspace coordinates and generate 3D multi-contrast images on the fly utilizing the pre-learned model. We demonstrated its technical feasibility on a digital phantom and volunteers.

2. Theory

2.1. Spatiotemporal decomposition

As in Feng et al (2020), a 4D image $I\left({\bf{x}},t\right)$ can be modeled as low-rank using partially separable functions. Using a matrix expression, an image of the following form

$\begin{eqnarray}{\bf{A}}=\left[\begin{array}{ccc}a\left({{\bf{x}}}_{1},{t}_{1}\right) & \cdots & a\left({{\bf{x}}}_{1},{t}_{{N}_{t}}\right)\\ \vdots & \ddots & \vdots \\ a\left({{\bf{x}}}_{J},{t}_{1}\right) & \cdots & a\left({{\bf{x}}}_{J},{t}_{{N}_{t}}\right)\end{array}\right]\end{eqnarray} \tag{ 1 }$

can be decomposed as

$\begin{eqnarray}{\bf{A}}={{\bf{U}}}_{{\bf{x}}}{{\boldsymbol{\Phi }}}_{{\rm{rt}}}=\left[\begin{array}{ccc}{{\boldsymbol{u}}}_{1} & \cdots & {{\boldsymbol{u}}}_{L}\end{array}\right]\left[\begin{array}{ccc}{{\boldsymbol{\phi }}}_{1} & \cdots & {{\boldsymbol{\phi }}}_{{N}_{t}}\end{array}\right],\end{eqnarray} \tag{ 2 }$

where $J$ is the total voxel number, ${N}_{t}$ is the number of time points (or k-space lines); ${{\bf{U}}}_{{\bf{x}}}\in {{\mathbb{C}}}^{J\times L}$ contains $L$ spatial basis functions, and ${{{\boldsymbol{\Phi }}}_{{\rm{rt}}}\in {\mathbb{C}}}^{L\times {N}_{t}}$ contains real-time temporal weighting functions (depicting relaxation, motion, contrast changes, etc). At a specific time point $t={t}_{s},$ the real-time image ${{\boldsymbol{a}}}_{{t}_{s}}$ is a linear combination of the spatial basis functions, weighted by a vector ${{\boldsymbol{\phi }}}_{{t}_{s}}={\left[\begin{array}{ccc}{\phi }_{1,{t}_{s}} & \cdots & {\phi }_{L,{t}_{s}}\end{array}\right]}^{{\rm{T}}}:$

$\begin{eqnarray}&&{{\boldsymbol{a}}}_{{t}_{s}}=\left[\begin{array}{c}a\left({{\bf{x}}}_{1},{t}_{s}\right)\\ \vdots \\ a\left({{\bf{x}}}_{J},{t}_{s}\right)\end{array}\right]={\phi }_{1,{t}_{s}}{{\boldsymbol{u}}}_{1}+\ldots +{\phi }_{L,{t}_{s}}{{\boldsymbol{u}}}_{L}=\displaystyle \sum _{i=1}^{L}{\phi }_{i,{t}_{s}}{{\boldsymbol{u}}}_{i}.\end{eqnarray} \tag{ 3 }$

In practice, ${\bf{A}}$ as a whole can be reconstructed by recovering ${{\boldsymbol{\Phi }}}_{{\rm{rt}}}$ and ${{\bf{U}}}_{{\bf{x}}}$ in a two-step approach (Liang 2007, Pedersen et al 2009, Christodoulou et al 2014, Biswas et al 2015). Typically, temporal weighting functions ${{\boldsymbol{\Phi }}}_{{\rm{rt}}}$ are first recovered using only 'navigator data' ${{\bf{D}}}_{{\rm{nav}}}\in {{\mathbb{C}}}^{M\times {N}_{{\rm{nav}}}},$ i.e. the central k-space line which is frequently sampled in time, where $M$ is the number of points sampled per k-space line, and ${N}_{{\rm{nav}}}$ is the total number of navigator lines. ${{\boldsymbol{\Phi }}}_{{\rm{rt}}}$ is often extracted by calculating the singular value decomposition of the frequently sampled navigator data ${{\bf{D}}}_{{\rm{nav}}}$ and selecting the $L$ most significant right singular vectors.

Spatial basis functions ${{\bf{U}}}_{{\bf{x}}}$ are then reconstructed by solving the following problem:

$\begin{eqnarray}&&{\hat{{\bf{U}}}}_{{\bf{x}}}={{\rm{argmin}}}_{{{\bf{U}}}_{{\bf{x}}}}{\parallel {{\bf{D}}}_{{\rm{im}}}-{\rm{\Omega }}\left({\bf{E}}{{\bf{U}}}_{{\bf{x}}}{{\boldsymbol{\Phi }}}_{{\rm{rt}}}\right)\parallel }_{F}^{2}+\lambda R\left({{\bf{U}}}_{{\bf{x}}}\right),\end{eqnarray} \tag{ 4 }$

where ${{\bf{D}}}_{{\rm{im}}}\in {{\mathbb{C}}}^{M\times {N}_{{\rm{im}}}}$ denotes the 'imaging data', which is acquired from the entire k-space with sparse sampling schemes, such as randomized Cartesian or golden-angle radial trajectories ( ${N}_{{\rm{im}}}$ is the total number of imaging lines, ${N}_{{\rm{nav}}}+{N}_{{\rm{im}}}={N}_{t}$ ), ${\bf{E}}$ is the signal encoding operator, ${\rm{\Omega }}$ is the (k-t)-space undersampling operator, and $R$ is a regularization functional to exploit compressed sensing. Both steps of this reconstruction process are non-causal, and are therefore appropriate for a 'prep' scan but not a real-time, on-the-fly 'live' scan.

2.2. Image generation using pre-learned spatial subspace and linear transformation

Given ${{\boldsymbol{\Phi }}}_{{\rm{rt}}}$ extracted from the right singular vectors of ${{\bf{D}}}_{{\rm{nav}}},$ there exists a linear transformation ${\bf{T}}$ that maps ${{\bf{D}}}_{{\rm{nav}}}$ to ${{\boldsymbol{\Phi }}}_{{\rm{rt}}}.$ For an individual time point ${t}={t}_{s},$ the navigator data ${{\boldsymbol{d}}}_{{\rm{nav}},{t}_{s}}{\in {\mathbb{C}}}^{M\times 1}$ can therefore be transformed into ${{\boldsymbol{\phi }}}_{{t}_{s}}$ with ${{\boldsymbol{\phi }}}_{{t}_{s}}={\bf{T}}{{\boldsymbol{d}}}_{{\rm{nav}},{t}_{s}}.$ Accordingly, the entire 3D image at $t={t}_{s}$ can be generated with a simple matrix multiplication:

$\begin{eqnarray}&&{{\boldsymbol{a}}}_{{t}_{s}}={{\bf{U}}}_{{\bf{x}}}{{\boldsymbol{\phi }}}_{{t}_{s}}={{\bf{U}}}_{{\bf{x}}}{\bf{T}}{{\boldsymbol{d}}}_{{\rm{nav}},{t}_{s}}.\end{eqnarray} \tag{ 5 }$

For successive scans using the same sequence, ${{\bf{U}}}_{{\bf{x}}}$ and ${\bf{T}}$ are assumed to remain constant throughout the acquisition process, unless abrupt body motion or unexpectedly introduced contrast mechanisms force the new images outside the range of ${{\bf{U}}}_{{\bf{x}}}.$ Therefore, we developed the SPIDERM technique based on the constant nature of ${{\bf{U}}}_{{\bf{x}}}$ and ${\bf{T}}.$ The 'prep' scan is first applied to learn ${{\bf{U}}}_{{\bf{x}}}$ and ${\bf{T}}.$ Specifically, ${{\boldsymbol{\Phi }}}_{{\rm{rt}}}$ and ${{\bf{U}}}_{{\bf{x}}}$ are reconstructed in the two-step approach, and ${\bf{T}}$ is calculated as:

$\begin{eqnarray}&&{\bf{T}}={{\boldsymbol{\Phi }}}_{{\rm{rt}}}{{\bf{D}}}_{{\rm{nav}}}^{+},\end{eqnarray} \tag{ 6 }$

where ${{\bf{D}}}_{{\rm{nav}}}^{+}$ is the pseudo-inverse of ${{\bf{D}}}_{{\rm{nav}}}.$ Afterwards, the 'live' scan is performed: only ${{\boldsymbol{d}}}_{{\rm{nav}}}$ is acquired, and real-time 3D images can be generated on the fly by applying the matrix multiplication process according to equation (5) (figure 1).

**Figure 1.** The workflow of the SPIDERM technique. A 'prep' scan is first performed to learn and store the subject-specific model, including the spatial subspace ${{\bf{U}}}_{{\bf{x}}}$ and the transformation ${\bf{T}}$ from navigator data ${{\boldsymbol{d}}}_{{\rm{nav}}\,}\,$ to subspace coordinates ${\boldsymbol{\phi }};$ a 'live' scan is then performed to acquire a single central k-space line for tracking dynamic information, which is adequate to generate on-the-fly 3D images, given the pre-learned model from the 'prep' scan. Contrast-frozen images can be generated from contrast-varying images in the 'live' scan, allowing real-time imaging with a contrast weighting at user's discretion.
Download figure:
Standard image High-resolution image

**Figure 1.** The workflow of the SPIDERM technique. A 'prep' scan is first performed to learn and store the subject-specific model, including the spatial subspace ${{\bf{U}}}_{{\bf{x}}}$ and the transformation ${\bf{T}}$ from navigator data ${{\boldsymbol{d}}}_{{\rm{nav}}\,}\,$ to subspace coordinates ${\boldsymbol{\phi }};$ a 'live' scan is then performed to acquire a single central k-space line for tracking dynamic information, which is adequate to generate on-the-fly 3D images, given the pre-learned model from the 'prep' scan. Contrast-frozen images can be generated from contrast-varying images in the 'live' scan, allowing real-time imaging with a contrast weighting at user's discretion.
Download figure:
Standard image High-resolution image

2.3. Contrast regeneration: towards multi-contrast real-time imaging

In non-steady-state sequences with a periodic signal evolution (e.g. sequence with inversion recovery, saturation recovery, or T2 preparation module, etc), the temporal weighting functions (or the temporal subspace) ${{\boldsymbol{\Phi }}}_{{\rm{rt}}}$ contain the information not only about respiratory motion, but also contrast changes. There is a need in the 'live' scan to separate the motion information and contrast information, so that real-time images can be displayed with a stable contrast weighting, e.g. T1-weighted (T1w), T2-weighted (T2w), or proton-density-weighted (PDw), while maintaining true motion states. This can be achieved by a data-driven image contrast regeneration method, as described below.

With respiratory binning, a multi-bin temporal subspace tensor ${{\rm{\Phi }}\in {\mathbb{C}}}^{L\times {N}_{{\rm{seg}}}\times {N}_{p}}$ can be recovered from the navigator data ${{\bf{D}}}_{{\rm{nav}}}$ with low-rank tensor completion (Christodoulou et al 2018), where ${N}_{{\rm{seg}}}$ is the number of sampling points in a signal evolution cycle, and ${N}_{p}$ is the number of respiratory motion phases.

As the 'live' image comes in, its respiratory phase $p$ is identified with the liver-dome position using nearest-neighbor matching to the 'prep' data, and ${{{\boldsymbol{\Phi }}}_{p}\in {\mathbb{C}}}^{L\times {N}_{{\rm{seg}}}}$ is extracted from ${\rm{\Phi }}$ at phase $p.$ Then the target image contrast that would correspond to time $t={t}_{c}^{({\rm{target}})}$ can be synthesized in the 'live' scan by replacing ${{\boldsymbol{\phi }}}_{{t}_{s}}$ in equation (5) with ${\tilde{{\boldsymbol{\phi }}}}_{{t}_{s}}:$

$\begin{eqnarray}&&{\tilde{{\boldsymbol{\phi }}}}_{{t}_{s}}\left({t}_{c}^{({\rm{target}})}\right)={{\boldsymbol{\phi }}}_{{t}_{s}}+{\rm{\Delta }}{{\boldsymbol{\phi }}}_{p,{t}_{s}}\left({t}_{c}^{\left({\rm{target}}\right)}\right)\end{eqnarray} \tag{ 7 }$

$\begin{eqnarray}&&{\rm{\Delta }}{{\boldsymbol{\phi }}}_{p,{t}_{s}}\left({t}_{c}^{\left({\rm{target}}\right)}\right)={{\boldsymbol{\Phi }}}_{p}\left(:,{t}_{c}^{\left({\rm{target}}\right)}\right)-{{\boldsymbol{\Phi }}}_{p}\left(:,{t}_{c}^{\left({\rm{original}}\right)}\right),\end{eqnarray} \tag{ 8 }$

where ${t}_{c}^{\left({\rm{original}}\right)}$ refers to the original time point within the signal evolution cycle corresponding to the absolute time point ${t}={t}_{s},$ while ${t}_{c}^{\left({\rm{target}}\right)}$ refers to the time point with targeted contrast of interest, e.g. T1w, T2w, or PDw. This new term subtracts the contribution of the current contrast weighting, ${{\boldsymbol{\Phi }}}_{p}\left(:,{t}_{c}^{\left({\rm{original}}\right)}\right)\in {{\mathbb{C}}}^{L\times 1},$ and replaces it with the desired contrast weighting, ${{\boldsymbol{\Phi }}}_{p}(:,{t}_{c}^{\left({\rm{target}}\right)})\in {{\mathbb{C}}}^{L\times 1}.$

Thus, equation (5) can be adapted as follows for contrast-frozen, motion-maintained real-time imaging:

$\begin{eqnarray}&&{\tilde{{\boldsymbol{a}}}}_{{t}_{s}}={{\bf{U}}}_{{\bf{x}}}{\tilde{{\boldsymbol{\phi }}}}_{{t}_{s}}={{\bf{U}}}_{{\bf{x}}}\left({\bf{T}}{{\boldsymbol{d}}}_{{\rm{nav}},{t}_{s}}+{\rm{\Delta }}{{\boldsymbol{\phi }}}_{p,{t}_{s}}\left({t}_{c}^{\left({\rm{target}}\right)}\right)\right).\end{eqnarray} \tag{ 9 }$

Note that the original 'live' image ${{\boldsymbol{a}}}_{{t}_{s}}={{\bf{U}}}_{{\bf{x}}}{\bf{T}}{{\boldsymbol{d}}}_{{\rm{nav}},{t}_{s}}$ is generated directly from the navigator data without binning, as in equation (5). The binning process is only used to generate the contrast update term ${\rm{\Delta }}{{\boldsymbol{\phi }}}_{p,{t}_{s}},$ which ensures better separation of motion and contrast changes.

When using a pulse sequence in which multiple image contrasts present along the signal evolution, SPIDERM is able to generate multi-contrast real-time images using several different ${t}_{c}^{\left({\rm{target}}\right)}$ 's. In this work, three different ${t}_{c}^{\left({\rm{target}}\right)}$ 's were chosen to represent T1w, T2w, and PDw respectively, as indicated by black arrows in figure 2(b).

**Figure 2.** (a) K-space sampling pattern and (b) sequence diagram of the T1/T2 Multitasking sequence. (a) The k-space is continuously sampled using a stack-of-stars FLASH sequence with golden angle ordering in the x–y plane and Gaussian-density randomized ordering in the z-direction, interleaved with navigator data (central k-space line along z-direction, ${k}_{x}={k}_{y}=0$ ) every 10th readout. (b) A saturation recovery (SR) preparation and T2 preparation (T2-Prep) are used to generate T1-weighted (T1w) and T2-weighted (T2w) signals, respectively, during each magnetization evolution cycle. A gap of 700 ms is intended to facilitate magnetizations' full recovery and thus minimize T1 weighting in subsequent PDw and T2w acquisitions.
Download figure:
Standard image High-resolution image

**Figure 2.** (a) K-space sampling pattern and (b) sequence diagram of the T1/T2 Multitasking sequence. (a) The k-space is continuously sampled using a stack-of-stars FLASH sequence with golden angle ordering in the x–y plane and Gaussian-density randomized ordering in the z-direction, interleaved with navigator data (central k-space line along z-direction, ${k}_{x}={k}_{y}=0$ ) every 10th readout. (b) A saturation recovery (SR) preparation and T2 preparation (T2-Prep) are used to generate T1-weighted (T1w) and T2-weighted (T2w) signals, respectively, during each magnetization evolution cycle. A gap of 700 ms is intended to facilitate magnetizations' full recovery and thus minimize T1 weighting in subsequent PDw and T2w acquisitions.
Download figure:
Standard image High-resolution image

For simplicity, ${{\boldsymbol{a}}}_{{t}_{s}}$ generated with equation (5) are denoted as contrast-varying (CV-SPIDERM) images, while ${\tilde{{\boldsymbol{a}}}}_{{t}_{s}}$ generated with equation (8) are denoted as contrast-frozen (CF-SPIDERM) images.

3. Methods

3.1. MRI protocol

Built upon the partial separability model (Liang 2007), the recently proposed MR Multitasking technique (Christodoulou et al 2018) can generate multi-contrast MR images with 3D coverage and high spatiotemporal resolution, without the assistance of external devices for gating or triggering (Hu et al 2020, Wang et al 2020, Han et al 2021). We tested the SPIDERM technique using an abdominal T1/T2 MR Multitasking sequence (figure 2) (Deng et al 2019). A saturation recovery preparation and a T2 preparation are used to generate T1 and T2 contrast weightings, respectively, during each TR (figure 2(b)). A gap of 700 ms is intended to facilitate magnetizations' full recovery and minimize T1 weighting in subsequent PDw and T2w acquisitions. The k-space is continuously sampled with fast low-angle shot (FLASH) readouts using a stack-of-stars acquisition with golden angle ordering in-plane and Gaussian-density randomized ordering in the partition direction (figure 2(a)). The 'imaging data' is interleaved with 'navigator data' (central k-space line along the partition direction, ${k}_{x}={k}_{y}=0$ ) every 10th readout. The acquired data contains three overlapping dynamics, including respiratory motion, T1 relaxation, and T2 relaxation. General imaging parameters for both phantom and volunteer studies were: axial orientation, TR/TE = 6.0/3.1 ms, flip angle = 5° (following SR preparation) and 10° (following T2 preparation), bandwidth = 762 Hz/pixel, water-excitation for fat suppression, BIREF T2 preparation of 42 ms. The time per scan was 8 min.

3.2. Digital phantom study

The feasibility of SPIDERM was evaluated using an open-source digital phantom in MATLAB (https://github.com/SeiberlichLab/Abdominal_MR_Phantom) (Lo et al 2019). First, k-space data of 60000 readouts (corresponding to a total scan time of 8 min) were simulated using the T1/T2 Multitasking sequence and the sampling pattern shown in figure 2. Additional imaging parameters included: matrix size = 320 × 320 × 40, and voxel size = 1.7 × 1.7 × 6.0 mm³. To demonstrate that our technique does not assume or rely on strictly periodic respiratory cycles, time-varying breathing patterns were simulated with pseudo-randomly interleaved normal (∼4.2 s), long (∼6.3 s) and short (∼3.2 s) respiratory cycles.

The data of 60 000 readouts were viewed as the 'prep' scan, from which the spatial basis ${{\bf{U}}}_{{\bf{x}}},$ the linear transformation matrix ${\bf{T}},$ and the multi-bin temporal subspace tensor ${\rm{\Phi }}$ were generated. Then, 1000 additional time points of navigator data were simulated with variable respiratory motion positions and contrasts. These data were viewed as a portion of the 'live' scan and processed with the SPIDERM technique to generate real-time multi-contrast 3D images.

3.3. In-vivo studies

The in-vivo study was approved by local Institutional Review Board and written informed consent was obtained from all participating subjects. Experiments were performed in eight healthy subjects on a 3.0 T clinical scanner (Biograph mMR, Siemens Healthineers, Erlangen, Germany) equipped with an 18-channel phase array body coil. Additional imaging parameters included: matrix size = 320 × 320 × 52, field-of-view (FOV) = 550 × 550 × 312 mm³, voxel size = 1.7 × 1.7 × 6.0 mm³.

For each subject, two identical T1/T2 MR Multitasking scans were performed successively, serving as a 'prep' scan and a 'live' scan, respectively. Volunteers were instructed to breathe normally during the two scans. Although repetitive acquisition of the same single k-space projection as navigator data is the only essential need in the 'live' scan, we adopted the same sequence as used in the 'prep' scan for the following purposes: (a) the navigator data acquired every 10th readout were used to generate CV-SPIDERM images with equation (5), and CF-SPIDERM images with equation (9), using the proposed SPIDERM technique; (b) the data of the entire scan, including both navigator data and imaging data, were reconstructed retrospectively as in the 'prep' scan, to generate 'reference' real-time contrast-varying (CV-ref) images. Figure 3 illustrates the experimental design of in-vivo studies.

**Figure 3.** *In-vivo* experimental design. Two identical 8 min scans were performed successively to serve as a 'prep' scan and a 'live' scan, respectively. Both navigator data and imaging data were acquired in the 'live' scan. SPIDERM images (CV-SPIDERM and CF-SPIDERM) were generated using navigator data from the 'live' scan, while reference images (CF-ref) were generated by retrospective reconstruction using both navigator data and imaging data from the 'live' scan.
Download figure:
Standard image High-resolution image

Image reconstruction was performed offline in MATLAB 2018a on a Linux workstation equipped with two 2.7-GHz 12-core Intel Xeon CPUs, one NVIDIA Quadro K6000 GPU, and 256 GB RAM.

3.4. Data analysis

3.4.1. Digital phantom study

3.4.1.1. Accuracy of motion depiction

The respiratory motion-induced displacement of the liver dome, measured as its distance from the top of FOV, was determined in 1000 arbitrarily selected CV-SPIDERM images acquired during normal, long, or short respiratory cycles. Linear regression analysis was used to determine the agreement in the dome displacement between CV-SPIDERM and the ground-truth. Results were reported for time points in normal (603 time points), long (224 time points) and short (173 time points) respiratory cycles respectively to investigate the impact of breathing patterns on motion depiction.

3.4.1.2. Geometric accuracy

T1-weighted CF-SPIDERM images at three arbitrarily selected time points corresponding to end-of-expiration (EOE), end-of-inspiration (EOI), and a medium phase (MED) were used to assess the organ-level geometric variation from the ground-truth. The pancreas was manually contoured on the CF-SPIDERM images and temporally corresponding ground-truth by a clinical medical physicist with 15 years' experience using VelocityAI™ (Varian Medical System, Palo Alto, CA). The Dice similarity coefficient and mean surface distance (the mean voxel shortest distance from the surface of one structure to another) (Chalana and Kim 1997) were then determined.

3.4.2. In-vivo studies

3.4.2.1. Image quality

To assess the image quality of SPIDERM images, CV-SPIDERM and CV-ref images were compared using the following quantitative metrics. The mean values of these metrics among all 6000 time points in the live scan were reported for each subject. The average metrics of each signal evolution cycle were also measured and then plotted as a function of the signal evolution cycle to illustrate their temporal stability over the 8 min scan.

(1)
Normalized root mean square error (NRMSE)
$\begin{eqnarray}&&{\rm{NRMSE}}=\sqrt{\frac{{\sum }_{i}{\left({I}_{{\rm{SPIDERM}}}-{I}_{{\rm{ref}}}\right)}^{2}}{{\sum }_{i}{I}_{{\rm{ref}}}^{2}}},\end{eqnarray} \tag{ 10 }$
where ${I}_{{\rm{SPIDERM}}}$ and ${I}_{{\rm{ref}}}$ denote the magnitude pixel values of CV-SPIDERM and CV-ref images of the 'live' scan respectively.
(2)
Peak signal-to-noise ratio (PSNR)
$\begin{eqnarray}&&{\rm{PSNR}}=-10{\rm{lg}}\left(\frac{{\rm{MSE}}}{{I}_{{\rm{\max }}}^{2}}\right),\end{eqnarray} \tag{ 11 }$
where ${I}_{{\rm{\max }}}$ is the maximum magnitude pixel value, and ${\rm{MSE}}$ denotes the mean squared error.
(3)
Structural similarity index (SSIM)

$\begin{eqnarray}&&{\rm{SSIM}}=\frac{\left(2{\mu }_{{\rm{ref}}}\cdot {\mu }_{{\rm{SPIDERM}}}+{c}_{1}\right)\left(2{\sigma }_{{\rm{ref}},{\rm{SPIDERM}}}+{c}_{2}\right)}{\left({\mu }_{{\rm{ref}}}^{2}+{\mu }_{{\rm{SPIDERM}}}^{2}+{c}_{1}\right)\left({\sigma }_{{\rm{ref}}}^{2}+{\sigma }_{{\rm{SPIDERM}}}^{2}+{c}_{2}\right)},\end{eqnarray} \tag{ 12 }$

where ${\mu }_{\left(\cdot \right)}$ and ${\sigma }_{\left(\cdot \right)}$ denotes the mean and variance respectively, ${\sigma }_{{\rm{ref}},{\rm{SPIDERM}}}$ is the covariance, ${c}_{1}=0.01,$ ${c}_{2}=0.03.$

3.4.2.2. Accuracy of motion depiction

The respiratory motion-induced displacement of the liver dome was determined in the same coronal view of SPIDERM images (CV-SPIDERM, CF-SPIDERM T1w, and CF-SPIDERM PDw) and CV-ref images. CF-SPIDERM T2w images were not included in this comparison because the banding artifact at the liver dome caused by field inhomogeneity during T2 preparation made it difficult to accurately measure the distance (see Discussion for more details). The first 1000 time points of the 8 min 'live' scan were selected in each volunteer for linear regression analysis.

4. Results

4.1. Digital phantom study

Figure 4 shows the comparison of true reference, CV-SPIDERM, and CF-SPIDERM images of the digital phantom. Representative time points shown in figures 4(a) and (b) corresponded to end-of-expiration and end-of-inspiration, respectively. The respiratory motion-induced displacement was visually comparable between SPIDERM images and the true reference. CF-SPIDERM T1w, T2w and PDw images showed appropriate contrasts respectively. As shown in figure 5, the displacement of the liver dome measured from CV-SPIDERM images and reference images were strongly correlated in all three different respiratory cycles (normal cycles: slope = 0.90, intercept = 1.48, R² = 0.984; long cycles: slope = 0.90, intercept = 1.43, R² = 0.991; short cycles: slope = 0.88, intercept = 2.35, R² = 0.983).

**Figure 5.** (a)–(c) Linear regression analysis of the motion displacement in reference images and CV-SPIDERM images for the digital phantom study. Time points from normal, long, and short respiratory cycles showed an R² of 0.984, 0.991 and 0.983 respectively.
Download figure:
Standard image High-resolution image

The pancreas contour analysis showed a Dice similarity coefficient of 0.91, 0.84 and 0.85 for EOE, MED and EOI time points respectively. Mean surface distances were reported as 0.57, 0.95 and 0.87 mm for EOE, MED and EOI respectively.

4.2. In-vivo studies

The average elapsed time from the input of the central k-space line to the generation of real-time contrast-frozen 3D images was approximately 45 ms. Given that the navigator data (one central k-space line) can be acquired in 6 ms, a real-time display latency of 55 ms or less can be reached. This is achieved using MATLAB 2018a on a Linux workstation equipped with two 2.7-GHz 12-core Intel Xeon CPUs.

The NRMSE, PSNR and SSIM values between CV-SPIDERM and CV-ref images for each volunteer are shown in table 1. The average NRMSE, PSNR and SSIM among eight volunteers were 0.141, 30.12 and 0.88 respectively.

Figure 6 shows the time courses of various quantitative image quality metrics during the 8 min live scan. Compared with the first 25 signal evolution cycles, the average NRMSE, PSNR and SSIM changed by +9.1%, −3.2% and −0.9% respectively in the last 25 signal evolution cycles, indicating a slight metric degradation. Abrupt increase in NRMSE or decrease in PSNR/SSIM were visible in some volunteers (such as Volunteer 3), which presumably arose from sudden deep breaths.

Figure 7 shows the comparison between reference images and SPIDERM images in in-vivo studies. As shown in figures 7(a) and (b), CV-SPIDERM images had comparable image quality with CV-ref images, and CF-SPIDERM images demonstrated appropriate image contrasts of T1w, T2w and PDw. Banding artifacts caused by imperfect T2 preparation and main field inhomogeneity were visible at the liver dome area, making CF-SPIDERM T2w images' quality suboptimal (see Discussion for more details). A movie showing several respiratory cycles of the same volunteer can be found in supplementary video S1 (available online at stacks.iop.org/PMB/67/135008/mmedia).

Table 1. Mean NRMSE, PSNR and SSIM values between CV-SPIDERM and CV-ref images for each volunteer.

Volunteer	NRMSE	PSNR	SSIM
1	0.127 ± 0.062	31.16 ± 4.34	0.89 ± 0.04
2	0.113 ± 0.043	32.33 ± 4.74	0.93 ± 0.03
3	0.120 ± 0.058	32.07 ± 4.85	0.91 ± 0.04
4	0.186 ± 0.057	27.56 ± 4.98	0.88 ± 0.03
5	0.127 ± 0.065	30.13 ± 4.64	0.85 ± 0.06
6	0.117 ± 0.035	31.43 ± 4.86	0.90 ± 0.04
7	0.108 ± 0.026	31.96 ± 5.33	0.92 ± 0.04
8	0.233 ± 0.038	24.36 ± 5.41	0.77 ± 0.07
Mean	0.141	30.12	0.88

Figure 8 displays CV-ref images and CF-SPIDERM (T1w, T2w and PDw) images of another volunteer within an arbitrarily selected respiratory cycle. Motion states were consistent between CF-SPIDERM images and CV-ref images, while image contrasts remained stable in CF-SPIDERM images among different motion states.

As shown in figure 9, the displacement of the liver dome measured from SPIDERM images and reference images were strongly correlated (CV-SPIDERM images: slope = 0.98, intercept = 1.29, R² = 0.986; CF-SPIDERM T1w: slope = 0.98, intercept = 1.32, R² = 0.983; SPIDERM PDw images: slope = 0.98, intercept = 1.59, R² = 0.983).

5. Discussion

In this work, we developed a novel technique, named SPIDERM, for real-time multi-contrast 3D imaging with a latency of 55 ms or less. Our initial digital phantom and healthy volunteer studies demonstrated the technical feasibility of SPIDERM.

Major innovations of the SPIDERM framework are as follows. First, superb imaging latency can be achieved. The low-rank/partial separability model is used in SPIDERM, as in many previous subspace reconstruction frameworks (Pedersen et al. 2009, Zhao et al 2012, Lam and Liang 2014, Christodoulou et al 2018, Dong et al 2020). However, in SPIDERM, all the parameter estimations, where inverse or optimization problems are involved, are conducted only after the 'prep' scan. Constant subspace weighting functions and the constant linear operator are prepared for the 'live' scan. Therefore, only the navigator data needs to be sampled in the 'live' scan, and real-time images are generated with matrix multiplication and vector addition, which are both simple forward processes. The latency of 55 ms to generate real-time 3D images is much shorter than those afforded by previous methods, such as 170 ms in MR-MOTUS (Huttinga et al 2022), ∼300 ms in MRSIGMA (Feng et al 2020), or 476 ms in Stemkens et al (2016). Second, simultaneous multiple contrast weightings are available in a real-time imaging setting. Given the overlapping dynamics within the T1/T2 MR Multitasking sequence, images of different contrast weightings can be generated simultaneously using the contrast regeneration algorithm. This allows the end users to select the most appropriate contrast weighting or even synthesize a unique contrast weighting for target tracking and delineation. It provides more flexibility than previous techniques in which a fixed image contrast weighting based on a steady-state acquisition is available (usually T1-weighted (Feng et al 2020, Huttinga et al 2020) or T2/T1-weighted (Stemkens et al 2016, Dietz et al 2017) only).

In this work, we assessed the performance of the SPIDERM in motion depiction and geometric accuracy in phantom and/or in-vivo studies. Both digital phantom and in-vivo results demonstrated excellent correlation between SPIDERM images and reference images in liver dome displacement (R² ≥ 0.98). Geometric analysis also showed small variation in digital phantom (mean surface distance ≤0.95 mm). In digital phantom studies, the ground truth was available as the reference. However, there was no true reference in in-vivo studies. Ideally, real-time 2D images from fully sampled data should be used as a reference for accurate comparison. In previous work, 2D real-time projections were also used for this purpose (Feng et al 2020). However, such real-time 2D references are not available in this work because the single k-space line acquisition scheme was used in SPIDERM and inclusion of 2D references would interrupt contrast evolution. Therefore, reference images were generated by retrospective reconstruction using both navigator data and imaging data from the live scan (figure 3). This is a limitation of our evaluation study, as geometric artifacts shared by the reference images and SPIDERM images which would not be detected by the comparison here.

SPIDERM, involving 'prep' and 'live' scans, is compatible with the current MRgRT workflow. A typical MRgRT procedure starts with a pre-treatment phase, in which brief MR scans are used for position confirmation and replanning purposes, followed by the treatment phase guided by real-time imaging tracking. The 'prep' scan in the proposed technique can fit into the on-board pre-treatment phase to provide multi-contrast and motion-resolved 3D images while learning the spatial subspace and temporal linear transformation. During the beam-on treatment phase, the 'live' scan can generate real-time multi-contrast (e.g. T1w, T2w, and PDw) 3D images on the fly. Hence, an imaging framework based on SPIDERM possesses the potential to serve as a standalone package for MRgRT.

The T1/T2 MR Multitasking sequence (Deng et al 2019) was chosen to evaluate and validate the SPIDERM method. The sequence itself is a free-breathing volumetric body imaging technique based on MR Multitasking. It can provide spatially co-registered T1w, T2w, and PDw images, and respiratory phase-resolved 3D images with one single scan. Therefore, reconstructed images from the 'prep' scan can be used for pre-treatment planning. Furthermore, it uses a k-space acquisition pattern ('navigator data' + 'imaging data') specifically designed for the partial separability model, for which the spatial subspace and linear temporal transform are already available as byproducts of image reconstruction and do not need to be calculated as a post-processing step. However, it is worth noting that the core idea of SPIDERM is not limited to this specific Multitasking sequence for 'prep' scans. Variations of SPIDERM for real-time image generation and contrast regeneration are feasible with different 'prep' methods, as long as spatial basis functions are calculated and stored after the 'prep' scan and frequently sampled navigator data can be easily transformed to temporal weighting functions. We also note that in the current T1/T2 MR Multitasking sequence, a modified golden-angle stack-of-stars trajectory was used to sample the imaging data, with golden angle ordering in-plane and Gaussian-density randomized ordering in the partition direction. The isotropy of spatial resolution still needs to be improved for coronal or sagittal views. However, the SPIDERM scheme is not limited to this specific sampling pattern. Other sampling patterns, such as the rotating cartesian k-space (ROCK) pattern (Han et al 2017), may also be applied.

We used an image processing algorithm based on thresholding to estimate the real-time diaphragm position. Several factors may contribute to the errors in this estimate. First, the real-time images are of changing contrast and have a slice thickness of 6 mm. Second, the use of T2 preparation pulses can cause banding artifacts around the liver dome area at a series of subsequent time points, even after contrast regeneration. Those artifacts present a challenge for automated position measurement algorithms. Therefore, some obvious outliers in figure 9, which indicate errors of up to 2 cm, does not necessarily mean the real error for SPIDERM is 2 cm. Even with these outlier position measurements, we note that the tracked liver dome positions are still highly correlated between the real-time images and the reference.

Currently, SPIDERM assumes that the spatial subspace and the linear transformation from navigator data to temporal weighting functions are constant throughout the acquisition process. Under this assumption, a pre-learned spatial subspace and linear transformation can be used as constant operators, therefore only the navigator data is needed to update real-time images. In some scenarios, however, this assumption may not be satisfied. For instance, a bulk body movement would reduce the spatiotemporal correlation of the signal evolution, thus destabilizing both the spatial subspace and linear transformation. In real radiotherapy settings, immobilization devices are typically used to minimize rigid body motion, which may help lessen the likelihood of such movement. Another challenging scenario is that an abnormal respiratory pattern, such as a deep breath that forces the new image outside the pre-learned subspace, also can correspond to an image outside the fixed spatial subspace, as shown in supplementary information figure S1. To address this, a sudden deep breath could be recognized by setting an acceptance range for ${\tilde{{\boldsymbol{\phi }}}}_{1,:},$ the first component of contrast-frozen temporal weighting functions ( $\tilde{{\boldsymbol{\phi }}}$ in equation (7)). If the generated ${\tilde{\phi }}_{1,{t}_{s}}$ goes out of the acceptance range at specific time point ${t}_{s},$ the image at ${t}_{s}$ should be rejected for display, as in supplementary information figure S2. An automatic detection of abrupt motion may trigger beam-off until the respiratory pattern returns to normal. Further, internal organ motion and gradual displacement due to non-cyclic organ motion (such as peristaltic motion) may also break the basic assumption. This would be an interesting topic to explore in our future work.

The current SPIDERM technique has several limitations. First, the reconstruction time for stack-of-star sampling-based MR Multitasking is currently several hours, which is acceptable for retrospective reconstruction, but would be prohibitively long for the 'prep' scan in practical applications. This will be addressed from two angles in the future. On one hand, direct reconstruction acceleration can be done based on the current MR Multitasking technique, including optimization of sampling trajectory with Cartesian acquisition (Chen et al 2021), improvement of the iterative reconstruction process, as well as transformation of the code from MATLAB to, for example, C++. On the other hand, given enough data acquired with this sequence, deep learning reconstruction may be introduced to further reduce the reconstruction time (Chen et al 2019). Second, banding artifacts caused by B₀ field inhomogeneities and imperfect T2 preparation pulses may still be visible at the liver dome, which degraded the image quality of T2w images, as shown in figure 7. As a feasibility study, all experiments in this work were performed at 3 T, which made the quality of B₀ field shimming and T2 preparation refocusing pulses suboptimal, particularly for large field of view. This problem may be alleviated with adoption of advanced pulse designs, higher-order shimming technologies, or lower field strength. Third, during the binning process based on the liver-dome position for contrast regeneration, exhale and inhale portions of the breathing cycle were not differentiated, which could lead to errors. This may be addressed in the future by modifying the binning procedure to separate these portions as different phases.

6. Conclusion

SPIDERM is a novel imaging technique for real-time multi-contrast 3D imaging with a low latency. An imaging framework based on SPIDERM can potentially become a standalone package for MRgRT.

Acknowledgments

This work was supported in part by NIH R01 EB029088, R01 EB028146, and R21 CA234637.

Single projection driven real-time multi-contrast (SPIDERM) MR imaging using pre-learned spatial subspace and linear transformation

Article metrics

Permissions

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Abstract

1. Introduction

2. Theory

2.1. Spatiotemporal decomposition

2.2. Image generation using pre-learned spatial subspace and linear transformation

2.3. Contrast regeneration: towards multi-contrast real-time imaging

3. Methods

3.1. MRI protocol

3.2. Digital phantom study

3.3. In-vivo studies

3.4. Data analysis

3.4.1. Digital phantom study

3.4.1.1. Accuracy of motion depiction

3.4.1.2. Geometric accuracy

3.4.2. In-vivo studies

3.4.2.1. Image quality

3.4.2.2. Accuracy of motion depiction

4. Results

4.1. Digital phantom study

4.2. In-vivo studies

5. Discussion

6. Conclusion

Acknowledgments

Single projection driven real-time multi-contrast (SPIDERM) MR imaging using pre-learned spatial subspace and linear transformation

Article metrics

Permissions

Share this article

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Abstract

1. Introduction

2. Theory

2.1. Spatiotemporal decomposition

2.2. Image generation using pre-learned spatial subspace and linear transformation

2.3. Contrast regeneration: towards multi-contrast real-time imaging

3. Methods

3.1. MRI protocol

3.2. Digital phantom study

3.3. In-vivo studies

3.4. Data analysis

3.4.1. Digital phantom study

3.4.1.1. Accuracy of motion depiction

3.4.1.2. Geometric accuracy

3.4.2. In-vivo studies

3.4.2.1. Image quality

3.4.2.2. Accuracy of motion depiction

4. Results

4.1. Digital phantom study

4.2. In-vivo studies

5. Discussion

6. Conclusion

Acknowledgments