Real-time 4D MRI using MR signature matching (MRSIGMA) on a 1.5T MR-Linac system

Objective. To develop real-time 4D MRI using MR signature matching (MRSIGMA) for volumetric motion imaging in patients with pancreatic cancer on a 1.5T MR-Linac system. Approach. Two consecutive MRI scans with 3D golden-angle radial stack-of-stars acquisitions were performed on ten patients with inoperable pancreatic cancer. The complete first scan (905 angles) was used to compute a 4D motion dictionary including ten pairs of 3D motion images and signatures. The second scan was used for real-time imaging, where each angle (275 ms) was processed separately to match it to one of the dictionary entries. The complete second scan was also used to compute a 4D reference to assess motion tracking performance. Dice coefficients of the gross tumor volume (GTV) and two organs-at-risk (duodenum-stomach and small bowel) were calculated between signature matching and reference. In addition, volume changes, displacements, center of mass shifts, and Dice scores over time were calculated to characterize motion. Main results. Total imaging latency of MRSIGMA (acquisition + matching) was less than 300 ms. The Dice coefficients were 0.87 ± 0.06 (GTV), 0.86 ± 0.05 (duodenum-stomach), and 0.85 ± 0.05 (small bowel), which indicate high accuracy (high mean value) and low uncertainty (low standard deviation) of MRSIGMA for real-time motion tracking. The center of mass shift was 3.1 ± 2.0 mm (GTV), 5.3 ± 3.0 mm (duodenum-stomach), and 3.4 ± 1.5 mm (small bowel). The Dice scores over time (0.97 ± [0.01–0.03]) were similarly high for MRSIGMA and reference scans in all the three contours. Significance. This work demonstrates the feasibility of real-time 4D MRI using MRSIGMA for volumetric motion tracking on a 1.5T MR-Linac system. The high accuracy and low uncertainty of real-time MRSIGMA is an essential step towards continuous treatment adaptation of tumors affected by real-time respiratory motion and could ultimately improve treatment safety by optimizing ablative dose delivery near gastrointestinal organs.


Introduction
According to a recent study (Mee et al 2021), MR-guided radiotherapy has the potential to benefit approximately 16% of all radiotherapy procedures in the UK. Extrapolating this proportion beyond the UK, it is estimated that approximately 1.5 million patients worldwide would benefit from MR-guided radiotherapy each year (Keall et al 2022). The MR-Linac, which combines an MRI scanner and a linear accelerator (Linac), provides an ideal platform for MR-guided radiotherapy. Compared to conventional onboard cone-beam computed tomography (CBCT), onboard MRI provides superior soft-tissue contrast and intra-fraction motion imaging capabilities for adaptive radiotherapy. Recent studies have demonstrated the advantages of MR-guided radiotherapy at various tumor sites, such as primary and metastatic tumors in the liver (Boldrini et al 2021, Stanescu et al 2022, pancreatic cancer (Michalet et al 2022, Stanescu et al 2022, high-risk lung tumors (Finazzi et al 2020), and prostate cancer (Cuccia et al 2022, Ristau et al 2022.
Physiological processes, such as respiration, heartbeat, digestion, bowel filling, and others, introduce various types of internal organ movements in the thorax, abdomen, and pelvis, leading to either underdosing the tumor or overdosing adjacent healthy tissue (Caillet et al 2017, Gargett et al 2019, Keall et al 2020. The need for realtime motion management is increasingly recognized in radiotherapy to further improve local tumor control and minimize treatment-related toxicity. A recent survey by the European society for therapeutic radiology and oncology (ESTRO) indicates that 71% of the radiotherapy centers in 41 countries stated their wish to implement real-time respiratory motion management (Anastasi et al 2020). The ideal strategy would be to obtain real-time 3D motion images of the tumor and organs-at-risk (OARs), and continuously reshape the radiation beam using the multileaf collimator (MLC) on an MR-Linac (Paganelli et al 2019, Keall et al 2022. MLC tracking would maximize dose in the tumor and minimize dose in the OARs, which would potentially improve treatment outcomes. 2D cine MRI with fast imaging of three orthogonal slices (sagittal, coronal, and axial views) remains the stateof-the-art clinical method for real-time imaging (Kurz et al 2020). However, motion of the tumor and OARs results in volumetric deformations that are not completely captured by 2D cine MRI in three orthogonal views (Yuan et al 2019). This limitation has recently been addressed by utilizing beam-eye-view 2D cine imaging with tumor volume projection adapted to each radiation beam (gantry) angle, enabling intra-fractional monitoring of tumor motion and verification of beam conformality (Nie et al 2021). A subsequent study proposed a predictive strategy to minimize latency in identifying motion-matched tumor volume while accounting for hysteresis, facilitating real-time assessment of beam-to-tumor conformality (Nie and Li 2022). While the beam-eye-view approach shows promising results for MR-guided radiotherapy, its application has only been demonstrated in lung cancer patients using a 3T MRI scanner. The study solely evaluated peripheral lung tumors, which exhibit stable tumor volume and shape during respiration. However, this does not accurately represent abdominal tumors, such as pancreatic cancer, where substantial deformation occurs during respiration for both the tumor target and OARs. Additionally, acquiring the high-resolution static images necessary for time-resolved 4D MRI super-resolution image reconstruction required breath-hold scans of 10-20 s each, which may lead to inconsistent results and pose feasibility challenges for certain patients.
Despite recent advances, real-time 4D MRI to guide radiation therapy of organs affected by respiratory motion remains a technically challenging task since both acquisition and reconstruction need to be performed with minimal latency. For organs affected by respiratory motion, a total latency of about 300 ms, including imaging and motion tracking, is required to achieve appropriate treatment guidance (Keall et al 2021, Huttinga et al 2022. Considerable efforts have been made to achieve this goal by introducing accelerated image acquisition techniques and fast reconstruction algorithms to reduce the total latency while maintaining adequate image quality. For example, real-time 3D motion was estimated with a latency of 170 ms from undersampled k-space data using low-rank MR-MOTUS reconstruction (Huttinga et al 2021(Huttinga et al , 2022. The latest work from the same group indicates that a much lower latency (14 ms or 69 Hz) could be achieved based on Gaussian processes for motion prediction (Huttinga et al 2023). Another study attempted to estimate real-time 3D motion with high spatiotemporal resolution and relatively low latency (less than 500 ms) using multi-resolution neural networks (Terpstra et al 2021). In addition, a real-time multi-contrast 4D MRI approach based on multi-tasking and a single projection was proposed to achieve a latency of 55 ms for MR-guided radiotherapy (Han et al 2022). However, these techniques were only implemented on digital phantom or healthy subjects and have not been demonstrated in clinical patients treated on 1.5T MR-Linac systems. Moreover, none of these techniques provide a method to validate motion tracking in real-time.
Magnetic Resonance SIGnature MAtching (MRSIGMA) is a recent technique that shifts the acquisition and reconstruction burden to a motion learning step and can achieve a total imaging latency (including acquisition and reconstruction) of less than 300 ms (Feng et al 2020). Another advantage of MRSIGMA is the calculation of 4D reference directly from the real-time acquisition data that can be used to retrospectively evaluate the performance of real-time motion tracking (Kim et al 2021). While the MRSIGMA methodology has been previously developed, it has only been implemented on 3 T MRI scanners. MR-Linac systems have inherent limitations due to their special design to perform imaging while delivering radiation, resulting in lower gradient performance and signal compared to traditional 3 T MRI scanners. The Elekta Unity 1.5T MR-Linac system, for example, has a lower maximum gradient amplitude (34 mT m −1 ) and maximum slew rate (120 mT m −1 ms −1 ) compared to 3T MRI scanners (45 mT m −1 and 200 mT m −1 ms −1 , respectively). Consequently, the lower gradient performance leads to longer repetition time, resulting in increased imaging latency for each angle and extended total scan time. Additionally, the signal-to-noise ratio (SNR) scales approximately linearly with the main magnetic field (B 0 ) within the clinically-used field strength range. This means that the SNR of images obtained from the 1.5T MR-Linac is approximately only half of that from the 3T MRI scanners. Thus, there is an unmet need to demonstrate the feasibility of the MRSIGMA technique for real-time 4D MRI with a latency lower than 300 ms on a 1.5T MR-Linac, which is the target clinical system.
Previous studies have demonstrated the feasibility of utilizing MRSIGMA on a small cohort of patients with liver metastatic tumors using 3T MRI scanners (Feng et al 2020, Kim et al 2021. However, in the case of pancreatic cancer, the tumor target is near OARs, such as duodenum, stomach, and small bowel. These organs are significantly affected by respiratory motion, resulting in substantial movements, with up to 9.8 mm for the duodenum and 11.4 mm for the stomach (Umezawa et al 2021), and an average of 10 ± 5 mm for the small bowel (Hallman et al 2012). As a result, there is a pressing need for advanced motion management techniques with real-time motion tracking, such as MRSIGMA, to effectively manage tumor motion and minimize radiation-induced complications on pancreatic cancer. This work develops an implementation of MRSIGMA on a 1.5T MR-Linac and demonstrates the feasibility of real-time 3D motion imaging in patients with inoperable pancreatic cancer. The performance of MRSIGMA was assessed quantitatively using the Dice coefficients between tumor and OAR contours defined on MRSIGMA and the 4D reference.

Overview of MRSIGMA
MRSIGMA shifts the acquisition and reconstruction burden to a learning step where pairs of 3D motion states and motion signatures are computed and leaves fast signature acquisition and matching for real-time motion tracking in a subsequent step that uses the learned information (figure 1). The learning step continuously acquires stack-of-stars radial data for about 5 min and reconstructs a 4D motion dictionary with entries given by pairs of 3D motion states and motion signatures. Motion signatures are given by the motion range along the z dimension, which is the main direction of motion in the body. After motion learning is completed, MRSIGMA moves to a real-time signature matching step, where the same acquisition is employed, but for each angle, a motion signature is computed and matched to one of the motion signatures and its corresponding 3D motion state in the pre-computed 4D motion dictionary. The signature matching step runs with fast matching operations without the need of image reconstruction, enabling low-latency real-time 4D MRI.

4D motion dictionary
The stack-of-stars k-space data of the training scan was transferred from the scanner to a high-performance computer and a 4D motion dictionary (10 motion states of 3D volume) was generated in Matlab (MathWorks, Figure 1. The MRSIGMA framework consists of the following steps: (1) offline dictionary learning (based on the XD-GRASP technique): 3D golden-angle radial stack-of-starts data (hundreds of angles) are continuously acquired for about 5 min; the center of k-space is used to compute a z projection for each angle and to estimate the respiratory motion signal; motion signatures (S n 1-) are generated using amplitude binning of the respiratory motion signal; k-space data are sorted into n 3D motion states; and a motion dictionary (m n 1-) is computed using temporal CS (compressed sensing) reconstruction; (2) online signature matching: the same acquisition is performed during real-time motion tracking, but each angle is processed separately to compute the position (P) within the respiratory cycle and match it to one of the motion signatures.
(3) retrospective self-evaluation: a reference dictionary (m n 1 ¢ -) is generated using all the angles acquired during online signature matching for retrospective evaluation of motion tracking performance. Natick, MA) using the XD-GRASP algorithm, as described previously (Feng et al 2016, Feng et al 2020, Kim et al 2021. First, a respiratory signal was extracted from the center of k-space data (kx = ky = 0) along the kz direction using fast Fourier transform, principal component analysis (PCA), and coil clustering (Feng et al 2016, Zhang et al 2016. Next, the k-space data was sorted according to the amplitude of the respiratory signal and subsequently binned into 10 motion states from end-expiration to end-inspiration. Finally, respiratory motionresolved 4D MRI reconstruction was performed using temporal compressed sensing followed by denoising using the Matlab Deep Learning Toolbox to generate the final 4D motion dictionary, in which each motion state (m n 1-) is uniquely identified by a motion signature (S n 1-). It is important to note the motion signature in the 4D motion dictionary is not a time-averaged signature, but rather a real-time signature that corresponds to a specific motion state within the dictionary. In other words, the data acquired at each angle has a motion signature which corresponds to one of the motion states in the dictionary. This distinction arises from the fact that XD-GRASP (which is used for motion dictionary learning) sorts data in k-space to compute a motion-resolved 4D motion dictionary, as opposed to the typical sorting performed in the image domain for the generation of 4D computed tomography (CT). Additionally, the displacement of the diaphragm along the z-direction can be captured by a znavigator and used to navigate respiratory motion. While the displacement along z is used for navigation, 3D images in the motion dictionary include deformations along the three spatial dimensions.

Real-time signature matching
Signature matching uses the same stack-of-stars trajectory, but a signature is computed for each angle, matched to the closest signature in the 4D motion dictionary and the corresponding 3D motion state is selected as realtime output. After all the kz lines for an angle are acquired, a z projection is computed by taking the FFT along the center k-space position. The new z projection is added to the matrix of projections used during dictionary learning and a new respiratory signal is computed using PCA. Given the position of the newly acquired signature P, the index of the motion signature (or motion state) in the 4D motion dictionary that best matches P is performed by the following: where S i (i = 1, 2, K 10) is the ith motion signature that is associated with the ith motion state (m i ) in the 4D motion dictionary.

Motion tracking evaluation
The performance of real-time motion tracking was evaluated retrospectively by using all the data acquired during signature matching to compute a new 4D motion dictionary, called the 4D reference dictionary (m n 1 ¢ -).
The 3D motion states in 4D reference dictionary were retrospectively assigned to each real-time data point and used as the 4D reference. The performance of real-time motion tracking was evaluated by measuring the degree of overlap between the real-time MRSIGMA contours and reference contours using the Dice coefficient as follows: where C 1 and C 2 are the MRSIGMA and reference contours, respectively. The Dice coefficient is in the range of [0, 1], and the two contours completely overlap when Dice = 1. Figure 2(a) schematically illustrates the calculation of the Dice coefficient of two contours (C 1 and C 2 ).

Patient study
Inclusion criteria for patients were: (a) patients with unresectable pancreatic tumors treated on a 1.5T MR-Linac system, (b) age greater than 18 years old, and (c) no MRI contraindications (such as metal implants, claustrophobia, etc). With approval from the local institutional review board (IRB), ten patients were prospectively enrolled in the study. The pancreatic tumors were measured as 3.5 ± 1.8 cm (the largest dimension) and were located in the head (n = 6), body (n = 3), and tail (n = 1) of the pancreas. Table 1 displays a more detailed description of the patient and tumor characteristics. All patients were treated with ablative stereotactic body radiation therapy (SBRT) dose of 50 Gy in five fractions. Four of the patients were scanned twice at two of their five fractions and others were scanned only once, resulting in a total of 14 scans. Written informed consent was obtained from each patient prior to the MRI scan.

MRSIGMA data acquisition
Two MRSIGMA scans (learning and real-time) were performed for each patient on a 1.5T Unity MR-Linac system (Elekta AB, Stockholm, Sweden) after the treatment session. MRSIGMA datasets were acquired under free-breathing condition. A T1-weighted 3DVANE sequence (Philips Healthcare, Best, The Netherlands) was modified for 3D golden-angle radial stack-of-stars acquisitions. A training dataset was first acquired for the generation of the 4D motion dictionary, and a second scan with the same sequence parameters was performed for online signature matching and the data was also used as a reference for retrospective self-evaluation. The MR-Linac system is equipped with 4-channel anterior and 4-channel posterior receive coils. The two MRSIGMA scans were positioned to cover the lung-liver interface (upper side) and most part of the kidneys (lower end) and were performed with the following sequence parameters: repetition time (TR) = 5.0 ms, echo time (TE) = 2.1 ms, flip angle = 12°, field-of-view (FOV) = 370 × 370 × 200 mm 3 , voxel size = 1.5 × 1.5 × 4.0 mm 3 , number of radial spokes (angles) = 905, bandwidth = 720 Hz pixel −1 , acquisition time = 5:30 min for each scan. The imaging latency of acquiring a single motion signature (one angle) was about 275 ms.

Contouring
Contouring was performed by two board-certified radiation oncologists using the MIM software (MIM Software Inc., Cleveland, OH). First, the two sets of 4D MRI images (training and real-time) were converted to DICOM image format and then imported to the MIM software for contouring. The gross tumor volume (GTV) of the pancreas and two OARs (duodenum-stomach and small bowel) were manually defined on the first motion state (end-expiration) for the training and real-time scans independently. These contours were then automatically propagated from the first motion state to all remaining motion states using the deformable propagation algorithm provided by the MIM software. Finally, manual adjustment of the propagated contours was performed if needed. The MIM software (version 7.2.8) utilizes the VoxAligh Deformation Engine, a deformable image registration algorithm, for deformable contour propagation in 4D image analysis. The VoxAligh Deformation Engine employs a coarse-to-fine multi-resolution approach to determine corresponding locations in the target image based on predefined control points from the source image (Rich et al 2022). It minimizes intensity differences using an optimized imaging matching metric and incorporates regularization to ensure smooth deformations without tears or folds. The accuracy of the automatic propagation algorithm has been validated using standard volumetric and distance-based similarity metrics, as described in previous studies (Calusi et al 2019, Rich et al 2022. These validations establish the reliability and performance of the automatic propagation algorithm within the MIM software. Figure 3 illustrates the performance of the algorithm for automatic contour propagation along the motion states dimension. The three contours on the first phase (P1) were manually defined by radiation oncologists, followed by an automatic propagation to other phases (P2-P10) using deformable image registration. It is noteworthy that the size and shape of the GTV (red) remain relatively stable across the phases, while significant changes are observed from P1 to P10 for the duodenum-stomach (blue) and small bowel (green).

MRSIGMA motion assessment
In addition to the Dice coefficient in equation (2), the following parameters were measured for each patient to show the volume changes and displacements of the GTV and OARs (duodenum-stomach and small bowel) in real-time MRSIGMA: volume (ml), volume change (ml), displacements of the centroid (Dx, Dy, and Dz) in the x, y, and z directions, the center of mass shift (COMS, mm), and the Dice scores over time (i.e. the moving Dice score between two consecutive time points) for the MRSIGMA and reference scans. Figure 2(b) provides a schematic representation of the moving Dice coefficient between two consecutive time points, allowing for a better understanding of the temporal evolution of the Dice score over time for both scans. The measurements were automatically performed in the MIM Software based on the 3D contours.

Results
The MRI scans were successfully performed, and the 4D motion and reference dictionaries were generated for all patients. Minor streaking artifacts originating from the arms were observed in one of the patient scans but had duodenum-stomach, green: small bowel) are defined in the first phase (P1) by radiation oncologists, followed by an automatic propagation to other phases (P2-P10) using deformable image registration. The size and shape of the GTV (red) remain relatively stable across the phases, while significant changes are observed from P1 to P10 for the duodenum-stomach (blue) and small bowel (green) due to respiratory motion. no impact on the contouring and motion assessment in the regions of interest (i.e. GTV and OARs). The realtime imaging latency includes the time required to acquire the k-space data for a single angle, which consists of 50 spokes along kz. With a repetition time (TR) of 5 ms, the total time to acquire the k-space data for one angle is approximately 250 ms (50 × TR = 250 ms). Additionally, a spectral presaturation with inversion recovery (SPIR) module with duration of 20 ms is added for each angle to achieve fat suppression. Furthermore, a 5 ms delay time is included for parallel data transmission. As a result, the overall imaging latency amounts to approximately 275 ms. The total latency during real-time signature matching was less than 300 ms, including ∼275 ms for data acquisition and transmission corresponding to one angle and less than 25 ms to match the new signature to one of the entries in the 4D motion dictionary. Figure 4 shows the real-time 4D MRI images and contours from a 69 year old patient with a tumor in the pancreatic head and neck. Real-time MRSIGMA and reference contours are overlaid on three representative motion states (end-expiration, middle state, and end-inspiration). The anatomy and overlaid contours are shown in three axial slices (first three rows) and one sagittal slice (bottom row). The video in supplemental material S1 shows the real-time motion of this patient in 25 s (90 out of 905 angles). Figure 5 illustrates that the Dice coefficients (first row) between real-time MRSIGMA and reference contours of the patient are 0.95 ± 0.01 (GTV), 0.84 ± 0.01 (duodenum-stomach), and 0.89 ± 0.01 (small bowel), indicating high accuracy (high mean value) and low uncertainty (low standard deviation) of the MRSIGMA technique for real-time motion tracking. The 3D volume over time (GTV: 60.1 ± 1.4 ml, duodenum-stomach: 184.8 ± 2.6 ml, small bowel: 350.3 ± 8.9 ml) and the center of mass shifts (GTV: 2.4 ± 2.1 [0-5.3] mm, duodenum-stomach: 3.6 ± 3.0 [0-8.1] mm, small bowel: 2.4 ± 1.9 [0-5.6] mm) are shown in the second and third rows to demonstrate real-time motion of the GTV and OARs, where the center of mass shifts are presented as mean ± std [min-max] mm. Figure 6 illustrates the real-time MRSIGMA and reference contours on three motion states from a 50 year old patient with a tumor in the pancreatic tail. The video in supplemental material S2 shows the real-time motion of this patient in 25 s (90 out of 905 angles). Figure 7 shows the Dice coefficients (first row) between the real-time MRSIGMA and reference contours of the patient: 0.86 ± 0.01 (GTV), 0.90 ± 0.01 (duodenum-stomach), and 0.85 ± 0.01 (small bowel), the 3D volume over time (second row): 32.3 ± 0.4 ml (GTV), 272.6 ± 2.0 ml (duodenum-stomach), 394.9 ± 15.5 ml (small bowel), and center of mass shifts (third row): 1.4 ± 1.9 [0-4.4] mm (GTV), 1.3 ± 1.6 [0-3.9] mm (duodenum-stomach), and 1.1 ± 1.3 [0-3.4] mm (small bowel). Figure 8 illustrates the respiratory signals from the motion learning and real-time MRSIGMA scans of patient 1 (figures 4 and 5) and patient 2 (figures 6 and 7). The first patient exhibits relatively stable breathing, although different breathing patterns are observed between the motion learning (figures 8(a), (c)) and real-time (figures 8(b), (d)) scans. On the other hand, the second patient displays a more irregular breathing pattern, as indicated by the green arrows, including variations between motion learning (figure 8(e)) and signature matching (figure 8(f)) scans. Nevertheless, the motion range remains consistent between the motion learning and real-time scans for both patients, as indicated by the green and yellow dashed lines for patient 1 (figures 8(a), (b)) and patient 2 (figures 8(e), (f)), respectively. This demonstrates the robustness of MRSIGMA in managing irregular and different respiratory motion patterns between motion learning and real-time signature matching. Table 2 summarizes the real-time MRSIGMA parameters (mean ± standard deviation) of the pancreas tumor and OARs for all patients. The Dice coefficients between MRSIGMA and reference scans are 0.87 ± 0.06 (GTV), 0.86 ± 0.05 (duodenum-stomach), and 0.85 ± 0.05 (small bowel), which suggests overall good performance of MRSIGMA for real-time motion tracking. For all the three contours, the displacements are mainly in the z direction, Dz = 2.8 ± 1.9 mm (GTV), 4.6 ± 2.9 mm (duodenum-stomach), and 2.3 ± 1.1 mm (small bowel), which are significantly larger than the displacements in the x and y directions (<2 mm). The average center of mass shifts are 3.1 ± 2.0 mm (GTV), 5.3 ± 3.0 mm (duodenum-stomach), and 3.4 ± 1.5 mm (small bowel), and the maximum center of mass shifts are 6.6 mm (GTV), 12.7 mm (duodenum-stomach), and 6.3 mm (small bowel), respectively. The Dice scores over time (0.97 ± [0.01-0.03]) are similarly high for MRSIGMA and reference scans in all the three contours.

Discussion
This work has demonstrated the feasibility of using MRSIGMA for online motion tracking on a 1.5T MR-Linac, but further steps are required for its complete integration into the current MR-guided radiotherapy (MRgRT) online workflow. First, simultaneous data acquisition and transmission are necessary for real-time signature generation. To address this, there is ongoing work to develop a prototype software that enables parallel acquisition and transmission of k-space raw data, minimizing latency, and sending the data to an external highperformance computer for processing . Second, fast generation of the motion dictionary and efficient contouring of the tumor target and OARs are crucial to reduce the pre-treatment time. Future work will explore the use of a deep learning-based neural network called 'movienet' for rapid motion-resolved 4D MRI reconstruction, leveraging space-time-coil correlations and motion consistency, achieving reconstruction within a few seconds (Murray et al 2023). Last, a customized software is under development to control the Figure 6. Real-time MRSIGMA images and contours from a 50-year-old patient with a tumor in the pancreatic tail on a 1.5T MR-Linac machine. The motion of the anatomy and three contours are shown in two axial (first two rows), one sagittal (third row), and one coronal (bottom row) views on three representative motion states (from left to right: end-expiration, middle state, and endinspiration). Real-time MRSIGMA contours are shown in solid lines in the gross tumor volume (red), duodenum-stomach (pink), and small bowel (green). The reference contours are shown in yellow dotted lines. There is a high concordance between MRSIGMA and reference contours in this patient. movement of the MLC to adapt the radiation beam to the 3D shape and position for real-time motion tracking. The development and validation of a deep learning-based auto-segmentation model and real-time MLC tracking are currently underway and will be reported in future work. By combing all these components, the MRSIGMA technique will be seamlessly integrated into the clinical workflow for adaptive radiotherapy with real-time motion tracking.

Same acquisition for training and signature matching
Even though signature matching only requires acquiring a single kz line (1D kz acquisition) or navigator echoes to compute the new signature, which would minimize latency, MRSIGMA uses the same acquisition trajectory for both dictionary training and signature matching. The main reason is to compute a 4D reference using the complete dataset acquired during real-time signature matching to retrospectively evaluate the performance of MRSIGMA for real-time motion tracking, which is not possible when using 1D kz acquisition or relying on navigator echoes. Evaluation of real-time imaging performance is very challenging since it requires to acquire the reference in parallel. By using the same acquisition trajectory, the same technique used to compute the motion dictionary can be used to compute the reference and assess the performance of real-time signature matching. Future work will also address the case of respiratory motion baseline drifts by periodically updating the motion dictionary, which will require to maintain the same acquisition strategy during signature matching.

Real-time motion tracking latency and accuracy
The latency of the current MRSIGMA implementation for real-time motion tracking is about 300 ms, which is sufficient to capture respiration-induced motion in most cases. Lower latencies might be required in cases when the patient's breathing frequency is irregularly high or when additional time is required to modify the radiation beam for real-time adaptative radiotherapy. One way to further reduce latency would be to decrease the number of kz positions for each angle, which can be achieved using partial Fourier (e.g. a factor of 0.75 would reduce imaging acquisition time by 25%) or parallel imaging to undersample the kz dimension. An extreme case to reduce latency would be to acquire a single kz line (head-foot direction) or navigator echoes along three orthogonal directions. However, this would remove the ability to compute a reference for retrospective evaluation of motion tracking performance. (a) and (e) represent the respiratory signals during motion learning, while (b) and (f) depict the respiratory signals during real-time signature matching. (c) and (d) display zoomed-in respiratory signals from the first minute of the motion learning and real-time scans, respectively. The first patient exhibits relatively stable breathing, with noticeable differences in breathing patterns between the motion learning (a), (c) and real-time (b), (d) scans. The second patient displays a more irregular breathing pattern, as indicated by the green arrows, including variations between motion learning (e) and signature matching (f) scans. Despite these differences in breathing patterns, the motion range remains consistent between the motion learning and real-time scans for both patients. This consistency is depicted by the green and yellow dashed lines for patient 1 (a), (b) and patient 2 (e), (f), respectively. 4.5 ± 3.3 3.5 ± 1.9 11.2 ± 9.3 Dx (mm) 0.9 ± 0.8 1.4 ± 0.9 1.9 ± 1.3 Dy (mm) 0.8 ± 0.6 1.9 ± 1.4 1.9 ± 1.0 Dz (mm) 2.8 ± 1.9 4.6 ± 2.9 2.3 ± 1.1 COMS (mm) 3.1 ± 2. Differences between adjacent motion states in the 4D motion dictionary are small only for motion states close to end-expiration. However, for phases close to the center or near end-inspiration, the motion difference between adjacent states tends to be higher than 1 mm, which will have an impact if real-time signature matching is not accurate. We observed that the average respiratory shifts for the GTV, duodenum-stomach, and small bowel were 3.1 ± 2.0 mm, 5.3 ± 3.0 mm, and 3.4 ± 1.5 mm, respectively, which indicates the range of motion during a full respiratory cycle exceeds 5 mm for all three contours (maximum center of mass shifts were 6.6 mm, 12.7 mm, and 6.3 mm, respectively). This range of motion cannot be safely covered by the 3-5 mm margin typically used in clinical practice (Reyngold et al 2019, which requires the use of real-time motion imaging. Furthermore, it is important to recognize that real-time motion does not necessarily follow the same order as the motion states in the 4D motion dictionary. Instead, it depends on the real-time respiratory position, which means that real-time signatures may match to non-continuous motion states in the dictionary. Considering the reported accuracy of the MLC for motion tracking of within ±1 mm (Klein et al 2009), this work assumes that as long as the signature mismatching results in a motion difference within the ±1 mm tolerance, there should be no direct impact on the tracking accuracy.

Irregular breathing and baseline drift
The MRSIGMA technique relies on the assumption that the motion amplitude range during dictionary training will be similar to that during signature matching (Feng et al 2020). Although, there were no significant baseline drifts observed in the patient cohort of this study, it is not uncommon that intra-fractional baseline drift occurs in some patients due to changes of stress or relaxation levels before and during the treatment procedure or changes between abdominal and thoracic breathing. It is reported that baseline drift of the external respiratory surrogate signal in the abdomen was observed in 40% of the patients (Quirk et al 2013), but this observation may not directly apply to the internal respiratory surrogate signal (i.e. self-gating from the k-space data) used in this work since the external-internal correlation remains unclear.
A few strategies can be potentially used to overcome the challenges due to baseline drift. First, we can temporarily pause radiation delivery when a baseline drift is detected and wait until the respiratory motion amplitude returns to the learned range to resume the treatment. Ideally, the reliability of real-time motion tracking should be estimated concurrently to ensure safe treatment, such as the uncertainty (or confidence) maps used in the work by (Huttinga et al 2023). In this work, the standard deviation of the Dice coefficient between the MRSIGMA and reference contours over time was relatively small (0.05-0.06), which indicates a low uncertainty or high precision of the MRSIGMA technique for real-time motion tracking. Second, the 4D motion dictionary can be updated to reflect the new breathing position if the baseline drift persists, which requires a more time efficient re-learning process, such as using deep learning-based image reconstruction. Third, a visual feedback system can be used to guide the patient's breathing pattern so that it can be kept relatively consistent during the treatment (Swamidas et al 2021).
It should be noted that the real-time scan was performed immediately after the training scan to minimize the patient table time, which also explains why no datasets exhibited respiratory motion drift in this work. In practice, however, a few minutes might be needed to reconstruct the 4D motion dictionary and to prepare for the contours before radiation delivery with real-time motion tracking. It is possible that there is a breathing pattern change or baseline drift from the training scan to the real-time scan that compromises the accuracy of real-time motion tracking, which can be potentially addressed using the strategies discussed earlier when baseline drifts are identified.
In the event of a sudden change in the patient's breathing level during real-time motion tracking, the MRSIGMA framework will detect an out-of-range motion signature within a timeframe of less than 300 ms. This detection will trigger an automatic pause in radiation delivery until the motion returns to its previous level. Moreover, in cases where there is persistent baseline drift, the motion dictionary will continuously update with the newly acquired data to accurately reflect the new breathing positions. As a result, the MRSIGMA framework demonstrates to be a robust solution to handle sudden changes in respiratory motion or persistent motion drift during real-time motion tracking.

Contouring
Although automatic deformable propagation was used along the dimension of respiratory motion states, minor manual adjustment was needed in cases of low-contrast boundaries to correct the misaligned contours. In addition, the GTV and OAR contours were manually delineated on the first motion state (end-expiration) of the training and real-time scans by two physicians. This is a time-consuming and labor-intensive process. An attempt was made to directly transfer the clinical contours used during treatment to our MRSIGMA images, but some manual adjustments were still needed due to differences in patient geometry (e.g. the use of compression belt) and image contrast (T2-weighted versus T1-weighted images). Recent studies have shown great promise of using auto-segmentation for MRI-guided online adaptive radiotherapy (Zhang et al 2020, Ding et al 2022. The combination of MRSIGMA with auto-segmentation of tumors and OARs provides an efficient clinical implementation for real-time adaptive radiotherapy.
The total latency calculation did not account for the time required for contour propagation. However, it is important to note that contour generation and propagation are conducted on the 3D images within the motion dictionary prior to real-time imaging and signature matching. Therefore, the time taken for contour propagation is considered part of the motion learning process and does not contribute to the latency during real-time motion tracking. Future work will explore the use of a deep learning-based auto-segmentation approach to expediate the contour preparation step. This advancement is expected to add only a few seconds to the motion learning phase and will not increase the overall latency during real-time motion tracking.

Image contrast
The training and real-time scans were T1-weighted due to short TR and TE, and the use of a strong spoiling gradient at the end of each excitation to eliminate residual signals in the transverse plane. The T1-weighted scan with fat suppression provides sufficient image contrast for the delineation of the OARs (duodenum-stomach and small bowel), but it remains difficult to accurately define the GTV for pancreatic cancer in some cases due to the low contrast between the tumor and adjacent tissue. In clinical practice, both T1-weighted and T2-weighted image contrasts were used synergistically to define the GTV during treatment planning (Tringale et al 2022). However, it is time inefficient to acquire T2-weighted image contrast for real-time motion tracking with the MRSIGMA framework due to the need for long TR and TE.
In fact, the current clinical approach for real-time motion tracking is to use 2D cine MRI in orthogonal views with a balanced steady-state free precession (bSSFP) sequence due to its high SNR efficiency. 3D golden-angle radial stack-of-stars sampling can be combined with bSSFP to achieve T2/T1-weighted image contrast for improved tumor delineation, particularly on the 1.5T MR-Linac system where bSSFP is less sensitive to susceptibility variations compared to 3 T MRI scanners (Goncalves et al 2012). Another approach is to first register the 3D T2-weighted images to the first motion state of the T1-weighted 4D motion dictionary, and then to deform the post-registered 3D T2-weighted images using the deformation vector fields (DVFs) extracted from the T1-weighted 4D motion dictionary, yielding a new 4D dictionary with T2-weighted image contrast . The T2-weighted 4D dictionary can be used collectively with the T1-weighted 4D dictionary for improved tumor and OAR visualization and contouring. This approach does not need additional time for data acquisition and contouring and the same latency can be preserved since signature acquisition and matching are based solely on T1-weighted images.

Dose accumulation
Accurate volumetric mapping of dose accumulation in the tumor target and OARs are important for adaptive radiotherapy and could potentially enable safe escalation of the radiation dose to the tumor target. The MRSIGMA technique not only allows real-time guidance of radiation delivery, but also enables intra-fractional dose accumulation analysis based on 3D motion and dose distribution of the GTV and OARs. A recent study reported intra-fractional dose accumulation analysis with 4D MRI in patients with liver metastases on the Unity MR-Linac system (van de Lindt et al 2022), but the accumulated dose was calculated from four separate 4D MRI scans in a fraction and therefore it was only an approximation of the actual delivered dose. In this work, realtime 3D motion is available during radiation delivery, which would enable a more accurate estimation of the accumulated dose.

Comparison to 2D orthogonal Cine MRI
Even though 2D orthogonal cine MRI was acquired during treatment, it cannot be used as an independent reference to assess the MRSIGMA motion signature generation and matching. First, treatment (and 2D cine MRI) at our institution was performed using a compression belt to minimize organ motion, while MRSIGMA scans were performed during free breathing after removing the compressed belt. Second, it is worth noting that 2D motion may not capture the full extent of organ motion when they move in and out of the imaging slice. Therefore, caution should be exercised when directly comparing the motion observed in 3D MRSIGMA scans with 2D orthogonal cine MRI scans. Instead, the data acquired during real-time signature matching, utilizing the same sampling trajectory and sequence parameters as the motion learning used in this study, provide a unique way to retrospectively evaluate the performance of MRSIGMA for real-time motion tracking. In previous studies, it has been reported that lung tumor motion is primarily characterized by translation, with minimal deformation during normal respiration (Wu et al 2009, Li et al 2013. Pulmonary nodules have been observed to exhibit small volumetric changes at different respiratory phases (Li et al 2013). However, it is important to note that relative changes in the volumes of the GTV and OARs in the abdomen may not accurately reflect the uncertainties associated with volume determination. This consideration is particularly relevant for abdominal tumors such as pancreatic cancer and OARs such as duodenum-stomach and small bowel. These organs experience substantial deformation due to respiratory motion and the interconnecting nature of the tumor with its surrounding structures. Additionally, it is worth acknowledging that the measured volume may not represent the entire organ, such as the small bowel, due to the limited spatial coverage in the superior-inferior direction, which is necessary to achieve relatively low imaging latency. Consequently, the conservation of volume may not hold even under normal respiration.

Limitations
This work has several limitations that need discussion. First, image reconstruction of the 4D motion dictionary was relatively slow (tens of minutes) due to iterative compressed sensing reconstruction. A deep learning-based 4D MRI image reconstruction model with GPU acceleration would significantly speed up the generation of the 4D motion dictionary, as shown in recent studies (Freedman et al 2021. Second, this work aims to demonstrate the feasibility of MRSGIMA for real-time motion imaging on a 1.5T MR-Linac system, but the application of the output of MRSIGMA to shape the radiation beam in real time using the MLC is yet to be demonstrated. While we acknowledge that reconstruction, signature matching, and motion tracking were performed in a non-real-time scenario, this represents a crucial step towards establishing a true online workflow for real-time motion tracking. It is important to highlight that we are actively working on implementing parallel data acquisition and transmission, developing deep learning-based 4D motion dictionary generation, as well as enhancing real-time signature matching and motion tracking capabilities . The outcomes of these ongoing efforts will be reported in future publications. Third, although the standard deviation of the Dice coefficient reveals the precision or uncertainty of real-time motion tracking, it is only available retrospectively. A more robust and instantaneous mechanism is needed to continuously monitor the patient respiratory pattern during treatment, so that any baseline drift or bulk patient motion can be immediately detected, and an action (e.g. to pause or resume treatment) can be taken during radiation delivery. Finally, the current MRSIGMA framework is limited to handle respiration-induced organ motion and therefore needs modification to the imaging concept in the context of irregular motion, such as peristalsis of the digestive system.

Conclusion
This work demonstrated the feasibility of real-time 4D MRI for volumetric motion imaging with latency less than 300 ms on a 1.5T MR-Linac system using the MRSIGMA technique. The retrospective self-evaluation demonstrated MRSIGMA with high accuracy and low uncertainty for real-time motion tracking of the GTV and OARs in patients with inoperable pancreatic cancer. This work is an essential step towards continuous treatment adaptation of tumors affected by respiratory motion and could ultimately improve treatment safety by optimizing ablative dose delivery near gastrointestinal organs.