Physics in Medicine & Biology

Focus on Machine Learning Models in Medical Imaging

Guest Editors

Dr Giorgos Papanastasiou, University of Essex, UK
Dr Alba García Seco de Herrera, University of Essex, UK
Dr Chengjia Wang, University of Edinburgh, UK
Prof Heye Zhang, Sun Yat-sen University, China
Dr Guang Yang, Imperial College London, UK
Prof Ge Wang, Rensselaer Polytechnic Institute, USA

Scope

Computational medical imaging techniques aim to enhance the diagnostic performance of visual assessments in medical imaging, improving the early diagnosis of various diseases and helping to obtain a deeper understanding of physiology and pathology.

Machine Learning (ML) models revolutionised multiple tasks in medical image computing, such as image segmentation, registration and synthesis, through the extensive analysis of big imaging data. Although ML models outperform classic approaches on these tasks, they remain to a large extend implicit in terms of describing the data under investigation. This limits ML model interpretability, which is one of the main barriers towards ML-based pathology detection and generalised single- or multi-modal ML analysis in medical imaging. In modern clinical practices, detailed explanations of the model behaviours are increasingly required to support reliability towards improving clinical decision making. Moreover, being of the most promising topics in ML/medical imaging research, the main challenge for developing explainable models is to offer insights and rationales whilst maintaining high learning performance.

With this joint focus issue, between Physics in Medicine and Biology and the multidisciplinary open access journal Machine Learning: Science and Technology, we aim to attract original high-quality research and survey articles that reflect the most recent advances on ML models in medical imaging (MRI, CT, PET, SPECT, Ultrasound and other), by investigating novel methodologies either through interpreting algorithm components and/or exploring algorithm-data relationships.

Articles will be published in one of two participating journals and we are happy to let authors choose which journal to submit to based on the criteria below. If an article submitted to one journal is found to be unsuitable for consideration, but suitable for the other, the assessing Editor will offer the author an opportunity to transfer their article. This means that duplication of peer review effort can be largely eliminated as a service to our authors.

Physics in Medicine and Biology encourages the submission of papers that focus on the medical interpretation, clinical impact, applications and modalities and Machine Learning: Science and Technology encourages the submission of papers that focus on the methodology and physics-based interpretation of the technical aspects of machine learning models.

Topics:

We welcome researchers from academia, clinics and industry, to present their state-of-the-art scientific developments covering all aspects of ML model in medical imaging.

Potential topics include but are not limited to:

Develop and interpret ML models in single- or multi-modal (MRI, CT, Ultrasound, PET, SPECT) imaging
Multi-task learning on multi-modality medical images
Solidify explainability in cross-domain image synthesis between different imaging modalities or sequences (e.g. from different MRI sequences, or MRI and CT, etc.)
Transfer learning for single- or multi-modality medical images
ML model explainability in semi-supervised, weakly-supervised and unsupervised learning in medical imaging
Enhance explainability through developing ML models to detect or predict pathology versus healthy statuses
To improve explainability, combine ML with biophysical modelling and/or visual assessments from additional/complementary imaging modalities (e.g., multiple sequences in MRI, or combining MRI with Ultrasound, CT, PET or SPECT)
To improve explainability, combine ML with other types of "reference standard" input data (e.g. clinical data, electrophysiology signals, molecular analysis, invasive methods) that can enhance ML interpretability in medical imaging
Explain strengths and weaknesses of ML models through quantitative evaluation and interpretation of algorithm performance, especially mechanisms of adversarial attacks and associated solutions

Webinar

A webinar for this special issue is available for viewing Focus on machine learning models in medical imaging

Editorial

Focus on machine learning models in medical imaging

Giorgos Papanastasiou et al 2023 Phys. Med. Biol. 68 010301

View article, Focus on machine learning models in medical imaging PDF, Focus on machine learning models in medical imaging

https://doi.org/10.1088/1361-6560/aca069

Papers

Physics in Medicine and Biology

Deep residual-SVD network for brain image registration

Kunpeng Cui et al 2022 Phys. Med. Biol. 67 144002

View article, Deep residual-SVD network for brain image registration PDF, Deep residual-SVD network for brain image registration

Objective. Medical image registration aims to find the deformation field that can align two images in a spatial position. A medical image registration method based on U-Net architecture has been proposed currently. However, U-Net architecture has few training parameters, which leads to weak learning ability, and it ignores the adverse effects of image noise on the registration accuracy. The article aims at addressing the problem of weak network learning ability and the adverse effects of noisy images on registration. Approach. Here we propose a novel unsupervised 3D brain image registration framework, which introduces the residual unit and singular value decomposition (SVD) denoising layer on the U-Net architecture. Residual unit solves the problem of network degradation, that is, registration accuracy becomes saturated and then degrades rapidly with the increase in network depth. SVD denoising layer uses the estimated model order for SVD-based low-rank image reconstruction. we use Akaike information criterion to estimate the appropriate model order, which is used to remove noise components. We use the exponential linear unit (ELU) as the activation function, which is more robust to noise than other peers. Main results. The proposed method is evaluated on the publicly available brain MRI datasets: Mindboggle101 and LPBA40. Experimental results demonstrate our method outperforms several state-of-the-art methods for the metric of Dice Score. The mean number of folding voxels and registration time are comparable to state-of-the-art methods. Significance. This study shows that Deep Residual-SVD Network can improve registration accuracy. This study also demonstrate that the residual unit can enhance the learning ability of the network, the SVD denoising layer can denoise effectively, and the ELU is more robust to noise.

https://doi.org/10.1088/1361-6560/ac79fa

Dynamic imaging using motion-compensated smoothness regularization on manifolds (MoCo-SToRM)

Qing Zou et al 2022 Phys. Med. Biol. 67 144001

View article, Dynamic imaging using motion-compensated smoothness regularization on manifolds (MoCo-SToRM) PDF, Dynamic imaging using motion-compensated smoothness regularization on manifolds (MoCo-SToRM)

Objective. We introduce an unsupervised motion-compensated reconstruction scheme for high-resolution free-breathing pulmonary magnetic resonance imaging. Approach. We model the image frames in the time series as the deformed version of the 3D template image volume. We assume the deformation maps to be points on a smooth manifold in high-dimensional space. Specifically, we model the deformation map at each time instant as the output of a CNN-based generator that has the same weight for all time-frames, driven by a low-dimensional latent vector. The time series of latent vectors account for the dynamics in the dataset, including respiratory motion and bulk motion. The template image volume, the parameters of the generator, and the latent vectors are learned directly from the k-t space data in an unsupervised fashion. Main results. Our experimental results show improved reconstructions compared to state-of-the-art methods, especially in the context of bulk motion during the scans. Significance. The proposed unsupervised motion-compensated scheme jointly estimates the latent vectors that capture the motion dynamics, the corresponding deformation maps, and the reconstructed motion-compensated images from the raw k-t space data of each subject. Unlike current motion-resolved strategies, the proposed scheme is more robust to bulk motion events during the scan.

https://doi.org/10.1088/1361-6560/ac79fc

Open access

Assessment of data consistency through cascades of independently recurrent inference machines for fast and robust accelerated MRI reconstruction

D Karkalousos et al 2022 Phys. Med. Biol. 67 124001

View article, Assessment of data consistency through cascades of independently recurrent inference machines for fast and robust accelerated MRI reconstruction PDF, Assessment of data consistency through cascades of independently recurrent inference machines for fast and robust accelerated MRI reconstruction

Objective. Machine Learning methods can learn how to reconstruct magnetic resonance images (MRI) and thereby accelerate acquisition, which is of paramount importance to the clinical workflow. Physics-informed networks incorporate the forward model of accelerated MRI reconstruction in the learning process. With increasing network complexity, robustness is not ensured when reconstructing data unseen during training. We aim to embed data consistency (DC) in deep networks while balancing the degree of network complexity. While doing so, we will assess whether either explicit or implicit enforcement of DC in varying network architectures is preferred to optimize performance. Approach. We propose a scheme called Cascades of Independently Recurrent Inference Machines (CIRIM) to assess DC through unrolled optimization. Herein we assess DC both implicitly by gradient descent and explicitly by a designed term. Extensive comparison of the CIRIM to compressed sensing as well as other Machine Learning methods is performed: the End-to-End Variational Network (E2EVN), CascadeNet, KIKINet, LPDNet, RIM, IRIM, and UNet. Models were trained and evaluated on T₁-weighted and FLAIR contrast brain data, and T₂-weighted knee data. Both 1D and 2D undersampling patterns were evaluated. Robustness was tested by reconstructing 7.5× prospectively undersampled 3D FLAIR MRI data of multiple sclerosis (MS) patients with white matter lesions. Main results. The CIRIM performed best when implicitly enforcing DC, while the E2EVN required an explicit DC formulation. Through its cascades, the CIRIM was able to score higher on structural similarity and PSNR compared to other methods, in particular under heterogeneous imaging conditions. In reconstructing MS patient data, prospectively acquired with a sampling pattern unseen during model training, the CIRIM maintained lesion contrast while efficiently denoising the images. Significance. The CIRIM showed highly promising generalization capabilities maintaining a very fair trade-off between reconstructed image quality and fast reconstruction times, which is crucial in the clinical workflow.

https://doi.org/10.1088/1361-6560/ac6cc2

An inception network for positron emission tomography based dose estimation in carbon ion therapy

Harley Rutherford et al 2022 Phys. Med. Biol. 67 194001

View article, An inception network for positron emission tomography based dose estimation in carbon ion therapy PDF, An inception network for positron emission tomography based dose estimation in carbon ion therapy

Objective. We aim to evaluate a method for estimating 1D physical dose deposition profiles in carbon ion therapy via analysis of dynamic PET images using a deep residual learning convolutional neural network (CNN). The method is validated using Monte Carlo simulations of ¹²C ion spread-out Bragg peak (SOBP) profiles, and demonstrated with an experimental PET image. Approach. A set of dose deposition and positron annihilation profiles for monoenergetic ¹²C ion pencil beams in PMMA are first generated using Monte Carlo simulations. From these, a set of random polyenergetic dose and positron annihilation profiles are synthesised and used to train the CNN. Performance is evaluated by generating a second set of simulated ¹²C ion SOBP profiles (one 116 mm SOBP profile and ten 60 mm SOBP profiles), and using the trained neural network to estimate the dose profile deposited by each beam and the position of the distal edge of the SOBP. Next, the same methods are used to evaluate the network using an experimental PET image, obtained after irradiating a PMMA phantom with a ¹²C ion beam at QST’s Heavy Ion Medical Accelerator in Chiba facility in Chiba, Japan. The performance of the CNN is compared to that of a recently published iterative technique using the same simulated and experimental ¹²C SOBP profiles. Main results. The CNN estimated the simulated dose profiles with a mean relative error (MRE) of 0.7% ± 1.0% and the distal edge position with an accuracy of 0.1 mm ± 0.2 mm, and estimate the dose delivered by the experimental ¹²C ion beam with a MRE of 3.7%, and the distal edge with an accuracy of 1.7 mm. Significance. The CNN was able to produce estimates of the dose distribution with comparable or improved accuracy and computational efficiency compared to the iterative method and other similar PET-based direct dose quantification techniques.

https://doi.org/10.1088/1361-6560/ac88b2

Open access

A deep learning and Monte Carlo based framework for bioluminescence imaging center of mass-guided glioblastoma targeting

Behzad Rezaeifar et al 2022 Phys. Med. Biol. 67 144003

View article, A deep learning and Monte Carlo based framework for bioluminescence imaging center of mass-guided glioblastoma targeting PDF, A deep learning and Monte Carlo based framework for bioluminescence imaging center of mass-guided glioblastoma targeting

Objective. Bioluminescence imaging (BLI) is a valuable tool for non-invasive monitoring of glioblastoma multiforme (GBM) tumor-bearing small animals without incurring x-ray radiation burden. However, the use of this imaging modality is limited due to photon scattering and lack of spatial information. Attempts at reconstructing bioluminescence tomography (BLT) using mathematical models of light propagation show limited progress. Approach. This paper employed a different approach by using a deep convolutional neural network (CNN) to predict the tumor’s center of mass (CoM). Transfer-learning with a sizeable artificial database is employed to facilitate the training process for, the much smaller, target database including Monte Carlo (MC) simulations of real orthotopic glioblastoma models. Predicted CoM was then used to estimate a BLI-based planning target volume (bPTV), by using the CoM as the center of a sphere, encompassing the tumor. The volume of the encompassing target sphere was estimated based on the total number of photons reaching the skin surface. Main results. Results show sub-millimeter accuracy for CoM prediction with a median error of 0.59 mm. The proposed method also provides promising performance for BLI-based tumor targeting with on average 94% of the tumor inside the bPTV while keeping the average healthy tissue coverage below 10%. Significance. This work introduced a framework for developing and using a CNN for targeted radiation studies for GBM based on BLI. The framework will enable biologists to use BLI as their main image-guidance tool to target GBM tumors in rat models, avoiding delivery of high x-ray imaging dose to the animals.

https://doi.org/10.1088/1361-6560/ac79f8

Training low dose CT denoising network without high quality reference data

Jie Jing et al 2022 Phys. Med. Biol. 67 084002

View article, Training low dose CT denoising network without high quality reference data PDF, Training low dose CT denoising network without high quality reference data

Objective. Currently, the field of low-dose CT (LDCT) denoising is dominated by supervised learning based methods, which need perfectly registered pairs of LDCT and its corresponding clean reference image (normal-dose CT). However, training without clean labels is more practically feasible and significant, since it is clinically impossible to acquire a large amount of these paired samples. In this paper, a self-supervised denoising method is proposed for LDCT imaging. Approach. The proposed method does not require any clean images. In addition, the perceptual loss is used to achieve data consistency in feature domain during the denoising process. Attention blocks used in decoding phase can help further improve the image quality. Main results. In the experiments, we validate the effectiveness of our proposed self-supervised framework and compare our method with several state-of-the-art supervised and unsupervised methods. The results show that our proposed model achieves competitive performance in both qualitative and quantitative aspects to other methods. Significance. Our framework can be directly applied to most denoising scenarios without collecting pairs of training data, which is more flexible for real clinical scenario.

https://doi.org/10.1088/1361-6560/ac5f70

A two-step method to improve image quality of CBCT with phantom-based supervised and patient-based unsupervised learning strategies

Yuxiang Liu et al 2022 Phys. Med. Biol. 67 084001

View article, A two-step method to improve image quality of CBCT with phantom-based supervised and patient-based unsupervised learning strategies PDF, A two-step method to improve image quality of CBCT with phantom-based supervised and patient-based unsupervised learning strategies

Objective. In this study, we aimed to develop deep learning framework to improve cone-beam computed tomography (CBCT) image quality for adaptive radiation therapy (ART) applications. Approach. Paired CBCT and planning CT images of 2 pelvic phantoms and 91 patients (15 patients for testing) diagnosed with prostate cancer were included in this study. First, well-matched images of rigid phantoms were used to train a U-net, which is the supervised learning strategy to reduce serious artifacts. Second, the phantom-trained U-net generated intermediate CT images from the patient CBCT images. Finally, a cycle-consistent generative adversarial network (CycleGAN) was trained with intermediate CT images and deformed planning CT images, which is the unsupervised learning strategy to learn the style of the patient images for further improvement. When testing or applying the trained model on patient CBCT images, the intermediate CT images were generated from the original CBCT image by U-net, and then the synthetic CT images were generated by the generator of CycleGAN with intermediate CT images as input. The performance was compared with conventional methods (U-net/CycleGAN alone trained with patient images) on the test set. Results. The proposed two-step method effectively improved the CBCT image quality to the level of CT scans. It outperformed conventional methods for region-of-interest contouring and HU calibration, which are important to ART applications. Compared with the U-net alone, it maintained the structure of CBCT. Compared with CycleGAN alone, our method improved the accuracy of CT number and effectively reduced the artifacts, making it more helpful for identifying the clinical target volume. Significance. This novel two-step method improves CBCT image quality by combining phantom-based supervised and patient-based unsupervised learning strategies. It has immense potential to be integrated into the ART workflow to improve radiotherapy accuracy.

https://doi.org/10.1088/1361-6560/ac6289

Improving mammography lesion classification by optimal fusion of handcrafted and deep transfer learning features

Meredith A Jones et al 2022 Phys. Med. Biol. 67 054001

View article, Improving mammography lesion classification by optimal fusion of handcrafted and deep transfer learning features PDF, Improving mammography lesion classification by optimal fusion of handcrafted and deep transfer learning features

Objective. Handcrafted radiomics features or deep learning model-generated automated features are commonly used to develop computer-aided diagnosis schemes of medical images. The objective of this study is to test the hypothesis that handcrafted and automated features contain complementary classification information and fusion of these two types of features can improve CAD performance. Approach. We retrospectively assembled a dataset involving 1535 lesions (740 malignant and 795 benign). Regions of interest (ROI) surrounding suspicious lesions are extracted and two types of features are computed from each ROI. The first one includes 40 radiomic features and the second one includes automated features computed from a VGG16 network using a transfer learning method. A single channel ROI image is converted to three channel pseudo-ROI images by stacking the original image, a bilateral filtered image, and a histogram equalized image. Two VGG16 models using pseudo-ROIs and 3 stacked original ROIs without pre-processing are used to extract automated features. Five linear support vector machines (SVM) are built using the optimally selected feature vectors from the handcrafted features, two sets of VGG16 model-generated automated features, and the fusion of handcrafted and each set of automated features, respectively. Main Results. Using a 10-fold cross-validation, the fusion SVM using pseudo-ROIs yields the highest lesion classification performance with area under ROC curve (AUC = 0.756 ± 0.042), which is significantly higher than those yielded by other SVMs trained using handcrafted or automated features only (p < 0.05). Significance. This study demonstrates that both handcrafted and automated futures contain useful information to classify breast lesions. Fusion of these two types of features can further increase CAD performance.

https://doi.org/10.1088/1361-6560/ac5297

Phase function estimation from a diffuse optical image via deep learning

Yuxuan Liang et al 2022 Phys. Med. Biol. 67 074001

View article, Phase function estimation from a diffuse optical image via deep learning PDF, Phase function estimation from a diffuse optical image via deep learning

Objective. The phase function is a key element of a light propagation model for Monte Carlo (MC) simulation, which is usually fitted with an analytic function with associated parameters. In recent years, machine learning methods were reported to estimate the parameters of the phase function of a particular form such as the Henyey–Greenstein phase function but, to our knowledge, no studies have been performed to determine the form of the phase function. Approach. Here we design a convolutional neural network (CNN) to estimate the phase function from a diffuse optical image without any explicit assumption on the form of the phase function. Specifically, we use a Gaussian mixture model (GMM) as an example to represent the phase function generally and learn the model parameters accurately. The GMM is selected because it provides the analytic expression of phase function to facilitate deflection angle sampling in MC simulation, and does not significantly increase the number of free parameters. Main Results. Our proposed method is validated on MC-simulated reflectance images of typical biological tissues using the Henyey–Greenstein phase function with different anisotropy factors. The mean squared error of the phase function is 0.01 and the relative error of the anisotropy factor is 3.28%. Significance. We propose the first data-driven CNN-based inverse MC model to estimate the form of scattering phase function. The effects of field of view and spatial resolution are analyzed and the findings provide guidelines for optimizing the experimental protocol in practical applications.

https://doi.org/10.1088/1361-6560/ac5b21

Open access

Effect of dataset size, image quality, and image type on deep learning-based automatic prostate segmentation in 3D ultrasound

Nathan Orlando et al 2022 Phys. Med. Biol. 67 074002

View article, Effect of dataset size, image quality, and image type on deep learning-based automatic prostate segmentation in 3D ultrasound PDF, Effect of dataset size, image quality, and image type on deep learning-based automatic prostate segmentation in 3D ultrasound

Three-dimensional (3D) transrectal ultrasound (TRUS) is utilized in prostate cancer diagnosis and treatment, necessitating time-consuming manual prostate segmentation. We have previously developed an automatic 3D prostate segmentation algorithm involving deep learning prediction on radially sampled 2D images followed by 3D reconstruction, trained on a large, clinically diverse dataset with variable image quality. As large clinical datasets are rare, widespread adoption of automatic segmentation could be facilitated with efficient 2D-based approaches and the development of an image quality grading method. The complete training dataset of 6761 2D images, resliced from 206 3D TRUS volumes acquired using end-fire and side-fire acquisition methods, was split to train two separate networks using either end-fire or side-fire images. Split datasets were reduced to 1000, 500, 250, and 100 2D images. For deep learning prediction, modified U-Net and U-Net++ architectures were implemented and compared using an unseen test dataset of 40 3D TRUS volumes. A 3D TRUS image quality grading scale with three factors (acquisition quality, artifact severity, and boundary visibility) was developed to assess the impact on segmentation performance. For the complete training dataset, U-Net and U-Net++ networks demonstrated equivalent performance, but when trained using split end-fire/side-fire datasets, U-Net++ significantly outperformed the U-Net. Compared to the complete training datasets, U-Net++ trained using reduced-size end-fire and side-fire datasets demonstrated equivalent performance down to 500 training images. For this dataset, image quality had no impact on segmentation performance for end-fire images but did have a significant effect for side-fire images, with boundary visibility having the largest impact. Our algorithm provided fast (<1.5 s) and accurate 3D segmentations across clinically diverse images, demonstrating generalizability and efficiency when employed on smaller datasets, supporting the potential for widespread use, even when data is scarce. The development of an image quality grading scale provides a quantitative tool for assessing segmentation performance.

https://doi.org/10.1088/1361-6560/ac5a93

Open access

Automatic contouring of normal tissues with deep learning for preclinical radiation studies

Georgios Lappas et al 2022 Phys. Med. Biol. 67 044001

View article, Automatic contouring of normal tissues with deep learning for preclinical radiation studies PDF, Automatic contouring of normal tissues with deep learning for preclinical radiation studies

Objective. Delineation of relevant normal tissues is a bottleneck in image-guided precision radiotherapy workflows for small animals. A deep learning (DL) model for automatic contouring using standardized 3D micro cone-beam CT (μCBCT) volumes as input is proposed, to provide a fully automatic, generalizable method for normal tissue contouring in preclinical studies. Approach. A 3D U-net was trained to contour organs in the head (whole brain, left/right brain hemisphere, left/right eye) and thorax (complete lungs, left/right lung, heart, spinal cord, thorax bone) regions. As an important preprocessing step, Hounsfield units (HUs) were converted to mass density (MD) values, to remove the energy dependency of the μCBCT scanner and improve generalizability of the DL model. Model performance was evaluated quantitatively by Dice similarity coefficient (DSC), mean surface distance (MSD), 95th percentile Hausdorff distance (HD_95p), and center of mass displacement (ΔCoM). For qualitative assessment, DL-generated contours (for 40 and 80 kV images) were scored (0: unacceptable, manual re-contouring needed - 5: no adjustments needed). An uncertainty analysis using Monte Carlo dropout uncertainty was performed for delineation of the heart. Main results. The proposed DL model and accompanying preprocessing method provide high quality contours, with in general median DSC > 0.85, MSD < 0.25 mm, HD_95p < 1 mm and ΔCoM < 0.5 mm. The qualitative assessment showed very few contours needed manual adaptations (40 kV: 20/155 contours, 80 kV: 3/155 contours). The uncertainty of the DL model is small (within 2%). Significance. A DL-based model dedicated to preclinical studies has been developed for multi-organ segmentation in two body sites. For the first time, a method independent of image acquisition parameters has been quantitatively evaluated, resulting in sub-millimeter performance, while qualitative assessment demonstrated the high quality of the DL-generated contours. The uncertainty analysis additionally showed that inherent model variability is low.

https://doi.org/10.1088/1361-6560/ac4da3

Prospectively-validated deep learning model for segmenting swallowing and chewing structures in CT

Aditi Iyer et al 2022 Phys. Med. Biol. 67 024001

View article, Prospectively-validated deep learning model for segmenting swallowing and chewing structures in CT PDF, Prospectively-validated deep learning model for segmenting swallowing and chewing structures in CT

Objective. Delineating swallowing and chewing structures aids in radiotherapy (RT) treatment planning to limit dysphagia, trismus, and speech dysfunction. We aim to develop an accurate and efficient method to automate this process. Approach. CT scans of 242 head and neck (H&N) cancer patients acquired from 2004 to 2009 at our institution were used to develop auto-segmentation models for the masseters, medial pterygoids, larynx, and pharyngeal constrictor muscle using DeepLabV3+. A cascaded framework was used, wherein models were trained sequentially to spatially constrain each structure group based on prior segmentations. Additionally, an ensemble of models, combining contextual information from axial, coronal, and sagittal views was used to improve segmentation accuracy. Prospective evaluation was conducted by measuring the amount of manual editing required in 91 H&N CT scans acquired February-May 2021. Main results. Medians and inter-quartile ranges of Dice similarity coefficients (DSC) computed on the retrospective testing set (N = 24) were 0.87 (0.85–0.89) for the masseters, 0.80 (0.79–0.81) for the medial pterygoids, 0.81 (0.79–0.84) for the larynx, and 0.69 (0.67–0.71) for the constrictor. Auto-segmentations, when compared to two sets of manual segmentations in 10 randomly selected scans, showed better agreement (DSC) with each observer than inter-observer DSC. Prospective analysis showed most manual modifications needed for clinical use were minor, suggesting auto-contouring could increase clinical efficiency. Trained segmentation models are available for research use upon request via https://github.com/cerr/CERR/wiki/Auto-Segmentation-models. Significance. We developed deep learning-based auto-segmentation models for swallowing and chewing structures in CT and demonstrated its potential for use in treatment planning to limit complications post-RT. To the best of our knowledge, this is the only prospectively-validated deep learning-based model for segmenting chewing and swallowing structures in CT. Segmentation models have been made open-source to facilitate reproducibility and multi-institutional research.

https://doi.org/10.1088/1361-6560/ac4000

An ensemble learning method based on ordinal regression for COVID-19 diagnosis from chest CT

Xiaodong Guo et al 2021 Phys. Med. Biol. 66 244001

View article, An ensemble learning method based on ordinal regression for COVID-19 diagnosis from chest CT PDF, An ensemble learning method based on ordinal regression for COVID-19 diagnosis from chest CT

Coronavirus disease 2019 (COVID-19) has brought huge losses to the world, and it remains a great threat to public health. X-ray computed tomography (CT) plays a central role in the management of COVID-19. Traditional diagnosis with pulmonary CT images is time-consuming and error-prone, which could not meet the need for precise and rapid COVID-19 screening. Nowadays, deep learning (DL) has been successfully applied to CT image analysis, which assists radiologists in workflow scheduling and treatment planning for patients with COVID-19. Traditional methods use cross-entropy as the loss function with a Softmax classifier following a fully-connected layer. Most DL-based classification methods target intraclass relationships in a certain class (early, progressive, severe, or dissipative phases), ignoring the natural order of different phases of the disease progression, i.e., from an early stage and progress to a late stage. To learn both intraclass and interclass relationships among different stages and improve the accuracy of classification, this paper proposes an ensemble learning method based on ordinal regression, which leverages the ordinal information on COVID-19 phases. The proposed method uses multi-binary, neuron stick-breaking (NSB), and soft labels (SL) techniques, and ensembles the ordinal outputs through a median selection. To evaluate our method, we collected 172 confirmed cases. In a 2-fold cross-validation experiment, the accuracy is increased by 22% compared with traditional methods when we use modified ResNet-18 as the backbone. And precision, recall, and F1-score are also improved. The experimental results show that our proposed method achieves a better classification performance than the traditional methods, which helps establish guidelines for the classification of COVID-19 chest CT images.

https://doi.org/10.1088/1361-6560/ac34b2

Open access

A deep-learning method for generating synthetic kV-CT and improving tumor segmentation for helical tomotherapy of nasopharyngeal carcinoma

Xinyuan Chen et al 2021 Phys. Med. Biol. 66 224001

View article, A deep-learning method for generating synthetic kV-CT and improving tumor segmentation for helical tomotherapy of nasopharyngeal carcinoma PDF, A deep-learning method for generating synthetic kV-CT and improving tumor segmentation for helical tomotherapy of nasopharyngeal carcinoma

Objective: Megavoltage computed tomography (MV-CT) is used for setup verification and adaptive radiotherapy in tomotherapy. However, its low contrast and high noise lead to poor image quality. This study aimed to develop a deep-learning-based method to generate synthetic kilovoltage CT (skV-CT) and then evaluate its ability to improve image quality and tumor segmentation. Approach: The planning kV-CT and MV-CT images of 270 patients with nasopharyngeal carcinoma (NPC) treated on an Accuray TomoHD system were used. An improved cycle-consistent adversarial network which used residual blocks as its generator was adopted to learn the mapping between MV-CT and kV-CT and then generate skV-CT from MV-CT. A Catphan 700 phantom and 30 patients with NPC were used to evaluate image quality. The quantitative indices included contrast-to-noise ratio (CNR), uniformity and signal-to-noise ratio (SNR) for the phantom and the structural similarity index measure (SSIM), mean absolute error (MAE), and peak signal-to-noise ratio (PSNR) for patients. Next, we trained three models for segmentation of the clinical target volume (CTV): MV-CT, skV-CT, and MV-CT combined with skV-CT. The segmentation accuracy was compared with indices of the dice similarity coefficient (DSC) and mean distance agreement (MDA). Main results: Compared with MV-CT, skV-CT showed significant improvement in CNR (184.0%), image uniformity (34.7%), and SNR (199.0%) in the phantom study and improved SSIM (1.7%), MAE (24.7%), and PSNR (7.5%) in the patient study. For CTV segmentation with only MV-CT, only skV-CT, and MV-CT combined with skV-CT, the DSCs were 0.75 ± 0.04, 0.78 ± 0.04, and 0.79 ± 0.03, respectively, and the MDAs (in mm) were 3.69 ± 0.81, 3.14 ± 0.80, and 2.90 ± 0.62, respectively. Significance: The proposed method improved the image quality of MV-CT and thus tumor segmentation in helical tomotherapy. The method potentially can benefit adaptive radiotherapy.

https://doi.org/10.1088/1361-6560/ac3345

Abdominal synthetic CT reconstruction with intensity projection prior for MRI-only adaptive radiotherapy

Sven Olberg et al 2021 Phys. Med. Biol. 66 204001

View article, Abdominal synthetic CT reconstruction with intensity projection prior for MRI-only adaptive radiotherapy PDF, Abdominal synthetic CT reconstruction with intensity projection prior for MRI-only adaptive radiotherapy

Objective. Owing to the superior soft tissue contrast of MRI, MRI-guided adaptive radiotherapy (ART) is well-suited to managing interfractional changes in anatomy. An MRI-only workflow is desirable, but producing synthetic CT (sCT) data through paired data-driven deep learning (DL) for abdominal dose calculations remains a challenge due to the highly variable presence of intestinal gas. We present the preliminary dosimetric evaluation of our novel approach to sCT reconstruction that is well suited to handling intestinal gas in abdominal MRI-only ART. Approach. We utilize a paired data DL approach enabled by the intensity projection prior, in which well-matching training pairs are created by propagating air from MRI to corresponding CT scans. Evaluations focus on two classes: patients with (1) little involvement of intestinal gas, and (2) notable differences in intestinal gas presence between corresponding scans. Comparisons between sCT-based plans and CT-based clinical plans for both classes are made at the first treatment fraction to highlight the dosimetric impact of the variable presence of intestinal gas. Main results. Class 1 patients (n = 13) demonstrate differences in prescribed dose coverage of the PTV of 1.3 ± 2.1% between clinical plans and sCT-based plans. Mean DVH differences in all structures for Class 1 patients are found to be statistically insignificant. In Class 2 (n = 20), target coverage is 13.3 ± 11.0% higher in the clinical plans and mean DVH differences are found to be statistically significant. Significance. Significant deviations in calculated doses arising from the variable presence of intestinal gas in corresponding CT and MRI scans result in uncertainty in high-dose regions that may limit the effectiveness of adaptive dose escalation efforts. We have proposed a paired data-driven DL approach to sCT reconstruction for accurate dose calculations in abdominal ART enabled by the creation of a clinically unavailable training data set with well-matching representations of intestinal gas.

https://doi.org/10.1088/1361-6560/ac279e

Machine Learning: Science and Technology

Open access

Extending the relative seriality formalism for interpretable deep learning of normal tissue complication probability models

Tahir I Yusufaly 2022 Mach. Learn.: Sci. Technol. 3 024001

View article, Extending the relative seriality formalism for interpretable deep learning of normal tissue complication probability models PDF, Extending the relative seriality formalism for interpretable deep learning of normal tissue complication probability models

We formally demonstrate that the relative seriality (RS) model of normal tissue complication probability (NTCP) can be recast as a simple neural network with one convolutional and one pooling layer. This approach enables us to systematically construct deep relative seriality networks (DRSNs), a new class of mechanistic generalizations of the RS model with radiobiologically interpretable parameters amenable to deep learning. To demonstrate the utility of this formulation, we analyze a simplified example of xerostomia due to irradiation of the parotid gland during alpha radiopharmaceutical therapy. Using a combination of analytical calculations and numerical simulations, we show for both the RS and DRSN cases that the ability of the neural network to generalize without overfitting is tied to ‘stiff’ and ‘sloppy’ directions in the parameter space of the mechanistic model. These results serve as proof-of-concept for radiobiologically interpretable deep learning of NTCP, while simultaneously yielding insight into how such techniques can robustly generalize beyond the training set despite uncertainty in individual parameters.

https://doi.org/10.1088/2632-2153/ac6932

Focus on Machine Learning Models in Medical Imaging

Guest Editors

Scope

Topics:

Webinar

Editorial

Papers

Journal links