Generation of quantification maps and weighted images from synthetic magnetic resonance imaging using deep learning network

Objective. The generation of quantification maps and weighted images in synthetic MRI techniques is based on complex fitting equations. This process requires longer image generation times. The objective of this study is to evaluate the feasibility of deep learning method for fast reconstruction of synthetic MRI. Approach. A total of 44 healthy subjects were recruited and random divided into a training set (30 subjects) and a testing set (14 subjects). A multiple-dynamic, multiple-echo (MDME) sequence was used to acquire synthetic MRI images. Quantification maps (T1, T2, and proton density (PD) maps) and weighted (T1W, T2W, and T2W FLAIR) images were created with MAGiC software and then used as the ground truth images in the deep learning (DL) model. An improved multichannel U-Net structure network was trained to generate quantification maps and weighted images from raw synthetic MRI imaging data (8 module images). Quantitative evaluation was performed on quantification maps. Quantitative evaluation metrics, as well as qualitative evaluation were used in weighted image evaluation. Nonparametric Wilcoxon signed-rank tests were performed in this study. Main results. The results of quantitative evaluation show that the error between the generated quantification images and the reference images is small. For weighted images, no significant difference in overall image quality or signal-to-noise ratio was identified between DL images and synthetic images. Notably, the DL images achieved improved image contrast with T2W images, and fewer artifacts were present on DL images than synthetic images acquired by T2W FLAIR. Significance. The DL algorithm provides a promising method for image generation in synthetic MRI techniques, in which every step of the calculation can be optimized and faster, thereby simplifying the workflow of synthetic MRI techniques.


Introduction
Magnetic resonance imaging (MRI) has been widely used as an in vivo imaging technique due to its safe, nonintrusive nature and high resolution (Yousaf et al 2018). For an accurate diagnosis, conventional MRI scans require multiple sequences with multiple parameters, which restricts the development and application of MRI due to the longer scan time. The emergence of synthetic MRI technology has greatly shortened the MRI scan time and effectively promoted the development of MRI (Warntjes et al 2007, Ma et al 2013. Synthetic MRI can simultaneously obtain three quantification maps-T1, T2, and PD-to generate multiple weighted images through one acquisition scan. A multiple-dynamic, multiple-echo (MDME) pulse sequence is a commonly used acquisition method that can estimate quantitative parameter maps from specific equations. Then, different synthetic contrast-weighted images can be obtained by adjusting the scanning parameters, such Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. as repetition time (TR), echo time (TE), and inversion time (TI). The generation images can cover several basic sequences that are usually required in the clinic, such as T1 weighted, T2 weighted and FLAIR images (Warntjes et al 2007(Warntjes et al , 2008. Quantitative measurements provide important information for clinical diagnosis because the tissue in the human body can be precisely distinguished depending on their inherent parameters-T1, T2 and PD, while weighted images are still commonly used images for clinical diagnosis. This technique is gradually being applied in the clinic due to its short scanning time and good image quality (Hagiwara et al 2017, Andica et al 2019, Ji et al 2020, Ryu et al 2020. However, the calculation of the quantification maps (T1Map, T2Map, and PDMap) in synthetic MRI techniques is based on a complex fitting equation of the signal intensity of each pixel in images, resulting in a complicated image calculation process (Maitra and Riddles 2010). This calculation usually relies on iterative optimization, which is more computationally demanding and complicated. What's more, post-processing using existing software takes a long time (over 1 min per case), which is not conducive to the clinical promotion of synthetic magnetic resonance technology. Therefore, it is necessary to explore a method of generating quantification maps and weighted images so that every step of the calculation can be optimized and performed faster, thereby simplifying the workflow of the synthetic MRI technique.
Recent advances in deep learning (DL) networks have provided a new efficient way to generate synthetic images. DL network-based approaches have been applied to medical images for various purposes, such as image reconstruction, segmentation, and denoising ( . Mainstream deep learning networks, including generative adversarial networks (GANs), U-Net, and ResNet, have been applied to synthetic MRI processing (Hagiwara et al 2019). The GAN network is composed of an image generator that generates a new image similar to the input target image, and a discriminator that distinguishes the target image from the generated image (Goodfellow et al 2014). The image generator and the discriminator are trained simultaneously. U-Net is a conversion network, which generates images through different mapping function. This network learns intensity transformation between two images by feeding input images and use the learned information to reconstruct synthetic images (Chen et al 2020).
Considering that the image generated by synthetic MRI contains complex conversion information and the current research results of the DL algorithm in synthetic MRI, we propose a multichannel U-Net-based deep learning architecture for quantification maps and weighted image generation. The objective of this study was to evaluate the feasibility of this method for fast reconstruction of synthetic MRI and to provide a promising method for clinical quantitative magnetic resonance image reconstruction.

Subjects
This study was approved by the Medical Research Ethics Committees of the Beijing Friendship Hospital, Capital Medical University. Written informed consent was obtained from all subjects prior to enrollment.
A total of 44 healthy subjects were recruited for this study. The inclusion criteria were as follows: (a) the age of the subject was greater than or equal to 18 years, (b) the subject showed no structural changes or signs of disease in brain MRI, and (c) the subject showed no contraindications or adverse reactions to MRI examination. The exclusion criteria were as follows: (1) subjects were under 18 years old; (2) subjects were in general poor health; and (2) subjects were contraindicated by MRI examination.

MRI acquisition
All MRI imaging was performed using a SIGNA Pioneer 3.0 T MRI scanner (GE Health care, Maukesha, Wis). All subjects underwent synthetic MR imaging using the Magnetic resonance image compilation (MAGiC) sequence, which is a MDME sequence. The acquisition parameters for quantitative synthetic MR imaging were as follows: TE, 18.3 and 91.4 ms; TR, 4 s; FOV, 220×220 mm; matrix, 320×256; echo-train length, 16; bandwidth, 31.25 kHz; thickness, 5 mm; gap, 1 mm; slices, 24; and acquisition time, 4 min 55 s. In total, 384 slices (192 real images and 192 imaginary images) were obtained from each subject. Four delay times (170, 670, 1840, and 3840 ms) and two echo times were used to generate 16 complex images per slice. Then, 8 module images were calculated using equation (1).
where M real and M imaginary represent the real image and imaginary image, respectively. Then, 8 module images were used to quantify T1 and T2 relaxation times and proton density: where S is signal intensity of each pixel of 8 module images per slice, A is an overall intensity scaling factor, PD is proton density, TR is repetition time, TE is echo time, TI is inversion time, T1 is longitudinal relaxation times, T2 is transverse relaxation times, αis excitation flip angle andθis saturation pulse angle.
MAGiC software was used to retrieve the quantification maps -T1Map, T2Map, and proton density map (PDMap)-on the basis of the acquired data and to create synthetic T1 weighted (T1W) images, T2 weighted (T2W) images and T2 weighted fluid attenuated inversion recovery (T2W FLAIR) images (Tanenbaum et al 2017). Generally, the average processing time was over 1 min per case.

DL framework
The proposed network architecture for generating quantification maps and weighted images from multipledynamic and multiple-echo sequences is illustrated in figure 1. We utilized a multichannel double U-Net model in this work to design a mapping function to convert the 8 module images to three quantification maps (T1Map, T2Map and PDMap) in the first step and then convert three quantification maps to three weighted images (T1W, T2W and T2W FLAIR) in the second step. The quantitative maps and weighted images processed by MAGiC software were designated as ground truth images in different channels in each step.

Network architecture
The network consists of two identical U-Net models. The U-Net architecture involves 5 downsampling operations, each compressing the input image by 2 times while increasing the number of filters by 2 times. Then, the same number of upsampling operations was implemented, increasing the image size by a factor of 2 and reducing the number of channels by a factor of 2 (Navab et al 2015). To output 3 different images at the same time, a convolution layer with 3 filters of 1×1 kernel was added to convert a 32-channel input to a 3-channel output image. Each convolution layer was followed by Leaky ReLU activation, except for the last layer, for which linear activation was applied.

Implementation
The workflow of our method included two preprocessing steps-threshold selection and normalization-for the ground truth quantification maps and input images (8 module images). First, for three quantification maps, T1Map, T2Map and PDMap, the intensity thresholds were set as 2000, 500, 150, respectively, according to the statistical results. For input images, 4000 was set as the intensity threshold. Only the pixels within the threshold can be used in deep learning calculations. Second, an image intensity normalization process was performed on both ground truth images and input images using min-max normalization: where I norm is the normalized image intensity, I is the original image intensity, and I I min and max ( ) ( ) are the minimal and maximal intensity values in the image, respectively.
Learning was achieved by minimizing the loss between the predicted images and ground truth images. We used Mean square error (MSE) as the loss function: where M x y z , , out and M x y z GT , , represent the pixels for the output images and ground truth images, respectively. A total of 30 cases were randomly selected from all of the datasets used to train the DL model, and the remaining 14 cases were used in the model validation. For the training stage, the weights are different between two U-Net, and trained separately. Input data for two U-Net are MDME input data and MAGiC generated T1/ T2/PD maps, respectively. To improve the performance of the model, the image sequence of continuous layers was randomly disordered, and each layer was taken as an independent image. The loss function was minimized by using the adaptive moment estimation (ADAM) optimizer with a learning rate of 1e-4. The network was trained on an NVIDIA Tesla V100 for 3000 iterations with a batch size of 4 by using the TensorFlow framework. The whole training process required approximately 10 h, once the network was trained, our proposed method required about 1.3 s to generate whole-brain images from raw MDME data.

/
These three metrics were used to measure the error between the predicted value and the reference value; lower values indicated better quality and good model performance.
The weighted images were quantitatively evaluated on the test dataset in terms of peak signal-to-noise (PSNR) and structural similarity index (SSIM) (Wang et al 2004 where m m s s s , X Y X Y XY and are the local means, standard deviations, and cross-covariance for the DL-weighted images and synthetic weighted images, respectively, and C 1 and C 2 are two quantities used to stabilize the division in the case of a weak denominator, with L=255, K1=0.01, and K2=0.03.
Notably, images with higher PSNR and SSIM indicate better image quality.

Qualitative evaluation
The image quality of weighted images was visually assessed by two radiologists in terms of overall image quality, signal-to-noise ratio (SNR), image sharpness, image contrast and artifacts. DL-weighted and synthetic weighted images were assessed in random order. All metrics were evaluated using a five-point Likert scale and scored as follows: 1, very bad; 2, bad; 3, acceptable; 4, good; and 5, excellent. Evaluations were performed for each patient based on a complex set of reconstructed axial images.

Statistical analysis
Statistics were computed using R (version 3.5.1, http://www. r-project.org/). Nonparametric Wilcoxon signedrank tests were performed to compare the qualitative assessment of the DL images and synthetic weighted images. Interrater reliability was also assessed by using Wilcoxon tests based on the scores of each expert radiologist. The significance level was set to P<0.05 (2-sided).

Results
For visual Figure 4 shows image assessments as percentages for DL and synthetic images when evaluated independently by two blinded readers in terms of overall image quality, SNR, image sharpness, image contrast, and artifacts. For T1 weighted images, no significant difference in overall image quality, SNR or image contrast or artifacts were identified between DL images and synthetic images ( Figure 6 shows the representative artifacts. Part of the brain sulci showed abnormally hyperintensity on the synthetic T2W FLAIR images, while was normal on the deep learning T2W FLAIR images. Table 5 displays the mean assessment scores for each metric and reader for T1W, T2W and T2W FLAIR images. For all images and metrics, there was no significant difference between the two readers. Values are the mean±SD.

Discussion
In this study, we provided a deep learning algorithm for generating quantitative maps and weighted images from synthetic MRI sequences. This algorithm only required approximately 2 s to generate whole-brain images. These images were created without any additional scanning and produced T1-, T2-, and PD maps as well as T1 weighted, T2 weighted, and T2W FLAIR images with comparable overall image quality to synthetic MR images. The preliminary results of the research show that it is feasible to use deep learning methods to calculate quantitative magnetic resonance and weighted images and has potential value in clinical applications. In our study, the proposed quantitative maps had comparable image quality with ground truth quantitative maps, and the MSE, MAE and MAPE results also showed that the error of quantitative values between DL quantitative maps and ground truth quantitative maps was also small. The above results show that the quantitative maps generated by our method can be used to generate weighted images and be further applied in clinical research. Since only healthy subjects were used in this study, we have calculated the evaluation metrics in whole-brain. If the whole-brain is divided into different types of tissues, such as gray matter, white matter and cerebrospinal fluid, and the calculation results of different tissues are evaluated, the quantitative results will be more intuitive, especially for patient studies (Akkus et al 2017).
DL-weighted images, when compared with synthetic weighted images, had comparable overall image quality and SNR in visualization. However, compared with synthetic weighted images, the DL-weighted images had decreased image sharpness in all weighted images, which was related to various factors, such as the size of the dataset and the quality of the collected images. This aspect will continue to be improved upon in the future. However, from the qualitative assessment of expert radiologists, there is no difference in the overall image quality of the weighted images, which does not affect the clinical diagnosis. It is noteworthy that the image contrast of DL-weighted images is better than that of the synthetic weighted images, especially in T2W images, which has a significant difference. In T1 weighted and T2W FLAIR images, although there is no significant difference, the mean contrast score of the DL-weighted images is higher than that of synthetic weighted images. Better contrast can reflect the difference between different tissues, which is conducive to clinical diagnosis. For T2W FLAIR images, fewer artifacts were present in DL images than in synthetic images. A monoexponential decay model used in the current synthetic MRI technique may not appropriately produce a FLAIR signal in the boundary of different tissues (Tanenbaum et al 2017). In our study, it's mainly manifested as the incorrect display of the sulci. The sulcus is composed of cerebrospinal fluid. In the case of normal subjects used in this study, the sulci should show hypointensity on the T2W FLAIR image. However, in the synthetic T2W FLAIR, part of the sulci showed hyperintensity, which may cause misdiagnosed. After deep learning, sulci appear normal on the T2W FLAIR images. This finding may provide a solution to the problem of artifacts in synthetic T2W FLAIR. However, it should be confirmed by future studies with more data and patient images.
In the current study, the U-Net network was chosen due to its characteristics of low complexity and data requirements. For medical image reconstruction, Liu et al (2021) propose a deep learning model with jointly enforces data-driven and physics-driven training but was achieved on simulated and coil-combined real MRI data sets; (Jafari et al 2021) use a deep neural network with unsupervised training and using physical cost for solving the optimization problem of water/fat separation, but the networks only accept real values. Our study was conducted on real data, and using module data which inspired from synthetic MRI white paper technical documents that the module image can make the most efficient use of real and complex images. Based on the current situation of our data, it is more appropriate to try the U-Net network first in this research (Chen et al 2020). Based on the original U-Net model architecture, we modified the U-Net network to a multiscale regression task. A multichannel double U-Net model was used to convert the 8 module images to three   quantification maps (T1Map, T2Map and PDMap) in the first step and then convert three quantification maps to three weighted images (T1W, T2W and T2W FLAIR) in the second step. For the first step, the input was changed from 1 channel to 8 channels, and the output was increased to 3 channels to generate quantitative maps, while in the second step, an identical U-Net model was used to generate weighted images. The U-Net network's convolution layer and upsampling and downsampling operations are fully utilized to capture multidimensional information, which is beneficial to improve the accuracy of the output image. However, the choice of activation function and the method of downsampling may also have an impact on the result, and further investigation is also needed.
There are several limitations of this study. First, our DL algorithm needs to be improved: increasing the number of datasets, changing the parameters of the network or trying more algorithms may improve the performance of the DL methods and obtain higher-quality images. Further research on these problems will be conducted in the future. Second, the synthetic MRI images generated form MAGiC software, instead of conventional MRI image, were used as ground truth for model training. Conventional images might be used for comparison to further improve the reliability and accuracy of our method. Additionally, only healthy subjects were used in this study, and the same imaging parameters were performed on both training data and test data, limiting its immediate clinical utility and its generalization ability. Improving the generalization ability of the model, meeting the needs of generating different weighted images, and conducting research with higher clinical application value are the major points for our research in the future.

Conclusions
In summary, we have described and investigated a U-Net convolutional network algorithm to generate three quantification maps (T1Map, T2Map, and PDMap) and three weighted images (T1W, T2W, and T2W FLAIR) from synthetic MRI sequences. The DL approach may provide a promising method for clinical quantitative magnetic resonance image reconstruction.