Automated detection of vertebral body misalignments in orthogonal kV and MV guided radiotherapy: application to a comprehensive retrospective dataset

Objective. In image-guided radiotherapy (IGRT), off-by-one vertebral body misalignments are rare but potentially catastrophic. In this study, a novel detection method for such misalignments in IGRT was investigated using densely-connected convolutional networks (DenseNets) for applications towards real-time error prevention and retrospective error auditing. Approach. A total of 4213 images acquired from 527 radiotherapy patients aligned with planar kV or MV radiographs were used to develop and test error-detection software modules. Digitally reconstructed radiographs (DRRs) and setup images were retrieved and co-registered according to the clinically applied alignment contained in the DICOM REG files. A semi-automated algorithm was developed to simulate patient positioning errors on the anterior-posterior (AP) and lateral (LAT) images shifted by one vertebral body. A DenseNet architecture was designed to classify either AP images individually or AP and LAT image pairs. Receiver-operator characteristic curves (ROC) and areas under the curves (AUC) were computed to evaluate the classifiers on test subsets. Subsequently, the algorithm was applied to the entire dataset in order to retrospectively determine the absolute off-by-one vertebral body error rate for planar radiograph guided RT at our institution from 2011–2021. Main results. The AUCs for the kV models were 0.98 for unpaired AP and 0.99 for paired AP-LAT. The AUC for the MV AP model was 0.92. For a specificity of 95%, the paired kV model achieved a sensitivity of 99%. Application of the model to the entire dataset yielded a per-fraction off-by-one vertebral body error rate of 0.044% [0.0022%, 0.21%] for paired kV IGRT including one previously unreported error. Significance. Our error detection algorithm was successful in classifying vertebral body positioning errors with sufficient accuracy for retrospective quality control and real-time error prevention. The reported positioning error rate for planar radiograph IGRT is unique in being determined independently of an error reporting system.


Introduction
On-board image guidance has become the de facto standard for patient alignment during high precision radiation therapy treatment [1].Image-guided radiation therapy (IGRT) is most frequently performed with cone-beam computed tomography (CBCT) and/ or planar kV or MV projection imaging [1].Either three-dimensional CBCTs are registered to the planning CT, or bony anatomy from treatment radiographs are registered to digitally reconstructed radiographs (DRRs) derived from the planning CT [2].
The implementation of image-guidance in radiotherapy has been observed to improve tumor control and reduce normal tissue toxicity, especially among highrisk patient populations [3].It was reported in Russo et al that gantry-mounted kV imagers for IGRT reduce the frequency of treatment errors compared to non-IGRT approaches [4].
Although patient localization errors are improbable, they are known to occur during image-guided intensity modulated radiotherapy (IMRT) and threedimensional conformal radiotherapy (3D-CRT) procedures [5][6][7][8][9][10][11].A 2015 report by the Radiation Oncology Incident Learning System (RO-ILS) reviewing 396 inter-institutional cases found that 40 cases had incorrect setup instructions given to radiation therapists and 34 of them had incorrect couch shifts [12,13].We use the term 'IGRT error' here to refer to an error in the patient setup process caused by a human mistake when interpreting treatment setup images.In modern radiotherapy departments, extensive effort is directed towards reducing the probability of such errors through methods such as identification of workflow failure modes [7] and adherence to safety guidelines and checklists [14,15].However, automation and interlocks are widely believed to be the most effective methods of error prevention [16].In this work we seek to develop an automated algorithm for detecting IGRT errors in planar x-ray setup images, which could be used as the core of a treatment prevention interlock to prevent radiotherapy delivery if the patient had been incorrectly aligned.
A recent report by the French Nuclear Safety Authority illustrated the significance of vertebral body setup errors in radiotherapy treatments.Among 40 incidents reviewed, 29 originated from planar kV imaging and 7 from planar MV portal imaging, as compared with 4 from three-dimensional CBCT [17].The primary factor resulting in such IGRT errors was difficulty in differentiating between adjacent vertebral bodies, and contributing factors included poor image quality, longitudinal matching using non-discriminating landmarks, and excessively small collimation [17].The potential impact of such setup errors is serious because if the treatment is aligned to the wrong vertebral body, misalignments of several centimeters will occur.Off-by-one vertebral body misalignments are relatively prevalent due to approximate translational symmetry of the vertebral column.Similar errors have been demonstrated in other treatment modalities, for example human mistakes combined with low quality images have been known to cause spinal surgery errors [18,19].
Previous studies have shown that image similarity measures calculated between the IGRT image and the corresponding planning CT or DRR can be an effective means of separating patients correctly aligned from those incorrectly aligned in a variety of treatment sites [20,21].These early approaches developed classifications using machine learning approaches with hand-curated features.More recently, deep learning (DL) using convolutional neural networks (CNN) has become a de facto standard to solve image recognition problems in medical imaging.Deep learning models are highly capable of performing a variety of medical image processing tasks, and they are becoming increasingly widespread in medical research [22][23][24][25].
Convolutional neural networks (CNNs) are a subset of deep learning architectures that have provided highly accurate models for medical image classification and segmentation [32,33].Luximon et al, for example, developed this approach to detecting off-byone vertebral body errors in CBCT-based IGRT [34].Planar kV and MV x-ray imaging represent the 2nd and 3rd most frequently used IGRT technologies (after CBCT) according to a recent practice patterns survey, being used by 22%-67% of institutions, depending on disease site [1].Compared to CBCT, planar kV or MV images provide less soft-tissue information but are more rapidly acquired and thus may be more frequently used for palliative treatments.Petragallo et al developed a CNN-based approach using data from a BrainLAB ExacTrac stereoscopic x-ray system [35].Such a stereoscopic system records oblique, nonorthogonal image pairs with a narrower field of view of 10 × 10 cm.By contrast, the IGRT dataset used in this study consists of orthogonally paired anterior-posterior (AP) and lateral (LAT) radiographs produced from gantry-mounted imagers with a larger field of view of 30 × 40 cm.The study on ExacTrac [35] treated each x-ray from the stereoscopic image pair completely separately.Our study not only trains models on the AP and LAT orientations separately, but also trains a model that combines information from both projection directions.Furthermore, our pre-processing steps of applying intensity-based normalization and image gradients provide improvements in interpreting the significantly lower contrast inherent in lateral images than in oblique, stereoscopic images.Finally, a significant difference between this work and previous publications is the algorithm was applied to the entire dataset in order to retrospectively determine the absolute off-by-one vertebral body error rate for planar radiograph guided RT at our institution from 2011-2021.
Deep densely connected CNN models, or Dense-Nets, were first developed by Huang et al in [36] to accommodate increasingly deep CNNs that tended to dilute the original features.Unlike ResNets [37], where features from a previous layer were added to the current layer, DenseNets concatenated features from all previous layers.DenseNets have been used in medical imaging and radiotherapy error detection contexts [34,38,39].In the present work, we propose a Dense-Net model to differentiate between correctly aligned and incorrectly aligned patients imaged with kV or MV planar x-ray imaging.Due to relatively lower image contrast, larger treatment field of view (FOV), and lower alignment precision, detection of misalignments in such images presents a more challenging classification problem and a significant gap in the development of this approach to the prevention of IGRT errors.We develop and test this error detection method using image data from thoracic and abdominal radiotherapy treatments performed at our institution between 2011 and 2021.Separate models are trained on unpaired kV, unpaired MV, and orthogonally paired kV image data.Our goal is to increase patient safety by developing these error detection models as automated radiotherapy workflow interlocks.Our objective is to achieve at least a 95% true positive rate of error detection given a false positive rate of less than 5%.A low false positive rate would ensure that the clinical workflow is minimally disrupted, which is desired since patient setup errors are infrequently encountered.Finally, we perform a retrospective error analysis of our institutional dataset by examining false positives in the training data.

DICOM query/retrieval
To efficiently acquire DICOM images from the clinical database at our institution, a DICOM query and retrieval application programming interface (API) was developed using the pynetdicom Python package.This custom API allowed the user to query patients based on study dates within a desired date range, filter selections by relevant plan names and image type, and download DICOM RT Plan, RT Image, and RT Registration (REG) objects for selected patients.The REG files provided the transformation matrix that registered setup DRRs to treatment radiographs.
For this study, we restricted our database search between the years 2011 and 2021.We used search tags for radiotherapy plan names such that thoracic and lumbar spinal anatomy would be expected to appear centrally in the IGRT images.This approach was facilitated by our institution's rigorous adherence to an anatomy-based plan naming convention.Plans named for thoracic and lumbar vertebral bodies including T2 through L2 were selected from the database, as well as plans for lungs, ribs, abdomen, stomach, pancreas, and esophagus.For each patient medical record number (MRN), a list of all series directories across all studies was acquired.Within these directories, a collection of DRR, REG, and RT Image files was extracted.After obtaining such a list of file paths for all relevant DICOM data, the associated REG files were analyzed to identify radiograph pairings.Within a REG file there are two service object pair (SOP) instance unique identifiers (UIDs).For paired radiographs with both anterior-posterior (AP) and lateral (LAT) orientations, there should be four SOP instance UIDs (two for the DRRs and two for the x-rays).For unpaired radiographs with only AP oriented images, there are correspondingly two SOP instance UIDs.In general, the order in which these UIDs are itemized in the REG file is not fixed, so a match between the file paths with our DRR and x-ray lists was required.Furthermore, the order that AP and LAT orientations are stored is subject to change, so a sorting operation among all DICOM header files was performed.For paired data, the DICOM RT Image labels were processed by searching for strings that distinguished AP and LAT orientations.Last, the file paths were grouped together and saved before moving on to the principal component of our algorithm, namely vertebral body misalignment analysis.

Data selection process
The DRR and x-ray images retrieved by our DICOM API were reviewed to ensure appropriate data selection.There were many instances where the radiographic FOV covered a combination of sites, such as both the cervical and upper thoracic vertebrae.To maintain a consistent data selection process, images were included if at least three thoracic or lumbar vertebrae were visible, which would provide sufficient information for off-by-one vertebral body misalignment simulation.A summary of our data selection process, specifically the explanations for excluding datapoints and their frequencies, is found in table 1. Orthogonal dual-energy radiographs were also considered, although the total number of such kV-MV images processed was insufficient for deep learning model training, and these images were excluded from our final datasets.There were no paired orthogonal MV images to report.

Semi-automated misalignment training data generation
To train a model to classify off-by-one vertebral body setup errors, it was necessary to artificially misalign the orthogonal radiographs with respect to the DRRs.For each DRR-radiograph pair selected, our goal was to create a dataset to train a convolutional network by misaligning the DRRs in both the superior and inferior directions.These misalignments were intended to simulate patient positioning errors similar to those made by radiation therapists, so repositioning the DRRs using arbitrary shifts was insufficient.In a typical clinical workflow, radiation therapists view a blended overlay of the x-ray and DRR in a single window; therapists drag the DRR until the vertebral bodies are visually aligned in the overlaid images, thus determining a couch shift to be applied to bring the patient into alignment.An initial rough positioning at the off-by-one vertebral body location was performed by selecting pixel landmarks at off-by-one locations in the x-ray and DRR, and applying a shift to make the landmarks coincide.Subsequently, the translational shift was optimized by maximization of the normalized cross-correlation (NCC) between the gradient images of the x-ray and DRR.The NCC was computed over a manually selected ROI about the vertebral column in the AP x-ray images.The search space was limited to a 1 1cm 2 region for the AP images.The superior-inferior translation computed on the AP images was applied to the LAT DRR, and then an anterior-posterior translation of the LAT DRR was optimized using gradient-based NCC within a 2 cm anterior-posterior line.The DRRs were not rerendered because the angular divergence would be within 2 degrees, and this was assumed to be small compared to the rotational alignment precision used for these palliatively treated patients.All misaligned radiographs were computed in the MATLAB environment (MathWorks, Natick, MA).Generally, landmark points were selected either on the spinous process or the vertebral foramen.If contrast was especially low, a rib was used in place of its attached vertebra.Grid and line searches were sampled in discrete steps of 1 mm.Note that this semi-automated process was only applied to generate simulated errors for training and testing data; it was not used in the error detection algorithm itself.Supplementary section S1 describes the process of resampling the DRR to the coordinate system consistent with the x-ray frame of reference.An example set of orthogonal kV treatment radiographs together with clinically aligned and artificially misaligned DRRs is depicted in figure 1.
In clinical practice, a primary DRR is aligned to a secondary treatment radiograph by radiation therapists manually.Due to the large number of images used, manual misalignment could not be performed on all the images used for algorithm training.However, to assess the accuracy of our models in realistic situations where positioning was manually determined, a separate dataset was designed from our list of test patients with DRRs shifted by hand relative to the properly registered radiographs to simulate setup errors.In the interest of time, only one fraction per patient was manually misaligned.Manual misalignment was performed using MIM (MIM Software Inc., Cleveland, OH).The starting point for manual misalignment was the clinically applied alignment which was extracted from the DICOM REG files stored in the image database.This initial aligment was visually validated.
The REG file was opened in MIM to display the approved fusion.Image contrast was manually adjusted to optimize visibility of the vertebral bodies.In the fusion window, the primary DRR was then manually translated up and down by one vertebral body.Once the vertebral bodies and column were best matched to the secondary radiograph by human eye, the original aligned DRR and the two manually misaligned DRRs were resampled onto the corresponding treatment radiograph grid and saved as two-channel arrays in the same manner as was performed in the semi-automated training data generation.

DenseNet architecture
Dense convolutional networks, or DenseNets, were used for model training.For simplicity the sequential combination of a convolutional layer, a rectified linear unit (ReLU) activation, and batch normalization were defined here as a convolution block.The first layer consisted of a 7 7 convolution block with 3 3 max pooling, and transition layers between dense blocks consisted of 1 1 convolution blocks with 2 2 max pooling.Dense blocks contained sequential 1 1 and 3 3 convolution blocks repeated six times.Recall that successive layers in each dense block were by definition connected to the output of all previous block layers.The growth rate of the network controlled the number of features learned during convolutions.We set the growth rate to = k 32 for all models trained in this study.In total, we had three dense blocks, each six layers deep.
Finally, a terminal block contained a 7 7 global average pooling layer, followed by a fully connected (FC) layer that outputs two terminal neurons.A softmax function converted these outputs into a probability density function (PDF), which was then used for classification using a cross-entropy loss function where w 1 and w 2 were the weights for classes error and no error, respectively.Here p denoted the true PDF and q denoted the PDF predicted by the model.See figure 2 for an illustration of our densely connected convolutional neural network architecture.Note that our dataset class labels were imbalanced, since for every approved aligned radiograph we were able to generate two off-by-one misaligned radiographs (one superior, one inferior).To rebalance the dataset, the class weights in our model were modified to w = 0.33 1 and w = 0.67.

Network training
Network training was implemented on a NVIDIA RTX A5000 GPU.Our saved training, testing, and validation subsets were used to extract all NIfTI twochannel images.These images were then used to generate a deep learning image dataset in MATLAB with a custom datapoint reader function.Specifically, the NIfTI files were opened by our reader function and were pre-processed with the following steps.First, the spatial dimensions of the images were resampled to a conventional size of 300 300 pixels with bicubic interpolation.By inspection, this size was sufficiently small for rapid deep learning computations without reducing the original number of rows and columns by more than a third.Second, partial derivatives were taken in the vertical and horizontal direction as a preprocessing step.Thus, the input size of the DenseNet model was ´300 300 4. Each channel was separately normalized.
Training was conducted with the Adam algorithm for stochastic gradient descent [40].The patient IDs in the dataset were carefully split into separate groups as summarized in table 2. A maximum number of epochs of 30 and a mini-batch size of 64 was fixed.The data were also shuffled after each epoch.Validation   progress was monitored every 50 iterations, and training was terminated early if the validation loss did not improve after 25 consecutive runs.Our learning rate was set to - 10 3 and the denominator offset for the Adam optimization was - 10 . 8 Individual models were trained on the AP and LAT data for kV beam energies, as well as on the AP data for MV beam energies.A fourth model was trained on the paired AP and LAT data together for kV beam energies.Two copies of the DenseNet architecture were used as parallel branches for the different orientations, up to the final convolution layer where the tensors were combined prior to the global average pooling layer.Whereas the weights of the branches were initialized according to the separately trained models, the final fully connected layer was trained from scratch, hence its weight and bias learn rate was increased accordingly by a factor of 10.

Network testing
Three different DenseNet models were created on the training subsets for each dataset category in table 1 (kV-AP, kV-LAT, MV-AP) on the training, and a fourth model was created on the paired orthogonal AP data (kV-paired).The testing subsets for each dataset category were used to evaluate the models.As mentioned, the testing subsets contained images with true positives produced algorithmically as well as manually, in order to obtain greater assurance that our models work on realistic images.Each network outputted a list of prediction probabilities stored as a two- dimensional vector, one element for each class of error or no error.Since the elements added to unity, choosing the index of the maxima yielded prediction classes, which could be directly compared to the original testing subset labels.
A receiver operating characteristic (ROC) curve was created for each dataset evaluated.ROC curves for the testing cases with semi-automatically misaligned true positives are illustrated in figure 3(a), while those for manual true positives are illustrated in figure 3(b).It sufficed to consider the probability that a datapoint was in the first class error.Various thresholds T in the interval [0, 1] were chosen to indicate that if the probability exceeded the threshold, then the datapoint belonged to class error, i.e. if ( ) p j T,  then case j belonged to the error class.The area under the ROC curve (AUC) was used as our metric for the success of the classifier.We used trapezoidal numerical integration to calculate the AUC.Due to the rarity of setup errors in the clinic, it was desirable to select a high specificity on the ROC curves to minimize disruption in the clinical workflow.A fixed specificity of 95% was chosen for model testing, with a target sensitivity of at least 95%.

Linear discriminant analysis
As a baseline for evaluating the performance of the CNN classifier architecture, a subset of = N 498 twochannel images (DRR and x-ray) from 69 patients in our paired kV dataset was examined using linear discriminant analysis (LDA).For each of the N images in our subset, we computed a feature vector as follows.The image channels were preprocessed by spatially downsampling to = Ḿ 300 300 pixels using bicu- bic interploation.Vertical and horizontal and vertical image gradients were acquired, and for each gradient direction, the normalized cross-correlation coefficient between the DRR and x-ray was computed.A linear discriminant between the aligned and misaligned datapoints was determined following the deriviation in [41].

Complete database error search
Subsequent to testing, all clinically aligned paired kV image sets (i.e., the union of the training, validation and test sets) were processed by the paired KV model as a safety analysis tool for retrospective error hunting.Note that due to the large volume of data (2,486 image pairs) it was deemed impractical to visually verify every

Results
Figure 3 shows all ROC curves plotted on the same axis, as well as their corresponding AUCs.The AUCs for the four models trained are shown in the legend of figure 3. Model sensitivities at nominal specificities of 90% and 95% are reported in table 3.
The baseline linear discriminant analysis resultsed in an AUC of 0.92 for unpaired AP images (compare to 0.98 for the CNN method).A scatterplot of the feature set, shown in Supplementary figure S, showed that vertical-direction image gradient filters are powerful for discriminating between error and no-error cases.That insight led us to apply such filters as a pre-processing step prior to the CNN.These results also demonstrate that LDA with vertical gradient filters provide a reasonably high AUC, although underperforming CNNs, as expected.
Out of 2486 clinically aligned image pairs in the database, 16 were flagged by the error detection algorithm.These cases were checked visually in our offline review software, and it was confirmed that only one radiograph pair was truly misaligned (see figure 4(a)).This case was reported to our institution's incident learning system.Common features of the remaining incorrectly flagged 15 cases include low contrast, spinal implants, and abnormal spinal curvature.The per-fraction rate for off-by-one vertebral body errors with paired kV images at our institution was determined to be 0.044%, with a 95% confidence interval of 0.0022% to 0.21% assuming Poisson statistics.

Discussion
The model performance was similar when evaluated on the test data produced by manual and semiautomated error detection methods.The DenseNet classifier performed the worst on the kV LAT images, although the performance was still significantly better than a baseline random classifier.This is likely due to the very low contrast in these laterally acquired images.For the same reason, in our clinic kV LAT images were almost always acquired with a paired AP image for these anatomical sites.Note that there were no instances of unpaired MV LAT images retrieved from our database.The second lowest performance is for the MV AP dataset, and this was once again expected on account of the low contrast for MV imaging.The kV AP dataset was classified with remarkable accuracy.Superior contrast on the vertebral bodies and surrounding anatomy compared to the MV and kV-LAT likely account for this improvement.Finally, as anticipated, the combined orthogonal kV data sent through separate branches of a parallel DenseNet architecture performed the best.Additional information provided by the lateral direction yielded a small but noticeable improvement in the classifier accuracy for both semi-automated and manual datasets.
For our most accurate model with paired kV radiographs, it was instructive to analyze the cases that the model misclassified.A threshold for classification was selected on the ROC curve of the model at the point corresponding to a specificity of 96.4% and sensitivity of 98.2%.Out of the 84 cases in the test set, only 2 were misclassified, with 1 false positive and 1 false negative.See figure 5.It was observed that the false positive case had extremely low contrast, which invariably creates challenges for the neural network as we noted for the lateral orientations and MV beam energies.The AP radiograph was also observed to be slightly rotated with respect to the DRR.The false negative case was characterized by a large AP radiographic field that included the pelvis.This patient also had a chest catheter which may have confused the model.Fortunately, misaligned catheter and iliac crest would have been easily noticeable by radiation therapists.Interestingly, the linear discriminant derived earlier predicted that both of these radiographs were misaligned.Hence, the error in 5(b) would not have gone unnoticed through a combination of LDA and CNN automated interlocks.
Limitations of our study include the fact that our misalignments used for training were created algorithmically rather than manually.It is possible that a more realistic error detection model could be developed if every misaligned datapoint was translated manually.However, in the interest of time we only misaligned one fraction per patient in the testing subset.Also, more accurate models could be developed using data from multiple institutions rather than a single institution.The extent to which low image contrast decreases the predictive accuracy of the model was not quantified in this study.Future improvements could involve expert medical physicists providing a quantitative scoring of the images based on the visual contrast and anatomical detail available for alignment.Our work could also benefit from training the model using a third category where the detection of an off-by-one error is undetermined.This would potentially provide more insight into the frequency of poor contrast treatment x-rays and how our model performance is correlated with contrast.

Conclusion
A deep convolutional neural network was successfully trained to detect off-by-one vertebral body misalignments for radiotherapy patients positioned with orthogonally paired kV and anteroposterior MV radiographs.These models achieved high AUCs against both semi-automatically and manually generated datasets.The paired kV model achieved our objective goal of at least a 95% true positive rate given a false positive rate of less than 5%.This level of accuracy enables the possibility of a workflow in which the model detects misaligned images in real time at the treatment machine and asserts an over-rideable interlock that would alert therapy technologists to double check the alignment before the shifts are sent to the treatment couch.The model was used to retrospectively search image databases at the UCLA radiation oncology clinic to determine an absolute per-fraction off-by-one vertebral body alignment error of 0.044% [0.0022%, 0.21%] for planar x-ray image-guided radiotherapy over the period 2011-2021, confirming the overall safety of this radiotherapy delivery modality.

Figure 2 .
Figure 2. Our proposed DenseNet implementation for radiographic setup images.The input DRR and x-ray image pairs are preprocessed with vertical and horizontal image gradients, which are stored as separate channels.These images are fed into a multilayered, densely connected CNN.The orientations are concatenated, and a sequence of global average pooling and fully connected layers reduces the information to two classifier neurons.Note that for the unpaired image models with only one orientation, there is no parallel branch.

Figure 1 .
Figure 1.Example images for model training.The planning DRR and treatment x-ray image pair was used as a true negative datapoint, indicating no setup error.Our misalignment algorithm was used to shift the DRR by one vertebral body in both the superior and inferior directions to generate true positive datapoints, indicating setup errors.(a) AP orientation.(b) LAT orientation.

Figure 3 .
Figure 3. ROC curves for the DenseNet model performances applied to our testing datasets left out during model training.(a) Simulated errors created semi-automatically, where DRRs were finely registered to adjacent vertebral bodies by maximizing image cross-correlations.(b) Simulated errors created manually, where DRRs were shifted by hand relative to the properly registered radiographs to simulate realistic setup errors.The paired kV datasets have the highest AUC, while the kV LAT datasets have the lowest.

Figure 4 .
Figure 4. Clinical cases that were flagged by our paired kV model and confirmed as clinically treated misalignments.(a) An off-by-one vertebral body error.(b) An approximately 1.5 cm misalignment along the vertebral column.

Figure 5 .
Figure 5. Cases that were misclassified by our model.(a) False positive.The x-ray had relatively poor contrast, so here the intensity windowing was adjusted to improve visibility.(b) False negative.

Table 1 .
Data selection summary among all treatment fractions in our radiotherapy patient population.The most common reason for exclusion was that the field of view (FOV) was too lateral from midline to discern vertebral bodies.Models were built on the paired kV and unpaired MV datasets.

Table 2 .
Number of patients in training-validation-testing splits for our individual models.