A deep-learning assisted bioluminescence tomography method to enable radiation targeting in rat glioblastoma

Objective. A novel solution is required for accurate 3D bioluminescence tomography (BLT) based glioblastoma (GBM) targeting. The provided solution should be computationally efficient to support real-time treatment planning, thus reducing the x-ray imaging dose imposed by high-resolution micro cone-beam CT. Approach. A novel deep-learning approach is developed to enable BLT-based tumor targeting and treatment planning for orthotopic rat GBM models. The proposed framework is trained and validated on a set of realistic Monte Carlo simulations. Finally, the trained deep learning model is tested on a limited set of BLI measurements of real rat GBM models. Significance. Bioluminescence imaging (BLI) is a 2D non-invasive optical imaging modality geared toward preclinical cancer research. It can be used to monitor tumor growth in small animal tumor models effectively and without radiation burden. However, the current state-of-the-art does not allow accurate radiation treatment planning using BLI, hence limiting BLI’s value in preclinical radiobiology research. Results. The proposed solution can achieve sub-millimeter targeting accuracy on the simulated dataset, with a median dice similarity coefficient (DSC) of 61%. The provided BLT-based planning volume achieves a median encapsulation of more than 97% of the tumor while keeping the median geometrical brain coverage below 4.2%. For the real BLI measurements, the proposed solution provided median geometrical tumor coverage of 95% and a median DSC of 42%. Dose planning using a dedicated small animal treatment planning system indicated good BLT-based treatment planning accuracy compared to ground-truth CT-based planning, where dose-volume metrics for the tumor fall within the limit of agreement for more than 95% of cases. Conclusion. The combination of flexibility, accuracy, and speed of the deep learning solutions make them a viable option for the BLT reconstruction problem and can provide BLT-based tumor targeting for the rat GBM models.


Introduction
In the past decades, image-guided small animal precision irradiation systems have found their way into the preclinical cancer research (Brown et al 2022, Verhaegen et al 2023. These systems mainly use micro cone-beam computed tomography (μCBCT) as their primary image guidance and allow clinically relevant conformal irradiation for small animals. However, to visualize small tumors with high spatial resolution, it is often necessary to increase the x-ray imaging dose in these systems. In general, a voxel size of approximately 100 μm is required to visualize the anatomical structures of rats or mice. Achieving such a high resolution usually imposes high imaging x-ray doses in the range of 10-100 cGy to the animal (Verhaegen et al 2011, Vaniqui et al 2017. Native μCBCT images, without the use of contrast media, result in poor imaging contrast, especially for preclinical glioblastoma (GBM). Hence, contrast-enhanced CBCT (CE-CBCT) is often employed to improve tumor visualization (Yahyanejad et al 2014, Mowday et al 2020, Stegen et al 2020. The accumulated x-ray imaging radiation dose limits the number of imaging sessions within a longitudinal study and therefore hinders effective preclinical research. Bioluminescence imaging (BLI) has been introduced as an alternative to other functional imaging modalities, such as positron emission tomography (PET). BLI allows functional tumor imaging without any radiation burden for the animal. In addition, it often constitutes a cheaper functional imaging solution without any background noise. Hence, recently BLI has become a very attractive imaging modality for small animal preclinical cancer research.
However, at the time of this publication, most commercially available systems do not fully utilize BLI-based targeting and irradiation possibilities (Verhaegen et al 2018). This is mainly due to the lack of 3D information based on 2D bioluminescence images. Many groups, including ours, have tried various solutions to tackle the bioluminescence tomography (BLT) reconstruction problem (Deng et al 2020, Rapic et al 2022, Rezaeifar et al 2022. In contrast to other mathematically-driven solutions (Dehghani et al 2018, Deng et al 2020, Rapic et al 2022, our efforts have mainly been focused on deep learning (DL) based solutions. Previously, we proposed a 3D convolutional neural network (CNN) to predict the tumor's center of mass (CoM) and to construct a spherical volume around the CoM as the targeting volume (Rezaeifar et al 2022). Although the CoM-based method provides an effective solution to enable DL-assisted BLI-based tumor targeting in preclinical practice, it has several shortcomings due to its simplified spherical targeting geometry. In this paper, a novel artificial intelligence (AI) based algorithm is developed to predict the 3D shape and location of the tumor for rat GBM models. The proposed solution relies on a 3D ResNet architecture adopted from the RatLesNet model, originally developed by Valverde et al (2020) for lesion detection in rodent magnetic resonance images (MRI). Furthermore, the proposed solution employs Monte Carlo simulations (MCS) to provide a realistic training database for the DL model as an alternative to a large set of acquired images. The performance of the trained model is then evaluated on the MCS database and a smaller set of real measured BLI using a variety of objective quality metrics, such as dice similarity coefficient, geometrical convergence metrics, and dose-volume metrics.

Problem formulation
To solve the BLT reconstruction problem, an accurate model of optical light propagation in the biological tissue is needed. The diffusion approximation (DA) of the radiative transport equation is the most commonly used forward model in the literature. Following the notation used by He et al (2010), the DA is expressed as , 1 a [where D is the optical diffusion coefficient depending on the 3D position Î R r 3 inside the region of interest W, F represents the photon density (Watt mm -2 ) and S denotes the power density of the internally located light source (Watt mm −3 ). Furthermore, the optical diffusion coefficient is defined as where m a and m¢ s are the absorption and reduced scattering coefficients (mm −1 ). The DA equation is solved using the following Robin boundary condition: where ( ) ¢ A n n r; , represent the boundary mismatch resulting from the two different refractive indexes at the boundary, and u is the unit outer normal at the boundary ¶.
Once this forward model is properly solved using the finite element method (FEM), the DA equation can be reduced to the following discretized linear equation: where M and F are positive system matrices resulting from FEM. Equation (3) can be rewritten as: Most mathematically derived approaches then define a cost function with a specific regularization term and attempt to locate the optimal light source by minimizing this cost function. In this paper, however, following the same notation, the BLT reconstruction inverse problem can be expressed as In equation (5), - 1 is a nonlinear function that links the measured photon flux to the corresponding source, resulting from the solution to the inverse problem.
It has been proven that the BLT reconstruction problem is highly ill-posed (Gao et al 2018). Often various prior information or regularization methods are utilized to decrease the ill-posedness of the problem. In this paper, a DL model is used to directly learn a novel solution for S based on the best fit to the observations. Following the notation of (Weinan et al 2020), the output of a multi-layered DL model can be expressed as: are the weights of the network in different layers. Furthermore, x i 0 indicates the input of the DL model, i.e. bioluminescence surface photon count. The activation function  is an arbitrary nonlinear function that gives the DL model further degrees of freedom in modeling nonlinear phenomena. In this notation, the bias term in each layer is generalized as a weight.
DL models can be considered as universal function approximators and thus if the DL model is designed and trained properly, it can learn a mathematical model , G in equation (6), that best fits the provided data and, in theory, can be an estimation for - 1 in equation (5). Figure 1 depicts the overview of the DL-based proposed framework in this paper to solve the BLT inverse problem.  Soleimanzad et al (2017). Everything other than that, i.e. body and air, are assigned to water and air.

Monte Carlo simulations
MCS is considered the gold standard for photon transport simulations and can provide more accurate groundtruth data for the AI model than other analytical model counterparts. Therefore, due to the lack of a considerable amount of ground-truth labeled BLI measurements (which is a common problem in biological experiments), a larger database of MCS is generated and utilized to train and validate the AI model.
To build the MC database representative of the real GBM BLI measurements, a database of CE-CBCT images of real GBM is employed. This database, hereinafter called the F98 database, consists of 57 cases with CE-CBCT images of an orthotopic F98 rat GBM animal model, imaged at several time points in our previous work (Mowday et al 2020). Each of the cases within the F98 database further includes hand-delineated contours for normal brain and tumor tissue by a trained biologist. In addition to these contours, two separate thresholds are applied to the mass density image, obtained from the original CE-CBCT, to generate bone and air masks. The resulting contours are combined to create the MCS geometry, as shown in figure 2 and explained in our previous publication (Rezaeifar et al 2022) in more detail. The hand-delineated tumor contour is then used to constitute a uniformly and isotropically-emitting light source with a similar light emission spectrum to the firefly luciferase light emerging from the tumor. In other words, in this study, substructures within the tumor, such as necrotic and hypoxic regions, are ignored. Therefore, the uniformly emitting tumor approximates the real emission of the bioluminescence-enabled tumor.
The simulation geometry and the light-emitting source are presented to the MCS engine, namely the Geant4 application for tomographic emission (GATE) (Cuplov et al 2014). In this study, various wavelength-dependent optical properties are assigned to each tissue in the MCS geometry. These properties included a tissue-dependent absorption and scattering coefficient, presented in figure 3 of our previous work (Rezaeifar et al 2022), which was obtained from previous work (Zhao et al 2005, Mesradi et al 2013, Soleimanzad et al 2017. Furthermore, two simplifications are included in the MCS: (a) tumor tissue has the same optical properties as the brain tissue, and (b) everything other than the brain, air, skull, and the tumor is considered water since its contribution to the simulation output is negligible. The water regions account for the small regions in the medial longitudinal fissure, the space between the brain and the skull, and the rest of the soft tissue in the head and neck region. As shown in figure 2, the aforementioned water-equivalent region is either filled with cerebrospinal fluid, which has similar optical properties to water or located far from the relevant scoring region of interest, which causes its optical properties to be insignificant.
The MCS output is scored using the GATE fluence actor which tracks photons entering or exiting a specified geometry. In a voxelized geometry, such as the one used in this study, the fluence actor registers the photons passing through each individual voxel and saves them as a raw 3D image volume. Furthermore, the constant number of emitted photons per unit volume, i.e. voxels inside the tumor volume, is set to provide an average statistical simulation uncertainty below 0.2% for an average-sized tumor.

Deep learning solution
2.3.1. Pre-processing of the MCS output Two main pre-processing steps are considered to create the training database from the MCS output: (a) converting the raw 3D MCS output to the 3D bioluminescence skin fluence (BSF) by applying the corresponding skin mask and (b) normalizing the BSF data.
As mentioned previously, raw MCS output includes the resulting photon count at every voxel in the voxelized geometry. To consider only the MCS output for voxels visible to the camera, and thus creating the subsequent BSF, a skin mask is constituted based on the original CT scan and the location of a hypothetical camera. This is done by a three-step process: (a) obtaining an air mask from the original CT image by using a constant HU threshold, (b) computing an approximate skin contour by applying morphological operators to the air mask and (c) removing any unwanted voxels that are not visible to the camera such as voxels located in the inner part of ears using a simplified ray-tracing algorithm. The location of a hypothetical rotating camera used in this study corresponds to the commercially available small animal irradiating platform (X-RAD 225Cx, Precision x-ray Inc., North Branford, CT, USA). Furthermore, a set of five camera viewing angles is considered, based on real animal experiments, to obtain the visible skin voxels. Details of the algorithm used for computing the skin mask can be found in supplementary materials, sub-section S1.
Once the BSF is obtained for every case in the MCS database, the volumetric images are normalized in both intensity and size. Intensity normalization is performed by normalizing the volumetric images to have a median of zero and a standard deviation of one. In addition to the intensity normalizers, all the volumetric images in the database are moved to a fixed input grid of 375 × 450 × 375 voxels, by padding or cropping the original input, to have equal image dimensions required by the DL algorithm. Thereafter, all the samples in the database are down-scaled to a smaller 250 × 300 × 250 volume to minimize the GPU memory needed for training.

Training and validation of the AI model
In this study, a previously developed fully convolutional neural network architecture, namely RatLesNet (Valverde et al 2020), is employed to solve the BLT reconstruction problem. The RatLesNet model was originally developed to segment small brain lesions in rodent magnetic resonance images (MRI). Therefore, it is a suitable candidate for the BLT reconstruction problem since the final aim of the current study is also the segmentation of small tumors from 3D BSF images. Furthermore, an automatic hyperparameter optimization algorithm (Optuna, Akiba et al 2019) is used to obtain the best set of hyperparameters by solving an optimization problem that samples hyperparameters from a pre-defined search space using the tree-structured Parzen estimator algorithm (Bergstra et al 2011). The hyperparameters included in the search space consist of the number of filters in convolutional layers, loss function, and the optimization algorithm. Hence, the original architecture of the RatLesNet is kept intact. More details on the hyperparameter optimization and the search space for each hyperparameter are shown in supplementary materials, supplementary table 1.
An exclusion criterion based on the tumor volume is defined removing tumors smaller than 10 mm 3 , reducing the total number of MCS samples to 42 cases. This is due to the fact that such a small tumor: (a) requires collimated beams smaller than 3 mm for targeting, which will increase the dose delivery uncertainty, (b) emits fewer bioluminescence photons, and (c) cause additional challenges for the DL model due to high level of class imbalance in the prediction image.
Once the exclusion criteria are applied and the optimal set of hyperparameters is obtained, the remaining MCS database is shuffled randomly and divided into different subsets for training, validation, and test. A 9-fold cross-validation algorithm is used to train, validate and test the model on all the cases in the database. In other words, as shown in figure 3, for each fold five cases (12% of total samples) are reserved for testing and five more for validation. The rest of the samples are used to train the model. During the training phase, one DL model is trained for each fold using the training and validation set, keeping the test set unobserved. This results in nine distinct trained models for each fold and allows the model to be tested on all 42 cases using the corresponding fold in which the specific case is in the test database.

Robustness evaluation using synthetic cases
A set of 42 cases is artificially generated to evaluate the robustness of the proposed deep-learning solution for cases outside the initial training distribution and quantify the performance gain upon retraining the network with newly added samples. The synthetic case database includes randomly augmented tumor shapes inside randomly selected rat MCS geometry, placed in either (a) a random location in the brain or (b) near the center of mass of the original tumor with respect to the selected MCS geometry. These two categories of synthetic cases are further complemented with cases where the original tumor for the selected MCS geometry is either replaced by (c) the predicted tumor by the proposed deep-learning solution for the same case or (d) one of the two flat tumors in the F98 database. Therefore, each of the categories represents a true out-of-distribution (OOD) scenario. For example, category (a) represents cases where differently shaped tumors are implanted in anatomical locations far from the standard implantation site in the F98 database. In contrast, categories (b) and (c) represent cases where new variations of tumor shapes are located around the same implantation site.
Once the MCS geometry for the synthetic cases is obtained, a fast MCS is performed for each case with fewer photons per unit volume of tumors. This also enables the investigation of the model's sensitivity with respect to the statistical noise in the MCS output. Thereafter, the same pre-processing steps, introduced in section 2.3.1, are applied, and a new synthetic case database is generated. The synthetic case database is then utilized in two scenarios: (a) as the test data for the network trained with original F98 cases to establish robustness against new cases, and (b) added to the training data to obtain the performance gain when the model observes such OOD cases. For scenario (a), where the synthetic cases are used as test data without further training, all nine models obtained from the 9-fold cross-validation are utilized, and the final prediction is considered as the result of the majority voting of all models.

Geometrical evaluation
To quantify the model's absolute performance, the BLT problem is considered analogous to the auto-contouring problem (Lappas et al 2022). This is possible by converting the raw prediction output of the AI model for the location of the photon source to a binary mask using a pre-defined constant threshold. Hence, segmentation quality evaluation metrics, consisting of DCoM, dice similarity coefficient (DSC), and a set of geometrical coverage scores are used to evaluate this aspect of the solution.
DSC is defined as the ratio of the overlapping region between the two contours, the ground-truth tumor and the predicted BLT source, and the overall volume covered by both contours: Where TP (True Positive) is the overlap region between the ground-truth tumor contour and the predicted BLT source, FP (False Positive) is the part of the predicted BLT source which is not in the ground truth, and FN (False Negative) is the missing part of the ground truth in the BLT predictions. DCoM, on the other hand, quantitively measures the Euclidean distance between the centers of mass of the predicted BLT source and the ground-truth tumor contour. The output of the RatLesNet is the binary 3D BLT source prediction and can be considered as the BLI-based gross tumor volume (bGTV). In this paper, a 3D uniform margin is added to the bGTV to construct the BLIbased planning target volume (bPTV). The size of the added margin is optimized using the MCS database. More details are provided in supplementary material, section S3. Furthermore, healthy brain tissue is computed by subtracting the CT-based gross tumor volume (cGTV) from the brain contour used in the MCS.
Thereafter, geometrical coverage scores for corresponding tissues are computed as the percentage of tissue that falls within the bPTV with respect to the total volume of the tissue ÇĆ volume bPTV tissue volume tissue 100.
tissue Therefore, the ideal results will be = C 100% tumor and = C 0%, brain meaning that the predicted BLT-based planning volume includes all the tumor tissue while not targeting any normal brain tissue. However, in practice, this is not feasible with external radiation beams traversing the brain and often the added margin will impose normal tissue coverage intentionally to avoid tumor recurrence.

BLT-based irradiation planning evaluation
Another important aspect of the BLI-based tumor predictions is the evaluation of irradiation planning with photon beams. Therefore, a set of dose metrics are used to evaluate the BLI-based tumor irradiation, including dose-volume metrics (DVM) and dose-volume histograms (DVH) for each tissue. Here, in order to avoid uncertainties in margin selection in small animal radiotherapy, conformal radiation treatment delivery plans are made based on the cGTV or on the bGTV by two independent observers using the small animal radiotherapy Figure 4. Treatment planning visualization using two anterior/posterior parallel opposed circular beams around specific isocenter: in (a)-(c) the isocenter is located in the center of the BLI-based tumor prediction (bGTV) while (e)-(g) depicts the CT-based tumor (cGTV) and its associated planning. In both cases, doses are scored at the brain, cGTV and bGTV. treatment planning software (SmART-ATP version 2.0, SmART Scientific Solutions B.V., Maastricht, Netherlands). In other words, no additional margin, other than the imposed margin by choosing a circular collimator, chosen from a real set of available collimators with diameters of 1, 3, 5, 8, and 10 mm, is considered for the treatment plans.
In routine preclinical practice, the objectives of the study determine the configuration of the beams in a casedependent manner. Therefore, biologists have to select proper beam configurations per case. However, to normalize the beam configuration in this study, treatment plans are limited to two anterior/posterior parallel opposed beams with the isocenter located at the center of the target volume. This beam configuration is chosen based on the work of Mowday et al (2020), proving it to have the highest healthy tissue spearing effect. In each case, as shown in figure 4, the isocenter and width of two parallel-opposed circular-collimated beams are selected based on the ground truth cGTV or on the result of the proposed method, i.e. bGTV.
Photon dose calculations were done using the DOSXYZnrc Monte Carlo transport code (National Research Council Canada) within SmART-ATP with a constant statistical dose uncertainty of 5% to the target volume. The plans used 225 kVp x-rays (0.3 mm Cu filter) and were made to deliver 20 Gy to the isocenter located at the center of the target volume in the brain.
Once the treatment planning is completed, a set of DVM is computed for each case. These metrics include: (a) mean dose (D mean ), and (b) dose to 95% (D 95 ) of the CT-delineated tumor and dose to 5% (D 5 ) of the brain tissue. DVM for all the cases in the MCS database are presented in scatter plots, allowing quantitative comparison between the reference CT-based plan and the resulting BLI-based plan. In addition to DVM, for a handful of representative cases, the DVH is also presented.

Case study: real BLI measurements
To underline the performance of the novel method, developed using MC simulations, on real BLI measurements, a set of 5 real BLI measurements from two animals are used. The 2D BLI readings are obtained using the small animal radiotherapy unit equipped with a highly sensitive optical camera (iXon Ultra 897, Andor Technology Ltd, Belfast, United Kingdom). Although the optical system is fitted with a filter wheel enabling multispectral readings, the measurements used in this study are obtained using the open-filter option capturing the full spectrum of bioluminescence emission. In addition to the tested cases, numerous BLI-CT pairs have been acquired previously. However, the aforementioned data is not included in this study since the animals were taken out of the cabinet in between the two scans and are prone to displacement errors.
Following the same implantation procedure explained previously (Mowday et al 2020), a total of 20 000 firefly luciferase-positive GBM tumor cells are slowly injected into the brain. At each time point within the study, the animals are injected with both contrast-enhanced agents for CT (60 mg kg −1 Omnipaque, GE Healthcare, Eindhoven, Netherlands) and D-luciferin for BLI (150 mg kg −1 , Perkin Elmer, Rotterdam, Netherlands), according to the same protocol. Thereafter, animals are placed under isoflurane anesthesia and consecutive CBCT and BLI scans of the skull are obtained without moving or relocating the animal. 2D BLI projections are acquired at five angles (0°, ±30°, and ±60°) with 60 s exposure time and an electrical gain of 5. Thereafter, the 2D projections are processed using the provided software (Pilot, version 1.18.5.2, Precision X-Ray, Inc.) to obtain the 3D bioluminescence skin fluence (Weersink et al 2014). The output of the BLI is therefore saved as a 3D surface mesh where the BSF is expressed as an attribute for each node, which then is converted to a 3D volumetric image on the fixed grid, used for DL model training, by triangulation of the mesh. The resulting 3D BSF image is dilated by a 3 × 3 × 3 structure element to increase the thickness of the skin and further resemble the MC simulations.
The 3D BSF image for each of the five real cases is fed into the DL algorithm and the output prediction is compared to the ground truth tumor mask provided by hand-delineating 3D CE-CBCT for the corresponding case. The real BSF images are considered out-of-database samples for all the folds. Hence, allowing all different models trained as part of the k-fold cross-validation method to be used, thus providing 9 different predictions per case. Furthermore, the final output for the real BSF image results from majority voting on all different predictions, enabling a more accurate result.
The prediction accuracy is then evaluated using DSC and ΔCoM, as explained in section 2.4. Furthermore, the BLI-based planning quality is scored both using the DVM and DVH from SmART-ATP. Figure 5 includes the two segmentation quality metrics, namely ΔCoM and DSC. As shown in figures 5(a)-(b) and further visualized in figure 7, the network prediction provides a good agreement with the ground truth with a median ΔCoM and DSC of 0.61 mm and 61%, respectively. Furthermore, the DL algorithm, on average, predicted a contour with ΔCoM of 0.69 ± 0.47 mm and DSC of 59 ± 17%. Therefore, the proposed DL framework can infer the light source, i.e. tumor segmentation, from the surface photon count with submillimeter accuracy.  Figures 6(a)-(b) represents the effect of the added margin on the geometrical coverage scores. As can be seen, with only 0.8 mm of uniform margin, the median tumor coverage score increases to more than 97% while keeping the geometrical brain coverage below 5%. Therefore, 0.8 mm of uniformly added margin is considered as the optimal margin and the resulting geometrical coverage scores are depicted in detail in figures 6(c)-(d). As can be seen, the proposed solution on the median can achieve 97.4% geometrical tumor coverage and 4.2% geometrical brain coverage, considering a 0.8 mm of uniformly added margin.

Prediction evaluation on Monte Carlo simulations
In this study, two of the samples within the database were extremely flat tumors seated near the edges of the brain along the ventral-dorsal axis, as depicted with blue squares in figure 5 and visualized in figures 7(d), (e). Such flat tumors were misclassified as deeper implanted tumors beneath the ground truth volume.
The resulting BLI-based treatment planning for representative MCS cases is presented in figure 7. As shown, the provided BLI-based treatment planning is identical to the CT-based treatment planning in cases with high DSC ( figure 7(a)). For cases with a median DSC, two different scenarios were observed: figure 7(b) represents cases where the predicted bGTV is slightly bigger than the ground-truth cGTV and figure 7(c) is a case with median dice where the prediction is slightly smaller than the cGTV. As can be seen in the DVH plot for these cases, both result in good BLI-based planning. Figure 7(b) resulted in full dose coverage for the tumor but a slight increase in the healthy tissue, which is still acceptable. Figure 7(c), on the other hand, caused a reduced healthy tissue dose with the cost of slightly less tumor coverage. Finally, for the cases with the lowest DSC, i.e. the two flat tumors, the parallel opposed anterior-posterior treatment planning provides an acceptable plan compared to the CT-based planning (figures 7(d), (e)) since the placement of the predicted bGTV is directly beneath the actual tumor in the axial plane.   Figure 8 shows the resulting DVM for the tumor and brain tissues. Since there is a considerable variation in tumor sizes within the MCS database, different collimator sizes were needed to target the respective volumes in each planning, ranging from 3 mm up to 10 mm circular beams. In most cases (shown with circles in figure 8), the circular collimator used for BLI-based and CT-based plans are of the same sizes, which underlines the similarity in the volume of cGTV and bGTV. Nevertheless, 25% of the database (shown with triangles in figure 8) resulted in a larger BLI-based collimator than the CT-based collimator, because bGTV was larger than cGTV. There were 22.5% of the total cases for which the bGTV was smaller than the cGTV and resulted in a smaller collimator, shown in figure 8 with squares.
DSC, on the other hand, has a less descriptive role in the treatment planning outcome with regard to the DVM for the tumor and brain. As can be seen in figure 8, some cases with average to high DSC scores did not provide the prescribed mean dose to the tumor, either due to a smaller collimator or a larger ΔCoM error. On the contrary, a number of cases with low DSC provided the prescribed mean dose to the tumor but at the expense of a higher mean dose to the brain tissue.

Robustness evaluation
The robustness of the proposed solution is measured against artificially generated samples with additional variations outside the training database. As shown in figure 9, the proposed solution provides less accurate predictions for OOD samples, i.e. samples with additional variations than those inside the training database. Categories (d) and (a), namely flat and randomly located tumors, constitute the worst performance with a median DSC of 24% and 30%, respectively. However, both categories also advantage from the highest performance gain upon training, with a median DSC of 37% and 42%. The Category (b) cases, new tumor shapes in the proximity of the original CoM, yield a median DSC of 46% before training and 53% after training. Furthermore, the performance of Category (c) remained almost constant, with a median DSC of 68% in both scenarios. Finally, the original MCS database, which was utilized in training without the new synthetic cases and provided a median DSC of 61%, demonstrated a reduction in performance upon the new training and provided a median DSC of 55%. Figure 10. Performance evaluation on 5 real BLI measurements of glioblastoma rat models using an X-RAD 225Cx irradiator: (a)-(b) segmentation quality metrics per sample and k-fold model and the aggregated prediction by majority voting shown as a red '+' marker (c)-(d) tumor and brain coverage of the aggregated predicted BLT source plus 0.7 mm margin.

Case study: real BLI measurement
The performance evaluation of the proposed method on five different real BLI measurements is summarized in figures 10(a)-(b). As shown in this figure, the overall performance of the proposed method is slightly reduced when applying it to the real BLI measurements, with a median DSC of 42.4 ± 14.8 percent and DCoM of 1.6 ± 0.4 mm. Furthermore, the coverage metrics for the real BLI measurements are visualized in figures 10(c)-(d), underlining the agreement between the predicted BLT source and the ground truth tumor mask, with a median geometrical tumor coverage of 95.1 ± 11.2% and geometrical brain coverage of 7.5 ± 2.0%. Figure 11. Visualization of the aggregate predicted BLT source using real BLI measurement without added margin: red contour is the ground-truth cGTV, hand-delineated from CE-CBCT, and yellow is the predicted bGTV from the BLI data. DVH plot is the resulting planning for each contour.
The visualizations of the predictions are shown in figure 11 for the real BLI acquisitions underlining the agreement between the predictions and the ground-truth tumor mask. As can be seen, in three of the total five cases, BLI-based treatment planning provides identical results to CT-based planning. For the other two cases, however, the dose to the tumor is slightly decreased when using BLI images, which can be compensated by considering a margin around the BLI-based tumor prediction.
The DVM for the real BLI acquisitions are presented in figure 12. As shown, the proposed DL-based framework provides good planning accuracy compared to ground-truth CT-based planning. Four of the five cases resulted in nearly perfect agreement with the CT-based plans and only one of the cases (shown in figure 11(a)) predicted a bigger BLI-based volume which necessitated use of a larger collimator.

Discussion
In this study, a novel deep-learning approach is developed to enable BLI-based irradiation planning for the GBM orthotopic rat models. The proposed framework is a good candidate to facilitate BLI-based planning for other kinds of tumor models, both in rats and mice, providing small-animal image-guided radiotherapy without excess x-ray imaging dose on animals. This can be further studied and developed using the same framework, i.e. by developing a suitable MC-based training database and training a similar deep-learning model.
The results of this study show the feasibility of BLI-based precision radiotherapy. The proposed deeplearning algorithm works well in a large variety of simulated cases, with tumors ranging from 10-270 mm 3 in size. Tumors smaller than 10 mm 3 were excluded from this study since they are too small to be targeted accurately using the BLI signal.
The performance of the proposed method can be quantified in two distinct tasks: (a) tumor position accuracy, and (b) tumor shape prediction accuracy. The proposed DL-based solution provided excellent submillimeter accuracy for the tumor position. Despite this, the proposed method cannot fully capture the detailed shape characteristics of a tumor and often provides a smoothed-out prediction compared to the ground-truth tumor mask. In addition, the overly smoothed prediction is not necessarily a drawback of the proposed method since it is often needed to add a margin to the ground-truth contours. The effect of the added margin and the trade-off between the coverage and the excess treatment is presented in figures 6(a)-(b). As shown, a margin of Figure 12. DVM for real BLI acquisitions: (a, b) DVM for tumor, and (c, d) DVM for the brain. DSC is color-coded and differences in collimator size are shown as different markers: circles or triangles are cases where the BLI-based collimator was the same size or larger than the CT-based collimator. 0.8 mm provides the best trade-off between good tumor coverage and limited healthy tissue exposure in the MCS database. Finally, most commercially available precision radiotherapy systems for small animals cannot irradiate such small detailed shape variations conformally.
The proposed solution, with median DSC and DCoM of 61% and 0.61 mm, exceeds our previous publication (Rezaeifar et al 2022), which used an AI solution to predict the tumor's CoM. Specifically, the CoM method provided median DSC and DCoM of 56% and 1.01 mm on the same database (excluding tumors below 10 mm 3 ). The proposed solution is also superior to the mathematical solution in a similar GBM mouse model (Xu et al 2021). Xu et al reported an average DSC and DCoM of 55% and 0.62 mm, which is slightly lower than the performance of the proposed solution. Nonetheless, Xu reported the results using real GBM mouse experiments, while the proposed solution in this study is evaluated using the MCS database of rat experiments, which makes a direct comparison between the two methods challenging.
The DVMs for the MCS database, shown in figure 8, reveal that the proposed DL-based solution can provide acceptable tumor dose coverage for most cases while delivering a limited dose to the organ at risk. However, in this study, no margin scheme is considered for planning. Therefore, both the cGTV and bGTV are considered without added margin and only the imposed margins by the circular collimator of fixed sizes are considered. Nevertheless, the effect of margins is investigated in the geometrical coverage evaluation, and it is likely to assume that adding a treatment margin would control the spread in the tumor dose coverage with the cost of an additional brain dose. In other words, the spread of points in figures 8(a), (b) below the identity line can be avoided by including a proper margin. It is shown that a margin of 0.8 mm can increase the median geometrical tumor coverage to 97%. Furthermore, the results suggest that using BLI-based collimators smaller than 8 mm will increase the probability of delivering less dose to the tumor. Additionally, it is important to note that the tested beam configuration might influence the dose coverage greatly. Although the parallel opposed beam configuration is selected based on a previous study without any correlation to the proposed BLI-based solution, the beam configuration seems to compensate for the BLI-based targeting error. This is especially apparent for the real measurement cases where the displacement error in the anterior-posterior direction is mitigated by the proposed beam configuration.
It is worth mentioning that in some of the cases, both in the MCS database and real BLI acquisitions, the predicted bGTV slightly overlaps with the skull, as can be seen in figure 11(b). Such overlap will cause a long flat tail in the DVH due to the high dose in the bone when irradiating with 225 kV x-rays (when calculating dose-tomedium-in-medium in Monte Carlo dose calculations). This can be easily removed in the post-processing steps by automatically removing the skull from the bGTV and only considering overlapping regions with the brain.
The robustness analysis provided valuable insights about the proposed AI-based solution. The trained model using the initial samples performs mediocre for OOD cases, especially flat tumors and tumors located at random locations inside the brain. It is speculated that poor performance for randomly located tumors is most likely related to the wrong location of the implantation drill hole with respect to the location of the synthetic tumor. In other words, the trained network has observed an indirect effect of the punctured skull since most MCS observations included such an effect as a hotspot in the BSF directly above the drill hole. Therefore, some of the synthetic cases provided unrealistic and unfamiliar samples for which the tumors were not located directly below the punctured location in the skull. In addition, another important observation from the robustness analysis is the reduced performance on the original cases once the network is trained with the additional synthetic cases, which again can be the result of unrealistic cases.
The performance of the proposed solution decreased slightly for a small set of real BLI measurements compared to the MC simulated data. This is believed to be a direct outcome of the limitations of artificial intelligence methods, such as the proposed algorithm, and their dependence on the quality of the training data. Therefore, the trained model on the MC simulations struggles with the increased level of variations in the real measurements. The provided normalization and preprocessing steps restricted the adverse effect of the measurement noise on the predictions. In spite of this, a larger set of real BLI measurements is required to enhance the model's robustness and reduce uncertainties, especially for unseen samples.
The fully convolutional ResNet architecture enables more efficient use of limited hardware resources, especially the GPU, compared to the fully-connected multi-layer perceptron counterparts. Furthermore, the computational efficiency of the developed method is far superior to analytical models in the inference phase, making the proposed method a viable option for real-time BLI-based treatment planning for small animals. The proposed DL model, on the median, takes 532 ± 6 ms to predict the 3D tumor contour on an NVIDIA RTX A6000 GPU. This is of great importance for the small animal precision radiotherapy workflow since animals are required to remain under anesthesia during the whole imaging, planning, and treatment workflow, which makes rapid BLI reconstruction necessary.
The main drawback and challenge of the proposed deep-learning approach is the availability and quality of the training database. Despite our best efforts to address this problem with a high-quality MC-simulated database, the provided training database is considered small in the field of AI and can increase the probability of overfitting. This is frequently an issue for preclinical AI imaging applications. Although the proposed solution is optimized to decrease the effect of such a small-sized training database on the predictions, the epistemic uncertainty remains high, which specifies the performance of any machine-learning model in regard to out-ofdatabase samples.
The proposed solution does not fully exploit the current capabilities of the BLI and only relies on a time and spectral-integrated acquisition. However, we believe that adding spectral information (using the available light filters) in the form of either adding more input channels or an ensemble of AI models per spectrum can yield better performance. In addition to multispectral-enabled AI, physics-informed deep-learning models are another potential candidate for improving the outcomes of this study. These models can couple the flexibility of the AI solutions with the explainability of physics models and provide a synergy between the two.

Conclusion
In this paper, a novel real-time DL solution is presented to accelerate the BLI-based treatment planning problem. The proposed method can achieve good quality planning for the majority of the cases presented here, and therefore demonstrates the proof of concept of using AI-based BLI volumetric reconstruction. However, this study is just a starting point for the use of fully convolutional deep-learning approaches in this field, and like many other deep-learning solutions, the quality of the proposed solution can be improved with more data generated from similar studies, and from using other information derived from the BLI images such as multispectral BLI.