A systematic approach for calibrating a Monte Carlo code to a treatment planning system for obtaining dose, LET, variable proton RBE and out-of-field dose

Objective. While integration of variable relative biological effectiveness (RBE) has not reached full clinical implementation, the importance of having the ability to recalculate proton treatment plans in a flexible, dedicated Monte Carlo (MC) code cannot be understated . Here we provide a step-wise method for calibrating dose from a MC code to a treatment planning system (TPS), to obtain required parameters for calculating linear energy transfer (LET), variable RBE and in general enabling clinical realistic research studies beyond the capabilities of a TPS. Approach. Initially, Pristine Bragg peaks (PBP) were calculated in both the Eclipse TPS and the FLUKA MC code. A rearranged Bortfeld energy-range relation was applied to the initial energy of the beam to fine-tune the range of the MC code at 80% dose level distal to the PBP. The energy spread was adapted by dividing the TPS range by the MC range for dose level 80%–20% distal to the PBP. Density and relative proton stopping power were adjusted by comparing the TPS and MC for different Hounsfield units. To find the relationship of dose per primary particle from the MC to dose per monitor unit in the TPS, integration was applied to the area of the Bragg curve. The calibration was validated for spread-out Bragg peaks (SOBP) in water and patient treatment plans. Following the validation, variable RBE were calculated using established models. Main results. The PBPs ranges were within ± 0.3 mm threshold, and a maximum of 5.5% difference for the SOBPs was observed. The patient validation showed excellent dose agreement between the TPS and MC, with the greatest differences for the lung tumor patient. Significance. A procedure for calibrating a MC code to a TPS was developed and validated. The procedure enables MC-based calculation of dose, LET, variable RBE, advanced (secondary) particle tracking and more from treatment plans.


Introduction
Treatment planning systems (TPS) for radiotherapy of cancer enable rapid and detailed optimization and dose calculation for treatment volumes and surrounding organs at risk (OAR).However, TPSs have limitations in their flexibility of calculating proton relative biological effectiveness (RBE) and out-of-field doses.The increasing use of proton therapy has brought on new clinical challenges, and in particular the issue of variable RBE has received considerable attention throughout the past years (Underwood et al 2022).Proton RBE is well known to vary with tissue type, fractionation, total dose, dose rate and linear energy transfer (LET) (Willers et al 2018, Paganetti 2022), but is currently substituted by a fixed RBE of 1.1 during treatment planning (Dalrymple et al 1966, ICRU 2007, Paganetti et al 2019).The factor 1.1 represents an increase in the biological effectiveness by 10% compared to conventional photon-based radiotherapy.The clinical implementation of variable RBE is

Materials and methods
The Eclipse TPS (Varian Medical Systems, Palo Alto, California, US/ Siemens Healthineers, artificial ProBeam360 data, pencil beam convolution superposition (PCS) algorithm) and the FLUKA general purpose MC code (Ferrari PRS et al 2005, Böhlen et al 2014, Battistoni et al 2016) were utilized throughout this calibration.An overview of the step-wise calibration procedure is presented in figure 1. Calibrating range and energy spread was initially conducted in water for pristine Bragg peaks (PBP) with 10 MeV energy increments by adjusting the mean energy and the distal dose fall off for each beam, respectively.Range and energy spread were implemented as separate functions in FLUKA.Corrections to density and relative proton stopping power (RPSP) were set for the borders of Hounsfield Unit (HU) intervals.These were both implemented as lookup tables in FLUKA.The calibration phantoms were set at a constant HU, this is done to minimize differences between TPS and MC simulations.An absolute dose calibration from the MC simulated data to files comparable to the TPS was calculated and implemented in our workflow.
The TPS planning files were imported into FLUKA for recalculation using an in-house framework (Fjaera et al 2017).To calibrate range, energy spread and RPSP, virtual phantoms of dimensions 100 × 10 × 10 cm 3 were generated using pydicom.For range and energy spread calibrations, the HU of the phantom was set to a RPSP = 1.0001 in the TPS, while in FLUKA, the phantom was defined as water.Water/water equivalent was used in the calibration by convention (Paganetti et al 2008), and all dose calculations were done as dose to water.RPSP calibrations were done for different densities (HUs [ ] 900, 3072 Î - ).For RPSP calibrations at very low densities (HU <−830), an extended phantom with a length of 400 cm was used.
The highest achievable dose scoring resolution from the TPS is 1 mm, and this was replicated for the MC calculations.
Each patient field was initialized in a dedicated simulation using a phase space compiled from the DICOM plan file (RT).The phase space contained each spot with its energy, position, spot size, spot direction and spot weight, where the spot weight was used to determine the fractional number of primaries each spot exhibits/ contains based on the total amount primaries.The orientation of the spots was calculated with the gantry angle exported from the RT file and transformed to be within the xy-plane in FLUKA.Spot positions were extracted in the same manner as direction, but in addition the distance to the source distance was also extracted (usually 1.5 m) and transformed to be within the same xy-plane.Spot sizes were extracted and given the same orientation as the direction in the phase space file; however, these were divided by a factor 2 2ln2 to obtain the standard deviation and multiplied with a normal distribution function in FLUKA.

Range
The ranges obtained from the TPS and the MC were calibrated to coincide by changing the initial beam energy for the MC.MC PBPs were individually adjusted to the TPS PBPs in water.At 80% of maximum dose distal to the PBP (R 80 ), the fluence of initial protons is ideally 50% (Gottschalk 2012), making it independent of the energy spread and therefore a suitable parameter to use for range calibration.PBP calibration ranges are illustrated in appendix A, figure A.1.Since the PBP data are discrete, a MatLab (Gajewski 2023) curve fitting code using the Bortfeld approximation (Bortfeld 1997) was applied, and the R 80 distal to the PBP was determined from the curve fit.The difference in R 80 between the MC and the TPS, denoted R , 80

D
was then used to derive the new required energy (E MC ) for the MC such that R 80 D would be less than 0.3 mm.This was derived through a rearranged energy-range relation (Bortfeld 1997, Fjaera et al 2020) as given by: The constants 0.0022 a = cm/MeV , and p = 1.77 were used throughout this calibration but are estimates given by the Bortfeld approximation (Bortfeld 1997).E TPS is the nominal beam energy from the TPS.The difference between PBP ranges calculated for the TPS and MC are denoted R , d D where d is the depth at a certain percentage of the max.E MC was calculated for each PBP in 10 MeV intervals.Once E MC was obtained for the energy intervals, a curve fit of the data was calculated, modeling 70 MeV, 218 MeV TPS Î ).A 2nd-degree polynomial was used to obtain the fit and was implemented in our in-house software for automatically converting a RT plan to files executable by FLUKA such that the nominal beam energy in FLUKA give equal range outcome to the TPS.

Energy spread
The energy spread was approximated with a Gaussian distribution in FLUKA, with a mean at approximately zero and a standard deviation as a percentage of the beam energy for each initial energy.For each simulated initial proton, a random number from the Gaussian distribution was generated and added to the initial proton energy closely resembling the energy spread of the TPS.The energy spread was calibrated for PBPs in the aforementioned phantom.Distal dose fall-off of 80%-20% in a PBP (R 8020 ) i.e. the distance where the dose decreases from 80% to 20% can be described by range straggling and energy spread as described by B. Gottschalk was used (Gottschalk 2012): Range straggling is fluctuations in path lengths; hence, the range has a statistical error, range s accounting for this.beam s is the additional contribution from the beam energy spread.The range straggling is assumed to be equivalent for the TPS and MC for equal energies ( 0.012 range s » ´range).For beams of equal energies, we can approximate .  3is the assumed energy spread implemented at the start of the calibration in the MC.The energy spread was initially projected to be linearly decreasing with increasing beam energy, as an initial energy spread were needed.The assumed energy spread started at 1.7% of the initial energy at 70 MeV, to 0.2% at 218 MeV, since a decreasing energy spread was similar to results presented by Grassberger et al (2015).The TPS s is the energy spread equal to the TPSs.Similarly, to the energy adjustments, a 2nd-degree polynomial was used to obtain the fit and implemented in FLUKA.

Density and stopping power corrections
The graphical user interface FLAIR (Vlachoudis 2009) was used with the FLUKA MC tool.FLAIR implements several features including a DICOM module which translates voxel-based computer tomography (CT) files i.e.DICOM, into voxel files readable by FLUKA based on HUs.The CT-based voxel file requires a conversion for each individual voxel with its HU value to a tissue equivalent material as well as specific density and RPSP.It would be computationally intensive to assign a material for each HU, hence common practice is to approximate with HU intervals with the Schneider et al (2000) parametrization, which segments the CT into 24 materials of defined elemental compositions.The density is set for the midpoint of the interval, using a density-to-HU curve extracted from the TPS (figure A.2). Densities for the outer edges of the interval were calculated by using the same density-to-HU curve extracted from the TPS.FLUKA then calculates a gradient within the interval, assigning densities for individual HU values.Additionally, the TPS has a HU to RPSP curve (figure A.2) that were used to calculate ranges for the TPS for different HU.With the range in water (R 80 ) obtained from the range calibrations, we calculated the TPS range for HUs for 150 MeV by subsequently divided R 80 in water by the RPSP from the HU to RPSP curve, resulting in projected range at a certain HU (PR TPS ), displayed in figure A.3.In FLUKA, we simulated a 150 MeV (corrected for energy) beam in phantoms with equal HU to the TPS calculations (figure A.3).The projected range for the MC (PR MC ) for the specific HU was then estimated with the Bortfeld approximation (Bortfeld 1997) in the same manner as with the range.This was done for both border HUs in each interval to match range between TPS and MC for most HUs.The RPSP-corrections i.e. relative range correction (RRC), which are implemented in the MC, as the relationship between the projected ranges (equation ( 4)), the MC and TPS are assumed to have equal HU

Absolute dose MU conversion
The method implemented in FLUKA calculates an average dose per primary particle, whereas the TPS calculates dose to water for each monitor unit (MU).Hence, to obtain absolute doses in FLUKA, we performed the conversion by integrating the PBP depth dose curves.The depth at 50% proximal (p in the subscript) to the PBP ( ) R P 50 to the depth at 20% distal to the PBP ( ) R 20 and the whole Bragg curve was numerically integrated for the PBPs calculated in FLUKA and the TPS using the trapezoidal rule for each energy.Dividing the TPS area (A TPS ) by the MC area (A MC ) divided by MUs for each energy interval yields the dose conversion factor (DCF).
( ) This relation was only energy-dependent and was approximated with a 2nd-degree polynomial and implemented in our internal framework for converting FLUKA output files to DICOM.Several segment integrations were tested and currently the best dose conformity in the clinical target volume (CTV) was achieved using the R P 50 to R 20 area.

Verification
Verification of the calibration in water was done for ranges at specific percentages of the PBPs, their range differences needs to be within the standard deviation of a uniform distribution with a scoring size of 1 mm as this is the best a calibration can be.A standard deviation of a uniform distribution is defined as MeV) energies.Dose verification was also performed in three patient cases: a pediatric brain tumor (ependymoma), a lung cancer patient and a prostate cancer patient.The lung plan includes a posterior-anterior field, which is not the most common practice, but are included for the purpose of stretching the calibration to its limits.For the patient treatment plans, dose-volume histograms (DVH), gamma test(s) (Low and Dempsey 2003) and visual dose inspection layer by layer were performed to confirm the equivalence between the MC and the TPS.The gamma test was performed with distance to agreement (DTA) equal to 1, 2 and 3 mm, a dose difference (DD) of 1%, 2% and 3% and a dose threshold of 10% of the prescribed dose for each patient.

Results
The difference in required energy adjustment systematically increased by 0.6-1.4MeV when compared to the nominal energy.The ranges at 160 MeV showed greatest deviation and had the largest differences, although all ranges were within the defined threshold (figure 2).The adjusted energy spread function showed an overall similar trend to the initially assumed trend but with less energy spread at low energies (figure 2).The energy spread ranged from approximately 0.2%-1.4% of the beam energy, depending on the proton energy.In general, the PBP depth difference across the TPS and MC increased with energy (figure 3).Only a single data point, 210 MeV, used in the calibration of range and energy spread failed to meet the  0.3 mm criteria ( R 20 D ).Distal to the BP 110,140,150,170,190 and 218 MeV failed to meet the  0.3 mm for R P 80 and 150 and 190 MeV for R .TPS The data points E diff were implemented as energy corrections datapoints with respect to the arrival of the polynomial used for range corrections.The lower panel shows the energy spread [percentage of the initial beam energy] as a function of the E .
TPS Assumed spread, corrected data, and a 2nd-degree polynomial fit is represented with blue data points, orange data points, and a red dashed line, respectively.under 4% and the average FWHM difference were 0.27 mm.The estimated number of primary particles per MU was dependent on the integrated area of the PBPs (figure 4).For higher energies, a lower number of primaries was estimated when calibrating with R R .
P 50 20 -As a measure of agreement between the MC and TPS, both models were tested with SOBPs and gamma test pass rates for the different patients.The integration R P 50 -R 20 showed the best agreement, hence, this method was utilized further throughout this work.
The SOBPs (figure 5), resulted in a moderate dose difference at the beam entry for low energy SOBPs.The largest dose difference was 5.5% located at the distal end of the low energy SOBP.The differences were 4.4% in dose at the depth of largest difference, for both the medium and high energy SOBPs.
The lateral dose profiles of the low, medium and high SOBPs had a maximum dose difference of 1.3%, −2.8% and −4.1% for TPS-MC at the center of the SOBPs.Furthermore, a deviance of 0.5 mm in the lateral spread was observed for the high energy SOBP at 10% of the field edge (figure 6).The lateral dose profiles were also examined (figure A.5) at the entrance of each SOBP, with a maximum dose difference of −0.4%, −1.5% and −1.9%.
The dose comparison for the pediatric brain tumor case showed a minimal dose difference in the CTV and low to negligible dose difference in OARs (figure 7).D 50% values were 53.9 Gy for the TPS and 53.8 Gy for the    While there were indications of hotspots for the neutron dose, the doses rarely exceeded 0.6 mSv Gy −1 (figure 8).Calculated median dose for volumes in the brain also utilized in dose verification showed a low 0.2-0.3m Sv Gy −1 .
The gamma test, performed on the TPS and MC proton dose calculations, for RBE = 1.1, showed solid 95.5% pass rates for the total brain, and an acceptable pass rate for the CTV, brainstem, and chiasm for the extreme requirements 1 mm distance to agreement and 1% dose difference (table 1).
The results from the pediatric patient can be utilized to demonstrate the difference when integrating the whole Bragg curve compared to R p 50 -R .20 When integrating over the whole Bragg curve, the MC mean dose increased by 0.7 Gy for the CTV (53.6 Gy for R R , p 50 20 compared to 54.3 Gy for the whole PBP integrated), where the mean dose calculated by the TPS was 53.8 Gy.The exception were volumes preceding the PBP such as the right temporal lobe (figure 9).The dose difference between the two conversion factors favored integrating the whole Bragg curve.The gamma test preformed for the two dose DCF models favored the integration from R P 50 -R 20 (table 2).In this comparison, the gamma test was only performed for the extreme scenario 1 mm distance to agreement, 1% dose difference since this difference is easier to determine.When the whole PBP was integrated over and utilized in the absolute dose conversion, a lower pass rate was observed for the target region (PBP area).In this example, a notable, more significant pass rate was found for regions proximal to the PBP such as the right temporal lobe for whole Bragg curve integration (table 2).
Compared to the brain tumor patient, the dose difference in the CTV was larger for the lung patient (D 50% 1.5 Gy), while the dose difference for the right lung (D 20% of 0.4 Gy) and heart was minimal.Dose difference calculated for the right lung was larger at low doses and lower at high doses when compared to the MC (figure 10).The beam entering posterior to the patient had a lower calculated dose in the TPS.The CTV received a higher dose (figure 10).A gamma test was also performed for the lung plan, where we found an overall lower pass rate compared to the brain tumor plan (table 3).
When excluding the posterior field for the lung patient, a significantly larger pass rate was observed for all structures except for the heart with the criteria 1 mm distance to agreement and 1% dose difference (table 4).
The dose difference between the TPS and MC was larger for the prostate patient compared to the pediatric case for high doses in the CTV (figure 11).The CTV had a D 50% difference of 0.3 Gy (69.9 Gy and 70.2 Gy for the TPS and MC, respectively.The 0.3 Gy difference is a rather small difference.The dose distribution (figure 11) showed good agreement between the TPS and MC except for distal to bony structures.The variable RBE models (Rørvik and McNamara) had a moderate increase in RBE-weighted dose, a consequence of the low / a b of the  tumor.The gamma test showed good agreement for the prostate case between the TPS and MC except for the 1% dose difference,1 mm distance to agreement for the CTV (table 5).

Discussion
The motivation for presenting this calibration procedure is to increase availability of MC tools in proton therapy research and for independent dose verification.Despite the introduction of MC particle transportation calculations in some TPS software, it is not as adaptable as a dedicated MC code, where implementation and calculation of state-of-the-art variable RBE is feasible with high flexibility and transparency to the user.The use of general purpose MC codes also enables deeper insights into mechanisms and interactions which is essential to support development work (Schneider et al 2016, Ytre-Hauge et al 2019, Winterhalter et al 2020).In this work, we present a novel systematic method of calibrating a MC to a TPS.In particular, we introduce a new mathematical method for estimating energy spread only using a few obtainable parameters and a first of its kind dose conversion method only requiring integral depth dose curves.Range differences for pristine Bragg curves was measured along with verifications of SOBPs for a broad selection of energies.Verification and variable RBE calculations were performed in patient cases and included a wide range of densities and cellular / a b to cover relevant clinical scenarios.
A range uncertainty of 0.5 mm has commonly been applied during previous range calibrations (Espana and Paganetti 2010, Schuemann et al 2014, Grassberger et al 2015).Grevillot et al (2011) used a percentage range difference, which at low energies could misrepresent the uncertainty.In our calibration, we assumed a uniform hit registration across scoring volumes and applied a standard deviation of a uniform distribution (Illowsky 2020).With this assumption, 0.3  mm range difference uncertainty is within what can be obtained from experimental measurements.A few measurements proximal to the BP itself failed to meet this uncertainty, however that is to be expected with differences in algorithms.In comparison, prior to the calibration of PBPs the  area of the Bragg curve is currently employed in FLUKA.A DTA of 1 mm and a DD of 1% was employed as this is the only criteria displaying the difference to a meaningful degree.values were almost completely outside the accepted uncertainty (figure A.6).If only range calibrations would be done, R 20 could shift significantly compared to the TPS, which could also lead to systematic LET d shift.
Nevertheless, there has been a lack of reporting proximal calculated ranges for PBPs (Grevillot et al 2011, Fracchiolla et al 2015, Grassberger et al 2015, Kozłowska et al 2019), and for our method for absolute dose MU conversion will be affected by a skewed range difference in the proximal PBP.We therefore suggest reporting ranges proximal to the peak, the peak itself and distal to the peak.Moreover, the American association for physics in medicine (AAPM) recommends the usage of 1  mm uncertainty for clinical measurements (Arjomandy et al 2019) which is well within the 0.3  mm utilized here.Another task group from AAPM, Farr et al (2021) recommended a 95% pass rate for a 3 mm/3% gamma-test for a 2 dimensional dose distribution,  where all of our cases passed with the exception for the CTV in the extreme lung case scenario with 3 fields.Based on the results from our MC calibration and verification in the different patient cases, we recommend reporting gamma scores for at least 3 mm/3% as a measure of general accuracy and 1 mm/1% as a measure of how precise the calibration is for volumes both proximal to the BP and the target volume.
For the absolute dose calibration, the integrated area of the Bragg curves will affect the required dose conversion factor (DCF) (figure 4).This, in turn, will determine whether an average over the whole curve or the BP area exclusively, is favored.Kelleter et al (2019) found that by excluding proton and electronic buildup, the dose could be reduced by 16% at the entrance and affect the dose accuracy up to 150 mm in depth.The deterministic TPS algorithms does to some degree account for this entrance buildup but have limitations in the transition across different densities.Also, PCS algorithm can moderately account for non-elastic scattering through its double Gaussian fluence model (Shen et al 2016), but will deviate in the plateau region of the Bragg curve.Furthermore, non-elastic scattering will be more prominent for higher energies (figure 4).As the beam energy increases, the difference between the full Bragg curve area and only the R R P 50 20 -DCF, will also increase.The PCS algorithm is less accurate in inhomogeneous regions (De Martino et al 2021), such as in our applied phantoms with an intersection of vacuum and water.Due to subsequently inconsistency between the TPS and MC in these regions, the entrance as well as parts of the plateau of the Bragg curve was omitted when calculating the DCF.A DCF derived from integrating whole Bragg curves may for the reasons mentioned above lead to an excessive high dose for the high dose regions, for example the CTV in the MC recalculations.Moreover, a DCF utilizing the area R R P 50 20 will result in a more similar dose in the CTV for the two calculation methods.Differences across dose calculation methods/algorithms are quantified by calculating dose differences and gamma tests.Paganetti et al (2008) calibrated the Geant4 MC code to measurements, they examined the difference in a MC algorithm and pencil beam algorithm, evaluating the dose differences by gamma tests for the criteria 2 mm, 2%.Whereas tabulated values were not provided, gamma pass scores cannot be directly compared to this work and for similar patients.Furthermore, Kozlowska et al (2019) compared FLUKA to the TPSs from two different institutions.The simulation preformed with the Trento commissioned beam line included beam characteristics only, i.e. no geometry involved in the MC simulation, as has been done for this calibration as well.Distinguishing the Trento and the calibration done in this work a PTV dose difference up to 1.4% (0.8 Gy) in a chordoma case (Trento-FLUKA), compared to 0.2% (0.1 Gy) in our calibration (Eclipse-FLUKA) was noted.The difference in PTV dose from the CNAO commissioning, was however, similar to our work (0.1 Gy).
The corrections done to RPSP and density within intervals were performed for HUs −900, as most HU values were above this limit (lung cancer patient, figure A.7). Dose calculation algorithms (pencil beam) tend to underestimate dose to normal lung tissue (Grassberger et al 2014), as shown in the dose distributions for the lung patient (figure 10).For the RPSP calibrations of lower HUs (<−830) a phantom with extended depth (4 m) was utilized.A lower HU phantom could result in a larger lateral deviation between the TPS and MC, though, this would not be of concern unless a beam traverses through a large HU region, as with the posterior anterior lung field.This tendency can also be seen for low dose regions for the lung DVH (figure 10).In another study by the same group (Grassberger et al 2015), TOPAS was commissioned for different patient scenarios and compared to a pencil beam algorithm.They found up to 30% dose difference between the two modalities for a lung cancer patient plan.In our lung patient case, we obtained a maximum dose difference of 15%.A pencil beam algorithm will only account for material along its central axis (Petti 1992) and thus multiple coulomb scattering is not modeled as well as in a MC, furthermore, in inhomogeneous material the pencil beam algorithm will be less accurate (De Martino et al 2021).To investigate the calibration limits, we used a posterior-anterior field in a lung cancer case with a pass rate of 36.5% (1 mm, 1%), where the beam traverses through a large low HU-value organ (Grassberger et al 2014).As expected, a higher pass rate (72.7%) for the CTV was achieved when the posterioranterior field was excluded from the treatment plan (table 4).As modern TPSs move towards an integrated MC, it might be reasonable to it in cases with low densities regions.
Grevillot et al (2012) adapted a TPS from Electa (XiO) to the MC, though without implementing nozzle components, similarly to our current work.This implementation allowed for the user to adjust the precision by tuning the number of sub-spots in each calculation (using 1, 49 and 121 sub-spots), 1 sub-spot would be comparable to the PCS algorithm.They further compared to a calibrated MC toolkit (GATE).With a medium degree of precision for XiO and dose to water for GATE, they achieved similar dose differences (for D 50% ) at 0.4%, whereas with our set-up we achieved 0.2% in the GTV of a prostate patient.
For recalculation in this work, we utilized an Intel Xeon Gold 6248 with 80 cores and 128 Gb of RAM and achieved sufficient statistical accuracy within 4-12 h per patient.Commercial treatment planning systems with integrated GPU (Schreuder et al 2019) are faster, but with a different scope of utility.With the flexibility and extra data opportunity, a dedicated MC is a valuable addition to the clinic as it can provide selectivity related to RBE models, out-of-field doses, and support experimental measurements.

Conclusion
In this study we calibrated a MC to a TPS for pencil beam scanning proton therapy and presented a novel, easyto-reproduce, stepwise method.We achieved excellent accuracy in both water and patient cases, with the exception for extreme fields in low density tissue.The method of calibrating range and energy spread will work for any modern TPS and general purpose MC code.Having a dedicated MC enables reproducing proton treatment plans calculated with clinical TPSs, paving the way for studies including LET and out-of-field doses.

Figure 1 .
Figure 1.Overview of the calibration steps.The PBP calibration are illustrated in box A, where the left PBPs are pre-calibration and right, after.HU correction (box B) are datapoints extracted from the TPS and implemented in FLUKA as tables.For the dose calibration we used the area from range at 50% of max proximal to the BP (R P 50 ) to 20% distal to the BP (R 20 ) illustrated in the PBP in the third box (left) as well as the finished dose conversion factor (DCS) (box C).And the lower box is a SOBP verification and a patient TPS-MC dose verification (box D).
By dividing R 8020 calculated with PBPs from the TPS (DDF TPS ) by the distal dose fall-off calculated in FLUKA (DDF MC ), the difference in energy spread can be calculated with equation (3): (Illowsky 2020) and are the limit for range implemented.The assumption was a uniform hit registration across the scoring voxel.The difference in range at 80% and 90% proximal to the PBP ( PBP ( R 100 D) and at 80% and 20% distal to the peak ( R 80 D and R 20 D ) were calculated (ranges illustrated in figure A.1). Spot sizes were also verified at the isocenter in air for equal energies as range and energy-spread, with the size measured as full width half maximum (FWHM).Differences in spot size was represented as a percentage of the original (TPSs) FWHM, for comparison a 10% spot difference is utilized in current clinical treatment planning(Schwarz et al 2016).The calibration was verified for spread-out Bragg peaks (SOBPs) in water for low (70-120 MeV), medium (120-160 MeV) and high (160-210 For the RBE-weighted dose models in this work all / a b values used were generic values./ a b = 10 Gy was used for the tumor in the pediatric (Paganetti 2017) and lung patient (Klement et al 2020) plan, while an / a b = 1.5 Gy was used for the tumor for the prostate cancer (Brenner and Hall 1999) plan./ a b = 2.1 Gy was used for all organs not mentioned (McNamara et al 2015).For the brainstem and right lung / a b = 3.0 Gy was used (Borst et al 2010, Kondziolka et al 2015), while an / a b = 2.0 Gy (Kehwar 2005) was used for the heart.Finally the neutron dose to the pediatric patient was calculated using ICRP103 (ICRP 2007) weighting factors P 90Spot sizes were verified for energies from 70 MeV to 218 MeV.Without any change in spot size between the TPS and MC a maximum difference of 5.3% was achieved (figure A.4). Additionally, most of the difference was

Figure 2 .
Figure 2. The upper panel shows the energy difference, (TPS-MC), denoted E , diff as a function of the nominal beam energy from the TPS (MeV), denoted E .TPS The data points E diff were implemented as energy corrections datapoints with respect to the arrival of the polynomial used for range corrections.The lower panel shows the energy spread [percentage of the initial beam energy] as a function of the E .TPS Assumed spread, corrected data, and a 2nd-degree polynomial fit is represented with blue data points, orange data points, and a red dashed line, respectively.

Figure 3 .
Figure 3. Range differences (D) for the MC minus the TPS PBPs in mm along the depth dose curves for increasing energy, indicated by the respective markers, range at 80% and 90% proximal to the PBP (R P 80 and R , P 90 respectively ) proximal to the PBP, R 100 as the PBP and R 80 and R 20 as 80% and 20% distal to the PBP.The red lines indicate the 0.  3 mm criteria.The only range shown with error bars are the most volatile range (R P 80 ).

Figure 4 .
Figure 4. Number of primary particles needed to achieve one MU plotted against specific energies.The two different data sets with fits represent the area of the integrated PBPs, where the blue data points and fit provide the integration of the whole track, while the red data points and fit contain integration from R P 50 to R , 20 only.

Figure 5 .
Figure 5. SOBP depth-dose curves for low energies (70-120 MeV) in the upper left panel, for intermediate energies (120-160 MeV) in the upper right panel, and for high energies (160-210 MeV) in the lower left panel.The lower right panel is the point-to-point dose difference between the TPS and MC for low, intermediate, and high energies.

Figure 6 .
Figure 6.Lateral dose profiles for the high (upper panel), medium (middle panel) and low (lower panel) energy SOBPs in water for the TPS (orange lines) and MC (blue lines).The depth profiles were sampled at a depth of 225 mm, 150 mm and 75 mm for the high, medium and low SOBPs, respectively.

Figure 7 .
Figure 7. DVHs showing the CTV (upper left), chiasm (upper right), right temporal lobe (lower left) and the brainstem (lower right).Each subplot contains the calculated dose in the TPS (TPS in legend) and the calibrated MC dose (MC in legend) for comparison, aditionally dose weighted using RBE models by McNamara et al (MCN in legend) and Rørvik et al (RORW in legend) are plotted for each structure.Dose color washes for the TPS (left), MC (middle) and difference in dose (right, TPS-MC).Delineated structures are the CTV (red), right temporal lobe (purple) and brainstem (green).

Figure 8 .
Figure 8. Weighted neutron dose for the pediatric patient.Deliniated in the dose distribution are the CTV (red), right temporal lobe (blue) and brainstem (green).The DVH is calculated for the CTV and OARs.

Figure 9 .
Figure 9.Comparison DVH for different absolute dose calibration methods for the pediatric brain tumor.

Figure 10 .
Figure 10.DVHs for the CTV (left panel), right lung (middle panel) and heart (right panel).Each panel contains the calculated dose in the TPS (TPS in legend) and the calibrated MC dose (MC in legend) for comparison.Additionally the weighted dose by McNamara et al (MCN) and Rørvik et al (RORW) doses are plotted for each structure.Dose color washes for the lung patient.From left to right, TPS doses, FLUKA simulated doses and difference (TPS-MC).Delineated in blue is the right lung, while the CTV is shown in orange.

Figure 11 .
Figure 11.DVHs for the CTV (upper panel), and the whole body (lower panel).Each panel contains the calculated dose in the TPS (TPS in legend) and the calibrated MC dose (MC in legend) for comparison.Additionally the weighted dose from McNamara et al (MCN) and Rørvik et al (ROR) are plotted for each structure.Prostate patient dose distribution.TPS doses (left), FLUKA doses (middle) and difference (right, TPS-MC).Yellow is the CTV.

Figure A. 3 .
Figure A.3.Projected range for Eclipse (RR TPS ) for a 150 MeV beam at different HU-values for the TPS.

Figure A. 4 .
FigureA.4.Spot size differences for a single spot, 4.5 cm depth in water.s is the standard deviation for a Gaussian approximation to the spot size, the D is the difference in standard deviation for the TPS and MC.

Figure A. 6 .
Figure A.6.Range differences (D) prior to the calibration of the PBPs for the MC minus the TPS PBPs in mm along the depth dose curves for increasing energy, indicated by the respective markers, range at 80% and 90% proximal to the PBP (R P 80 and R , P 90 respectively ) proximal to the PBP, R 100 as the PBP and R 80 and R 20 as 80% and 20% distal to the PBP.The red lines indicate the 0.  3 mm criteria.The only range shown with error bars are the most volatile range (R P 80 ).

Figure A. 5 .
Figure A.5. Lateral dose profiles for the high (upper panel), medium (middle panel) and low (lower panel) energy SOBPs in water for the TPS (orange lines) and MC (blue lines).The depth profiles were sampled at a depth of 175 mm, 105 mm and 45 mm for the high, medium and low SOBPs, respectively.
(Fedorov et al 2012)erformed with Slicer(Fedorov et al 2012)and a sub package, Slicer RT (Csaba Pinter and Slicer 2021).Variable RBE, LET and weighted neutron dose calculations were performed to demonstrate the utility of the MC implementation.
Fjaera et al (2017)o water as described byFjaera et al (2020)with only the primary and secondary protons as perFjaera et al (2017).

Table 1 .
Gamma test pass rate in percentage, performed on the TPS and MC calculations, for different volumes of interest and different criteria.

Table 2 .
Gamma test performed for the pediatric brain tumor plan comparing the R R

Table 3 .
Gamma test results performed on the TPS and MC dose for the CTV, right lung, and heart of the lung tumor patient.

Table 4 .
Gamma test comparison for the original lung patient treatment plan and a plan where the posterior-anterior field is excluded.The comparison was done for the extreme scenario with a DTA of 1 mm, and a DD of 1% as these criteria highlights the difference in dose.

Table 5 .
Gamma test results for TPS and MC doses for different thresholds for CTV and body for the prostate patient, using the criteria DTA = 3, 2 and 1 mm and DD = 3, 2 and 1%.