Hypoxia adapted relative biological effectiveness models for proton therapy: a simulation study

In proton therapy, a constant relative biological effectiveness (RBE) factor of 1.1 is applied although the RBE has been shown to depend on factors including the Linear Energy Transfer (LET). The biological effectiveness of radiotherapy has also been shown to depend on the level of oxygenation, quantified by the oxygen enhancement ratio (OER). To estimate the biological effectiveness across different levels of oxygenation the RBE-OER-weighted dose (ROWD) can be used. To investigate the consistency between different approaches to estimate ROWD, we implemented and compared OER models in a Monte Carlo (MC) simulation tool. Five OER models were explored: Wenzl and Wilkens 2011 (WEN), Tinganelli et al 2015 (TIN), Strigari et al 2018 (STR), Dahle et al 2020 (DAH) and Mein et al 2021 (MEI). OER calculations were combined with a proton RBE model and the microdosimetric kinetic model for ROWD calculations. ROWD and OER were studied for a water phantom scenario and a head and neck cancer case using hypoxia PET data for the OER calculation. The OER and ROWD estimates from the WEN, MEI and DAH showed good agreement while STR and TIN gave higher OER values and lower ROWD. The WEN, STR and DAH showed some degree of OER-LET dependency while this was negligible for the MEI and TIN models. The ROWD for all implemented models is reduced in hypoxic regions with an OER of 1.0–2.1 in the target volume. While some variations between the models were observed, all models display a large difference in the estimated dose from hypoxic and normoxic regions. This shows the potential to increase the dose or LET in hypoxic regions or reduce the dose to normoxic regions which again could lead to normal tissue sparing. With reliable hypoxia imaging, RBE-OER weighting could become a useful tool for proton therapy plan optimization.


Introduction
Tumor hypoxia (low oxygenation) is associated with radioresistance, potentially leading to treatment failure and poor disease survival [1,2]. The increased radioresistance of hypoxic tumors is estimated by means of the oxygen enhancement ratio (OER), defined as the ratio of radiation doses in hypoxic (D hyp ) and fully oxygenated conditions (D norm ) that result in the same cell surviving fraction. The effects of hypoxia on photon radiation have been known for a long time, where the OER commonly is between 2 and effectiveness (RBE) [5,6]. The proton-RBE may be used to estimate the biological dose (D bio ) of a proton therapy plan. D bio accounts for the different biological effect of protons compared to conventional photon radiation and is found by weighting the physical dose (D phys ) with the RBE, i.e., D RBE D bio phys = · . Many RBE models have been proposed up to date [7], but in daily clinical practice the consensus is to adopt a 1.1 constant RBE. However, several studies have shown that the RBE is not constant and that it depends on factors such as the deposited physical dose, the radiation quality, and the irradiated tissue type [6][7][8][9][10][11]. Proton therapy plans may benefit from using a variable RBE and from accounting for the OER [11,12].
Proton therapy plans can be adapted using the microdosimetric kinetic model (MKM), a model that can predict the RBE for heavy ions [10,13,14], and others are based on phenomenological models [7,15]. These RBE models rely on experimental data such as the radiosensitivity parameters a and b from the linear quadratic (LQ) model. And with both, it is possible to perform a hypoxia adaptation to estimate and study the RBE-OER weighted dose (ROWD). Several models to adapt the RBE for hypoxia have been published for proton therapy [16][17][18][19][20][21][22][23][24], but how these models agree or differ in their predictions has not yet been studied. Therefore, before a potential hypoxia adaptation of clinical proton therapy plans, there is a need to investigate and compare the performance of the different models.
In this study, we aimed to implement and analyze different hypoxia adapting RBE models in order to compare the predicted OER and the resulting ROWD for proton therapy. Five published models that weight the RBE using the OER were implemented in the FLUKA Monte Carlo (MC) simulation framework. For each model, we simulated proton therapy plans both on a virtual water phantom and on a head and neck cancer (HNC) patient case where hypoxia was quantified by positron emission tomography (PET) imaging using a hypoxia tracer.

Calculating the RBE for proton therapy
With the formalism presented by the LQ model [25], the RBE can be expressed as showing that the RBE depends on the proton physical dose (D p ) and the radiosensitivity parameters of the tissue, a, b, x a and x b [26,27]. The a and b corresponds to the radiation sensitivity for protons and x a and x b to the sensitivity for photons, the reference radiation. Note that some of these models define the extreme values, achieving the parameters RBE max and RBE min and rewrite equation (1) Furthermore, ROR includes a biological weighting function (BWF) based on phenomenological data from in vitro cell experiments to include a dependency in RBE max on the full dose weighted LET spectrum. Finally, it assumes a constant [15]. Using these RBE max and RBE min in equation (2) the ROR-RBE is obtained.

Microdosimetric kinetic model (MKM)
The MKM can predict the RBE of heavy-ion (high-LET) radiation [10,13,28,29]. It defines the number of lethal lesions produced in the cell nucleus averaged over a cell population to describe the surviving fraction S after cell irradiation as S z D D ln [14,30,31]. Making an analogy with LQ model, the MKM defines its radiosensitivity parameters z .
a and b are often taken as the reference photon x a and x b parameters [31][32][33] and z D 1 * is the saturation-corrected dose-mean specific energy from a single event [30,33]. Substituting MKM a and MKM b for a and b in equation (1) the MKM-RBE is estimated.

OER models and RBE adaptation
Hypoxia can be quantified by the partial oxygen pressure (pO 2 ), expressed in % or in mmHg. Different approaches to calculate the OER from pO 2 have been proposed [16][17][18][19][20][21][22][23][24]. Generally, these models use experimental in vitro data from different cell lines under normoxic and hypoxic conditions to fit a reversesigmoid shaped curve describing the OER and pO 2 relationship through the Alper-Howard-Flanders formalism [3]. In general, the experimental data defined their normoxic data to correspond to 160 mmHg or 21% oxygenation. From how the fitted experimental data is used on the modeling and estimation of the OER, we see that the models follow two different methods.
The first methods is used by the adaptations proposed by Mein et al (MEI) [19] and Tinganelli et al (TIN) [18], who use published and in-house phenomenological data, respectively, to parameterize the OER for photons (OER ph ) as a function of the pO 2 and then correct for high LET-particles using a radiation quality parameter, L. For these models, the resulting OER follows OER L pO a OER pO L a L , 3 where a and g are parameters obtained from the numerical fit to the experimental data and OER ph is described as where the m and K are constants obtained via a numerical fit to the experimental data [34,35]. The second method is used on the OER models by Wenzl and Wilkens (WEN) [16], Dahle et al (DAH) [20] and Strigari et al (STR) [22], and it consists of adapting the LQ radiosensitivity parameters, L a( ) and L b ( ), to account for hypoxia. The modeled L pO , 2 a( ) and L pO , 2 b ( ) are then used to calculate the OER using equation (5) [16,36], obtained after expanding iso-effective D hyp and D norm , and where p norm is the partial oxygen pressure in normoxic conditions and S is the surviving fraction.
The calculated OER, regardless of whether is directly modeled (first method) or calculated from the hypoxia adapted LQ parameters (second method), is then used together with the RBE to define the ROWD-factor, described in this study as RO RBE OER.
= / This makes the ROWD-factor an RBE relative to photon radiation at normoxic conditions. The ROWD is then calculated as ROWD D RO phys = · where D phys is the total physical dose.
Adaptations that use the modeled OER using methods resulting in heavy, time-consuming numerical calculations [21,23] and earlier versions of the studied OER models [17] where left out in this work. A summary of all the studied models can be found in table 1. The table shows that some models account for the oxygenation level by measuring the pO 2 in % and other models by using mmHg. However, the conversion from one unit to the other is trivial [37].
To account for oxygenation effects, OER models are used in combination with an RBE model. Particularly, the studied OER models could have been used together with a low/intermediate-LET RBE model or with an RBE model that considers high-LET radiation as well. At the time of coupling the OER models with an RBE model we followed the approach detailed in the models' original publications. Consequently, in the present work, both WEN and DAH were used to adapt for hypoxia together with the variable RBE model by Rørvik et al (ROR) [15] and a constant RBE of 1.1 (RBE 1.1 ) (low/intermediate-LET RBE). The TIN, MEI, and STR models were combined with the RBE in the MKM framework (high-LET RBE).

Wenzl and Wilkens OER model (WEN)
WEN modeled the LQ parameters accounting for the LET and the pO 2 measured in mmHg, before adjusting the OER [16]. In this framework, the whole LET dependence is carried by a and b is LET independent. Following the Alper-Howard-Flanders relation [3], the Wenzl and Wilkens a and b parametrization follows: where K 3.0 = mmHg, LET d is the dose-averaged linear energy transfer, a 0.22 Gy −1 and b 0.015 2 = Gy −1 [16]. The WEN model used results from OER experiments for several cell lines (V79, HSG, HeLa, p388, FSa-II, R1, T1, L2, EMT6, SCC VII, NFSa, RAT1, and 9L), to find 10% cell survival for multiple radiation types (protons, deuterons, He-, C-, Ne-and Ar-ions) under normoxic and hypoxic conditions.

Dahle OER model (DAH)
The model by Dahle et al is based on the WEN model. The particularity of this model is that it only uses experimental proton data from several cell lines (V79, HSG, T1, HeLa, p388, and H 4 ) to parametrize the LQparameters following equations (6) and (7), resulting in a 0.

Mein OER model (MEI)
MEI modeled a parameter called the hypoxia reduction factor (HRF) that acts as the OER to account for hypoxia [19]. The HRF depends on the pO 2 in % and a dimensionless particle and energy dependent parameter, the radiation quality energy, RQE = Z .
Z eff is the effective charge of the ionizing particle and v c, where v is the velocity of the particle and c the speed of light.
MEI is based on in vitro experimental data from the V79 cell line irradiated with proton and higher-LET radiation data from heavier ions, and it was used to adapt the MKM, an RBE model for high-LET radiation.

Tinganelli OER model (TIN)
Similar to MEI, TIN parametrizes the OER using a two-step process to include the pO 2 in % and the LET d [18], enabling adaptation also to heavy-ion RBE models, i.e., TIN can also be used to adapt the MKM.
First, the LET d dependence is introduced in anoxic conditions, i.e., pO 0%, 2 = and the OER is given by This fit was performed on V79 cell line data from an extensive dataset measured at NIRS and then adapted to the Chinese hamster ovary (CHO) cell line data set that was subject of their study [18].
In a second step, the pO 2 dependence is included resulting in where b 0.25% = according to an x-ray data parametrization [38] and m OER LET , 0% d = ( ) as described in equation (10) [19].

Strigari OER model (STR)
STR is intended to use with the MKM, and it considers the specificity of different ions, LET and tissue type Ne-ions a others: HeLa, p388, FSa-II, R1, T1, L2, EMT6, SCC VII, NFSa, RAT1, and 9L. b others: T1, HeLa, p388, and H 4 . c Fitted data displays details on the irradiated cell lines, the type of radiation used to acquire the experimental data and the survival fraction of the irradiated cells. d Modeled parameters are obtained from the experimental data and are specific for each model. e Dependencies of the models are listed, as these may affect the OER adaptation ( v c ion b = / ). Note that some of them imply a tissue type, particle type and/or particle energy dependence. [22]. STR models the a and b parameters from the MKM and defining pO 2 a( ) and pO 2 b ( )as The min a and min b correspond to the LQ radiosensitivity parameters under extreme hypoxic conditions, with pO 2 close to 0 mmHg while max a and max b correspond to normoxic conditions. These parameters are evaluated accounting for the non-Poisson lethal event distributions for high-LET particles by multiplying MKM a with a correction factor [13,31], resulting in where z Dn 1 is the dose-averaged specific energy in the nucleus and z , is the dose-averaged specific energy in the domain and 0 a and 0 b the LQ radiosensitivity parameters used in the MKM. These parameters were fitted to experimental in vitro data for the HSG, V79 and CHO cell lines under normoxic and extreme hypoxic conditions to obtain the max 0 a and min 0 a parameters needed to solve equation (12).
To account for non-Poisson lethal event distributions for high-LET particles for MKM b parameter they multiplied it by where a is obtained from equation (14) and . needed to solve equation (13). Note that equation (15) the same scaling between LQ-parameters in the MKM.
Details on the dose-averaged specific energies z Dn 1 and z D 1 are beyond the scope of this study, these can be found elsewhere [39,40]. Their calculation was carried out as described by Kase et al 2008 [31], and for their implementation in FLUKA we followed the approach from Magro et al 2017 [33].

Monte Carlo implementation
We implemented these five OER models in the FLUKA MC tool [41] following three steps as indicated in figure 1: 1. Input data is created and organized into input tables that are read by FLUKA. These include the pO 2 -position table needed for hypoxia adaptations and the radiation quality parameter tables required for each model. Single-track dose-averaged specific energies or the RQE were calculated and given to FLUKA as z Dn 1 -energy, z D 1 -energy and RQEenergy tables. With an internal subroutine, FLUKA uses the proton energy recorded at each step of the Monte Carlo simulation to interpolate and use the corresponding radiation quality parameter on the corresponding voxel.
2. Based on the input data and the corresponding radiotherapy plan input file, FLUKA calculates the model parameters with a dedicated subroutine for each model. Depending on the model, FLUKA will calculate and score the OER(L, pO 2 ). Note that for WEN, DAH and STR, the OER(L, pO 2 ) is scored after estimating the L pO , 2 a( ) and L pO , 2 b ( ) parameters and using them into equation (5).
3. FLUKA output is studied with in-house Python scripts. The calculated and scored OER(L, pO 2 ) is used together with the scored physical dose and the RBE specific a and b parameters to estimate the ROWD, create OER-maps, dose-volume histograms (DVHs) and asses the effects of hypoxia on the original proton plan.

Model simulations on a virtual water phantom and in a patient case
To study and evaluate the five OER models, MC simulations were performed using the in-house FLUKA based system for treatment plan recalculation [42] with standard FLUKA physics settings (PRECI-SIOn defaults, a delta-ray threshold of 10 keV and a logarithmic width of dp dx / momentum loss tabulated interval of 1.03, as recommended for particle therapy) and activating the modified RQMD (Relativistic Quantum Molecular Dynamics) for nucleus-nucleus interactions above 125 MeV per nucleon. Detailed explanations of the FLUKA recalculation tool are found in [20,42]. Results on the LET d , D phys , D phys a and D phys b were scored using the FLUKA subroutine fluscw (FLUence Scoring Weight) and, the number of primaries simulated was chosen to achieve a voxel mean statistical uncertainty below 2% and a voxel maximum statistical uncertainty below 3% in the target.
To verify the model implementation in FLUKA and evaluate the OER, a first set of MC simulations recalculated a proton field irradiating a water phantom. The proton field implemented in FLUKA was optimized in an analytical TPS to deliver, in a virtual water phantom, a homogeneous D bio of 2.0 Gy(RBE) in a 4×4×4 cm 3 volume that ranges from 8 cm to 12 cm depth in water using spread-out Bragg (SOBP) peaks. During the optimization process, the TPS used the constant RBE 1.1 for D bio calculations. We first set a homogeneous pO 2 value for the whole water phantom and changed it in repeated simulations. The oxygenation (pO 2 ) levels ranged from fully hypoxic (pO 0.0075 2 = mmHg or pO 0.001% 2 = ) to fully normoxic (pO 160.0 2 = mmHg or pO 21.13% 2 = ) to score the corresponding OER(L, pO 2 ) for different oxygenation levels. The results from these simulations were compared to the expected theoretical results from the implemented OER models.
Secondly, five 0.5 cm wide hypoxic layers were defined inside the virtual water phantom, perpendicular to the proton field incidence direction. The central layer corresponded to pO 2. Finally, all implemented models were used to simulate a hypoxia-adapted proton therapy plan for a HNC case. Hypoxia imaging of the patient using [ 18 F]-EF5 PET enabled quantification of oxygenation levels of the target volumes. The conversion from PET uptake to pO 2 values was calculated as previously published [20] where a threshold on the PET uptake data is implemented to keep pO 2 values from being higher than 60 mmHg, this pO 2 value is generally taken as normoxic in the clinic [37,43].
Since all studied OER models used data from the V79 cell line, all simulations were based on published data from the V79 cell line with normoxic LQ parameters of 0.147 x a = Gy −1 and 0.02 [21]. This was done to account for possible uncertainty effects in the results when simulating on a cell line not used when fitting the OER model parameters, and to minimize model differences caused by uncertainties arising from the use of different cell line data for different models.

Treatment plan
Treatment plans for both the virtual water phantom and patient case were created in the Eclipse (Varian Medical Systems, Palo Alto, CA, USA) treatment planning system (TPS), without accounting for hypoxia and were optimized for a constant RBE of 1.1.
For the HNC case, three proton fields were used to target the planning target volume (PTV). None of the three fields used a range shifter and were optimized with Eclipse TPS multifield optimization function to deliver a homogeneous D RBE1.1 of 70 Gy(RBE) in 35 fractions to the PTV. Table 1 shows properties of the implemented OER models. Model differences arise from the mathematical modeling dependencies, the unit used for the pO , 2 the parameter used for radiation quality and the conditions under which the phenomenological data used by the OER models was obtained, i.e., the radiation type, the irradiated tissue type and the biological effect achieved, being this a survival fraction of 0.1 for all models. Regarding radiation type, WEN, TIN, and MEI were fit using data points from low and high LET radiation, DAH was fit using proton radiation data points, i.e., low LET, and STR was fit by only using data from high-LET radiation.

OER study
All models show a clear increase in OER with decreasing pO 2 (figure 2). WEN, DAH and MEI agree well for clinically relevant pO 2 values on the studied patient case (∼2.5 mmHg and above). For the same pO 2 values the maximum OER ranges between 1.4 and 2.1 for all models and, as expected, drops towards 1.0 as the pO 2 increases. The shape of the curve is different for STR, where the resulting OER diverges for low pO 2 values. In general, TIN and STR give higher OER than the other models, but while STR differs more at low pO , 2 the results are more similar to WEN, DAH and MEI at pO 2 above 8-9 mmHg than for TIN. Figure 2 also shows higher OER with lower LET for all implemented OER models except TIN, that exhibits no changes for the shown LET values of 1 keV μm −1 and 11 k eV μm −1 . However, WEN, DAH and STR have a stronger LET dependence than MEI.
How OER estimations from FLUKA fit to the theoretical results is shown in figure S1.

Oxygenation effect on the biological dose
The resulting ROWD from the different models for the water phantom simulations can be seen in figure 3. All hypoxia adapted RBE models show a dip in ROWD in the hypoxic layers placed between 8 cm and 12 cm depth. The dip in the 2.5 mmHg layer corresponds to a reduction in dose between 25% and 40% when comparing all ROWD results to the dose from normoxic RBE models (D , Differences in the ROWD within the same hypoxic regions for the same model are the result of inhomogeneities in the SOBP, which can be seen in the plotted D , phys and the effects of averaging the ROWD values from voxels found on the boundary between hypoxic layers. As the voxel grid used in our simulations and the limits of the pO 2 layers do not match, the averaged value for the data points near these boundaries will be affected by the overlap of different pO 2 values within the same voxel. At the same time, D bio results for the MKM model are higher than for the ROR model in the distal edge of the SOBP and slightly  lower in the plateau region. These differences are still noticeable after ROWD calculations. Figure 4 shows that for WEN, DAH, and MEI the volume percentage corresponding to the same OER is similar for the patient case, resulting in three similar differential OER-volume histograms. For TIN and STR the shapes are alike, with the differential OERvolume distribution slightly shifted towards higher OER values for STR, see zoom in plot in figure 4. The peaks at low OER correspond to the respective OER model predictions at pO 60 2 = mmHg, the upper pO 2 limit set during the hypoxia PET data conversion.

Simulations on patient case
The OER distributions for the patient case are also shown in figure 5, together with the LET and pO 2 distribution in the target. Similar OER maps are seen for WEN, DAH and MEI, while higher OER are observed for TIN and STR models. However, regions of OER variation were found to be consistent between all models.
The MC simulations show that the PTV received LET d in the range of 1.1-5.3 keV μm −1 with a mean value of 2.7 keV μm −1 . As for the pO , 2 it ranges between 2 mmHg and 60 mmHg in the PTV with a mean pO 2 of 9.75 mmHg. Figure 6 shows ROWD volume histograms for each model. All models predict a ROWD lower than

Discussion
In this study we implemented and compared previously published models that adapt the RBE in proton therapy for hypoxia through modelling of the OER. Using the FLUKA MC tool, we compared the OERs and resulting ROWDs predicted by each model. The evaluation was performed on a one field proton plan for a virtual water phantom with different hypoxia (pO 2 ) levels as well as on a three-field HNC patient case. When studying phenomenological models, we must be aware that uncertainties in the experimental data affect the model parameters. Information about the precision of model parameters and their relationship with phenomenological uncertainties is not available in the literature. However, all studied models provide a methodology of estimating the OER and the OER-weighted dose to account for hypoxia. In the present work, we compare these models as they would be used in the clinic, where it is not a common practice to apply model uncertainties in the biological dose optimization process. We studied the effect of these models on the parameters from the V79 cell-line as this cell-line was included in all the model databases. This, together with enough primaries on the MC simulations to calculate physical quantities (voxel mean statistical uncertainty <2% and voxel maximum statistical uncertainty <3%), allows us to compare the outcome from the OER models as the impact of the assumptions made by the authors on the results instead of as the impact of the different uncertainties from the data on the results. A sensitivity study of the impact of uncertainties on dose estimates, as presented by Dahle et al [32], would be of high interest also for ROWD estimations.
As expected, OER-adapted RBE-models predict a lower biological dose absorption in hypoxic regions compared to RBE-models not adapted for hypoxia.  The observed sigmoid shape of the OER as a function of the pO 2 (figure 2) agrees with previous results [16][17][18][19][20][21][22][23][24]. All models show the same trend with increasing OER with reduced pO , 2 but while MEI, WEN and DAH agree very well across the range of clinically relevant hypoxia levels, TIN and STR predict higher OER over the whole pO 2 range. In particular, a higher OER for STR at low pO 2 is observed, which may be of clinical relevance.
OER variations for different LETs (figure 2) are also consistent with the expectations: we find higher OERs for lower LET values. This supports that high-LET radiation therapy is more efficient to treat aggressive hypoxic tumors. However, this LET difference is rather small for all models because the effect of the LET on the model equations is low for the studied pO 2 values and the limited range of LET present in proton therapy. Figures S2 and S3 show the extent of LET effects on extremely low, low and intermediate pO 2 concentrations on the OER and ROWD respectively. Yet, significative larger LET effects on the OER for WEN and DAH are captured when compared with MEI, STR and TIN. This might be explained by the phenomenological data used by the models: WEN and, in particular, DAH rely more on low-and intermediate-LET data compared to MEI, STR and TIN. DAH only used data from proton radiation, whereas STR used data from intermediate-(He-ions) and high-LET (Cand Ne-ions) radiation exclusively. WEN and MEI used data from low and high LET radiation throughout the OER modelling, whereas TIN modeled the photon OER and included the LET dependence considering only heavy-ion/high-LET data, which might explain why this dependence fades for TIN resulting in the same OER for different LETs. Even so, for the same LET, proton and heavy-ions do not result in the same radiation quality [44], making it not ideal to use an OER modeled with high-LET data for protons. Whether these models can be used to adapt for hypoxia high-LET radiation RBE models (like the MKM) has been explored or suggested in the literature can be seen in table 1. From figure 2 we suggest a two-group differentiation of the OER models, with WEN, DAH, and MEI in the first group and TIN and STR in the second group. This differentiation is independent of the two different methods in which the studied models estimate the OER. However, this two-group differentiation vanishes for extremely low pO 2 values ( Figure S2) since the OER and its dependence with LET becomes specific for each model: WEN, DAH, MEI and TIN are seen to behave differently with increasing LET and STR results diverged and are one order of magnitude larger than for the other models for low LET. The latter could happen because the correction that STR makes to account for non-Poisson lethal event distributions for high-LET particles does not hold for proton LET, resulting in a divergence in the modeled OER for extremely low pO 2 values.
All adapted RBE models are LET dependent, which results in an increase of the biological dose towards the end of the SOBP when comparing to results with the RBE 1.1 model ( figure 3). After adapting these RBE models for hypoxia, all ROWDs reveal an increased biological dose at the end of the SOBP and a dose reduction in the hypoxic regions. The lowest pO 2 in the phantom simulation was 2.5 mmHg, which roughly corresponds to the lowest pO 2 in the patient case. Therefore, the approximated drop of 25% to 40% in the ROWD when compared to the D bio should be expected in patient simulations when the LET of the incident protons and the pO 2 level selected as normoxic are the same in the phantom and the patient. These results support the same differentiation in two groups of models as the data from figure 2. This twogroup model classification still holds when changing the reference x x a b / -ratio used on the adapted RBE models, ( Figure S4). The WEN, DAH and MEI group shows a lower but still severe change of the absorbed dose estimated by the TPS and more homogeneous dose distributions than for the TIN and STR group. Still, all models exhibit a substantial ROWD decrease in specific tumor regions when accounting for the pO . 2 This would suggest increasing the dose to hypoxic regions during treatment plan optimization. Locating then the most radioresistant regions may allow to reduce the dose in the rest of the PTV, this would increase the complexity of the proton therapy plans in the clinic but overall reduce the dose to organs at risk (OARs). Ignoring the increased radioresistance of hypoxic tumor regions could potentially result in an ineffective treatment and poor patient outcome. Another solution would be to exploit the increased effectiveness of high-LET radiation on hypoxic tissue [45]. In this case, instead of boosting the dose on hypoxic tumor regions the LET could be increased. Our study indicates that, in the case of proton radiation, only a slight reduction in OER is obtained through LET increase ( Figure S2). However, with the RBE increase achieved when increasing the LET, the ROWD could still be significantly augmented through the elevated LET ( Figure S3).
As a final remark on the WEN, DAH and MEI models and calculations of the ROWD, we identified very similar results in spite of having adapted different RBE models (the MKM for MEI and ROR for WEN and DAH). Although unexpected, this is explained when looking at the differences between the normoxic D bio MKM , and D bio ROR , from 9 cm to 11.5 cm depth in figure 3. In this region of the SOBP we see how the normoxic ROR and MKM result in very similar D .
bio Therefore, any differences in the corresponding ROWD calculations would solely appear due to the OER adaptation. As seen in figure 2, there are only minor differences in the OER between WEN, DAH and MEI models for extremely low, low and intermediate oxygen levels. These two features put together result in very similar ROWD calculations despite adapting different RBE models.
Since we have exclusively recalculated proton plans, and TIN and STR do not use proton data in their modelling, their results could be less reliable when compared with the WEN, DAH and MEI results. Among WEN, DAH, and MEI, we might select DAH as the most reliable model for proton plan recalculations since it is only based on proton data, but at the same time, this model is based on a smaller data set. However, the differences between these three models are so small that one could use other criteria to select which one to use, e.g., implementation complexity. However, we must consider that all these models carry an uncertainty that arises from the incertitude of the experimental data used for the modelling. Consequently, one might select the model after evaluating aspects such as the type of radiation, the energy and LET range, pO 2 range, pO 2 value set as normoxic condition, or the different cell-lines accounted for. For clinical use we must also acknowledge that the uncertainties will increase due to the lack of in vivo data and the uncertainties in the PET tracer uptake to pO 2 conversion method.
OER-maps (figure 5) go along with the estimated pO 2 distribution and the data from figures 2 and 3. As for the volume percentage peaks ( figure 4) and their corresponding OER min value, these appear due to the maximum pO 2 estimated from the hypoxia PET data. This value was set to 60 mmHg and was defined as the default normoxic pO 2 for PET uptake below a certain threshold [20] using as reference the normoxic pO 2 values in brain, muscle and fat (29.2 mmHg, 33.8 mmHg and 60.0 mmHg respectively) found in the literature [37,43]. The model differences result in slightly different minimum OERs, between 1.01 and 1.05 depending on the model. This occurs because the cell data used in the models take 160 mmHg as the normoxic value. If the data used in the OER models took 60 mmHg as the normoxic value, matching the normoxic pO 2 in the clinical case, these peaks would appear at OER equal to 1. Consequently, when using any of these OER adaptations on clinical cases it should be considered to include a normalization step to ensure that we find an OER of 1 on the normoxic patient tissue, avoiding the calculation of an overdosage of these regions during a later optimization process. Only WEN and DAH comment on a normalization step for clinical recalculations. The effects of setting different pO 2 values as normoxic on the OER can be seen in figure S5. These normalization steps reduce the OER as we lower the reference pO 2 value for normoxic tissue. Therefore, this would reduce the depth of the dip of the ROWD in hypoxic regions shown on figure 3.
The differences in height of the volume percentage peaks that correspond to the OER min are affected by the influence of the LET on each model: larger OER variations with the LET result in a lower volume with OER min . This argument is not valid for TIN, where the dispersion of low OER is a direct consequence of the rapid variation of the OER when increasing the pO . 2 The OER curve from TIN approaches 1.0 slower than the OER curves from the other models (figure 2), resulting in a larger variety of OER for high pO 2 values. Therefore, after a hypothetical normalization step to adapt the models to clinical cases, the volume percentage differences at OER equal to 1 would still be noticeable since it is affected by the dependence of the OER model with LET. Consequently, the proton-LET influence, although reduced, is noticeable in most of the studied OER models.

Conclusion
To summarize, we implemented a ROWD calculation in FLUKA MC using five different hypoxia adapted RBE models. Models were studied in terms of the consistency between their ROWD estimates to evaluate the impact of model assumptions, data selection and the potential for performing hypoxia adaptation of proton therapy plans. Results from simulations in both a virtual water phantom and a HNC case suggest a two-group classification of the models. The WEN, DAH and MEI models showed good agreement and more uniform dose distributions with lower OER values in low pO 2 regions when compared with the TIN and STR models. Application and comparison of the models indicated that the two latter models, primarily based on heavy ion data, potentially could overestimate the OER of protons at low pO . 2 Overall, all models display a large difference in the estimated dose from hypoxic and normoxic regions. This shows the potential to increase the dose or LET in hypoxic regions or reduce the dose to normoxic regions which again could lead to normal tissue sparing. The uncertainties in the OER models are however still largely unknown, but with quantification of uncertainties in the model parameters and reliable hypoxia imaging, RBE-OER weighting could become a useful tool for proton therapy plan optimization.