Efficient uncertainty quantification for Monte Carlo dose calculations using importance (re-)weighting

The high precision and conformity of intensity-modulated particle therapy (IMPT) comes at the cost of susceptibility to treatment uncertainties in particle range and patient set-up. Dose uncertainty quantification and mitigation, which is usually based on sampled error scenarios, however becomes challenging when computing the dose with computationally expensive but accurate Monte Carlo (MC) simulations. This paper introduces an importance (re-)weighting method in MC history scoring to concurrently construct estimates for error scenarios, the expected dose and its variance from a single set of MC simulated particle histories. The approach relies on a multivariate Gaussian input and uncertainty model, which assigns probabilities to the initial phase space sample, enabling the use of different correlation models. Exploring and adapting bivariate emittance parametrizations for the beam shape, accuracy can be traded between that of the uncertainty or the nominal dose estimate. The method was implemented using the MC code TOPAS and tested for proton IMPT plan delivery in comparison to a reference scenario estimate. We achieve accurate results for set-up uncertainties ($\gamma_{3mm/3\%} \geq 99.99\%$) and expectedly lower but still sufficient agreement for range uncertainties, which are approximated with uncertainty over the energy distribution ($\gamma_{3 mm/3\%} \geq 99.50\%$ ($E[\boldsymbol{d}]$), $\gamma_{3mm/3\%} \geq 91.69\%$ ($\sigma(\boldsymbol{d})$) ). Initial experiments on a water phantom, a prostate and a liver case show that the re-weighting approach lowers the CPU time by more than an order of magnitude. Further, we show that uncertainty induced by interplay and other dynamic influences may be approximated using suitable error correlation models.


Introduction
Monte Carlo methods are considered the gold standard for dose calculation in radiotherapy treatment planning due to their accuracy (Paganetti, 2012;Weng et al., 2003). However, the accuracy of a simulated compared to a delivered dose is not only determined by the chosen dose engine, but also compromised by treatment uncertainties in water-equivalent path length (WEPL), patient setup and anatomy. Especially in proton and carbon-ion therapy, the high dose localization in the Bragg-peak usually does not allow for uncertainty quantification and mitigation using approximations known for photon therapy, such as the static dose cloud (Lomax, 2008a,b).
The additional computational effort of robustness analyses and robust optimization techniques, however, clashes with the long computation times of Monte Carlo dose calculation. The use of faster, less accurate deterministic pencil-beam dose calculation algorithms instead is not always feasible, because their accuracy is low in particularly heterogeneous anatomies like lung (Taylor et al., 2017), which at the same time show high sensitivity to uncertainties in range and set-up.
More efficient uncertainty quantification approaches for Monte Carlo methods, developed for example by the radiative transport community (e.g. Poëtte, 2018;Hu and Jin, 2016), often do not demonstrate an application to realistic patient data and it is not clear how well the results transfer. Also in many cases, more sophisticated methods are intrusive, which limits the applicability when using proprietary MC simulation engines.
In this paper, we introduce a simple, minimally-intrusive method for uncertainty quantification in Monte Carlo dose computations. It is based on (re-)weighting a single set of MC simulated particle histories. In contrast to the conventional approach of simulating different scenarios separately, our method significantly reduces the required computational effort. The weighting step, which can be represented as multiplications of a weight vector with a history dose matrix, replaces simulations of different dose scenarios. The method enables uncertainty propagation during the simulation, making it possible to estimate the dose uncertainty induced by range and setup errors from nominal dose calculations. We demonstrate the application of this method to specifically approximate expected value and variance of dose, given a respective uncertainty model for set-up and range errors, which includes the choice of different beam and pencil beam correlation scenarios.
The remainder of this paper is organized as follows: In section 2, we introduce basic definitions and notation, derive a direct computation of the expected value before introducing the concept of importance (re-)weighting for set-up and range uncertainty models. Section 3 then compares estimated expected doses and corresponding standard deviations to reference computations based on scenario sampling. Discussion and Conclusion follow in sections 5 and 6, respectively.

The Monte Carlo method for dose computation
First, we briefly recapitulate the basic principles of the Monte Carlo method for radiotherapy. This serves the purpose of establishing notation and parameters used to introduce our method and simplifying the illustration of later adaptations. For a more detailed description we refer to other sources, such as Paganetti (2012); Fippel and Soukup (2004); Ma et al. (2002); Bielajew (1994); Mackie (1990), among many others.
The Monte Carlo method is a numerical integration technique, based on random sampling. When used for dose calculations, a set of particles is created with properties including position, momentum and energy, which evolve dynamically over the course of a simulation. The initial values of these properties constitute the random input parameters of the MC simulation and are sampled from a known probability distribution function. On this basis, the trajectories of each primary particle and its secondaries are simulated and the deposited dose is aggregated, by sampling interactions such as scattering and energy loss according to physical laws and material properties. While this appears to be an intuitive simulation of the actual physical process, it is essentially a statistical method to solve the linear Boltzmann transport equation and therefore compute the expected value of a model with random input.
Let ξ be the vector of random input parameters of the dose simulation. Φ 0 (ξ) is the joint density of these parameters and is assumed to be known. For our purposes, which will not interfere with the simulation itself, we assume that the trajectory of a primary particle is given by the "black box" simulation engine, yielding the dose deposited in voxel i within an individual particle's history h i (ξ).
The nominal dose d i in voxel i can now be estimated with the expected value E Φ 0 of all histories via the sample mean where H is the sample size (number of computed primary particle histories) and Ξ p , p = 1, ..., H are realizations of primary particle properties.
Here we omit the dependence on random factors within the simulation, such as particle scattering, as well as their probability distribution. Particle histories h i (Ξ p ), for input realizations Ξ p , implicitly also include realizations of these random parameters. For a large number of histories, their effect on the dose estimates can however be assumed to be constant.

Beam model
The initial state of each particle is represented by a point in the seven-dimensional phase space, which encompasses the particle position r = (r x , r y , r z ), momentum p = (p x , p y , p z ) and energy E. We assume a Gaussian emittance model, i.e. the parameters within each pencil beam are multivariate normal distributed Here, ϕ = (ϕ x , ϕ y ) = ( dpx dpz , dpy dpz ) describes the transverse divergence of the momentum direction from the axial beam direction.
The joint density over all pencil beams is then defined by a Gaussian mixture model where w b are the pencil beam weights.
To introduce our method, we will initially assume a simplified phase space where ϕ x = ϕ y = 0. Results including a distribution in the momentum direction can be found in the Appendix.
Set-up errors directly affect the primary particle positions in an additive way, such that the actual position r δ of a primary particle under uncertainty is given by its position r according to the emmitance model, plus the error δ r .
Uncertainties in the particle range are caused by a variety of factors, ranging from the conversion of Hounsfield units to stopping powers and imaging artifacts, over changes in the patient geometry to biological effects and inaccuracies in physics models (Unkelbach et al., 2007;Paganetti, 2012;Lomax, 2008a;McGowan et al., 2013). Here, we focus on calculational uncertainties, such as conversion errors, and model these by scaling the complete tissue density with the random factor δ ρ (comp. Lomax, 2008a;Souris et al., 2019;Malyapa et al., 2016). Since the density is assumed to be deterministic, the error is not directly linked to a random input parameter. In section 2.7, we however present an approximation which models range errors using the initial energy distribution.
Sampling-based uncertainty quantification approaches, similar to Park et al. (2013) or Kraan et al. (2013), rely on repeated dose calculations for different realizations ∆ k , k = 1, ..., K of the error vector. For an individual error scenario ∆ k , the dose is computed as In the case of set-up uncertainties Φ(ξ, ∆ k ) for example corresponds to the nominal parameter density Φ 0 (ξ), where all particle positions are shifted by ∆ r;k . Due to its accuracy, this procedure is later used to obtain reference values to validate our results. It is however extremely computationally expensive, since it requires numerous runs of the complete Monte Carlo dose simulation.

Direct computation of the expected value
When the distribution Ψ(ξ δ ) of the initial parameters under uncertainty can be explicitly defined, it is possible to compute the expected dose directly by replacing the nominal parameter distribution Φ 0 with Ψ in the Monte Carlo dose simulation as follows: For example when the error is additive, i.e.

Importance (re-)weighting
We now consider the dose deposited by histories h(ξ, δ), which are a function of the random input parameters ξ ∼ Φ 0 (ξ) and random error vector δ ∼ p δ . In the following we focus on computing estimates for the dose expected value and standard deviation, the method can however be analogously applied to the computation of several worst case scenarios.
We propose a replacement of the dose calculations for different error scenarios by a more efficient weighting of particle histories h. For this, we adopt the concept of importance sampling (Kahn, 1950;Hastings, 1970). Instead of sampling primary particles from Φ(ξ, ∆ k ) for different error scenarios, we sample from a different density function -e.g. the nominal parameter distribution Φ 0 (ξ). Then, the dose for all scenarios can be estimated using histories from the nominal dose calculation: Thus, scenario computation reduces to a scoring problem. Dose expectation and variance can now be computed through sample mean and variance over the respectively obtained scenarios.
When it is possible to compute the expected value directly as discussed in 2.4, only one (re-)weighting step is necessary: 2.6. Modeling set-up uncertainties Set-up uncertainties correspond to a shift of the patient position or equivalently the positions of primary particles relative to the patient. While errors occur in three dimensional space, shifts along the beam axis do not affect the dose distribution. In the Gaussian model, set-up errors can hence be assumed to follow a bivariate normal distribution for each pencil beam b = 1, .., B: with µ b δr ∈ R 2 and Σ b δr ∈ R 2×2 . Particles are initialized in a 2D plane, thus the primary particle positions follow a bivariate Gaussian mixture (2.2) Here µ b r is the mean lateral position of initial particles in pencil beam b in beam's eye view i.e., in the 2D plane perpendicular to the central beam axis. Then, according to 2.3 the initial position r δ of a particle under uncertainty is determined by and r δ is distributed with the convolution function An individual error realization ∆ r;k ∼ p δr then formally just corresponds to a shift of the original primary positions, which now follow the distribution corresponding to the nominal distribution shifted by ∆ r;k . The above distributions can be directly used with (6), (7) and (9) to obtain the expected dose and variance for set-up uncertainties.

Modeling range uncertainties
The proposed approach could be analogously applied to any type of uncertainty directly affecting input parameters of the simulation, which have an a-priori probability distribution. Range uncertainties, however, modify the density values, which are deterministic and can thus not be directly modeled within the proposed framework.
To still approximate our quantities of interest, we exploit that the largest dose uncertainty is induced near the range of a beam (Bortfeld, 1997), although the uncertain density variation affects the whole trajectory. Range can be expressed in terms of the initial energy of particles, using the Bragg-Kleemann rule where R is the range, E 0 is the initial energy and α and p are applicationspecific parameters. For the case of the slow-down of therapeutic protons in water, values of α = 0.022 mm/MeV p and p = 1.77 can be chosen (Ulmer and Matsinos, 2011).
The initial energy spectrum of a scanned pencil beam at the exit of the nozzle can be approximately represented by a Gaussian (Bortfeld, 1997;Kimstrand et al., 2007;Tourovsky et al., 2005;Soukup et al., 2005). We can use this to model range uncertainties through random variations of the initial energy (compare treatment of range straggling in Pedroni et al., 2005).
Let's assume range uncertainties are normally distributed, i.e. δ ρ ∼ N (0, σ 2 R ) (comp. Lomax, 2008a;Yang et al., 2012). With a Taylor approximation (order 1 for the mean and 2 for the variance) around X = E[X], we can determine the parameters µ E 0 , σ 2 E 0 of the energy distribution due to range uncertainties.
Thus the randomness in range is approximated through an energy distribution E 0 ∼ N (µ E 0 , σ 2 E 0 ) and the expected dose and variance can be computed by (re-)weighting histories analogously to 2.6. This can again be extended to multiple pencil beams using Gaussian mixtures.
Note that, again, the expected value can be directly computed from simulations with an energy spectrum convolved with the Gaussian uncertainty kernel.
Alternatively to the nominal energy distribution, this convolved distribution can also be used to obtain the required histories. In this case, the nominal distribution is replaced by the convolved distribution in (6)-(9).

Correlation Models
In the previous sections, the distributions of different types of phase space parameters were considered independently. Note that the derived distributions are however all marginals of the joint multivariate Gaussian mixture spanning the complete phase space and all pencil beams (see also 2.2).
Similarly, the univariate normal distributions of errors of different types and in different pencil beams can be connected using a joint multivariate Gaussian distribution. This framework in principle allows for the definition of arbitrary correlation models for uncertainties between pencil beams. For the dose variance computation, these correlations can be easily implemented using the covariance matrix of this joint distribution, since the samples for the weighted scenarios are directly drawn from the respective multivariate normal distribution. The expected dose is constant under varying correlation assumptions.
Using the example of set-up uncertainties, if we define the errors in each beam by δ r = (δ 1 r , ..., δ B r ) T , the multivariate Gaussian N (µ, C) would be parametrized with where ρ ab xy is the covariance between set-up errors in the x-direction in pencil beam a and errors in the y-direction in pencil beam b.
A few simple examples for correlation models are shown in figure 1, more can be found in literature (Bangert et al., 2013;Pflugfelder et al., 2008;Unkelbach et al., 2009).
In case the correlation matrix is singular (perfect correlation between some pencil beams), the dimension of the uncertain vector can be reduced and one joint error can be sampled for the respective perfectly correlated pencil beams.
More complex correlation models are possible.

Implementation
For the proof-of-concept in this work, the weighting method was implemented as a post-processing routine in Matlab. Radiation plans were generated with matRad  and exported to the Monte Carlo simulation engine TOPAS To reduce the number of required realizations, both for the reference computation and the (re-)weighting steps, a quasi-Monte Carlo approach was used to sample the random parameters (see e.g. Caflisch, 1998).

Investigated patients and uncertainties
In the following, we consider range and set-up uncertainties, as well as a combination of both. For the set-up uncertainties, we assume a symmetric, bivariate normal distribution with zero mean (no systematic errors) and a standard deviation of 3 mm (comp. Wahl, 2018;Perkó et al., 2016). For range uncertainties, in the reference computations we scale the density with a normally distributed factor, where the mean is equal to the nominal density and the standard deviation is 3 % (as recommended in Lomax, 2008b;Yang et al., 2012). The corresponding parameters of the energy distribution, used to approximate range errors in the importance (re-)weighting estimate, are determined based on this distribution as detailed in 2.7. Table 1 provides an overview of which uncertainty models were computed for which patient, as well as the irradiation angles used for different patients. The number of histories and pencil beams for each considered patient, as well as the number of error scenarios computed for the importance (re-)weighting estimates can be found in table 2. Note, that the number of histories per pencil beam were determined based on (non-normalized) weights from the optimized radiation plan (see 2.9), where around 10 5 histories are computed for the pencil beams with the highest weights.

Evaluation criteria
To compare our results to the respective reference computations, we plot twodimensional slices of the dose cubes as well as a difference map and employ a global three-dimensional γ-analysis. For the difference maps, we compute for each voxel i in the reference result d ref and (re-)weighting estimate d est . For the γ-analysis, we use the matRad implementation based on Low et al. (1998), with a distance to agreement of 3 mm and a dose difference criterion of 3 %.

Results
In the following we present results for the cases given in table 1. Unless specified otherwise, results were computed on the basis of histories from nominal dose calculations, i.e. with phase space parameters sampled from Φ 0 (see 2.2). The references computed for nominal and expected dose stem from Monte Carlo dose calculations with the respective phase space distributions Φ 0 and Ψ (see 2.4), the reference standard deviation is derived using numerous such Monte Carlo simulations for different error scenarios sampled from the joint error distribution. Therefore, the importance (re-)weighting estimate for the nominal dose only differs from the reference by round-off errors introduced in post-processing, as can be seen in figure 2. It is omitted thereafter, as is the reference. Figure 2 displays the nominal dose, expected value and standard deviation estimates for a water phantom, computed using the (re-)weighting approach in comparison to the respective references. While we see some minor deviations in difference maps for the expected dose and standard deviations, they do not appear systematic.

Set-up errors
The distance-to-agreement analysis using the γ-criterion supports this quantitatively (table 3), with γ 3 mm 3 % -pass rate of 100 % for the nominal and expected dose and 99.97 % for the standard deviation. Figure 3 demonstrates that this is transferable to the more complex patient cases. With overall γ 3 mm 3 %pass rates of 99.99 % (prostate patient) and 100 % (liver patient), the standard deviation agrees as well with the reference computations as the expected value, with 100 % and 99.99 %, respectively (table 4, 5).

Range errors
In contrast to the set-up errors, for which dose estimates can also be shown to be mathematically accurate, range errors can only be modeled through an approximation introduced in 2.7. Figure 4 displays results for range errors as well as the combination of range and set-up errors in the water phantom. The difference maps for both expected value and standard deviation show that the deviations when including range errors are expectedly higher. We observe a systematic bias primarily at the distal edge, where our method seems to consistently underestimate the variance. The standard deviation estimate using our importance weighting method also expresses strong local artifacts, as evident in the difference maps (compare figure 4). This is an indicator of too little statistical mass, i.e., computed particle trajectories, in the original simulation. For more extreme error realizations, relatively high weights are assigned to a small number of particles, thereby amplifying single realizations or errors. Especially in case of a relatively small beam energy spread in the original simulation (here 1 %), compared to the range error of 3 %, such artifacts are likely to appear. In order to prevent this, one could either compute a larger number of particle histories in the simulation or sample the particles from a different distribution which has more density mass in its outer regions or tails (compare 2.4).
To underline the explanation for the appearance of the artifacts above, we recomputed the estimates using the (re-)weighting method based on a direct computation of the expected value, which can be calculated using the convolution Ψ of the Gaussian error kernel with the nominal phase space parameter distribution 2.4. Figure 4 shows that this alleviates the discrepancy from the references, causing artifacts to disappear and also reducing the overall amount of deviation displayed in the difference maps.  Thereby we can conclude that the irregularities in the solution can be attributed to the lack of statistical support in certain areas. Contrary to this, parts of the systematic differences remain and are thus most likely a result of the model approximations. Figure 5 validates these observations for a liver patient. The difference maps for estimates computed based on the expected distribution Ψ, have less severe artifacts and systematic deviations. For both set-up and range errors, the γ 3 mm 3 %pass rate also increases from 93.12 % to 98.19 % (table 5). However, we do not observe such an increase in the case of only range uncertainties.
Also, it has to be noted, that using Ψ to sample the initial particles leads to an expected dose estimate which is exact up to machine precision, but a nominal dose estimate which now shows deviations from a nominal standard Monte Carlo reference computation in the order of magnitude that we could previously observe for the expected dose.

Correlation models
So far, we have only shown results for the case of fully correlated pencil beams, meaning one global shift of the patient position or scaling factor for the beam range. One of the advantages of the proposed method is, however, the high flexibility in changing the uncertainty model. In figure 6 we therefore present the standard deviation estimate for four examples of different error correlation models discussed in section 2.8.
(a) (b) (c) (d) Figure 6: Standard deviation of dose in a prostate patient for (a) no correlation (b) correlation between pencil beams in the same energy level (c) ray-wise correlation between pencil beams and (d) correlation between pencil beams with the same irradiation angle, w.r.t set-up errors and in case (c) also range errors.
The results indicate, that different correlation assumptions have a crucial impact on the standard deviation of dose distributions. While it is in principle possible to define arbitrary correlations within the proposed framework, estimates can be prone to artifacts due to a lack of statistical information, especially for the ray-wise correlation model. When sampling error realizations independently for smaller beam components, the reconstruction depends solely on the particle histories associated with these components. For rays with small weights, only very few histories are computed, therefore we observe similar artifacts as encountered in above range uncertainty computations (3.2).

Accuracy and Convergence
Mathematically, it can be shown, that the expected and nominal dose estimates are unbiased. This also holds for the doses corresponding to each individual error realization. While this does not generally apply for the variance, our results indicate that the bias does not have a significant impact on the quality of estimates. For quicker convergence we used quasi-random numbers throughout the whole comparisons, both for the reference computation and the importance (re-)weighting approach. Note, that the combination of importance sampling with quasi-Monte Carlo methods has been shown to be not only possible, but advantageous and preserves the convergence properties of quasi-Monte Carlo (Hörmann and Leydold, 2005;Ökten, 1999;Caflisch, 1998;Schürer, 2004). Since the procedure mimics a (quasi-) Monte Carlo method for uncertainty quantification, where the repeated simulation runs are replaced by (re-)weighting steps, the convergence of the variance per computed error realization is identical. However, due to the lower cost of the (re-)weighting steps, the convergence per time is much faster (see figure 7).
For run-time comparisons, the reference computations using TOPAS and the (re-)weighting approach, implemented as post-processing in Matlab, were run on the same virtual machine ‡. We observe reduced CPU times by a factor of 80, 32 and 23 for the water phantom, liver and prostate patient, respectively (see table  6).

Discussion
In this paper, we introduce an efficient approach for uncertainty quantification in Monte Carlo dose calculations using history (re-)weighting. We demonstrate how particle histories from one simulation can be scored to construct estimates for error scenarios, the expected dose and standard deviation, for set-up and range errors in intensity modulated proton therapy. As demonstration example, Gaussian range and set-up uncertainties, with 3 % and 3 mm standard deviation respectively, were considered for a water phantom, a liver patient and a prostate patient.
For set-up uncertainties, we observed good agreement of at least 99.99 % in the γ 3 mm/3 % -criterion for all quantities of interest. Range error propagation could be approximated by transforming the assumed range uncertainty into energy uncertainty via the range-energy relationship. The error caused by this model approximation appears to be relatively minor for the expected dose. The standard deviation estimates are, however, sensitive to the number of histories and usage of the nominal Gaussian pencil beam width or the convolved distribution. Differences and visible artifacts in the standard deviation estimate can be partly eliminated by simulating the initial phase-space parameters using the convolved beam parameterization. While some systematic deviations remain, the order of magnitude as well as shape and extent of the dose standard deviation is sufficiently well-represented. However, this causes a reduction in accuracy for the nominal dose. Thus, improving the accuracy of the estimate for one quantity of interest comes at a cost for the accuracy of another and it remains up to the user and usecase to put the focus on either retaining accuracy in the nominal dose computation or trading it against better accuracy of the uncertainty estimate.
We also demonstrate the use of different pencil beam correlation models within the framework. It is clear that the choice of correlation model has a significant impact on the standard deviation estimate. Therefore, it is particularly convenient that the (re-)weighting method allows for the definition of principally arbitrary correlation matrices to put into the underlying multivariate Gaussian error model. These could possibly be extended to simulate interplay effects or other dynamic influences in the context of 4D treatment planning. Since the applied correlation models are not only experimental but also difficult and timeconsuming to evaluate in scenario sampling, we did not quantitatively compare them to reference computations. Further studies could explore whether they agree with other methods computing such correlations based on an analytical probabilistic dose engine (Bangert et al., 2013;Wahl et al., 2017;Wieser et al., 2020).
Compared to the reference scenario estimates, which rely on performing full MC dose calculations repeatedly, the CPU-time for standard deviation estimates could be reduced by more than an order of magnitude using our method in combination with a quasi-MC approach. This is achieved by reducing the costs of repeated expensive simulations to those of scoring based on matrix-vector multiplications. Consequently, it has to be noted, that the time reduction depends largely on the proportion of computational overhead of the initialization and simulation steps in the MC engine. Therefore the factor of speed increase varies strongly between different test-cases and, most likely, implementations. But even then, our method holds two performance advantages: First, it can directly compute the expected dose by using the convolved phase space parameterization (2.4) in one standard simulation. Second, multiple uncertainty models with different correlation patterns and magnitude can be reconstructed from the same set of histories. This could for example be used to investigate the impact of fractionation effects, using the framework proposed by Wahl et al. (2018) or to consider a number of (worst case) scenarios besides the expected dose and standard deviation.
Further, we argue that computational performance can be further improved through a more efficient implementation and using better parallelization. Also, a combination of our approach with other efficient uncertainty quantification approaches, which rely on scenario computations could lead to run-time improvements. For instance a polynomial chaos expansion as introduced in Perkó et al. (2016) could be adjusted such that the evaluations are computed by (re-)weighting histories instead of the usual dose calculations.
When computing the estimates in post-processing, the regular Monte Carlo dose calculation is not slowed down perceptibly by storing particle histories for later reconstructions. The possible additional run-time is smaller than the variation between two runs of the same simulation and thereby barely detectable. An implementation as on-the-fly scoring is however also possible and might outperform post-processing in terms of overall run time for a single uncertainty model.
Last but not least, the method is not inherently limited to the discussed application in proton therapy; a calculation of uncertainty estimates using the (re-)weighting approach would also be feasible for other IMPT modalities like carbon ions but also photons. In its current description, it is however limited to uncertainties which can be modeled in terms of variations of phase space parameters with a prior probability distribution. Application to, for example, pre-simulated phase spaces might also be feasible using numerical convolution techniques. Also, a disproportionately high magnitude of uncertainties in relation to this probability distribution can compromise the accuracy of results. Furthermore, it needs to be mentioned, that the current computational speed, especially for the standard deviation, might still not be sufficient for optimization purposes, where a full dose influence matrix needs to be computed. Due to the simplicity of the process and the high flexibility in post-processing at virtually no cost for the original simulation, we are confident that the approach has the potential for further development and use.

Conclusion
Dose distributions in intensity modulated proton therapy are known to be sensitive to uncertainties. The computational efforts in estimating such uncertainties become particularly evident when Monte Carlo dose calculation is used. We showed how the concept of importance sampling can be adapted to estimate the expected dose and its variance using histories from only a single Monte Carlo simulation. Set-up uncertainties can be efficiently modeled and exhibit almost exact agreement with reference computations. The inclusion of range uncertainties, by modeling them as energy uncertainty via the range-energy relationship, yields less but sufficient accuracy for most application purposes. Further, the physical simulation of particles is completely decoupled from uncertainty quantification, thereby allowing for the incorporation of arbitrary correlation assumptions and the comparison of different scenarios, at no additional cost to the nominal dose calculation. Therefore, the presented approach has several benefits over classic non-intrusive methods and is a step towards reconciling efficient uncertainty quantification and, in the future, robust optimization based on Monte Carlo dose calculations. σ Figure B1: Nominal dose, expected dose and standard deviation w.r.t. set-up uncertainties with 3 mm standard deviation for one beam (couch angle 0°, gantry angle 315°), computed using the full phase space parameterizations. Figure B1 presents results for the nominal dose, expected dose and standard deviation in a liver patient, for set-up uncertainties with 3 mm standard deviation, 0.2 standard deviation in the momentum direction and 0.3 correlation between ϕ v and r v , v ∈ {x, y}. Estimates were computed based on the convolution function Ψ of the error and beam parameter densities, as well as the nominal parameter density Φ 0 . The corresponding global γ-analysis pass rates can be found in table B1.