Improved calibration of mass stopping power in low density tissue for a proton pencil beam algorithm

Dose distributions for proton therapy treatments are almost exclusively calculated using pencil beam algorithms. An essential input to these algorithms is the patient model, derived from x-ray computed tomography (CT), which is used to estimate proton stopping power along the pencil beam paths. This study highlights a potential inaccuracy in the mapping between mass density and proton stopping power used by a clinical pencil beam algorithm in materials less dense than water. It proposes an alternative physically-motivated function (the mass average, or MA, formula) for use in this region. Comparisons are made between dose-depth curves calculated by the pencil beam method and those calculated by the Monte Carlo particle transport code MCNPX in a one-dimensional lung model. Proton range differences of up to 3% are observed between the methods, reduced to  <1% when using the MA function. The impact of these range errors on clinical dose distributions is demonstrated using treatment plans for a non-small cell lung cancer patient. The change in stopping power calculation methodology results in relatively minor differences in dose when plans use three fields, but differences are observed at the 2%–2 mm level when a single field uniform dose technique is adopted. It is therefore suggested that the MA formula is adopted by users of the pencil beam algorithm for optimal dose calculation in lung, and that a similar approach is considered when beams traverse other low density regions such as the paranasal sinuses and mastoid process.

in lung, and that a similar approach is considered when beams traverse other low density regions such as the paranasal sinuses and mastoid process.
Keywords: proton therapy, pencil beam algorithms, CT calibration, radiation therapy treatment planning (Some figures may appear in colour only in the online journal)

Introduction
Pencil beam dose calculation algorithms are widely-used in clinical proton therapy, offering a computationally simple semi-empirical alternative to Monte Carlo with fewer parameters needed to describe the treatment machine and patient sufficiently well. Assuming high quality calibration data, these analytical algorithms are generally agreed to calculate dose distributions with an accuracy of a few millimetres (Paganetti 2012); this accuracy is one of many factors that govern the size of the margin between the clinical target volume (CTV) and the planning target volume (PTV) or scanning target volume (STV). Treatment plans with smaller PTV/STV margins are advantageous to patients in terms of decreased toxicity, irradiating less tissue at high dose for technical (as opposed to medical) reasons, but are only possible by reducing spatial uncertainties in dose calculation and delivery.
This paper examines the mapping between computed tomography (CT) scan pixel values via mass density to proton stopping power for a clinical pencil beam algorithm, and compares the dose-depth curves produced by such a method to those directly calculated by Monte Carlo particle transport software. A potential souce of systematic error arising from the mapping process is identified, and a solution is demonstrated to resolve these errors. The magnitude of the resulting dose differences are shown for a typical clinical treatment planning scenario.

Pencil beam dose calculation
The pencil beam concept was introduced in radiotherapy dose calculation because it better accounted for patient heterogeneity in electron beam therapy compared to earlier broad beam methods (Hogstrom et al 1981). As with electrons, the range of a proton beam is strongly dependent on properties of the material traversed. A number of proton pencil beam dose calculation algorithms have therefore been developed (e.g. Petti (1992), Hong et al (1996), Soukup et al (2005), Schaffner (2008) and Westerly et al (2013) and references therein) and, as a result of abundant affordable computing power, are available clinically for interactive radiotherapy treatment planning.
In brief, proton pencil beam algorithms calculate the dose delivered to points within a patient by decomposing the initial fluence distribution into a number of adjacent pencil beams. These calculation beams are distinct from the pencil beams which may be used to deliver treatments, and typically have smaller dimensions. The dose contribution of each pencil beam is the product of a longitudinal depth-dose function and a lateral scattering function. Approaches to lateral scattering vary, but the longitudinal depth-dose curve is invariably derived from depth-dose data measured in water during commissioning, which is scaled and shifted as steps are taken along the pencil beam's axis, according to the 'water equivalent' thickness 3 of the material intersected.
Some commercial treatment planning systems (TPSs) currently implement the method described in Soukup (2005). This extends the pencil beam concept by separating each pencil 3 true thickness multiplied by proton stopping power ratio (SPR) relative to water. beam into two beamlets, the first due to multiple Coulomb scattering (MCS) and the second arising from nuclear scattering. The relative weight of the nuclear-scattered beamlet increases with 'water equivalent' depth following a fit to Monte Carlo simulations. Both beamlets' lateral spread functions are Gaussian, with widths that are a sum of components arising from angular divergence and the relevant scattering processes. The MCS width is calculated using multiple scattering theory (in particular, a formula by Rossi and Greisen (1941)) and varies depending on the radiation length of material traversed, whilst the nuclear scattering width is again informed by Monte Carlo simulations.

The patient model
Dose calculations using pencil beam algorithms require a reliable model of the patient's anatomy, providing a map from which the proton stopping power of material along the beam's path may be sampled. This patient model is built from the planning CT, and informed by a scanner-specific calibration curve.
1.2.1. CT calibration. The stoichiometric method, as proposed by Schneider et al (1996) and built on by others (Schaffner andPedroni 1998, Schneider et al 2000), is currently the predominant approach for CT calibration in proton therapy. It can be summarised as follows: (a) CT scan a selection of tissue equivalent materials, with known density and composition. (b) Fit the CT numbers (in Hounsfield Units) to a function of density and elemental composition adapted from Rutherford et al (1976), determining coefficients that reflect the strength of each major x-ray attenuation process for the energy spectrum used in the scan. (c) Use the coefficients from step (b) to predict CT number (the independent x-axis variable in the CT calibration curve) for a range of real biological tissues, using reference data (e.g. in ICRU/ICRP Reports) for their density and composition. For each tissue, also calculate the physical property required by the dose algorithm, which constitutes the dependent y-axis variable. The choice of physical property will be discussed in section 1.2.2. (d) Use the (x,y) points from step (c) to specify a one-to-one function linking CT number to the calibration variable.  Hong et al 1996) expect the calibration curve to link CT number and proton stopping power ratio relative to water (SPR). In constrast, the method in Soukup et al (2005) uses mass density as the dependent variable; consequently, a TPS using this algorithm for scanned proton treatments expects the calibration curve to link CT number and mass density. There are clear advantages to this approach: • Mass density can be measured without using proton beam time.
• Mass density is independent of proton energy, whilst the SPR of a material displays an energy dependence, as illustrated in figure 1. Ignoring the variation of SPR with energy leads to range errors of approximately 1 mm water equivalent for 200 MeV protons travelling entirely in mineral bone, with smaller errors in most other biological materials. • It has been demonstrated that an accurate formula can be found for proton stopping power as a function of energy and mass density, and radiation length as a function of mass density, using tabulated composition data for human tissues (Fippel and Soukup 2004).
1.2.3. Interpolation of stopping power at low density. The region ρ < 0.9 g·cm −3 is poorly populated by human tissues in most sources of composition data, such as ICRU Report 46 (1992) or ICRP Publication 89 (Valentin 2002). The exception is lung, which is typically associated with a mass density approximately one-third that of water. This low density is a bulk average of both tissue and air; most non-osseous tissues, including compressed lung, have an intrinsic density close to that of water. Voxel dimensions in radiotherapy planning CT scans are of order 1 mm, whilst the lung contains structures on a variety of scales: from alveoli at ∼100 µm to centimetre-sized bronchi. Each lung voxel in the planning CT may therefore be occupied by a different fraction of tissue, and the range of CT numbers observed within the organ can be expected to extend from 100% air at −1000 HU to compressed lung at approximately 0 HU. In order to implement pencil beam dose calculation, it is necessary to define an approach to SPR calculation over the whole range of CT numbers spanned by low density tissue, based on the very limited set of calibration data available. The description of the algorithm by Soukup et al recommends following an earlier work (Fippel and Soukup 2004) to calculate mass stopping power ratio (denoted f S ) by linear interpolation at low densities.
In this work, an in-silico study is carried out to determine the extent to which linear interpolation of mass stopping power in low density (ρ < 0.9 g·cm −3 ) lung tissue affects the accuracy of proton range calculation by a pencil beam algorithm, by comparison to a proposed alternative (the mass-average method) and direct Monte Carlo calculations.

Pencil beam algorithm
All longitudinal aspects of the pencil beam dose calculation algorithm described in the Soukup et al paper (Soukup et al 2005) were implemented in a function for the MATLAB Figure 1. Mass stopping power ratios of selected tissues with ICRU Report 46 (1992) compositions as a function of energy and proton range in water, calculated using SRIM 2013 (Ziegler et al 2010). Water stopping power in the SPR denominator does not include the effects of molecular bonds; the blue line shows the difference if, instead, a low-energy compound correction (c.c.) of 0.94 is considered in water (the value specified in SRIM's tables, precalculated according to Ziegler and Manoyan 1988 programming language (MathWorks, Natick, MA). This implementation evaluates SPR, residual proton energy and deposited energy over 10 equally-sized steps in the depth direction per dose calulation voxel. In addition to calculating SPR from mass density according to the original linear interpolation (LI) method, the mass-average (MA) approach was also used, and stopping power was calculated directly from elemental composition using the SRIM software (Ziegler et al 2010).
SRIM is able to account for the effect of molecular bonding on stopping power using compound corrections calculated from the molecular structure (Ziegler and Manoyan 1988); however, these compound corrections are only relevant at proton energies that are considered low in the therapeutic regime, and which are difficult to define for composite materials like lung tissue. Therefore, no molecular corrections were included in SRIM calculations.
2.1.1. Linear interpolation (LI) of SPR at low density. The stopping power calculation method used by the implemented pencil beam algorithm is described in an earlier work (Fippel and Soukup 2004), in which the authors derived a formula to calculate stopping power as a function of energy and mass density in human tissues. In brief, the method involved evaluating stopping power ratios at a range of energies using the PSTAR tool (Berger et al 2005) for tissues with compositions specified in ICRU Report (1992), followed by a multivariate fitting procedure. The resulting equations are reproduced below as (1) and (2), and have a quoted accuracy better than 1% in all ICRU 46 tissues 4 . (2) where ρ is mass density in units of g·cm −3 and E is the proton kinetic energy in units of MeV. Equation (2) accounts for the small number of calibration materials with low densities by linearly interpolating f s between the points for air and bulk lung. Linear interpolation is also performed between the point for bulk lung and the bottom of the continuously-defined range. This approach will be referred to as the LI method.

Mass-average (MA) of SPR at low density.
An alternative approach to interpolation is proposed here. Since f S in equation (1) is the ratio of the mass stopping power of a material relative to that of water, its value in a mixture can be calculated as a mass-average of the mass stopping power of its constituent materials, under the assumption that each contributes independently to proton stopping. For a mixture of density ρ containing materials of density ρ 1 and ρ 2 , the mass stopping power ratio can therefore be given by equation (3).
The new mass-average formula is given in equation (4). It uses equation (3) to calculate f S between the points defined for air and lung. Additionally the single f s value for lung is extended to the entire region 0.26 g·cm −3 ⩽ ρ < 0.8 g·cm −3 , motivated by the fact that ≳95% of internal CT voxels with densities below 0.8 g·cm −3 appear within the outlined lung contour for lung radiotherapy patients; however, this modification should be examined in detail before being applied to other treatment sites. Equation (3) is used again between 0.8 g·cm −3 and the bottom of the continuously-defined range. The equation matches the LI method (equation (2)) at densities above 0.9 g·cm −3 .  (4) where . A comparison between SPR calculated by the mass-average method (hereafter denoted MA) and the LI method is shown in figure 2.

Reference data.
Three sets of reference pencil beam data (mean energies of 60 MeV, 90 MeV and 170 MeV) were generated using the MCNPX v2.7.0 Monte Carlo particle transport code (Pelowitz et al 2011) and the default LA150 cross section dataset. In each case, a collimated beam of protons with Gaussian lateral fluence profile (σ = 1 mm) and Gaussian initial energy distibution (σ = 1 MeV) was generated to be incident perpendicular to the centre of a face of a 40 cm × 40 cm × 40 cm water cube. The energy deposited was recorded using a cylindrical mesh tally (type CMESH3) with radial bins of 1 mm and depth bins of 0.1 mm within a region of diameter 12 cm and depth 30 cm. Only protons were transported (including secondary protons arising from elastic recoil, by using a recl parameter of 1 on the PHYS:H card) to a minimum energy of 100 keV, after which the remainder of their energy was deposited locally. The energy of all other particles was deposited at the point of generation.

Monte Carlo calculations
The MCNPX Monte Carlo code was also used as a benchmark against which the pencil beam method as a whole could be tested, alongside the various approaches to SPR determination. It allows dose calculation at higher spatial resolution than a clinical TPS and determines stopping power based on the full elemental composition of a material and residual proton energy on a particle-by-particle basis. The simulation setup mirrored that used to generate the reference data in section 2.1.3, with the water cube replaced by a one-dimensional longitudinal patient model with a lateral cross section of 40 cm × 40 cm. Since only dose-to-material may be found in the standard release of MCNPX, dose-to-water in arbitrary materials was obtained by dividing the tallied energy deposition per unit volume by the mass density and the SRIM-calculated SPR, assuming the residual proton energy given by the pencil beam method at the same depth. Each direct MCNPX calculation ran until the relative uncertainties in each laterally-integrated longitudinal bin were below 0.25%, with the vast majority of bins displaying considerably smaller variance. Total number of histories ranged between 10 7 and 3 × 10 8 .

Patient models
2.3.1. Benchmarking geometry. A multi-material virtual phantom was used to benchmark the pencil beam implementation, containing homogenous tissues with compositions as specified in ICRU Report 46 (also used in deriving equation (2)). The geometry comprised five neighbouring 3 cm × 40 cm × 40 cm slabs, each composed of a different material, followed by a final 25 cm × 40 cm × 40 cm slab. Variations were produced containing the following six materials, in this order and its cyclic permutations: adipose tissue, muscle, lung, cartilage, cortical bone and water. One such arrangement is illustrated in figure 4.

Lung models.
One-dimensional lung models were created by extracting the CT numbers of all voxels within the demarcated lung contours of 25 non-small cell lung cancer (NSCLC) radiotherapy patients who participated in the ROCOCO in-silico planning trial (Roelofs et al 2012). Representative mass densities were calculated assuming images were acquired with an x-ray energy of 80 keV 5 and tissue had ICRU lung composition. Voxels below −990 HU and above 100 HU were excluded (2.3% of the total number extracted). The resulting mass density distribution is shown in figure 3.
Virtual phantoms were created by sampling the CT number distribution 300 times at intervals corresponding to one-third-percentiles and assigning the resulting densities to neighbouring 1 mm × 40 cm × 40 cm slabs of ICRU lung in four arrangements: with density ramping up, ramping down, ramping up and then down (peaked), and alternating high and low densities which converge on the median.  Fippel and Soukup (2004) with the low density region calculated using linear interpolation (LI) and mass-average by equation (4) (MA). Points show SPR calculated using the software tool SRIM Ziegler et al (2010) for tissues with elemental composition and density specified in ICRU Report 44 (1989), without compound corrections.

Assessment of clinical impact
A comparison was made between treatment plans that were calculated assuming the linear interpolation and mass average methods for mass stopping power determination. The XiO v4.70 treatment planning system (Elekta Ltd., Stockholm) was used to design spot-scanned proton plans using three techniques (single-field uniform dose, three uniform dose fields and three-field IMPT) for one NSCLC patient who participated in the ROCOCO planning trial. All plans had a prescription dose of 70 Gy (RBE) and were designed following the dose constraints detailed in the ROCOCO protocol. Whilst XiO calculates dose using the linear interpolation method of mass stopping power determination, a good approximation of the mass averaged method was obtained by modifying the CT calibration curve using the transformation derived in appendix A: comparisons using the MATLAB pencil beam implementation indicate the accuracy of this approach is better than 0.1 mm.
Dose distributions for each of the treatment plans and SPR methods were exported from the TPS in DICOM format and imported into CERR (Computational Environment for Radiotherapy Research) (Deasy et al 2003) for comparison, using the dose subtraction and gamma analysis functions.

Benchmarking
The pencil beam implementation was tested using a 170 MeV proton source in the six variations of the benchmarking geometry previously described. The calculated dose-depth curves were compared with those found using MCNPX: results are summarised in table 1, and shown graphically for one arrangement in figure 4.
It can be seen that in all geometries and with all SPR-calculation methods the pencil beam implementation predicts 80-20% isodose distal fall-off distance within 0.4 mm of the Monte    In the 170 MeV case, the 30 cm of modelled lung was followed by water into which the Bragg peak penetrated approximately 11 cm.
ΔR is given as a physical distance (in mm) and as a water equivalent distance relative to the MCNPX-predicted range (RPL). LI = Linear Interpolation. Carlo result. The range is also consistently predicted within 0.4 mm of the Monte Carlo value, except when the final material is lung. However, considering the lung material's density of 0.26 g·cm −3 , even these differences are of comparable water equivalent size to those in water and other soft tissues. These results suggest that the spatial error of this pencil beam implementation is smaller than the dose calculation voxels typically used for proton therapy treatment planning, and suitable for comparing SPR calculation methods.
The predictions of the pencil beam method in this scenario are identical for the LI and MA approaches to low density SPR calculation. This is because the benchmark model has homogenous tissues with mass densities as defined by the ICRU, so no interpolation or mass averaging is required.

Lung model
A summary of the results of comparative simulations using MCNPX and pencil beam methods in the lung model can be seen in table 2, for proton sources with energy 60 MeV, 90 MeV and 170 MeV. Dose-depth curves at 60 MeV and 90 MeV are shown in figure 5. Since the water equivalent thickness of the 30 cm lung model is approximately 8 cm, results for the 170 MeV beam (19.5 cm range in water) were obtained by placing a water phantom directly downstream of the model.
In this one-dimensional lung model with a single material at multiple mass densities, the range prediction errors introduced by linear interpolation of SPR are approaching a clinically significant scale, depending on mass density distribution and proton energy, ranging from 0.4 mm in water at high energy to 3.6 mm in low density lung at lower energies. When considered in terms of water-equivalent path length, these errors correspond to 0.2-3.0% of the proton range. With one exception, Bragg peak positions for curves calculated with SRIMderived SPRs differ from Monte Carlo data by less than 0.25% in water equivalent terms and vary little with density arrangement. This implies that a large proportion of the differences Figure 4. Illustration of the slab-based geometry used for benchmarking. Each slab is homogenous and has a cross-section measuring 40 cm × 40 cm. Superimposed is exemplar dose-depth data for the arrangement that ends with lung, using ICRU Report 46 tissue definitions. for the LI and MA cases can be attributed to SPR calculation rather than the pencil beam implementation. As in section 3.1, dose-depth curves do not change appreciably as a result of allowing SPR to vary with energy.  The tabulated results are best considered with reference to figure 2. It can be seen that the mass stopping power predictions of the LI model quickly depart from the MA model (which takes a similar approach to MCNPX) both above and below the ICRU lung density of 0.26 g·cm −3 . Approximately 40% of the total mass in the lung model appears at densities below this value, where the LI approach will underestimate the SPR by up to 12%, whilst the SPR may be overpredicted by up to 5% in the remaining 60% of mass. The net result on Bragg peak position depends on the proportion of the mass traversed by protons (∼40% at 60 MeV,∼80% at 90 MeV and 100% at 170 MeV) and the density at which that mass appears.
In the case of calculations for 170 MeV protons, the whole density spectrum is traversed and the opposing directions of range errors on each side of the distribution result in the LI method being a relatively good predictor of range. This statement is also approximately true for the alternating and triangular distibutions at 90 MeV. At 60 MeV and 90 MeV, the LI method can be seen to over-predict range in the ramp-up arrangement, and under-predict range in the ramp-down arrangement, as may be expected after examination of figure 2.
In summary, the proton range errors resulting from use of the LI model with an unmodulated Bragg peak are variable and dependent on the mass distribution traversed, but may be greater than the dose calculation voxel sizes typically used in proton therapy treatment planning (1-3 mm). As each of the individual spots that comprise a scanned proton treatment comprises protons of a different initial energy, traversing a variety of different paths and therefore a variety of mass distributions, the impact on clinical dose distributions is best understood through consideration of a treatment plan.

Clinical comparison
Five spot-scanned NSCLC treatment plans were created to examine the effect of SPR interpolation in realistic and more complex irradiation scenarios. A numerical summary of the difference in dose distributions calculated using the LI method and a good approximation of the MA method is given in table 3. It can be seen that the range of dose differences is plan-dependent, being greater for the single field uniform dose (SFUD) plans than for the three-field plans. The volume of large-scale differences in the dose distributions is also quantified by the number of voxels which fail a 3D gamma test (Low et al 1998) with parameters 0.02D max and 2 mm: one-field plans have a significant number of failures, whilst essentially all voxels in three-field plans pass the test. The better performance of three-field plans can by explained by the fact that range errors in one field will to some degree be compensated by dose from other fields. For the three SFUD plans considered, an increase in gantry angle from 90° results in further increased deviations of the LI results from MA, due to an increase in the thickness of lung tissue intersected by the beam.
A dose distribution and a LI-MA dose difference map are shown for one slice of the SFUD 120° plan in figure 6. Two pronounced regions of range error are apparent: an over-prediction of range by the LI method in the anterior left lung and an under-prediction of range by the LI method close to the right wall of the mediastinum. In the former position protons have passed through a large proportion of low-density tissue (similar to the 1D ramp-up case), whilst the traversed lung tissue is at higher densities in the latter position (similar to the 1D ramp-down case).

Discussion and conclusions
For the model above, it has been shown that errors of up to 4 mm (or 3% of water equivalent range) may result in the range predicted by a pencil beam algorithm using the LI method compared to Monte Carlo. In the clinical treatment plans considered, switching from MA method to LI introduced small regions of dose difference equating to 7% of the prescription dose.
It should be noted that these potential algorithmic errors are only relevant in regions containing significant amount of low density tissue and, even then, are very much dependent on the distribution of mass along the path of the protons. They should be viewed in the context of other proton range uncertainties in lung, for example: • Pencil beam methods can suboptimally model protons which have undergone scatter in complex heterogeneities, particularly because a beamlet's evolution is typically calculated Table 3. Minimum, maximum and standard deviation of the difference in dose distibutions calculated using linear interpolation (equation (2)) and mass-averaging (equation (4)) for proton mass stopping power in the low density region.  (Low et al 1998) with maximum allowable dose difference of 2% of the prescription dose or maximum allowable distance-to-agreement of 2 mm, also expressed as a percentage of the number of voxels with non-zero dose. Results are given for five treatment plans on a non-small cell lung cancer patient: three single-field uniform dose (SFUD) plans at different gantry angles, an equally-weighted sum of the SFUD plans (3FUD) and three-field IMPT (3F-IMPT).

Figure 6.
(a) Single slice of the dose distribution for a spot-scanned single field uniform dose (SFUD) proton therapy treatment at gantry angle 120°, calculated using linear interpolation of proton mass stopping power in the low density region (equation (2)). Dose calculated on a 2 mm cubic grid. (b) Difference between the dose distribution shown in slice (a) and the equivalent dose distribution calculated using a good approximation of the mass average method in the low density region (equation (4)). based only on the material intersected by its axis. This has recently been investigated on a site-by-site basis, with range margins of 6.3% + 1.2 mm having been recommended in lung, potentially reduced by patient-specific optimisation (Schuemann et al 2014). • Motion, including interplay with the scanning dynamics, can lead to mean differences between planned and delivered doses in the target of approximately 5%, and localised differences in excess of 20%, during a fractionated treatment (Grassberger et al 2013). • Errors in CT calibration and the effect of image artefacts, such as beam hardening and stochastic noise, may introduce water equivalent range prediction errors of approximately 0.5% along typical paths through lung (Warren 2014).

Dose [Gy
A literature search did not reveal any dose calculation comparisons between a pencil beam algorithm using the LI method and one that uses an alternative SPR calculation method, or Monte Carlo simulation, in low density tissue. However, the developers of the Monte Carlo platform GATE reported a large discrepancy between XiO's pencil beam algorithm and their own method for stopping power determination in the density interval 0.5-0.9 g·cm −3 (Grevillot et al 2012), which would be an expected result of the TPS using the LI method.
Outside of lung, similar proton range errors may exist in doses calculated by the LI method in the paranasal sinuses, which contain fine structures composed of soft tissue and air. Related range errors may also occur in beams travelling through the mastoid process, part of the skull. The mastoid is composed chiefly of bone and air pockets, many of which are smaller than the CT voxel size, yielding pixel values in the approximate range −950 HU to +1700 HU, as seen in figure 7(a). Whilst equation (3) can be used to create a calibration curve relevant to boneair mixtures, this will not be applicable to all tissue in the relevant CT slices because the mass density (and Hounsfield Unit) ranges in soft tissue and brain completely overlap those of mastoid. The issue is illustrated by the idealised CT calibration curves shown in figure 7(b), which show departures of up to 40% in SPR prediction if mid-density tissue is considered to be composed of a bone-air mix instead of soft tissue. Range calculation errors of multiple millimetres may result, and are likely to be common to all proton dose calculation methods which rely on a single CT calibration curve. One possible resolution to the problem is for TPS developers to allow users to apply different stopping power models within different regions of interest.
The discussion in this paper has been devoted to biological materials because equations (2) and (4) were derived for human tissues, but it should be noted that non-biological materials may be present in the TPS calculation volume; for example, range shifters used in the case of shallow treatment volumes are sometimes separate from the gantry or treatment machine, or the patient may have prostheses. Careful consideration should be made as to whether corrections to the calibration curve or patient model are necessary if significant thicknesses of non-tissue material will be traversed by protons, such as by assigning the relevant regions a 'tissue equivalent mass density' calculated from the inverse of equation (1). Silicone breast implants, for instance, may need a modification of up to 7% in the equivalent mass density of the patient model in order for a TPS using the LI method to assume the measured SPR of 0.935 (Moyers et al 2014).
The modification presented here is suggested on the grounds that it is advisable for all aspects of radiotherapy dose calculation algorithm to be physically-motivated. It will provide optimal dose calculations in lung for users of this pencil beam method 6 , but the differences compared to using the previous mass density to SPR method should be within current safety margins, and entirely negligible for most non-lung treatments. Introducing a change to the MA method should be trivial for TPS developers, but they may instead wish to consider the benefit of accepting calibration curves that link CT numbers directly to linear stopping power 6 and for users of the VMCPro Monte Carlo code, for which equation (2) was originally derived. ratio rather than mass density. At its most basic, this would result in a very small additional inaccuracy since it would not account for energy dependence of stopping power, but it seems likely that correlations exist between SPR calculated at a defined energy and that at lower energies. Such a move would allow the physicists commissioning a proton therapy system to be fully aware of, and to take full responsibility for, the methods by which a very important physical property is determined. Figure 7. (a) Histogram of pixel values within the left and right mastoids for one patient, with calculated mass densities assuming that voxels are composed of a boneair mix and imaged with monoenergetic 80 keV x-rays and (inset) part of a CT slice through the mastoid displayed with a 1000 HU window around 0 HU. (b) Idealised proton stopping power ratio (SPR) versus CT number calibration curves for three different monenergetic x-ray energies, with models that assume air-soft tissue mix at low density and soft tissue-bone mix at high density (A/S/B) or a bone-air mix across the entire mass density range (A/B). The method for calculating these curves is described in appendix B. The MA method calculates stopping power ratio using a similar expression: where f S,MA is the piecewise function corresponding to the MA method, given in equation (4). A transformed calibration curve C m is sought, in which CT number is instead mapped to a modified mass density ρ m . For a particular CT number, ρ m is the mass density which causes the LI formula to return the same value for S as the MA formula returns for original mass density ρ o . This situation may be expressed by equating the right-hand sides of (A.2) and (A.3): It is assumed that that neither f S function changes significantly with E, which is true over the majority of a proton's path length, especially in the low density region-see figures 1 and 2. The dependence on E is therefore dropped, and (A.4) may be rearranged to give: The LI and MA methods only differ in the region ρ < 0.9, where f S,LI is defined by two linear segments. If f S,LI (ρ) = a·ρ + b, then the expression for ρ m becomes a quadratic equation for which the positive solution is: In the first linear segment of equation (2) In the second linear segment of equation (2),  Figure A1 shows the relative size of modifications made to the calibration curve if the above equation is evaluated using the relevant coefficients. The modification is equivalent to an absolute change in linear SPR of 3-4% at maximum deviation (near −200 HU) over the energy range 10-250 MeV. Simulations suggest that ignoring the energy dependency in ρ m /ρ o introduces water equivalent range errors of less than 0.1 mm over the path lengths utilised in this work.
where i labels chemical elements. Proton stopping powers were calculated in SRIM (Ziegler et al 2010) for mixtures corresponding to each CT number, and divided by the stopping power of water to arrive at SPRs.