Applying the column generation method to the intensity modulated high dose rate brachytherapy inverse planning problem

Objective. Intensity modulated high dose rate brachytherapy (IMBT) is a rapidly developing application of brachytherapy where anisotropic dose distributions can be produced at each source dwell position. This technique is made possible by placing rotating metallic shields inside brachytherapy needles or catheters. By dynamically directing the radiation towards the tumours and away from the healthy tissues, a more conformal dose distribution can be obtained. The resulting treatment planning involves optimizing dwell position and shield angle (DPSA). The aim of this study was to investigate the column generation method for IMBT treatment plan optimization. Approach. A column generation optimization algorithm was developed to optimize the dwell times and shield angles. A retrospective study was performed on 10 prostate cases using RapidBrachyMCTPS. At every iteration, the plan was optimized with the chosen DPSA which would best improve the cost function that was added to the plan. The optimization process was stopped when the remaining DPSAs would not add value to the plan to limit the plan complexity. Main results. The average number of DPSAs and voxels were 2270 and 7997, respectively. The column generation approach yielded near-optimal treatment plans by using only 11% of available DPSAs on average in ten prostate cases. The coverage and organs at risk constraints passed in all ten cases. Significance. The column generation method produced high-quality deliverable prostate IMBT plans. The treatment plan quality reached a plateau, where adding more DPSAs had a minimal effect on dose volume histogram parameters. The iterative nature of the column generation method allows early termination of the treatment plan creation process as soon as the dosimetric indices from dose volume histogram satisfy the clinical requirements or if their values stabilize.


Introduction
Amongst the options available for cancer treatment, external beam radiotherapy (EBRT) and high dose rate (HDR) brachytherapy are two commonly used techniques. For selected tumour sites, HDR brachytherapy offers improved therapeutic ratio compared to EBRT (Goy andSoper 2016, Lecavalier-Barsoum et al 2021). However, radiation sources used in brachytherapy provide isotropic dose distributions, which result in poor dose conformity due to the asymmetrical shape of the tumours. For optimal treatment outcomes, high dose of radiation should be delivered to the tumour while limiting the radiation to the healthy organs at risk (OAR) surrounding the tumour. Thus, the optimal dose distribution in the tumour is limited by the proximity to the OAR. The ability to produce anisotropic dose distributions from individual dwell positions would allow for protection of the OAR from excessive dose without compromising the tumour coverage.
Intensity modulated brachytherapy (IMBT) is a new brachytherapy treatment modality enabling the generation of anisotropic dose distributions by incorporating rotating metallic shields inside the brachytherapy needles or catheters (Webster et al 2013, Han et al 2014, Famulari et al 2020a, Morcos and Enger 2020, Morcos et al 2021. IMBT can open the possibility to increase the absorbed dose to the tumours while more effectively shielding the OAR by dynamically directing the radiation towards the tumours and away from the OAR; i.e. the dose distributions will better conform to the shape of the tumour. This added degree of freedom, however, presents a challenge in treatment planning, since optimal source emission angles have to be determined in addition to the source dwell positions and dwell times. There have been various studies on the theoretical feasibility of IMBT dose delivery. Ebert Ebert (2002 first introduced the concept of IMBT by investigating the possibility of achieving radial intensity modulation through rotations of a collimated radiation source about its axis. Significant improvements in dose conformity were observed with IMBT compared to conventional HDR brachytherapy. Collimation angles of 22.5 to 45 degrees and a transmission on the shielded-side of <10% or less were found optimal to observe the benefits of IMBT over conventional HDR brachytherapy (Ebert 2002).
In recent years several treatment planning studies for IMBT have been published. Chaswal et al (2012) developed an adjoint sensitivity field-based treatment-planning technique for the directional low dose rate (LDR) brachytherapy sources investigated by Lin et al (2008). This 3D greedy heuristic optimization algorithm successfully spared the OAR from overdose and produced a feasible LDR brachytherapy treatment planning solution. In 2013, Webster et al (2013) published a study regarding dynamic modulated brachytherapy for rectal cancer by translating and rotating a tungsten alloy shield during the treatment. Following this study, two separate studies by Adams et al (2014) and Famulari et al (2020b) for prostate cancer and additional studies by Dadkhah et al (2015) and Morcos and Enger (2020), Morcos et al (2021) for cervical cancer investigated rotating shield brachytherapy. All these studies showed improved dose distributions when IMBT was compared to conventional brachytherapy.
In order to find the optimal configuration of dwell position and shield angle (DPSA) combinations for a given tumour site, an inverse planning optimization algorithm is required. In conventional brachytherapy, inverse planning simulated annealing (IPSA) Lessard and Pouliot (2001) is among the most commonly used dose optimization techniques. However, IPSA is susceptible to local minimum trapping (Deufel and Furutani 2014), and its linear penalty optimization approach results in generating treatment plans with larger hotspots (Chajon et al 2007, Holm et (Alterovitz et al 2006). Aside from using simulated annealing, to reach convergence faster, a lot of effort has been invested into GPU-accelerated optimization to improve inverse planning efficiency and runtime. GPU-based Multi-criteria optimization has been successful in exploring the Pareto surface to generate one (Breedveld et al 2019) or several (Bélanger et al 2019) clinically approved plans.
The column generation method was initially applied to the direct aperture optimization problem in intensity modulated radiotherapy (IMRT) (Romeijn et al 2005). In comparison with IMRT, the number of degrees of freedom involved in IMBT is much smaller, meaning that an optimization model which considers every single DPSA simultaneously remains computationally feasible.
However, in contrast to a full linear or quadratic model which optimizes the dwell times of all possible DPSAs at once , the column generation method iteratively adds DPSAs to a treatment plan by solving a pricing problem. The pricing problem aims to 'price' all DPSAs not included so far based on the value they can add to the current plan. At every iteration, the dwell with the lowest price (most effective) is added to the plan. This process continues until clinical constraints are met or until the algorithm determines that no further DPSAs can improve the cost function. The result is a clinically deliverable plan with a built-in mechanism to control plan complexity.
The aim of this study was to investigate the column generation method for IMBT treatment plan optimization. With the added complexity of IMBT, and since this work was the first to describe an algorithm specifically designed for IMBT, a column generation method that allows more explicit control over the number of dwell positions was chosen. This work was intended to provide an algorithm that solves dwell time optimization for 2000+ DPSAs and to serve as a benchmark for future IMBT dose optimization algorithms.

Optimization and column generation method
In the column generation method, the treatment plan creation process is split into two parts. At every iteration of the algorithm, a 'master problem' is solved to find the optimal dwell times for the current set of DPSAs included in the treatment plan, starting with an empty set. This problem can be solved by choosing from well known optimization techniques depending on the features of the cost function. Ideally, the cost function should be convex to ensure that the obtained minimum is a global minimum. The optimal solution is then used to assign a 'price' to all DPSAs, which are not currently included in the treatment plan. Mathematically, a DPSA's price indicates the expected improvement to the cost function if its dwell time is increased from 0. The DPSA with the best price, i.e. the most negative price is then added to the treatment plan and the process is repeated. The algorithm terminates once none of the remaining DPSAs have a negative price, indicating that adding them to the treatment plan will not improve the cost function. Alternatively, the process can be stopped as soon as the optimization criteria are achieved or a pre-specified number of DPSAs are included in the treatment plan, allowing fine control over plan complexity (number of DPSAs).

Notation and choice of cost function
In the optimization model considered for this work, the goal is to minimize an objective function of the form: where f s  are the cost functions for individual structures. A combination of convex one-sided weighted quadratic objective functions of the form given in equations (2a) and (2b) were selected. The symbols and their definition are given in table 1.
For a given structure, the dose inside each voxel is the sum of the dose contributions to that voxel from each DPSA, which can be written as where J is the set of all DPSAs included in the problem. If D sij is normalized as the absorbed dose (in Gy) from DPSA j per second, then t j is the dwell times in seconds for the specific DPSA. Ultimately, the dwell times t j are the decision variables to be optimized to achieve the goal of minimizing the cost function.

The master problem
The 'master' optimization problem is formally stated as . 4 j " Î  * * During the optimization the DPSAs that have dwell times smaller than 0.01 s were set to 0 s and the rest of the dwell times were scaled to compensate for this treatment time reduction if needed. User-specified weights associated with a given penalty and a structure s V s Set of voxels inside structure s z i Dose value within voxel i Θ The Heaviside step function T min , T max Dose thresholds before which a penalty is counted in the cost function 2.1.3. The pricing problem The relevant material for this section were presented formally in Bazaraa et al (2006) and has been re-derived multiple times in the context of EBRT ( , therefore only a brief overview will be provided. As a starting point for the derivation, the Karush-Kuhn-Tucker conditions for optimality are applied to the master problem. These conditions define the necessary conditions for an optimal solution to the master problem: where π si and ρ j are the Lagrange multipliers corresponding to constraints (4b) and (4c), respectively. Optimizing only a subset of all DPSAs is equivalent to setting the dwell times t j to 0 for all DPSAs not included in the treatment plan. For those unincluded DPSAs, all optimality conditions are automatically satisfied except for which defines the pricing problem. After obtaining an optimal set of dwell times and corresponding dose values from the master problem, (t ,ẑ), the ρ j can be computed for all DPSAs not included in the treatment plan, and any DPSA with ρ j < 0 can potentially improve the cost function. The heuristic used in this work involves finding the largest negative ρ j and adding the corresponding DPSA to the treatment plan or the cost functions defined in equations (2a) and (2b), the Lagrange multipliers π si are given as followŝ , defines the rate of improvement of the cost function that would result from a small increase (from 0) in the dwell time of DPSA j Î J. In the case where the lowest price is greater or equal to 0, the column generation process is stopped as no further dwell positions can be added to improve the cost function. The termination criterion used in this work is stopping when all the remaining DPSAs have a positive price or when 20% of available DPSAs were added to the plan.

Patient data, simulation details and plan optimization
IMBT treatment plans for ten prostate cancer patients were optimized using the column generation-based optimizer described above. Patients' postimplant CT images were used with approval from our institutional review board. Contours were imported from DICOM RT Structure Set files. Since IMBT is not yet clinically used and the optimal needle placement of dwell positions has not been determined, efforts were made to ensure that all dwell positions assigned to a catheter can realistically be delivered. For interstitial IMBT, the thickness of the shield must be in the sub-millimeter range to fit inside existing interstitial brachytherapy needles and yet modify the intensity of the source by several half-value layers. The needle used in this study has an outer diameter of 2 mm, which can only fit a platinum shield of 0.8 mm. The shield is not thick enough to attenuate the intensity of the conventional 192 Ir source, hence this radionuclide cannot be used for prostate IMBT. A radionuclide with lower energy than 192 Ir, namely 169 Yb was chosen (Famulari et al 2020b). The dynamic shield system combined with the source model presented in Famulari et al (2020a) were used in the simulations performed in this study with 169 Yb as the active source. The optimizer was coupled to RapidBrachyMCTPS. A total of 5 × 10 8 decay events were simulated to obtain type A uncertainties less than 0.2% in the voxels within the planning target volume (PTV). Details regarding the MC simulations are summarized in table 2 according to the recommendations from the AAPM task group 268 (TG-268) (Sechopoulos et al 2018).

Results
In this study, a column generation-based optimization method was used for treatment plan optimization and shield angle selection in IMBT and tested on ten prostate cancer patients. As a part of the column generation method, the current set of active DPSAs is always optimized to convergence before a new DPSA is added. Figure 1 shows an example of the evolution of the optimal cost function value as more DPSAs are added to the treatment plan. The cost function is compared to the asymptotic cost obtained which is the optimal plan when all possible DPSAs are included at once. In this case, 2080 possible DPSAs were considered, and the final column generation-optimal treatment plan included 168 DPSAs. The column generation took 143 s to converge while optimizing all available DPSAs at once took 406 s. Figure 2 shows the dose volume histogram (DVH) for an IMBT treatment plan with 169 Yb as a source isotope and optimized shield-angle for a prostate cancer patient simulated with RapidBrachyMCTPS. Figure 2 also shows the difference between applying the column generation technique and applying dwell time optimization directly on all available DPSAs.
The column generation method converges to the asymptotic cost as more DPSAs are included. The diminishing returns of additional DPSAs are clearly shown by the negligible decrease in the cost function value after roughly 150 DPSAs. However, the absolute value of the cost function provides little concrete information on the quality of the treatment plan. Figure 3(a) shows the relevant DVH parameters for the target volume and figures 3(b)-(d) for the urethra, rectum and bladder, respectively. The DVH parameters are stable after 100 DPSAs, demonstrating the potential of the column generation method for limiting treatment plan complexity. This case is fairly representative of what was seen in all the other cases. All ten cases were optimized using the same optimization constraints and weights. After the convergence of the column generation algorithm, the dwell times were normalized so that D 90 of the tumor equals the   prescribed dose. The resulting DVH is summarized in table 3. We followed the protocol from UCSF (Yamada et al 2012). All of the ten cases passed all the clinical constraints from the protocol by a good margin. The bladder, rectum and urethra were on average 11%, 17% and 28% lower than the given dose limit, and 2% higher, 3% and 8% lower than the clinical non IMBT plan, respectively. The homogeneity indices were better than the conventional plan where were 27% and 9%, 3% and 2% lower than the clinical plan, respectively .
A summary of the performance figures of all ten cases is shown in table 4. The number of voxels used for optimization were 8000 on average and the number of DPSAs that were included in the final plan were 11% of the total number of DPSAs on average. Number of DPSAs of the ten cases varied from 1648 to 3200 averaging 2270. The optimisation was ran through Windows Subsystem for Linux using a core i7 10 700 with 32 GB of RAM running the commercial solver Gurobi v8.1 (Gurobi Optimization, Inc.).

Discussion
IMBT adds a new degree of freedom compared to conventional brachytherapy, namely the shield angle at every dwell, to the treatment plan optimization process to allow for better dose delivery. The added treatment plan complexity makes the optimization problem one order of magnitude larger. This problem is similar to using IMRT in EBRT, where column generation was used to perform inverse planning optimization (Romeijn et al 2005). This study aimed to investigate the column generation method for IMBT treatment plan optimization. A column generation optimization algorithm was developed and tested to optimize the dwell times and shield angles retrospectively for ten prostate cancer cases. The final optimized plans containing 11% of total DPSAs on average, satisfied all the clinical DVH metrics goals as presented in table 3. The D 1cc of the rectum and the bladder are both lower than 75% of the prescribed dose (R x ) while being on par with the clinical plan. Urethra's D 1cc is also lower than R x and is 8% than the clinical plan without reducing the tumor coverage. With respect to hotspots inside the tumor volume, are well within what is acceptable clinically and lower than the clinical conventional plan. Figure 3(a) illustrates the gradual increase of the target coverage when more DPSAs are added, with less relevant increments after including 150 DPSAs. This proves that the optimization step performed in the algorithm can produce high-quality plans with good target coverage. Table 3. Average and standard deviation of the dosimetric indices derived from the DVHs of the ten prostate optimized plans. V x = fractional volume covered by x% of the prescription dose; D y = highest dose that covers volume y of the organ.

Structure
Dosimetric index Clinical non IMBT Column generation + IMBT Goal Mean ± Standard Deviation Mean ± Standard Deviation a Following the UCSF protocol Yamada et al (2012). Figures 3(b)-(d) shows that all the dosimetric indices of interest stabilize quickly after the first 100 iterations. This shows that a large number of dwell positions in a typical IMBT problem do not add value to the plan. This is also reflected in figure 1 where the cost of the objective function at every step decreases with a slower pace the more DPSAs are added. The pattern at which the cost function decreases shows that the DPSAs with better influence are added first. Such a pattern proves that the pricing method can successfully quantify the value added when including a given DPSA. In addition, the cost is decreasing at every step meaning that adding all the DPSAs added at every iteration improved the quality of the plan.
The horizontal convergence line in figures 1 and 3 is calculated based on the value of the corresponding parameter when the algorithm is forced to use all the DPSAs at once. The cost function value and the PTV's and the urethra's DVH parameters in figures 1, 3(a) and (b), respectively, are converging, as expected, to the horizontal convergence line. However, the rectum and bladder in 3(c) and (d) do not seem to converge as they are exhibiting a change of priority while adding DPSAs to the plan. This is also can be expected to happen throughout the addition of DPSAs especially that the bladder and rectum had equal weights which can cause the algorithm to swap priorities depending on the newly added DPSAs.
Column generation's greediness makes it suitable for large optimization problems. The gradual increase of the number of DPSAs included at every step allows to perform inverse planning optimization on problems that would not be feasible otherwise. By using column generation, in theory, one can optimize an IMBT plan with smaller shield degree variation, or smaller step size between the dwell positions. To better benefit from advantages of IMBT, in addition to DPSAs and dwell times, needle positions need also to be optimized. Column generation can be extended in the future to include this feature. The ability to potentially optimize any plan regardless of its size enables IMBT and removes one of its few limitations.
Although the optimization time on average is higher than other optimization techniques implemented for conventional HDR brachytherapy, it is still faster, on average, than manually changing the dwell times, dragging the isodose lines, or sometimes reoptimizing using different weights which can take more than 15 min (Jacob et al 2008). It is also worth noting that all the calculations were done on CPU and big acceleration gains can be achieved by performing those calculations on GPU.
Aside from GPU-acceleration, there exist possible derivations to the version of column generation discussed in this work that can result in faster convergence. The time needed to obtain the final plan is a function of the number of iterations and DPSAs as well as the stopping criteria. For instance, by using precalculated dose maps for each DPSA that consider the shield including its geometry, material composition and mass density, but ignores the patient specific information such as tissue composition and mass density (i.e. treat the patient geometry as water with unit density), unwanted DPSAs can be filtered out. In that case, a full MC simulation per remaining DPSA taking into account all the heterogeneities introduced by the source, shield and the patient would be performed. This should result in a reduction of at least 80% of MC simulations. On average for our 10 simulated patient cases it would result in 89% reduction in the number of performed MC simulations. This was the rationale behind choosing to add one dwell position at a time. Based on such a heuristic, we should end up with fewer DPSAs at the end compared to when adding n dwells or x% of the most negative ones. It is worth mentioning that adding more dwells will result in a quicker column generation solution but at the expense of running more MC simulations. In other words, although this algorithm can be standalone post MC simulations (which is the case for this study), it can also be used to more efficiently perform MC simulations. This added efficiency, combined with state-of-the-art deep learning-based dose prediction models (Mao et al 2020), column generation can play an important role to help the adoption of IMBT in a clinical setting.
Other ways of converging faster can include stopping the optimization process after adding a specific number of DPSAs or after spending some constant time, possibly allowing for better control of the number of iterations done before terminating. A needle placement optimized for IMBT can also accelerate the solution convergence.
This work focused on the application of column generation in IMBT without defining where it sits vis-à-vis similar inverse planning approaches for conventional HDR brachytherapy. Comparing column generation with well-established optimization algorithms used in conventional HDR brachytherapy is outside the scope of this study and will be investigated in future work.

Conclusions
The column generation method produced high-quality deliverable prostate IMBT plans. The treatment plan quality reached a plateau fast, where adding more DPSAs had a minimal effect on the DVH parameters. The iterative nature of the column generation method allows early termination of the treatment plan creation process as soon as the DVH parameters satisfy the clinical requirements or have stabilized.