Dosimetric performance of the Elekta Unity MR-linac system: 2D and 3D dosimetry in anthropomorphic inhomogeneous geometry

Following the clinical introduction of the Elekta Unity MR-linac, there is an urgent need for development of dosimetry protocols and tools, not affected by the presence of a magnetic field. This work presents a benchmarking methodology comprising 2D/3D passive dosimetry and involving on-couch adaptive treatment planning, a unique step in MR-linac workflows. Two identical commercially available 3D-printed head phantoms (featuring realistic bone anatomy and MR/CT contrast) were employed. One phantom incorporated a film dosimetry insert, while the second was filled with polymer gel. Gel dose-response characteristics were evaluated under the Unity irradiation and read-out conditions, using vials and a cubic container filled with gel from the same batch. Treatment plan for the head phantoms involved a hypothetical large C-shape brain lesion, partly surrounding the brainstem. An IMRT step-and-shoot 7-beam plan was employed. Pre-treatment on-couch MR-images were acquired in order for the treatment planning system to calculate the virtual couch shifts and perform adaptive planning. Absolute 2D and relative 3D measurements were compared against calculations related to both adapted and original plans. Real-time dose accumulation monitoring in the gel-filled phantom was also performed. Results from the vials and cubic container suggest that gel dose-response is linear in the dose range investigated and signal integrity is mature at the read-out timings considered. Head phantom 2D and 3D measurements agreed well with calculations with 3D gamma index passing rates above 90% in all cases, even with the most stringent criteria used (2 mm/2%). By exploiting the 3D information provided by the gel, comparison also involved DVHs, dose-volume and plan quality metrics, which also reflected the agreement between adapted and delivered plans within  ±4%. No considerable discrepancies were detected between adapted and original plans. A novel methodology was developed and implemented, suitable for QA procedures in Unity. TPS calculations were validated within the experimental uncertainties involved.

tumor or normal tissue changes. A direct visualization of the structures of interest at the treatment position with superior soft tissue contrast, also allows for robust patient re-positioning or, equivalently, plan adjustment to patient position . By exploiting the capability of real-time monitoring of structures during treatment, other attractive features include intra-fractional motion detection and delivery gating (McPartlin et al 2016, Menten et al 2016, Pathmanathan et al 2018. The Elekta Unity MR-linac (Elekta AB, Stockholm, Sweden), referred as Unity in the following for brevity, comprises an Achieva 1.5T MR scanner (Philips Healthcare, Best, The Netherlands) and a 7 MV flattening-filterfree linear accelerator with an Agility MLC configuration (Elekta AB, Stockholm, Sweden). Beam directions are always perpendicular to the static magnetic field, which is parallel to the treatment couch (Woodings et al 2018a). The main drawback of hybrid MR-linac systems is the inability to translate or rotate the treatment couch within the MR bore. Therefore, patient set-up errors are accounted for by virtual couch shifts, i.e. the plan is adapted to the MR-identified patient position. Virtual couch shifts (or, equivalently, isocenter shifts) are calculated following a rigid or deformable spatial registration between the pre-treatment on-couch MR scan and the original planning CT scan (Winkel et al 2019a).
In Monaco v. 5.40.00 (Elekta AB, Stockholm, Sweden) treatment planning system (TPS), following calculation of the virtual couch shifts, adaptive planning can be performed by various methods, divided into two main categories (Winkel et al 2019a); the 'adapt-to-position' (ATP) and the 'adapt-to-shape' (ATS). The former strategy does not allow for contour editing and the daily MR scan is only rigidly registered to the original planning CT scan. ATS is more robust as it involves structure re-contouring and deformable registration. Depending on the strategy followed, plan optimization can involve segment shifts-only, segment aperture morphing, weights optim ization from segment, weights optimization from fluence, etc (Lim-Reinders et al 2017, Winkel et al 2019a. Plan optimization methods were recently reviewed and compared (Winkel et al 2019a).
In any case, the original plan created is rarely delivered and instead an adapted plan is prepared online with the patient at the treatment position (Winkel et al 2019a). However, the adapted plan should not be considered a priori the same with the original in terms of target conformality and normal tissue involvement. For cranial intensity-modulated radiation therapy (IMRT) cases, a recent study has detected increased maximum doses at critical organs for the adapted plan with respect to the original one . Depending on the plan optimization method used, the adapted plans often failed to meet the clinical dose constraints criteria (Winkel et al 2019a(Winkel et al , 2019b. The main challenge arises from the fact that moving an MLC aperture across the non-flattened irradiation field might result in a very different fluence pattern (Lim-Reinders et al 2017).
With respect to dose delivery quality assurance (QA) and dosimetry in a Unity system, several challenges exist which are associated with the presence of the high magnetic field strength. The Lorentz force impacts on secondary electron paths resulting in a reduction of the build-up distance while increasing the dose near the proximal side of an air cavity, the so-called electron return effect , Costa et al 2018. With respect to reference dosimetry, correction factors (up to 8%) have been introduced to take into account the effect of the magnetic field on ionization chambers responses, which are also orientation dependent (Pojtinger et al 2018). Furthermore, regarding dosimetry within plastic phantoms, the presence of air cavities around ionization chambers also affects detector response (Hackett et al 2016, Agnew et al 2017, O'Brien and Sawakuchi 2017. Even in a water phantom, a diamond detector was shown to exhibit increased angular dependence in the presence of 1.5T magnetic field (Woodings et al 2018b). Film dosimetry seems to be the least affected. Minor (just above uncertainty levels) differences (Delfs et al 2018) in radiation induced optical density (OD) change have been reported. In terms of dose from the red channel, a zero effect by the presence of 1.5T was measured in a recent study (Billas et al 2019). In addition, film response was not affected by the orientation of monomer crystals (parallel to the short film side) with respect to the magnetic field direction (Delfs et al 2018).
The potential of using 3D dosimetry in conjunction with MR dose read-out has been repeatedly demonstrated , 2018. The attractive feature of dose read-out at the treatment delivery position using the 1.5T MR scanner enables 3D dosimetry without the uncertainty related to the additional spatial registration step between measured and calculated dose distributions. Regarding dosimetry, the doseresponse characteristics for a set of three 3D dosimeters (including polymer gels) were not significantly affected by the presence of a 1.5T magnetic field, suggesting that no correction factors need to be applied for dose measurements in Unity . More importantly, polymer gels act as both the detector and the tissue equivalent material and, subsequently, do not perturb the photon fluence by the beam (Baldock et al 2010). In contrast to using plastic phantoms, with this characteristic, introduction of unintentional small air cavities is also avoided, which could be significant in the presence of 1.5T magnetic field. Furthermore, real-time dose accumulation monitoring in 3D can only be performed in MR-linacs using such dosimetric systems (Lee et al 2018). More specifically, Lee et al performed real-time relative 3D dosimetry using Fricke-type gels and T1-weighted pulse sequences, within homogeneous geometries. The implementation of the well-established polymer gel dosimetry formulations in combination with T2 relaxometry (Pappas et al 1999, Pappas 2009, Baldock et al 2010 has not yet been evaluated. This work focuses on the development and implementation of a methodology for the overall dosimetric evaluation of the Unity system, specifically addressing related challenges. The novelty of this study lies on both the phantoms employed and the dosimetric protocols followed. In particular, phantom designs feature anthropomorphic geometry, as well as tissue and bone equivalent materials which exhibit realistic contrast in both MR and CT imaging modalities. Because of the latter characteristic, the phantoms used are suitable for performing clinically realistic virtual couch shifting and plan adaptation prior to treatment delivery, simulating the actual clinical procedure. Furthermore, each phantom incorporates either a radiochromic film or is filled with 3D polymer gel, i.e. 2D and 3D dosimeters, respectively. Specific methodology provisions ensured (i) presence of realistic (in size and shape) bone inhomogeneities, (ii) fiducials of adequate MR/CT contrast for facilitating registration of film measurements to MR planning coordinate system, (iii) absolute 2D dose measurements, (iii) relative 3D dose measurements and, (iv) real-time dose accumulation monitoring based on T2-weighted (T2w) images. The methodology is implemented for treatment plan verification of an intracranial IMRT case, following virtual couch shifting. Dose measurements are compared against both the original and the adapted plans to highlight any potential discrepancies.

Phantoms characteristics
The PseudoPatient ™ head phantoms (RTsafe P.C., Athens, Greece) were employed for the majority of irradiations described in this study. Specifically, two identical head phantoms were used, each one encompassing a different dosimetric system (figure 1). These phantoms were produced by the vendor using the methodology described in Makris et al (2019). Briefly, a patient's planning CT scan is used to 3D-print the patient's phantom in terms of external contour and cranial bone anatomy. The finalized 3D-printed material is bone equivalent in the high energy photon fields used herein (Makris et al 2019). The resulting hollow phantom can accommodate any kind of dosimetric system (e.g. ionization chamber, film or even 3D polymer gel) and has been previously used for dosimetry in several studies (Pantelis et al 2018, Saenz et al 2018, Hillbrand et al 2019. Regarding 2D dosimetry, the rest of the phantom is filled with water, serving as soft tissue equivalent. In the 3D dosimetry case, the gel acts as both the detector and the soft tissue material.
The reason for selecting these specific head phantoms is that they comprise: (i) detailed anthropomorphic geometry in terms of size and shape, (ii) involving bone inhomogeneities, (iii) multi-detector dosimetry capabilities, including the option for 3D measurements and, most importantly, (iv) clinically realistic and sufficient contrast in both MR and CT imaging modalities. The latter characteristic enables the option to perform virtual couch shifting and adaptive treatment planning at the set-up position (following MR imaging and spatial coregistration to the planning CT scan coordinate system), simulating the actual clinical case (Raaymakers et al 2017, Pathmanathan et al 2018. It is particularly important for a QA protocol to involve this step of the workflow which is a unique characteristic related to the Unity design.

Treatment planning, adaptive planning and dose delivery
The anonymized actual patient's planning CT scan (corresponding to this particular set of phantoms) were used to prepare a challenging plan for the treatment of a hypothetical large C-shape brain lesion in Monaco v5.40.00. This TPS version features dose calculations in 1.5T magnetic fields for Unity applications. Planning target volume (PTV) was 27.506 cm 3 in a C-shape, partly surrounding the brainstem (contoured volume 11.805 cm 3 ) which was regarded as the organ-at-risk (OAR), as shown in figure 2. A step-and-shoot IMRT technique was employed consisting of seven flattening-filter-free 7 MV beams at gantry angles 51°, 102°, 180°, 153°, 204°, 255° and 306°, using one isocenter. Collimator angle was always at the Unity's fixed configuration of 270°. Dose delivery levels were selected to match the dosimeters' calibration or optimal performance ranges (see sections below). Therefore, a dose of 8 Gy was prescribed to cover 95% of the PTV. A 3 × 3 × 3 mm 3 dose calculation grid spacing was used with 1% statistical uncertainty per calculation of dose to medium. The specific calculation grid size, although it provides relatively limited spatial resolution, was a compromise between accuracy and feasible calculation times in clinical practice aiming to simulate a realistic treatment workflow. The plan is illustrated in figure 2.
The created structures were exported in dicom format and shared with the phantoms' manufacturer (RTsafe P.C., Athens, Greece) in order to install one dosimetry insert per phantom and more specifically: (i) create a socket for film dosimetry at a coronal plane intersecting the PTV (figures 1(a) and (c)) and (ii) fill the entire brain area of the second phantom with a 3D polymer gel dosimeter (figures 1(b) and (d)). Specific details for each detector and dosimetry protocol are given in the following sections.
Each phantom was positioned on the Unity treatment couch in order to perform the necessary virtual couch shifts and adaptive planning. All on-couch pre-treatment MR images of the two head phantoms were acquired by implementing a 3D T2w turbo spin echo (TSE) pulse sequence with echo time (TE), repetition time (TR) and flip angle (FA) of 500 ms, 2000 ms and 90°, respectively. This is the routinely used MR protocol for patient  positioning. Acquisition duration did not exceed 2 min. Subsequently, images were spatially registered to original planning CT image stack (reconstructed voxel size of 0.74 × 0.74 × 1.25 mm 3 ) using the mutual informationbased rigid body registration algorithm incorporated in Monaco. The registration procedure was based on the clinically realistic MR contrast of the bone and soft tissue structures of the entire phantom geometry. Specifically for the PseudoPatient ™ phantoms employed in this study, dose calculations in the patient CT image stack were recently shown to be equivalent to patient-derived TPS calculations and in-phantom measurements, at least in the brain area (Makris et al 2019). Necessary virtual couch shifts were calculated by the TPS and an adapted treatment plan was prepared with the same dose calculation grid resolution of 3 × 3 × 3 mm 3 , in order to keep calcul ation times at a practical minimum, from a clinical point of view. Since contour editing was not of particular interest, the ATP strategy was followed, using the segment weight optimization mode. This is the suggested plan adaptation method, if minimal or no anatomical changes are expected (Winkel et al 2019a). No changes were manually made to the created adapted plan. Dose delivery was subsequently performed. The same procedure was followed for both head phantoms.

2D dosimetry
The phantom, incorporating a film dosimetry insert, has been described in detail in a previous publication (Makris et al 2019). Briefly, an axial area of 7.5 × 2 cm 2 from the inferior side of the phantom is cleared of bone structures, allowing entrance of a film dosimetry holder in the brainstem and central brain area, as shown in figure 1(a). In the created socket, a film cassette of 6.5 × 15 cm 2 , made of large RW3 slabs (provided by PTW, Freiburg, Germany) is installed. The dosimetry film is stacked in-between RW3 slabs.
In the original phantom design (Makris et al 2019), four metal pins lanced through the film for film stabilization among the slabs, as well as facilitating registration of measured and calculated dose distributions. Although this technique has been proven successful, resulting in accurately registered distributions (Pappas et al 2017, Saenz et al 2018, Makris et al 2019, it has been adapted to involve MR compatible materials only. Novel fiducial markers, suitable for Unity, were designed and constructed. Three small hollow cylindrical inserts (inner diameter and height of 4 mm and 7 mm, respectively, wall thickness of 1 mm) were made of acrylic (also known as polymethyl methacrylate or PMMA) which does not yield signal in MR imaging (Pappas et al 2016. The created cavities were filled with copper sulfate solution and sealed, providing adequate MR contrast, as shown in figure 1(c). Both materials are MR (and CT) compatible and often used in MRI-dedicated phantoms (Pappas et al 2016.
Holes were drilled in the RW3 slabs in order to incorporate the cylinder-shape fiducials, while same holes were cut on the film dosimetry piece. Similar to the approach previously followed using metal pins, the geometric center of the fiducials is placed on the film plane. Matching MR-identified centers of the fiducials with corresponding positions of the holes in the scanned film image offers the necessary set of reference points, defining the rigid transformation matrix in order to spatially register film measurements to TPS calculations. Appropriate routines have been developed in MATLAB R2015b (The MathWorks, Inc., Natick, MA) and validated in previous studies (Pappas et al 2017, Saenz et al 2018, Makris et al 2019. Regarding film dosimetry, GAFchromic EBT3 (Ashland Inc., Wayne, NJ) films were used implementing the single channel protocol (Devic et al 2005). The utilized film batch was calibrated at the Secondary Standard Dosimetry Laboratory of the Greek Atomic Energy Commission using a 60 Co beam in a dose range of 0.10-15 Gy. Calibration film pieces of dimensions 3 × 3 cm 2 were irradiated at a depth of 5 cm in a solid water slab phantom. Dose read-out was performed 24 h after irradiation. The same methodology regarding film scanning parameters and image processing was followed, as described in Makris et al (2019), using the EPSON Perfection V850 Pro flat-bed color scanner and in-house developed MATLAB routines. A polynomial calibration curve was obtained for the red color and was used for conversion of the net OD values into dose values (Makris et al 2019).

3D dosimetry
The polymer gel formulation characterized in Papoutsaki et al (2013) was used, also produced by RTsafe P.C. (Athens, Greece). Prior to implementing the gel-filled head phantom, the dosimetric characteristics of the specific gel batch were assessed under Unity irradiation conditions. In order to verify dose-response linearity, a set of four small vials (shown in figure 1(b)) were filled with gel samples from the same batch. The vials were placed in a custom-made water tank, using an in-house developed PMMA holder, so that the central sagittal plane of each vial lied at a depth of 5 cm. Each vial was irradiated by Unity using a simple static beam of 10 × 10 cm 2 field size and gantry angle of 90°, so that the central sagittal plane of the respective irradiated vial received a uniform dose of either 0, 4, 8 or 12 Gy. The delivered MUs were calculated by the TPS. After applying the MR pulse sequence (described below), used for the dose read-out, the measured R2 relaxation rates of each vial were extracted by averaging the values of a central circular ROI at the central sagittal plane of each irradiated vial. Although this check can also provide a dose-response curve, obtained results were not applied to head phantom measurements for dose determination, in order to avoid any potential dependencies, temporal variations of the signal and volume effects (Baldock et al 2010).
In a second independent check, a cubic container, made of PMMA (internal dimensions: 10 × 10 × 12 cm 3 , wall thickness: 1 cm, shown in figure 1(b)), was also filled with gel from the same batch and irradiated by a single 3 × 3 cm 3 static field, centered at the cube and delivering 12 Gy at the depth of maximum dose. The endpoint of this check was to verify spatial integrity and maturity of the MR signal related to the radiation induced polymerization, immediately after irradiation (i.e. the dose read-out timing considered for the head phantom). In parallel, this check allowed for validation of the MR pulse sequence parameters (described below) to be used for dose readout (quantitative T2 maps) of the head phantom. For this test, reconstructed voxel size was 1.94 × 1.94 × 2 mm 3 .
The second head phantom was filled with gel from the same batch, covering the entire brain volume while the rest of the phantom was filled with non-dosimetric gel. After placing it on the Unity treatment couch, virtual couch shifts and adaptive planning were performed by Monaco, using again the segment weights optimization method, under the ATP strategy (Winkel et al 2019a).
Regarding MR dose read-out, the head phantom was also scanned immediately after plan delivery. A 3D multi-echo turbo spin echo (TSE) pulse sequence was used, which consisted of 20 echoes with the first one at 25 ms and echo time intervals of 40 ms (TR, FA) = (2000 ms, 90°). In order to minimize MR-related geometric distortion, the selected bandwidth was 1184 Hz/pixel (Hillbrand et al 2019). To result in acceptable MRI acquisition duration, the scan length on the superior-inferior direction (i.e. z-axis) was limited to 9.2 cm, which did not extend to cover the entire phantom (16 cm), disregarding areas of very low dose. As a result, MR scanning time did not exceed 35 min for a reconstructed voxel size of 1.53 × 1.53 × 2 mm 3 . Reconstruction of the measured R2 relaxation rate maps in 3D was performed using in-house MATLAB routines which have been verified in previous works (Saenz et al 2018, Hillbrand et al 2019. An axial plane of the obtained T2 (=1/R2) distribution, intersecting the PTV, is shown in figure 1(d).
In addition, the gel-filled head phantom was continuously MR-scanned prior, during treatment delivery. Specifically, a 2D T2w dynamic pulse sequence was employed to capture sagittal MR images intersecting the PTV, with a temporal resolution of 6 s and specific parameters (TE, TR, FA, Bandwidth) = (500 ms, 2000 ms, 90°, 100 Hz/px). The in-plane reconstructed pixel size was 1.72 × 1.72 mm 2 . This side test served as a feasibility study demonstrating the potential of real-time 3D dose accumulation monitoring based on T2w images, for the first time.

Analysis and comparison
Throughout this study, all created plans and calculated dose distributions (adapted and original ones) were exported in dicom format, spatially registered to the original planning CT scan dicom coordinate system by Monaco, following the procedure described above. This also applies to the acquired MR images.
All 2D and 3D measurements, images and dose calculations were analyzed in MATLAB using in-house routines. A 2D Wiener noise removal filter was applied to smooth film measurements. Noise reduction in the gelrelated distributions was performed by using a 3D Gaussian filter.
Film measurements were not normalized and comparison with TPS predictions was performed in absolute doses. On the other hand, polymer gel 3D results were normalized to the mean net R2 value in a central homogeneous 180 mm 2 area inside the PTV. The same area was used for normalizing the TPS-calculated dose distributions, prior to comparing with 3D measurements. This normalization approach was used in all subsequent calculations such as plan quality indices, dose volume histograms (DVHs), etc (see below).
Quantitative comparison between measured and calculated datasets involved the 3D gamma index (GI) test (Low and Dempsey 2003), following the interpolation recommendations in Hussein et al (2017). Measurements always served as the reference dose distribution against which the TPS calculations were evaluated. Selection of passing criteria was partly associated with estimated experimental uncertainties of each dosimetric system and protocol (see section 4). Therefore, distance-to-agreement (DTA) and dose difference (DD) were always ⩾ 2 mm and ⩾ 2%, respectively.
With respect to the original and adapted plans and corresponding 3D relative dose measurements, datasets were further processed to derive and compare plan quality and plan conformity metrics, such as RTOG conformity index (RCI), Paddick's conformity index (PCI), homogeneity index (HI) and quality of coverage (Q) (Paddick 2000, Feuvret et al 2006. DVHs and relevant dose-volume metrics (such as the mean dose (D mean ), the minimum dose delivered to the 95% of the structure volume (D 95% ), the maximum (D max ) and minimum (D min ) doses) were also determined for the PTV.

2D dosimetry
Following the positioning of the film head phantom and pre-treatment MR scanning, the calculated virtual couch shifts (i.e. isocenter shifts) were (x, y , z) = (−0.1 mm, −33.8 mm, 32.6 mm) in the dicom coordinate system. It is worth noting that if the necessary isocenter shift exceeds 50 mm, the system does not allow for adaptive planning and patient/phantom repositioning is required. Therefore, this case was close to the limits of the allowed magnitude of virtual couch shifting, and thus a rather extreme case. Figure 3 presents the dosimetry results related to film measurements. Selected measured and calculated isolines are superimposed on the obtained 3D GI maps, using passing criteria of 3%/3 mm and 3%/2 mm. However, a dose cut-off threshold of 20% of the maximum calculated dose was applied prior to GI calculations in order to exclude points of relatively low dose and, consequently, increased uncertainty (Aldelaijan et al 2011, Pappas et al 2017. In-plane spatial agreement between all distributions can be easily acknowledged. From a dosimetric point of view, although the overall results imply a very good agreement between measurements and calculations for both the adapted and the original plans, a small high-dose area inside the PTV can be noticed where failing points occur systematically. GI passing rates are given in table 1, including the more stringent 2%/2 mm criteria. In all cases, passing rates lie well above 90%.

3D dosimetry
In figure 4(a), the measured R2 relaxation rates are presented against the dose delivered to the four vials filled with gel from the same batch. A first order polynomial is fitted to check the gel's dose-response linearity, under the same time and irradiation conditions and in the presence of 1.5T magnetic field. Gel response can be regarded as linear at least up to 12 Gy, even immediately after irradiation.
Results for the signal integrity check are given in figures 4(b) and (c). Central crossline and inline profiles are depicted and compared with TPS calculations for a static 3 × 3 cm 3 field. All profiles are normalized to their local maxima values. 1D GI tests resulted in 100% passing rates using criteria of 3%/3 mm, for both profiles, without applying any dose cut-off threshold.
After positioning the second PseudoPatient ™ head phantom on the Unity treatment couch, the calculated virtual couch shifts were almost identical for the x and y axes (dicom coordinate system) but on the y axis were even larger by 2.7 mm, almost reaching the limits of the allowed radial isocenter shifting.
Head phantom 3D measurements are shown in figure 5 and are compared with both the adapted and the original calculated dose distributions. Relative isodoses are superimposed on the reconstructed T2 maps for an indicative axial and a sagittal slice intersecting the PTV. MR contrast also qualitatively represents relative dose measurements.
Agreement between measurements and calculations is quantified in table 2, where the 3D GI passing rates are given for a bounding box surrounding and including the entire PTV. A dose cut-off threshold of 20% was used in this analysis.
The 3D relative dose information, provided by the gel, is further exploited in figure 6 where the measured DVHs for the PTV and the OAR are plotted alongside the corresponding TPS-calculated DVHs. It should be noted that the normalization method described in section 2.5, was used. Although DVHs are in excellent agreement for the PTV, minor discrepancies can be noticed for the OAR, which could be attributed to the increased uncertainties in the low dose area found in the OAR.
Furthermore, several relative dose-volume metrics for the PTV were calculated and are presented in table 3. Despite that almost all values agree within 2.5%, the maximum dose in the PTV as calculated for the original plan is approximately 3% higher than the one in the adapted plan and ~4% higher than measured. This is the only considerable difference detected for the PTV. However, maximum measured dose is determined by a single voxel and is, therefore, subject to statistical uncertainties. Table 3 also presents plan quality indices commonly employed in routine clinical practice. Excellent agreement is also verified by these values. The detected discrepancy in maximum dose in the PTV is also reflected to the HI. It should be noted, however, that the reference isoline considered for calculating the plan quality indices matches the 70% isodose (with respect to the normalization dose), as it better conforms to the PTV shape (see figure 6 and D 95% values in table 3).
As a side study, real-time dose accumulation monitoring was performed based on T2w dynamic imaging. A video showing dose deposition in a central sagittal slice is provided as supplementary material (stacks.iop.org/ PMB/64/225009/mmedia). Gradually darkening areas correspond to T2 values being reduced, (i.e. higher R2 relaxation rates) and, consequently, increasing deposited dose levels. To ensure reasonable time duration, video time is in fast forward mode.

Discussion
Benchmarking and periodic QA testing in Unity systems pose dosimetric challenges related to the presence of strong magnetic fields. In response to that, correction factors for ionization chamber readings (Pojtinger et al 2018), specially designed scanning water phantoms (Smit et al 2014) and MR compatible versions of commercially available diode arrays (Houweling et al 2016, De Vries et al 2018 have been proposed. All these studies utilize active detectors in homogeneous phantoms of generic size and shape (cubic or cylindrical).
Furthermore, adaptive planning (if included) cannot follow the clinical workflow due to the lack of realistic MR signal and contrast.
In the present study, passive dosimetry has been employed for plan verification in an anthropomorphic phantom geometry, including the presence of high density inhomogeneities. Conveniently enough, the MR compatibility of the materials employed also comes with adequate MR signal and contrast, allowing for clinically realistic virtual couch shift calculations and adaptive planning, important steps in the Unity treatment workflow. The presented methodology combines a well-established 2D film dosimetry protocol for absolute dose measurements with polymer gel, which allows for on-couch 3D relative dose read-out. All end-to-end tests performed were completed within approximately 30 min (additional 30 min of MR scanning were required for the dose read-out in 3D dosimetry), allowing the implementation of the methodology for routinely performed QA tests, Figure 3. GI maps comparing film measurements with the adapted (a) and (b) and the original (c) and (d) plans. Passing criteria used were 3%/3 mm (a) and (c) or 3%/2 mm (b) and (d). Measured (dashed black lines) and calculated (solid black lines) isolines (in Gy) are superimposed to assist comparison. White corresponds to areas where film measurements are either not available or dose levels lie below the low dose cut-off threshold of 20% of the maximum calculated dose. An arbitrary on-film coordinate system is adopted.   Corresponding relative dose measurements are also superimposed (dashed isolines) to assist comparison with relative dose TPS calculations (solid isolines) related to the adapted (a) and (b) and the original (c) and (d) treatment plans. Legend: blue, turquoise, green and yellow isolines correspond to the 40%, 60%, 80% and 100% isodose, respectively. The PTV, OAR and external surface are depicted by the red, green and brown contours, respectively. since it is not time consuming. In addition, real-time dose accumulation monitoring based on T2w images was presented for the first time, as a feasibility proof-of-concept study. After performing the necessary virtual couch shift (i.e. isocenter shift) of several centimeters, the adapted plans comprised off-axis IMRT non-flattened beams in order to irradiate a relatively large PTV. This yields challenges in TPS calculations, let alone the presence of 1.5T magnetic field, resulting to the electron return effect and crossline profile offsets (Woodings et al 2018a), combined with material inhomogeneities. For these reasons, we consider the benchmarking end-to-end test performed in this study particularly rigorous.
Overall results suggest that measurements agree well with both the adapted and original plans. A limited area with GI failing points was detected on the film plane (figure 3) but was not verified by the gel measurements. Nevertheless, the high passing rates achieved using both dosimeters, in combination with DVHs comparison and plan quality metrics results, clearly indicate that TPS predictions were verified within the dosimetric uncertainties involved, passing criteria considered and limitations of the methodology followed.
As part of the daily dose fraction delivery, an adapted plan should result in a dose distribution as close as possible to the original one. However, it was shown that this is not always the case, with adapted plans often being clinically unacceptable (Winkel et al 2019a(Winkel et al , 2019b. For extracranial cases, the effect was strongly associated with the adaptation method used and treatment site considered (Winkel et al 2019a). Specifically for cranial radiotherapy, smaller discrepancies have been reported , mainly related to increased maximum doses in OARs (up to 2.0%). In accordance with the latter study, results of the present work also indirectly suggest that the original and adapted cranial radiotherapy plans did not exhibit any considerable dosimetric differences. This is clearly illustrated in figure 6 and tables 1-3. However, this work involved only one treatment plan and geometry without investigating the effect of different plan adaptation methods and, therefore, results are not sufficient for any general conclusions.  Effort was made to employ dosimetry protocols and equipment with minimal levels of experimental uncertainties. Spatial accuracy in the 2D dosimetry methodology followed is governed by the spatial registration uncertainty between scanned film images and the planning CT scan image stack. This procedure was facilitated by the specially designed MR compatible fiducials, which defined the film plane and position within the dicom coordinate system. Induced uncertainty levels can only be estimated based on a previous work which employed the same concept, although metal pins were used (Pappas et al 2017). According to that study, registration uncertainty was 1.5 mm. Given the relatively larger fiducials (and, consequently, wider film holes) used herein, it can be assumed that the spatial uncertainty is even larger. Dosimetric uncertainty is dose dependent and mainly stems from the fitting procedure, necessary to obtain the parameters of the dose-response calibration curve (Aldelaijan et al 2011). At the 8 Gy dose level, the corresponding dosimetric uncertainty reaches 1.2% and increases with decreasing dose. Other scanner-related uncertainties do not exceed 0.4% (Pappas et al 2017). Regarding gel dosimetry, spatial uncertainty is related to the rigid registration step between on-couch MR scans and planning CT scan. In a multi-institutional study, MR/CT registration uncertainty was estimated at 1.8 mm for a cranial case (Ulin et al 2010). Since relative dosimetry was performed using gels, global systematic uncertainties are not relevant. However, type B (statistical) uncertainty is still relatively high as it is related to MR read-out. More specifically, based on the gel vials irradiation, T2 reproducibility was estimated at 2.3%.
A number of limitations are noteworthy and should be considered when relying on the presented results to evaluate the dosimetric performance of Unity. Regarding the PseudoPatient ™ phantoms, despite being produced based on real patient CT scan, air cavities (e.g. sinuses) are flooded with water and, therefore, not modeled (Makris et al 2019). Thus, dose verification around air cavities (where the electron return effect would be of particular interest) could not be performed. In addition, uncertainty reduction in film dosimetry could be further achieved by employing the triple channel dosimetry protocol (Micke et al 2011). Dose calculations by Monaco were characterized by limited spatial resolution. The 3 × 3 × 3 mm 3 dose calculation grid resolution might not be considered appropriate for a benchmarking test. Although it could be improved in a phantom study, this would come at the expense of calculation time and, consequently, burden the overall duration of each treatment session in clinical practice. On the other hand, one could argue that an end-to-end benchmarking test should involve all parameters and conditions employed in routine clinical practice and not those of the ideal case. Lastly, the real-time dose accumulation monitoring capabilities were not explored in-depth and were limited to a single MRI acquisition, as a proof-of-concept study.
Although calibration vials were available, gel measurements were normalized to avoid any potential dependencies, temporal variations and volume effects (Baldock et al 2010), yielding 3D relative dosimetry only. Although absolute dose determination in 3D is attractive, an in-depth assessment of the gel dose-response characteristics and dependencies was beyond the scope of this study. However, it should be mentioned that this work demonstrated that polymer gel is an excellent candidate for end-to-end QA in Unity, combining a series of attractive features; (i) magnetic field independency of dose-response, (ii) without inducing field perturbations, (iii) MR compatibility, (iv) realistic MR and CT signal allowing for realistic adaptive treatment planning, (iv) on-couch dose read-out at the treatment delivery position immediately after irradiation and (v) real-time dose accumulation monitoring capabilities.
In response to the increasing demand for methodologies and tools suitable for dosimetry in the Unity workflow (De Prez et al 2019), a PseudoPatient ™ phantom in conjunction with any dosimetric system enables introduction of rigorous end-to-end QA protocols suitable for commissioning and benchmarking, periodic QA, confidence building and training purposes when introducing new techniques, as well as, in the case of suspected or reported problems in the existing practices. The phantom employed, features realistic MR/CT contrast and human-like bone structures in anthropomorphic geometry (Makris et al 2019), allowing the implementation of patient-specific pre-treatment QA protocols in the actual clinical workflow. Other potential fields of application involve dosimetry in challenging single isocenter multiple metastases cases (especially if combined with 3D dosimetry) and QA procedures in extracranial radiotherapy such as stereotactic body radiotherapy cases. Furthermore, with appropriate inserts, the phantom could also be used for QA in MR/CT imaging and spatial registration procedures. Specially designed tests focusing on the performance of virtual couch shifting and the evaluation of adaptive treatment planning methods are already being considered.

Conclusion
In this work, the PseudoPatient ™ head phantoms were employed and combined with 2D and 3D dosimetry protocols to evaluate the overall dosimetric performance of the Elekta Unity MR-linac system for a cranial IMRT case.
The suitability of the employed phantoms for QA procedures in 1.5T Elekta Unity MR-linacs was demonstrated. Based on the realistic MR/CT contrast of the phantoms, the presented benchmarking test also included the virtual couch shifting and adaptive treatment planning, steps routinely performed in clinical practice.
Regarding dosimetry, several novel tools and methodologies were presented. MR compatible fiducial markers facilitating registration of the film measurements were constructed. MR scanning protocols for gel dose read-out were developed, while demonstrating that relative 3D dosimetry can provide acceptable results even immediately after irradiation.
Overall results suggest that TPS calculations (for both the adapted and original plans) were validated within the experimental uncertainties involved. Comparison did not solely rely on GI calculations. By exploiting the 3D relative dose information, agreement of measurements with TPS predictions was further verified by comparing DVHs, dose-volume metrics and plan quality indices. For this specific case and adaptation method used, no considerable discrepancies were detected between the adapted and original plans.
As a side study, real-time dose accumulation monitoring based on T2w images was presented for the first time.

Acknowledgments
Elekta is acknowledged for allowing access to one of the pre-clinical Unity systems where the irradiations of this study were performed.
RTsafe PC is acknowledged for providing the PseudoPatient ™ phantoms and gel batch used throughout this work.
The Greek Atomic Energy Commission is gratefully acknowledged for the film calibration irradiations performed in the secondary standard dosimetry laboratory.