Introduction

Radiation therapy is an essential treatment for children and adults with brain tumours, but it can lead to important side effects including neurocognitive change, hearing loss and endocrinopathies. Designing RT treatments that maximize the likelihood of cure while minimizing side effects is crucial [1]. Although RT planning software has improved significantly in recent decades, the creation of RT plans for most tumour types is still dependent on a semi-manual iterative process of optimizing parameters to achieve an acceptable, inverse-planned RT dose distribution. This manual process of trial-and-error is operator-dependent and labor intensive, and while the resulting radiation dose distributions may meet specified clinical goals, they are not necessarily the optimal radiation plan for an individual patient. Automated planning is a method to overcome these limitations, and has been previously studied in patients with cervical cancer [2], prostate cancer [3], breast cancer [4], and lung cancer [5]. To our knowledge, no prior publication has described the successful use of automated planning to optimize radiation treatment of primary brain tumours.

In this study, we developed and evaluated an automated machine-learning RT planning method for children and adults with brain tumours. Deliverable ML-generated treatment plans were dosimetrically compared with human-generated plans that were delivered clinically.

Materials and methods

We performed an in silico dosimetry study to evaluate feasibility of ML planning for brain tumours, and the quality of the resulting RT plans. The study was approved by the relevant institutional Research Ethics Boards.

Details of ML model development have been described previously [6,7,8]. In brief, an atlas of clinically-treated photon plans was first created. Within the ML pipeline, contoured structures and computed tomography (CT) imaging features were extracted by the software. Imaging features describe the appearance and texture of the imaging dataset on a per-voxel basis, and account for differences in patient anatomical geometry (see Additional file 1: supplementary materials). The first ML component used atlas regression forests (ARFs) to associate image features with observed radiation dose. This process was repeated over each voxel for the entire CT dataset, on every case in the training dataset. A second component of the ML step was designed to ensure the accuracy of dose prediction by considering contextual information to the dose-per-voxel. Since each voxel’s dose is not independent from the dose to adjacent voxels, the contextual dose links a voxel’s dose to that of nearby voxels. A conditional random field (CRF) model was used to combine these individual voxel doses and generate predicted dose distributions that were spatially accurate and realistic over anatomic regions of interest. The trained ML model predicted the dose to targets and normal tissues for a novel patient case based on the learned relationships between imaging features and per-voxel dose by automatically identifying anatomically similar training cases. The predicted dose plans are then converted into clinically deliverable single-arc volumetric arc therapy (VMAT) plans using an inverse-planning optimization algorithm that minimizes the difference between the predicted and final dose, while ensuring technical beam delivery constraints are met, to create a deliverable plan.

We applied this approach to a training set of 95 consecutive brain tumour patients treated from July 2016 to August 2020 at a single institution (Fig. 1). Patients receiving focal treatment (no craniospinal radiotherapy component) to 54 Gy using VMAT for an intracranial brain tumour were eligible for inclusion. RT plans met evaluation criteria listed in Table 2.

Fig. 1
figure 1

Flow diagram of study data and planning workflow

Fifteen novel brain tumour patients clinically treated with 54 Gy in 30 fractions from July 2018 to November 2020 at two institutions were then re-planned with this ML model as a testing set (Fig. 1). These patients’ novel planning CT images with target and organs-at-risk contours were input into the ML model for ML-plan generation. Dosimetry to both target volumes and OARs was reviewed and compared with the manual, human-generated plans that were delivered clinically. Target coverage, maximum doses to brainstem, optic chiasm, optic nerves, spinal cord, and mean doses to brain, hypothalamus, pituitary, cochlea, hippocampi, temporal lobes and parotids were evaluated and compared between ML and manual plans using paired t-tests.

Results

Details of our patient cohort are shown in Table 1. ML plans were successfully created for all 15 patients in the testing set. An example case is shown in Fig. 2, with representative manually-created clinical plan and the clinically-deliverable ML plan. All ML plans were generated within 30 min of initiating planning.

Table 1 Patient characteristics in training and testing set
Fig. 2
figure 2

Manually-created clinical plan on top row and final ML plan bottom row respectively. Axial, sagittal and coronal views are shown from left to right. Red, green and blue lines represent the gross tumor, clinical target and planning target volumes, respectively

To evaluate ML plans in the testing set and compare with the manual plans, we first applied pre-specified plan evaluation criteria to both. The results of this comparison are shown in Table 2. Similar target coverage was observed in both ML and manual plans; at least 95% of PTV received > 51.3 Gy (95% of prescription) in all ML and manual plans. Maximum chiasm dose was < 54 Gy in 14 ML vs 15 manual plans; maximum brainstem dose was < 54 Gy in all 15 ML vs 13 manual plans.

Table 2 Evaluation criteria applied to manual and ML plans

We subsequently compared quantitative dose metrics to OARs, shown in Table 3. Maximum doses to brainstem, chiasm, each eye and optic nerve, spinal cord, and mean doses to right temporal lobe, left cochlea, each hippocampus, hypothalamus, parotid and pituitary were not statistically different between ML and manual plans (p > 0.05 for each). The maximum in-patient dose was not statistically different between ML and manual plans. Mean doses to brain and left temporal lobe were lower in ML plans than manual plans (mean difference to left temporal, – 2.3 Gy, p = 0.006; mean differences to brain, – 1.3 Gy, p = 0.017), whereas mean doses to right cochlea and lenses were higher in ML plans (+ 1.6–2.2 Gy, p < 0.05 for each).

Table 3 Summary of dose differences to OARs between ML and manual plans

Discussion

To our knowledge, this is the first study to demonstrate the feasibility of using ML planning to create high quality, clinically deliverable RT plans for patients with primary brain tumours. ML plans were comparable with manual plans with respect to their ability to meet a priori plan evaluation criteria, including target coverage. Quantitative dosimetry to OARs was similar in both approaches, indicating that ML plans would be suitable to use and implement for clinical treatments.

Previous studies have demonstrated promising results using fully automated RT planning for sites with limited inter-patient variation in anatomy such as prostate, breast and lung cancer. McIntosh et al., demonstrated the feasibility of the voxel-based approach used here to create deliverable prostate cancer RT plans [7, 9] and Duren-Koopman et al. developed personalized, scripted tangential and arc-based RT planning for patients requiring breast plus locoregional lymph nodes [4]. Similarly, Creemers et al. demonstrate excellent dosimetric characteristics of automated VMAT plans in non-small cell lung cancer, as compared with manual plans [10]. Among primary brain tumours, although the intracranial contents are similar between patients, the variation in brain tumour configuration, and the variable impact of tumor and surgery on normal CNS anatomy poses unique challenges that the ML method was able to overcome. This contrasts with prior studies of automated planning, which have primarily been applied to anatomically homogeneous targets.

When creating ML models, using high-quality RT plans in the training model is critical so that ML output is similarly high-quality [11]. In the present study, we applied strict dosimetric criteria for inclusion in the training set to ensure high-quality plans were included in the ML model. Our study is limited to use of homogeneous dose prescriptions (54 Gy); different training sets and models are likely needed for use with two-phase plans or other prescriptions because of differing dose-constraints on OARs. Clinical implementation to ensure continued feasibility is required; this process is ongoing at our institution.

The potential of ML model lies in its the ability to reliably create high-quality treatment plans that were not dependent on the training or skill of the medical dosimetrist, as well as rapid creation of reliable RT plans. This has important potential to improve access to high quality RT in small practices or middle-income countries where planning expertise may be limited [12]. Further, rapid RT planning is especially important for patients requiring urgent commencement of RT, such as in children with symptomatic brainstem glioma.

Conclusions

In conclusion, we developed and evaluated an automated machine-learning RT planning method for pediatric and adult brain tumour patients, and demonstrated the feasibility of rapidly generating clinically-deliverable ML plans that display consistent plan quality, as well as similar target coverage and OAR sparing as compared to human-generated plans used clinically. Clinical implementation of this ML treatment planning system is ongoing.