Background

Radiation therapy treatment planning for nasopharyngeal carcinoma (NPC) is often challenged by the convoluted target volume and many adjacent organs at risk (OAR) [1]. Intensity-modulated radiation therapy (IMRT) technique has been considered as a common treatment for NPC, because it delivers highly conformal doses to the targets and effectively spares the OARs, potentially improving the local control rate and reducing radiation-related toxicities [2]. However, it is time-consuming to manually generate an IMRT plan due to its intrinsic trial-and-error process. In addition, IMRT plan quality may be inconsistent due to the inhomogeneous knowledge and experience level of the planners [3]. Hence, it is of great need to develop highly efficient automated planning techniques to consistently generate high quality plans.

In general, automated planning techniques are either algorithm based on some optimization methods [4,5,6,7,8,9,10] or knowledge based on prior plan data [11,12,13,14,15,16,17,18,19]. The knowledge-based techniques usually involve machine learning methods, which demonstrated their utility in improving treatment planning quality and efficiency. Some commercial modules can generalize a dose volume histogram (DVH) estimation model, from which treatment plans can be generated semi- or fully-automatically [11,12,13]. An in-house knowledge-based treatment planning technique has also been developed and proved effective in fully automating IMRT plans [20], using the overlap volume histogram (OVH) information [21]. One study recruited 138 head-and-neck patients but the inclusion of NPC patients was unknown [20]. Furthermore, all of these studies had not been exclusively applied to the treatment planning of locally advanced NPC patients [4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19]. The efficacy of the knowledge-based autoplan technique for locally advanced NPC treatment planning still needs further investigation due to the particular challenges from the tumor and OAR anatomy in this disease.

In our institution, we developed a knowledge-based IMRT treatment planning technique for locally advanced NPC based on a neural network (NN) machine learning model. The NN model correlated an individual patient’s OVH with the corresponding plan optimization dose objectives by learning from a cohort of similar locally advanced NPC patients. A set of Perl scripts were developed to bridge the NN model predicted patient specific dose objectives to the treatment planning system for plan optimization and dose calculations.

Methods

Patient libraries

Consecutive 140 locally advanced NPC patients treated with definitive IMRT at Fujian Cancer Hospital between July 2016 and September 2018 were retrospectively selected and chronologically separated into a knowledge library (n = 115) and a test library (n = 25). Only NPC patients with bilateral cervical lymph nodes metastases were included. All patients were diagnosed and staged by pretreatment enhanced magnetic resonance imaging (MRI) according to the Chinese 2008 staging system for NPC [22, 23]. Each patient was immobilised in a supine position with a thermoplastic mask and underwent contrast enhanced computed tomography (CT) (Brilliance CT Big Bore; Philips Medical Systems Inc., Cleveland, OH, USA) at a 3-mm slice spacing from the skull vertex to the level of 2 cm below the clavicles. Volume delineation was performed on the CT images in the Pinnacle3 treatment planning system (TPS) (Philips Radiation Oncology Systems, Madison, WI) after a CT-MRI fusion.

The target volumes were delineated using an institutional treatment protocol defined as following: the primary nasopharyngeal tumor (GTV_T) and definitive bilateral lymph nodes (GTV_NL and GTV_NR), as determined by clinical information, endoscopic examinations and radiography including CT and MRI. The clinical target volumes (CTVs) included high-risk regions (CTV1), low-risk regions (CTV2), and bilateral low-risk nodal regions (CTV_NL and CTV_NR). The CTV1 included GTV plus 5- to 10-mm margin. The CTV2 was designed for potentially involved regions and encompassed the entire CTV1. Each target volume was expanded by 3 mm to generate the planning target volume (PTV) in consideration of the setup error, geometric uncertainties and patient movement. In total, for each patient, seven target volumes (GTV_T_P, CTV1_P, CTV2_P, GTV_NL_P, GTV_NR_P, CTV_NL_P and CTV_NR_P) and ten OARs (left/right parotid, brainstem, spinal cord, left/right optic lens, left/right optic nerves, pituitary and optic chiasm) were delineated. A total dose of 69.96 Gy in 33 fractions at 2.12 Gy/fraction to the GTV_T_P, 66 Gy at 2 Gy/fraction to the GTV_NL_P/GTV_NR_P, 61.05 Gy at 1.85 Gy/fraction to the CTV1_P, 56.1 Gy at 1.7 Gy/fraction to the CTV2_P/CTV_NL_P/CTV_NR_P were prescribed.

Manual planning

All patient treatment plans in the knowledge and test libraries were manually generated by a single experienced physicist. All plans were optimised in Pinnacle3 9.2 for treatment delivery by an Elekta Synergy accelerator using seven equally spaced coplanar 6MV photon beams (210°, 260°, 310°, 0°, 52°, 104°, and 156°). During treatment planning, auxiliary structures were generated to be used in objectives parameters (Table 1).

Table 1 Auxiliary structures for treatment planning

Direct machine parameter optimization (DMPO) was set for all beams with 9 cm2 minimum segment area, 9 minimum segment monitor unit (MU) and up to 60 maximum segments. The manual plans (MPs) in both libraries followed the institutional locally advanced NPC planning criteria shown in Table 2. To achieve these criteria, objectives shown in Table 3 were used as a starting point for planning. The type, volume and weight for regions of interest (ROIs) were preset and not allowed to change. Only the target dose objectives are tunable to improve plan quality. All the MPs required the planner’s best effort to lower the OAR doses by only adjusting the target dose values while maintaining the PTVs’ dose coverage. This iterative process shall be repeatedly executed until no further improvement can be made.

Table 2 The criteria of regions of interest for manual IMRT planning
Table 3 Objective parameters

Neural network model

The patients in the knowledge library were equally and chronologically divided into 5 groups, each group with 23 patients. A 5-fold cross validation scheme was adopted to generate 5 NN machine learning models. Each model was used to validate one group by training the other 4 groups. The output dose objectives for patients in the test library were obtained by taking the mean of the 5 dose objectives generated from the 5 models.

The details of how to build our NN model were given in this paragraph. For all patients in the knowledge library, their OVH, target volume histogram (TVH) and dose objective values were extracted and normalised. The OVH essentially defines the overlapping volume fraction between an OAR and a uniformly contracted/expanded PTV (see Fig. 1). It acts as a visualisable descriptor depicting the three-dimensional anatomical relationships between an OAR and the tumor volumes into the two-dimensional Cartesian coordinate system, which can be conveniently used as inputs to an NN model. The TVH indicates the uniformly contracted or expanded PTV. Each NPC patient in the knowledge library had 20 OVH, 5 TVH, and one set of 21 dose objectives. Both OVH and TVH had 11 values, starting from a zero or negative (contraction) distance to an ending positive distance (expansion) with a fixed step size (see Table 4). Our 3-layer NN model had consisted 275, 184 and 21 nodes in its input, hidden and output layer respectively, taking OVH and TVH values as inputs and returning dose objectives as desired outputs. The model learned by refining their node-to-node link weights between two neighboring layers to minimize the cost function defined as the mean squared error between the trained and known value on each output node.

Fig. 1
figure 1

The overlapping between the left parotid (sky blue) and: (a) the CTV-ALL contracted with a distance of 5 mm (red); (b) the initial CTV-ALL (purple); (c) the CTV-ALL expanded with a distance of 5 mm (tan). The overlap volume fraction is defined as the overlapping volume divided by the volume of the left parotid

Table 4 Overlap volume histogram (OVH) and target volume histogram (TVH) used as inputs to build neural network model

The NN modeling was run by Spyder (a python integrated development environment) on a personal computer with an Intel (i7-2630QM) CPU with 2 GHz main frequency. The model learning rate affects how big a step we update our model weights and values to move towards the minimum output error. The rate was set to 0.02 and model iteration time set to 2500. The choice of these parameters yielded satisfactory results in this feasibility study with relatively short training time. During test, the trained model simply calculated a set of patient specific dose objectives based on the OVH and TVH values.

Automated planning

Automated plans (APs) were all generated by an in-house developed Perl and HotScripts planning scripts in Pinnacle3 9.2. It automated the entire planning process including additional structure generation, beam and optimization parameters setup, and the final inverse optimization. This script also received planning parameters of gantry angle, beam energy, beam modality, treatment isocenter placing, prescription, number of fractions, isodose lines for visualization, IMRT optimization type, maximum number of segments, minimum segment area, minimum segment MU, max iteration (100) and convolution dose iteration at 40th. Finally, the script incorporated the derived dose objectives before the APs were automatically generated with a single loop of iteration of the planning process. An overview of our proposed process was presented in Fig. 2.

Fig. 2
figure 2

The flow chart of knowledge-based IMRT treatment planning technique for locally advanced nasopharyngeal carcinoma

Plan comparison and statistical analysis

The AP and the MP of each patient from the test library were all blindly reviewed and rated by one attending radiation oncologist in our institute by evaluating both DVH and dose distribution. Grade C indicated an inferior plan quality which is considered clinically unacceptable. A grade B plan was deemed just about acceptable and grade A suggested a superior plan where the DVH and dose distribution were more desirable. Similar quality plans could be deemed comparable and rated the same. The ratings for both the APs and MPs in the test library were compared by McNemar-Bowker tests using Statistical Package for the Social Sciences (SPSS 21.0; SPSS Inc., Chicago, IL, USA) software. The reviewer also recorded the numbers of ROIs achieving the given criteria to be compared between the APs and MPs.

SPSS 21.0 was also used for statistical analysis. The dose parameters in Table 2 were included in the statistical analysis. Mann-Whitney U test was performed to compare dose parameters of the APs and MPs. Dx was the received dose corresponding to x% of volume. D5 was used to evaluate the high dose in PTV. V30 was the percentage volume receiving 30 Gy dose. Conformity index (CI = (VPTV region receiving prescription dose/VPTV)* (VPTV region receiving prescription dose/VPrescription dose)) and homogeneity index (HI = D5/D95) were calculated for PTV evaluation. Furthermore, planning duration and MU per fraction were also analysed for both the APs and MPs. The alpha level was set at 0.05 and the Bonferroni correction was also applied to control type I error probability. Since 32 tests were carried out in this analysis, it was considered statistically significant when P < 0.0015.

Results

Plan quality comparison

In the blind test, 11 APs were rated A, 10 rated B, and 4 rated C, while 12 MPs were rated A, 10 rated B, and 3 rated C (see Fig. 3). The APs and MPs had the same rating in 19 out of 25 patients. APs were rated better for two patients and worse for four patients. The McNemar-Bowker test result showed that there existed no difference between the rating distribution of AP and MP with a P value of 0.549.

Fig. 3
figure 3

Blind review on plan quality between automated plans (APs) and manual plans (MPs) for the 25 NPC patients in the test library

ROI meeting criteria

The numbers of PTVs and OARs achieving the given criteria are listed in Table 5. For no less than 80% of the patients from the test library, the PTV coverage met the criteria in both the APs and MPs. Particularly, CTV_NL_P and CTV_NR_P of all the APs and MPs achieved their given criteria. GTV_T_P remained the most challenging PTV, since the number of GTV_T_P D95 achieved the given criteria was 20 and 22 in APs and MPs, respectively.

Table 5 The comparison between automated and manual IMRT plans for 25 patients with locally advanced nasopharyngeal carcinoma

All the left and right lens in both the APs and MPs met the dose constraint of Dmax<8Gy. Pituitary appeared the most challenging OAR to manage, as only 17 APs and 15 MPs were able to meet Dmax<66Gy. Notably, the number of the APs was close to that of the MPs in achieving each OAR criterion. The largest different OAR number achieving its criteria between the APs and the MPs was 2 in both the pituitary and optic chiasm..

Data comparison and analysis

Dose parameters of the PTVs and the OARs using Mann-Whitney U test and Bonferroni correction are also shown in Table 5. PTVs (including GTV_T_P, CTV1_P, CTV2_P, CTV_NL_P, and CTV_NR_P) in the MPs had significantly higher D95 than those in the APs (P < 0.0015). No significant difference was observed in the D5, CI, and HI of PTVs between APs and MPs (P > 0.0015). Moreover, dose parameters of all OARs were comparable between APs and MPs (P > 0.0015), although all the APs showed lower mean dose parameters (except brainstem D1cc) compare to the MPs. The D1cc of brainstem was 56.19 ± 6.87 cGy and 54.95 ± 7.8 cGy in the APs and the MPs, respectively (P = 0.449). The MU for the APs was comparable to that for the MPs (685.04 ± 59.63 vs. 721.36 ± 63.36, P = 0.051). It was also found that the planning duration for the APs was greatly shorten compared to that for the MPs (9.85 ± 1.13 min vs. 57.10 ± 6.35, P < 0.001).

Figure 4 is the DVH for patient (#12), one of the best plans of which its AP (solid line) and MP (dashed line) were both rated grade A. It shows clinically acceptable PTV coverages for both the AP and the MP, and it also shows that the AP considerably increases dose sparing to both right optic lens and pituitary. For patient (#12), the PTV coverage in the AP was approximately equal to that in the MP; relative percentage difference at D95 for GTV_T_P, CTV1_P, CTV2_P, GTV_NL_P, GTV_NR_P, CTV_NL_P and CTV_NR_P were − 0.4, − 1.2%, − 2.4, 0.2, 0.2, 1.9 and 1.8%, respectively. Compared to the MP, AP greatly reduced OAR dose for left parotid V30, right optic lens Dmax, left optic nerve Dmax, right optic nerve Dmax and pituitary Dmax with relative percentage difference values of − 6.9, − 15.8%, 7.3, 10.2 and 21.2%, respectively.

Fig. 4
figure 4

A comparison of dose volume histograms for the automated plan (solid line) and the manual plan (dashed line). As one of the best plans which were both rated grade A, patient (#12) demonstrated acceptable PTV coverage and considerably greater dose sparing to both right optic lens and pituitary

Discussion

We developed a feasible knowledge-based IMRT treatment planning technique for locally advanced NPC using a trained 3 layer NN model. The knowledge-based library consisted of a comparatively larger sample size of 115 locally advanced NPC patients [12, 14, 20], and each patient had a high-quality manual IMRT plan. 5-fold cross validation method was also applied in our study. In addition, a wide range of OVH and TVH information which would have a great effect on the resulting dose distribution were selected as the input of the NN model [24]. Patient specific dose objectives predicted by the model were subsequently used for a single-iteration automated planning, which generated high quality, clinically acceptable or superior APs for 21 out of the 25 patients under test. For the 4 patients whose APs were rated C, their MPs were rated C as well (#6, 13, and 18), except for one patient (#25) whose MP was rated B. Further examinations were conducted for these four patients. For the patient (#6), GTV_T_P completely overlapped the left optic nerve. For the patient (#13), GTV_T_P which was given the highest prescription dose overlapped partially with the bilateral parotids, and thus the parotid V30 was greatly increased. A large portion of target volume invaded superficial cerebral tissue in the patient (#18), which made it a difficulty to cover the superficial target with the prescription dose. For the above patients, the APs mimicked the manual operation on the choices of optimization priorities. However, the AP for the patient (#25) prioritised pituitary and brainstem and chose to sacrifice the dose coverage on GTV_T_P, in contrast to the MP that well covered the tumor volume. It suggested that our automated technique could not always make expected choices aligning to the oncologist’ preference, particularly for those challenging cases.

Our automated method greatly reduced the planning duration compared to the MPs (9.85 ± 1.13 min vs 57.10 ± 6.35 min). Moreover, it involved no human intervention when the embedded Pinnacle scripts were running. Currently, the dose objectives derived from the NN model on our personal computer had to be manually transferred to the TPS computer, so our knowledge-based automated planning technique was not fully automated in this sense. Nevertheless, the model could be transferred on the TPS to complete the automation workflow in the future. Note that although the training time for each NN model was 27 min, the time to generate a set of objective for one patient took only less than 0.1 s.

Wu B and his group [20] applied k-nearest neighbour method and made a prediction on the best DVH of each single OAR based on its OVH, which might compromise the dose distribution when every OAR reached its best DVH. However, our study took all target volumes and OARs into consideration at the same time, and employed a NN model to derive a patient-specific set of dose objectives.

Our study did not include some OARs such as oral cavity, temporal lobes, and thyroid glands because these OARs could easily achieve their dose constraint by setting dose constraint to the additional rings (R5200, R4500, R3600, and R3100). Our study has not fully addressed the dose inhomogeneity with single iteration optimization. But one study suggested that automatic generation of regions and objectives for hot and cold spots would further improve dose uniformity without manual interference [25]. The study also utilised embedded Pinnacle scripts and provided a solution on achieving better CI and HI for us.

Our study proposed a prospective automated IMRT planning technique for locally advanced NPC. Although our current study has limited the settings of machine parameters such as gantry angles and segment sizes, the same technique can be applied to more complicated IMRT delivery techniques. We anticipate that volumetric modulated arc therapy treatment planning can also take advantage of the described technique to achieve individually tailored optimal radiotherapy plans. In addition, as the volume, position and dose of targets and OARs would change during the treatment course for NPC patients [26,27,28], the introduction of adaptive radiation therapy (ART) could potentially improve the treatment outcome [29]. Our knowledge-based automated planning approach would be of great value to generate high quality ART plans for NPC patients in an efficient manner.

Conclusions

A robust and effective knowledge-based IMRT treatment planning technique for locally advanced NPC is developed by use of NN model and HotScripts planning scripts in Pinnacle3 9.2 TPS. This automated technique largely shortened planning time without compromising the plan quality.