Background

Nasopharyngeal carcinoma (NPC) is one of the most common head and neck malignant tumors in East and Southeast Asia, and radiation therapy is the primary treatment modality for non-metastatic NPC because of its high sensitivity to ionizing radiation [1]. Intensity-modulated radiation therapy (IMRT) can accurately deliver radiation dose to targets while sparing adjacent normal organs with intensity modulation of high-energy photon beams so that it has favorable treatment outcomes for NPC [2, 3]. Nevertheless, serious complications frequently occur during or after IMRT, such as xerostomia [4], radiation caries [5], dysphagia [6], taste impairment [7], and radiation-induced brain injury [8]. Thus, balancing the high-dose coverage to targets and minimum-dose exposure to organs at risk (OARs) is crucial. However, the completion of IMRT treatment planning for geometrically complex NPC involving multiple OARs and non-convex planning target volumes (PTVs) is extremely challenging [9]. In clinical practice, IMRT treatment planning is a time-consuming inverse planning process completed in a treatment planning system (TPS) with manual trial-and-error fashion [10]. As a result, the quality of plan is largely influenced by the planner’s experience and skills, which implies that patients may receive diverse quality of treatment. Therefore, many studies on automatic treatment planning have been conducted to enhance plan quality consistency and improve planning efficiency for IMRT [11,12,13,14].

Knowledge-based planning (KBP) is an automatic planning method that has been integrated into commercial TPS to accomplish dose volume histogram (DVH) estimation using the built-in KBP model, and dose objectives are generated to guide the follow-up optimization process [14,15,16,17]. However, DVH prediction can only provide the relative volume received doses of certain structures without dosimetric spatial information, which would result in inferior plan dose distribution and dose conformity [18, 19]. This issue was further solved by predicting the 3D dose distribution from anatomical information of structures based on deep convolutional neural networks (CNNs), which showed fairly similar dosimetric quality to those in deliverable plans [13, 20,21,22,23]. However, the predicted dose distribution cannot be easily converted into voxel-level optimization objectives in current commercial TPS to generate the corresponding deliverable plan. Recent advances bypassed inverse optimization and directly predicted fluence maps to generate multi-leaf collimator (MLC) leaf sequence to obtain the final plan [24,25,26,27,28].

Although the KBP method based on fluence prediction can directly generate plans in TPS without inverse optimization, there is no guarantee that the resulting plan is optimal because any fluence prediction error, fluence loss during leaf motion calculation, and patient heterogeneity would result in plan quality degradation. In this study, we combined CNN-based fluence map prediction with script-based plan fine-tuning to automatically generate IMRT treatment plans for 38 patients with NPC. The plans were first generated by predicted fluence maps, and then further fine-tuned with dose objectives provided from the predicted fluence generated dose. Finally, we evaluated both the plan quality and planning efficiency for the proposed automatic planning method.

Methods

Patient collection

The ethics committee of Sun Yat-sen University Cancer Center approved the retrospective use of clinical treatment plans for patients in this study. A cohort of 38 patients with NPC treated with IMRT at Sun Yat-sen University Cancer Center between March 2015 and February 2016 was collected. Among these 38 patients, 30 (79%) were males and 8 (21%) were females, with an age range of 22–79 years (median age of 49 years). All IMRT plans were generated in the same treatment machine of Varian Trilogy system (Varian Medical Systems, Palo Alto, CA, USA) with Millennium 120 MLC, using nine equally spaced beams (beam angles at 0°, 40°, 80°, 120°, 160°, 200°, 240°, 280°, and 320°) and 6 MV photon beam energy in flattening filter mode.

All patients with NPC had multiple radiation targets, and five PTVs named “PTV-GTV,” “PTV-1,” “PTV-2,” “PTV-LN(L)” (PTV of left lymphonodus), and “PTV-LN(R)” (PTV of right lymphonodus) were considered. The prescription doses for PTV-GTV, PTV-1, PTV-2, PTV-LN(L), and PTV-LN(R) were 70, 60 or 64, 54 or 58, 60–70, and 60–70 Gy, respectively, in 30–33 fractions. Seventeen OARs used in this study were body, brainstem, spinal cord, chiasm, tongue, left and right optic nerves, left and right lens, left and right temporal lobes, left and right mandibles, left and right temporomandibular joints, and left and right parotid glands.

Fluence prediction

A customized CNN model named “shared encoder network” proposed in our previous study was used for fluence prediction [29]. The shared encoder network constructed by one encoding path and two decoding paths was exploited to simultaneously generate dose distribution and fluence maps with structure contours and CT images as input. The contour of PTV was converted to a 3D mask according to the prescription dose, and the maximum prescription dose of PTVs where the voxel belonged was set to each voxel of the PTV mask and every non-PTV voxel was assigned zero. Each OAR was expressed as a binary mask with one set inside the contour and zero set outside the contour. We extracted CT image, PTV mask, and 17 OAR masks from each patient as input data, and we utilized the trained model to generate fluence maps with resolution of 2.5 mm × 2.5 mm and size of 160 × 160 at nine beam directions. The predicted fluence maps were saved in a file storage format with header information and pixel values before importing into TPS.

Automatic plan generation

The automatic planning process was accomplished in a research-only Eclipse TPS (version 15.6). Using the Eclipse Scripting Application Programming Interface script to assist radiotherapy planning and plan quality assessment [30,31,32], we integrated all manual planning operations into a compiled C#-based script to achieve a fully automated planning process. With the customized C#-based script, the predicted fluence maps were imported into Eclipse to generate an initial plan, the auxiliary target structures were produced and dose objectives and priorities were set according to prescription and predicted fluence generated dose, the optimization and leaf motion and final dose calculations were also completed automatically. An approved binary plugin can be executed with one click to automatically generate a plan in the Eclipse system. Figure 1 demonstrates the procedure of an automatic IMRT plan generation.

Fig. 1
figure 1

The flowchart of automatic plan generation

Step 1: Importing predicted fluence and calculating dose distribution

After creating a new course and new plan for a selected patient, the predicted fluence maps for each beam were imported into Eclipse and then converted to MLC sequences with MLC leaf motion calculations (Varian LMC 15.6.03). The predicted fluence generated plan was obtained after calculating the resulted dose distribution with Anisotropic Analytic algorithm (AAA 15.6.03).

Step 2: Adding auxiliary structures and cropping targets

To improve target dose conformity and reduce the radiation dose to normal tissues, we added four auxiliary structures in optimization: “PTV-1-Crop,” “PTV-2-Crop,” “Ring 2 cm,” and “40 Gy-PTV2”. “PTV-1-Crop” was defined as the region of 3 mm outward expansion of PTV-GTV subtracted from PTV-1. “PTV-2-Crop” was generated by subtracting the 3 mm outward expansion of PTV-1, PTV-LN (L), and PTV-LN (R) from the whole region of PTV-2. “Ring 2 cm” was defined as a 2 cm-wide ring between PTV-2 expanded by 0.2 cm and PTV-2 expanded by 2.2 cm, and “40 Gy-PTV2” referred to the region between the isodose line of 40 Gy and the 0.3 cm extension of PTV-2. The Additional file 1 illustrates the definition of four auxiliary structures.

Step 3: Setting optimization objectives and priorities

The plan generated from predicted fluence maps already provided the achieved dose information, but the plan quality may need to be further improved. To ensure a plan quality improvement after plan fine-tuning, we set stringent optimization objectives (Table 1). The dosimetric values for key OARs were set 5%–25% lower than the achieved values from the predicted fluence generated plan.

Table 1 Optimization objectives were set according to prescription dose and predicted fluence generated dose information

Step 4: Further optimization and calculating final dose distribution

Plan optimization was completed with the Photon Optimizer algorithm (PO, version 15.6.03) with continued optimization, and the dose distribution calculated from the predicted fluence was set as the intermediate dose to reduce the optimization convergence time. Plan optimization was completed with the maximum number of 300 iterations. After optimization, the optimal fluence maps were converted to MLC leaf sequences with MLC leaf motion calculations, and the final dose distribution was calculated to generate the final deliverable plan.

Evaluation

The plan quality was quantitatively assessed between clinical plans, automatic plans with warm start (using predicted fluence as initial value for further optimization) and cold start (optimization with no initial state) for all 38 patients. Dosimetric metrics, including D2%, D98%, conformity index (CI) [33], and homogeneity index (HI) [34], were reported for five PTVs. The CI is expressed as CI = \(\frac{{TV}_{RI}}{TV}\), where \({TV}_{RI}\) refers to target volume covered by the prescription dose, and TV is the target volume. The range of CI values is from 0 to 1, and high CI values indicate good target conformity. The HI is defined as \(\frac{{D}_{5\%} - {D}_{95\%}}{{D}_{px}}\), where \({D}_{5\%}\) and \({D}_{95\%}\) are 5% and 95% of the PTV volume received dose, respectively, and \({D}_{px}\) is the prescription dose. In general, low HI values represent a homogeneous dose distribution inside the PTV. Maximum dose (Dmax) and mean dose (Dmean) were used to assess quantitative metrics for 17 OARs. All dosimetric comparisons were tested for statistical differences using the Wilcoxon signed-rank test with a significance level of 0.05.

Results

The nine-field fluence maps predicted from the trained model took approximately 12 s for one patient. On average, the whole process of automatic planning in Eclipse using script per patient was completed in 199.8 s. Plan fine-tuning step with warm start didn’t show significant iteration number reduction and optimization efficiency improvement than cold start. The time cost of automatic planning for 38 patients ranged from 155.9 to 239.3 s, and the median time was 206.1 s. Figure 2 shows the time spent in each step of the automatic planning process for a randomly selected patient, and the total planning time was 185.7 s.

Fig. 2
figure 2

The time breakdown of automatic planning process for a randomly selected patient

The dose distribution comparison among clinical plan, predicted fluence generated plan, and automatic plan for two representative patients (patient A and patient B) on three axial sections is illustrated in Figs. 3 and 4, respectively. In general, all three plans achieved comparable dose coverage on both PTV-GTV (red segments) and PTV-1 (orange segments), but the automatic plan further improved the target dose homogeneity and conformity as indicated by the arrows compared with the clinic plan and predicted fluence generated plan.

Fig. 3
figure 3

The comparison of dose distributions between clinical plan, predicted fluence generated plan and automatic plan for patient A. The first column is clinical result, the second column is predicted fluence generated result and the third column is automatic fine-tuning result

Fig. 4
figure 4

The comparison of dose distributions between clinical plan, predicted fluence generated plan and automatic plan for patient B. The first column is clinical result, the second column is predicted fluence generated result and the third column is automatic fine-tuning result

Figures 5 and 6 show the DVH comparison of five PTVs and  seventeen OARs for the two patients, respectively. No significant difference was found in the target curves between the clinical plan (solid line) and automatic plan (dashed line). The predicted fluence generated plan (dash-dotted line) showed an obviously inadequate dose coverage for PTV-2, PTV-LN(L), and PTV-LN(R), whereas the automatic plan successfully recovered the target dose coverage after plan fine-tuning. For OARs, both the predicted fluence generated plan and automatic plan showed better dose sparing than the clinical plan.

Fig. 5
figure 5

The comparison of DVH curves between clinical plan (solid line), predicted fluence generated plan (dash-dotted line) and automatic plan (dashed line) for patient A

Fig. 6
figure 6

The comparison of DVH curves between clinical plan (solid line), predicted fluence generated plan (dash-dotted line) and automatic plan (dashed line) for patient B

Figures 7 and 8 showed the comparison of major dosimetric results between clinical plans and automatic plans using box plots for 38 patients. Compared to automatic plans, the dosimetric parameters for the five targets in clinical plans generated using conventional planning methods exhibited a relatively more dispersed distribution range and worse plan quality consistency. In addition, automatic plans produced better target dose with lower D2%, higher D98%, higher CI, and lower HI except for D98% and CI of PTV-1. For most OARs, automatic plans also showed lower dosimetric values than clinical plans, especially Dmax of brainstem, spinal cord, left and right optic nerves, and chiasm and Dmean of left and right parotid glands.

Fig. 7
figure 7

The box plot comparisons of D98%, CI and HI between clinical and automatic plans for five targets\

Fig. 8
figure 8

The box plot comparisons of dosimetric results between clinical and automatic plans for fifteen OARs

Table 2 summarizes the comparison results of dosimetric metrics and corresponding p-values between clinical plans, automatic plans with warm start and cold start. The automatic plans with cold start also ameliorated the dosimetric results for most structures compared to clinical plans, and showed only a slight plan quality difference compared with automatic plan with warm start. However, automatic plans with warm start showed higher plan MUs than automatic plans with cold start.

Table 2 The comparison of dosimetric metrics for thirty-eight patients in the unit of Gy (mean ± standard deviation) between clinical plans, automatic plans with warm start and cold start

Discussion

The ideal trade-off between target coverage and OAR sparing for NPC is challenging and often requires a well-experienced planner to iteratively adjust optimization parameters during manual IMRT planning. Such a conventional method is time/resource-consuming and leads to uneven plan quality. In this study, we developed an automated IMRT plan-generating framework through fluence prediction and further plan fine-tuning, and we integrated it into commercial TPS via scripts to achieve automatic plan generation by one click. The proposed method was validated through 38 patients with NPC, showing high planning efficiency in less than 4 min and comparable plan quality with clinical plans.

Several previous studies have proposed to automatically generate plans based on direct fluence prediction [24,25,26,27,28], which may lead to unstable plan quality due to inaccurate prediction of fluence or quality loss when converting fluence into MLC sequences. The proposed plan fine-tuning step may be favored to further improve the plan quality. The DVH results in Figures 5 and 6 illustrated that some of the targets showed low-dose coverage in the predicted fluence generated plan, whereas the dose coverage significantly improved after the automatic plan fine-tuning step. Compared with the DVH prediction-based KBP method, the proposed method generated an initial deliverable plan first, which provided already achieved dosimetric information although may not optimal, while the predicted DVH is not always guaranteed to be achievable and optimal (uncertainties from machine learning models).

For NPC patients, VMAT is increasingly used in current clinical practice. Although the proposed method was only validated on IMRT plans in this study, it can be potentially used for VMAT plan optimization. Specifically, fluence can be predicted at discrete beam angles (such as 60 beams with 6 degree space) first, a VMAT plan arc sequencing step can be followed to generate an initial plan, then the plan fine-tuning step can be proceeded by using the predicted dose as objectives and the initial plan as warm start to generate a final plan. The planning efficiency improvement can be expected and would be more meaningful than IMRT. In the future study, we plan to extend the proposed method to automatic VMAT planning for NPC patients.

Conclusions

In conclusion, we proposed an automated IMRT plan-generating method for patients with NPC through fluence prediction and further plan fine-tuning. This method remarkably reduced the dose for most OARs without compromising target conformity and homogeneity. Compared with clinical plans, the automatic plans showed high planning efficiency and achieved comparable or superior plan quality.