Background

Modern radiation therapy (RT) approaches in Hodgkin lymphoma (HL), with lower prescribed doses (20–30 Gy) and smaller irradiated volumes (involved site or involved node), lead to a reduction of organs-at-risk (OARs) exposure [1]. Accordingly, the rates of radiation-induced late toxicity are expected to be lower [2, 3] when compared with older series of successfully treated long term surviving HL patients [4, 5]. In parallel, a considerable effort has been made to identify those HL radiation delivery modalities that increased control over target as well as OAR dose distributions [6,7,8,9,10].

Several, widely available, intensity-modulated radiation therapy (IMRT) planning solutions have been proposed in the literature. Among the different IMRT techniques, the dosimetric advantages of the “butterfly” (BF) technique for female patients with mediastinal HL has been reported [11, 12]. In particular, BF volumetric modulated arc therapy (VMAT) showed high levels of conformation permitting to achieve the most balanced compromise between higher conformation around the target and OAR sparing [11].

Dose-volume histogram (DVH) predictors and Normal Tissue Complication Probability (NTCP) models developed for HL patient population have supported the planning optimization procedures intended to limit OAR complications. NTCP models have been reported for late side effects such as radiation induced lung damage [13,14,15], hypothyroidism [16, 17], and cardiovascular diseases [2, 18,19,20].

However, plan optimization remains a very time-consuming and operator dependent task. This issue has been addressed by the recent introduction of automated engines in treatment planning (TP) systems in order to create an optimized plan with minimal user interaction. They have proved able to generate IMRT plans of non-inferior or even higher clinical quality compared to human driven plans for many different tumour sites, such as head and neck [21], prostate [22, 23]and lung [24]. To the best of our knowledge, however, no study investigated the Auto-Planning (AP) algorithm applied to VMAT for supradiaphragmatic HL (SHL) patients.

Given this background, the current study was designed to devise a fully automated pipeline, based on the Pinnacle3 (Philips Radiation Oncology Systems, Fitchburg, WI, USA) AP algorithm, for treating female SHL patients. For 10 female patients AP plans were compared with treatment plans generated by experienced human planners. The different TP solutions were evaluated by quantitative risk estimates based on published models for different toxicity endpoints.

Methods

Patient data

Planning CT-scans of 10 consecutive female patients with SHL (Table 1) in standard supine position were extracted from our clinical database. Involved site clinical target volume (CTV) was defined according to ILROG guidelines [1] for early stage HL. Planning Target Volume (PTV) was obtained by CTV uniform 10-mm expansion. Target and OARs structures were contoured on free-breathing CT images (voxel size = 0.94 × 0.94 × 5 mm3). The following OARs were contoured: lungs, heart, left ventricle, left anterior descending (LAD) artery, esophagus, spinal cord, breasts and thyroid. Heart and its substructures were contoured according to heart contouring guidelines [25]. All contours were reviewed and approved by one of the authors (M.C.).

Table 1 Nodal disease localization and Planning Target Volume (PTV) size for each patient

A total dose of 30 Gy was prescribed in 1.5 Gy daily fractions for all patients.

Treatment plan optimization

Each patient was purposely planned with an antero-posterior/postero-anterior weighted BF 6 MV photon beams VMAT technique by Pinnacle3 v. 9.10. SmartArc module and Collapsed Cone Convolution Superposition dose calculation algorithm (grid resolution 3 mm) were used. The BF VMAT technique consists of 3 arcs: 2 coplanar arcs, one anterior and one posterior (width ranging from 60° to 100°) and one anterior no-coplanar (couch angle 90°) arc (width ranging from 45° to 60°). Arc width was customized to provide tumour coverage according to patient anatomy. All plans were optimized for a Varian True Beam STx Linac (Varian Medical System, Palo Alto, CA) equipped with a High Definition 120 multileaf collimator (HD120MLC).

For each patient, two different optimization approaches were used: the human-driven optimization (Manual-BF) and the AP optimization (AP-BF), both generated using the same required clinical constraints (Table 2). No constraints were used on the left ventricle and LAD artery.

Table 2 Planning Target Volume (PTV) and Organ-At-Risk dose-volume constraints for plan optimization and patients violating the required constraints when the Auto Plan best optimization objective list was applied

The Manual-BF plan was generated using planner-dependent definitions of additional guidance contours (inner and outer rings structures for PTV), avoidance structures and associated optimization objectives. The plan was validated by 2 experienced clinical physicists (S.C., C.O.) in consensus.

The AP-BF plan was optimized using Pinnacle3 AP algorithm. In summary, it is a fully integrated module in the TP system which uses a progressive optimization algorithm to continually adjust the optimization objective list set by the user to meet or further decrease OARs doses and related DVH parameters with minimal compromise to PTV coverage, thus simulating the decision-making process of an experienced human planner [21]. Indeed, the AP algorithm iteratively fine-tunes the target coverage and OAR sparing by creating multiple additional structures based both on the relative geometry of originally segmented ROIs and on transient dose distributions. The algorithm automatically assigns dose-volume objectives to the additional ROIs which are added to the standard optimization list [26].

In addition to the BF technique, the AP engine was applied to a 2 coplanar disjointed arcs (AP-ARC) technique which consists of 2 full co-planar arcs moving clockwise and counter clockwise respectively avoiding the arms.

In the present study, each AP plan was obtained running a single optimization cycle.

AP optimization objective list

The starting point of Pinnacle3 AP optimization procedure is setting a user dependent optimization list of PTV/OAR clinical goals. In order to set a single AP-BF list for SHL with a high level of generalizability, we selected 5 out of 10 patients to be used as a training set. The patient selection criterion was based on nodal disease localization and target size heterogeneity (Table 1). The remaining 5 patients were instead used as a validation set to test the obtained optimization list. In Table 1, PTV characteristics for training and validation patient sets are reported.

For all training set patients, the list was iteratively refined using the algorithm described in Fig. 1 (learning phase). The algorithm was designed to satisfy, first, the tumour-coverage criteria (at least 95% of PTV received at least 95% of prescription dose) and, secondly, the constraints on the OARs of Table 2. To this end, in the algorithm we introduced the concept of “admitted violations”, intended as the maximum number of required objectives not satisfied at the end of an optimization cycle. The admitted violations for the algorithm were PTV V107% > 1% and only one OAR not fulfilling the required dose-volume constrains reported in Table 2.

Fig. 1
figure 1

The flow of the algorithm used for setting the Auto Planning optimization objective list (learning phase)

The list thus obtained was then tested on the validation set for both AP-BF and AP-ARC configurations.

Plan analysis

For plan comparison, DVHs of PTVs and OARs were extracted. For each patient, relevant PTV/OAR DVH metrics were analyzed: the percentage volume receiving at least X dose (Vx), near maximum dose (D2%), near minimum dose (D98%), mean (Dmean) and median dose (D50%).

The target coverage was assessed via the conformity index (CI=V95%/PTVvol) and the homogeneity index (HI = [D2%-D98%]/ D50%).

Toxicity risks were calculated according to several NTCP models available in the literature [2, 3, 13, 16, 18,19,20, 27, 28]. NTCP models specifically extrapolated from HL patients’ cohorts were used.

In addition, the number of planned monitor units (MU) and the hands-on planning time were recorded. The hands-on planning time was defined as the time of human interaction with the TP system.

The median and the range were employed to describe all continuous variables and the non-parametric ANOVA (Friedman matched-pairs signed-rank test) was used to determine statistically significant differences (p < 0.05). A posthoc procedure was performed in order to identify significant differences between groups (Dunn’s test).

Results

AP optimization objective list

At the end of the learning phase, we succeeded in implementing a single AP optimization objective list for SHL patients (details in Table 3). This list was subsequently applied, with no further refinement, to the validation set. Each AP plan of the validation set (patients 6–10) fulfilled all PTV/OAR constraints within the admitted violations of the required constraints, i.e. all PTV constraints (except V107%) and constraints on all but one OAR were fulfilled (Table 2 and Fig. 2).

Table 3 Auto Planning setting list
Fig. 2
figure 2

Comparison of Planning Target Volume (PTV) percentage volume receiving at least 107% of the prescribed dose (V107%), of Thyroid-PTV mean dose (Dmean) and Heart-PTV mean dose (Dmean) values for Manual-BF, AP-BF, AP-ARC

Target volume

Median target size was 386.9 cc (199.3–559.5 cc). Figure 3 illustrates dose distributions in one representative patient for the three treatment techniques.

Fig. 3
figure 3

Dose distributions in one representative patient for the three treatment plans: a) Manual-BF, b) AP-BF, c) AP-ARC

Median PTV DVH from the 3 plans were largely overlapping (Fig. 4.A). AP offers comparable coverage of the PTV with the manual plan. CI indices for AP plans were comparable to that of the Manual-BF plan, while AP-BF showed a higher HI compared to both Manual-BF and AP-ARC (Table 4).

Fig. 4
figure 4

Median cumulative patient dose-volume histograms (DVHs) for the Planning Target Volume-PTV (A) and the organs-at-risk (B–F) for the three treatment plans: Manual-BF, AP-BF, AP-ARC

Table 4 Dosimetric indices and comparative analysis for Planning Target Volume (PTV) and different organs at risk for manual and automated plans

Organs at risk

All AP plans fulfilled the clinical dose criteria set for OARs within the admitted violation. Data in Table 4 show that the AP solutions were never outperformed by the manual plans and the AP engine also leads to a general reduction of OARs dose metrics. In particular, AP-ARC was never outperformed by AP-BF, except for lungs V5Gy and breast Dmean, which show a slightly higher sparing provided by AP-BF.

In terms of NTCP, AP engine was always at least as safe as manual planning (Table 5), with the exception of radiation-induced lung fibrosis where AP-BF involved a higher risk compared with manual plan. In particular, comparing AP-ARC and AP-BF, AP –ARC resulted in a lower risk of radiation-induced coronary events and lung fibrosis compared to AP-BF.

Table 5 Risk analysis for different organs and endpoints for manual and automated plans

The median number of MU were 287.7 (239.6–378.9) for Manual-BF, 267.7 (214.9–382.5) for AP-BF and 375.6 (339.9–456.7) for AP-ARC (p < 0.001; AP-ARC > Manual-BF and AP-BF).

Hands-on planning time by AP decreased by an order of magnitude. The mean computation time (performed on a Server Expert hardware platform 32 GB RAM –http://incenter.medical.philips.com/doclib/getdoc.aspx?func=ll&objid=10925579&objaction=open) for the automated procedures was 25 min (AP-BF) and 40 min (AP-ARC).

Discussion

The most up-to-date and optimized RT techniques applied to mediastinal HL have demonstrated a significant dose reduction to various sensitive critical structures [10,11,12]. Modern TP systems automate many beam parameters, in particular the beam modulation, via inverse planning computations which create IMRT or VMAT plans so that each treatment plan will result highly customized for each patient. However, the mediastinum remains a critical and complicated target area in HL, due to the heterogeneity of tumour volumes and their position relative to many different important OARs, such as the heart and its substructure or the lungs. As a consequence, HL planning optimization entails a high level of complexity with a wide variation in plan quality that strongly depends on planner skills, as demonstrated for other disease sites by [29]. This issue calls for an additional level of automation in HL RT in order to reduce the inter-operator variability of plan quality. In recent years, different automated treatment planning approaches have been proposed and are commercially available. They show that it is possible to almost fully automate and accelerate this task, improving speed, consistency and quality of RT plans [30].

One proposed knowledge-based solution relies on the concepts of machine learning and uses a library of historical plans for a given disease site to build a model that can predict achievable DVHs for new patients and guide plan optimization [31]. Another approach instead is based on a multicriteria optimization algorithm which provides a database of Pareto-optimal plans [32]. Pinnacle3 AP algorithm uses an iterative approach of progressive optimization without requiring any prior database of successful plans [26].

In this study we devised a fully automated pipeline for treating female SHL patients using Pinnacle3 AP. First, we designed a learning phase based on a trial-and-error approach to fit an optimization list that could satisfy a number of dosimetric acceptability criteria on the training set patients (as illustrated in the flowchart of Fig. 1). Then, we applied the obtained optimization list as an input for the AP algorithm on an independent validation set of patients.

On the whole, the analysis of the results on the validation set confirmed the behaviours observed in the training phase (Table 2 and Fig. 2). Namely, AP techniques seemed to favour sparing of healthy tissues over target coverage, in agreement with [26]. In particular, the requirement on PTV V107% ≤ 1% was violated when AP was applied to patients 6, 7 and 8 in the validation set. Indeed, in several patients a plan renormalization was necessary to fulfil the requirement of PTV V95% = 95% at the expense of the target high dose region.

Nonetheless, with AP-ARC technique the V107% was always no more than 5% while higher V107% values (≥ 12%) were obtained with the AP-BF technique. In this regard, AP-ARC proved able to largely reduce the gap between the manually optimized and AP-BF plans, as also reflected by the HIs (Table 4). On the other hand, AP plans naturally succeeded in satisfying OAR requirements, with the exception of the heart mean dose for one training and two validation patients, and thyroid Dmean and V18Gy for one training patient. However, even in those cases AP engine was able to outperform manual optimization.

The quantitative assessment of DVH (Fig. 4 and Table 4) revealed that, as a general rule, AP schemes performed at least as well as the manual approach. For lungs and heart, the dosimetric advantages translated into a significant reduction of morbidity risk estimates (Fig.5 and Table 5). In addition, even when non statistically significant differences were found, the observed trends held for all the evaluated variables.

Fig. 5
figure 5

Comparison of morbidity risk parameters for heart, lungs and thyroid for Manual-BF, AP-BF and AP-ARC (please note that patient number 8 underwent a thyroidectomy)

We have to remark that further refinements of the AP plans could be expected by running more than one AP optimization cycle.

Analogously, a general trend suggested that AP-ARC outperforms AP-BF, with the only exception of lungs V5Gy and breast Dmean which shows a slightly higher sparing provided by AP-BF. The out of phase behavior of the two considered lungs metrics (namely V5Gy and V20Gy) translates into a similar result for estimating the radiation fibrosis risk and reflects the famous conundrum “a lot to a little or a little to a lot” inherent to lungs radiobiology.

The better performances of AP applied to the arc beam settings compared to the well-established “butterfly” technique can be explained by the increased number of beam entries resulting in an augmented number of degrees of freedom that the optimization algorithm can exploit to satisfy the objective list. This point is best demonstrated by the higher homogeneity of the target coverage and by the lower heart doses. In addition, the longer beam-on time for AP-ARC plans (by a factor of about 1.5) is overbalanced by reduced in room times compared to AP-BF plans, which involve a non-coplanar beam. This potentially reduces the immobilization errors and facilitates more comfortable treatments. Of note, no difference between AP-ARC and AP-BF was observed in the non-target tissue mean doses.

Besides, the adoption of the AP algorithm obviously leads to a huge decrease of the hands-on time on the TP system which can be easily quantified in terms of an order of magnitude.

Summing up, the above results prove that we have succeeded in defining a procedure that leads to a fully automation of the TP process for obtaining clinically acceptable SHL plans, despite the high inter-patient target variability (size and position) inherent to the considered disease. The standardization of the treatment is a direct consequence of the automation, thus guaranteeing the quality of treatment delivered in an arbitrary institution independently from the planner’s skills.

Finally, the flowchart devised for setting a single optimization objective list is not tied to the considered disease and, as such, can be applied to any tumour site in order to remove the only operator dependent task left by the Pinnacle3 AP optimization tool.

Conclusions

In this study, we demonstrated the feasibility of a completely automated pipeline based on Pinnacle3 AP for SHL plan optimization. The AP module was able to limit OAR doses, thus producing clinically acceptable plans of high quality without additional user interaction. On the whole, the AP engine associated to the arc technique represented the best option for SHL.