Background

Nasopharyngeal carcinoma (NPC), originating from the mucous epithelium of the nasopharynx, is a heterogeneous malignancy highly prevalent in South China and Southeast Asia [1, 2]. Due to the concealed location and high radiosensitivity of NPC, radiotherapy has been the most effective treatment modality for NPC [1, 3]. Although excellent local control has been achieved with the wide use of intensity-modulated radiation therapy (IMRT), local recurrence remains an important cause of treatment failure in approximately 10% of advanced NPC [4,5,6]. What’s worse, for patients with locally recurrent NPC, the salvage treatment options are limited and the prognosis is miserable, with 5-year overall survival ranging from 28 to 60% in patients with rT3-T4 disease [6,7,8].

There is no doubt that the probability of local control is highly correlated with dosimetric metrics in an IMRT planning. Therefore, some metrics have been recommended to evaluate the feasibility of an IMRT planning [9]. Previous studies have investigated how dosimetric metrics influenced the local control rate of NPC patients [10, 11]. However, most of the studies did not consider enough significant metrics in dose-volume histogram (DVH), nor did them exclude the effect of other clinical confounding factors, such as treatment modalities, on the prognosis of NPC. Therefore, reliable dosimetric metrics for IMRT planning evaluation remain scanty, and the association between dosimetric metrics and local recurrence has not yet been well-established. It is important to find out the most relevant dosimetric metrics to describe a dose rationality and to predict local recurrence of NPC patients in routine practice.

In addition, due to the technical advantages of IMRT, NPC treated with IMRT had its unique recurrent characteristics when compared with that treated with two-dimensional or three-dimensional conformal radiotherapy [12, 13]. Although previous studies have examined local failure patterns of NPC treated with IMRT and indicated that local recurrence mainly occurred in high dose area [14, 15], it is critical to examine the recurrent sites and patterns with a larger cohort, which would contribute to gain insight into recurrent features, target contouring, and planning optimization of NPC in the IMRT era.

Therefore, the present study aimed at analyzing the effect of dosimetric metrics on local recurrence of NPC patients, and subsequently developing a predictive risk model for local recurrence free survival (LRFS) of patients. Moreover, the recurrent sites and patterns of NPC treated with IMRT were also elucidated.

Methods

Study population

Newly diagnosed NPC patients treated by curative-intent chemo-radiotherapy in West China Hospital between January 2010 and December 2015 were reviewed. The eligible criteria were as follows: histologically confirmed NPC; no distant metastasis at initial diagnosis; achieved complete remission (CR) after initial treatment. The main exclusion criteria included regional lymph node recurrence alone, history of other malignancy or insufficient treatment or image data. The flowchart of patient selection was illustrated in Fig. 1.

Fig. 1
figure 1

The flowchart of patient selection. NPC, nasopharyngeal carcinoma; IMRT, intensity-modulated radiation therapy; RT, radiotherapy; IC, induction chemotherapy; CC, concurrent chemotherapy; AC, adjuvant chemotherapy; PSM, propensity score matching

All patients were restaged according to the eighth edition of the American Joint Committee on Cancer (AJCC). This study was approved by the Ethics Committee on Biomedical Research of the hospital and the informed consent was waived.

Target definition and delineation

Target volumes were delineated according to the International Commission on Radiation Units and Measurements (ICRU) reports 83 [16] and the treatment protocol of our cancer center. The nasopharynx gross tumor volume (GTVnx) and node gross tumor volume (GTVnd) were determined by physical, endoscopic, and imaging examinations. The positive retropharyngeal lymph nodes were delineated together with the GTVnx. For patients receiving induction chemotherapy (IC), the primary tumor volume before IC was utilized for GTVnx delineation, and the volume of lymph nodes after IC was utilized for GTVnd delineation. High-risk clinical target volume (CTV1) was defined as the GTVnx plus a 5–10 mm margin and the whole nasopharynx mucosa. Low-risk clinical target volume (CTV2) was defined as CTV1 plus a 5–10 mm margin, which included the posterior part of nasopharyngeal cavity, posterior third part of maxillary sinus, posterior ethmoid sinus, the inferior part of sphenoid sinus and cavernous sinus, skull base, the anterior third part of clivus and cervical vertebra, parapharyngeal space, and pterygopalatine fossa. The above margins were initially obtained with automatic 3D expansion, and then slightly adjusted manually according to tumor characteristics. The clinical target volume for bilateral lymphatic drainage area (CTVnd) routinely included levels II to V nodal regions. The planning tumor volumes (PTVs) were created by adding 2–3 mm margin with automatic 3D expansion to the above target volumes.

In this study, two types of prescribed radiation doses were used for PGTVnx based on patients’ clinical stage. For patients with T4 classification or with bulky primary tumor, a dose of 74 Gy in 33 fractions at 2.24 Gy/fraction was administrated to the PGTVnx. For the other patients, a dose of 70 Gy in 33 fractions at 2.12 Gy/fraction was administrated. All patients received 70 Gy in 33 fractions to PGTVnd, 60 Gy in 33 fractions to PCTV1, and 56 Gy in 33 fractions to the PCTV2 and PCTVnd.

The planning goal was to deliver at least 95% of prescription dose to 100% of the PTVs without exceeding the dose tolerance of organs at risk (OARs). We mainly followed the protocol of Radiation Therapy Oncology Group (RTOG) trial 0225[17] and the protocol from a published study [14]. In short, the ideal maximal point dose should be less than 54 Gy for brainstem, optic chiasma and optic nerve, 45 Gy for spinal cord, and 65 Gy for temporal lobe. However, if these constraints could not be fulfilled, acceptable criteria were to allow less than 60 Gy to 1% volume for brainstem, optic chiasma and optic nerve, and less than 50 Gy to 1 cc for spinal cord, and less than 70 Gy maximal point dose for temporal lobe.

IMRT was delivered with 6 MV X-ray beams modulated using either a step-and-shoot IMRT or a rotational technique (volumetric modulated arc therapy, VMAT). In addition, the technique of simultaneous integrated boost (SIB) was adopted in our center.

Chemotherapy

In terms of patients with T1-2 and N0, radiotherapy alone was adopted, and the other patients (T3-4/N +) were treated with radiotherapy combined with cisplatin-based chemotherapy. IC, concurrent chemotherapy (CC), and adjuvant chemotherapy (AC) were included in this study. The common IC and AC protocols included PF (cisplatin 80 mg/m2 d1-3 + 5-fluorouracil 800 mg/m2/day/ d1-5), TPF (docetaxel 60 mg/m2 d1 + cisplatin 60–80 mg/m2 d1-3 + fluorouracil 800 mg/m2/day/ d1-5), GP (gemcitabine 1000 mg/m2 d1 + cisplatin 80 mg/m2 d1-3), and TP (docetaxel 80 mg/m2 d1 + cisplatin 80 mg/m2 d1-3). CC consisted of cisplatin (80 mg/m2 d1-3) was given every three weeks during the period of radiotherapy.

Follow-up

Patients were evaluated every week during radiotherapy including physical and hematological examinations. After treatment, follow-ups were regularly scheduled every three months in the first two years, thereafter, every six months until death or loss to follow-up (the last follow-up was on Dec 31, 2019). Follow-up included physical examinations, nasopharyngoscopy, magnetic resonance imaging (MRI) for the head and neck, computed tomography (CT) for the chest, ultrasonic/ CT/ MRI of the abdomen, and whole-body bone scan if necessary. Local recurrence referred to the disappearance of the primary tumor after radical treatment but the occurrence of new lesion six months later, and local recurrence free survival (LRFS) was defined as the duration from the date of diagnosis to the date of local recurrence. No patient was lost to follow up in this study, and the median LRFS of these 493 patients was 58.4 months (Range, 7.6 to 100.6 months).

Propensity score matching

Propensity score matching (PSM) [18] was used to filter out clinical variables affecting tumor prognosis between patients with or without local recurrence so that the baseline characteristics of the two groups were comparable. Variables entering the PSM model included age, gender, T classification, N classification, IC, CC, and AC. In this study, one to two matching was performed (Fig. 1).

Dosimetric metrics extraction and definition of failure patterns

In this study, we mainly focused on the recurrence of nasopharynx tumor (local recurrence). Dx was defined as the minimum absorbed dose that covers x% of the volume of the target. Dosimetric metrics of the PGTVnx from D5 to D95 in steps of 5 were calculated and extracted from the pretreatment DVH through in-house script run by RayStation (Raysearch laboratories, Sweden) treatment planning system. In addition, D1, D2, D98, D99, Dave (the average dose of the target), Dmin (the minimum dose of the target), and Dmax (the maximum dose of the target) were also extracted. Homogeneity index (HI#) was defined as D5/D95 according to the report of AAPM Task Group 101 (TG 101) [19], and HI* was defined as (D2-D98)/D50 based on the ICRU 83 [16].

For patients with local recurrence, first the MRI images obtained at the time of local recurrence were transferred to the RayStation. The pretreatment planning CT served as the basis for registration, namely that the MRI images were moved to be registered with the CT images. Bony, vascular, and muscular structures adjacent to the failure were utilized to guide the co-registration process, which was repeated until satisfactory visual agreement was acquired between the MRI and CT images. Then, the recurrent tumor volume (RTV) was delineated on MRI images, and copied from MRI images onto the pretreatment planning CT. Finally, the exact site and extent of each tumor were compared with the pretreatment planning CT, concentrating on the 95% isodose lines. Doses received by RTV was calculated and analyzed with DVH. The patterns of failure were classified into in field failure (95% of RTV was within the 95% isodose), marginal failure (20% to 95% of RTV was within the 95% isodose), and outside field failure (less than 20% of RTV was inside the 95% isodose) [20].

Statistical analysis

Statistics analysis was performed using SPSS software package (Version 22.0, IBM SPSS Inc) and R software package (Version 3.5). Categorical variables were compared by Pearson chi-square test. In univariate analysis, log-rank test was performed for category variables, such as age, gender, T classification, N classification, IC, CC, and AC, and a Cox regression model was used for continuous variables, such as dosimetric metrics. For those factors with p < 0.05 in univariate analysis, a multivariate Cox regression analysis using a stepwise method with likelihood-ratio was performed to identify key dosimetric metrics and develop model for local recurrence, which was completed by “survival” package and “survminer” package. Survival analysis was calculated by Kaplan–Meier method, and survival curves of different groups were compared by log-rank test. A two-tailed p-value < 0.05 was considered significant.

Results

Patient characteristics

A total of 493 NPC patients were included in this study. Clinical characteristics and treatment modalities were summarized in Table 1. In detail, 44 patients had local recurrence and 449 patients did not. Before matching, the proportion of patients who received CC in the recurrent group was significantly lower than that of patients in the non-recurrent group (p = 0.042). To exclusively analyze the effect of dosimetric metrics on tumor recurrence, PSM was used to balance clinical variables which might affect tumor control, and a new cohort, the PSM cohort, was constructed. The new cohort included 44 recurrent patients and 88 non-recurrent patients, thereby eliminating the differences of observed baseline variables (p > 0.05) (Table 1).

Table 1 Clinical characteristics of patients with and without recurrence

Feature selection and prediction model developing

Table 2 shows comparison of dosimetric metrics between recurrent and non-recurrent patients. Significant differences were found between the two groups in metrics of Dmax, D1, D2, D95, D98, and D99 (all p < 0.05). And, D5 and Dmin were close to be significant (p = 0.057 and p = 0.073, respectively). Subsequently, a univariate analysis including clinical factors and all dosimetric metrics in the PSM cohort was conducted. The results showed that none of the clinical factors was significantly associated with local recurrence (Table 3). However, eight dosimetric metrics including Dmax, D1, D2, D5, D95, D98, D99, and Dmin were significantly associated with local recurrence (Fig. 2a and Additional file 1: Table S1), among which D95, D98, D99, and Dmin were protective factors. To identify the critical dosimetric metrics that mostly affected local relapse of patients, the Cox regression model was performed on the eight statistically significant variables derived from the univariate analysis. The result showed that only D5 (p = 0.002) and D95 (p < 0.001) were independent factors for predicting local recurrence (Fig. 2b and Additional file 2: Table S2). A predictive model was then constructed according to the coefficient of the two dosimetric metrics acquired from the Cox regression analysis, and the risk score was calculated as follows: Risk score = D5 * 0.0019- D95 * 0.0030.

Table 2 Comparison of dosimetric metrics between recurrent and non-recurrent patients
Table 3 Univariate analysis of LRFS by log-rank test according to clinical factors
Fig. 2
figure 2

Univariate and multivariate analysis of dosimetric metrics. a Eight dosimetric metrics were significantly correlated with LRFS derived from the univariate analysis using a cox regression model; b Two dosimetric metrics were statistically correlated with LRFS derived from the Cox regression analysis using a stepwise method with likelihood-ratio. LRFS, local recurrence free survival

Risk stratification and receiver operating characteristic (ROC) curve analysis

After obtaining the risk score for each patient according to the formula, patients were classified into low- and high-risk groups based on the median value of the score (Median value, 0.885; Range: 0.388–5.179). The distribution of the risk score along with the corresponding local recurrence data were plotted and shown in Fig. 3a. Patients with risk score ˃ 0.885 (high-risk group) tended to have a higher risk of local recurrence. The 3-year LRFS of patients in high-risk group was significantly lower than that of patients in low-risk group (66.2% vs 86.4%, p = 0.023) (Fig. 3b). Moreover, time-dependent ROC analysis was used to assess the predictive significance of the risk model. The area under the curve (AUC) value of ROC analysis for the prognostic signature was 0.706 and 0.681 for 3-year and 5-year LRFS, respectively (Fig. 3c). Furthermore, compared with other significant dosimetric metrics obtained from the univariate analysis, the AUC value of the risk model for predicting local recurrence was the highest (Fig. 3d).

Fig. 3
figure 3

Risk score calculated by the signature of D95 and D5, Kaplan–Meier survival analysis, and time-dependent ROC curve. a The distribution of risk score and survival status; b Kaplan–Meier analysis estimated LRFS of patients according to the median value of risk score; c ROC curve was plotted for 1-, 3-, and 5-year LRFS; d Compared with other dosimetric metrics for predicting local recurrence, the risk model including D5 and D95 had the highest AUC value. LRFS, local recurrence free survival; ROC, receiver operating characteristic curve; AUC, area under curve

Comparison of HI with the risk model

In order to investigate the relationship between HI and LRFS of NPC patients, univariate analysis was performed. According to the median value of HI# (Median value, 1.09; Range:1.04–1.20), patients with lower HI# had significantly longer LRFS compared with that with higher HI# (HR 1.86; 95% CI 1.03–3.36; p = 0.042) (Fig. 4a). However, there was no statistical difference in LRFS between patients with lower HI* (Median value, 0.11; Range:0.05–0.24) and that with higher HI* (HR 1.46; 95% CI 0.81–2.64; p = 0.212) (Fig. 4b). Subsequently, the predictive significance of local recurrence between HI# and the risk model was compared. We found that the ROC value of HI# was lower than that of the risk model (AUC, 0.663 vs 0.679), although the significant difference was not reached (Fig. 4c). Furthermore, the AUC value of HI# for the prognostic signature was 0.686 and 0.665 for 3-year and 5-year LRFS, respectively (Fig. 4d), which was still lower than that of the risk model (0.706 and 0.681, respectively) as analyzed before.

Fig. 4
figure 4

Kaplan–Meier survival analysis, and time-dependent ROC curve of HI. a Kaplan–Meier analysis estimated LRFS of patients according to the median value of HI#; b Kaplan–Meier analysis estimated LRFS of patients according to the median value of HI*; c Comparison of predictive power between HI# and the risk model; d ROC curve of HI# was plotted for 1-, 3-, and 5-year LRFS. HI, homogeneity index; HI# was defined as D5/D95; HI* was defined as (D2-D98)/D50

Recurrent characteristics

To examine the recurrent tumor characteristics, sites of initial tumor and recurrent tumor invasion were compared. The results showed the most common recurrent site was nasopharynx cavity (n = 27, 61.4%), followed by clivus (n = 23, 52.3%) and pterygopalatine fossa (n = 18, 40.9%). Then, we compared the volume and isodose curve of recurrent tumors with those of the corresponding initial tumors. The topographic analysis showed that recurrent lesions in 30 (68.2%), 9 (20.5%) and 5 (11.3%) patients were considered as in field recurrence, marginal recurrence and outside field recurrence, respectively. Representative illustrations of the three types of recurrence were presented in Fig. 5.

Fig. 5
figure 5

Three different types of recurrent patterns of NPC treated with IMRT. a in field failure; b marginal failure; c outside field failure. Left, pretreatment magnetic resonance imaging (MRI); Middle, the recurrent tumor was transferred from the MRI at the time of recurrence to the planning computed tomography (CT) to present doses to the recurrent sites; Right, MRI at the time of recurrence. The green line represented the initial gross target volume; The pink line represented the recurrent tumor volume; The red color-wash represented 70 Gy, yellow represented 66 Gy, and blue represented 60 Gy

Discussion

In this study, we examined the impact of dosimetric metrics on local recurrence and analyzed recurrent characteristics of NPC patients treated with IMRT. We found that eight dosimetric metrics and HI# were significantly associated with local recurrence of NPC patients while only D95 and D5 were independent prognostic factors. More importantly, a novel model constructed with these two factors could effectively predict the risk of local recurrence. Moreover, we found that in field recurrence was still the main failure pattern of NPC with IMRT, and nasopharynx cavity, clivus, and pterygopalatine fossa were the frequently recurrent sites.

IMRT was a major break-through in the treatment of NPC which dramatically enhanced the local control rate of NPC, with a 5-year LCR of 95% for T1-2 disease and 80%-88% for T3-T4 disease [21,22,23]. The improved LCR was associated with highly target dosimetry coverage and conformity in an IMRT planning [24, 25]. However, sometimes it is difficult to balance the conflict between the potential serious late injuries and the risk of local recurrence due to inadequate target coverage in an IMRT planning, especially in NPC patients with advanced stage [9]. Additionally, it should be also noted that quality of IMRT planning also differs due to physician’s capabilities and personal experience. The dose coverage and uniformity of the target might be inferior in certain patients, thus leading to tumor relapse. Therefore, identifying reliable dosimetric metrics associated with their treatment outcomes are significant.

In the current study, univariate and multivariate Cox regression analysis were carried out to identify important dosimetric index to predict local recurrence. First, we found that T classification was not correlated with LRFS, indicating that T classification alone had less power in dividing patients into different risk groups in IMRT era, which was similar to the result of other studies [14, 15]. However, it should be noticed that the patients in this study with T4 or bulky primary tumor received higher prescription doses (74 Gy), the conclusion might be different if these patients received lower doses (70 Gy). Although patients with advanced T classification usually had a larger tumor volume, the sophisticated IMRT technique greatly improved the dose distribution and reduced the proportion of insufficient dose-related recurrence compared to two-dimensional or three-dimensional conformal radiotherapy. However, the univariate analysis found that eight dosimetric metrics were associated with LRFS, among which D5, D2, D1, and Dmax reflected the near-maximum dose of the target volume, while D95, D98, D99, and Dmin reflected the near-minimum dose of the target volume to some degree, suggesting that local failure might probably associate with dose homogeneity of the target volume. Similarly, other studies also indicated that both D95 and Dmin were significantly associated with local recurrence [10, 11]. Subsequently, the multivariate analysis demonstrated significant prognostic value of D5 and D95 in the LRFS of NPC patients. A cumulative risk score consisted of this two dosimetric metrics was calculated, which indicated that this two-dosimetric parameter signature independently predicted LRFS in NPC patients. And the AUC value of the ROC curve was more than 0.7 when assessing the accuracy of the signature over 3-year LRFS, suggesting that the established risk model was reliable.

In this risk model, D95 and D5 were ultimately incorporated to predict local recurrence. Although ICRU 83 reports have recommended D50 as dose-volume parameter for evaluating IMRT planning [16], it was poorly adopted in academic institutions according to a survey [26]. Furthermore, consistent with other studies, there was no correlation between D50 and local recurrence [27]. However, D95 was a commonly used dose-volume constraint in clinical practice and some clinical trials use this metric to determine prescription dose [17]. In fact, the significance of D95 in an IMRT planning was similar to D98 to some extent. As D95 increased, the near-minimum dose of the target increased, thus increasing the whole absorbed dose of the target volume, which was helpful to tumor local control. One study suggested that a Dmin to the GTVnx ≥ 54.0 Gy conferred better local control in NPC patients with T3 and T4 [11], and another study also indicated that patients who received at least 66.5 Gy to primary GTV were less likely to have local failure [14]. By contrast, the significance of D5 was similar to D2. As D5 increased, the near-maximum dose of the target correspondingly raised. Hence, the dose uniformity of target was decreased, which might be harmful to tumor local control. In essence, the two dose-volume metrics of this risk model determined the shape and trend of a dose line in DVH (vertical drop or not) to some extent, reflecting a homogeneous absorbed-dose distribution in the target [16]. From this point of view, the risk model developed by this study was sensible.

Due to this risk model included the metrics of D5 and D95, which were also the key parameters to be used for calculating the dose HI#. HI# is also a commonly used dosimetric parameter for treatment plan reporting recommended by TG101 [19]. Hence, the univariate analysis was performed to examine the relationship between HI# and local recurrence. We found that patients with higher HI# had significantly shorter LRFS than that with lower HI#. Dose HI reflected the uniformity of the absorbed dose distribution of the target volume. As the HI# increased, the “hot spot” of the target volume increased, and the “cold spot” of the target volume decreased. This meant that this IMRT planning itself was difficult, and dosimetrists might sacrifice dose coverage and uniformity of the target to reduce doses of OAR, thus increasing the risk of tumor relapse. In addition, we did not find that HI* was statistically associated with LRFS. The possible reason might be that the formula used to calculate HI* included three parameters according to ICRU 83: D2, D98, and D50. However, these parameters were not independent prognostic factors in our multivariate analysis, which was also consistent with the study of Wang et al. [27]. Therefore, our study added evidence that HI# might be a more promising parameter for IMRT evaluation compared with HI*.

Although HI# was demonstrated as an indicator for predicting local recurrence, the predictive power of HI# was lower than that of the risk model according to the ROC value, especially the signature of 3-year LRFS. Therefore, compared with the HI#, the risk model that we established was more preferable to predict local recurrence of NPC patients.

In addition, the results of one study exploring the influence of target dosimetry on tumor recurrence in NPC differs from our results [27]. They concluded that D90 was the independent dosimetric parameter for predicting tumor recurrence and patients with D90 < 101% had higher incidence of local–regional recurrence than those with D90 > 101%. The possible reason underling the inconsistent conclusion might be that their studies focused on both local and regional lymph nodes recurrence, while ours only focused on local recurrence. Theoretically, the impact of dosimetric metrics on primary tumor is greater than on regional lymph nodes, because the latter is more influenced by anatomical change and positioning errors during radiotherapy [28,29,30], which might ultimately affect the analysis of dosimetric metrics on treatment outcome. From this point of view, it might be more reasonable to only focus on local recurrence when analyzing the effect of dosimetric metrics on treatment outcome, and include more factors when analyzing the factors influencing the local–regional recurrence. Nevertheless, more studies are needed to validate these conclusions.

Previous studies have showed that the local relapse of NPC mainly occurred in high dose area. In the study of Yang et al., they analyzed 212 NPC patients undergoing IMRT and found that 18 patients developed local recurrence, 15 (83.3%) of which were confirmed with in field failure [31]. Wang et al. also reported that in field failure was the main pattern associated with local–regional recurrence of NPC [15].The present study further confirmed this conclusion: in field failure was found in 68.2% recurrent patients, while marginal and outside field failure were not common. Together with those results, it was suggested that the definition and delineation of CTV currently used was large enough, with low incidence of outside field failure. Hence, further reducing CTV coverage to reduce late complications of patients is an important direction to explore in IMRT era in the future [31, 32].

This study has several limitations. Although PSM method was adopted, the selection bias was inevitable. Besides, due to the lack of more patients with local recurrence in our center, we did not have enough patients to construct another independent cohort to validate the risk model. Finally, we just focused on the dosimetric metrics of PGTVnx, other targets and OAR sparing might have some effects on outcome of patients. Given these limitations, more studies or multicenter researches are warranted.

Conclusion

Taken together, the association between dosimetric metrics and clinical outcome was examined in this study. And, we established a novel risk model that could effectively predict the LRFS in patients with NPC, which would benefit patients who had high risk of local recurrence. Moreover, our study added more evidence for the view that D95, D5, and HI# (high hot spot and low cold spot coexist) were important metrics for dose constraint and evaluation in an IMRT planning through real-world data.