Introduction

Craniopharyngiomas are locally aggressive tumors which typically are focused in the sellar and suprasellar region near a number of critical neural and vascular structures mediating endocrinologic, behavioral, and visual functions. Multimodality therapy for these tumors can be challenging, given the significant potential for harm with any intervention involving the structures in this region, as well as the young age of many of the patients with these lesions.

For many years, gross total resection (GTR) was felt to be the treatment of choice, given the better rates of tumor control compared to subtotal resection (STR) alone, and the avoidance of radiotherapy in young patients. Experience with aggressive surgical resection has led some to conclude that a general goal of GTR in all cases might lead to unacceptable rates of endocrinologic and behavioral morbidity, and that GTR could be replaced with subtotal debulking followed by stereotactic conformal radiotherapy [14]. This potential benefit must be tempered against the risk of radiation induced injury to nearby radiosensitive normal structures, notably the optic chiasm [5, 6]. The present study aims to summarize and compare the published literature regarding morbidity resulting from treatment of craniopharyngioma.

Materials and methods

Article selection

Articles were identified via a PubMed search using the key phrases “craniopharyngioma,” alone and in combination with “optic,” “morbidity,” “pituitary,” “hypopituitarism,” “endocrine,” “complications,” “visual,” and “vision.” Inclusion criteria were: (1) All patients had to have follow-up data available. (2) Articles had to have enough information for each patient to be disaggregated. The rationale for the exclusion of aggregated data sets is described below. Exclusion criteria were: (1) Articles that combined patient outcomes of craniopharyngioma with other childhood tumors were excluded, unless there was a clear distinction between the two separate groups of patients. (2) Rathke’s Pouch tumors were excluded from this study. After reviewing these articles, a thorough review of all referenced sources was also performed.

All references that contained disaggregated data specifically addressing complication rates with adequate follow up in patients who had undergone surgical resection with or without fXRT or stereotactic radiosurgery (SRS) as adjuvant or monotherapy were included in our analysis. In addition, we included patients undergoing surgery via the extended transsphenoidal route in the surgery cohort, where appropriate.

Data extraction

Our searches yielded a total of 274 studies [3, 7279] (Table 1) reporting data for 8058 non-duplicated craniophayngioma patients. After analyzing the complication rates in aggregated cohorts, we concluded that complication rates were likely underreported in aggregated data sets (for example, the reported rate of endocrinopathy with GTR was 2.1% in the aggregated datasets, compared with a more realistic 52% in series which presented individual patients separately). Thus, we chose to analyze only these disaggregated datasets.

Table 1 Pubmed ID’s of identified studies

Disaggregated data useable for analysis were presented for 800 individual patients. Data were stratified into five groups based on treatment paradigm: STR alone, GTR alone, STR plus adjuvant post-operative radiotherapy (STR + XRT), biopsy followed by fractionated radiotherapy (fXRT), or biopsy followed by radiosurgery (SRS) alone. We did not directly compare the radiation only groups (SRS and fXRT) to the surgically treated patients.

Endocrinopathy was defined as the development of any new monohormonal or polyhormonal anterior hypopituitarism, or diabetes insipidus. Vascular injury referred to any gross injury to the circle of Willis or perforating vessel, as well as any reported post-treatment cerebral infarction. Neurologic injury referred to the development of a post-treatment non-endocrine, non-visual neurologic deficit. Visual deterioration was defined as permanent loss or decrease in visual acuity, or a new visual field cut in either eye. Studies which did not present patient data in a way that these variables could reliably be determined were excluded from further analysis.

Statistical analysis

Pearson’s χ2 test was used to analyze for differences in categorical factors. Fisher’s exact test was used if there were less than five values per cell. Analysis of variance (ANOVA) was used to evaluate for statistical differences in pre-operative continuous factors, including age and tumor size. Post-hoc between group analyses were performed when the ANOVA demonstrated P < 0.05 using Tukey’s test. All analyses were carried out using SPSS version 16.0 (SPPS, Inc.).

Logistic regression analysis

Univariate analysis was used to identify covariates which might affect the rate of neurologic, endrocrinologic, vascular, or visual complications in these patients. Binary and categorical variables were compared using Pearson’s χ2 test, or the χ2 test for trend, respectively. Cut-offs for variables were determined empirically by first analyzing the data in smaller categories, and then aggregating groups which seemed statistically homogeneous. Variables which impacted rates of complication with a P = 0.2 or less on univariate analysis were included in stepwise binary logistic regression modeling [280]. All odds ratios on multivariate analysis, reflect the risk of having nonservicable hearing a neurologic or endocrine complication compared to the reference group. Reference groups included the STR + XRT cohort for the extent of resection analysis and the large sample size studies for the surgeon experience analysis. The goodness of fit of the regression model was confirmed by demonstrating a non-significant P-value on the Hosmer-Lemeshow test [280, 281].

We tested interaction terms between each of the three variables to significantly impact hearing on univariate analysis. The statistical significance of the interactions was assessed with the use of backward stepwise regression, in which statistical significance was estimated by means of the likelihood-ratio test to assess the effect of removing interaction terms for all strata of the given variable [280]. After finding that none of the interaction terms would significantly (unadjusted P > 0.2 for all terms) alter the log likelihood of the regression model if removed, we calculated the adjusted odds ratios without adjusting for interactions.

Of note, while tumor size is a variable of interest, it is inconsistently presented in most studies, and we were only able to collect data on tumor size for 87 patients. Because these data were not enough to include in the multivariate regression modeling, we present only univariate data regarding the relationship between tumor size and complications.

Results

Clinical characteristics of included patients

In our data set, 540 patients underwent surgical resection of their tumor. GTR was achieved in 289 cases, while STR was achieved in 251 cases. Of patients receiving STR, 110 patients received fXRT, and 141 did not. Surgical patients in different cohorts did not differ in mean age at the time of surgery, gender distribution, or pre-operative tumor size.

138 patients received biopsy alone followed by some form of radiotherapy. 72/138 of these patients received fXRT, and 66/138 patients received SRS. Age and gender distribution did not differ between patients receiving fXRT and SRS.

The remaining patients did not have adequate data to facilitate our analysis and were excluded. Mean overall follow-up for all patients in these studies was 54 ± 1.8 months.

The mean number of patients per study was 5.8 patients, (range 1–45 patients per study). 34% of the patients in the study came from series of 10 or less patients, 22% came from series of 10–20 patients, and 43% came from series larger than 20 patients. The modest sample sizes in many of the included studies is an inevitable consequence of the decision to use only disaggregated data, as larger studies generally provide summary statistics instead of individual patient data, and as discussed below, frequently do not report rates of morbidity in a way which can be realistically analyzed.

The rates of neurologic injury after treatment for craniopharyngioma

New neurologic deficits were reported in 5.1% of patients undergoing surgery (95% CI = 3.3–7.1), and in 2.2% of patients undergoing fXRT or SRS alone (95% CI = 0–4.6) (Table 2). There was no statistically significant difference in the rates of neurologic deficits between patients receiving GTR alone, STR alone, or STR + XRT (6.9 vs. 4.2 vs. 1.8%, P = NS) in the univariate analysis, or between patients receiving fXRT and those receiving SRS (1.4 vs. 3.0%, P = NS) (Fig. 2).

Table 2 Summary of overall rates of morbidity

Interestingly, on multivariate analysis, GTR conferred a significant increase in the risk of neurologic deficits compared to STR + XRT (OR = 5.05, 95% CI = 1.15–22.21, P = 0.03) (Table 5), after controlling for study size.

Gross total resection markedly increases the rate of endocrinopathy

The overall rate of new endocrinopathy for all patients undergoing surgical resection of their mass was 37% (95% CI = 33–41) (Table 2). Table 3 summarizes the rates of individual endocrinopathies. Significant monoendocrinopathies of TSH and ACTH were reported commonly, as was diabetes insipidus (DI). Anterior panhypopituitarism was reported in as many as 11.8% in patients undergoing GTR.

Table 3 Summary of various types of monohormonal and polyhormonal endocrinopathy following treatment of craniopharyngioma

Patients receiving GTR had over 2.5 times the rate of developing at least one endocrinopathy compared to patients receiving STR alone or STR + XRT (52 vs. 19 vs. 20%, χ2 P < 0.00001) (Fig. 1). The rates of endocrinopathy for patients receiving fXRT alone were not reported, and thus these data were not analyzed.

Fig. 1
figure 1

Comparison of 95% confidence intervals of rates of endocrinopathy for patients treated with different modalities for craniopharyngioma

On multivariate analysis, GTR conferred a significant increase in the risk of endocrinopathy compared to STR + XRT (OR = 3.45, 95% CI = 2.05–5.81, P < 0.00001) (Table 5), after controlling for study size and the presence of significant hypothalamic involvement.

The rates of vascular injury after surgery for craniopharyngioma

Vascular injury was an uncommon complication of craniopharyngioma surgery, occurring in just two cases (95% CI = 0–0.9) (Table 2), both resulting in ischemic cerebral infarction. No vascular complications were reported in any patients undergoing radiation treatment alone.

The rates of visual decline after treatment for craniopharyngioma

Visual decline was reported in 3.7% of patients undergoing surgery (95% CI = 2.1–5.3), and 8.6% of patients undergoing fXRT or SRS alone (95% CI = 3.9–13.4) (Table 2). There was a statistical trend towards worse visual outcomes in patients receiving XRT after STR compared to GTR or STR alone (GTR = 3.5% vs. STR 2.1% vs. STR + XRT 6.4%, P = 0.11). Visual outcomes did not differ between patients receiving fXRT and those receiving SRS (11.1 vs. 6.1%, P = NS) (Fig. 2).

Fig. 2
figure 2

Comparison of 95% confidence intervals of rates of neurologic injury for patients treated with different modalities for craniopharyngioma

Univariate risk factors for morbidity after craniopharyngioma surgery

Table 4 demonstrates univariate comparisons of rates of morbidities with patients stratified by age, tumor size, and extent of hypothalamic involvement. Note that in our univariate analysis, Patient age was not a significant univariate predictor of any form of complication, regardless of where the cut-off was placed in this analysis (one such analysis is provided in Table 4). Extensive hypothalamic involvement did increase the risk of endocrinologic complications, however this was not an independent predictor of morbidity when extent of resection was controlled for (Table 5).

Table 4 Univariate analyses of the effect of potential confounding variables on rates of neurologic, endocrinologic, vascular, and visual morbidity in the published literature
Table 5 Multivariate logistic regression demonstrating the risk of neurologic and endocrinologic morbidity controlling for extent of resection, study size, and presence of extensive hypothalamic involvement

As stated in the methods, while tumor size is a variable of interest, it is inconsistently presented in most studies, and we were only able to collect data on tumor size for 87 patients. Because these data were not enough to include in the multivariate regression modeling, we present only univariate data regarding the relationship between tumor size and complications. Patients undergoing surgery for tumors with a size greater than or equal to 3 cm experienced a statistical trend towards a higher rate of neurologic complications compared to patients with smaller tumors (10 vs. 0%, P = 0.06) (Table 4). The rates of endocrinologic, vascular, and visual complications were similar between patients with larger and smaller tumors (Fig. 3).

Fig. 3
figure 3

Comparison of 95% confidence intervals of rates of visual compromise for patients treated with different modalities for craniopharyngioma

The effect of surgical experience on rates of morbidity following treatment for craniopharyngioma

Surgical experience is generally not specifically reported in craniopharyngioma series in the literature. Using the sample size in these studies as a surrogate for surgical experience, we found a significant reduction in rates of most complications with increasing sample size (Table 4). More specifically, there was a linear decrease in the rates of neurologic complications with increasing study sample size (1–10 patients: 9%, 11–20 patients: 2%, 21+ patients: 1%, P < 0.0001), and a decrease in rates of endocrinopathy favoring studies larger than 20 patients (1–10 patients: 2%, 11–20 patients: 7%, 21+ patients: 1%, P < 0.01). Vascular injuries were also only reported in small series 1–10 patients: 1%, 11–20 patients: 0%, 21+ patients: 0%, P = 0.12), however this did not reach statistical significance due to the small number of events. Rates of visual complications did not differ between groups of different sample size (1–10 patients: 6%, 11–20 patients: 2%, 21+ patients: 4%, P = 0.45).

On multivariate analysis, studies with small sample sizes remained an independent predictor of increased rates of neurologic morbidity compared to those with large sample size (OR = 2.54, 95% CI = 1.02–6.29, P = 0.04) (Table 5), after controlling for treatment paradigm administered. Study sample size was not an independent predictor of endocrinologic morbidity on multivariate analysis.

Discussion

Currently, there are few definitive studies regarding management strategies for patients with craniopharyngiomas. This study aims to utilize a systematic collection of the published data on craniopharyngioma, to critically evaluate the idea that subtotal resection with adjuvant radiotherapy can serve as a desirable replacement for gross total resection, with lower post-surgical rates of monohormonal and panhormonal hypothalamic/pituitary endocrinopathy.

Based on our analyses, endocrinopathy is a common iatrogenic complication of treatment of craniopharyngiomas with either surgical resection or radiosurgery, however it is not unexpectedly higher with surgery. Gross total resection markedly increases the rate of hormonal disturbances compared to subtotal resection with or without radiotherapy, with over half of patients experiencing at least one endocrinopathy, and over 10% of patients being left with panhypopituitarism. Even after correcting for as many covariates as the data would allow, GTR still increased the rate of post-operative endocrinopathy by over three fold. These data support the concept that attempting gross total resection in these lesions is a difficult trade of endocrine function for tumor control, and that subtotal resection might be better for the patient’s overall quality of life. Further, previous work focusing on craniopharyngioma suggests that craniopharyngiomas frequently recur despite GTR, and that subtotal resection followed by adjuvant radiotherapy provides similar rates of tumor control to that achieved with gross total resection [282]. Taken together, these data argue that gross total resection of craniopharyngiomas may not be the best treatment option for most patients, however further work specifically focusing on the quality of life concerns of patients undergoing these treatment paradigms is needed to definitively conclude that STR + XRT is better for the patients than GTR.

The remaining argument against adjuvant radiation in this region is the risk of visual deterioration with administering radiotherapy or radiosurgery in close proximity to the optic apparatus. In our study, all irradiated patient cohorts experienced an increased rate of visual compromise compared to surgery only cohorts. While adjuvant radiotherapy following subtotal cytoreductive surgery did not reach statistical significance for worse visual outcomes, It is possible that a lack of statistical power in our cohort prevented statistical significance that would have been demonstrated by a larger study. It is important to note that this difference might be as large as a 10% in visual decline over GTR at the extremes of the confidence intervals, which would be a serious argument in favor of eliminating the need for XRT with GTR. This idea clearly deserves additional inquiry, especially for driving subtotal resections towards a targeted attack aimed at creating distance between the tumor and the optic apparatus.

Additionally, our analysis demonstrates that experience with craniopharyngiomas matters, and that experienced surgeons have lower rates of complications, even after correcting for tumor characteristics, and surgical philosophy. More specifically, using study size as a surrogate for surgical experience, more experienced surgeons have lower rates of neurologic complications than less experienced surgeons. We speculate that increased familiarity with the pathoanatomy of these tumors, combined with decreased need for brain retraction by experienced surgeons, leads to decreased rates of injury of surrounding neural structures.

Study limitations

While these findings represent a helpful summary of the published literature on this topic, an analysis of published data is limited by the data published by others, and may reflect source study biases. It is impossible for us to control for the quality of the data reported in the literature. For example, the majority of this data set is derived from self-reported outcomes largely assessed by the treating surgeon and colleagues. It is impossible for us to assess or control for the quality of the data reported in the literature, or the unwillingness to report complications. Such omissions would inevitably change the rates reported in our study. Further, subjectively defined variables, such as histologic diagnosis, extent of resection, and the adequacy of radiation therapy likely vary between studies, and we cannot independently confirm the validity of these definitions in other groups’ publications.

Finally, due to the diverse range of data presentation, the number of variables able to be studied and controlled for is limited. Variables that might be of interest which are inconsistently presented across studies cannot be reviewed. Specifically, we cannot completely control for the effect of tumor size on rates of complications, because unfortunately, not everyone publishes these data.

Conclusion

In this study, we summarized and compared the rates of endocrine, neurologic, vascular, and visual complications reported in craniopharyngioma patients. Given the difficulty in obtaining class 1 data regarding the treatment of this tumor, this study can serve as an estimate of expected outcomes for these patients, and guide decision making until these data become available.