Introduction

The brain can be a sanctuary for metastatic cancer disease as many anti-cancer drugs are unable to cross an intact blood–brain barrier allowing tumors to grow even when extracranial disease is effectively treated with chemotherapy. Most authors believe metastatic brain involvement is a growing problem, although the true incidence is unknown [1]. The estimates available are probably underestimates as registries are often incomplete, neuroimaging is withheld in asymptomatic patients, and autopsy studies are outdated [1]. Moreover, there are limitations of even contemporary brain imaging technologies [24].

Treatment options for patients with brain metastases range from open brain surgery, stereotactic radiosurgery (SRS), whole brain radiotherapy (WBRT), or mere supportive care with corticosteroids, or combinations. The aim is to relieve symptoms and to maintain intracranial tumor control throughout the course of the cancer disease. The median survival in patients diagnosed with brain metastases is only 4 months, although the variation between patients is considerable [5]. It is therefore crucial to select the appropriate therapy to the right patients. Unwarranted treatment may waste valuable and already limited time for these patients and can cause serious side effects. Ineffective and excessive treatment should also be avoided as aggressive therapy for brain metastases is associated with significant costs [6, 7]. The choice of treatment often depends on the clinical condition of the patient, number of metastatic lesions, depth and eloquence of lesions, mass effect of lesions, and tumor size [810]. Hence, neurosurgeons usually meet a highly selected patient population.

Several risk stratification scores have therefore been suggested to aid prognostication and to guide treatment strategies [1114]. Attempts to identify short survival (≤2 and ≤3 months) using various indices alone or in combination have nevertheless been disappointing [15, 16]. The graded prognostic assessment (GPA) has recently been developed to predict survival in cancer patients with brain metastases [14, 17] and its use has been supported [18]. GPA is based on objective and measureable parameters. It has also been validated outside randomized clinical trials [19, 20].

To our knowledge, there are two studies validating GPA in surgically treated patients [20, 21]. One study included patients who underwent both surgery and WBRT, excluding all perioperative deaths or serious surgical morbidity that could warrant only supportive care without WBRT. This inclusion criterion likely limits external validity. Also, patients that had undergone prior WBRT (which is, for example, standard therapy for all small cell lung carcinomas) or receiving other adjuvant treatment than WBRT were not included. This could limit extrapolation of results to an unselected everyday neurosurgical population. The other study included only patients with single brain metastases, likely excluding patients with the worst prognoses. In both studies, patients with reoperations seem to be excluded or at least not accounted for.

In the present study, we aimed to explore the prognostic capabilities of the GPA in an unselected, consecutive, neurosurgical population of brain metastases. Although the GPA instrument was developed to predict patients’ survival, it was also of interest to evaluate whether GPA scores can provide information on safety of the operation and postoperative functional outcome. Further, we sought in an exploratory fashion for potential surgery-related parameters such as depth and size of lesion that may possess independent predictive value in patients with brain metastases.

Materials and methods

Methods

We retrospectively included all adult (≥18 years) patients operated for brain metastases at the Department of Neurosurgery, St. Olav University Hospital, in the 6-year period from January 2004 through December 2009. Patients were followed until death or to 31 December 2010. No patients were lost to follow-up. Data collection was based on review of patient hospital files and image data. We sought to assess postoperative Karnofsky performance status (KPS) based on available records approximately 4 weeks after surgery to allow some time for recovery from transient surgically acquired deficits. We included all adverse events and serious adverse events in relation to the surgical procedure, without attempts to define causality. This is in coherence with Good Clinical Practice Guidelines (http://www.ema.europa.eu: Clinical Safety Data Management: Definitions and Standards for Expedited Reporting). Serious adverse events are defined as any unexpected medical occurrence (at any dose) in the operative period, which resulted in death, was life-threatening, required inpatient hospitalization or prolongation of existing hospitalization, or resulted in persistent or significant disability/incapacity. Tumor volumes were determined from preoperative volumes using an ellipsoid model (4π × r3/3), as described by others [22]. Early postoperative contrast enhanced MRI (<48 h) was used to determine resection grades. Gross total resection (GTR) was defined as no visible residual tumor, as opposed to subtotal resections (STR).

Study population

A total of 141 surgically treated cases with brain metastases were identified in the study period. Baseline characteristics are presented in Table 1. The mean age was 60.5 years (95% CI, 58.7–62.2) and 70 (50%) of the patients were female. The median number of brain metastases was 1 (range 1–11) and 101 (72%) patients presented with a single brain metastasis. Mean preoperative KPS was 75 (95% CI, 73–77). Eighty-two (58%) had metastatic disease outside the brain. Mean maximal depth of lesion, as measured from the meninges in the craniotomy was 35 (95% CI, 33–37) mm. Median preoperative volume was 9.610 (range 0.24–83.92) ml.

Table 1 Baseline characteristics

There were 111 (79%) primary operations and 30 (21%) reoperations. The patients underwent different adjuvant treatments after the discovery of brain metastases. Twelve (9%) patients received SRS during follow-up while another 12 patients had undergone SRS for brain metastases prior to open surgery. In Norway, SRS is centralized to another hospital and patients treated with SRS instead of open surgery are therefore not included in this study. Sixty (43%) patients received chemotherapy during follow-up. Seventy-two (51%) patients received WBRT during follow-up while 24 (17%) patients had undergone WBRT prior to surgery. In 23 (16%), open surgery was the only anti-cancer treatment given. In 109 (77%) operations, GTR was achieved. Diagnostic biopsies only were performed in 2 (1%) cases.

Graded prognostic assessment

GPA has recently been developed to predict survival in cancer patients with brain metastases [14, 17]. Four clinical parameters are evaluated with three possible values (0, 0.5 or 1). The parameters include: age (≥60, 50–59, <50 years), KPS (<70, 70–80, 90–100), number of brain metastases (>3, 2–3, 1), and extracranial metastases (present, not applicable, none). The score divides patients into four different prognostic groups (0–1; 1.5–2.5; 3; 3.5–4). Total scores range between 0 and 4 with higher scores indicating better prognosis.

Three-month-mortality

We sought to explore if GPA score or other possible prognostic factors could help to identify patients who had limited survival after surgery. From a surgical and general point of view, a 3-month expected survival could represent gross cut-off between worthwhile and futile treatment.

Statistics

All analyses were done with SPSS, v.16.0 (Chicago, IL, USA). Statistical significance level was set to P ≤ 0.05. All tests are two-sided. Q–Q plots were used to test for normal distribution of data. Central tendencies are presented as medians (range) when data is skewed and for survival as 18 (12.8%) cases are censored. The tests applied for analyzing data was chosen as follows. When both dependent and independent variables were categorical, we have used Pearson’s Chi-square test. Comparisons of groups with linear data were analyzed with Kruskal–Wallis test if there were several subgroups. When analyzing changes in KPS (before and after surgery) we used paired samples t test. Test properties (sensitivity/specificity) and diagnostic properties (predictive values) were calculated from 2 × 2 tables. Differences in survival were analyzed with log-rank test (Mantel Cox) and survival is presented as Kaplan–Meier plots.

Ethics and approvals

The study was approved by the Regional Ethical Committee for Health Region Mid-Norway.

Results

Graded prognostic assessment

The population was grouped into GPA 0–1 (n = 22, 16%), GPA 1.5–2.5 (n = 90, 64%), GPA 3 (n = 19, 14%), and GPA 3.5–4 (n = 10, 7%) according to the prognostic indices.

Overall survival

Median survival time (MST) for the entire population was 7.7 months (range 0.0–78.6). MST in the different prognostic groups is presented in Fig. 1 and Table 2. MST were 6.3 months (range 0.8–23.7) in GPA 0–1, 7.8 months (range 0.2–75.0) in GPA 1.5–2.5, 14.0 months (range 0.0–77.4) in GPA 3, and 18.4 months in GPA 3.5–4 (range 0.1–63.7). This represents an overall significant difference between groups (P = 0.010).

Fig. 1
figure 1

Kaplan–Meier plots for overall survival between the different GPA groups. a Kaplan–Meier plot for overall survival (n = 141), a significant difference in overall survival between groups (P = 0.010). Censored cases were still alive at end of follow-up. b Kaplan-Meier plot for overall survival in re-operated patients. There was only one patient in the best group and as a result of that we merged the two best groups. Not significant (P = 0.062), probably due to lack of power. Censored cases were still alive at end of follow-up

Table 2 Clinical outcomes and associations with GPA score

3-month mortality

Twenty-four patients (17%) died within 3 months after surgery. There was no significant association between GPA group and 3-month mortality (P = 0.750). Five (23%) patients were dead in the worst prognostic group compared to two (20%) patients in the best prognostic group (Table 2).

Perioperative (30 day) mortality

The perioperative mortality was 7% (n = 10). There was no significant association between perioperative mortality and the GPA group (P = 0.871). One patient (5%) was dead in the worst prognostic group compared to 1 (10%) in the best group (Table 2).

Adverse events and change in Karnofsky performance status

Adverse events are presented in Tables 2 and 3. In total, we registered adverse events in 25 (18%) of the operations. Serious adverse events were registered in 11 (7.8%) cases. There were no significant differences between the prognostic groups (P = 0.330). KPS ranged from 10 to 100 preoperatively and from 0 to 100 postoperatively. Mean preoperative KPS was 75 (95% CI, 73–78), and postoperative KPS was 70 (95% CI, 66–74). This represents a significant reduction in functional performance (P = 0.005) assessed approximately 4 weeks after the operation. There were 43 (31%) cases suffering a reduction in KPS, while 63 (45%) experienced no change, and 35 (25%) improved postoperatively. There was no statistical association between GPA groups and change in KPS after surgery (P = 0.558).

Table 3 Adverse events occurring in relation to surgery

Repeated surgery for brain metastasis

Patients undergoing their first brain metastases surgery had MST of 6.7 months (range 0.0–75.0). In our study population, 30 (21%) cases had previously been treated with surgery for brain metastases. This subgroup had MST of 17.2 months (range 1.3–77.4). Each case was grouped into GPA 0–1 (n = 4, 13%), GPA 1.5–2.5 (n = 19, 63%), GPA 3 (n = 6, 20%), and GPA 3.5–4 (n = 1, 3%) according to the prognostic index. As there was only one patient in the best prognostic group, we merged the two best groups in the survival analysis. Survival plot is presented in Fig. 1. In cases with repeated surgery, there was no overall significant (P = 0.062) difference between groups, probably due to lack of statistical power. Re-operated patients did not have significantly different GPA scores compared to first-time operations (P = 0.989).

Prediction of 3-month mortality: Exploratory analyses

We explored the test properties and diagnostic properties of preoperative variables; maximal tumor depth, tumor volume, tumor location (infratentorial vs supratentorial), different cut-offs for GPA, age ≥60 years, number of metastases and preoperative KPS. The following postoperative variables were also explored: resection grades (GTR vs. STR), new deficits, and adverse events. Results are presented in Table 4. Depth ≥40 mm (P = 0.029) and adverse events (P = 0.001) were significantly associated with 3-month mortality. No clinical useful predictor for 3-month mortality was found. Sensitivity for detecting 3-month mortality were highest with 63% in GPA 0–2, extracranial metastases and KPS ≤70 preoperatively. The highest positive predictive values were found in patients who experienced new deficits or adverse events postoperatively with 29 and 40%, respectively.

Table 4 Usefulness of different characteristics in predicting 3-month mortality

Discussion

Prognostication in patients with brain metastases is challenging, often making it difficult to refrain from aggressive treatment. Various treatment options may be associated with important differences in both side effects and effects within the time frame important for the individual patient. Thus, caregivers for patients with brain metastasis need a reliable prognostic marker for deciding how to treat the individual patient. From a surgical point of view, it would be of interest to identify patients with short survival accurately, avoiding over- and under-treatment as pointed out by Nieder et al. [21, 23].

In the present study we have demonstrated possible use and limitations of GPA in a consecutive neurosurgical series of brain metastases. It is our belief that if expected survival is short (e.g., less than 3 months), open surgery should preferably be avoided in most patients as other treatment options may provide symptomatic relief with less risks and costs in the following weeks, possibly without the need for hospitalization. Perhaps to no surprise, GPA did not predict perioperative mortality nor mortality within 3 months. This finding is consistent with earlier attempts to predict unfavorable outcome with scoring systems (using ≤2 and ≤3 months survival as cutoff) in patients treated with WBRT [15, 16, 23, 24]. According to our findings, GPA is not robust enough to aid selection of treatment strategies for individual patients in a neurosurgical setting. We are less convinced of the capabilities of GPA than Sperduto who claims: “GPA indices provide the clinician with an easy and valuable tool to distinguish which patient warrants aggressive therapy and which patient would be better served by hospice” [7].

Overall survival and surgical mortality

GPA split the patients into four different prognostic groups as demonstrated in our study. This has previously been demonstrated in several trials with various treatments [14, 17, 19, 25]. Although GPA seems like a fairly reliable tool to predict longer term survival in patients with brain metastases regardless of pattern of care, the frequent outliers and the inability to predict short-term survival limit the clinical usefulness in the individual patient.

Perioperative mortality was 7% in our series which is comparable to other studies in an unselected neurosurgical population [26]. Our findings suggest that GPA is not suitable for predicting surgical mortality.

Adverse events and change in Karnofsky performance status

The surgical treatment was associated with significant risk with almost 20% experiencing some kind of adverse event. This is somewhat more frequent than reported previously, although different methods for registration make direct comparisons difficult [26]. We evaluated the postoperative KPS score rather early (~4 weeks) in an attempt to minimize effects of disease progression locally or systemically, chemotherapy or radiation therapy. If patients with expected short survival achieve early improvement in quality of life (QoL) after surgery it may be worth the risk and cost of surgical treatment. However, according to our results, the average patient does not improve after surgery as measured with KPS. However, the physical performance may improve or maintain stable longer with longer follow-up as a result of improved local control in the central nervous system. In the famous randomized study from Patchell et al., patients treated with surgery and WBRT were functionally independent longer than the group receiving WBRT alone [27]. The present study was not designed to answer how surgery affects QoL compared to other treatment options, but rather if changes in functional status could be predicted by GPA. According to our results, GPA does not give additional prognostic information concerning postoperative physical performance level.

Repeated surgery for brain metastases

There is no consensus on how to treat recurrent brain metastases. However, several studies demonstrate improved functional outcome and survival in patients treated with repeated surgery [26, 28]. The selection process for repeated surgery is probably more restrictive than for the initial surgery. This is probably why this group had a MST approximately 10 months longer than the pooled population in our study. Interestingly, the GPA scores were similar between those with initial surgery and reoperations. Thus, although they may be subjects to stricter selection, this was not reflected in the score. Differences in tumor biology might explain why some patients with a more indolent disease course live long enough to experience recurrent CNS metastases while being in good enough condition for repeated surgery. As the number of re-operated patients is low (n = 30) in our study, the statistical power is weak (risk of type II error), but we still detected a near significant difference and divergent survival plots (data is shown) between GPA groups. Thus, it seems likely that GPA scores are also predictive for survival in patients operated for recurrent brain metastases.

Exploratory analyses in prediction of 3-month mortality

Due to an apparent need for better predictors for short survival in a population undergoing open surgery for brain metastases, we found it natural to search among traditional surgical parameters, as GPA scores failed to predict short-term survival. The postoperative predictors cannot be used in selection of patients, but they may influence surgical strategy. The results are from data-driven post-hoc analyses where we have chosen the most appropriate cut-offs for prediction of 3-month mortality, and therefore these results should be interpreted with caution. Table 4 demonstrates the clinical capabilities and limitations of the different clinical characteristics. Of the preoperative parameters, we identified that depth of lesion predicted early mortality better than the GPA. A cut-off at 40 mm maximal depth has 50% sensitivity and this cut-off was significantly associated with 3-month mortality. Thus, depth proved better than GPA in predicting 3-month mortality. Depth may be related to the surgical trauma, and deep-seated lesions may perhaps be more safely treated with SRS. The clinical usefulness, however, is very limited with a positive predictive value of only 27%. A recent study among glioma patients demonstrated shorter survival in patients with acquired aphasia and motor deficits [29]. This finding could also be of interest in other patients with intracranial tumors. Avoiding complications and new deficits in these patients with advanced disease is critical for several reasons. First, as indicated by the results in Table 4, it is probably associated with shorter survival. Second, new deficits are certainly related to impaired QoL [30]. Third, most complications prolong hospitalization, require treatment and thereby restrict the lives of patients with very limited time left. Lastly, readmissions and longer hospitalization due to treatment-related complications adds to already inflated health budgets [31].

Study limitations

The external validity of our findings will naturally depend much on the patient population elsewhere, since referral traditions and treatment strategies in patients with brain metastases may vary between institutions. As pointed out in a recent review, one small RCT and three retrospective cohort studies have evaluated surgical resection alone compared to surgery plus post-operative WBRT for the initial management of a single brain metastasis [9]. Fewer patients who received post-operative WBRT experienced a recurrence in the brain compared to those who had surgical resection alone, but convincing results are lacking concerning overall mortality. As for the present, there is insufficient evidence to make treatment recommendations for patients with poor performance scores, advanced systemic disease, or multiple brain metastases [9]. The neurocognitive tolls or potential benefits of WBRT have not so far been much explored in this setting. In our study population, WBRT was withheld in a few patients with expected longer term survival, due to excellent and long-term control of the extracranial disease. In some patients with expected poor prognoses, such as patients with KPS <70 (including perioperative deaths) or poor control of extracranial disease, WBRT was also not given, as the main concern was not local recurrence. This among other local treatment and selection strategies may have influenced results. However, we still believe the external validity of the results is high for institutions offering open surgical treatment for brain metastases. There are also potential biases associated with retrospective evaluation of own data. The data-driven post-hoc analyses carried out in the search for potential clinically useful predictors clearly have a possibility for false positive findings.

Conclusion

GPA scores holds prognostic properties in a population of patients operated for brain metastases. However, GPA did not predict short-term mortality, limiting the clinical usefulness in a neurosurgical population. There was no association between GPA and complications or change in physical performance postoperatively. The prognostic indices cannot be used alone to decide if surgery is warranted on an individual basis, or to evaluate risks and benefits of surgery.