Introduction

Incidence of patients with brain metastases (BM) is increasing due to an aging population, advanced imaging techniques to detect smaller metastases, and better systemic treatments resulting in improved survival and higher risk of developing BM [1, 2]. Due to increasing life expectancy, maintaining good neurocognitive functioning (NCF) as long as possible is of growing importance. BM can have unfavorable effects on NCF by affecting healthy brain tissue and brain connectivity [3].

Historically, Whole Brain Radiation Therapy (WBRT) has been the main treatment for BM, targeting the entire brain with a low radiation dose [4]. Nowadays, stereotactic radiosurgery (SRS) offers delivery of a precisely localized, high dose of radiation, while sparing healthy tissue in rest of the brain. SRS has better local tumor control and therefore a beneficial effect on NCF, compared to WBRT [5,6,7,8,9,10,11,12]. Nevertheless, SRS might still damage healthy brain tissue in the vicinity of the BM [8, 13]. Both the linear accelerator (LINAC) and the Gamma knife (GK) are used for SRS. For LINAC, 1–2 mm margins are used for planned target volume (PTV) [14]. GK is a dedicated system designed for intracranial radiosurgery and no margins are needed. Furthermore, dose fall-off of GK is steeper than of LINAC and therefore radiation dose to healthy brain tissue is lower. On average, the GK spares normal brain volume receiving ≥ 12 Gray (V12 Gy) by approximately 20% compared to LINAC [15, 16]. This leads in theory to fewer negative effects on NCF, although there are no studies yet that compared this directly.

Studies evaluating changes in NCF at the individual patient level concluded that, in LINAC patients with 1–4 BM, NCF was maintained compared to their pre-treatment level up to 6 months after SRS [17]. Up to 9 months, NCF was maintained or improved compared to pre-treatment levels among GK patients with 1–10 BM [17, 18]. Earlier studies used limited neuropsychological tests [19] and a relatively insensitive method to measure neurocognitive change, without taking practice effects into account [17]. Moreover, in a study by Schimmel et al. (2021), patients could ‘compensate’ for neurocognitive decline on one test by cognitively improving on another test, which potentially masks neurocognitive decline [18]. Therefore, this prospective study aimed to assess neurocognitive decline in patients with BM up to 6 months after GK or LINAC SRS, by using a recommended neuropsychological test battery and a sensitive approach to assess neurocognitive change at the individual level that adjusted for practice effects. Furthermore, predictors of neurocognitive decline were investigated. We hypothesized that patients would show neurocognitive decline after SRS, in particular after LINAC.

Methods

Study population

At the Netherlands Cancer Institute (NKI), neuropsychological assessment (NPA) in BM patients is conducted as part of routine clinical care. BM patients who received GK or LINAC SRS, who had baseline (pre-SRS) NPA between September 2016 and March 2020, and who completed follow-up NPA at 3 and 6 months were included. Patients were excluded when: (i) they did not give consent to use data for scientific research, (ii) received resection or (iii) received additional radiotherapy (including ‘staged’ GK). The local Institutional Review Board approved the study protocol (approved on 10-03-2020, IRBd20-089).

Procedures

Pre-treatment NPA was scheduled in the week preceding SRS. Follow-up NPAs were scheduled at the same day as the diagnostic MRI scan, approximately 3 and 6 months after SRS. At each time point, NPA was conducted by a trained test leader.

Treatment

Up till 2018, all patients at the NKI were treated with a LINAC Synergy Agility™ (Elekta A.B., Stockholm, Sweden). Patients underwent a computed tomography (CT) scan in a supine position using a mask for fixation [20]. The CT scan was registered with a 1-mm-thick contrast-enhanced T1-weighted MRI-scan, which was used for delineation of gross tumor volume (GTV) of BM. PTV was created by expanding GTV with 2 mm [14]. Dose prescribed to PTV was 18–24 Gy in a single fraction, 21–27 Gy in three fractions, or 25–30 Gy in five fractions. A treatment plan was created to such that a minimum of 95% of PTV received 100% of the prescribed dose [20].

Since February 2018, patients were treated with the Leksell Gamma Knife® Icon™ (Elekta A.B., Stockholm, Sweden). Immobilization was performed using a mask or rigid headframe. Patients received a dose of 18–25 Gy, with 99–100% coverage of the target and no setup margin was used [20].

Measures

NCF was assessed with a test battery consisting of six neurocognitive tests, yielding ten outcomes. Three tests were based on recommendations of the International Cognition and Cancer Task Force (ICCTF) [21]: (1) Hopkins Verbal Learning Test-Revised (HVLT-R) immediate recall, delayed recall and recognition [22], (2) Trail Making Test (TMT) part A and B [23] and (3) Letter Fluency (LF) [24]. Additionally, three tests measuring important daily cognitive functions are included: (1) Grooved Pegboard (GP) (non)-dominant hand [25], (2) Digit Span (DS) [26] and (3) Boston Naming Test (BNT) [27] (Suppl. Table 1).

Sociodemographic and clinical characteristics were retrieved from patients’ medical health records. Tumor volume of GK patients was derived from Leksell GammaPlan® for each BM separately, from which total tumor volume was calculated. GTV of LINAC patients was derived from Pinnacle SmartEnterprise version 9.10 (Philips Healthcare, Andover, MA, USA) treatment planning system. Tumor progression was defined as any new, contrast-enhancing lesion on follow-up MRI.

Statistical analysis

Analyses were done for GK and LINAC patients separately. Independent samples t-tests and Pearson’s chi-square tests were carried out to compare characteristics of both groups.

Pre-SRS NCF scores were calculated by converting raw individual scores into age and if suitable education and gender corrected standardized Z-scores, using published normative data or using the Dutch database ‘Advanced Neuropsychological Diagnostics Infrastructure (ANDI)—norms’ [28] (Suppl. Table 1). Lower Z-scores mean worse NCF functioning. Patients were classified as impaired by below-average scores of Z < − 1.5 [29] in ≥ 2 out of 10 test outcomes.

To assess change in NCF, differences in scores were calculated from pre-SRS to 3 months and pre-SRS to 6 months. We used the Iverson reliable Change Index (RCI), which takes into account test–retest reliability, Standard Deviation (SD) on the first and second measurement and practice effects. Performance on a test was classified as “changed” when it fell outside the 90% confidence interval: improved performance when RCI values were above + 1.645, decline in performance when RCI values were below − 1.645, stable performance when RCI scores did not exceed these values. We derived the total number of tests on which patients declined.

Binary logistic regression was used to identify risk factors for neurocognitive decline. Dependent variable consisted of neurocognitive decline, defined with the commonly used threshold of decline in RCI scores on ≥ 2 out of 10 test outcomes  [21, 30]. Considered predictive factors were; age, education level, type of SRS (GK or LINAC), pre-SRS Karnofsky Performance Score (KPS)-score, BM volume, number of BM, tumor progression and neurocognitive impairment pre-SRS [31,32,33]. Contribution of associated factors was taken into account when logistic regression had a P value < 0.05 and was significant at the level of 0.01. Over time, technical improvements made it easier to radiate more BM, which has led to broadened indications for GK [34]. Patients with an increasing number of BM are now treated with the GK and therefore a significant difference in number of irradiated BM between the GK and LINAC group is expected. GK and LINAC patients with ≥ 6 BM were excluded in additional analyses to match patient characteristics for both radiotherapy techniques. All analyses were conducted using IBM SPSS Statistics, version 25.0.

Results

Patient characteristics

Pre-SRS, 194 patients were included (Fig. 1), of which 98 (50%) completed NPA at 3 months and 69 (36%) at 6 months. Reasons for dropout were; being deceased, finding the NPA too burdensome, or a pause of NPA assessment due to SARS-CoV-2. Median overall survival (OS) of 121 GK patients who completed NPA pre-SRS was 11.0 months (SD = 8.0), with a 1-year survival rate of 37.3%. For 73 LINAC patients, median OS was 15.5 (SD = 15.1), with a 1-year survival rate of 38.8%.

Fig. 1
figure 1

Flow-chart of excluded participants. Pre-SRS before treatment with Gamma Knife or Linear accelerator Stereotactic Radiosurgery; FU follow-up. Note A pause of neuropsychological assessment due to SARS-CoV-2 took place between March 2020 and June 2021. Consequently, patients were not able to do a follow-up neuropsychological assessment in this period

Characteristics are summarized in Table 1. We present results of 69 patients who completed NPA at 3 and 6 months. Forty patients were treated with GK and 29 patients with LINAC (mean age 59.1 ± 10.8 years, and 57.7 ± 9.6 years respectively). The GK group had a mean number of 4.8 BM, with a mean GTV volume of 4.1 cm3. Four patients (10%) developed new distant BM at 3 months and 3 (7.5%) at 6 months. Median OS was 16 months (SD = 5.7) and 1-year survival rate was 84%.

Table 1 Pre-SRS characteristics separately for patients treated with GK and LINAC

The LINAC group had a mean number of 1.8 BM, with a mean GTV volume of 6.2 cm3. Three patients (10%) developed new distant BM at 3 months and 4 (14%) at 6 months. Median OS was 17 months (SD = 16.1) and 1-year survival rate was 97%.

Characteristics of patients who completed NPA at 3 months versus patients who completed NPA at 3 and 6 months are shown in Suppl. Table 2. There were statistically significant differences in primary tumor, BM symptoms at diagnosis, timing of BM diagnosis and systemic therapy between responders and non-responders at 3 months. No differences were reported at 6 months.

Neurocognitive functioning pre-SRS

Twenty-three percent of GK patients and 17% of LINAC patients showed neurocognitive impairment pre-SRS according to our pre-defined criteria (against an expected 14% in a healthy population [30]). NCF pre-SRS mean z-scores are shown in Table 2 and additional data from complete case analyses at 3 months are presented in Suppl. Table 3. The GK group showed lowest mean scores and most frequent impairments for verbal memory, fine motor skills and language. The LINAC group showed lowest mean scores and most frequent impairments for executive functioning, verbal memory, fine motor skills and language.

Table 2 Neurocognitive functioning pre-SRS in patients who completed NPA at 6 months (n = 69)

Change in neurocognitive functioning on test level

GK patients reported highest percentages of decline between pre-SRS and 3 months or pre-SRS and 6 months on tests of attention, executive functioning, verbal memory, and fine motor skills (Fig. 2(2.1)). Percentage of patients with declined NCF decreased between 3 and 6 months on all tests.

Fig. 2
figure 2

Individual neurocognitive changes at test level over 6 months after SRS for patients treated with 2.1 Gamma-Knife (pre-SRS—T6: n = 39–40) and 2.2 LINAC (pre-SRS—T6: n = 27–29). Numbers are expressed in percentages. T3 3 months; T6 6 months; TMT-A Trail Making Test A; DS digit span; TMT-B Trail Making Test B; LF letter fluency; HVLT-IR Hopkins Verbal Learning Test-Immediate Recall; HVLT-DR Hopkins Verbal Learning Test-Delayed Recall; HVLT-Recog Hopkins Verbal Learning Test-Recognition; GP-D Grooved Pegboard-Dominant; GP-ND Grooved Pegboard Non-dominant; BNT Boston Naming Test

LINAC patients reported highest percentages of decline largely in the same domains: executive functioning, verbal memory, and fine motor skills (Fig. 2(2.2)). In general, percentage of patients with declined NCF increased between 3 and 6 months.

After excluding GK and LINAC patients with ≥ 6 BM, we observed in the GK group that percentage of decline per test dropped for both pre-SRS vs. 3 months and for pre-SRS vs. 6 months comparisons. This drop was most pronounced for tests of fine motor skills. In the LINAC group, results for pre-SRS vs. 3 months and pre-SRS vs. 6 months comparisons remained essentially the same, as only one patient with 6 or more BM was omitted from the analysis (Suppl. Fig. 1). Additional results of percentage of decline from complete case analyses at 3 months are shown in Suppl. Fig. 2.

Cumulative change in neurocognitive functioning on patient level

In the GK group, 38% declined on ≥ 2 out of 10 tests at 3 months (Fig. 3(3.1)) and 23% declined at 6 months. In the LINAC group, 10% declined on ≥ 2 out of 10 tests at 3 months and 24% at 6 months (Fig. 3(3.2)). For descriptive purposes, improvement of NCF is shown in Suppl. Fig. 3. After excluding GK and LINAC patients with ≥ 6 BM, percentage of decline dropped for both pre-SRS vs. 3 months and pre-SRS vs. 6 months comparisons in the GK group (Suppl. Fig. 4). In the LINAC group, results for pre-SRS vs. 3 months and pre-SRS vs. 6 months comparisons remained essentially the same, as only one patient was omitted from the analysis (Suppl. Fig. 4). Additional results of percentage of decline from complete case analyses at 3 months are shown in Suppl. Fig. 5.

Fig. 3
figure 3

Individual cumulative neurocognitive decline at patient level over 6 months after SRS, for 40 patients treated with Gamma Knife (3.1) and 29 patients treated with the LINAC (3.2)

Prognostic factors of NCF decline

At 3 months, overall model had an R2 of 0.50, and was considered significant (P < 0.001). High age [Odds Ratio (OR) 1.07; 95% Confidence Interval (95%CI) 1.00–1.13; P = 0.059], low education level (OR 25.03; 95%CI 3.80–164.79; P = 0.001) and type of SRS (OR 0.17; 95%CI 0.03–0.85; P = 0.031) significantly predicted neurocognitive decline (see Table 3). Pre-SRS KPS score, BM volume, number of BM, tumor progression and neurocognitive impairment pre-SRS were no significant predictors.

Table 3 Binary logistic regression analyses of predictive factors for neurocognitive decline at 3 and 6 months post-SRS

At 6 months, overall model had an R2 of 0.32, and was considered significant (P = 0.004). High age (OR 1.09; 95% CI 1.02–1.17; P = 0.009) was a significant predictor for neurocognitive decline. No other predictors were found.

Discussion

This prospective study aimed to assess neurocognitive decline in patients with BM up to 6 months after GK or LINAC SRS, using a recommended battery of neuropsychological tests and a sensitive approach to assess neurocognitive decline at the individual level that adjusts for practice effects. Both patients treated with GK and LINAC reported decline in NCF.

Most common affected functions prior to SRS included executive function, memory, fine motor skills and language. Neurocognitive decline over 3 and 6 months post SRS was observed in the same cognitive functions, and in attention. Higher age was a strong predictor for neurocognitive decline. Furthermore, low education level and type of SRS (GK or LINAC) were predictors of neurocognitive decline at 3 months.

We found that 38% of the GK group declined in NCF at 3 months and 23% at 6 months. In the LINAC group, 10% declined at 3 months and 24% at 6 months. Characteristics of the GK and LINAC group significantly differ on number of BM; 4.8 in the GK group and 1.8 in the LINAC group. This could partly be explained by broader indications for GK, allowing treatment of patients with an increasing number of BM. Since February 2018, GK was introduced at the NKI. We were not able to correct for change in treatment indications for SRS, which likely influenced the results presented in this study. After excluding GK and LINAC patients with ≥ 6 BM, differences in decline of NCF between the GK and LINAC group remained. Maybe a detrimental impact on NCF is partially transient for the GK group, and not for the LINAC group. This can potentially be caused by two factors: (1) Higher number of BM in the GK group can have a more diffuse negative impact on brain functioning and lead to a reduced cognitive reserve. When this reserve is implicated (e.g., with the presence of edema), it may have a greater influence on NCF of GK patients than it would have on LINAC patients with fewer BM. We showed that, in general, there is a comparable percentage of patients who report impaired NCF pre-SRS in the GK and LINAC group. A diminished cognitive reserve can become more apparent over time, causing the diffuse impact on the brain to have a stronger influence on NCF over time. This could be an explanation why GK patients have more impaired NCF at three months compared with LINAC patients. (2) Systemic therapy can influence NCF. In particular, 10% of GK patients received chemotherapy, compared to 0% of LINAC patients. Previous research reported that chemotherapy negatively influences NCF [35]. Overall, there is no straightforward explanation for these observations.

Our results are not in line with two recent studies that concluded no decline in patients after SRS. We believe, however, that these studies may have underestimated the rate of cognitive decline due to methodological choices. In the first study by van der Meer et al., 55 LINAC patients were followed up to 6 months after treatment [17]. They concluded that half of the patients showed impaired NCF prior to SRS, mainly in the domain of verbal memory. Patients maintained their pre-SRS NCF level. Changes in neurocognitive functioning were evaluated per domain, thus averaging across test outcomes within a domain. In doing so, impaired test scores can be masked by unimpaired test scores. Furthermore, a relatively insensitive method was used to measure neurocognitive change by analyzing NCF without taking into account test–retest (including practice) effects.

In a second study by Schimmel et al. [18], 92 patients were followed up to 9 months after GK. They concluded that 15–55% of patients showed impaired NCF pre-SRS, mainly in domains of attention, executive functioning, and fine motor skills. Patients maintained or improved their pre-SRS NCF level. NCF was corrected for test–retest effects, but an interpretation of neurocognitive decline was used that could result in an underestimation of decline for two reasons: (1) The possibility to lift classification of decline on tests by improved test performance on other tests, and (2) Statistical testing for differences between proportions of patients and controls classified as declined, as conducted by the authors, could be viewed as an overcorrection: cognitive changes in patients were already defined based on reliable change intervals in the control sample.

We aimed to overcome underestimation of neurocognitive decline in this study by using a sensitive and commonly used method that corrects for practice effects, whereby we did not additionally compare test outcomes with a control sample in order to prevent overcorrection. Also, our criterion for neurocognitive decline without the possibility to compensate with improved tests is in line with recommended criteria stated by for example the ICCTF [21]. Furthermore, neuropsychological tests are administered as part of usual care and therefore providing a representative sample of patients with BM.

Nevertheless, this study has limitations to consider. We excluded patients who did not complete ≥ 1 follow-up NPA post-SRS (n = 96). These patients are expected to have worse NCF pre-SRS. Despite the fact that we have shown that neurocognitive impairment pre-SRS is not a predictor for neurocognitive decline, potentially low scores on tests in this excluded group can be predictive for neurocognitive decline. Furthermore, we excluded patients who received volume-staged GK. Consequently, GK patients with large BM volumes are underrepresented in this study, while they presumably will suffer from decline in NCF.

Further research should investigate whether differences of GK and LINAC SRS on NCF as suggested in this study are results of different effects of both types of treatments or results of selection bias, due to e.g., exclusion of staged GK patients. This should be investigated among larger groups of patients and longer time spans (> 6 months post-SRS). Furthermore, matching on BM volume can give more insight in differences in patient characteristics between GK and LINAC SRS [32].

In conclusion, patients with BM show decline in NCF over time when they are treated with GK or LINAC SRS. This suggests that SRS has an influence on NCF of the individual patient. If results of this study are confirmed in future studies, it is recommended to routinely assess neurocognitive functioning in patients with BM when they are treated with GK or LINAC SRS in order to determine risk–benefit aspects of SRS.