Introduction

Delayed cerebral ischemia (DCI) is one of the most severe complications of aneurysmal subarachnoid hemorrhage (aSAH) and occurs in approximately 30% of the patients [1, 2]. The extent of subarachnoid blood on admission computed tomography (CT) scan is one of the strongest predictors of DCI [3,4,5]. However, it is not clear which radiological grading scale has the strongest association with DCI. Over the years, several radiological grading scales have been introduced [6]. The first scale was introduced by Fisher et al. in 1980 [7]. This four-grade scale scores the amount of blood in the cisterns as absent (grade 1), diffuse thin (< 1 mm) (grade 2), thick (> 1 mm) (grade 3), or thin amount of blood in the cisterns with presence of intraventricular (IVH) or intraparenchymal (IPH) hemorrhage (grade 4). Later, an adjusted version of the Fisher scale was published: the modified Fisher scale [8, 9]. This scale introduced a separate grade for patients with both a thick layer of blood in the cisterns and presence of IVH. An alternative, more extensive but more time consuming scale is the Hijdra sum score, which grades the amount of blood in each cistern separately [4].

Several studies have assessed the association of these radiological grading scales with the occurrence of DCI [4, 7,8,9,10]. However, prior to 2010, DCI has not been unambiguously defined [11]. When the Fisher scale was introduced, DCI was believed to be caused by vasospasm and thus angiographic vasospasm was frequently used as end point. Presently, DCI is regarded a multifactorial process, which results in neurological deterioration [12]. The question remains how well the grading scales are associated with this current clinical definition of DCI [11]. We aimed to perform a systematic review to assess the associations of the Fisher scale, the modified Fisher scale, and Hijdra sum score with clinical DCI.

Methods

Search strategy

We performed this review according to the PRISMA statement [13]. We performed a MEDLINE and EMBASE search for records published between 1980 and 20th of June 2017. The search terms “subarachnoid hemorrhage,” “Fisher,” “Hijdra,” “score/grade,” and “computed tomography,” were used. Furthermore, as numerous terms for DCI are used in literature, we included the following terms for DCI: “delayed cerebral ischemia,” “delayed cerebral infarction” “delayed cerebral deficit,” “symptomatic vasospasm,” “cerebral vasospasm,” “delayed ischemic neurological deficit,” “delayed ischemic neurological deterioration,” and “cerebral infarction.” For the full search strategy, please see the Online supplemental text 1.

Study selection

Studies that met the following criteria were included: (1) SAH proven on CT images; (2) aneurysm shown on an angiographic study; (3) amount of blood graded on admission CT by either the Fisher scale, modified Fisher scale, or Hijdra sum score; (4) outcome was clinical DCI, defined by at least the occurrence of new focal neurological signs and/or a decrease in consciousness, with or without angiographic vasospasm or new infarct on follow-up CT scan; and (5) an association between the radiological grade and DCI was reported. In case two studies used (part of) the same dataset and assessed the same radiological scale, the study with the largest patient population was included. We excluded studies that included mainly patients younger than 18 years, studies in another language than English, Dutch or German, case reports, reviews, and conference abstracts. One reviewer (W.S.) screened all titles and abstracts for eligibility for full text review. Two reviewers (W.S. and E.L.) independently reviewed all full texts for inclusion in the review. The final decision on inclusion was made through consensus between the two reviewers.

Data extraction

Data extraction from the included studies was independently performed by two reviewers (W.S. and E.L.) using a standardized form. Crude data on radiological grade and occurrence of DCI were extracted along with, if available, the odds ratios (OR) for the occurrence of DCI. If the ORs were not available, ORs for DCI were calculated from the crude data in case these data were provided. Furthermore, we extracted whether an association between the radiological grade and DCI was found, both in univariable and multivariable analysis. Other collected variables were age, sex, and treatment modality (clipping or coiling). Differences in extracted data were evaluated by the two reviewers and resolved by consensus.

Quality assessment

Risk of bias was assessed by two reviewers independently (W.S. and D.V.) using the Newcastle–Ottawa quality assessment (NOS) scale for cohort and case control studies and the Cochrane Handbook risk of bias tool for randomized controlled trials [13, 14]. Cohort studies could be awarded a maximum of 9 points. Randomized controlled trials could be awarded a maximum of 7 points. In case an item of the scale was inapplicable, a lower maximum of points could be awarded. Points could be awarded for selection, comparability, and outcome assessment. The study population was considered unselected if no selection was made based on clinical condition, co-morbidities, and treatment of the aneurysm. A follow-up period for DCI of at least 2 weeks was considered sufficient. Follow-up was considered adequate if a DCI status (DCI or no DCI) was available for at least 80% of the included patients. Differences in risk of bias assessment between reviewers were resolved through consensus.

Statistical analyses

Mean age (standard deviation), percentage male sex, number of patients in each radiological grade, and occurrence of DCI per radiological grade were calculated using data from all studies that provided the crude data.

Pooled ORs with 95% confidence intervals for DCI were calculated for the radiological scales, both dichotomized and per grade on the radiological scale using Mantel Haenszel statistics. The Fisher and modified Fisher scale were dichotomized at grade 2 and the Hijdra sum score at grade 23. Pooled ORs were only calculated in case multiple studies reported crude data on patients for all grades of the radiological scale. The risk for DCI was compared between all grades of each radiological scale using each grade as reference in separate analyses. A random effects model was applied to calculate the pooled ORs. Statistical heterogeneity of the studies included in the meta-analysis was assessed using the I2 statistic. An I2 of 0–30% was considered no relevant heterogeneity, 30–50% moderate heterogeneity, 50–75% substantial heterogeneity, and 75–100% considerable heterogeneity [14]. The possibility of publication bias was assessed by plotting the effect of the studies included in the meta-analysis by the inverse of its standard error, thus creating a funnel plot. Funnel plots were visually assessed for asymmetry which could indicate a publication bias.

A sensitivity analysis was performed comparing studies that used clinical signs only to diagnose DCI to studies that used both a radiological and clinical definition.

Analyses were performed using SPSS version 24.0.0.1 (IBM SPSS Statistics for Windows, Armonk, NY: IBM Corp) and Review Manager 5.3.5 (RevMan, Copenhagen: The Nordic Cochrane Centre, The Cochrane Collaboration, 2014).

Results

Study characteristics

The MEDLINE and EMBASE search yielded, after removal of duplicates, 6766 records. After screening of titles and abstracts, 274 full-text articles were screened for inclusion. Of the full-text articles, 221 studies were excluded. The main reasons for exclusion were the following: amount of blood not assessed with the Fisher, modified Fisher, or Hijdra sum score (n = 38), clinical DCI not one of the assessed outcomes (n = 114), and no association between amount of blood and DCI could be determined (n = 51) (Fig. 1).

Fig. 1
figure 1

PRISMA flowchart

Fifty-three studies were included, including 13,612 patients with a mean age of 52.4 (SD 3.5) years of which 66% was female. Table 1 shows the study characteristics of the included studies.

Table 1 Study characteristics

In total 50 cohort studies, one case–control study and two randomized controlled trials were included. DCI occurred in 29% of the patients. In 28 studies (53%), the diagnosis of DCI was based on clinical signs in combination with angiographic vasospasm. In the remaining 25 studies (47%), the diagnosis of DCI was based on clinical signs only. No sufficient uniform data could be extracted on treatment modality.

None of the included studies received the maximum score on the quality assessment scale. The mean amount of awarded points was 5.2 (SD 1.6). Items that were most often rewarded points for were comparability of the cohorts (82%) and assessment of DCI (98%). However, most studies did not use an unselected study cohort (69%), showed no demonstration that DCI was not present at baseline (93%), and did not use a follow-up period of a least 2 weeks (56%) (Table 1).

Fisher scale

The Fisher scale was used in 42 studies (79%), including 10,429 patients. Of these 42 studies, 25 studies (5701 patients) reported patients for all Fisher grades and provided detailed data on the number of patients in each grade. A mean percentage of 6 (SD 6; range 0–23) was classified as Fisher 1, 18 (SD 9; range 0–37) as Fisher 2, 51 (SD 17; range 17–83) as Fisher 3, and 27 (SD 14; range 5–73) as Fisher 4. The distribution of patients in each Fisher grade in these studies is shown in Online supplemental Fig. 1.

Crude data on the number of patients that developed DCI in each Fisher grade was extracted from 25 studies (7222 patients), a mean percentage of 13 (SD 7; range 0–29) of patients with Fisher 1, 24 (SD 10; range 0–43) of patients with Fisher 2, 37 (SD 11; range 23–96) of patients with Fisher 3, and 30 (SD 13; range 0–70) of patients with Fisher 4 developed DCI.

The Fisher grade was significantly associated with DCI in 23 of 39 studies (62%) that performed an univariable analysis. Of the 29 studies that performed a multivariable analysis, the Fisher scale was independently associated with DCI in 16 studies (55%).

Of the 25 studies (5701 patients) that reported patients for all Fisher grades, 14 studies (4413 patients) provided data on occurrence of DCI that allowed to calculate a pooled OR by dichotomized Fisher grade and 10 studies (3939 patients) to calculate a pooled OR per Fisher grade. The pooled OR (95% CI) for DCI was 2.22 (1.54–3.19) for Fisher 3 to 4 compared to Fisher 1 to 2. Compared to Fisher 1, the pooled OR (95% CI) for DCI was 1.53 (1.01–2.32) for Fisher 2 (Fig. 2a), 3.21 (1.87–5.49) for Fisher 3 (Fig. Fig. 2b), and 2.21 (1.49–3.28) for Fisher 4 (Fig. Fig. 2c), respectively. Patients with Fisher 4 had a significantly lower risk of DCI compared to patients with Fisher 3 (OR (95% CI) 0.60 (0.37–0.97)) (Table 2).

Fig. 2
figure 2

a Forest plot comparing risk of DCI of Fisher 2 with Fisher 1 patients. b Forest plot comparing risk of DCI of Fisher 3 with Fisher 1 patients. c Forest plot comparing risk of DCI of Fisher 4 with Fisher 1 patients

Table 2 Pooled odds ratios for development of delayed cerebral ischemia

There was no relevant statistical heterogeneity between the studies: the I2 ranged between 0 and 20%. Visual inspection of the funnel plot showed a symmetrical distribution of the study effects (Online supplemental Fig. 2).

Modified Fisher scale

The modified Fisher scale was used in 11 studies (21%), including 3941 patients. Seven studies (2476 patients) included patients for all modified Fisher grades and provided data on the number of patients in each grade (Online supplemental Fig. 1). The majority of the studies grouped modified Fisher 0 and 1 together. A mean percentage of 20 (SD 4; range 11–29) was classified as modified Fisher 0–1, 11 (SD 4; range 0–20) as modified Fisher 2, 30 (SD 7; range 14–46) as modified Fisher 3, and 39 (SD 10; range 34–61) as modified Fisher 4.

Crude data on the number of patients that developed DCI in each modified Fisher grade was extracted from 5 studies (2085 patients). A mean percentage of 21 (SD 7; range 5–24) of patients with modified Fisher 0–1, 26 (SD 9; range 0–33) of patients with modified Fisher 2, 30 (SD 9; range 5–36) of patients with modified Fisher 3, and 42 (SD 9; range 34–83) of patients with modified Fisher 4 developed DCI.

The modified Fisher grade was significantly associated with DCI in 7 of 8 (88%) of studies that performed a univariable analysis and in 8 of 9 (89%) studies that performed a multivariable analysis.

Of the 5 studies that reported patients for all modified Fisher grades, 4 (1959 patients) provided data on occurrence of DCI that allowed to calculate a pooled OR by dichotomized modified Fisher grade. The pooled OR (95% CI) for DCI was 2.31 (1.40–3.81) for modified Fisher 3–4 compared to modified Fisher 0–2. No pooled OR per grade increase on the modified Fisher was calculated; as of three studies (1616 patients) that provided data for the occurrence of DCI in all modified Fisher grades, one study included 84% of the patients.

Hijdra sum score

The Hijdra sum score was used in 6 studies (11%), including 1197 patients. Mean Hijdra score in 4 studies (1048 patients) was 19 (SD 2; range 17–23). In two studies (289 patients) that reported the mean Hijdra score stratified by occurrence of DCI, mean Hijdra score was 24 (SD 2; range 22–26) in patients with DCI and 14 (SD 1; range 15–15) in patients without DCI.

The Hijdra score was significantly associated with DCI in both studies that performed univariable analysis and in all four studies that performed multivariable analysis. None of the studies provided crude data that allowed for calculating a pooled OR.

Sensitivity analysis

A sensitivity analysis was performed in the meta-analysis of the Fisher scale. Four studies (3254 patients) diagnosed DCI using clinical signs only, whereas the other 6 studies (673 patients) combined clinical signs with angiographic vasospasm. In studies that used clinical signs only to diagnose DCI, both Fisher 3 (OR (95%CI) 2.48 (1.70–3.60)) and Fisher 4 (OR (95%CI) 2.11 (1.23–3.61)) were significantly associated with DCI. In studies combining clinical signs with angiographic vasospasm to diagnose DCI, Fisher 3 (OR (95%CI) 8.79 (2.08–37.12)) was significantly associated with DCI (Online supplemental Table 1).

Discussion

In this systematic review, pooled analysis of the Fisher scale showed that all Fisher grades have an increased risk of clinical DCI compared to Fisher 1, with Fisher 3 patients having the highest risk of DCI. Nevertheless, the highest rate of DCI was seen in patients with modified Fisher 4. For the modified Fisher and Hijdra sum score, no pooled ORs per individual grade could be calculated. Fisher grades were significantly associated with DCI in less studies compared to the other radiological grades. The results of this review should be interpreted with care as the risk of bias was high in the included studies.

To our knowledge, this is the first systematic review assessing the association of radiological scales for grading SAH with DCI. One narrative review reported an overview of all available radiological scales for predicting vasospasm until 2006; however, no associations with DCI were presented in this review [6]. A recent systematic review assessed all risk factors for DCI showing that smoking is an established predictor of DCI; however, the radiological scales were not assessed in this review [49].

Fisher et al. showed that patients with Fisher 3 had the highest risk of vasospasm, a well-known cause of DCI [7]. Our review showed that patients with Fisher 3 also have the highest risk of clinical DCI (OR 3.2). The pooled analysis further showed that patients with Fisher 4 have a significantly lower risk of DCI compared to Fisher 3 patients. The main difference between Fisher 3 and Fisher 4 is the presence of blood outside the subarachnoid space in Fisher 4 with extension to the ventricles (IVH) or parenchyma (IPH). The thickness of SAH does not play a role in the scale difference. Habitually, most Fisher 4 patients will have a larger quantity of SAH and part of this blood is “escaping” to the ventricular system. However, even in patients with a perimesencephalic hemorrhage, blood can migrate into the 4th ventricle, upgrading these patients with a low tendency to develop DCI to a Fisher grade 4. This might explain the lower rate of DCI in Fisher grade 4 versus grade 3 patients. Furthermore, due to the presence of an IPH, Fisher 4 patients may be in poorer clinical condition which masks clinical deterioration [66]. The Fisher scale lacks a separate grade for patients with both thick SAH and IVH, which the modified Fisher does have. As such, patients with both thick SAH and IVH cannot be categorized in a Fisher grade.

In our sensitivity analysis, we analyzed if the strength of the association between the Fisher scale and DCI is dependent on the definition of DCI. The OR for DCI in Fisher 3 patients was higher in studies that used clinical deterioration and angiographic vasospasm to diagnose DCI, compared to those using clinical signs only. However, this high OR may be largely due to the inclusion of the results from the original paper by Fisher et al., in which patients were included with angiographic vasospasm as part of the definition of DCI. In this study all patients that developed DCI were graded as Fisher 3 [7]. Furthermore, confidence intervals were broad so no firm conclusions can be drawn based on this data.

We could not collect sufficient comparable data to calculate pooled ORs per grade of the modified Fisher scale and Hijdra sum score. We believe there are several reasons for this. There were a considerably lower number of studies using these scales. Even after the introduction of the modified Fisher in 2006, the Fisher scale was the most frequently used scale in the studies included in this review. As this was also the case for studies published in the last 5 years, this is not only due to later introduction of the scale. The Hijdra sum score may be too extensive and cumbersome for widespread usage, as this score requires grading of the amount of blood in ten basal cisterns and four ventricles separately. Another reason for not pooling the data is the large variability in data reporting. The usage of a wide range of definitions for DCI resulted in the exclusion of many studies that did not use a clinical definition of DCI. Furthermore, various categories of radiological grades were frequently grouped together, and a variety of effect estimates was used.

One of the most important strengths of this review is the presentation of the first formal meta-analysis on the association of the Fisher scale with DCI. The usage of clinical DCI as end point resulted in more homogeneous outcome data. The use of clinical DCI as endpoint is also a limitation of this study. This diagnosis may be more subjective and have a higher interobserver variation compared to radiological DCI. Furthermore, clinical DCI cannot be diagnosed in comatose patients. However, as follow-up imaging is frequently not routinely made, the clinical relevance of radiological DCI remains questionable. Finally, as not for all studies the risk of publication bias could be evaluated by a funnel plot, a publication bias may have occurred.

This review shows several advantages of the modified Fisher scale over the Fisher scale for clinical practice. Firstly, even though no pooled OR could be calculated, modified Fisher 4 patients had the highest rate of DCI. This may show that this scale identifies the group which has the highest risk of DCI. Secondly, the modified Fisher scale shows an increasing risk of DCI per grade increase on the scale, whereas the Fisher scale does not. This is counter-intuitive for a scale that is developed to predict DCI and makes the scale less suitable for clinical practice. Thirdly, the Fisher scale was in just over half of the included studies significantly associated with DCI. This may show the limited ability of the Fisher scale to differentiate between patients with a high and a low risk of DCI. In contrast, the other scales were more frequently associated with DCI. For the Hijdra sums score, not enough comparable data could be collected. Thus, the possibility that the Hijdra score has a higher association to DCI as the Fisher and modified Fisher score cannot be ruled out. Recently, new radiological scales like the Barrow Neurological Institute SAH Grading Scale, incorporation of the Modified Graeb score to the modified Fisher, and quantification of SAH volume have been introduced [67, 68,69,70,71]. These scales measure the amount of blood in a more quantitative manner and have shown high associations with DCI and reduced interobserver variability.

Conclusion

This review shows that the Fisher scale, modified Fisher scale, and Hijdra sum score are all associated with clinical DCI. The risk of DCI, however, does not increase with increasing Fisher scale as opposed to the other grading scales. Furthermore, the modified Fisher scale was more commonly significantly associated with clinical DCI than the Fisher scale, which may advocate the use of the modified Fisher in future SAH-related studies.