FormalPara Take-home message

In adult comatose resuscitated patients, clinical, biochemical, neurophysiological, and radiological tests predict poor neurological outcome within the first week after cardiac arrest with a low false-positive rate. Most predictors, however, have low sensitivity and a substantial risk of bias.

Key messages from the 2020 evidence review

This review follows those published previously in 2013 and will inform the upcoming European Resuscitation Council and European Society of Intensive Care Medicine (ERC-ESICM) Guidelines on Post-Resuscitation Care. Given the complexity and volume of evidence, we have summarised our main findings in Table 1 Key messages.

Table 1 Key messages from the 2020 evidence update


More than 80% of patients who are admitted to an intensive-care unit (ICU) after resuscitation from out-of-hospital cardiac arrest (OHCA) are comatose [1] because of hypoxic–ischaemic brain injury (HIBI), and about two-thirds of them will die before hospital discharge [2,3]. The majority of these deaths result from withdrawal of life-sustaining treatment (WLST) because of a predicted poor neurological outcome [4]. Accurate neuroprognostication is, therefore, essential in these patients, to provide correct information for their relatives, and to avoid both inappropriate WLST or prolonged treatment of patients with no chance of neurologically meaningful survival.

In 2013, for the first time, the evidence concerning predictors of poor neurological outcome in comatose survivors of cardiac arrest (CA) was systematically reviewed by the European Resuscitation Council (ERC) [5,6] and the European Society of Intensive Care Medicine (ESICM) [7]. In 2014, the evidence from these reviews was summarised by a joint ERC-ESCIM Writing Group and co-published as an Advisory Statement in the official journals of the respective societies [8,9]. Finally, in 2015, this evidence was incorporated into the Guidelines on Post-Resuscitation Care, co-published by the ERC and ESICM [10,11].

After the publication of these reviews, the number of studies of neuroprognostication after cardiac arrest has increased substantially, while the treatment standards, the outcome definition, and even the interpretation of some tests have changed. For this reason, a new review focusing on more recent studies is required. Evidence from the present review will inform the 2021 ERC-ESICM Guidelines on Post-Resuscitation Care. This project has been endorsed by both the ERC and ESICM on June 19, 2019.

Like the previous 2013 reviews, the present review deals with predictors of poor neurological outcome after cardiac arrest. Predictors of good neurological outcome will be evaluated in a subsequent review.


This is a systematic review of prognostic accuracy studies. The review protocol has been previously registered on PROSPERO (CRD 42019141169) on September 11, 2019. Data reporting are consistent with the recommendations included in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [12]. PRISMA does not provide a specific format for prognostic accuracy studies,however, given the similarity of prognostic accuracy studies with diagnostic accuracy studies, the recommended format for diagnostic test accuracy (PRISMA-DTA) was adopted [13].

Population, intervention, comparator, outcome, time frame, and study design (PICOTS) framework

Based on the PICOTS template, the review question was formulated as a follows: In adult patients who are comatose following resuscitation from cardiac arrest in all settings (P), does the use of predictors based on clinical examination, electrophysiology, serum biomarkers, or neuroimaging (I) recorded within 1 week after cardiac arrest (T), allow accurate prediction of poor outcome (O)? We selected prognostic accuracy studies (S), i.e., those in which sensitivity and false-positive rate (FPR) of the index test were reported. The accuracy of the index test was assessed by comparing the predicted outcome with the final outcome, which represented the comparator (C).

Eligibility criteria

Patient population

All studies on adult (≥ 16 years) patients who were comatose following successful resuscitation from cardiac arrest (either in-hospital or out-of-hospital) were considered for inclusion. Patients defined as unconscious, unresponsive, and/or having a Glasgow Coma Score (GCS) ≤ 8 at the time of study enrolment were considered as comatose. Studies including non-comatose patients or patients in hypoxic coma from causes other than cardiac arrest (e.g., respiratory arrest, carbon monoxide intoxication, drowning, and hanging) were excluded.

Interventions (index tests)

Four types of index tests were assessed: clinical examination, biomarkers, electrophysiology, and neuroimaging. Clinical examination included every clinical sign that can be collected during a bedside neurological examination, e.g., brainstem reflexes, even if measured using instrumental recording (e.g., automated pupillometry). Biomarkers included chemical markers of brain injury that are measured in the blood, such as neuron-specific enolase (NSE). Index tests based on electrophysiology were those which require the recording of a bioelectrical signal from the central nervous system. They included electroencephalogram (EEG) and evoked potentials (EPs). We also considered for inclusion studies in which the EEG signal was analysed to produce a summary variable, such as bispectral index (BIS). Brain imaging techniques included computed tomography (CT), magnetic resonance imaging (MRI), and ultrasound, the latter limited to optic nerve sheath diameter (ONSD) measurement.

For assessment of prognostic accuracy, an abnormal result of the index test was considered as a positive result. This included: for clinical examination, the absence of brainstem reflexes, the presence of motor response ≤ 2, or the presence of myoclonus/status myoclonus; for biomarkers, a biomarker blood level above a given threshold; for EEG, the presence of either change in the EEG background, such as a low voltage, or superimposed pathological activity, such as seizures; for somatosensory-evoked potentials, the absence of the N20 wave; for imaging studies, the presence of cerebral oedema, either assessed qualitatively, or quantitatively (e.g., using the grey matter/white matter density ratio on brain CT).


In compliance with current recommendations [14,15], poor neurological outcome was defined as severe neurological disability, persistent vegetative state, or death, corresponding to a Cerebral Performance Category (CPC) [16] from 3 to 5 or a modified Rankin Scale score (mRS) [17] from 4 to 6. Studies adopting CPC 4–5 or mRS 5–6 as a threshold for poor outcome were also considered for inclusion, but the certainty of their evidence was lowered because of indirectness. Wherever possible, the study authors were contacted to recalculate test accuracy with a CPC 3–5 or mRS 4–6 threshold. Based on current standards, the minimum accepted timing for neurological outcome measurement was at hospital discharge or 1 month after cardiac arrest [15]. We did not set any maximum timing. However, the vast majority of prognostication studies evaluate neurological outcome within 12 months after cardiac arrest [15].

Study design (S)

We included only studies where sensitivity and FPR could be calculated, i.e., those where the 2 × 2 contingency table of true/false negatives and positives for prediction of poor outcome was reported or could be calculated from reported data. Studies where the test result was expressed on a continuous scale were included if a threshold allowing dichotomisation and, therefore, calculation of a contingency table was provided; when multiple thresholds were available, the threshold minimising test FPR was chosen. This was based on the assumption that both clinicians and patients/relatives would prioritise minimising the risk of a falsely pessimistic prediction, which may result in an inappropriate WLST.

Study type

Clinical studies written in English, French, German, or Spanish, and published as full-text articles, were considered for inclusion. We excluded reviews, case reports, case series, studies including less than 10 patients, letters, editorials, conference abstracts, and studies published in abstract form. In case of overlapping populations on the same index test, the study with the larger population was included.

Search strategy

MEDLINE via PubMed, EMBASE, Web of Science, and the Cochrane Database of Systematic Reviews were searched using the strings included in Table E1 of the Electronic Supplementary Material [ESM]. The reference list of relevant papers and narrative reviews were searched for additional studies. The automatic alert system of the databases was activated to identify further studies published during the process of data extraction and analysis. The most recent search of the previous systematic reviews on neuroprognostication [5,6] was launched on May 31, 2013. We, therefore, searched studies published from January 1, 2013 onwards. We launched the last search on April 10, 2020.

Records extracted from information sources and from additional sources were entered into a reference management software (EndNote X6, Thomson Reuters, Inc.) to be screened. Two authors (SC, SD’A) performed an independent eligibility assessment. Disagreements were resolved by consensus and, if necessary, after consultation with a content expert.

Data extraction

Data extraction was performed by two authors (SC, SD’A) using a standardised form. When needed, the authors of the original studies were contacted to retrieve missing data.

For each study included in the final analysis, the following data were extracted: study design; enrolment period; number of patients; patients’ age and presenting cardiac arrest rhythm; percentage of patients undergoing targeted temperature management (TTM), target temperature, duration, and cooling method; description of index test and the criteria for a positive test, including a threshold where applicable; timing of the index test recording; blinding of the index test and outcome assessors; criteria for WLST; neurological outcome measure and timing of its assessment; number of patients with poor neurological outcome; number of patients with true-/false-positive and true-/false-negative test result; major confounders, as defined in the evidence appraisal tool. For continuous variables, the thresholds used to calculate test performance and the area under the receiver-operating characteristic (ROC) curve were recorded.

Evidence appraisal

Two study authors (CH, MK) rated the methodological quality of the included studies using the Quality in Prognosis Studies (QUIPS) tool [18]. The tool (ESM Appendix 1) was customised by adding some specific items pertinent to neuroprognostication after CA, such as self-fulfilling prophecy or confounding from sedation. In studies on prognostication of poor outcome, self-fulfilling prophecy is a bias that occurs when the treating team is not blinded to the results of the outcome predictor, so that WLST decisions are influenced or based on the predictor being investigated. Given the importance of the risk of self-fulfilling prophecy [14], the bias was graded as high when the index test was not assessed blindly or when the index test was part of the WLST criteria, regardless of the presence of other limitations. Grading was performed for individual predictors rather than the study as a whole, because risk of bias may be predictor-dependent and may differ between predictors within a study. Disagreement between quality assessors was resolved by consensus. For one included study, one of the quality assessors was also a co-author. Therefore, the quality assessment was assigned to another assessor.

Data analysis

According to the method described by Wan et al. [19], demographic data (e.g., age) were merged where they were presented divided by subgroups. The contingency two-by-two tables reporting true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN) were extracted from each study. Sensitivity and FPR (1-specificity), along with their 95% confidence intervals (95% CIs) were calculated using the Clopper–Pearson method [20]. To assess prognostic accuracy, poor neurological outcome was considered as a positive result of the reference standard. Coupled forest plots of sensitivity and FPR at comparable time points and outcome definitions were plotted and the presence of heterogeneity was detected by visual assessment. Publication and reporting bias were not assessed because of the lack of specific recommendations in this type of studies [21]. All analyses were conducted with R version 3.6.1, using the ‘rmeta’ package.


Study selection (Fig. 1)

The initial search identified 3790 records from online databases, while 73 records were identified through forward search. After duplicate removal and abstract screening, a total of 199 studies were considered for full-text assessment, of which 94 fulfilled inclusion criteria. A list of excluded studies with reasons is reported in ESM Table E2.

Fig. 1
figure 1

Flowchart of study selection

Study characteristics

The characteristics of the 94 included studies (30,200 patients) are listed in Table 2. TTM was used in 100% of patients in 68/94 (72%) studies, while 22 studies reported a percentage of TTM treatment ranging from 17.5% and 94%. In one study TTM was not used, while in three studies, this information was not available. Target temperature varied across studies (ESM Table E3). Fifteen studies [22,23,24,25,26,27,28,29,30,31,32,33,34,35,36] were included for clinical examination, 12 for biomarkers [37,38,39,40,41,42,43,44,45,46,47,48], 26 for electrophysiology [49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74], and 15 for imaging [75,76,77,78,79,80,81,82,83,84,85,86,87,88,89], while 26 studies [90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115] were included for more than one prognostic modality. In all but four studies [43,88,99,104], poor outcome was reported as CPC 3–5.

Table 2 Study characteristics

Sensitivity and FPR for index tests based on clinical examination, biomarkers, electrophysiology, and imaging are reported in Tables 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54. In 84/94 (87.3%) studies, sensitivity and specificity were reported, while in 12/94 (12.7%) studies, they were calculated based on reported data. QUIPS evidence profiles of included studies and predictors are reported in ESM Appendix 2. Almost all individual predictors were assessed as being at moderate or high risks of bias, mainly in the domain of study participation (e.g., a study conducted in a convenience sample rather than in an inception cohort) and of confounding (e.g., interference from sedation, lack of blinding, or use of the index test for WLST). Pooling was not performed because of heterogeneity, particularly in the domain of timing of assessment for both index test and outcome.

Table 3 Accuracy of clinical examination. Absent standard pupillary light reflex (s-PLR)
Table 4 Accuracy of clinical examination. Automated pupillometry: qPLR
Table 5 Accuracy of clinical examination. Automated pupillometry: neurological pupil index (NPi)
Table 6 Accuracy of clinical examination. Absent corneal reflex (CR)
Table 7 Accuracy of clinical examination. Absent pupillary and corneal reflexes
Table 8 Accuracy of clinical examination. Absent oculocephalic reflex (OCR)
Table 9 Accuracy of clinical examination. Absent gag and cough reflex
Table 10 Accuracy of clinical examination. Motor score (M = 1)
Table 11 Accuracy of clinical examination. Motor score (M ≤ 2)
Table 12 Accuracy of clinical examination. Myoclonus
Table 13 Accuracy of clinical examination. Status myoclonus
Table 14 Biomarkers. Neuron-specific enolase (NSE)
Table 15 Biomarkers. S-100B protein
Table 16 Biomarkers. Other tests
Table 17 Evoked potentials. Bilaterally absent N20 SSEP wave
Table 18 Evoked potentials. Unilaterally absent + unilaterally low-voltage (1) N20 SSEP wave
Table 19 Evoked potentials. Absent pain-related middle-latency SEPs
Table 20 Evoked potentials. Absent brainstem auditory-evoked potentials (BAEPs)
Table 21 Evoked potentials. Visual-evoked potentials (VEPs)
Table 22 Electroencephalogram (EEG). Background. Isoelectric (1)
Table 23 Electroencephalogram (EEG). Background. Suppressed—ACNS-defined (1)
Table 24 Electroencephalogram (EEG). Background. Suppressed—undefined (cEEG)
Table 25 Electroencephalogram (EEG). Background. Suppressed background (ANCS-defined) with periodic discharges
Table 26 Electroencephalogram (EEG). Background. Low voltage—ACNS-defined (1)
Table 27 Electroencephalogram (EEG). Background. Burst suppression—ACNS-defined (1)
Table 28 Electroencephalogram (EEG). Background. Burst suppression—synchronous (1)
Table 29 Electroencephalogram (EEG). Background. Burst suppression—heterogeneous (1)
Table 30 Electroencephalogram (EEG). Background. Burst suppression (non-ACNS or undefined)
Table 31 Electroencephalogram (EEG). Background. Burst suppression (amplitude-integrated EEG) (1)
Table 32 Electroencephalogram (EEG). Background. Burst suppression or burst attenuation, ACNS-defined (1)
Table 33 Electroencephalogram (EEG). Background. Discontinuous—ACNS-defined(1)
Table 34 Electroencephalogram (EEG). Background. Not continuous (1)
Table 35 Electroencephalogram (EEG). Background. No return to continuous normal voltage
Table 36 Electroencephalogram (EEG). Background. Unreactive EEG background
Table 37 Electroencephalogram (EEG). Discharges and seizures. EEG—rhythmic/periodic discharges
Table 38 Electroencephalogram (EEG). Discharges and seizures.Sporadic, non-rhythmic/periodic discharges—ACNS-defined
Table 39 Electroencephalogram (EEG). Discharges and seizures. Unequivocal seizures—ACNS-defined
Table 40 Electroencephalogram (EEG). Discharges and seizures. Seizures—non-ACNS-defined
Table 41 Electroencephalogram (EEG). Discharges and seizures. Seizures or status epilepticus
Table 42 Electroencephalogram (EEG). Discharges and seizures. Status epilepticus (SE)
Table 43 Electroencephalogram (EEG). Discharges and seizures. Stimulus-induced, rhythmic, periodic, or ictal discharges (SIRPIDs)
Table 44 Electroencephalogram (EEG). Combination of unfavourable EEG patterns. Highly malignant EEG patterns
Table 45 Electroencephalogram (EEG). Combination of unfavourable EEG patterns. Malignant EEG patterns
Table 46 Electroencephalogram (EEG). EEG-derived indices. Bispectral index (BIS)
Table 47 Electroencephalogram (EEG). EEG-derived indices. Burst-suppression amplitude ratio (BSAR) (1)
Table 48 Electroencephalogram (EEG). EEG-derived indices. Background continuity index (BCI) (1)
Table 49 Imaging. Brain CT: grey matter/white matter ratio (GWR)
Table 50 Imaging. Brain CT: other indices
Table 51 Imaging. Brain MRI: changes in diffusion-weighted imaging (DWI)
Table 52 Imaging. Brain MRI: apparent diffusion coefficient (ADC)
Table 53 Imaging. Brain MRI: gradient-recalled echo score (GRE score)
Table 54 Imaging. Optic nerve sheath diameter (ONSD)

Clinical examination (Tables 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 and ESM Fig. 1)

In 19 studies, a bilaterally absent standard pupillary light reflex (s-PLR) immediately after return of spontaneous circulation (ROSC) had high sensitivity but also a high FPR for prediction of poor neurological outcome. A 0% FPR for s-PLR was achieved at 48 h from ROSC in some studies, and became consistent after 4 days from ROSC, with sensitivities ranging from 17.9 to 35.7% (Table 3).

PLR was investigated quantitatively using automated infrared pupillometry. Similarly to an absent s-PLR, a minimal or absent percentage change of pupil size after light stimulation (qPLR), evaluated in three studies, achieved 0% FPR at 48 h from ROSC or later (range 0–4.3%) (Table 4). Conversely, a low neurological pupil index (NPi) had 0% FPR for poor outcome from 24 h after ROSC. Its thresholds varied from 0 to 2.4 (Table 5).

A bilaterally absent corneal reflex (CR) was described in 11 studies. A bilaterally absent CR after 4 days from ROSC consistently predicted a poor outcome. Sensitivity ranged from 23.1 to 40.5% (Table 6). Absence of PLR in combination with the absence of CR on admission to hospital [28] or within 72 h [90] predicted poor neurological outcome with FPR ranging from 0 to 9.3% and sensitivity ranging from 27.1 to 28.4% (Table 7). The absence of other brainstem reflexes, such as oculocephalic, gag, and cough, had 0% FPR starting from 48 h after ROSC. However, their precision was low, being based on a few studies (Tables 8, 9).

The absent or extensor motor response was described in 18 studies. An absent (M = 1) or absent or extensor (M = 1–2) motor response to pain had high sensitivity but low specificity for poor outcome. FPR was 5% or more even at 7 days after ROSC (Tables 10, 11).

Clinical myoclonus was described in six studies [26,32,96,104,109,111]. Only one of these studies provided a definition for myoclonus [109]. Timing of assessment ranged from ≤ 24 h to 96 h after ROSC. FPR ranged from 0 to 22.2% (Table 12). In two studies [27,35], status myoclonus ≤ 24 h or within 7 days of ROSC predicted poor outcome at hospital discharge or 6 months with 0–0.2% FPR (Table 13).

Biomarkers (Tables 14, 15, 16)

Neuron-specific enolase (NSE) was evaluated in 16 studies. High blood values of NSE predicted poor outcome with 0% FPR at 24, 48, and 72 h from ROSC in almost all studies. The relevant threshold values ranged from 39.8 and 172 µg/L, from 34 and 120 µg/L, and from 33 to 79 µg/L, respectively. Sensitivities ranged from 7.6% and 56%, from 24.6% and 60.2%, and from 39.3% and 52.6%, respectively. Two studies documented 0% FPR for NSE at 4 days [42,47], 2 at 5 days [42,105], and 1 at 7 days after ROSC [105] (Table 14).

In four studies [37,39,42,44], high blood values of S-100B protein predicted poor outcome with 0% FPR from immediately after ROSC to 72 h after ROSC. The relevant threshold values varied widely (Table 15).

High blood values of glial fibrillary acidic protein (GFAP) at 48 h [38], and serum tau protein [40] or neurofilament light chain (NFL) at 24, 48, and 72 h after ROSC [41,43] predicted poor outcome with 0% FPR (Table 16). Sensitivity of NFL was higher than that of the other biomarkers and exceeded 50% at all timings.


Evoked potentials (Tables 17, 18, 19, 20, 21)

A bilateral absence of N20 wave of short-latency somatosensory evoked potentials (SSEPs) was described in 18 studies. Bilaterally absent SSEPs from the day of the arrest up to 6 days after ROSC was almost invariably associated with poor neurological outcome (Table 17 and ESM Fig. 2). However, in two small studies, FPR of this sign was 50% [62] at 24–48 h and 25% at 24–72 h [94]. In a multicentre study on 201 patients [95], one false-positive result was observed (FPR 2.6%).

In two studies conducted on a multicentre cohort [69,110], a low-voltage N20 SSEP wave on one side associated with an absent N20 wave on the opposite side was used for prediction of poor neurological outcome at 12, 24, and 72 h from ROSC. FPR for this sign was 0%. Sensitivity ranged from 3.8 to 8.4% (Table 18).

Bilaterally absent brainstem auditory evoked potentials (BAEPs) [99,104] at 72 h or more after arrest, or a bilateral absence of pain-related middle-latency SEPs within 72 h from ROSC [74], or of visual-evoked potentials (VEPs) at 48–96 h from ROSC [91] also predicted poor outcome with 0% FPR. However, there were a few studies (Tables 19, 20, 21).

Electroencephalography (EEG) (Tables 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48 and ESM Fig. 3)

Several studies investigated EEG-based indices. For clarity, we subdivided them into four groups: (1) indices based on EEG background; (2) indices based on pathological superimposed activities (discharges or seizures); (3) indices based on a combination of these criteria; (4) indices based on a computerised analysis of the EEG signal. We also subdivided the studies regarding whether they adhered to the standardised terminology for critical care EEG from the American Clinical Neurophysiology Society (ACNS) [116] or not. A summary of the ACNS EEG terminology is reported in ESM Table E4.

EEG background (Tables 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36)

In one study [69], an isoelectric EEG, defined as all activity below 2 µV, predicted poor outcome at 12 and 72 h from ROSC with 0% FPR and high precision (95% CIs 0–4%) (Table 22). In six studies, a suppressed EEG background defined according to the ACNS terminology (i.e., all activity of the record < 10 µV; see ESM Table E4) almost invariably predicted a poor neurological outcome [52,53,61,68,69,72], especially after 16–24 h from ROSC. In studies using continuous EEG recording, sensitivity of suppressed EEG background progressively decreased over the first 48 h after ROSC [61,68], and between ≤ 24 h and 36–72 h after ROSC (Tables 23, 24). In two studies conducted in two subpopulations of the same cohort [52,72], a suppressed background on routine EEG at a median of 76–77 h from ROSC had 0% FPR for poor outcome (sensitivity 14.2% [52] and 25% [72].

In three large studies [52,68,72], the presence of a suppressed EEG background with superimposed periodic discharges invariably predicted a poor neurological outcome in a time range between ≤ 24 h and 120 h from ROSC. Sensitivity of this sign was low, in most cases below 10% (Table 25).

A low-voltage EEG, defined according to ACNS [116] as all activity below 20 µV, was assessed in four studies and predicted poor outcome with 0–9% FPR from 6 to 72 h from ROSC in one study [68]. However, in another study [53], FPR for this sign was 78%. In two other studies [52,72], a low-voltage EEG at a median of 76–77 h from ROSC predicted poor outcome with 11–12% FPR (Table 26).

In three studies, burst suppression, defined according to ACNS terminology as more than 50% of the record consisting of suppression alternated with bursts [116], predicted poor outcome with 0% FPR from within 8–48 h from ROSC in one study [61] and with ≤ 1.5% FPR at a median of 76–77 h from ROSC in two other studies [52,72] (Table 27). In a study using continuous EEG [68], a burst-suppression predicted poor outcome with 0% FPR from 6 to 96 h after ROSC, but only if it was restricted to the sub-type of highly epileptiform bursts (ACNS-defined) or identical bursts (‘synchronous’ burst-suppression). Conversely, the heterogeneous (i.e., ‘non-synchronous’) burst-suppression sub-type predicted poor outcome with 0–1% FPR only after 24 h or more from ROSC. The sensitivity of both these burst-suppression sub-types progressively decreased over time (Tables 28, 29). In other studies [50,62,109,114], burst suppression (undefined, or not ACNS-defined) predicted poor outcome with inconsistent accuracy both within 24 h or at 24–72 h after ROSC (Table 30). In one study [65] on 130 patients, burst suppression detected on amplitude-integrated EEG (aEEG) at a median of 4.3 (1.9–7.4) h after ROSC had 1.8 [0–9.7]% FPR and 74.7 [63.3–84]% sensitivity for poor neurological outcome (Table 31).

The presence of burst attenuation or burst suppression within 72 h from ROSC [51,53,69] and discontinuous or not continuous background within 120 h from ROSC [53,68,69] (Tables 32, 33, 34) was much less specific than burst-suppression for predicting poor outcome. We defined “not continuous” as an EEG pattern where suppression was present for more than 10% of the recording. This differed from “discontinuous”, defined according to ACNS terminology, where suppression is present for more than 10% but less than 50% of the recording.

In two studies [65,73], lack of recovery of a continuous normal voltage background within 36 h from ROSC on amplitude-integrated EEG [65], or within 60 h on continuous EEG [73], predicted poor outcome with 0[0–5]% FPR (Table 35).

The absence of EEG background reactivity assessed from 6 h to a median of 77(53–102)h after ROSC had 0% FPR in one small study [50]. In all other nine studies [51,53,57,59,63,72,90,96,111], FPR ranged from 58.3 [27.7–84.8]% at 6-12 h after ROSC [59] to 11.1 [0.3–48.2]% at 48–72 h after ROSC [53]. Definitions and the stimuli used to elicit EEG background reactivity were inconsistent among studies (Table 36).

Discharges and seizures (Tables 37, 38, 39, 40, 41, 42, 43)

The presence of rhythmic or periodic discharges on EEG was assessed in ten studies and predicted poor neurological outcome with low FPR in most studies. FPR was lowest when discharges were recorded early, especially within 24 h from ROSC (range 0–2.8%), while it was generally higher from 48 h onwards (Table 37). In 8/10 studies, the definitions of rhythmic/periodic discharges were based on the ACNS terminology. However, the prevalence of discharges was not included in most definitions.

Four studies [53,61,69,110] assessed the accuracy of sporadic, non-rhythmic/periodic discharges on EEG from 0–8 up to 72 h from ROSC. Their specificity was lower than that of rhythmic/periodic discharges (Table 38).

In four studies [53,61,68,72], the presence of seizures defined according to ACNS (unequivocal seizures,see ESM Table E4) from 0–8 h to 96 h after ROSC was invariably associated with poor neurological outcome, except in one [68], where FPR was 2.6 [0.1–13.5]% at 96 h from ROSC (Table 39). In two other studies, not adopting ACNS definitions [51,109], the presence of seizures predicted poor outcome with 0% FPR. However, these definitions were inconsistent (Table 40).

The presence of status epilepticus (SE) alone [51,54,56,62,65] or in combination with seizures [52,112] from a median 16.6 (11.3–16.4)h to 5 days after ROSC was often, but not invariably, associated with poor outcome (FPR range 0–17.4%; Tables 41, 42). Definition of SE was inconsistent across studies, and missing in one study [62]. In one of these studies [56], SE predicted poor outcome with 0% FPR only when it originated from a discontinuous or burst suppression rather than from a continuous or nearly continuous background.

Stimulus-induced, rhythmic, periodic, or ictal discharges (SIRPIDs) predicted poor outcome with 0 [0–6.1]% FPR within 24 h and with 2.1 [0.1–11.1]% FPR between 24 and 72 h from ROSC in one study [49] (Table 43).

Combinations of unfavourable EEG features (Tables 44, 45)

Thirteen studies [52,55,57,59,68,70,72,90,93,98,104,110,111] combined EEG features to assess FPR of poor neurological outcome and labelled these as highly malignant patterns. These patterns included isoelectric or low-voltage EEG, suppression, burst suppression (either generic or restricted to the identical bursts subtype), and generalised or continuous periodic discharges in combination with suppression. A few studies also included burst attenuation at 12 h [90], unreactive EEG background [104], or SE [104]. Most of these features were defined according to ACNS terminology.

For the sake of clarity, wherever possible we broke down these “highly malignant” patterns into their respective EEG components, whose accuracy has already been reported in the previous paragraphs and tables. Table 44 describes the results of studies for which only aggregate data were available, while ESM Table E5 compares the various combinations of unfavourable EEG components corresponding to all the “highly malignant” EEG patterns included in this review.

The FPR for “highly malignant” patterns was always below 10% and achieved 0% in most studies. Timings ranged from 6–12 h [59] to 72 h [98] after ROSC. Sensitivity varied widely, and it was generally higher at earlier, rather than later timings after ROSC (Table 44).

Six studies presented combinations of EEG features defined as malignant or intermediate, indicating that these combinations were more favourable than highly malignant patterns (ESM Table E6). The corresponding features included a discontinuous or unreactive background, a reversed antero-posterior gradient, alpha coma, periodic or rhythmic patterns, or discharges not associated with suppressed background, and seizures. However, some studies included features described as highly malignant in other studies, such as suppression [61,101], low voltage [52,57,72,90], and burst suppression [57,61,101]. The accuracy of these patterns is reported in Table 45. The FPR of malignant patterns was generally higher and less consistent than that of highly malignant patterns, ranging from 0% [54,72] to 51.8% [57].

EEG-derived indices (Tables 46, 47, 48)

Three studies [58,66,71] assessed bispectral index (BIS) for prediction of poor outcome at timings ranging from 6 to 48 h after ROSC. BIS thresholds ranged from 0 to 23, and FPR from 0 to 15.8% (Table 46).

One study [67] assessed two EEG-derived indices: the burst-suppression amplitude ratio (BSAR) defined as the mean amplitude ratio between non-suppressed and suppressed EEG segments, and the background continuity index (BCI), defined as the fraction of EEG not spent in suppressions (amplitudes < 10 mV for ≥ 0.5 s). For both indices, a threshold for 0% FPR at 12, 24, 48, and 72 h after ROSC was identified, with decreasing sensitivities over time (Tables 47, 48).

Imaging (Tables 49, 50, 51, 52, 53, 54)

Grey matter/white matter ratio (GWR) on brain CT (Table 49)

Fifteen studies assessed the accuracy of the ratio between the densities (in Hounsfield units) of grey matter and white matter (GWR), on brain CT [27,75,78,80,81,82,83,86,87,100,101,102,103,110,112,115]. Densities were sampled in various components of the basal ganglia (caudate, putamen, and thalamus) for the grey matter, in the corpus callosum and in the posterior limb of the internal capsule for the white matter, and in the cerebrum for both (centrum semiovale, upper convexity area). In 10/15 studies, the majority of patients underwent brain CT within 2 h from ROSC [75,78,80,81,82,100,101,102,103,115] and in all but one [103] study, 0% FPR for poor outcome was reported. However, the relevant GWR threshold values varied widely, even within studies using the same technique. In two studies from the same group, the investigators assessed separately the accuracy of GWR for cardiac arrests from a cardiac cause [81] from those from a non-cardiac cause [82]. Results showed that in arrests from a cardiac cause, corresponding sensitivities for 0% FPR were remarkably lower than in arrests from a non-cardiac cause.

Other indices based on brain CT (Table 50)

In two studies [85,103], cerebral oedema on brain CT was assessed qualitatively using visual evaluation [85] or semi-quantitatively using the Alberta Stroke Program Early CT Score (ASPECTS) bilaterally [103]. Both techniques showed a low FPR for poor outcome. In one of these studies [85], visual assessment of oedema was more accurate both in terms of sensitivity and FPR when brain CT was performed between 24 h and 7 days from ROSC than within 24 h.

In another study on brain CT [88], the gradient between the density of the grey matter and the white matter (differences between the cerebral cortex and white matter: DCW) in three cortical and subcortical areas was calculated. A gradient below 5.5 Hounsfield units between the average values of the grey and the white matter predicted poor outcome with 61.9% sensitivity and 0% FPR.

Brain MRI (Tables 51, 52, 53)

The predictive value of the changes in diffusion-weighted imaging (DWI) on brain MRI was assessed qualitatively in six studies [76,77,78,101,108,115]. In all these studies, both sensitivities and specificities of diffuse DWI changes varied widely (Table 51).

The accuracy of apparent diffusion coefficient (ADC) on brain MRI was assessed in four studies [79,84,97,115] using three different methods (Table 52). In one study [84], the mean (global) ADC value of the brain was assessed. Its thresholds for 0% FPR for poor outcome were ≤ 726 × 10−6 mm2/s within 48 h from ROSC and ≤ 627 × 10−6 mm2/s between 48 h and 7 days from ROSC, respectively. In this and another study [115], the percentage of brain volume with a low ADC (defined as an ADC value ≤ 400 × 10−6 mm2/s) was assessed at a median of 3 h from ROSC [115], or within 48 h and between 48 h and 7 days from ROSC [84]. Thresholds for 0% FPR of the percentage of low-ADC brain volume were 4.3%, 2.5%, and 1.66%, respectively. Results were confirmed in a similar study [97] using higher low-ADC thresholds (from 500 to 650 × 10−6 mm2/s) at days 1–7 from ROSC. Finally, a fourth study [79] evaluated the ADC reduction in various areas of the brain, both in terms of lowest global mean or minimum ADC, or in terms of maximum cluster size of low ADC. For each technique, thresholds for 0% FPR varied according to the studied brain area (see Table 52).

A preliminary study on 39 patients [77] assessed the predictive value of two-dimensional T2-weighted gradient echo (GRE) MRI, a surrogate marker of cerebral venous oxygenation. A value of 3 for the GRE score had 0% FPR and 81.5% sensitivity for poor neurological outcome (Table 53).

Optic nerve sheath diameter (ONSD) (Table 54)

Three studies [75,80,89] assessed the diameter of the optic nerve sheath for prognostication after cardiac arrest. In one study [89] an ONSD of 5.11 mm measured using ultrasound before TTM had 56.4% sensitivity and 21.4% FPR for prediction of poor neurological outcome. In two other studies, an ONSD of 6.21 mm [80] and 7 mm [75] measured using brain CT at around 1 h after ROSC had 0% FPR for poor outcome.


Our systematic review shows that in comatose, survivors of cardiac arrest, clinical, biochemical, neurophysiological, and imaging tests used within the first week after ROSC predict poor neurological outcome with a very low rate of giving a falsely pessimistic prediction. Most predictors, however, have a low sensitivity and a substantial risk of bias. To facilitate a comparison between the major findings of the previous reviews and these of the current review, a comparison Table is provided as ESM Table E7.

Clinical examination

Ocular reflexes

In our review, the FPR for prediction of poor neurological outcome of both s-PLR or corneal reflex decreased progressively with time and was almost consistently 0% from days 5 to 6 after ROSC. The presence of false-positive results (i.e., absent ocular reflexes in patients who subsequently achieve good neurological outcome) in the first 4 days after ROSC may have been caused by transient brain stem dysfunction reflecting the natural recovery from HIBI [117,118]. However, another reason for this result may have been an interference from sedatives, opioids, or neuromuscular blocking drugs (for reflexes based on striate muscle response, such as the corneal) used during TTM. Although in standard TTM protocols, patients are rewarmed within 36 h from ROSC [119], sedatives or opioids are often not discontinued after rewarming, for various reasons (e.g., to facilitate mechanical ventilation, or to treat seizures) [120]. Even when the clinical examination is conducted off sedation and opioids, residual interference from sedation because of reduced drug clearance from hypothermia [121,122,123,124] or concomitant organ dysfunction cannot be excluded.

Another reason for the false-positive results observed with the conventional ocular reflexes may have been the lack of standardisation in their assessment. Unlike automated pupillometry, the characteristics of the stimulating light source (i.e., intensity, distance from the eye, and duration) of s-PLR are not standardised. In addition, the assessment of the pupillary response is based on visual observation, which is prone to subjectivity. In a study [125] on interrater variability of s-PLR, the agreement for reactivity (reactive versus fixed pupils) was only moderate (k = 0.64; 95% CI 0.58–0.71). In that study, among 189 observations of a fixed pupil made by two practitioners, only 94 (49.7%) were scored as fixed by both practitioners, and only 58 (33.3%) were confirmed by automated pupillometry. In one study included in our review [106], the pupils of 5/79 comatose patients who eventually recovered were found to be unreactive at 72 h from ROSC by s-PLR (FPR 6%) but not by automated pupillometry. In all these patients, pupil size was small and its reduction after light stimulus was on average 0.25 mm, which may explain why it went undetected by visual inspection. In the study on interrater variability of s-PLR mentioned above [125], agreement on reactivity was lower when the pupil size was less than 6.0 mm. Residual effects of opioids may reduce pupil size and potentially hamper the assessment of s-PLR, while this has not been observed with automated pupillometry [126].

In the present review, we included recent evidence on automated pupillometry that was not available in the 2013 review. In three studies, we included [31,33,106] the absence of PLR measured with NPi using an automated pupillometer predicted poor neurological outcome with 0% FPR as early as 24 h from ROSC, which did not occur in any of the included studies assessing s-PLR. However, NPi thresholds for 0% FPR were inconsistent across these three studies, ranging from 0 to 2.4. NPi is a score of pupillary reactivity based not only on percentage constriction, but also on a series of dynamic pupillary variables such as pupil constriction and dilation velocity and latency. NPi calculation is made using a proprietary algorithm and its relationship with percentage constriction (qPLR)—also evaluable using pupillometry—is unclear. In studies included in our review, qPLR did not achieve 0% FPR at 24 h from ROSC, even at qPLR = 0%.

Like the pupillary reflex, the corneal reflex is prone to interference from sedative drugs, and may be affected by neuromuscular blocking drugs, since its effector is a striate (orbicularis oculi) muscle. Variations in assessment techniques of corneal reflex have been documented, as well. A recent survey [127] of 959 respondents raised concerns about the technique used to elicit the corneal reflex: 148 (26%) physicians reported applying the stimulus on the temporal conjunctiva rather than on the cornea.

Current guidelines [128] based on our previous review recommend evaluating s-PLR and corneal reflexes no earlier than 72 h from ROSC. Results of the present review suggest that 0% FPR is achieved later, at 96–120 h (days 4–5) after ROSC. For pupillary reflex, an earlier prediction may be achieved with automated pupillometry. The use of pupillometry is desirable, to standardise PLR assessment and provide comparable results. However, future studies are needed to identify a consistent qPLR or NPi threshold for 0% FPR.

Motor score

In line with our previous reviews, an absent or extensor motor response (M ≤ 2) was not 100% specific for poor neurological outcome, even when evaluated 7 days after ROSC. Its sensitivity, however, was high (above 60% at 72 h or later in almost all studies). The presence of M ≤ 2 in a patient who is comatose at 72 h or later after ROSC and the absence of confounders (e.g., sedation) is the current entry point of the prognostication algorithm recommended in the ERC-ESICM guidelines [128]. Like the corneal reflex, the motor response is prone to interference from sedatives and neuromuscular blocking drugs. None of the studies we included ruled out a residual effect of neuromuscular blocking drugs using an objective measure such as median nerve stimulation.

Myoclonus and status myoclonus

Myoclonus is a clinical phenomenon consisting of sudden, brief, involuntary jerks caused by muscular contractions or inhibitions. According to their distribution, myoclonic jerks can be focal, multifocal, or generalised; they can also be spontaneous or stimulus-induced [129]. The occurrence of clinical myoclonus early after cardiac arrest is an unfavourable prognostic sign, but it does not rule out neurological recovery. In a 2015 retrospective registry study [130], among 471 post-arrest patients with clinical myoclonus (however defined), 44 (9%) had good neurological outcome at discharge, a percentage equal to that observed in our previous review [131]. In the six studies included in the present review, the rate of false-positive predictions ranged from 0 to 22% (Table 12). Among these studies, only one [109] provided a definition of myoclonus, described as “a brief, involuntary twitching of a muscle or a group of muscles”.

A prolonged period of generalised myoclonic jerks is commonly described as status myoclonus. However, there is no consensus on the distribution or duration of myoclonic jerks to qualify as status myoclonus. In the 2014 ERC-ESICM Advisory Statement on neurologic prognostication after cardiac arrest [9], we suggested that in comatose survivors of cardiac arrest status myoclonus should be defined as a continuous and generalized myoclonus persisting for 30 min or more. Among the two studies reporting “status myoclonus” in the present review, one [27] used a similar definition.

In our previous reviews [5,6], status myoclonus had a worse prognosis than myoclonus. The same was observed in the present review, where the presence of status myoclonus was associated with only one false-positive result. However, in one of the two studies we included occurrence of an early (≤ 24 h) status myoclonus in combination with a bilaterally absent N20 SSEP wave was a criterion for WLST. The 2015 ERC-ESICM guidelines on post-resuscitation care [128] suggest using status myoclonus occurring within 48 h after ROSC for predicting poor outcome after cardiac arrest, but include it among the less robust predictors, because of its inconsistent definitions. Based on the results of the present review, this caution remains.

In studies using a case–control design and not included in the present review [129,130,132], some favourable prognostic features associated with clinical recovery after post-resuscitation myoclonus have been identified. Favourable clinical features included the distribution of myoclonus: focal/multifocal (three limbs or less) versus generalised (all four limbs, plus axial musculature) [129], or asynchronous (asymmetric) vs. synchronous (symmetric and bilateral), variable (different body parts at different times) vs. stereotyped, and distal vs. axial [133]. Electroencephalographic features suggesting a potentially favourable outcome included a reactive [130,134] and/or continuous vs. burst-suppression EEG background [130,132]. More specifically, a continuous EEG background with narrow, vertex spike-wave discharges synchronised with multi-focal myoclonic jerks [132,134] has been described in association with Lance–Adams syndrome (LAS), a relatively benign form of post-anoxic myoclonus that can evolve towards a slow neurological recovery in about 50% of cases. However, the prognostic significance of electrographic correlates of myoclonus is uncertain [32]. The ERC-ESICM Advisory Statement on prognostication suggests recording an EEG in patients with post-arrest status myoclonus, both to identify an associated epileptiform activity and to detect signs associated with potential recovery.


In accordance with the findings of our previous reviews [5,6], results of studies included in the present review showed a high variability of threshold values for 0% FPR for biomarkers. Major reasons for this variability include use of different measurement methods [135] and presence of extracerebral sources of biomarkers (haemolysis and neuroendocrine tumours for NSE [136,137] muscle and adipose tissue for S-100B [138]).


In some studies included in our review [44,105], the NSE thresholds for 0% FPR were very high, with a concurrent low sensitivity, especially at 24 and 48 h after ROSC (see Table 14). This was because of the presence of outliers in the group of patients with good neurological outcome. For instance, in the largest NSE study we included [44] the highest NSE value observed at 24 h and 48 h in a patient with good neurological outcome (threshold for 0% FPR) was 107 ng/ml and 120 ng/ml, respectively. However, 99% of the values in this group of patients were below 77 ng/ml and 69 ng/ml, respectively, and 95% were below 50 ng/ml and 43 ng/ml, respectively. The choice of the NSE threshold to adopt in clinical practice for prognostication purposes depends on the trade-off between achieving a maximal FPR and the need to avoid excessively decreasing sensitivity.

On visual analysis, the scatterplots and the ROC curves of NSE measured at consecutive timings in the studies we included [42,44,47,48] showed that the accuracy of NSE was higher at 48–72 h than at 24 h, confirming the results of older studies [102,139]. This was because median NSE values were higher at 48–72 h than at 24 h from ROSC in patients with poor neurological outcome, while they remained stable or decreased in patients with good neurological outcome, suggesting that the individual NSE trends over time may have an additional prognostic value (ESM Table E8). Indeed, in one study [92], both a 1.7 ratio between NSE values at 48 h and those at 24 h, and a 1.3 ratio between values at 72 h and 24 h had 0% FPR for prediction of poor neurological outcome. In two studies, an increase of individual values between 24 and 72 h [44] or between 24 and 96 h [47] predicted poor outcome with less than 10% FPR. However, in a substudy of one of these investigations [140], neither the 24–48 h nor the 48–72 h increase in NSE significantly improved prediction over a single measurement at 48 h in the same patients (area under the ROC curve 0.83 and 0.58 vs. 0.83, respectively).

In contrast to our previous reviews [5,6], we could assess the predictive value of NSE after 72 h, even if this was based on a limited number of studies (two at 4 and 5 days; one at 7 days). Results of these studies showed that NSE can achieve 0% FPR even when evaluated at these later times. Thresholds for 0% FPR were lower than those reported at 24–72 h, possibly related to the NSE kinetics, whose half-life is around 24–30 h [141]. In one study [47], an increase of NSE values greater than 3.8 ng/ml between day 3 and day 4 predicted poor outcome with 66.7% sensitivity and 3.6% FPR, suggesting that NSE values may continue to increase after day 3 in patients with poor neurological outcome.


In our review, S-100B has been evaluated in fewer studies than NSE. Results of the four studies we included showed that prediction of poor neurological outcome with 0% FPR based on blood values of S100B during the first 72 h after ROSC is possible. However, the corresponding thresholds varied widely. The ratio between the highest and the lowest threshold for 0% FPR documented at each time point was 52 at 24 h, 23 at 48 h, and 9 at 72 h.

Interestingly, in two studies we included [37,42], the accuracy of S-100B and NSE was measured in the same cohort of patients. At 0% FPR, sensitivity of S-100B was lower than that of NSE. Given the lack of proven advantages over NSE, its limited availability, and its largely inconsistent thresholds, a wide clinical application of S-100B for neuroprognostication after cardiac arrest appears at present unlikely.

Other biomarkers

In the present review, we included limited evidence regarding three biomarkers, not included in our previous reviews. These include glial fibrillary acidic protein (GFAP), serum tau protein, and neurofilament light chain (NFL). NFL has been evaluated in two studies [41,43], one of which included measurements on days 5 and 7 after ROSC [43]. In both these studies, sensitivity of NFL at each time point was always higher than that reported for GFAP and serum tau protein (see Table 16). The NFL thresholds for 0% FPR were very different between these two studies, possibly reflecting different measurement techniques, or outcome definitions. Its low concentrations, measured in the range of pg/mL, make measurement of NFL technically challenging.

In conclusion, the results of our review show that blood values of several biomarkers can be assessed to predict poor outcome accurately after cardiac arrest. Among the biomarkers we included, NSE is the most widely documented. When compared with other predictors after cardiac arrest, biomarkers have important advantages: they are not affected by sedation or paralysis, can be evaluated blindly, therefore, limiting the risk of self-fulfilling prophecy, and provide a quantitative result. Limitations include potential interference from extracerebral sources, the use of different measurement techniques, and reduced availability in comparison with other index tests [142]. Moreover, since their concentrations are expressed as continuous variables, achieving a consistent threshold for 0% FPR is difficult.


Short-latency somatosensory-evoked potentials (SSEPs)

In line with our previous reviews [5,6,7], in almost all the studies we included a bilaterally absent SSEP N20 wave predicted poor neurological outcome with 0% FPR. This was documented from the very early phase (6–12 h) up to a median of 5 days after ROSC (Table 17). However, two studies reported a false positive rate of 50% [62] and 75% [94] for SSEP. In both of these studies, only a few patients were assessed with SSEP, which may have amplified the observed false positive rate. Nevertheless, false-positive predictions have occasionally been reported in other studies included in both this [95,96] and in previous reviews [5,143]. In some cases, the cause of a false-positive result was an incorrect reading of the SSEP record [143]. Studies assessing the interobserver agreement of SSEP reading in patients resuscitated from cardiac arrest showed that the risk of misclassification is higher in the presence of artefacts or a low amplitude of the N20 wave [144,145,146]. Interrater agreement among SSEP readers increases significantly after neuromuscular blockade [145].

Along with a bilaterally absent s-PLR, a bilaterally absent N20 SSEP wave is considered to be the most robust predictor of poor neurological outcome in comatose survivors of cardiac arrest [9,128], and is also the most extensively studied. Moreover, in comparison with s-PLR, SSEPs have a higher sensitivity [5] and are not affected by sedation. However, they can be affected by a self-fulfilling prophecy bias. Indeed, in several studies we included in our review [68,93,95,96,99,106], SSEPs were part of the criteria for WLST. In two of these studies, the rate of WLST based on bilaterally absent SSEP was 50% [64] and 82% [95]. Interestingly, however, and differently from previous reviews, we included recent studies conducted in communities where, for legal and/or cultural reasons, no formal WLST was performed. A false-positive prediction from SSEPs was not observed in any of these studies [59,69,91,101,110]. Nevertheless, the treating team was also not blinded to the SSEP results, therefore, a risk of self-fulfilling bias cannot be completely excluded.

The absence of an N20 SSEP wave from one side combined with low-voltage N20 SSEP wave, on the other side, had 0% FPR for prediction of poor neurological outcome in two studies from the same group conducted on a multicentre cohort [69,110], and its combination with a bilaterally absent N20 SSEP increased the overall sensitivity by 9–10%. This finding, however, needs external validation from further studies.

Other potentials

We found limited evidence concerning other types of evoked potentials in our review. Two studies showed that bilateral absence of BAEPs early (≤ 72 h) [104] or after 72 h from ROSC [99] was a very specific but not a sensitive predictor of poor neurological outcome after cardiac arrest. This low sensitivity is probably because BAEPs assess the function of the brainstem, which is relatively resistant to post-anoxic brain injury. Apart from the small study sample, the evidence from these two studies is limited by indirectness, since both defined poor neurological outcome as CPC 4 or 5.

Other evoked potentials in our review showed remarkable features. For instance, pain-related middle-latency SEPs had a very high sensitivity (79.4 [62.1–91.3]%) along with 0% FPR for prediction of poor outcome. However, these results were reported in only one study [74].


In comparison with our 2013 review [5], the amount of available evidence concerning EEG has increased enormously. Continuous EEG monitoring has become more common for monitoring the recovery of brain function in cardiac arrest patients who are deeply sedated and often paralysed to maintain TTM. In addition to its potential for early prognostication, EEG has an important role to diagnose seizures and to monitor the effectiveness of antiepileptic treatment.

An important step towards standardising EEG assessment in the last years has been the introduction of the ACNS terminology [116], providing a uniform definition for EEG patterns in critical care, many of which are commonly observed in post-anoxic coma. Most of the studies included in this review followed the ACNS terminology, which was not used in any of the studies included in our previous review. The main EEG features assessed in these studies included background continuity, and voltage, reactivity, and superimposed discharges.

EEG background continuity and voltage

According to ACNS, the EEG background voltage is categorised as normal, low-voltage (most or all activity < 20 µV), or suppressed (all activity < 10 µV). In terms of continuity, it is categorised as continuous, nearly continuous, discontinuous, burst-attenuation/burst suppression, or suppressed (see ESM TableE4).

In our review, a suppressed EEG background defined according to the ACNS terminology showed a trend towards higher FPR and/or lower sensitivity over time, especially when comparing results from studies that included continuous EEG monitoring [61,68]. This suggests that EEG suppression is transient in the very early phase (12–24 h) after cardiac arrest. Nevertheless, its sensitivity was still 14% [52] and 25% [72] at a median of 76–77 h after ROSC in two studies conducted in separate subpopulations of the TTM trial.

The most important EEG alteration in background continuity observed in our review was burst-suppression. In three studies [52,61,72], the presence of this pattern anytime from 0 to 8 h to a median of 76–77 h after ROSC almost invariably predicted poor outcome. However, in another study [68], burst suppression from 6 to 96 h after ROSC had 0% FPR only when the bursts were highly epileptiform or identical (“synchronous” burst-suppression); if these characteristics were absent (“heterogeneous” burst-suppression), FPR was higher in the early phase (8–9% within 12 h from ROSC), decreased markedly at 24 h (1.4%), and reached 0% only at 72 h from ROSC. “Highly epileptiform bursts” is a standard ACNS term (ESM Table E4), while “identical bursts” and “heterogeneous” or “synchronous” characteristics of burst suppression are not. Burst suppression with identical bursts has been assessed as part of combined, “highly malignant” EEG patterns after cardiac arrest (see below) both by the same group of investigators who initially defined it [90,98] and by others [57], always showing 0% FPR from 24 to 72 h after ROSC.

A consistent decrease in the prevalence of burst suppression with time after ROSC was observed for both synchronous and heterogeneous burst suppression [68] in patients with poor neurological outcome (Tables 27, 28, 29; see also ESM Fig. 3), while a decreasing FPR over time was observed. These data suggest that—similarly to suppression—burst suppression is a transient pattern both in patients with good neurological outcome (with the consequent risk of false-positive predictions when used very early after arrest) and in those with poor neurological outcome (with a consequent decrease in sensitivity over time).

A progression of EEG background towards continuity and normal voltage in the early phase after arrest has been demonstrated in most patients with a final good outcome [73,147,148]. This is probably due to the general process of brain recovery after HIBI. An additional factor for this may be confounding from sedative drugs used to facilitate TTM in the first 24–36 h after arrest. In a recent study on comatose resuscitated patients [149], stopping sedation with propofol and/or midazolam (maximum doses 3–3.5 mg/kg/h and 63–68 µg/kg/h, respectively) after TTM was associated with a significant increase of background continuity on quantitative EEG both in patients with good outcome and in those with poor outcome. In adults undergoing general anaesthesia, propofol can induce burst suppression at a dose of 14–29 mg/kg/h [150]. This dose is much higher than that used for continuous sedation in the ICU. However, propofol doses above 10 mg/kg/h are often used in ICU patients with drug-resistant seizures [151], a common occurrence in post-anoxic coma.

EEG background reactivity

Based on ACNS terminology, EEG reactivity is a change in amplitude or frequency, including attenuation, after stimulation. The appearance of SIRPIDS is not considered standard reactivity, while muscle or eye blink artefacts do not qualify as reactivity (ESM Table E4). ACNS does not indicate any quantitative criteria for amplitude or frequency changes to define reactivity, nor a standard stimulus to assess reactivity. In our review, almost all studies included both the amplitude and frequency criteria, but only two of them explicitly excluded myogenic artefacts or SIRPIDS. There was a wide variation in the stimuli used (auditory, noxious, somatosensory, tactical, and visual) to elicit reactivity (Table 36).

The specificity for poor outcome of an unreactive EEG background was much lower than that of suppression or burst suppression. Variability in the methods of stimulation or assessment of reactivity may have contributed to this result. However, in one study [63] where a standardised somatosensory stimulus was used to elicit EEG reactivity, the FPR of this sign was 25%. A recent expert consensus on EEG reactivity proposed a standardisation of both the stimulus type (clapping, calling the patient’s name, and applying nail bed pressure) and duration (at least 5 s), and recommended a repetition of testing for three times to ensure reproducibility [152]. Assessment of reactivity is prone to variability. In one study we included, the kappa for reactivity among the investigators was 0.74 [55] (ESM Table E9), but it was only 0.26 in another study [153].

Superimposed discharges

In the current review we included a total of ten studies on rhythmic or periodic discharges, a pattern that was not documented in our previous reviews. According to the ACNS terminology, periodic and rhythmic discharges are repetitions of a waveform of relatively uniform morphology and duration with (periodic) or without (rhythmic) a quantifiable inter-discharge interval between consecutive waveforms (ESM Table E4). Most of the studies we included adopted the ACNS definition and included periodic discharges only, or both periodic and rhythmic discharges. The sensitivity of rhythmic discharges in the very early phase after ROSC was below 10% [52,72], but it increased later [61,68]. Conversely, FPR was very low in the first 24 h after ROSC, but increased later. In a study using continuous EEG [73], time to appearance of epileptiform activity was inversely correlated with the likelihood of poor outcome, while the opposite was observed with the time to recovery of a continuous normal voltage background. This can have implications in the timing for recording and assessment of these predictors in the clinical setting.

In three large studies [52,68,72] adopting the ACNS terminology, continuous or generalised periodic discharges on a suppressed EEG background were 100% predictive of poor neurological outcome within a time range of 12–120 h from ROSC. This pattern was a combination of two important unfavourable EEG features, and it has been included within the “highly malignant” patterns (see below).

In our previous review, the presence of seizures predicted poor neurological outcome with 0% FPR. However, the evidence was limited to three studies, and definitions were inconsistent. In the present review, seizures were documented in six studies, four of which used a standardised (ACNS) definition. Seizures were 100% specific for poor neurological outcome in all studies we included, except in one [68] where one patient with seizures occurring at 96 h recovered. Sensitivity was generally low, below 20% in almost all cases (Table 39).

The presence of prolonged seizures is generally defined as “status epilepticus” (SE). However, the ACNS terminology does not include this term, and there is no consensus on the duration and continuity of discharges to qualify a pattern as SE, nor on whether SE should include only unequivocal seizures or may also include periodic discharges. SE was described in several studies included in our review. Its definitions were, however, inconsistent or even absent (Table 42). Sensitivities varied widely, probably reflecting differences in classifications, or differences in timing and duration of EEG recording. Although SE (however defined) was usually associated with poor neurological outcome, this association was not consistent. Some specific EEG features of SE described in the studies we included suggest a potential for recovery. These include preserved reactivity, late appearance (after TTM) [154], or presence of a continuous or nearly continuous background [56]. These findings confirmed evidence [155,156] included in our previous review.

In all studies on seizures or SE, the prevalence of this index may have been affected by sedatives and/or antiepileptic treatment. Moreover, it is unclear whether seizure activity is a result of severe HIBI or if it may, in turn, cause further brain damage through massive excitotoxicity, in which case aggressive treatment would be justified. The TELSTAR trial [157] ( NCT02056236) is randomising patients with SE to either medical treatment to suppress all electrographic seizure activity, or no treatment and may provide some answers to this question.

“Highly malignant” and “malignant” EEG patterns

Most “highly malignant” patterns included features that have already been described and discussed separately in this review, notably suppression with or without superimposed periodic discharges, and burst-suppression. Other features included an isoelectric or low-voltage EEG background, and, occasionally, burst-attenuation or SE. FPR for these combined patterns was low. In one of these studies [57], the two most commonly described “highly malignant” patterns were compared in the same population. Results showed that a pattern including isoelectric or low-voltage background or burst-suppression with identical bursts had 0% FPR for poor outcome at 24 and 48 h from ROSC, while a pattern including suppression with or without periodic discharges or burst suppression [72], had an FPR of 9.4% and 4.2% at these two respective time points. Since data in that study were provided only as an aggregate, we could not identify which component (i.e., suppression or burst-suppression) was associated with false positives. We suggest that future studies assessing combined EEG patterns for prognostication should report the accuracy of individual EEG features separately wherever possible.

‘Malignant’ patterns had higher FPRs than “highly malignant” patterns. Their interrater reliability was also lower. In one study [153], this was κ = 0.42 vs. κ = 0.71. The wide variability of definitions and classifications across these patterns prevents further analysis.

In conclusion, specific EEG patterns have a remarkable potential for early neurological prognostication after cardiac arrest. Evidence from our review indicates that both suppression (with or without superimposed discharges) and burst suppression are accurate predictors of poor neurological outcome especially after 24 h from ROSC. ACNS-defined seizures predict poor outcome with very low FPR starting from the first 12–24 h after ROSC. Both these timings are earlier than those currently recommended for prognostication using EEG [9,128]. On the other hand, currently recommended EEG-based predictors, such as an unreactive EEG background or status epilepticus lack standardisation, which may partly explain their inconsistent accuracy. This suggests that the criteria for the use of EEG for prognostication in the current ERC-ESICM guidelines will need revision.

According to a survey conducted in 2015 [142], EEG is the most widely used tool for neurological prognostication in European countries. Bedside availability and its added value for interpreting myoclonus and detecting and treating post-anoxic seizures are important advantages. Although interference from sedative drugs cannot be excluded, its impact on EEG prognostic accuracy is probably limited [149]. A more important issue in the implementation of EEG as a prognostic test is interrater reliability. This was moderate (κ from 0.64 to 0.71) for highly malignant EEG patterns, both when defined as isoelectric, low-voltage, burst suppression with identical bursts [70,98], and when defined as suppression or burst suppression according to ACNS terminology [153]. Interrater agreement for background continuity was moderate (k ranging from 0.54 to 0.80) in two recent studies [53,68]. However, agreement may be affected by the operator experience. In one study we included [53], the interrater reliability of EEG reactivity measured by Cohen’s kappa was 0.87 among experienced neurophysiologists, but it was only 0.35 between a senior neurophysiologist and a junior neurophysiologist (ESM Table E9).

EEG-derived indices

EEG is a complex signal, and to provide an objective, standardised analysis, some automated techniques have been developed. Specifically for post-anoxic coma, in one study included in our review [67], two quantitative indices of EEG background continuity were assessed. These are background continuity index (BCI), which measures the fraction of EEG not spent in suppression, and burst-suppression amplitude ratio (BSAR), i.e. the mean amplitude ratio between non-suppressed and suppressed EEG segments. For both indices, threshold values for 0% FPR were identified from 12 to 72 h after ROSC. However, these thresholds varied widely across the time points. Both BCI and BSAR were calculated off-line in 5-min EEG epochs after careful removal of artefacts. Although promising, these indices await further validation.

Bispectral index (BIS) is a commercially available device for quantitative EEG analysis used mainly to monitor the depth of anaesthesia. However, it has also been used to assess the severity of post-anoxic brain injury. BIS is based on a proprietary algorithm and is calculated as a weighted sum of several EEG parameters (among these, time and frequency domain). Its results are expressed as a non-dimensional number. In our previous review, a BIS value of 6 or less during TTM had 0% FPR for poor outcome in two studies [158,159], but in another study [160] its FPR was 17%. In the present review, BIS was evaluated in three studies, at timings comprised within 6 and 48 h. Thresholds for 0% FPR varied between < 2.4 at 6 h and < 10.5 at 36 h. The clinical value of BIS monitoring in resuscitated comatose patients is at present uncertain.


The findings on brain imaging studies performed after cardiac arrest that are associated with severe HIBI are cytotoxic oedema and—less prominently—vasogenic oedema [161]. Cytotoxic oedema is a cellular swelling due to metabolic disruption which occurs early and is more pronounced in the grey matter due to its higher metabolic activity. This appears on brain CT as a reduced density with loss of the grey matter/white matter (GM/WM) interface.

In most of the studies we included, the focus was on measurement of the ratio of the GM and WM densities (GWR) in specified regions of interest (ROIs) with the aim of standardising the evaluation of brain oedema as an attenuation in the GM/WM interface not readily detected by visual inspection. As in our previous review [5], a low GWR on brain CT was an early and accurate predictor of poor neurological outcome after cardiac arrest. All but one of the studies reported a GWR threshold above which no neurological recovery was observed, and in most of them, this prediction was made within 2 h ROSC.

There is no consensus on normal ranges for GWR in the human brain. In studies included in our review, the GWR thresholds for 0% FPR varied widely (Table 49). A first explanation for this finding is the variability of the techniques used to calculate GWR. In most studies, GWR was calculated between GM and WM areas within the basal ganglia region (for instance, between caudate nucleus, putamen, or thalamus [for the GM], and corpus callosum or the posterior limb of the internal capsule [for the WM]). In others, measurements were made within the cerebrum, typically between the medial cortex and the medial white matter at the level of both the centrum semiovale and the high convexity. Most studies [75,78,80,81,82,100,101,103,115] combined these two measurements by calculating an average GWR (GWR basal ganglia + GWR cerebrum)/2). The relevant thresholds for 0% FPR of average GWR ranged from 1.07 to 1.23.

Other potential explanations for the observed variability of GWR thresholds include differences in scanning protocols, radiation dose, spatial resolution, and reconstruction techniques between scanners and software [162]. Finally, as for other predictors, an additional source of variability may be represented by variations in case mix, notably, in the aetiology of arrest. Results from two studies included in our review show that the sensitivity of GWR was lower in patients with arrest from cardiac aetiology [81] vs. those with arrest from non-cardiac aetiology [82].

Intracerebral post-anoxic cytotoxic oedema reduces water diffusivity and is detected on brain MRI as a hyperintensity on diffusion-weighted imaging (DWI) with corresponding low apparent diffusion coefficient (ADC) values. In two small studies included in our previous review [163,164], restricted diffusion in both cortex and basal ganglia had 0% FPR for poor neurological outcome. In six studies included in this review [76,77,78,101,108,115], DWI sensitivities and specificities varied widely across studies (from 26.9 to 92.6% and from 0 to 44.3%, respectively). The most likely reason is that definitions of DWI changes were inconsistent across studies.

ADC enables a quantitative measurement of diffusion changes on brain MRI. Lower ADC values identify restricted diffusion. However, there is no universal consensus on the best technique to assess reduced ADC within the brain after HIBI. In the four studies assessing ADC included in our review [79,84,97,115], three methods for ADC measurement were described: the mean global or regional ADC value of the brain [79,84], the proportion of voxels with low ADC [84,97,115], and the maximum size of the MRI clusters with minimum ADC [79]. In all these studies, ADC thresholds for 0% FPR were identified, often with sensitivities above 50%. These thresholds, however, were inconsistent across different areas of the brain within the same study and the same technique [79]. In the two studies assessing global ADC, the percentage of voxels with low ADC (however defined) above which FPR was 0% ranged from 1.66 to 10%. The corresponding thresholds for defining low ADC ranged from 400 × 10−6 mm2/s to 650 × 10−6 mm2/s. As for GWR, the results of ADC measurements may vary according to both the equipment and the protocol used [161]. One of the studies we included [97] aimed to prospectively validate the threshold of 10% of the brain tissue below ADC of 650 × 10−6 mm2/s found in a single-centre study [165] of the same group of investigators and included in our previous review [5]. The study confirmed 0% FPR for this threshold. In two studies [84,115], the same low-ADC definition was used at three different timings from 3 h to day 7 after ROSC. Their results suggest that the threshold percentage of low-ADC voxels for 0% FPR decreases over time.

In addition to cytotoxic oedema, HIBI also causes vasogenic oedema, due to a disruption of the blood–brain barrier, resulting in the accumulation of water and osmotically active substances in the extracellular space. Vasogenic oedema usually appears later than cytotoxic oedema. On a CT scan, it may be evident as an effacement of the cortical sulci and reduced ventricle size. It may also cause an engorgement of the cerebral venous sinuses which may be mistaken for a subarachnoid haemorrhage, the pseudo-subarachnoid sign [161]. In one of the studies included in our review [85], generalised cerebral oedema on brain CT detected on visual inspection by an experienced radiologist was a very specific sign of a poor prognosis after cardiac arrest although with limited sensitivity. This evaluation is obviously dependent on the radiologist’s expertise.

Vasogenic oedema may potentially lead to increased intracranial pressure (ICP) after cardiac arrest. Increased optical nerve sheath diameter (ONSD) is a non-invasive indicator of increased ICP [166]. Three studies included in our review assessed ONSD (see Table 54), all within a few hours after ROSC. One of these studies used ocular ultrasound [89], the other two used brain CT [75,80]. In studies using brain CT, an increased OSND was 100% accurate for prediction of poor outcome, with a moderate intraclass correlation coefficient (ICC) for interrater reliability (from 0.80 to 0.83) (ESM Table E10). However, in another study not included in our review [167], an increased ONSD measured using MRI did not predict poor neurological outcome. At present, results of OSND assessment after cardiac arrest should be considered as a preliminary, hypothesis-generating investigation.

The optimal timing for performing imaging after cardiac arrest has not been defined yet. In most studies included in our review, cytotoxic oedema was detected within 1–2 h after ROSC using brain CT and within 2–3 days after ROSC using DWI/ADC. However, in other studies [78,115], MRI was performed at a median of 3 h after ROSC. In a study not included in our review [168] changes on DWI MRI predicted poor neurological outcome within 6 h from ROSC. The time at which brain CT and MRI are performed may depend on clinical and organisational issues rather than on their accuracy. Brain CT is often performed shortly after admission in cardiac arrest patients, so that neurological causes of arrest such as intracranial haemorrhage can be ruled out [169,170], especially when early PCI may be indicated. In contrast, MRI may be less feasible in the early phase after cardiac arrest when there is maximal haemodynamic instability [3]. Studies included in our review showed that prediction with GWR on brain CT can be obtained up to 72 h [87] after ROSC. In one study mentioned above [85], “eye-balling” assessment of generalised oedema on brain CT predicted poor outcome with 56.5 [47–65.7]% sensitivity and 0 [0–10.1]% FPR from 24 to 7 days after ROSC. A few studies where the assessment was repeated over time suggest that the sensitivity of both brain CT [85] and MRI [84] may increase when these studies are performed after 24 and 48 h, respectively.

Predictive indices based on imaging have considerable advantages. Unlike clinical examination and EEG, they are not prone to interference from sedative drugs; they can also be assessed blindly. Moreover, imaging studies provide additional information on the spatial distribution and the time course of HIBI. However, they also have important limitations. First, standardisation of their recording and assessment techniques is still lacking. Second, although the number of available studies has increased in comparison with our 2013 review, the majority of these studies are single centre and have a retrospective design, and in most of them, imaging was performed at the discretion of the treating physician, potentially creating a selection bias. Third, because of the longer acquisition times when compared with brain CT and potential interference with monitoring devices, MRI may not be feasible in the most unstable patients, potentially creating an additional selection bias. Fourth, despite being quantitative, both GWR and DWI depend on manual selection of ROIs in the brain, which may introduce subjectivity in the assessment. Some of the studies we included documented an excellent or good interrater reliability for GWR (intraclass correlation up to 0.95 [0.94–0.96]; see ESM Table E10). However, this information was missing in studies on MRI, except one study on GRE score, where intraclass correlation was 0.95 (0.91–0.97). Importantly, only one of the studies we included [97] was a validation study of a previously identified index test. Although prognostic indices based on imaging were very accurate in centres where specific experience is available, future multi-centre validation studies using comparable measurement techniques are warranted to address their reproducibility. Analysis by artificial intelligence algorithms, such as vector support analyses/machine learning, is a potential for further investigation and standardisation of quantitative data from imaging studies in the near future [161].

Study limitations

Some limitations of our review should be acknowledged. First, our review was focused only on prediction of poor neurological outcome. This is because the vast majority of the currently available indices assess the severity of HIBI and as such are aimed at predicting poor outcome. However, some indices of neurological recovery have been identified in studies included in our review. These include, for instance, recovery of a continuous and reactive EEG early after arrest [65,69,107]. We will evaluate the accuracy of these predictors in a subsequent review.

Second, we assessed only predictors whose results could be reported as a binary variable, so that sensitivity and FPR could be calculated. Therefore, predictors whose results were not dichotomised were not included in our analysis. These included predictors whose performance was reported only as an area under the receiver-operating characteristic (ROC) curve, or as a difference in mean value of the predictor between good and poor outcome subgroups. However, these non-binary measures of predictor performance, although informative, are of limited use for clinical decisions.

Third, our review assessed outcome predictors individually, and not in combination. This was because the number of predictors considered was high, and the analysis of their combinations would not have been feasible in a single review. In addition, we think that assessing the reliability of single predictors is the first necessary step before designing a multimodal approach.

Fourth, as we mentioned in the Discussion, the use of TTM, sedatives, or neuromuscular blockers may have affected the accuracy of some index tests, especially those based on clinical examination or EEG. However, these factors could not be assessed separately in our review. One of the reasons was that several studies included a mix of TTM-treated and non-TTM-treated patients. Moreover, the use of neuromuscular blockers and sedatives was insufficiently documented in the studies that we included. Evaluating the effect of TTM-related factors will require access to individual patient data from studies and more complete information on these variables. The international multicentre Targeted Hypothermia Versus Targeted Normothermia After Out-of-hospital Cardiac Arrest (TTM-2) Trial ( Identifier: NCT02908308) will likely provide sufficient information on this regard.

Fifth, based on current recommendations [14,15], we restricted our review to studies assessing neurological outcome from hospital discharge/1 month onwards. However, there is evidence showing that a minority of patients with an initially poor neurological status may progress towards clinical recovery during the first few months after hospital discharge [171,172,173]. For this reason, assessing outcome at 6 months or later may be considered as more appropriate for neurological prognostication. Unfortunately, among the studies we included, only one [114] assessed the predictive value of individual index tests at multiple outcome time points, so that we could not test this hypothesis in our review.

Sixth, while our quality assessment protocol identified several potential sources of bias in the included studies, publication bias and selective outcome reporting were not part of this evaluation. This risk for diagnostic and prognostic studies is substantial, especially because, unlike RCTs, registration for this kind of studies is not mandatory [174]. However, there is no consensus on how publication or reporting bias should be assessed in these studies. Tests like funnel plot asymmetry, designed primarily for randomised control trials, are not appropriate [21].

Finally, our review is limited to neurological prognostication. Although HIBI is the major cause of death in resuscitated comatose patients, other causes include cardiovascular instability [3], and multiple organ failure [4,175]. This is important considering that, in most studies, poor outcome is driven primarily by CPC 5 (death), which can be easily confounded by non-neurological factors. A recent consensus statement [14] recommended that causes of death should be carefully described in studies on neurological prognostication after cardiac arrest. A comprehensive prognostic approach after cardiac arrest should consider the role of extracerebral factors as well as patient characteristics such as age, comorbidities, and functional status, using a multivariable approach.


In line with our previous reviews, accumulated recent evidence confirmed that the bilateral absence of pupillary reflexes, corneal reflexes, and N20 wave of somatosensory evoked potentials are reliable indicators of poor prognosis after cardiac arrest. New evidence has emerged supporting the use of EEG patterns based on standardised terminology from the ACNS. Threshold values for reliable prediction of poor outcome using GWR threshold on a brain CT or serum values of biomarkers varied extensively across studies indicating an unmet need for standardisation of these methodologies. Most of the studies included in this review were biased by lack of blinding and—for predictors based on clinical examination and EEG—by potential confounding from sedation. None of the indices which we evaluated enabled prediction of poor neurological outcome with absolute certainty, and a multimodal approach still appears to be the most prudent prognostication strategy.