Although serial clinical examinations represent a central part of neurological evaluation and the foundation for all “neuromonitoring,” this has not been well studied. Several different “coma” scales have been developed and tested for validity, reliability, and accuracy among varying diagnostic groups (e.g., traumatic brain injury, TBI), but none of these have compared monitoring strategies using serial neurological examinations with strategies not including these examinations. The related assessments of pain, agitation, and delirium (PAD) have received recent attention among general intensive care unit (ICU) patients, but less so among neurocritical care patients.

Methods and Search Criteria

Using the PubMed database, we conducted a literature search that included the terms: “coma” OR “Glasgow Coma Scale” OR “GCS” OR “FOUR score” OR “Full Outline of Unresponsiveness” AND “brain injury” OR “traumatic brain injury” OR “TBI” OR “subarachnoid hemorrhage” OR “SAH” AND “intensive care.” We restricted article language to English and did not consider unpublished or abstracts. A second search was performed using the keywords: “fixed pupil” OR “dilated pupil” OR “blown pupil” AND “brain injury” OR “traumatic brain injury” OR “TBI” OR “subarachnoid hemorrhage” OR “SAH” AND “intensive care.” An additional search was performed using the following keywords: “pupillomet*” AND “brain injuries” which yielded seven references. Finally, we searched for eligible studies using the following keywords: “delirium” OR “pain” OR “sedation” OR “agitation” AND “brain injury” AND “(intensive OR critical) care” AND “English Language” which yielded 330 references. These titles and abstracts were reviewed, as were personal files, reference lists of review articles, and reference lists in eligible studies for additional trials.

Study Selection and Data Collection

We independently reviewed citations, abstracts, and full-text articles to select eligible studies. We excluded: (a) review articles; (b) case reports or case series with ≤5 patients; (c) experimental studies; (d) study on pediatric ICU populations (<18 years); data were abstracted using a predefined abstraction spreadsheet, according to the PICO system.

Review Endpoints

The end-points of this review were to answer the following questions related to clinical assessment of brain-injured patients:

Consciousness Scales

  • Should assessments with clinical coma scales be routinely performed in comatose adult patients with acute brain injury?

  • For adult comatose patient with acute brain injury, is the GCS score more reliable than the FOUR score in the clinical assessment of coma?

  • Is the FOUR score a better predictor of clinical outcomes compared to the GCS score?

Pain, Agitation, and Delirium

  • Which pain scales have been validated and shown to be reliable among patients with brain injuries cared for in neurocritical care units (NCCU)?

  • Which pain scales have been validated and shown to be reliable among patients with severe disorders of consciousness (minimally conscious state or MCS and unresponsive wakefulness syndrome or UWS).

  • Which “sedation” scales are valid and reliable in brain-injured patients cared for in neurocritical units?

  • What other sedation strategies may lead to improved outcomes for brain-injured patients?

  • Which delirium scales are valid and reliable in brain-injured patients cared for in neurocritical units?

Consciousness Scales

Summary of the Literature

Glasgow Coma Scale

The Glasgow Coma Scale (GCS), introduced in the 1970s [1], is commonly reported as a single number summing the three components. Though widely studied and incorporated into many scoring systems, interrater reliability of the GCS has been inconsistent [25]. These studies report a wide range of κ scores (ranging, for example, from 0.39 to 0.79 in one study). Disagreement ratings tend to be higher between professions (nursing vs. medical practitioners) for the motor score, particularly among inexperienced staff, and for patients who had intermediate scores. Disagreement is lowest within specialist professional groups (e.g., neurocritical care nurses) for the verbal component, and when assessing alert or drowsy subjects.

The best immediate post-resuscitation GCS sum score or the GCS encountered in the field by paramedics has been studied as a prognostic marker. The sum GCS on ED arrival is a strong predictor of in-hospital mortality (area-under-the ROC curve, AUC of 0.91) and need for neurosurgical intervention (AUC of 0.87) [6], with the eye score the weakest predictor and sum score the best. An initial GCS sum score of 3 is associated with poor clinical outcomes in TBI (mortality 50–76 %) [79]. However, outcome is largely influenced by the extent of brainstem injury—particularly pupillary light responses, a finding not captured by the GCS. The GCS sum score is associated with outcomes in posterior circulation acute ischemic strokes [10, 11], though this is not a consistent finding [12]. The GCS is a good predictor of outcome in post-cardiac arrest patients treated with therapeutic hypothermia; GCS >4 after sedation was stopped predicted a favorable outcome with a sensitivity of 61 %, positive predictive value of 90 %, and AUC of 0.81 [13].

Concerns have been raised about the accuracy of GCS scoring in intubated patients and those receiving analgesics, sedatives, and paralytics since verbal scores cannot be assessed in these patients. There are varying approaches to this problem such as assigning the lowest possible score or adding “T” to the sum of the motor and eye components. Nearly, 80 % of 166 studies reviewed did not report how they handled untestable GCS features such as intubation or swollen eyelids [14]. A linear regression model derived from a cohort of non-neurologic patients (most with a GCS sum score of 15) was developed to predict the verbal score based on eye and motor response [15, 16]; this has not gained wide acceptance. A survey of 71 Level I trauma centers showed that only 55 % could identify patients receiving neuromuscular blockade (NMB), and 63 % could identify intubated patients [17]. Furthermore, data from a large academic trauma center in the UK showed decreasing correlation between admission GCS and clinical outcomes over time, perhaps reflecting that GCS ratings are less accurate as use of analgesics, sedatives, and NMB have become more common [18].

FOUR Score

The full outline of unresponsiveness (FOUR) score, introduced in 2005, provides additional information not captured by the GCS including details about brainstem reflexes and respiratory drive and an opportunity to recognize the locked-in syndrome [19]. The FOUR score has excellent interrater reliability and validity equivalent to the GCS (overall κ statistic of 0.82), and may discriminate better among severe consciousness disorders. Among patients with the lowest sum GCS of 3, 25 % have the lowest FOUR score of zero, and scores range from 0 to 8 in that subset [19]. The FOUR score has been further validated in the medical ICU [20], the ED [21], and among ICU nurses with varying neurologic experience [22]. The FOUR score performed better than the GCS for exact inter-rater agreement, but similar for agreement within ±1 score point [23]. Another study involving 907 critically ill patients showed a weighted κ of 0.92 which was similar whether the patient was mechanically ventilated or not [24].

A pooled analysis of four prospective validation studies showed an AUC of 0.88 for the total FOUR score and 0.87 for the GCS score in predicting outcome [25] and for patients with sum GCS of 3, a FOUR score of >2 provided maximum sensitivity and specificity for the prediction of in-hospital mortality. In another study, no patient with a FOUR score ≤4 at exam days 3–5 after cardiac arrest survived the hospitalization, and a two-point improvement in FOUR score in serial examinations (but not the GCS) was associated with survival. Sensitivities, specificities, positive, and negative predictive values were comparable between the two scales for cardiac arrest [26]. The FOUR score predicted mortality and poor functional outcome in one TBI study [27] and performed comparably with the GCS in another study [28].

Assessment of Pupils

A fixed dilated pupil in the setting of supratentorial brain injury is thought to represent brain herniation with third nerve and brainstem compression, though evidence of this pathology is absent in some cases [29]. The odds for poor outcomes are increased approximately 7-fold among patients with bilateral nonreactive pupils, and 2.5- to 3-fold with a unilaterally non-reactive pupil [30]. Patients whose pupils are non-reactive have a 68 % mortality vs. 7 % in those with brisk pupillary responses. With a sum GCS of 3, mortality ranges from 22 to 75 % if pupils were reactive, increasing to 80–100 % if pupils were fixed and dilated [7, 9, 31]. A poor functional outcome (GOS 1–3) occurs in 98.6 % of those with bilateral fixed dilated pupils, 72.4 % with a unilateral fixed pupil, and 74.5 % with bilateral reactive pupils [7]. Factors such as external facial and eye trauma, prior eye surgery, and administration of anticholinergic medications could confound this assessment and must be taken into account when evaluating pupillary reactivity. All patients with acute brain injury deserve aggressive resuscitation on presentation and the duration of pupil non-reactivity and potential surgical evacuation of acute mass lesions should be considered before deeming the prognosis unfavorable, as pupil examination can be dynamic and non-reactivity is occasionally reversible [32, 33].

Pupil size and reactivity typically are measured subjectively with a flashlight. However, significant inter-examiner variability afflicts standard pupil assessments [3436]. Several newer devices (e.g., NeurOptics, Colvard, Procyon) measure pupil diameter, and some incorporate infrared imaging, digital image capture, and automated measurement of device-specific calculations such as the minimum and maximum pupil diameter, percent decrease in response to photostimuli, and constriction velocity, among other variables [37, 38]. These devices have been tested widely in many populations, but less extensively among brain-injured patients, where they have been shown to detect impaired pupillary responses during herniation or other clinical events [37, 39], and improve accuracy, sensitivity, and reproducibility [36, 38] and provide device-specific metrics such as the Neurological Pupil index (NPi) [40]. Additional research is necessary to confirm any potential benefits from these devices in caring for brain-injured patients.


Coma scales allow a more objective measure of neurologic examination, facilitate communication, assist in outcome prediction, and aid in documenting injury severity. The GCS, considered the standard coma scale, is incorporated in many clinical scoring tools but newer studies raise concerns about variability in GCS assessment and accurate categorization of intubated patients. The FOUR score provides additional information about brainstem reflexes and respiratory drive, but has not been as systematically studied, particularly relative to clinical outcomes. The FOUR score and GCS have comparable good to excellent inter-rater reliability, but both can be confounded by sedatives and NMB medications. The FOUR score may have an advantage because it does not include a verbal score and does include pupillary assessment that is most resistant to sedative effects.

Pain, Agitation, and Delirium Assessment

Summary of the Literature: Assessing Pain for Brain-Injured Patients

Pain remains a common symptom among ICU patients [41], and recent practice guidelines for ICU PAD strongly recommended that all adult ICU patients be routinely monitored for pain [42] using patient self-report with the Numeric Rating Scale 0–10 (NRS) as the preferred initial approach.

For patients unable to self-report, using either the Behavioral Pain Scale (BPS) [43] or the Critical Care Pain Observation Tool (CCPOT) [44].

Brain-injured patients in NCCU are known to experience more significant pain than initially presumed, and if undertreated, the quality of recovery may be reduced [45]. In addition, a diagnosis of coma, vegetative state or unresponsive wakefulness state (UWS), and MCS, may further impact pain perception by these patients, and even more significantly, recognition of that pain by clinicians. Noxious stimuli can activate key nodes in the pain matrix in these patients [46] suggesting possible pain perception and clinicians should treat patients with severe disorders of consciousness as if they could perceive pain [47].

Neurocritical care patients can often assess their own pain using a tool such as the NRS, which can be elicited in 70 % [48], with BPS assessable in the remainder; both with good inter-rater agreement (0.92 and 0.83, respectively). Assessing pain in patients with severe disorders of consciousness such as MCS and UWS is a greater challenge, but is possible with Nociception Coma Scale (NCS) that assesses similar components to the BPS and CCPOT [49], with good to excellent concurrent validity and inter-rater agreement. More recent studies suggest that the visual subscale does not discriminate noxious stimuli, and its exclusion increased sensitivity from 46 to 73 % with specificity of 97 % and accuracy of 85 % (NCS-R) [50]. A score of 4 on the NCS-R was identified as a threshold value to detect a response to noxious stimuli.


The recent PAD guidelines place an increased emphasis on pain recognition and treatment before dealing with sedation, and almost all neurocritical care patients, even those with severe impairments of consciousness, can be assessed for pain.

Summary of the Literature: Assessing Sedation for Brain-injured Patients

Sedation for neurocritical care patients is paradoxically necessary yet fundamentally undesirable since it may cloud accurate neurological assessment [51]. Recent extensive psychometric testing suggest that both the Richmond Area Sedation Scale (RASS) and Sedation–Agitation Scale (SAS) scored the highest for validity, reliability, feasibility, and relevance [52], in keeping with recommendations of the 2013 PAD guidelines that these two scales be used to assess ICU sedation [42].

The bispectral index (BIS) monitor shows excellent correlations with the RASS and SAS scales in neurocritical care patients [53] both with and without sedative medications, and addition of BIS monitoring to the Ramsay scale [54] resulted in nearly 50 % less propofol usage, reduced use of high-dose propofol, and was associated with a faster time to waken. However, the contribution of drug induced sedation and underlying neurological disease to BIS values remains uncertain, and the technique is not widely used in brain-injured patients.

The risks, benefits, and role of sedation interruption or wake-up tests for brain-injured patients remain uncertain. A meta-analysis of five randomized trials of daily sedation interruption in 699 non-neurologic patients showed no reduction in duration of mechanical ventilation, length of ICU and hospital stay, or mortality [55]. Sedation interruption in patients without neurological disease may result in higher daily doses of sedatives with higher nurse ratings of workload but no difference in time to extubation or lengths of stay [56]. Among neurocritical care patients with TBI or subarachnoid hemorrhage (SAH) propofol interruption results in ICP increases [57], though these are of uncertain clinical significance. Helbok et al. studied 20 severely brain-injured patients with multimodal neuromonitoring during interruption of sedation [58]; only one new neurologic deficit was detected (2 %), but one-third of wake-ups were aborted due to ICP crisis, agitation, or desaturation.


Though not as extensively tested among the neurocritical population, the superior psychometrics associated with the RASS and SAS have been confirmed in many ICU patient groups, and both scales have been applied to these patients in multiple studies. The addition of processed EEG systems to ICU sedation likely has its greatest benefits in more deeply sedated patients particularly those receiving intermittent or continuous NMB. In these patients, routine clinical assessment is less reliable. Additional study is required before strong recommendations can be made.

Summary of the Literature: Assessing Delirium in Brain-injured Patients

Delirium in general ICU patients is associated with increased mortality, prolonged ICU and hospital length of stay, and long-term cognitive impairment [42]. Routine monitoring for delirium with either the Confusion Assessment Method for the ICU (CAM–ICU) or the Intensive Care Delirium Screening Checklist (ICDSC) was strongly recommended by the 2013 PAD Guidelines.

While delirium assessment with the CAM–ICU is feasible in some neurocritical care patients—a delirium incidence of 43 % was reported in one stroke unit [59], generalizability of this data is limited because 55 % of admitted patients were excluded due to higher NIH stroke scales and lower GCS scores, only 7 % required mechanical ventilation, and only 38 % received any doses of analgesia or sedation. Among 114 patients with intracerebral hemorrhage, the CAM–ICU was positive in 27 % of patients and was predictive of poor outcome (modified Rankin score >2) at 28 days, but not at 3 or 12 months, and was predictive of poor quality of life [60]. A multicenter study of 151 neurocritical care patients (including 43 % mechanically ventilated) revealed that delirium assessments with the ICDSC could be performed 76 % of the time, with an incidence of delirium of 14 % [48].

Unlike the CAM–ICU [61], the ICDSC recommends that changes in wakefulness and attention directly attributable to recent sedative medication not be scored as positive ICDSC points [62], an important distinction given the increasing concern that delirium assessment can be confounded by residual sedation [41, 48, 63, 64].


Defining and treating delirium among ICU patients remains challenging and fraught with potential confounders, including persisting sedation and progression of underlying neurological issues. Patients were 10.5 times more likely to be scored delirious (P < 0.001) if the CAM–ICU assessment was performed before (when median RASS score was −2) rather than after daily sedation interruption [64], and outcomes with sedation-related delirium were similar to patients who never had delirium, while delirium not related to sedation was associated with much worse outcomes. Such confounding of delirium assessment may be minimized by only assessing patients with a SAS level of at least 3 (follow commands), patients with at least 3 of the 4 Kress wakefulness criteria [65], or a RASS of at least −1 (given the findings of Haenggi and Patel).