Background

Nearly one third of patients admitted to an intensive care unit (ICU) will develop delirium [1], which is subsequently associated with sedation–analgesia management issues [2,3,4], an increased duration of mechanical ventilation, length of stay in the ICU and hospital, risk of death, as well as of having long-term neurocognitive dysfunction [1, 5]. Guidelines recommend the routine use of validated clinical tools for the early recognition and treatment of delirium by medical and nursing ICU teams, even if they are not expert neuropsychologists [6].

Among the delirium diagnosis tools that can be used by ICU clinicians in routine practice, the Confusion Assessment Method for the ICU (CAM-ICU) [7, 8] and the Intensive Care Delirium Screening Checklist (ICDSC) [9] have been extensively studied for more than 15 years, demonstrating good psychometric properties in a research setting [6]. In 2014, the CAM-ICU and its training manual were updated to avoid any misinterpretation by users (Table 1). Also, the original version of the CAM-ICU [7, 8] was validated against the American Psychiatric Association’s fourth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV). Differences between the 4th and the new 5th versions (DSM-5) regarding delirium assessment are still under debate [10, 11].

Table 1 Principal changes made in the 2014 updated version of the CAM-ICU training manual

The primary objective of this study was to measure the ability of the 2014 updated version of the CAM-ICU to diagnose delirium according to the most updated neuropsychological reference standard, i.e., the DSM-5 method. Secondary objectives were (1) to measure inter-observer agreement for the CAM-ICU, and (2) within the context of a comprehensive investigation of delirium assessment in a real-life intensive care setting, to compare the diagnostic accuracy of the CAM-ICU to the ICDSC and to physician, resident and nurse recognition of delirium, as well as to common orientation questions and to the patient’s own impression of feeling delirious.

Methods

Ethics and consent

The protocol was approved by an independent ethics committee [Comité de Protection des Personnes (CPP) Sud Méditerranée.IV (N°ID-RCB: 2015-A01084-45; Protocol version 1: 06/23/2015)] and was conducted in accordance with the Declaration of Helsinki (clinicaltrials.gov: NCT02760446). Written consent was required from the patient or the legally authorized representative or a proxy/surrogate decision maker (patient’s next of kin) who gave consent on the patient’s behalf, followed by the patient’s consent as soon they could communicate.

Population

The study took place in the 16-bed medical–surgical ICU of the University of Montpellier Saint Eloi Hospital, an academic tertiary-care hospital, from November 2015 to April 2016. All consecutive French-speaking patients ≥ 18-year old were eligible for enrollment if they had a Richmond Agitation Sedation Scale (RASS) ≥ −3 [12,13,14]. Exclusion criteria were preexisting cognitive disorder/psychosis (baseline cognitive status is often unknown early in the ICU stay, precluding accurate evaluation of change in mental status, a key feature of delirium), visual/hearing loss without helpers, pregnancy (according to French law), patients under tutelage, withdrawal of consent or change in clinical status that would preclude a complete cognitive testing.

Study conduct

All consecutive patients admitted to our ICU were screened by the ICU research team every morning including weekends, until they reached the inclusion criteria during a period of 5 months (November 2015–March 2016). After having obtained consent and enrolling the patient, the ICU research team contacted one of two neuropsychological experts participating in the study to independently perform a neuropsychological assessment of delirium. Figure 1 summarizes the timing of delirium assessments by the neuropsychological experts and the ICU research team.

Fig. 1
figure 1

Study design. The order of assessments by the research team was determined to check both the patient’s eligibility and the presence of some CAM-ICU and ICDSC features (i.e., fluctuating course of mental status assessed by RASS ratings). ICDSC was assessed after CAM-ICU because ICDSC included some CAM-ICU features (i.e., inattention). RASS Richmond Agitation Sedation Scale, CAM-ICU Confusion Assessment Method for the Intensive Care Unit, ICDSC Intensive Care Delirium Screening Checklist, DSM-5 5th version of the Diagnostic and Statistical Manual of Mental Disorders

Data collection

Delirium

Delirium was assessed once, the same day, in five ways that occurred as close together as possible in time, but strictly independent of each other. Separate clinical research forms were used to assure independence between observers.

1. ICU delirium tools: CAM-ICU and ICDSC

The ICU research team used the French versions of the 2014 updated CAM-ICU training manual and the ICDSC [9, 15]. The CAM-ICU was assessed by two independent investigators to estimate inter-observer agreement.

2. Expert neuropsychological assessment of delirium (the reference standard)

The neuropsychological experts were members of the speech and language therapy team, usually in charge of neuropsychological testing in the neurology/neurosurgery/neuro-ICU departments of the Neurosciences University Hospital of Montpellier. A standardized method for diagnosing delirium was used based on the DSM-5 [16] using the Montreal Cognitive Assessment (MOCA) [17], Dubois’ 5-word test [18], Language Screening Test (LAST) [19], with helpers for intubated ICU patients (see Additional file 1: Supplemental Digital Content).

3. Bedside–clinician assessment of delirium

When immediately available, the patient’s bedside ICU team (i.e., the patient’s nurse, resident and attending physician) were contacted by the ICU research team to get their personal feeling about the presence or absence of delirium.

4. The 3 simple orientation/memory questions for the assessment of delirium

The ICU research team also assessed delirium by asking three simple questions commonly used to assess delirium at our institution: Where are you? What day is it today? Who is the president? (because long-term memory is frequently altered in delirium) [20]. The number of incorrect and absent response(s) was recorded.

5. The patient’s own feeling

At the end of testing, the patients were asked by the ICU research team if they had the impression they were confused.

Demographic and medical data

Age, gender, comorbidities and the reason for ICU admission were recorded. The Simplified Acute Physiological Score II (SAPS-II) score [21] and the Sequential Organ Failure Assessment (SOFA) score [22] were calculated within 24 h after ICU admission and upon enrollment. In case enrollment occurred before 24 h, the SAPS-II score took into account the worst value available during the 24 h preceding enrollment. Therapeutics such as sedation, mechanical ventilation and the use of vasopressors were collected upon enrollment.

Data presentation and statistical analysis

Psychometric properties of the CAM-ICU

Validity (primary endpoint)

The performance of the CAM-ICU for diagnosing delirium was assessed by measuring the sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) according to standardized definitions [23, 24]. Expert assessments were used as the reference standard.

Reliability

The kappa coefficient was calculated between the two ICU research investigators. Kappa coefficients above 0.80, 0.60 and 0.40 are considered as measuring ‘near perfect,’ ‘strong’ and ‘moderate’ levels of agreement [25], respectively.

The diagnostic performance of other methods commonly used to assess delirium (ICSDC, bedside clinician assessments, 3-question test), as well as patients’ impressions)

The sensitivity, specificity, PPV and NPV were also calculated using the expert assessments as the reference standard. To compare all five methods for diagnosing delirium, kappa coefficients were calculated between the expert assessments and the other methods. Kappa coefficient comparisons between methods were made using the Z-test [26]. A p value of < 0.05 was considered statistically significant.

Power analysis

The study sample size was determined in relation to the primary endpoint. For expected values [8] of sensitivity and specificity at 85 and 75%, respectively, a desired level of precision set at 10% and the prevalence of delirium set at 50%, the number of patient inclusions required for achieving appropriate power would be 95. Taking into account possible post-enrollment exclusions, 115 patients needed to be enrolled in the study. The prevalence of delirium ranged from 30 to 90% in the literature [1]. Thus, we set the prevalence of delirium at 50% which is conservative regarding the number of patients needed to be analyzed, in order to maximize the power.

Data presentation

Quantitative data are shown as medians and 25th–75th percentiles. Data were analyzed using SAS version 9.2 (SAS Institute, Cary, NC).

Results

A total of 108 patients were included for analysis among the 115 patients enrolled in the study. A Standards for Reporting of Diagnostic Accuracy (STARD) diagram for patient enrollment is shown in Additional file 1: Supplemental Digital Content. Table 2 summarizes patient demographic and medical characteristics. Delirium was diagnosed by the neuropsychological experts for 41 of the 108 patients (38%).

Table 2 Demographic and medical characteristics of the 108 patients included for analysis

Validation of the 2014 updated CAM-ICU

A positive CAM-ICU was found for 34 (31%) patients. Compared to expert assessments, there were 7 misclassified CAM-ICU ratings among the 108 ratings, of which there were 7 false negatives and no false positives. Compared to expert assessments, the CAM-ICU had a sensitivity of 83% [95% confidence interval 71–94], a specificity of 100% [100–100], a PPV of 100% [100–100] and a NPV of 91% [84–97].

To measure the inter-observer reliability of the CAM-ICU, 98 patients were assessed by a second investigator. For the ten remaining patients, a second assessment was impossible because of changes in vigilance status or clinical condition. The kappa coefficient for inter-observer reliability was 0.87 (SD ± 0.06) demonstrating strong agreement. There were no significant differences between the 5 ICU investigators and the 2 neuropsychological experts (kappa coefficients ranging from 0.82 ± 0.1 to 0.88 ± 0.2). First and second CAM-ICU investigators obtained similar agreement with experts’ assessments (kappa coefficients = 0.86 ± 0.1 and 0.85 ± 0.1, respectively).

Diagnostic performance of commonly used methods for assessing delirium

Table 3 presents the statistical measurement of performance regarding delirium recognition via the CAM-ICU, and the ICDSC, by nurses, residents and physicians, as well as via three simple questions. The CAM-ICU demonstrated good performance, while the 3 simple questions demonstrated poor performance. The 3 simple questions demonstrated the highest sensitivity for a false or absent response, but with the lowest specificity and PPV. The ICDSC and clinicians’ diagnosis demonstrated similar performances.

Table 3 Patients’ clinical diagnosis and simple orientation questions

Figure 2 shows the graphic representation of kappa coefficients for each of the methods used to assess delirium. The kappa coefficient measured the agreement between each of the methods and the assessment by the neuropsychological experts using DSM-5 criteria (reference standard). There was a significant difference between the level of agreement found for the CAM-ICU (kappa 0.86 ± 0.05) and that found for other methods (p < 0.047), except the ICDSC (0.69 ± 0.07, p = 0.054, Z-test). Detailed data regarding CAM-ICU/ICDSC procedures are provided in Additional file 1: Supplemental Digital Content.

Fig. 2
figure 2

Agreement between different delirium assessment methods and the neurological experts’ reference rating using the DSM-5 criteria. This figure shows the graphic representation of kappa coefficients and their standard deviations for each of the methods used to assess delirium. The kappa coefficient measured the agreement between each of the methods and the assessment by the neuropsychologist experts using DSM-5 criteria (reference standard). For simple questions, we did not decide a priori how to analyze the answers. Because some patients answered some questions but did not answer other ones, we decided a posteriori to analyze these data following two approaches: including all patients and including only the patients able to answer all the questions. Several thresholds were tested, i.e., delirium was defined in all patients if they gave at least 1 or 2 false or no response(s), or, among the patients who were able to answer all simple questions, if the patients gave at least 1 or 2 false response(s). There was a significant difference (p < 0.047) between the CAM-ICU and each of the other methods, except the ICDSC (p = 0.054). There were significant differences between “all methods from CAM-ICU to ≥ 1 false response to simple questions” and “patient’s own impression of feeling delirious,” as well as between “all methods from CAM-ICU to nurse diagnosis” and “≥ 2 false responses to simple questions” or “patient’s own impression of feeling delirious.” *: Significant difference (p < 0.05); CAM-ICU Confusion Assessment Method for the Intensive Care Unit, ICDSC Intensive Care Delirium Screening Checklist

The 3 simple questions and the patient’s own impression had the lowest agreements with experts and demonstrated significant differences with other methods.

Patient’s own impression of feeling delirious

Among the 108 patients, 77 (71%) were able to answer the question as to whether they felt delirious or not (all had a RASS level ≥ −1). Among these patients, 27 (35%) answered that they were. The patient’s sensitivity for recognizing delirium compared to expert assessments was 38%, with a specificity of 66%, a PPV of 22% and a NPV of 80%.

Discussion

The main finding of this study is that the 2014 updated version of the CAM-ICU was valid compared to the DSM-5 reference standard, with strong inter-observer agreement. CAM-ICU and ICDSC agreed with experts’ opinion without significant difference. The CAM-ICU had superior performance for diagnosing delirium compared to the bedside–clinicians’ opinion, as well as compared to simple questions that are commonly used to assess delirium. Patient impressions of feeling delirious are not accurate, with a false-positive rate at 78%.

Delirium is multifactorial and frequent in critically ill ICU patients [6, 27,28,29,30,31,32,33,34,35]. It is diagnosed in 10–90% of ICU patients, depending on the diagnosis tool, the timing of assessment (during or after interrupting sedation), as well as the frequency of assessment (one-point assessment for validation studies or throughout the ICU stay) [6, 36]. A recent review of 42 studies estimated the prevalence of delirium at 5280 (31.8%) out of 16,595 critically ill patients [1]. The prevalence of delirium in the present study is close to this result: 38% according to neuropsychological assessment and 31% according to the CAM-ICU. Although frequent, delirium is under-recognized by both physicians and nurses [37]. Compared to the CAM-ICU, clinician sensitivity for diagnosing delirium is about 30% [38, 39]. The recognition of delirium by physicians and nurses in the present study was better, with a sensitivity of nearly 70%, suggesting that there might have been an increase in clinician awareness regarding delirium in the ICU over the past decade [40]. However, with PPVs under 80% and NPVs under 90% in the present study, clinicians should still use an ICU delirium tool to improve their diagnostic performance, according to current guidelines [6]. In the present study, agreement with expert diagnosis was not significantly different between the CAM-ICU and the ICDSC. In previous studies, the pooled sensitivity and specificity were 76 and 96%, respectively, for the CAM-ICU, and 80 and 75% for the ICDSC [41]. Sensitivity and specificity were slightly higher for both tools in the present study. This could be due to the study setting where a research team with experience in conducting research in the area of sedation–analgesia was available to conduct this psychometric study. Indeed, performance measurements for ICU delirium tools are higher in research settings than in real life [41]. However, the original study validating the CAM-ICU reported higher sensitivity and specificity than in the present study, with a sensitivity ranging from 93 to 100% and a specificity ranging from 98 to 100% [8]. This could be due to differences in the studied populations. In the original study [8], patients had a median Glasgow score of 7 at enrollment, while in the present study, 80% of patients had a RASS level ≥ 0, suggesting they were more alert. It has been reported that the CAM-ICU could have a lower sensitivity in alert patients, possibly because of better cognitive function [42, 43]. Because delirium prevalence and recognition may depend on the level of consciousness, some authors recommend stratifying delirium assessments for sedation score using a cutoff of RASS − 2 [44]. In the present study also, when taking into account only the 99 patients who had a RASS level of > −2 for a sensitivity analysis, the delirium prevalence was lower than in the overall population (32% instead of 38% according to the experts, 25% instead of 31%, according to the CAM-ICU). The CAM-ICU had a slightly lower sensitivity (78%, instead of 83%), while conserving the same specificity, positive and negative predictive values. Similar findings were obtained when excluding from the analysis the 12 patients who were sedated (8 of them having a RASS level of > −2).

Our study has several limitations. For example, all the methods for diagnosing delirium were assessed within a short time. This may have tired patients and decreased their cognitive functions. The research team planned the assessments within the space of an hour to increase the chance of measuring delirium at the same time for a given patient (Fig. 1). The agreement between the neuropsychological expert and the CAM-ICU was not significantly different whether the expert performed the assessment before or after the research team (kappa coefficient 0.86 ± 0.1 vs 0.85 ± 0.1, respectively). Secondly, except ICDSC, many other validated delirium tools [42] were not performed in order to make the duration of assessment feasible. In the same way, the ICDSC could have demonstrated higher sensitivity and specificity if it had been performed in more alert patients, and by the patient’s clinicians rather than by the research team. To perform the ICDSC, the research team took into account all nursing/medical charts (Fig. 1) but performed only “punctual” cognitive evaluations instead of evaluations over a nursing shift. The ICDSC was not performed by the patient’s clinicians to avoid any bias regarding their raw opinion about the presence or the absence of delirium. In other words, clinicians did not use a validated delirium tool which is to take into account for the interpretation of the data. The primary goal of the present study was to validate the 2014 updated version of the CAM-ICU. Measuring the psychometric properties of the ICDSC was only informative because it is a second recommended tool for assessing delirium and therefore frequently used throughout the world [6]. Moreover, the present study demonstrated no significant difference between the ICDSC and CAM-ICU regarding the agreement between the ICU research team and the neuropsychological experts (Fig. 2). However, this study was not calibrated to measure this difference. A longer period of evaluation could have resulted in a higher sensitivity for ICDS. Similarly, repeated measurements of delirium on a longer period of time could have lead to a higher sensitivity. Regarding the expert’s assessment, DSM-5 interpretations and use as a reference standard for delirium are a source of debate and thus may vary according to assessor [10, 11]. Finally, all the causes of delirium were not investigated because this was out of the scope of this psychometric study. Sepsis was present in 44% of patients at admission, and 11% of patients received sedatives at enrollment. Thus, a few intubated patients were included, due to a strategy of “early-sedation-interruption.” This study should be further performed in different settings/ICU populations.

Study strengths include the reference standard method used by experts to diagnose delirium, which was provided for the first time in detail to facilitate study reproducibility (see Additional file 1: Supplemental Digital Content). Aside from the expert assessment, a pragmatic approach for diagnosing delirium was also evaluated in order to reflect real-life situations in intensive care. This included nurse, resident and physician diagnoses, as well as commonly used simple orientation/memory questions. These questions are not appropriate for diagnosing delirium. This might be due to patient disorientation, which can be related to environment (absence of windows). The recommendation to use a validated delirium tool is reinforced by the fact that the ICU team is used to conducting clinical research and quality improvement projects in the area of agitation, sedation and analgesia [45,46,47]. Even in such an “a priori” favorable setting for the early recognition of delirium, bedside–clinicians still need to use a validated tool during their routine practice, repeatedly during the day and throughout the ICU stay. This is paramount for treating the factors associated with delirium [6, 27,28,29,30,31,32,33,34,35] as soon as possible, especially when taking into account the negative outcomes associated with delirium [5, 6]. A comprehensive approach [48] integrating delirium management with analgesia, sedation, mechanical ventilation, mobility/exercise and family engagement/empowerment has shown a positive impact on increasing ventilatory-free days [49, 50], decreasing delirium incidence [49,50,51] and improving hospital mortality [51].

Finally, the study investigated the patient’s ability to recognize delirium. Though delusional memories are frequent in ICU survivors, they have not been investigated during hospitalization in the ICU setting [52,53,54]. Recent studies found no significant association between delirium in the ICU and mental disorders in survivors [55,56,57]. However, memories of being delirious in the ICU are associated with anxiety [56]. The link between delirium recollection, feelings of being delirious while in the ICU (which is possibly theoretically wrong or too abstract for some patients) and long-term psychological outcomes thus requires further exploration.

Conclusions

The 2014 updated version of the CAM-ICU is a valid tool for delirium diagnosis in a research setting in critically ill patients according to the DSM-5 criteria used by neuropsychological experts. It demonstrated important inter-observer reliability, and better performance for diagnosing delirium in ICU patients than physicians, residents and nurses, despite increased awareness regarding delirium in the ICU for many years. Future studies should investigate the discrepancies between validated methods to diagnose delirium (DSM-5, CAM-ICU, ICDSC) and the ICU team. Moreover, the patient’s own ability to report delirium might be inaccurate. Ethics committees should pay attention to delirium assessment when checking for patient’s ability to consent to participate in ICU studies [58]. This suggests also that delusional memories reported by survivors should be investigated in regard to a valid assessment of delirium during the ICU stay. In the ICU, patients should be asked about feeling delirious, comforted if they are not, but be taken care of regarding what could make them feel so.