FormalPara Key Summary Points

Why carry out this study?

Accurate assessment of level of consciousness and pain during continuous sedation until death is important to ensure patient comfort. Assessments of patient comfort during continuous sedation until death are usually made by behavior-based observational scales.

Recently, however, a number of studies from the neurosciences have shown that sometimes consciousness and pain are undetectable with these traditional behavioral methods.

In this study, we wanted to determine whether subjective caregiver assessments of consciousness and pain during continuous sedation until death would be confirmed by objective neurophysiological measures.

What was learned from the study?

Subjective caregiver assessments showed very poor agreement with objective neurophysiological measures of consciousness and pain.

The sole use of behavior-based observational scales to make assessments of comfort during continuous sedation until death appears unreliable.

Our results suggest that assessments of patient comfort could have been improved by including objective monitoring of level of consciousness and pain.

Digital Features

This article is published with digital features, including a summary slide, to facilitate understanding of the article. To view digital features for this article go to https://doi.org/10.6084/m9.figshare.13102412.

Introduction

When palliative care patients enter the phase where symptoms become refractory, continuous sedation until death (CSD) may be the only treatment option left. This involves administering medications to induce decreased or absent awareness in order to relieve intractable suffering at the end of life. However, assessing comfort in palliatively sedated patients is difficult, and a number of problems have been identified questioning the reliability of these assessments [1]. Problems with the assessment of awareness or comfort in dying patients, and with the titration of drugs have been reported [2].

To assess the level of comfort of these unconscious patients, usually subjective behavior-based observational scales are used. However, a number of studies have recently questioned the accuracy and validity of such scales in these situations [3]. Although several efforts have been made to improve these observational scales for the palliative patient group, the main problem with observational assessment methods for palliatively sedated patients is related to the medication itself. Considering that the medications used to induce CSD have an impact on motor responsiveness while the traditionally used assessment scales are based on inferences from this responsiveness, this method of assessing could be unreliable and patient suffering may remain undetected. Studies in different patient groups that critically reviewed awareness consistently reported that persons were, in contrast to what was assumed by the caregivers, not always (completely) unaware. For example, several studies have shown that patients diagnosed with a vegetative state (“unresponsive wakefulness syndrome”) did show some (minimal) clinical signs of conscious awareness in about 40% of the cases [4]. Unresponsiveness, which commonly accompanies unconsciousness, does not automatically imply unawareness, and measures that rely on a person’s ability to react to stimulation may be misleading and could contribute to an uncomfortable death that goes unrecognized [5,6,7].

More objective methods are needed to improve the assessments of awareness and comfort during CSD. A few studies have used the Bispectral Index (BIS) monitor for this, a commonly utilized EEG-based device to assess depth of sedation when administering sedative, hypnotic, or anesthetic agents during surgical and medical procedures, but the algorithm used to calculate the Bispectral Index is proprietary [8]. Alternatively, monitoring devices such as the NeuroSense monitor (NeuroWave Systems Inc.) to assess the hypnotic depth of anesthesia and the Analgesia Nociception Index (ANI) monitor (Mdoloris Medical Systems SAS) to assess the analgesia/nociception balance have open protocols.

The feasibility and potential advantages of using these devices to improve assessments of consciousness and pain/discomfort during CDS have been demonstrated in a case report, where the authors suggested that more research is needed [9]. In the present study, we evaluate whether subjective behavioral assessments of consciousness and pain/discomfort during CSD were confirmed by objective measures of the NeuroSense and ANI monitors.

Methods

Design and Setting

This prospective observational study was performed over the years 2017 to 2019. Because of the complexity of the problem and the explorative nature of this study, we conceived it as a multi-case study in which the setting and participants were deliberately chosen. This study is part of the COMPAS study (COMfort in PAlliative Sedation); for full details we refer to the published protocol [10]. The protocol for this observational study has been approved by the biomedical ethics committee of the VUB/UZ Brussel (BUN 14320136504) and registered at ClinicalTrials.gov (ID NCT03273244) [11]. This study was performed in accordance with the principles of the Declaration of Helsinki of 1964 and its subsequent revisions. Ethical approval for this study was obtained from the biomedical ethics committee of the University and University Hospital of Brussels (BUN 14320136504). All study information and patient consent forms were approved by the ethics committee. Written informed consent was obtained from the patient or his/her substitute decision-maker.

Study Participants

Patients were recruited in an academic hospital, a loco-regional hospital, and two homes for the elderly, all located in Belgium. Patients were deliberately selected to reflect variability regarding setting and medical conditions, and written informed consent was obtained from the patient or his/her substitute decision-maker.

Inclusion Criteria

Patients were included if their treating physician considered them:

  1. 1.

    in their last week of life,

  2. 2.

    in conditions that might, when not treated, cause high levels of distress,

  3. 3.

    CSD was to be started.

Procedures

For this prospective study, we combined caregivers’ subjective observational assessment of pain and discomfort with objective measures of depth of sedation and pain/discomfort as produced by NeuroSense and ANI monitors. The NeuroSense monitor displays two frontal EEG signals and calculates a number of parameters including the bilateral WAVcns index (Wavelet Anesthetic Value for the Central Nervous System) ranging from 100 (awake) to 0 (flat EEG). The lower the index, the lower the likelihood of consciousness: after a standard bolus-based propofol induction, 95% of patients lose consciousness under a WAVcns level of 72 [12]. A WAVcns value in the 40–60 range represents an adequate depth of sedation for surgery, while a value of 60 or more represents an increasing risk of waking up [13]. In cases where the two WAVcns indices (each for one hemisphere) differed, we chose the higher value.

The ANI monitor continuously monitors heart rate variability (HRV) and transforms this into an analgesia nociception index (ANI 0–100), which assesses parasympathetic activity as a measure of nociception. A recent study showed that ANI is effective in detecting pain in deeply sedated critically ill patients [14]. ANI is a non-invasive tool based on the analysis of the respiratory fluctuations of heart rate that mainly reflect the variability in the parasympathetic tone and so is likely useful to assess pain and discomfort in non-communicative patients. The lower the index, the higher the likelihood of pain. The ANI monitor provides two values: mean-ANI (ANIm), an average calculated over the previous 4 min, and instant ANI (ANIi), an average calculated over a shorter period of time (64 s). Unless otherwise indicated, ANIi was used in our study. An ANI value between 50 and 70 is considered adequate analgesia [15, 16]. A study by Le Guen et al. in conscious patients showed that ANI values < 50 indicate moderate pain, corresponding to a score of > 30 on a visual analogue scale (VAS) for pain [17]. When ANIm is > 70, it is considered safe to decrease doses of opioids.

For each patient, nurses were asked to score three numeric rating scales (NRS) on the patient’s level of consciousness (no consciousness–full consciousness), comfort (no pain–very severe pain) and ability to communicate (no communication possible–full communication possible). The NRS is a segmented numeric version of the visual analog scale (VAS) in which a respondent selects a whole number (0–10 integers) best reflecting the presence of the measured quality. Nurses were asked to do this at the moment they would normally attend to the patient while blinded to the monitor outputs. Each day the principal investigator assessed the patient with observational scales, using one scale that is mentioned in the Flemish palliative sedation guideline and three other scales that have been proposed in the literature:

  1. 1.

    CCPOT (Critical Care Pain Observational Tool), a tool specifically developed for use in patients with limited consciousness [18].

  2. 2.

    RASS (Richmond Agitation-Sedation Scale) [19].

  3. 3.

    M-ESAS (Modified Edmonton Symptom Assessment Scale, validated for a Flemish Palliative Care Population) [20].

  4. 4.

    BPS or BPS-NI (Behavioral Pain Scale Non-Intubated) [21].

Statistical Analysis

Descriptive analyses were performed for all included data. Results were reported as proportions or median values. All analyses were conducted with the use of SPSS Software version 25 (IBM). We dichotomized NRS scores as well as the monitor outputs. Similar to the study by Masman et al., we used a cut-off of 4 to dichotomize the NRS scores by caregivers to determine the presence of the measured quality (0–3: the measured quality is completely or almost completely absent, 4–10: the measured quality is moderately to maximally present) [22]. We used these results to calculate the specificity and sensitivity of the caregiver subjective assessments when compared to objective monitoring by WAVcns and ANI. We then created scatterplots where we indicated the false negatives (FN), true negatives (TN), false positives (FP), and true positives (TP). For assessments of consciousness, these are defined as FN = (WAVcns ≥ 60; NRSc ≤ 3), TN = (WAVcns ≤ 60; NRSc ≤ 3), FP = (WAVcns < 60; NRSc > 3) and TP = (WAVcns ≥ 60; NRSc > 3). For assessments of pain, these are defined as FN = (ANI < 50; NRSp ≤ 3), TN = (ANI ≥ 50; NRSp ≤ 3), FP = (ANI ≥ 50; NRSp > 3) and TP = (ANI < 50; NRSp > 3).

Consequently, NRS assessments of consciousness ≤ 3 (not conscious) should therefore conform with WAVcns < 60 and NRS assessments of pain ≤ 3 (no pain) with ANI ≥ 50.

However, since multiple measurements were taken from the same patients, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and interrater reliability (κ) were computed at the individual level when applicable and pooled together.

The rank correlation between observational scales (CCPOT, RASS, M-ESAS, and BPS) and monitor outputs was calculated using Kendall's tau.

Role of the Funding Source

Funding for this research was provided by a government grant (G.0566.15N) from the Research Foundation—Flanders (FWO). The sponsor had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Results

Patient Characteristics

Twelve patients were enrolled in the study (see Table 1). Consent was granted in every instance, by the patient or by their substitute. There were six female and six male patients; their average age was 76 years (range, 49–91 years). This resulted in 108 caregiver NRS assessments of pain, consciousness, and ability to communicate, and 32 assessments by RASS, CPOT, BPS, and M-ESAS. In the majority of cases, several nurses were involved in the different NRS assessments of each patient.

Table 1 Patient characteristics (N = 12)

Assessments of Consciousness

The scatterplots show how many of the subjective assessments were confirmed by objective indices (see Fig. 1a, b); 46% (n = 46) of caregiver subjective assessments of consciousness were contradicted by neurophysiological assessments (Fig. 1a). Forty-two NRSc assessments indicating no consciousness (≤ 3) disagree with WAVncs (≥ 60), and five NRSc assessments indicating consciousness (> 3) disagree with WAVcns < 60. This cut-off value (60) is indicated by the black horizontal line in the scatterplot. Of the 100 valid WAVcns assessments measured, 55 were above or equal to 60 (55%). Sensitivity and specificity of caregivers’ subjective assessments of consciousness was 23.6 and 91.1% respectively, with a PPV of 76.5% and NPV of 49.4%, an accuracy of 54.0% and interrater reliability (κ) of 0.13. Similar to the omnibus test, caregiver subjective assessments are highly specific yet show poor sensitivity, NPV, PPV, accuracy, and κ values (see Table 2).

Fig. 1
figure 1

a Scatterplot of WAVcns by NRSc. WAVcns wavelet anesthetic value for the central nervous system, NRSc numeric rating scale for consciousness, FN false negatives, TP true positives, TN true negatives, FP false positives. b Scatterplot of ANI by NRSp. ANI analgesia nociception index, NRSp numeric rating scale for pain, FN false negatives, TP true positives, TN true negatives, FP false positives

Table 2 Comparison of objective and subjective assessments

Assessments of Pain

Caregiver subjective assessments of pain were contradicted by neurophysiological assessments in 8.7% (n = 9) (Fig. 1b). Five NRSp assessments indicating no pain (≤ 3) disagree with ANI (< 50), and four NRSp assessments indicating pain (> 3) disagree with ANI (≥ 50). The black horizontal line in the scatterplot indicates the cut-off value of 50. Of the 104 valid ANI assessments measured, 8 were < 50 (7.7%), indicating insufficient analgesia. Of the corresponding ANIm values, 90 were > 70 (85.6%), indicating a possible overdosage of opioids. Omnibus sensitivity and specificity of caregivers’ subjective assessments of pain was 0 and 94.79%, respectively, with a PPV of 0%, an NPV of 91.92%, an accuracy of 88% and an inter-rater reliability (κ) of − 0.063. Analogously, caregivers’ subjective assessments of pain show to be highly specific and quite accurate at identifying correctly patients who are not in pain according to ANI assessments. In general, however, κ values indicate that objective and subjective assessments tend to show very poor agreement.

Correlation between Observational Assessments and Neurophysiological Measures

Median (IQR) WAVcns and ANI values during subjective NRS caregiver assessments (pain, consciousness, and communication) and the accompanying Kendall rank correlations coefficients are given in Table 3. Correlations were 0.18, 0.33, and 0.41 for NRS assessments compared with WAVcns values, and − 0.02, − 0.03, and − 0.08 for NRS assessments compared with ANI values, which indicates negligible to low positive correlations and negligible correlations, respectively [23].

Table 3 Correlation between subjective and objective assessments

For 11 of the 12 patients, we scored RASS, CPOT, BPS, and M-ESAS (one patient died before the observational scales could be scored). Table 4 shows the rank correlations between observational scales that are suggested in the literature to assess level of consciousness and pain during CSD and neurophysiological measures. All found correlations were negligible, except for the RASS, which showed a significant low positive correlation with WAVcns (r = 0.36; p = 0.014).

Table 4 Correlation between observational scales and objective assessments

Discussion

From a clinical point of view, caregiver observational assessments are important to inform medication adjustments in sedated patients. Two important questions in this regard are: is the patient experiencing pain and is the patient sufficiently sedated? We therefore compared the caregivers’ subjective assessments with the objective monitor values according to the standards mentioned in the literature. Our results show that subjective observational caregiver assessments of level of consciousness and pain agree poorly with objective assessments of Neurosense and ANI monitors.

The two basic measures of quantifying the diagnostic accuracy of a test or assessment are the sensitivity and specificity [24]. Sensitivity is the ability of a test or assessment to detect the condition when it is truly present, whereas specificity is the probability of a test or assessment to exclude the condition in patients who do not have it [25]. Sensitivity is given by the ratio of TP/(TP + FN), and specificity is given by the ratio of TN/(TN + FP).

Sensitivity and specificity of caregivers’ subjective assessments of consciousness were 23.6 and 91.1%, respectively; this means that caregivers detected the presence of consciousness in a patient accurately in 23.6% of all assessments when compared to objective neurophysiological monitoring. This also means that in 76.4% of all assessments, caregivers believed the patient to be deeply unconscious, while in fact this was not the case. In 91.1% of assessments, caregivers accurately detected the absence of consciousness, when consciousness was truly absent according to the objective reference standard. Positive predictive value is the probability that a patient has the condition given that the assessment results are positive [this is given as the ratio TP/(TP + FP)]; this means that subjective caregiver assessment is able to detect consciousness accurately in 76.5% of instances, when consciousness is truly present. Negative predictive value [given as the ratio TN/(TN + FN)] is the probability that the patient does not have the condition given that the assessment results are indeed negative; in 49.4% of instances caregiver assessment would accurately identify patients as deeply unconscious, when this truly is the case. Or in other words, caregivers’ assessments are accurate in determining that a patient is deeply unconscious about half the time.

For detecting pain/discomfort, sensitivity and specificity were 0 and 94.79%. This means that caregivers detected the presence of pain/discomfort in a patient accurately in 0% of all assessments when pain/discomfort was truly present according to ANI monitoring. In 94.79% of assessments, caregivers accurately detected the absence of pain/discomfort, when this was truly absent according to the objective reference standard.

Subjective caregiver assessments have a positive predictive value for pain/discomfort of 0% compared with objective ANI monitoring, and a negative predictive value of 91.92% (probability of correctly identifying the absence of pain/discomfort).

There is always a trade-off that needs to be considered: a highly sensitive test (or with high PPV) is beneficial for the patient, meaning that if he/she is conscious or in pain, the subjective method should be able to recognize it allowing proper sedation of the patient. However, it will also generate more false positives, implying medication use when not necessary, according to objective monitoring.

The latter may induce ethical issues for the physician (palliative sedation could be perceived as euthanasia, and some physicians may develop “morphinophobia”). The latter in mind, a highly specific test will reduce the number of false positives, but rather increase false negatives; put differently, the method will well recognize unconscious/pain-free patients when they actually are, but not when they are conscious or in pain.

When assessing unconscious patients, a distinction is generally made between the level of consciousness and the subjective experience of the patient [26, 27]. The former is assessed by contact the unconscious person appears to have with the outside world, and is quantifiable by means of observational scales. The subjective experience is the patient’s own experience of unconsciousness and may differ from the observed level of consciousness [26]. An unconscious patient may therefore experience pain, fear, etc., and not manifest visible clinical signs, such as restlessness or grimacing [28]. It is therefore important that we not only assess the level of consciousness but also possible nociception. Using the ANI monitor during CSD allowed us to detect not only 7.7% of the assessments, indicating insufficient analgesia, but also a possible analgesic overdosing in 85.6%. Palliative sedation guidelines require that the principle of proportionality be followed (meaning the level of sedation should be the lowest necessary to provide adequate relief of suffering) [29, 30]. These results show that using the ANI monitor to assess the absence of pain and risk of over- or under-administration of analgesics, might be a substantial improvement compared to subjective assessments based on observation alone.

Our results further confirm findings by Barbato et al. and Masman, who found observational scales measuring the level of comfort and sedation in unresponsive palliative care patients being unreliable when compared with the results of a processed-EEG monitor (BIS) [22, 28]. In their study, Barbato et al. found a significant correlation of 0.42 between BIS and RASS; in our study, we found a significant though low correlation of 0.36 between WAVcns and RASS. Considering that WAVcns is validated for guiding anesthesia during surgical procedures and has been proven superior for continuous sedation monitoring in critically ill intensive care unit patients compared to standalone observational measures such as RASS, we see no arguments to question its superiority during CSD [31].

Using monitoring devices during CSD allows better titration of medication, and to adapt a standard protocol to fit the specific needs of the patient, which may be influenced by factors such as underlying medical condition, medical and pharmacological antecedents, habituation effects, prior substance abuse, etc. To our knowledge, this is at present the best possible way to ensure a safe and quality approach to assess patient comfort as mandated by palliative sedation guidelines. Our data suggest that monitoring devices should be considered as the preferred method guiding comfort assessments during CSD; feasibility and acceptability for the caregivers and family members has already been demonstrated [32]. This will not only improve quality of care for the dying patient but also protects the caregivers from being accused of over-or undersedating the patient, by providing objective measures. There is also a role to be played here by public health officials and palliative care associations, who should invest in further developing and implementing these technologies in palliative care from a perspective to prevent an unwanted death, where suffering may remain underdetected or underappreciated.

Strengths and Limitations

A major strength of this study is that we not only assessed level of consciousness but also possible nociception. To our knowledge, this is the first study having assessed this in a sample of patients being continuously sedated until death. Another strength is its high ecological validity: even in our limited sample of 12 patients, we obtained 108 assessments and were nonetheless able to observe that care could be improved by including objective measures such as WAVcns and ANI, thereby demonstrating the validity of the concept.

A limitation of this study is its small sample size. This was due to the fact that the target group was very difficult to reach; often the decision to switch to CSD was made only at the last moment, which made it difficult to recruit participants (e.g., three potential participants who had given permission to participate in the study died before the measurements could be started). In addition, for some substitute decision-makers, it was emotionally too difficult to give permission to participate. Although the small sample size does not allow us to make statistical generalizations, we were able to demonstrate the added value of the concept in assessing patient comfort during CSD.

Another potential limitation concerns the epistemological status of what is defined as subjective and objective. In this study, we considered the caregivers’ assessments to be subjective and neurophysiological measures to be objective; however, in epistemologic literature, objectivity is described as inter-subjectivity, meaning several observers come to the same result [33]. Our choice of measurement instruments may thus influence the potential for inter-subjective cross-validation and logically begs the question if the choice of other neurophysiological indices and other ways of assessing patient comfort would produce different results. This is particularly important as patient self-reporting is considered the gold standard for assessing pain. The question “Is the patient still experiencing any kind of consciousness or pain, during continuous sedation until death?” cannot be tested according to the Popperian paradigm, since a hypothesis is only testable if it can be falsified (if it is possible to reject it). It is similar to the so-called hard problem of consciousness. Waking up the patient, which would be immoral considering CSD is administered as a last resort to treat refractory symptoms, would also not solve this problem, as the patient would no longer be sedated in that case. Hence it is a “wicked” problem, because falsification is impossible [34]. That means, epistemologically speaking, we have to consider the next best thing: approaching the supposed phenomenological essence as close as we can, or in other words, make the most likely inference to the best explanation. In epistemology, this is known as abductive reasoning, which can be distinguished from deductive and inductive reasoning. Abductive reasoning is a form of synthetic inference through which meaningful underlying patterns of selected phenomena are recognized to comprehend a complex reality and expand scientific knowledge [35]. Although the hypothesis regarding absence of awareness during CSD cannot be falsified, there are a number of strong induction-based arguments that can be used to support the currently best explanation: a (limited) number of studies have shown that processed EEG monitoring (WAVcns, BIS) and HRV monitoring (ANI) can be used in unconscious deeply sedated patients who afterwards woke up again (and who could self-report retrospectively), or in deeply sedated critically ill patients, to assess level of consciousness and absence of pain, and that keeping monitoring values within a certain range provided the required comfort during nociceptive procedures [14, 15, 21, 36]. Adding to that a qualitative study showed that the use of these monitoring devices during CSD is acceptable to both professional caregivers (physicians and nurses) and family members and is considered an added value by them, which strengthens our inter-subjective (transdisciplinary) approximation of what we consider to be objective assessments [32]. These arguments lead up to the currently most likely inference that the assessment of level of consciousness and pain during continuous sedation until death can indeed be improved by including objective monitoring devices. For future studies, however, it could be considered to include different monitoring systems and apply different neurophysiological paradigms in combination with several independent and different observers that make assessments simultaneously, to further clarify whether these observations concur with each other.

Implications for Practice

Neurophysiological assessments with indices such as ANI and WAVcns seem to be able to detect consciousness and absence of pain that otherwise remains undetected or underappreciated by caregivers’ subjective assessments. Therefore, we suggest that these monitors might be important tools to improve assessment of patient comfort during CSD and avoid unnecessary (and undetected) suffering. Additionally, monitoring devices measure continuously, while caregivers make periodic assessments, potentially missing or underappreciating moments of in-between suffering. Future implementation strategies will have to take into account the need for dedicated training of caregivers; depending on the level of specialization in each setting (intensive care unit, palliative care unit, general hospital department, nursing home, at home, etc.,) and the level of expected complexity, it can be useful to consider which caregiver (nurse, specialized nurse, specialized doctor) is best suited to be deployed in a particular setting and what training needs are associated with this.

Conclusions

In our sample, subjective caregiver assessments of level of consciousness and pain during CSD tend to show very poor agreement with objective assessments by monitors to assess depth of sedation and absence of pain. Our findings show that the sole use of behavior-based observational scales to make assessments during CSD is unreliable. Objective monitoring has uncovered several discrepancies which, at the very least, call into question the current method of making assessments of patient comfort during CSD. We suggest future research should focus on further exploring which monitoring systems and neurophysiological paradigms are best to improve assessments during CSD and how this can be implemented in practice.