Introduction

Community-acquired pneumonia (CAP) is a frequent acute condition characterized by a wide range of possible presentations and outcomes. An estimated 40–80% of patients present with mild pneumonia, have a low risk of death (1–3%), and can safely be treated as outpatients [1]. Around 20–60% require hospitalization for reasons of severity, decompensated comorbidity or because of social reasons [2, 3]. Of these, around 10% have to be admitted to the ICU [4]. This relatively small group of patients is at high risk of death, reaching 30–40%. Therefore, the decision about the treatment setting is a key issue in the initial management of patients with community-acquired pneumonia (CAP).

Scoring systems have been derived and validated to aid the clinician in this regard. The pneumonia severity index (PSI) and the CURB-65 and its variations (CURB, CRB-65) have been shown to provide comparable predictions of pneumonia severity using death as the endpoint, with the PSI being somewhat more sensitive in predicting mild and the CURB-65 in predicting severe pneumonia [57]. However, both are poor tools for guidance of decisions about need for ICU admission [810].

In parallel, scores specifically for prediction of SCAP requiring ICU admission have been evaluated. Today, several scores are available to guide this decision. The last update of the IDSA/ATS guidelines advocates revision of the modified ATS score (herein called the IDSA/ATS score) [11]. However, none of these scores has satisfactorily resolved the issue of identification of patients with severe pneumonia. In particular, none is predictive for patients at risk of early deterioration who might be at highest risk of being disregarded. In the following, we provide an overview of the scores available, and their predictive power and limitations, and suggest a perspective beyond scoring systems for identification of patients presenting with severity criteria.

Scores for severe CAP

The ATS guideline of 1993 was the first to suggest ten criteria predictive for SCAP requiring ICU treatment [12]. These criteria were derived from studies evaluating prognostic factors for in-hospital death. In 1998, we could show that these criteria were 98% sensitive but only 32% specific, and derived a prediction rule with patients admitted at a respiratory ICU used as reference. This rule [presence of two or three minor criteria (systolic blood pressure <90 mmHg, multilobar involvement, PaO2/FiO2 <250) or one of two major criteria (requirement of mechanical ventilation, presence of septic shock), later addressed as modified ATS rule] had sensitivity of 78%, specificity of 94%, positive predictive value of 75%, and negative predictive value (NPV) of 95% [13]. This rule was adopted as the modified ATS rule in the ATS guidelines from 2001 [14]. A subsequent validation study in the same setting achieved sensitivity of 69%, specificity of 97%, positive predictive value of 87%, and negative predictive value of 94% in predicting admission to the ICU [8].

However, in two subsequent validation studies from the USA and one from Australia, sensitivity of this rule ranged between 44–92%, specificity 72–95%, positive predictive value 26–71%, and negative predictive value 88–99% [9, 10, 15]. Thus, high negative predictive value was the only consistent finding when comparing the performance of the rule. However, a predictive rule for SCAP should also display high positive predictive value. Likelihood ratios would be even superior to predictive values, since they do not depend on prevalence of severe pneumonia across diverse populations.

Two alternative rules using another reference have been proposed. España et al. [16] derived a prediction rule from a combined reference including mechanical ventilation, septic shock, and mortality. The variables of the score were also grouped in six minor criteria (confusion, urea ≥30 mg/dL, respiratory rate >30/min, multilobar bilateral infiltrates, PaO2 <54 mmHg or PaO2/FiO2 <250, and age ≥80 years) and two major criteria (arterial pH <7.35 or systolic blood pressure <90 mmHg). At least two minor criteria or one major criterion predicted SCAP. In the validation group, this rule achieved sensitivity of 84%, specificity of 60%, positive predictive value of 22%, and negative predictive value of 97%. This predictive performance was not better than that of the modified ATS score [area under the curve (AUC) 0.72 versus 0.71]. In a subsequent validation study, these criteria had comparable AUC of 0.75 [17]. Charles et al. [18] derived a predictive rule with the need for intensive respiratory or vasopressor support (IRVS) regardless of setting where this treatment was applied as the outcome of interest. The so-called SMART-COP score is an acronym for systolic blood pressure, multilobar extension of infiltrates, albumin, respiratory rate, tachycardia, confusion, oxygen, and arterial pH, assigning two points (systolic blood pressure, oxygen, arterial pH) or one (all others) to each criterion. Score ≥3 identified 92% of patients who needed IRVS. It provided sensitivity of 58–85%, specificity of 46–75%, and AUC of 0.72–0.87 in five independent external validation cohorts [18]. Although these figures are favorable, they do not appear to outperform the modified ATS score.

To increase the performance of the predictive rule, the last update of the IDSA/ATS guidelines recommends use of the IDSA/ATS rule, which includes nine minor criteria instead of the three in the modified ATS rule [11]. In the meantime, this rule was validated. In our study, we found sensitivity of 71% and specificity of 88%, similar to the modified ATS rule (sensitivity 66%, specificity 90%) in predicting ICU admission [19]. Likewise, in a study from Greece including only patients with pneumococcal CAP, the 2007 IDSA/ATS criteria performed as well as the 2001 modified ATS rule in predicting ICU admission. Both demonstrated high sensitivity (90%) and NPV (97%) [20].

Table 1 summarizes the performance of the different scores. AUC values for the modified ATS score ranged between 0.68 and 0.9, and no score was superior. In contrast, the performance of the pneumonia severity scores was generally lower. AUC values for high-risk PSI classes range between 0.61 and 0.75, and for CURB-65 ≥3 between 0.58 and 0.73. PSI consistently had lower specificity, whereas CURB-65 had lower sensitivity (Table 2). No study found pneumonia severity scores equivalent to ICU admission scores in predicting ICU admission or intensified treatment.

Table 1 Performance of predictive rules for severe community-acquired pneumonia (CAP)
Table 2 Performance of pneumonia severity rules [PSI IV+V and CURB, british thoracic society (BTS) criteria or CURB-65 ≥3] for prediction of ICU admission

Since major criteria are increasingly being regarded as less helpful for the clinician, attention has been attracted to the predictive potential of the nine minor criteria listed in the IDSA/ATS guideline update [11]. Two further studies focused on these minor criteria, using ICU admission as reference for severe CAP. These predicted ICU admission with area under the curve of 0.88 (95% confidence interval 0.85–0.90), which improved to 0.90 (95% confidence interval 0.88–0.92) with weighting. Competing models (España criteria and SMART-COP) had area under the curve of 0.76 to 0.83. Using four rather than three minor criteria improved positive predictive value from 54% to 81%, with stable negative predictive value of 94–92% [22]. Phua et al [21] found sensitivity, specificity, and positive and negative predictive values of the minor criteria of 58.3%, 90.6%, 52.9%, and 92.3%, respectively, for ICU admission.

Only recently, a prediction rule for ICU admission on days 1–3 of emergency department (ED) presentation was derived for patients presenting with no obvious reason for immediate ICU admission (not requiring immediate respiratory or circulatory support). The risk of early admission to ICU index (REA-ICU index) comprises 11 criteria independently associated with ICU admission: male gender, age younger than 80 years, comorbid conditions, respiratory rate of 30 breaths/min or higher, heart rate of 125 beats/min or higher, multilobar infiltrate or pleural effusion, white blood cell count less than 3 or above 20 G/L, hypoxemia [oxygen saturation <90% or arterial partial pressure of oxygen (PaO2) <60 mmHg], blood urea nitrogen of 11 mmol/L or higher, pH less than 7.35, and sodium less than 130 mEq/L. The REA-ICU index stratified patients into four risk classes with risk of ICU admission on days 1–3 ranging from 0.7% to 31%. The area under the curve was 0.81 [95% confidence interval (CI) = 0.78–0.83] in the overall population [23]. However, this index has not been validated in independent cohorts. In particular, its impact on mortality has not been assessed.

Critical appraisal

The presented scores all provide a reasonable reflection of CAP severity. However, the range of performance in different settings seems disturbing. Furthermore, none seems to be truly superior in a clinically relevant manner. Thus, it is time for a new reflection on the reasons behind these conflicting findings. Four issues have to be considered in this regard.

Methods of derivation: the reference outcome used

The reference “admission to the ICU” is biased by local admission policies which eminently limit the applicability of predictive rules in other treatment settings. Such policies vary not only between nations and regions but also between the levels of care and specialities of hospitals. Thus, it is not surprising that the validation of a rule derived from one treatment setting performs nearly identically in the same treatment setting but results in variable performance in others [810, 13, 15]. Another important point is that, since the late 1990s, intermediate care units have been introduced into most hospitals, some even as specialized respiratory units, and noninvasive ventilation has become standard of care for respiratory failure, sometimes delivered on the respiratory ward. The availability of this type of care outside the ICU dilutes the seemingly sharp difference between nonsevere and severe pneumonia in both directions: it may include patients with pneumonia who formerly would have been qualified as candidates for ICU admission (thereby reducing specificity of severity criteria) but also those without severe respiratory or hemodynamic compromise who might profit from some monitoring in the initial phase of management (thereby reducing sensitivity of severity criteria). It is for these reasons that the reference “admission to the ICU” is no longer valid but also cannot be simply replaced by its inflation to “admission to the ICU or intermediate care unit.”

The alternative suggested by España et al [16] (mechanical ventilation, septic shock, and mortality as reference outcome) seems logical, since it relies on the three most severe outcomes of CAP. However, this approach is devalued by two critical issues. First, taken as it is, this reference excludes other presentations of SCAP, e.g., acute respiratory failure not requiring mechanical ventilation and severe sepsis. Second, as with all composite outcomes, different factors might predict each separate outcome. Mortality is a conflicting outcome, since escalation of care may not always be offered to the deteriorating patient. It implies a lot of factors which might have contributed to death not related to severity, e.g., comorbidity, delayed timing, and inadequacy of treatment, and complications not related to pneumonia. The principal criticism, however, relates to the obvious circular reasoning inherent to this approach. It is a protopathic bias to include in the prediction tool what it wants to predict (such as major criteria). Rather than deconstructing the prediction endpoint in a bunch of criteria with the same meaning as the prediction endpoint itself (a sort of alternative definition of the prediction endpoint), the prediction tool should be based on signs and symptoms that are in some way related to the prediction endpoint in an epidemiologic and physiopathological perspective but not equivalent [24]. In contrast, the approach of Charles et al. [18] seems methodologically sound. This approach circumvents the problem of the treatment setting as reference by defining the critical endpoints independently from the treatment setting. IRVS might be applied in the ICU or intermediate care unit, or even (with the exception of invasive ventilation) in an experienced regular ward. Unexpectedly, this approach did not result in higher and more consistent predictions of SCAP in external settings than the previous ones. Several reasons may account for this finding. First, since a rule has to be applied dichotomously, it is open to failures due to the limited predictive power of the variables included. Second, failures in patient management, i.e., failure to initiate IRVS or inadequate indications for IRVS, cannot be excluded. In fact, IRVS itself is not an unequivocally objective reference, since the indications at least for ventilatory support are to some extent dependent on clinical estimations.

Populations evaluated

The most important confounder, however, may be the population evaluated. Hidden treatment restrictions when confronted with the elderly, those with multiple comorbidities, and those with severe disability may preclude IRVS, although the rule may predict its application. Although some studies expressly indicate that patients with pneumonia as a terminal event of a chronic disabling condition were excluded, the number of patients excluded by this criterion is never specified, nor are there valid criteria to define which patients may meet this criterion. In fact, data from a nationwide quality assurance program in Germany indicate that treatment restrictions (no ventilatory support) were applied in around 85% of patients who died [25]. As long as no criteria for futility are available and consented in patients with CAP, it will be impossible to account for this potential bias in the derivation and validation of rules for SCAP.

Variables included

All predictive rules have in common that they use similar criteria that they include to define severe CAP. In fact, all criteria can be grouped into three categories: those reflecting acute respiratory failure, severe sepsis/septic shock, and radiographic spread (Table 3). In the original list of severity criteria of the 1993 ATS guidelines, criteria can additionally be divided into those present at admission and those that might also develop in the course of the acute disease.

Table 3 Severity criteria for assessment of severe community-acquired pneumonia (CAP)

Evidently and surprisingly enough, simple criteria can strikingly well reflect highly complex pathophysiological processes; e.g., increased respiratory rate or PaO2/FiO2 can reflect acute respiratory failure and low systolic pressure, and mental confusion can reflect severe sepsis. On the other hand, addition of more criteria to a small set of criteria which may reflect these conditions does not substantially increase the predictive power of a severity rule. Since the prevalence of each criterion reflecting severe sepsis is limited, the addition of more such criteria results in higher thresholds and as a consequence loss of overall sensitivity of the rule. This is why it makes little sense to inflate the modified ATS rule as done in the last update of the IDSA/ATS guidelines.

Another problem is the division into minor and major criteria. Major criteria such as mechanical ventilation and septic shock represent a type of self-fulfilling prophecy, since they predict SCAP in patients who already receive treatment for the most severe complications of CAP. In fact, major criteria might only be useful for patient classification purposes in studies. Minor criteria, however, have limited sensitivity, regardless of the number of criteria included. Currently, the España and the SMART-COP rule are the only rules that do not include major criteria, but they also suffer from lack of sensitivity, as obvious in at least some validation cohorts.

Time course of pneumonia severity

Finally, none of the predictive rules accounts appropriately for the time course of pneumonia severity. In the modified ATS rule, criteria are classified as those which have to be present at admission and evolutionary criteria (at admission or during follow-up). However, only major criteria are provided as evolutionary criteria. This structure mixes patients who meet the criteria at admission with those who meet them during follow-up, but the latter only when major criteria are present. Thus, patients with severe CAP during follow-up not requiring mechanical ventilation or not having septic shock are not identified.

For a predictive rule for SCAP to be useful for a clinician, it is essential that it allows for severity assessment at admission as well as at any time during follow-up. Patients who are at risk for experiencing severe CAP during the course of the illness are particularly important to identify, since such a course may be prevented even outside the ICU. In fact, the prognostic window of such patients might be particularly wide, probably far wider than for patients with overt septic shock [26, 27].

Rethinking SCAP

Having these caveats in mind, it is highly improbable that any modification of currently available predictive rules for SCAP will result in substantially and consistently higher performance. Instead, a new reflection about the following two key questions is mandatory:

What is SCAP?

From a pathophysiological view, pneumonia can become severe because of two processes. First, alveolar infectious inflammation may result in serious ventilation–perfusion mismatches, with dead space ventilation up to 50% as well as shunt up to 20%, i.e., acute respiratory failure [28]. Second, infection might induce a systemic inflammatory response syndrome with severe hypoperfusion and multiorgan failure, i.e., severe sepsis and/or septic shock. Some respiratory and hemodynamic compromise may occur without progressing to organ failure. However, there seems to be a relatively short transition leading from a status of clinical stability to vital threat. In fact, severe sepsis and septic shock in patients with CAP were present already at presentation in the emergency room in 71% of severe sepsis cases and 44% of septic shock cases, respectively. In that study, most acute organ dysfunction was present at presentation or occurred on day 1, with renal dysfunction occurring earlier than other organ dysfunction [29].

No additional reasons for SCAP are known. However, several complications may be part of the pathogenesis of one or both of these reasons. The development of large pleural effusions, abscess or empyema formation, infectious metastases, and pulmonary embolism are prominent examples. Moreover, comorbidities may decompensate during the course of the disease. Only recently, myocardial infarction has been found to be a frequent event during CAP [30]. Preexisting cardiac, pulmonary, renal, and hepatic disease as well as diabetes mellitus are all prominent potential causes that may aggravate the course of pneumonia and increase mortality.

Death from CAP seems to be caused by pneumonia itself and by decompensated comorbidities in around 50% of cases, respectively [4]. The relative importance of iatrogenic damage such as ventilator-associated pneumonia (VAP) and catheter-related infections is unknown.

What are the specific needs of the clinician in order to recognize patients with SCAP?

Clinicians do not need to be reminded that patients with mechanical ventilation and/or septic shock in fact have SCAP. In daily practice, clinicians need to identify those patients who might profit from any type of intensified treatment. All criteria that have been captured in predictive rules reflecting acute respiratory failure and/or septic shock might be useful; however, it seems that a small set of criteria is sufficient to assess the presence of these conditions (Table 2).

Unfortunately, there are very few studies investigating predictors of progression to SCAP. Baseline clinical stability is difficult to define, and to our knowledge there is no reference defining stability in the context of pneumonia. For the purpose of defining SCAP, it may be assumed in the absence of acute respiratory failure and severe sepsis or septic shock. If one accepts such an assumption, it is not known whether an increased respiratory rate below, or a low systolic blood pressure above, the threshold is a risk factor for development of SCAP within the following days. Although there are data about the risk of the development of parapneumonic empyema [31], the analysis did not report predictors for development of empyema causing SCAP. Systemic inflammatory response syndrome (SIRS) was not found to be a predictor for progression to severe sepsis in CAP [29]. There are no data about the relative risk of comorbidities to decompensate during a pneumonia episode.

How should we assess the presence of SCAP?

Much has been learnt during the evaluation of predictive scores for SCAP. As it stands today, it is possible to predict SCAP with sensitivity of around 70% and specificity of around 80–90% using the modified ATS score or its most recent variation, the IDSA/ATS rule; however, ranges throughout settings are wide, and positive predictive values might be very low. The SMART-COP rule, although methodologically more convincing, does not achieve substantially better predictions.

The critical issue is whether a prediction rule with such performance is useful for the clinician. We argue that the available severity scores do not meet the principal needs of severity assessment.

  1. 1.

    All severity rules have a failure in sensitivity. This is particularly troublesome since all patients who might benefit from intensified treatment should be identified.

  2. 2.

    Severity scores are focused on vital sign abnormalities and do not specifically weigh the potential contribution of complications or decompensated comorbidity to pneumonia severity.

  3. 3.

    None of the severity scores is sensitive for the lower extreme in the spectrum of severe pneumonia, i.e., patients at risk of SCAP. This is a concern since SCAP is expected to follow a bell-shaped rather than linear response to treatment, resulting in the greatest likelihood of treatment benefit in those patients with moderate organ dysfunction (i.e., severe sepsis and not septic shock, acute respiratory failure and not severe sepsis, etc.) [32].

Overall, it seems that scores for SCAP are rigid and poorly sensitive tools which might impede alert clinical assessment based on informed pathophysiological knowledge including all available diagnostic evidence [32]. Therefore, we advocate a change in approach to patients with severe pneumonia, and the following issues seem mandatory in this regard:

  1. 1.

    Severity assessment should not refer to the need for ICU admission, nor even necessarily to intermediate care admission. Instead, it should hint at the identification of patients in need of any type of intensified monitoring and treatment, i.e., monitoring of vital signs and oxygenation, wherever such monitoring is performed. Intensified treatment includes monitoring of respiratory rate, oximetry, heart rate, and blood pressure. Patients with signs of acute respiratory failure may receive oxygen supplementation or/and noninvasive ventilator support, and those with hypotension early goal-directed treatment, mainly fluid resuscitation. Patients with major criteria are the only ones who have an absolute indication for admission to the ICU (Fig. 1).

    Fig. 1
    figure 1

    Selection of treatment setting and intensified treatment in patients with community-acquired pneumonia (CAP). Intensified treatment includes: monitoring of respiratory rate, oximetry, heart rate, and blood pressure; in case of acute respiratory failure, oxygen supplementation or/and noninvasive ventilation (NIV); in case of severe sepsis, early goal-directed treatment (mainly fluid resuscitation). It may be applied in the ICU, intermediate care unit or even in experienced wards. Absolute indications for ICU admission are only mechanical ventilation, presence of septic shock, and possibly complications as well as unstable comorbidity. Criteria reflecting acute respiratory failure and/or severe sepsis and septic shock are provided in Table 3

  2. 2.

    Severe community-acquired pneumonia is principally caused by acute respiratory failure and severe sepsis/septic shock, and severity assessment is basically aimed at recognition of these two conditions. Thus, all criteria reflecting acute respiratory failure and hemodynamic compromise may be used to assess severity. However, it is important to remember that thresholds are only an aid to recognize critical conditions, and that values that shortly fail thresholds may nonetheless reflect transition to SCAP. This is particularly true for younger patients who might much more effectively compensate acute respiratory failure.

  3. 3.

    It is important to keep in mind the dynamic nature of pneumonia-associated inflammation. The first 24–72 h are the most vulnerable to pneumonia progression and death [24]. There is some evidence that patients with delayed transfer to ICU have worse outcome compared with those transferred initially [33]. Therefore, patients should continuously be reevaluated for presence of the two conditions.

  4. 4.

    Beyond recognition of the two conditions causing SCAP, patients should be carefully evaluated for presence of pneumonia-related complications and decompensated comorbidities. These may be an additional indication for ICU admission.

  5. 5.

    After initial evaluation, it might be important in elderly and severely disabled patients to decide whether treatment restrictions should be applied, i.e., whether the patient should not receive intensified treatment. This is important in both directions, in order to prevent unethical overtreatment as well as inappropriate hidden limitations of care.

This concept does not mean a recommendation backwards to more subjective evaluation. Evidently, validated predictive rules, preferably the modified ATS rule, IDSA/ATS rule or the SMART-COP rule, can be used as an aid for the clinician in the decision for who should be admitted to the ICU. However, presently available scores do not seem to be sufficiently sensible to the complexity behind severe pneumonia. At present, it seems to be more appropriate to apply the whole spectrum of severity criteria within a sensible clinical assessment following the pathophysiological basis of pneumonia-associated inflammation and identifying hazards from complications and comorbidities, all at admission and during follow-up.