, Volume 189, Issue 3, pp 225–232

Routine Laboratory Tests can Predict In-hospital Mortality in Acute Exacerbations of COPD


  • Alex C. Asiimwe
    • School of Health Sciences and Social WorkUniversity of Portsmouth
  • Fraser J. H. Brims
    • Centre for Respiratory ResearchUniversity College London
  • Neil P. Andrews
    • Portsmouth Hospitals NHS Trust
  • Dave R. Prytherch
    • Centre for Healthcare Modelling and InformaticsUniversity of Portsmouth
  • Bernie R. Higgins
    • School of Health Sciences and Social WorkUniversity of Portsmouth
  • Sally A. Kilburn
    • School of Health Sciences and Social WorkUniversity of Portsmouth
    • Portsmouth Hospitals NHS Trust
    • Respiratory Centre, Queen Alexandra Hospital

DOI: 10.1007/s00408-011-9298-z

Cite this article as:
Asiimwe, A.C., Brims, F.J.H., Andrews, N.P. et al. Lung (2011) 189: 225. doi:10.1007/s00408-011-9298-z


Chronic obstructive pulmonary disease (COPD) has a rising global incidence and acute exacerbation of COPD (AECOPD) carries a high health-care economic burden. Classification and regression tree (CART) analysis is able to create decision trees to classify risk groups. We analysed routinely collected laboratory data to identify prognostic factors for inpatient mortality with AECOPD from our large district hospital. Data from 5,985 patients with 9,915 admissions for AECOPD over a 7-year period were examined. Randomly allocated training (n = 4,986) or validation (n = 4,929) data sets were developed and CART analysis was used to model the risk of all-cause death during admission. Inpatient mortality was 15.5%, mean age was 71.5 (±11.5) years, 56.2% were male, and mean length of stay was 9.2 (±12.2) days. Of 29 variables used, CART analysis identified three (serum albumin, urea, and arterial pCO2) to predict in-hospital mortality in five risk groups, with mortality ranging from 3.0 to 23.4%. C statistic indices were 0.734 and 0.701 on the training and validation sets, respectively, indicating good model performance. The highest-risk group (23.4% mortality) had serum urea >7.35 mmol/l, arterial pCO2 >6.45 kPa, and normal serum albumin (>36.5 g/l). It is possible to develop clinically useful risk prediction models for mortality using laboratory data from the first 24 h of admission in AECOPD.


COPDExacerbationsMortalityRiskDecision tree analysis


Chronic obstructive pulmonary disease (COPD) is the only major cause of death with a rising global incidence. A recent report from the Centers for Disease Control and Prevention estimates that 1 in 20 deaths in the USA have COPD as an underlying cause [1]. In the UK, COPD is the second most common cause of emergency admission to hospital, consuming over half of the £500 million a year spent on COPD in the UK [2]. In the US, annual excess health-care expenditures are estimated at nearly $6,000 for every COPD patient [1].

There has been recent interest in the identification of outcome predictors for inpatient and 30-day mortality following an acute exacerbation of chronic obstructive pulmonary disease (AECOPD) [35], although these studies have focused on clinical and social parameters alone.

There has been renewed interest within the National Health Service (NHS) to reduce hospital admissions for chronic diseases such as COPD, using several clinical indicators to screen those requiring admission, some of which may be liable to misinterpretation. It would be a useful adjunct to clinical assessment to identify those at greatest risk of mortality by using additional indisputable laboratory data to select those who would require hospital admission because of a high risk of mortality, and once admitted it would be useful in guiding medical decision-making, provide prognostic information to patients and their families, and allow clinicians to implement more individualized treatment strategies [6]. High-risk patients potentially would be given intensive treatment [7] and low-risk patients could be considered for early discharge, or not admitted.

Clinical prediction tools have historically used multivariate logistic regression (MLR) to predict binary outcomes in a wide variety of medical disciplines [8, 9], but the number of predictors and complex mathematical functions limit its use in routine clinical practice. Classification and regression tree (CART) analysis provides an alternative approach as it maximises sensitivity by identifying patients truly at risk and minimises misclassification of low-risk patients and thus can stratify patients into different levels of risk [1013]. It produces a simple decision tree that can be more accurate and easy to apply at the bedside and missing data does not restrict its utility. We sought to use CART to develop a simple clinical tool to identify patients at high risk of death during admission.



We retrospectively studied 9,985 admissions (from 5,985 inpatients) with AECOPD to Portsmouth Hospitals NHS Trust over a 7-year period. All admissions whose primary reason for admission to the hospital was AECOPD [International Classification of Diseases (ICD) 10th Revision, Codes J40-44] were identified and all data available to the clinicians on the pathology and haematology database within 24 h of admission were extracted. Those exacerbations that occurred within 21 days of a previous one were excluded from the analysis [14]. Patient records were matched across databases using a method previously described by our group [15].

Reliability of Data

Confirmation of diagnosis of COPD and an admission for AECOPD was sought by analysis of patient records, crosschecking with databases held in the hospital, including the Respiratory Clinic, spirometric records, and patient-related hospital letters; these sources were selected for their reliability and ease of access to investigators. For example, the medical case notes of 981 patients with COPD known to the hospital Respiratory Clinic (which held spirometric and other records on these patients) were examined to confirm their diagnosis, date of admission, date of discharge, and date of death if applicable. COPD was accepted as a diagnosis if age >40 years, there was demonstrable airflow obstruction with an FEV1/FVC <0.70, and a smoking history of >20 pack-years. An exacerbation was accepted if the patients were admitted having reported an acute and sustained worsening of symptoms over their usual stable state, with symptoms including breathlessness, cough, increased sputum production, and change in sputum colour, in the absence of another cause for the presentation. In addition to inspection of other clinical databases, we examined an additional randomly selected 200 case records whose details were not available on any clinical database.

Outcomes and Potential Predictors

The primary outcome of interest was in-hospital mortality from any cause following AECOPD. Both the NHS tracing system and the hospital’s Patient Administrative System (PAS) were searched for deaths from any cause. The potential predictors were routinely available blood test data in the first 24 h of hospital presentation. Variables extracted from the databases included demographic characteristics and laboratory tests (haematology and biochemistry) which included renal and liver function, arterial blood gases (ABG), systemic inflammation [white cell count (WCC), platelets, C-reactive protein (CRP)], and cardiac injury (CK, AST, troponin I) tests.

Data Analysis and Statistical Methods

Information was extracted from the databases and stored in FoxPro ver. 9.0 (Microsoft Corp., Redmond, WA, USA). Data were summarised using means and standard deviation (SD) for continuous variables and frequencies and percentages for categorical variables. The data were divided randomly into model training and validation sets. CART analyses then produced a decision tree to predict mortality during admission on the training set, and the resulting decision tree was tested on the remaining validation set to identify subgroups of patients at higher risk of death during admission [16]. To avoid complications of repeated measures, only data from the last admission of each patient was used. CART algorithms were performed using the CART 6.2 program (California Statistical Software, Salford Corp, San Diego, CA, USA). Multiple decision trees were identified with differing ranges of variables, accuracy, sensitivity, and specificity. Models with the optimum numbers of variables and a tradeoff between sensitivity and specificity were chosen for the final analyses in order to ensure maximum clinical applicability and relevance.

Assessing Model Performance

The C statistic, a measure of the discriminative power of the predictive model (and numerically equivalent to the area under the ROC curve) was used as an index of model performance [9]. We also calculated sensitivity, specificity, and negative and positive predictive values (NPV and PPV, respectively) to further assess the performance of the decision tree.

The study was approved by Isle of Wight, Portsmouth, and South East Hampshire Local Ethics Committees.


Reliability of Diagnosis and Mortality Data

A total of 5,985 patients were identified, with 9,915 admissions for analysis. Of the 981 patients from the Respiratory Clinic database with spirometry compatible with COPD (FEV1/FVC <70%), 510/5,985 (8.5%) were included in the study period as having had an admission for AECOPD. A further 932/5,985 (15.6%) had compatible spirometry with COPD on databases from nonrespiratory departments. Of the additional 200 patients randomly selected to confirm the diagnosis of admission for AECOPD, all had features consistent with a diagnosis of AECOPD, though 61 patients did not have spirometric confirmation either due to death during admission, default from follow-up, or no previous or subsequent record of spirometry in the hospital despite clinical and radiological features and a smoking history consistent with COPD, and treatment with inhalers prior to admission.

Overall, 1,581 (510, 932, and 139) patients of the 5,985 (26.4%) had a diagnosis of COPD confirmed with matching dates of diagnosis and discharge. A comparison was then made of the cohort with confirmation of COPD diagnosis (with our criteria outlined above; n = 1,581) and the remainder with no proven diagnosis (n = 4404). There were no differences in age, gender, death in hospital (242 vs. 679; P = 0.916), overall mortality (375 vs. 1109; P = 0.248), or 1-year mortality (354 vs. 1,403; P = 0.298).

Baseline Characteristics

One thousand seventy-three (17.9%) patients had no known previous admission with AECOPD. There were 931 (15.5%) deaths during admission. The mean age (±SD) at time of admission was 71.5 (±11.5) years, and mean length of stay was 9.2 (±12.2) days. Computer-generated random numbers were used to allocate admissions to training and validation data sets. There were 4,986 (50.3%) admissions (from 3,640 patients) in the training set and 4,929 (49.7%) admissions (from 3,590 patients) in the validation set. Characteristics for both cohorts are presented in Table 1. There were no statistically significant differences in any of the parameters studied between the two groups.
Table 1

Demographic and blood test characteristics in the training and validation sets




Validation seta (N = 4,986)

Training seta (N = 4,929)




71.3 (11.7)

71.4 (11.3)

Length of stay


9.2 (3.2)

9.1 (1.2)



N (%)

1847 (56.9)

1768 (55.9)

Blood count



13.6 (2.0)

13.6 (1.9)


White cell count


11.7 (8.4)

11.5 (6.2)



291.0 (10.0)

291.1 (10.1)




137.8 (4.1)

137.9 (4.2)




4.3 (0.5)

4.3 (0.6)



8.1 (5.2)

8.1 (5.2)




110.8 (68.3)

110.2 (71.7)

Liver function

Alkaline phosphatase



108.4 (69.8)


Total bilirubin



12.0 (7.1)

Aspartate transaminase (AST)



35.9 (64.0)

Total protein



68.8 (7.0)





39.1 (5.0)

Arterial gases



11.4 (6.3)

11.5 (6.0)


Base excess


1.7 (4.8)

1.5 (4.8)



25.8 (4.0)

25.7 (4.0)



6.1 (2.0)

6.2 (2.2)



92.7 (7.5)

92.9 (6.9)




7.3 (0.1)

7.3 (0.1)

Cardiac injury

Creatinine kinase


145.7 (3.0)

151.4 (5.2)

Troponin I


7.3 (10.9)

7.0 (9.4)


C-reactive protein


67.9 (7.1)

69.5 (8.1)




0.9 (0.8)

0.9 (0.8)


Corrected calcium


2.3 (0.1)

2.2 (0.1)




7.2 (3.4)

7.3 (3.4)

aData are mean (SD)

Classification and Regression Tree (CART)

The CART was obtained by binary recursive partitioning from the training set (Fig. 1) and tested on the validation set (Fig. 2). Five groups were generated which categorised patients ranging from high to low risk of death. Among the 29 variables selected for analysis, the CART method identified only three variables to predict in-hospital mortality in AECOPD admissions. Albumin dichotomised at a level of 36.5 g/l was the best single discriminator between deaths and survivors during admission. The other variables utilised to classify patients into risk groups were serum urea and paCO2. Details on the rules generated are shown in Table 2.
Fig. 1

Decision tree model depicting risk of death during admission in AECOPD admissions (training set). Albumin, g/l; urea, mmol/l; pCO2, kPa
Fig. 2

Decision tree model depicting risk of death during admission in AECOPD admissions (validation set). Albumin, g/l; urea, mmol/l; pCO2, kPa

Table 2

Stratification of patients by CART analyses into different risk groups to predict in-patient mortality in admissions for AECOPD


High risk

Medium risk

Low risk

Risk 1

Risk 2

Risk 3

Risk 4

Risk 5

Training set: n/N (%: mortality)

62/265 (23.4%)

197/976 (20.2%)

38/216 (17.6%)

83/1118 (7.4%)

72/2411 (3.0%)

Validation set: n/N (%: mortality)

52/245 (21.2%)

183/978 (18.7%)

34/201 (16.9%)

116/1170 (9.9%)

84/2335 (3.6%)



 Albumin (g/l)






 Urea (mmol/l)






 paCO2 (kPa)






n = number of deaths within category; N = total numbers within category

Risk groups are stratified by mortality proportion in each group, with the highest mortality assigned Risk 1—High Risk through to Risk 5—Low Risk

Every patient fulfilled the criteria of one of the five mutually exclusive subgroups at the leaves of the decision tree (a leaf corresponds to a subgroup that is not further subdivided). The predicted likelihood of mortality during admission is also reported for each leaf. For example, all patients admitted with serum albumin ≤36.5 g/l had a 20.2% predicted and 18.7% actual risk of death during admission.

Performance of Decision Tree

Discriminative performance of the decision tree produced a C statistic of 0.734 on the training set and 0.701 on the validation set. The overall accuracy of the algorithm on the training and validation data sets was 74.1 and 72.5%, respectively. The ROC curves for both data sets are provided in Fig. 3. The decision tree demonstrated 65.7% sensitivity and 74.1% specificity on the training set and 57.4% sensitivity and 74.1% specificity on the validation set. The NPV and PPV on the training set were 95.6 and 20.4%, respectively, and similarly 95.6 and 18.9% on the validation set.
Fig. 3

ROC curve for decision tree to predict in-hospital mortality in AECOPD admissions using CART. Training set (continuous line) C statistic = 0.734 (95% CI = 0.723–0.756); validation set (broken line) C statistic = 0.701 (95% CI = 0.687–0.763)


This is the first study to show that decision tree analyses of laboratory data collected routinely on admission for AECOPD can identify patients at high and low risk of mortality during an admission. Using only three variables (albumin, paCO2, and urea) with discrimination paths and interaction between risk groups, we have identified a clinical algorithm that could be used to stratify AECOPD admissions into five groups in which the risk of inpatient mortality varied from very low (3.0%) to very high (23.4%).

Clinical Relevance and Application

This algorithm may provide clinicians with a simple, easily interpreted, method of assessing risk of inpatient mortality during AECOPD by the bedside. A major strength of this study is that the model developed uses variables that are readily available to clinicians within the first 24 h of presentation to hospital, thus key decisions can be made by the admitting physicians after identifying high- or low-risk patients. By utilising just three common laboratory variables, such an algorithm can easily be placed on a physician’s personal digital assistant (PDA) or hospital computer system to assist clinicians. Correct application of this model to specific clinical situations could assist clinicians in identifying those at highest risk of in-hospital death and thus guide the appropriate health-care provision and treatment of such patients. It must be acknowledged at this point that reasons for severity of AECOPD and factors surrounding suitability for safe discharge are multifactorial and are not encompassed in this model. Information from this algorithm should become part of a decision-making process and not be used in isolation.

Factors Associated with High Mortality Risk

The overall in-hospital mortality of 15.5% compares with other studies where mortality rates have varied between 5.2 and 42% [3, 5, 1720]. We further identified two groups similarly at high risk for inpatient mortality; in one a high urea (>7.35 mmol/l), high albumin (>36.50 g/l), and high paCO2 (>6.45 kPa) conferred an approximately 23% mortality risk, and in the other, low albumin (≤36.50 g/l) alone conferred an approximately 20% risk of death during admission.

A raised urea level in any acute medical condition may be considered an indirect marker of nonspecific systemic illness, while albumin will fall in acute systemic illness, and both may also represent an underlying poor nutritional state prior to admission. A poor nutritional state and low body mass index have been previously linked with poor outcome in COPD [2123]. The observation that low albumin conferred a higher risk of mortality in our cohort is consistent with results from other studies of AECOPD [17, 18, 2426]. Hypercapnia is usually associated with more severe COPD and reflects marked ventilation-perfusion inequality and an inability of an individual to adequately eliminate excess carbon dioxide with increased ventilatory drive. It occurs commonly in combination with hypoxaemia, although in this study a low paO2 was not found to be important when combined with the other parameters.

Data Accuracy and Model Performance

In this study we have considered the misclassifications of diagnosis, admission, and outcome data, some of which are described in other studies [27]. The data for the outcome were complete in all patients, and although the data for each laboratory value varied, at least 75% of the data was available for the admissions. Furthermore, the accuracy and validity of diagnosis of COPD and AECOPD were reviewed and supported in 1,642 (27.4%) cases by combinations of spirometry, case notes, imaging, and patients’ clinical letters, and in 1,581 (26.4%) by spirometry and case records. It is possible that administrative information included errors in coding or wrong dates, although such miscoding would have resulted in an underestimate of the magnitude of the effect of the outcome.

The models showed high specificity and negative predictive values but relatively reduced sensitivity and low positive predictive values. The negative predictive value is inversely proportional to the prevalence of the outcome, and in this case the relatively low outcome measure (mortality rate of 15.5%) would explain the high negative and low positive predictive values. These values also indicate the applicability of the tests; potential models with much higher sensitivity and specificity were considered but this had resulted in more than 20 subgroups of risk, which would make such models unnecessarily complex and therefore clinically useless. The C statistic indices are a measure of the discriminative ability of the model, and values between 0.7 and 0.8, as reported in this study, would be considered to indicate good discrimination [9]. Previous studies with prediction models in other diseases using routine data have reported C statistic indices of below 0.7, suggesting that our models have performed comparatively well [2831].

Study Limitations

Although the CART models developed are able to identify patients with AECOPD who are likely to have poor inpatient outcome, we have not included measures of social deprivation, physical function, and factors precipitating the admission into the data set, e.g., virus infections [32]. Future studies combining comprehensive laboratory investigations with clinical and social parameters may yield further useful prediction models. Our data relate to a single-centre UK population, thus requiring further validation on a different population to ensure applicability.


We developed and validated a simple decision tree based on routinely collected laboratory data at the time of emergency admission that is able to correctly classify subgroups of AECOPD inpatients at high and low risk of death during admission. This type of modelling uses only three variables and is arguably simple to apply at the bedside.


We are grateful to the Departments of Clinical Coding, Cardiology, Haematology, and Biochemistry for providing the data and to Sumita Kerley for her assistance in data collection. This project was supported by a student bursary (for AA) from the University of Portsmouth.


All authors declare that there are no competing interests related to this manuscript.

Copyright information

© Springer Science+Business Media, LLC 2011