Routine Laboratory Tests can Predict In-hospital Mortality in Acute Exacerbations of COPD
Chronic obstructive pulmonary disease (COPD) has a rising global incidence and acute exacerbation of COPD (AECOPD) carries a high health-care economic burden. Classification and regression tree (CART) analysis is able to create decision trees to classify risk groups. We analysed routinely collected laboratory data to identify prognostic factors for inpatient mortality with AECOPD from our large district hospital. Data from 5,985 patients with 9,915 admissions for AECOPD over a 7-year period were examined. Randomly allocated training (n = 4,986) or validation (n = 4,929) data sets were developed and CART analysis was used to model the risk of all-cause death during admission. Inpatient mortality was 15.5%, mean age was 71.5 (±11.5) years, 56.2% were male, and mean length of stay was 9.2 (±12.2) days. Of 29 variables used, CART analysis identified three (serum albumin, urea, and arterial pCO2) to predict in-hospital mortality in five risk groups, with mortality ranging from 3.0 to 23.4%. C statistic indices were 0.734 and 0.701 on the training and validation sets, respectively, indicating good model performance. The highest-risk group (23.4% mortality) had serum urea >7.35 mmol/l, arterial pCO2 >6.45 kPa, and normal serum albumin (>36.5 g/l). It is possible to develop clinically useful risk prediction models for mortality using laboratory data from the first 24 h of admission in AECOPD.
KeywordsCOPDExacerbationsMortalityRiskDecision tree analysis
Chronic obstructive pulmonary disease (COPD) is the only major cause of death with a rising global incidence. A recent report from the Centers for Disease Control and Prevention estimates that 1 in 20 deaths in the USA have COPD as an underlying cause . In the UK, COPD is the second most common cause of emergency admission to hospital, consuming over half of the £500 million a year spent on COPD in the UK . In the US, annual excess health-care expenditures are estimated at nearly $6,000 for every COPD patient .
There has been recent interest in the identification of outcome predictors for inpatient and 30-day mortality following an acute exacerbation of chronic obstructive pulmonary disease (AECOPD) [3–5], although these studies have focused on clinical and social parameters alone.
There has been renewed interest within the National Health Service (NHS) to reduce hospital admissions for chronic diseases such as COPD, using several clinical indicators to screen those requiring admission, some of which may be liable to misinterpretation. It would be a useful adjunct to clinical assessment to identify those at greatest risk of mortality by using additional indisputable laboratory data to select those who would require hospital admission because of a high risk of mortality, and once admitted it would be useful in guiding medical decision-making, provide prognostic information to patients and their families, and allow clinicians to implement more individualized treatment strategies . High-risk patients potentially would be given intensive treatment  and low-risk patients could be considered for early discharge, or not admitted.
Clinical prediction tools have historically used multivariate logistic regression (MLR) to predict binary outcomes in a wide variety of medical disciplines [8, 9], but the number of predictors and complex mathematical functions limit its use in routine clinical practice. Classification and regression tree (CART) analysis provides an alternative approach as it maximises sensitivity by identifying patients truly at risk and minimises misclassification of low-risk patients and thus can stratify patients into different levels of risk [10–13]. It produces a simple decision tree that can be more accurate and easy to apply at the bedside and missing data does not restrict its utility. We sought to use CART to develop a simple clinical tool to identify patients at high risk of death during admission.
We retrospectively studied 9,985 admissions (from 5,985 inpatients) with AECOPD to Portsmouth Hospitals NHS Trust over a 7-year period. All admissions whose primary reason for admission to the hospital was AECOPD [International Classification of Diseases (ICD) 10th Revision, Codes J40-44] were identified and all data available to the clinicians on the pathology and haematology database within 24 h of admission were extracted. Those exacerbations that occurred within 21 days of a previous one were excluded from the analysis . Patient records were matched across databases using a method previously described by our group .
Reliability of Data
Confirmation of diagnosis of COPD and an admission for AECOPD was sought by analysis of patient records, crosschecking with databases held in the hospital, including the Respiratory Clinic, spirometric records, and patient-related hospital letters; these sources were selected for their reliability and ease of access to investigators. For example, the medical case notes of 981 patients with COPD known to the hospital Respiratory Clinic (which held spirometric and other records on these patients) were examined to confirm their diagnosis, date of admission, date of discharge, and date of death if applicable. COPD was accepted as a diagnosis if age >40 years, there was demonstrable airflow obstruction with an FEV1/FVC <0.70, and a smoking history of >20 pack-years. An exacerbation was accepted if the patients were admitted having reported an acute and sustained worsening of symptoms over their usual stable state, with symptoms including breathlessness, cough, increased sputum production, and change in sputum colour, in the absence of another cause for the presentation. In addition to inspection of other clinical databases, we examined an additional randomly selected 200 case records whose details were not available on any clinical database.
Outcomes and Potential Predictors
The primary outcome of interest was in-hospital mortality from any cause following AECOPD. Both the NHS tracing system and the hospital’s Patient Administrative System (PAS) were searched for deaths from any cause. The potential predictors were routinely available blood test data in the first 24 h of hospital presentation. Variables extracted from the databases included demographic characteristics and laboratory tests (haematology and biochemistry) which included renal and liver function, arterial blood gases (ABG), systemic inflammation [white cell count (WCC), platelets, C-reactive protein (CRP)], and cardiac injury (CK, AST, troponin I) tests.
Data Analysis and Statistical Methods
Information was extracted from the databases and stored in FoxPro ver. 9.0 (Microsoft Corp., Redmond, WA, USA). Data were summarised using means and standard deviation (SD) for continuous variables and frequencies and percentages for categorical variables. The data were divided randomly into model training and validation sets. CART analyses then produced a decision tree to predict mortality during admission on the training set, and the resulting decision tree was tested on the remaining validation set to identify subgroups of patients at higher risk of death during admission . To avoid complications of repeated measures, only data from the last admission of each patient was used. CART algorithms were performed using the CART 6.2 program (California Statistical Software, Salford Corp, San Diego, CA, USA). Multiple decision trees were identified with differing ranges of variables, accuracy, sensitivity, and specificity. Models with the optimum numbers of variables and a tradeoff between sensitivity and specificity were chosen for the final analyses in order to ensure maximum clinical applicability and relevance.
Assessing Model Performance
The C statistic, a measure of the discriminative power of the predictive model (and numerically equivalent to the area under the ROC curve) was used as an index of model performance . We also calculated sensitivity, specificity, and negative and positive predictive values (NPV and PPV, respectively) to further assess the performance of the decision tree.
The study was approved by Isle of Wight, Portsmouth, and South East Hampshire Local Ethics Committees.
Reliability of Diagnosis and Mortality Data
A total of 5,985 patients were identified, with 9,915 admissions for analysis. Of the 981 patients from the Respiratory Clinic database with spirometry compatible with COPD (FEV1/FVC <70%), 510/5,985 (8.5%) were included in the study period as having had an admission for AECOPD. A further 932/5,985 (15.6%) had compatible spirometry with COPD on databases from nonrespiratory departments. Of the additional 200 patients randomly selected to confirm the diagnosis of admission for AECOPD, all had features consistent with a diagnosis of AECOPD, though 61 patients did not have spirometric confirmation either due to death during admission, default from follow-up, or no previous or subsequent record of spirometry in the hospital despite clinical and radiological features and a smoking history consistent with COPD, and treatment with inhalers prior to admission.
Overall, 1,581 (510, 932, and 139) patients of the 5,985 (26.4%) had a diagnosis of COPD confirmed with matching dates of diagnosis and discharge. A comparison was then made of the cohort with confirmation of COPD diagnosis (with our criteria outlined above; n = 1,581) and the remainder with no proven diagnosis (n = 4404). There were no differences in age, gender, death in hospital (242 vs. 679; P = 0.916), overall mortality (375 vs. 1109; P = 0.248), or 1-year mortality (354 vs. 1,403; P = 0.298).
Demographic and blood test characteristics in the training and validation sets
Validation seta (N = 4,986)
Training seta (N = 4,929)
Length of stay
White cell count
Aspartate transaminase (AST)
Classification and Regression Tree (CART)
Stratification of patients by CART analyses into different risk groups to predict in-patient mortality in admissions for AECOPD
Training set: n/N (%: mortality)
Validation set: n/N (%: mortality)
Every patient fulfilled the criteria of one of the five mutually exclusive subgroups at the leaves of the decision tree (a leaf corresponds to a subgroup that is not further subdivided). The predicted likelihood of mortality during admission is also reported for each leaf. For example, all patients admitted with serum albumin ≤36.5 g/l had a 20.2% predicted and 18.7% actual risk of death during admission.
Performance of Decision Tree
This is the first study to show that decision tree analyses of laboratory data collected routinely on admission for AECOPD can identify patients at high and low risk of mortality during an admission. Using only three variables (albumin, paCO2, and urea) with discrimination paths and interaction between risk groups, we have identified a clinical algorithm that could be used to stratify AECOPD admissions into five groups in which the risk of inpatient mortality varied from very low (3.0%) to very high (23.4%).
Clinical Relevance and Application
This algorithm may provide clinicians with a simple, easily interpreted, method of assessing risk of inpatient mortality during AECOPD by the bedside. A major strength of this study is that the model developed uses variables that are readily available to clinicians within the first 24 h of presentation to hospital, thus key decisions can be made by the admitting physicians after identifying high- or low-risk patients. By utilising just three common laboratory variables, such an algorithm can easily be placed on a physician’s personal digital assistant (PDA) or hospital computer system to assist clinicians. Correct application of this model to specific clinical situations could assist clinicians in identifying those at highest risk of in-hospital death and thus guide the appropriate health-care provision and treatment of such patients. It must be acknowledged at this point that reasons for severity of AECOPD and factors surrounding suitability for safe discharge are multifactorial and are not encompassed in this model. Information from this algorithm should become part of a decision-making process and not be used in isolation.
Factors Associated with High Mortality Risk
The overall in-hospital mortality of 15.5% compares with other studies where mortality rates have varied between 5.2 and 42% [3, 5, 17–20]. We further identified two groups similarly at high risk for inpatient mortality; in one a high urea (>7.35 mmol/l), high albumin (>36.50 g/l), and high paCO2 (>6.45 kPa) conferred an approximately 23% mortality risk, and in the other, low albumin (≤36.50 g/l) alone conferred an approximately 20% risk of death during admission.
A raised urea level in any acute medical condition may be considered an indirect marker of nonspecific systemic illness, while albumin will fall in acute systemic illness, and both may also represent an underlying poor nutritional state prior to admission. A poor nutritional state and low body mass index have been previously linked with poor outcome in COPD [21–23]. The observation that low albumin conferred a higher risk of mortality in our cohort is consistent with results from other studies of AECOPD [17, 18, 24–26]. Hypercapnia is usually associated with more severe COPD and reflects marked ventilation-perfusion inequality and an inability of an individual to adequately eliminate excess carbon dioxide with increased ventilatory drive. It occurs commonly in combination with hypoxaemia, although in this study a low paO2 was not found to be important when combined with the other parameters.
Data Accuracy and Model Performance
In this study we have considered the misclassifications of diagnosis, admission, and outcome data, some of which are described in other studies . The data for the outcome were complete in all patients, and although the data for each laboratory value varied, at least 75% of the data was available for the admissions. Furthermore, the accuracy and validity of diagnosis of COPD and AECOPD were reviewed and supported in 1,642 (27.4%) cases by combinations of spirometry, case notes, imaging, and patients’ clinical letters, and in 1,581 (26.4%) by spirometry and case records. It is possible that administrative information included errors in coding or wrong dates, although such miscoding would have resulted in an underestimate of the magnitude of the effect of the outcome.
The models showed high specificity and negative predictive values but relatively reduced sensitivity and low positive predictive values. The negative predictive value is inversely proportional to the prevalence of the outcome, and in this case the relatively low outcome measure (mortality rate of 15.5%) would explain the high negative and low positive predictive values. These values also indicate the applicability of the tests; potential models with much higher sensitivity and specificity were considered but this had resulted in more than 20 subgroups of risk, which would make such models unnecessarily complex and therefore clinically useless. The C statistic indices are a measure of the discriminative ability of the model, and values between 0.7 and 0.8, as reported in this study, would be considered to indicate good discrimination . Previous studies with prediction models in other diseases using routine data have reported C statistic indices of below 0.7, suggesting that our models have performed comparatively well [28–31].
Although the CART models developed are able to identify patients with AECOPD who are likely to have poor inpatient outcome, we have not included measures of social deprivation, physical function, and factors precipitating the admission into the data set, e.g., virus infections . Future studies combining comprehensive laboratory investigations with clinical and social parameters may yield further useful prediction models. Our data relate to a single-centre UK population, thus requiring further validation on a different population to ensure applicability.
We developed and validated a simple decision tree based on routinely collected laboratory data at the time of emergency admission that is able to correctly classify subgroups of AECOPD inpatients at high and low risk of death during admission. This type of modelling uses only three variables and is arguably simple to apply at the bedside.
We are grateful to the Departments of Clinical Coding, Cardiology, Haematology, and Biochemistry for providing the data and to Sumita Kerley for her assistance in data collection. This project was supported by a student bursary (for AA) from the University of Portsmouth.
All authors declare that there are no competing interests related to this manuscript.