Introduction

Clinical epidemiology involves answering questions about the diagnosis, prognosis, treatment, and burden of human diseases. There are many population-based studies on the incidence and outcome of cardiovascular, traumatic, infectious, and neoplastic diseases. In the United States the National Center for Health Statistics maintains data on the incidence and mortality of hundreds of diseases [1]. Similarly, the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute maintains high-quality data on cancer incidence and survival from selected areas across the United States [2]. Unfortunately, similar data are not readily available for the critical care syndromes, acute lung injury (ALI), sepsis, or multiple organ failure. There are good reasons for this lack of data—doing epidemiology in the ICU is hard.

Challenges to epidemiology in the ICU: studying a place in the hospital

The epidemiology of critical illness is based on looking for disease in the ICU. This common assumption already imposes a bias which may cause researchers to miss some patients with ALI and sepsis who are cared for in hospital wards or emergency departments [1, 3]. The geographic constraint of studying a place in the hospital imposes other problems for the epidemiologist. In the absence of an oversupply of intensive care beds the number of available ICU beds bounds, in an important way, the incidence and outcome of critical illness syndromes. The epidemiology of critical illness in developing nations with few ICU beds, or in a country that implicitly withholds intensive care from the elderly, differs considerably from countries that use their medical resources more liberally [1]. Since many of the causes of ALI and sepsis are iatrogenic (organ transplantation, cardiac surgery, intensive chemotherapy); the use of these procedures also determines the epidemiology. Even decisions about the use of positive end-expiratory pressure, volume resuscitation, and the number of arterial blood gases performed may affect the “diagnosis” of ALI [4]. Epidemiologists routinely capitalize on interregional variations in diet and other exposures to study risk factors for diseases. However, in the case of critical illness syndromes, local medical habits, and bed availability may overshadow other comparisons.

Challenges to epidemiology in the ICU: studying syndromes

Critical illness syndromes are defined by a complex combination of physiological, biochemical, and clinical criteria. Although much is made of the limitations of the existing operational definitions for ALI and sepsis, syndromes can be studied rigorously. The field of psychiatry routinely uses syndromic definitions to identify patients for clinical research. While these definitions may not be mechanistically satisfying, they are superior to the ones used in critical care because they have empirically demonstrated their reliability and validity. Reliability is the ability of a definition to identify the same patients with repeated testing by different observers or by the same observer over time. The diagnostic criteria for ALI are known to have poor reliability [5, 6, 7]. A simple Medline search (schizophrenia/diagnosis and reliability or observer variation) shows over 90 articles on the reliability of the diagnosis of schizophrenia. Similar searches for acute respiratory distress syndrome (ARDS) and sepsis find nine and seven articles, respectively, which, with the exception of the radiographic criteria for ALI, do not specifically evaluate aspects of the syndromes’ definitions.

Validity is the ability of a definition to distinguish between those persons who truly have the disease and those who do not. This is straightforward in neoplastic diseases where biopsy serves as a gold-standard. However, it is a challenge in critical illness because we do not have gold-standard diagnostic tests. There are rigorous ways to study syndromes whose definitions have unknown validity. One option is sensitivity analysis, where the results are reanalyzed using various assumptions under the hypothesis that if the results persist using different definitions, the findings are likely to be true even if the definitions are imprecise. This method was used by Cook and colleagues [8] in their randomized clinical trial of ranitidine vs. sucralfate to evaluate the agents’ effects on ventilator-associated pneumonia. Another option is used by social scientists who explore other forms of validity in the absence of a gold-standard. These include face, content, predictive, and concurrent validity [9]. Intensivists implicitly rely on these concepts without referring to them by name. For example, Doyle et al. [10] used their observation that ALI vs. ARDS does not predict mortality to argue that ALI and ARDS are similar processes. The ALIVE investigators use the same reasoning and opposite data to argue that ARDS provides a more “homogeneous” patient population [11]. Both investigators implicitly rely on the concept of predictive validity, namely that if different definitions predict different outcomes, they identify patient populations that differ in important ways. Unfortunately, criteria that predict different outcomes do not reliably distinguish different mechanisms or response to therapy. Inferior wall myocardial infarctions have a different mortality than anterior wall myocardial infarctions, but they represent the same mechanism of disease and respond to similar treatments. Therefore differential mortality may not identify patients with different forms of ALI or identify patients who should be enrolled in clinical trials.

In their careful analysis the ALIVE investigators found that a “mild-ALI” group had significantly lower mortality than patients with ARDS and suggest that this group be excluded from certain research studies. Unfortunately, their definition of mild-ALI (PaO2/FIO2 between 200 and 300 that does not fall to below 200 in 3 days), which confirms the hypothesis that patients whose condition worsens are more likely to die, cannot be applied prospectively. An accurate way to predict which of the mild-ALI patients will progress was not provided, and therefore there is no way to tell whether a given patient in the ICU has “mild-ALI” or ARDS. Mild-ALI is a subset of ALI that can only be identified after patients either become better or worse. In fact, their observation that 55% of patients who present with a PaO2/FIO2 between 200 and 300 progress to ARDS supports one of the justifications behind broadening the ARDS definition [12]. A less severe oxygenation cutoff identifies some patients with ARDS earlier in their course. It may do this by including some patients who do not progress and have a lower mortality. However, if the goal is to identify patients with ALI whose risk of death lies within a specified range, the degree of hypoxemia should not be the only data used [13, 14]. In fact, despite their limitations the current definitions for severe sepsis and ALI have already passed one important test: they effectively identify patients who respond to life-saving interventions [15, 16].

The Future

The ALIVE investigators have added importantly to a growing body of studies based in selected ICUs that examine the effect of various factors on mortality in ALI [14, 17, 18]. Important issues remain for the epidemiologist in the ICU, including a better understanding of the burden of illness of ALI in the population and incorporating genetic data into our risk factor theories. Reliable definitions are crucial to advancing the study of the genetic and molecular epidemiology of critical illness syndromes. One review article on genetic epidemiology makes this point emphatically [19]: “Use of standardized, reproducible [phenotype descriptions] with strict requirements for training, certification, and quality control is a fundamental principle of population-based research that needs to be translated to genetic epidemiologic studies.” Studying the genetic factors of complex diseases is challenging enough without introducing the confusion of different investigators studying different complex diseases.