Background

Influenza and other respiratory viruses cause an enormous healthcare burden in the U.S. Each year, influenza alone is responsible for 226,000 (54,000–431,000) hospitalizations [1] and 31.4 million outpatient visits [2]. Appropriate diagnosis and cost-effective treatment are dependent upon timely presentation for care, accurate and reasonably priced testing with short turnaround time, when appropriate. The U.S. Centers for Disease Control and Prevention (CDC) recommends antiviral treatment with a neuraminidase inhibitor for all outpatients with suspected or confirmed influenza who are at higher risk for influenza complications because of age or underlying medical conditions, when treatment can be started within 48 h of illness onset. Treatment also can be considered on the basis of clinical judgment, for outpatients with uncomplicated, suspected influenza who are not at increased risk for developing complicated illness, if antiviral treatment can be initiated within 48 h of illness onset.

Use of antiviral medications, specifically neuraminidase inhibitors, within 2 days of illness onset, has been shown to reduce the time to alleviation of symptoms by about half a day to a day, depending on the authority cited [35], and the risk of secondary complications of influenza such as clinically diagnosed pneumonia in some meta analyses [6, 7], but not in others [3, 4]. Prescribing of antiviral medications among primary care clinicians for treatment of outpatient acute respiratory infections (ARI) is infrequent [8, 9], perhaps because of cost and/or side effects associated with neuraminidase inhibitors [3, 4, 6].

Because these medications are effective only against influenza, they should only be prescribed to patients with suspected or confirmed influenza. Without respiratory viral testing, differentiating influenza from other viral respiratory infections can be difficult. Although molecular multi-viral respiratory panel testing can accurately distinguish possible causes of ARI, these tests are relatively new and expensive. Rapid influenza tests, in contrast, are inexpensive but lack sensitivity [10]. Moreover, clinical experience suggests that the use of respiratory viral testing increases as influenza circulation increases in the community, but is ordered less frequently at the beginning and end of the influenza season.

Clinicians use medical clinical decision algorithms, based on a series of decision rules, to determine risk for a range of medical conditions [11]. The availability of a clinical decision algorithm for determining influenza would benefit patients and potentially minimize costs to the healthcare system by providing the opportunity to diagnose influenza with reasonable accuracy and potentially treat patients with antiviral medication.

This study was designed to determine if an algorithm developed using recursive partitioning resulted in reasonable estimates of the likelihood of ARI being due to influenza infection. Classification and regression trees have been used to analyze mortality in persons with influenza A (H5N1) virus [12] and the likelihood of seasonal influenza [13, 14]. In one study of seasonal influenza, the presence of influenza was most accurately predicted by a model that assigned scores to symptoms; it included fever plus cough, myalgia, duration <48 h and chills or sweats [13]. Using the same data set, but grouping specific symptoms into dichotomous variables (present/absent), a model was developed in which the presence of both temperature >37.3 °C and chills or sweats predicted high risk of influenza [14]. Absence of both symptoms predicted low risk, and presence of only temperature >37.3 °C predicted moderate risk. In the present study, the Classification and Regression Trees [15] (CART) methodology was used to estimate the likelihood of influenza among individuals presenting for outpatient care for ARI in 2011–2012 who were enrolled in the U.S. Influenza Vaccine Effectiveness Network (US Flu VE Network) study.

Methods

Enrollment

Details of the US Flu VE Network design, sites, and enrollment procedures have been described previously [16]. Briefly, during the 2011–2012 influenza season, patients aged ≥6 months seeking outpatient medical care for an ARI with fever or cough were recruited at outpatient clinics in Marshfield, Wisconsin; southeastern Michigan (Ann Arbor and Detroit); Temple-Belton, Texas; Seattle, Washington; and Pittsburgh, Pennsylvania. Patients meeting the symptom criteria were eligible if duration of illness was ≤7 days and they had not received antiviral medication prior to enrollment. Recruitment and sample collection were performed by study personnel at each site and not influenced by the diagnosis of the treating physician. Consenting patients or their parents/guardians completed an enrollment interview to ascertain patient demographic characteristics, symptoms (fever, cough, fatigue, sore throat, nasal congestion, shortness of breath, wheezing), onset date, and subjective assessments of general health, current health status and self-reported influenza vaccination status.

Nasal and throat swabs (nasal only for children age <2 years) were collected and combined for influenza testing at network laboratories. This technique was selected because it is easier to collect and less uncomfortable for the patient and has been found to be as effective as nasopharyngeal swabs [17]. Presence of influenza was tested using real-time reverse transcription polymerase chain reaction (PCR) as previously described [16]. The parent study used a test-negative case-control design [1820].

Selection of study sample

Individuals enrolled in 2011–2012 from all 5 sites during periods of influenza circulation at each site were included in the analyses. That is, influenza circulation at each site was determined to be the time between date on which the first influenza positive case was enrolled and the last influenza-positive case was enrolled. Although participants may have reported onset of symptoms before or after this period; they were excluded from analysis. The total sample for all sites was 5,147. Some individuals were enrolled multiple times; all of those visits except the first enrollment were excluded (N = 71) reducing the sample to 5,076. Because symptoms of influenza vary between young children and older individuals, the primary analysis sample was restricted to enrollees ≥5 years of age, resulting in a final sample size of 4,173. Secondary analyses included children <5 years of age.

CART analysis

Classification and Regression Trees (CART) [15] software was used to develop models that can classify subjects into various risk categories. Recursive partitioning, a non-parametric statistical method for multivariable data, uses a series of dichotomous splits, e.g., presence or absence of symptoms and other demographic variables, to create a decision tree, with the goal of correctly classifying members of the population, in this case, laboratory-confirmed influenza cases. Each independent variable is examined and a split is made to maximize the sensitivity and specificity of the classification, resulting in a decision tree. The objective of pruning is to develop a tree with the best size and lowest misclassification rate [15].

The CART method is able determine the complex interactions among variables in the final tree, in contrast to identifying and defining the interactions in a multivariable logistic regression model.

To begin the CART analysis, simple random sampling without replacement was used to split the sample into equal sized (50 %–50 %) developmental and validation samples. CART was applied first on a developmental sample then on a validation sample to assess the model’s generalizability and to evaluate the over fitting of the model to the developmental sample.

Several sets of candidate predictors were used to build the classification trees. Using several iterations, CART models were used to determine a clinically logical fit, based on sensitivity and specificity; the variables included those that were potentially related to risk of influenza such as, symptoms, self-reported vaccination status, personal and demographic variables and presence of chronic disease. The primary developmental and validation models were constructed for all participants ≥5 years, using self-reported vaccination status, household smoking status, and symptoms reported at enrollment: cough, fever, fatigue, wheezing, sore throat, nasal congestion, shortness of breath. The variables smoking status, age and presence of other high risk conditions were not included in this model.

The Gini Index method was used to split off the largest category into a separate group, with the default split size set to enable growing the tree. When the final tree was built, the tree was pruned, deleting the variables that did not further classify subjects, based on the variable importance score and the sensitivity, into an influenza group or no influenza group. Once a clinically meaningful structure on the CART evolved, pruning was discontinued. Hosmer-Lemeshow goodness of fit test confirmed the suitability of the trees.

Secondary analyses were constructed that included: 1) children 6–59 months of age, presenting within 2 days of onset of symptoms and included PCR-confirmed influenza status, self-reported vaccination status, household smoking status, and symptoms reported at enrollment: cough, fever, fatigue, wheezing, sore throat, nasal congestion, shortness of breath; and 2) adults ≥65 years old and individuals 5–64 years old with a high risk condition, presenting within 2 days of onset of symptoms and included PCR-confirmed influenza status, self-reported vaccination status, household smoking status, symptoms reported at enrollment: cough, fever, fatigue, wheezing, sore throat, nasal congestion, shortness of breath, and asthma diagnosis.

Receiver Operating Characteristics (ROC) curves and the area under the curve (AUC), sensitivity, specificity, positive and negative predictive values which were estimated using CART software were used to assess the performance of the CART model for the developmental and validation samples. The sensitivity from the CART model was determined using the final influenza positive terminal node and specificity was determined using the previous influenza negative terminal nodes.

Additional analyses

In addition to the CART analyses, descriptive statistics were calculated as percentages for discrete variables and as means and standard deviations for continuous variables. Chi-square statistics were used to compare the distribution of symptoms and other discrete measures and Student’s t-tests were used to compare the continuous measures (i.e., age) between those with and without laboratory-confirmed influenza.

To support the CART findings, sensitivity analyses were conducted using multivariable regression analyses with a full model method, using the same set of variables used in the CART analysis for both developmental and validation samples, and for the full sample with all individuals ≥5 years of age. Positive and negative predictive values were calculated using sensitivity and specificity values from the CART model across a hypothetical range of influenza prevalence values (1–40 %) to reflect influenza seasons of varying severity (Table 3). The sensitivity and specificity, calculated using the predicted probability from the multivariable logistic regression for both developmental, validation and the full sample with the true classification of influenza, were obtained and are presented.

Statistical significance was defined as a two-sided p value <0.05. Data were analyzed using SAS v9.2 (SAS Institute, Inc., Cary, NC) and CART for the decision trees (Predictive Modeler) Software version 7.0.0.470 (Salford Systems, San Diego, CA).

Results

Distributions of demographic variables for all enrollees with PCR-confirmed influenza or no influenza are shown in Table 1 and for the final developmental and validation samples are shown in Appendix 1: Table 4. All variables used in the tested models are shown. Individuals with PCR-confirmed influenza were more likely to report fever, cough and fatigue at enrollment and less likely to report household smoking, asthma diagnosis and were younger than those without influenza. In this cohort, antiviral use was low. Only 185 of 4173 enrollees (4.4 %) were prescribed an antiviral medication (15 % of cases vs. 6 % of non-cases were prescribed antivirals; Chi square P < 0.001). The use of antiviral medication among those ≥5 years old with a positive PCR test was 22 % (n = 40).

Table 1 Sociodemographic characteristics and symptoms of enrollees ≥ 5 years of age reported at enrollment, by Polymerase Chain Reaction (PCR)-confirmed Influenza status

Primary CART analyses

Figures 1 and 2 show the CART decision trees for the developmental and validation samples, respectively showing the conditions that would need to be present to predict influenza with maximum certainty for this sample. For the developmental sample, the sensitivity was 84 % and the specificity was 48 %. Positive predictive value (PPV) was 23 % and negative predictive value (NPV) was 94 % (Fig. 1). For the validation sample that examined the other half of the sample, the sensitivity was 84 % and the specificity was 49 % with a PPV of 23 % and NPV of 95 % (Fig. 2). The receiver operating characteristic (ROC) curves for the developmental and validation decision trees are shown in Fig. 3a, b, respectively, with area under the curve (AUC) =0.68 for the developmental sample and AUC = 0.69 for the validation sample. The misclassification rates for developmental and validation CART models were 16 % and 15 %, respectively. When the subjects were restricted to those who were enrolled within 2 days of illness onset, the model included fever and cough with a sensitivity of 89 % and a specificity of 50 %.

Fig. 1
figure 1

CART decision tree for the developmental sample for all enrollees ≥5 years for the outcome RT-PCR-confirmed Influenza

Fig. 2
figure 2

CART decision tree for the validation sample for all enrollees ≥5 years for the outcome RT-PCR-confirmed influenza

Fig. 3
figure 3

Receiver operating curve for CART algorithm on developmental sample for the outcome RT-PCR-confirmed Influenza. a Sensitivity = 278/330 = 84.2 %; Specificity = sum of all non-influenza subjects with negative symptoms in the terminal nodes/total non-influenza subjects, i.e. (93 + 160 + 594)/1757 = 48.2 %; Area under the Curve = 0.68. b Sensitivity = 84 %; Specificity = 49 %; Area under the Curve = 0.69

Antiviral candidate analyses

For the model including only children <5 years old, the pruned CART decision tree (Appendix 2: Figure 4) contained fever, cough and fatigue with a sensitivity of 84 %, specificity of 48 %, PPV of 11 %, NPV of 97 % and an AUC =0.69. For the high risk model including individuals 5–64 years with a high risk condition and those ≥65 years, the CART decision tree (Appendix 3: Figure 5) contained fever and cough with a sensitivity of 86 %, specificity of 47 %, PPV of 27 %, NPV of 95 % and an AUC = 0.67. The average log-likelihoods to test the goodness of fit are shown in Appendix 4: Table 5.

Comparison of CART with multivariable logistic regression

For comparison of CART with traditional multivariable logistic regression analyses, Table 2 shows the odds ratios (ORs) and 95 % confidence intervals (CIs); fever, cough and fatigue significantly increased the likelihood of PCR-confirmed influenza while exposure to household smoking decreased the likelihood of influenza. Shortness of breath, wheezing, sore throat and nasal congestion were not related to influenza. For the full nine-variable logistic regression equation for the observed prevalence of influenza of 15.4 %, the resultant values from the predicted probability were 82 % sensitivity, 52 % specificity, 24 % PPV, 94 % NPV and c-statistic (AUC) = 0.70. The multivariable stepwise logistic regression model resulted in the same significance of those four variables from the full model. The c-statistic = 0.69 for the step-wise selection model.

Table 2 Likelihood of Polymerase Chain Reaction (PCR)-confirmed Influenza by logistic regression among 4,173 enrollees ≥5 years of age

Impact of varying prevalence

The prevalence of influenza during 2011–2012 influenza season among enrollees ranged from <3 % in January to a peak of 23 % in March, indicating a late and relatively light season. Therefore, we compared PPV across a range of influenza prevalence values to determine the value of the decision aid in milder or more severe seasons. Using consistent values of 84 % sensitivity and 48 % specificity, PPV ranged from 1.6 to 51.9 % and NPV ranged from 99.7 to 81.8 % when prevalence was varied from 1 to 40 % (Table 3).

Table 3 Positive Predictive Values (PPV) and Negative Predictive Values (NPV) across a range of influenza prevalence using the Sensitivity (84 %) and Specificity (48 %) of the diagnostic model based on CART analyses

Discussion

Eighty-five percent of influenza positive cases in this study were not prescribed an antiviral medication; whereas, 94 % of influenza negative cases were not prescribed an antiviral medication. While empiric treatment of certain patients is recommended, influenza prediction tools may be a useful adjunctive approach to improving appropriate use of antiviral medication.

A range of techniques, including clinical judgment [21], clinical decision rules [13, 22, 23], and CART [14], has been used to predict the likelihood of seasonal influenza among individuals presenting for outpatient treatment of an ARI. In multivariable regression analyses, Monto et al. found that presence of fever and cough in adults and adolescents in ambulatory settings best predicted influenza confirmed by cultures, 4-fold increase in antibody titer, immunofluorescence, or PCR [21]. These findings were confirmed among children ≥5 years of age [23]. Subsequent work by Stein et al. [22], also using multivariable regression analyses and PCR to confirm influenza, found no improvement over clinical judgment of either rapid testing or a clinical prediction rule in which specific symptoms were assigned scores.

The advantages of CART are that recursive partitioning does not make any distributional assumptions about the modeled variables, and that it accounts for multi-level interactions among variables. We found a similar area under the curve with CART and logistic regression, but CART used fewer variables.

Afonso et al. used CART to develop a model predicting low and high risk of influenza among 459 patients combining data from two studies from different countries [14]. Their model included temperature >37.3 °C and presence of chills or sweating and produced an AUC of .75-.76. In comparison, we report a decision tree using four symptoms (fever, cough, fatigue and shortness of breath) and presence of household smoke with an AUC of .68–.69. We did not ask about chills/sweating. Our model had a PPV of 23 % and NPV of 95 %. Testing the sensitivity of 81 % and specificity of 52 % of the CART model across a range of prevalence values indicated that, in a more typical influenza season with a hypothetical prevalence of 25 %, the PPV increased to 36 % with a 91 % NPV.

In a review of clinical decision rules for diagnosis of influenza, Ebell and Afonso [24] presented suggestions for future studies of this topic, of which the present study addresses several. For example, the authors suggested using PCR as the reference standard for the detection of influenza, an adequate sample size, and a broad range of patients. PCR was used exclusively for confirmation of influenza in this study of 4,173 patients ≥5 years of age from across the U.S., thus addressing some of these conditions. The value of this analysis is the ability to identify, from among a group of patients with ARI and onset of fever or cough within 7 days, those patients with fever, cough, fatigue, as having influenza with 84 % sensitivity. Thus, the algorithm is good at identifying a group who do not need testing or antivirals (i.e., algorithm negative); however, among those “positive” by the algorithm, laboratory testing or treatment based on clinical judgement is indicated. Our additional analyses that limited the population to those whose onset of symptoms was 2 or fewer days before presentation, with a high risk condition or high risk age group, could be used to develop guidance for prescribing antivirals based on symptoms alone. CART analyses support previous studies using multivariable regression to predict influenza from symptoms [2123], but CART offers the advantage of requiring fewer variables for input into a clinical decision tool.

Determining a low-risk group in whom neither testing nor antivirals is warranted could save resources and avoid inappropriate antiviral prescribing and concomitant concerns about the development of viral resistance. A threshold approach to clinical decision-making was advocated by Pauker and Kassirer [25], with one threshold for testing and another for treating. For influenza, these testing and treatment thresholds vary by country and are lower for US compared to Swiss physicians [26]. We consider the 95 % NPV sufficiently good to rule-out influenza and avoid testing in this group. In light of the findings that use of antiviral medication among patients with influenza is low [27], and the cost of PCR testing is high [28, 29], a clinical decision aid based on symptoms offers the opportunity to promote appropriate testing during the times when it is most cost-effective, and increase antiviral medication use among those most likely to benefit, while avoiding a large increase in use in those who will not benefit.

Limitations and strengths

The limitations of this study are that the data are from a single year, in which influenza circulated later than usual and was less prevalent than normal. Moreover, influenza vaccine uptake and effectiveness may affect the model’s predictive validity. CART does not provide a p-value to test significance. Conversely, an advantage of CART is its ability to examine complex higher order interactions among variables. The analysis is strengthened by the inclusion of data from a large sample of outpatients from five sites spanning the U.S. However, it did not include developing countries or tropical regions with differing etiologies of infectious diseases.

Conclusions

Although CDC recommends empiric use of antivirals, their use remains low. Recursive partitioning using CART analyses to establish a clinical decision algorithm for influenza has good sensitivity and NPV, but limited PPV. Thus, it is good at identifying a group who do not need testing or antivirals; however, among those “positive” by the algorithm, laboratory testing or treatment based on clinical judgment is indicated. Further testing during additional influenza seasons may help to determine how this algorithm could be used to optimize decisions on laboratory testing and antiviral use in patients with ARI.