Introduction

Postoperative infectious complications (PICs) occur frequently after major abdominal surgery. PICs have a major influence on patient outcomes and hospital costs [18]. A timely diagnosis of infectious complications is associated with a lower morbidity and mortality rate [7, 9]. However, early clinical features of postoperative infections are often nonspecific and difficult to distinguish from the normal postoperative inflammatory response related to surgical trauma [9]. The median time to diagnosis of infectious complications has been reported up to 12 days after surgery, with commonly several days of delay in retrospect [8].

A biological marker that can predict infectious complications before clinical signs and symptoms develop could be of clinical value. The value of such a marker is two-sided; it could identify patients with a high probability of infectious complications for early additional investigations, such as an abdominal CT scan, or it could identify patients with a low probability of infectious complications.

C-reactive protein (CRP) is a biological marker that might be of value in detecting infectious complications. CRP is a widely available, fast, and cheap marker. CRP levels are known to increase in the postoperative period, because of surgical tissue damage. CRP levels tend to normalize rapidly in patients with an uncomplicated postoperative course due to its short plasma half-life of 19 h [10, 11].

CRP has been extensively studied for its value in predicting PIC after major abdominal surgery [1216]. Several studies have concluded that CRP is a useful predictor of PIC, but low positive predictive values have been reported [7, 1623], making CRP a suboptimal marker for ruling in of an infectious complication. A recent meta-analysis of CRP after gastroesophageal cancer surgery confirms that CRP values are insufficient to predict postoperative inflammatory conditions [24].

The value of CRP to rule out the presence of infectious complications has not yet been studied. In an era of minimal invasive surgery and enhanced recovery programs, patients are often discharged early, possibly before clinical signs of deterioration have become evident. A marker that accurately predicts the absence of postoperative complications could aid patient selection for safe and early hospital discharge and prevent overuse of imaging.

The present systematic review and meta-analysis aims to determine the value of CRP to rule out the presence of infectious complications allowing for safe and early discharge of patients after major abdominal surgery.

Material and methods

Search strategy

Embase, PubMed, and the Cochrane library were searched up to the 26 of January 2014. The search strategy consisted of the MeSH terms and free text words indexed for CRP and major abdominal surgery. The detailed search strategy is available in Appendix A. This review was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement.

Study selection

Inclusion and exclusion criteria were set before the search. Articles were considered eligible if the diagnostic accuracy of CRP for PIC following abdominal or gastrointestinal surgery was assessed in a prospective study design. If the following criteria were all met, articles were included: (1) CRP was evaluated in the postoperative setting, (2) CRP was evaluated after major abdominal or gastrointestinal surgical procedures (including pancreatic, colorectal, hepatobiliary, esophageal, and gastric surgery), (3) outcome of interest was the association between CRP and PIC, and (4) the study design was prospective. Designs other than prospective design were excluded to minimize the risk of bias. Studies presenting insufficient data for extracting 2 × 2 contingency tables of CRP versus PIC were also excluded. Original articles in the English, French, German, Dutch, or Spanish language were considered for inclusion.

Two independent reviewers (SLG and JJA) screened the titles and abstracts of all papers identified by the search for eligibility. The full text was obtained of potentially eligible papers for further evaluation. Reference lists of key articles and reviews were manually searched to identify additional articles. In case of disagreement, consensus was reached through discussion. The inclusion and exclusion of articles were recorded in a PRISMA flow chart (Fig. 1). Two reviewers independently extracted data from the included studies using a standardized form (SLG, JJA).

Fig. 1
figure 1

PRISMA flow chart

Test outcome

CRP levels were compared for patients with and without PIC. PIC was defined as reported in the studies. If provided, outcomes were registered for in-hospital stay and 30-day period. CRP data were recorded whenever mentioned in text, graphs, or figures of the article. Data regarding measures of diagnostic accuracy such as sensitivity, specificity, positive predictive value, negative predictive value, and area under the receiver-operating curve (AUC) were recorded as reported in the included articles. The cutoff value of CRP with presumed highest discriminatory value was recorded.

Reference standard

The outcome of interest was PIC. PICs were counted per event and defined as reported in the individual studies. The true disease status or reference standard, i.e., whether patients actually developed a PIC, could be determined in multiple ways. Follow-up, surgery, and radiological imaging were all accepted as reference standard for the diagnosis of PIC. Duration of clinical follow-up was recorded.

Study design, patient characteristics, and quality

The following data were extracted from included studies: study period; department of the first author; inclusion period; study design; country of origin; and patient characteristics such as number of included patients, the mean or median age (and range), male to female ratio, time of follow-up, and the number of true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN). The methodological quality of the studies was assessed using the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool [25, 26]. Two by two contingency tables were extracted or reconstructed for CRP versus PIC for every included study.

Meta-analysis

Meta-analysis was performed with studies that provided sufficient quantitative data to calculate a contingency table for a specified cutoff value of CRP at a specified postoperative day (POD). A nonlinear mixed model was used to obtain summary estimates of sensitivity and specificity with 95 % confidence intervals (CIs) [27]. To compare sensitivity and specificity for each POD, we used the Wald test for unpaired data. Pooled likelihood ratios (LRs), positive predictive values (PPVs), and negative predictive values (NPVs) were calculated for each POD using the pooled sensitivity and specificity. The geometric mean with 95 % CI was used to calculate the pooled CRP value per POD to compare patients with and without PIC.

The pooled cutoff value was calculated using the cutoff values provided in the individual studies per POD weighted by their sample size. The pooled area under the curve was calculated using the individual AUCs weighted by their sample size. The pooled incidence of PIC was calculated using the incidence of the individual studies weighted by their sample size. The pooled incidence was not necessarily the same for each POD, because for each POD, different studies were available for pooled analyses.

The pretest probability of developing a PIC can be thought of as the probability that a patient will develop a PIC based on bedside evaluation. The posttest probability is the probability that a patient will develop a PIC based on both bedside evaluation and the CRP value. The posttest odds were calculated by multiplying the pretest odds with the positive and negative LR. Posttest probabilities for a high and a low CRPs were calculated and presented in a graph for pretest probabilities across the range of 0 to 100 %. The underlying assumption of this graph is that the positive and negative LRs were constant across all pretest probabilities. Using the pooled incidence, representing an average patient, as the pretest probability resulted in posttest probabilities for a positive and negative index test. This incidence can differ across PODs, depending on the studies available for pooled analysis for each POD.

All statistical analyses were performed using SPSS (version 20.0, IBM, Armonk, New York, USA) and SAS (version 9.3, SAS institute Inc., Cary North Carolina). P values of <0.05 were considered to indicate statistical significance.

Results

The search identified 3440 articles after excluding duplicate articles. Articles not meeting the inclusion criteria based on assessment of the title and abstract (3388) were excluded. Full text of the potentially eligible 52 articles was retrieved for detailed examination. Inclusion criteria were not met in 30 studies. Most of the excluded studies either had a retrospective design or assessed the value of CRP in settings other than major abdominal surgery [3, 10, 14, 1719, 21, 2838]. The remaining 22 studies were included for qualitative analysis of which 16 studies could also be used for meta-analysis (Fig. 1).

Study and patient characteristics

The included articles were published between 1997 and 2013 (Table 1). CRP levels of 2215 patients were examined for their value in predicting PIC in patients undergoing major abdominal surgery. Postoperative follow-up duration varied from 7 up to 60 days postoperative. Further study characteristics are summarized in Table 1. Most studies analyzed the value of CRP in colorectal surgery (eight studies), gastrectomy (three studies), and esophagectomy (three studies).

Table 1 Characteristics of included studies

Quality of included studies/risk of bias

The quality of the included studies was fairly good (Figs. 2 and 3). All studies had a representative spectrum of patients and used an acceptable reference standard. The time between index and reference test was acceptable in all studies. The preferred reference standard differed across studies. In most studies, only patients with elevated values of inflammatory markers or a clinical suspicion of complications underwent imaging as diagnostic reference standard (partial verification). The preferred reference test differed between patients with a positive index test (e.g., patients with elevated inflammatory markers) and negative index test. The preferred reference standard in patients with a positive index test was diagnostic imaging (predominantly computed tomography or conventional radiography with water-soluble contrast), whereas in patients with a negative index test, clinical follow-up was the reference standard (differential verification). Only one study avoided partial and differential verification by performing imaging in all patients [1]. In only one study, incorporation of the index test in the reference test was avoided [33]. None of the studies blinded the outcome assessors for the reference standard. In only one study, the index test results were blinded [33]. All studies provided information on uninterpretable results except for one study [22]. One study failed to provide information on withdrawals [51].

Fig. 2
figure 2

Methodological quality summary of the included studies

Fig. 3
figure 3

Methodological quality across studies

Overall, the risk of bias in the studies is low due to the nature of the index test (CRP). The outcome of the index test was independent of the reference standard. In all studies, CRP measurement was performed in a standardized manner for study purposes and independent of clinical suspicion of infection, nor was clinical suspicion of infection documented.

Predictive value of CRP for infectious complications

The incidence of PIC ranged from 5 to 60 % across studies. The average incidence was 27 % (95 % CI; 26–29 %).

The cutoff level for CRP, which was used to calculate sensitivity, specificity, NPV, and PPV, varied across studies from 48 to 200 mg/L. In most studies, CRP levels were significantly higher in patients with infectious complications compared to patients without complications. This difference increased each POD (Tables 2 and S2).

Table 2 Mean CRP levels per POD in relation to complications

Four studies provided sufficient data for meta-analysis on POD 1, nine studies provided data for POD 3, and six studies provided data for POD 2, POD 4, and POD 5. Sensitivity, specificity, PPV, and NPV from the studies are provided in Table S1 (Appendix). Values of sensitivity and specificity of the individual studies are plotted for each POD in Fig. 4a–e. Sensitivity and specificity increased up to POD 3. The pooled AUC ranged from 0.72 on POD 2 to 0.87 on POD 3 and 0.83 on POD 5. Up to POD 2, the pooled cutoff value of CRP increased. From POD 3 onward, the pooled cutoff value decreased (Table 3). The pooled cutoff value for the CRP was 190 mg/L (range 140–240) on POD 2, 159 mg/L (range 92–200) on POD 3, and 114 mg/L (range 48–150) on POD 5.

Fig. 4
figure 4

Bubble plot of sensitivity and specificity of the individual studies including the pooled values (circle with dashed line is the pooled AUC circle with black lines representing the individual studies weighted by their sample size). a On postoperative day 1. Pooled AUC = 0.73. b On postoperative day 2. Pooled AUC = 0.72. c On postoperative day 3. Pooled AUC = 0.87. d On postoperative day 4. Pooled AUC = 0.82. e On postoperative day 5. Pooled AUC = 0.83

Table 3 Pooled diagnostic accuracy of included studies

Pooled diagnostic accuracy variables are listed in Table 3. The pooled sensitivity and specificity increased per POD. The lowest pooled sensitivity and specificity were reached on POD 1 (respectively, 60 %; 95 % CI (47–71 %) and 60 %; 95 % CI (43–75 %)). Pooled sensitivity and specificity were highest on POD 5, 86 % (79–91 %) and 86 % (75–92 %), and were significantly higher on POD 5 compared to all other PODs (p < 0.001). Using the pooled cutoff values would lead to 23 % of missed cases (1—sensitivity) of PIC on POD 3, 20 % on POD 4, and 14 % on POD 5.

The pooled NPV increased each day after surgery at a decreasing cutoff of the CRP values (Table 3). The NPV ranged from 82 % (95 % CI; 68–90 %) on POD 1 to 92 % (95 %CI; 85-96 %) on POD 5. The pooled PPV was low ranging from 41 % (95 % CI; 27–56 %) on POD 1 to 64 % (95 % CI; 49–77 %) on POD 5.

The negative likelihood ratio (LR−) decreased each POD. The highest LR− was 0.67 (95 % CI; 0.33–1.02) on POD 1, and the lowest LR− was 0.17 (0.09–0.25) on POD 5. The positive likelihood ratio (LR+) of CRP increased each POD. The lowest pooled LR+ was 1.48 (95 % CI; 0.66–2.30) on POD 1, and the highest LR+ was 6.07 (95 % CI; 2.26–9.89) on POD 5.

Figure 5 presents the (posttest) probability of a PIC for a patient with a high CRP (green line) and a low CRP (red line) on POD 3 (Fig. 5a) and POD 5 (Fig. 5b). The cutoff value between a low and a high CRP was 159 mg/L on POD 3 and 114 mg/L on POD 5. The posttest probability of a PIC in an average patient with a high CRP on POD 3 was 61 versus 12 % in an average patient with a low CRP. On POD 5, an average patient with a high CRP had a posttest probability of 55 versus 3 % in an average patient with a low CRP.

Fig. 5
figure 5

a The (posttest) probability of a PIC is presented for a patient with a high CRP (green line) and a low CRP (red line) on POD 3 (a) and POD 5 (b). The cutoff value between a low and a high CRP was 159 mg/L on POD 3 and 114 mg/L on POD 5. The arrows show that the posttest probability of a PIC for an average patient (incidence of PIC 32 %) with a high CRP on POD 3 was 61 versus 12 % in an average patient with a low CRP. The length of the arrows represents the absolute change in probability of a PIC in case of a high or low CRP. On POD 5, an average patient (incidence of PIC 16 %) with a high CRP had a posttest probability of 55 versus 3 % in an average patient with a low CRP. The black diagonal line at 45° with the x-axis represents the line of a hypothetical noninformative test in which the pretest and posttest probabilities are equal. The posttest probability of a PIC can be read from the two panels for any pretest probability (i.e., based on bedside evaluation) and CRP value. Pretest probability (incidence) = 0.32, posttest + probability = CRP >159 mg/L = 0.61 and posttest – probability = CRP <159 mg/L = 0.12. b Posttest probability as a function of pretest probability for the positive and negative likelihood ratio on POD 5

Discussion

The objective of this meta-analysis was to evaluate the value of CRP to rule out PICs after abdominal and gastrointestinal surgery. In the era of fast-track surgery where patients are discharged early after surgery and mostly within the first five PODs, there is a need for a reliable, inexpensive, and widely available marker that permits safe and early discharge of patients. A marker reliable enough to rule out the presence of infectious complications would have a high NPV and a low negative LR. The NPV of CRP is higher than 90 % from day 3 onward. This suggests that patients with a CRP below 159 mg/L on POD 3 have a low probability of developing a PIC and could safely be discharged early.

A recent meta-analysis that focused on the diagnostic value for the presence of anastomotic leakage also found a high NPV justifying early discharge of patients after colorectal surgery [52]. However, this meta-analysis is limited by the methodological design of the included studies. The majority of included studies have a retrospective design resulting in significant heterogeneity. A retrospective study design leads to selective measurement of CRP in patients who are clinically suspected of having infectious complications (incorporation bias), which may lead to an overestimation of diagnostic accuracy.

In many European centers, CRP is used in daily practice in combination with clinical judgment based on history and physical examination. Combining CRP with clinical judgment might increase the (negative) predictive value of CRP even more. Bedside evaluation during the postoperative course broadly classifies patients into three categories. Firstly, there is a group of patients without a clinical suspicion of infectious complications and, thus, a very low pretest probability. In these patients, even an elevated CRP value on POD 5 may not increase the posttest probability enough to warrant a change in management (e.g., imaging or antibiotics). CRP is of limited value for these patients. The second group includes patients in whom the suspicion of infectious complications based on clinical evaluation is very high. In these patients, the (posttest) probability will remain sufficiently high, even for a low CRP value, to justify a change in management. CRP again has limited value in decision making for these patients. Finally, predominantly in patients with an intermediate pretest probability of infectious complications, CRP values are most likely to determine the need for a change in management. For high CRP values, the (posttest) probability of an infectious complication might be sufficiently high to justify a change in management, while a low CRP value might justify no change in management. Figure 5 can be used as a decision aid, in which the physician still needs to determine the (posttest) probability of a PIC above which he or she feels that a change in management is warranted.

Studies have also demonstrated that CRP values initially increase postoperatively and then tend to normalize in patients without infectious complications around POD 3 [52]. Other studies have confirmed this suggestion [1, 16, 23, 28, 53, 54]. The results of the present meta-analysis confirm these results demonstrating that average values of CRP differ between patients with and without infectious complications. These findings suggest that prolonged elevated CRP values are predictive of infectious complications. CRP might be clinically useful to aid selection of patients for additional imaging. In this review, the highest PPV of CRP was 64 % on POD 5. This PPV would lead to a FP diagnosis in 36 % of patients, resulting in unnecessary additional imaging in these patients.

Diagnostic test research can be subject to several limitations. Firstly, knowledge of the CRP value might influence the interpretation of the reference test. For example, the same intraabdominal fluid collection on a CT scan may be classified as abscess if the CRP is high, but as ascites if the CRP is low. Only in one study, blinding for the outcomes of CRP value was used when determining the presence or absence of PIC [33]. Another difficulty in diagnostic test research is the heterogeneity across studies for the selected optimal CRP cutoff value. The optimal cutoff value is selected by the authors of the individual studies at the level at which they feel patients with and without infectious complications are best distinguished. At a higher cutoff value for the CRP, the sensitivity is higher, but the specificity is lower. To prevent bias, ideally, the analysis should have been performed using the same cutoff value of the CRP in each study. Unfortunately, insufficient data was reported to use a single optimal cutoff value of the CRP. Pooling study results even with small differences in the cutoff value have inevitably biased the results. Diagnostic test results can also be influenced by variation in timing of the reference test: for example, whether a CT scan to detect an intraabdominal abscess was performed early or late during follow-up. Another limitation is the difference in the definition of infectious complications between studies. This might influence the incidence of infectious complications. We aimed to minimize the risk of bias by including studies that used radiological and/or clinical evidence to define infectious complications. Finally, the pooled analysis included patients who underwent different types of surgery. A bias may have been introduced because the prognostic value of CRP may depend on the type of surgery. It has been demonstrated that the postoperative increase in CRP is dependent on the extent of operative trauma. Different types of surgery could lead to different absolute values of CRP. Nevertheless, this discrepancy between types of abdominal surgery can only exist in the first 2 days due to the short half-life of CRP (19 h). The trend where CRP values tend to normalize in patients without infectious complications around POD 3 has been demonstrated for various types of surgery [1, 16, 23, 28, 53, 54]. The major advantage of combining different types of abdominal surgery is that a much larger sample size was reached resulting in more precise estimates.

This meta-analysis evaluated the value of CRP as a predictor to rule out infectious complications. However, when CRP is combined with bedside clinical evaluation, as used in daily practice, the NPV of CRP might further increase as illustrated in Fig. 5. In the literature, no studies were found that assessed the added value of CRP on top of bedside judgment. Future studies should aim to determine this added value of CRP. These studies should also evaluate the diagnostic value of the change in CRP in the postoperative period instead of focusing on the absolute value. The change in CRP (e.g., between POD 2 and 5) may be a stronger predictor than the absolute CRP value on POD 5. Also, the actual benefit of early detection and management is assumed but needs to be determined more definitively. In conclusion, CRP values seem clinically useful to aid patient selection for safe and early hospital discharge and prevent overuse of imaging.