Background

Patients with persistent lower abdominal complaints are common in primary care [1]. At presentation, the general practitioner (GP) has to differentiate between potentially life-threatening significant colorectal diseases (SCD), such as colorectal cancer (CRC) and inflammatory bowel disease (IBD), and functional bowel disorders such as irritable bowel syndrome. As symptoms and signs alone have insufficient specificity, GPs refer many patients for endoscopy to not miss an SCD diagnosis. Consequently, 60–80 % of referred patients do not have SCD at endoscopy [26], unnecessarily straining healthcare budgets and endoscopy schedules, and exposing many non-SCD patients to a small but realistic risk of severe endoscopy-associated complications.

Thus, an improved diagnostic strategy that can safely rule out SCD is needed. Previous – largely non-primary care – studies have shown that diagnostic strategies solely based on symptoms and signs are unlikely to suffice [7, 8]. Adding faecal biomarkers to such diagnostic strategies may, however, improve their performance. One promising faecal biomarker is calprotectin, which indicates the presence of intestinal inflammation [9]. Calprotectin has been recommended by the National Institute for Health and Care Excellence (NICE) to help distinguish between IBD and non-IBD [10]. However, calprotectin has only been evaluated as a single test without accounting for other diagnostic information [1113]. Furthermore, the presence of faecal haemoglobin (Hb) may indicate neoplastic disease [14]. Faecal occult blood tests have previously been included in diagnostic strategies for CRC with limited success [15, 16]. Over the past decade these tests have improved substantially, mainly because of specific immunochemical detection of human Hb, resulting in so-called faecal immunochemical tests for Hb (FITs) [14].

We designed the large-scale prospective CEDAR study (Cost-Effectiveness of a Decision rule for Abdominal complaints in primary caRe), to develop a new diagnostic strategy to safely rule out SCD in primary care patients with lower abdominal complaints, thus reducing the number of unnecessary endoscopy referrals. To meet this aim, we specifically quantified the incremental diagnostic accuracy of a point-of-care (POC) calprotectin test and a POC FIT above routine diagnostic information, both individually and in combination. We specifically focused on POC tests as these can be easily executed at the time and place of patient care.

Methods

Study design

The prospective diagnostic CEDAR study enrolled patients from 266 Dutch primary care practices referred for endoscopy from July 2009 through January 2012 [11]. Patients were eligible if suspected of SCD, defined by lower abdominal complaints for at least 2 weeks, combined with rectal bleeding, change in bowel habit, abdominal pain, fever, diarrhoea, weight loss, and/or a sudden onset of abdominal complaints at > 50 years of age. Patients were excluded if aged below 18, known with SCD, or with confirmed parasitic bowel infection. Recruitment was at the GP’s office (19.0 %) or directly following endoscopy scheduling (81.0 %). If not directly recruited by their GP, our research staff contacted eligible patients. If at any time during the study patient referral outpaced our study resources, each nth scheduled patient was screened and contacted to guarantee representativeness of the study population. The University Medical Center Utrecht ethics committee approved the study (protocol number 08-462E), and all patients gave written informed consent.

History taking and physical examination

Patient and GP questionnaires facilitated a structured history taking. Abdominal pain, rectal blood loss or mucus, weight loss, and fever were considered present upon patient or GP report; duration of abdominal pain, abdominal bloating, and family history of CRC upon patient report; and change in bowel habit upon GP report. We defined constipation as at least two of the following symptoms: less than three bowel movements/week, difficult/incomplete defecation, hard/lumpy faeces, sensation of anorectal obstruction, or laxative use. We based diarrhoea on frequently loose/liquid faeces, or anti-diarrhoea medication use. GPs reported the presence of a palpable abdominal mass or an abnormal digital rectal examination.

Blood and faecal SCD biomarkers

A pre-endoscopy venous blood sample was drawn to estimate Hb and C-reactive protein (CRP) concentrations according to routine clinical practice. Directly following study inclusion, patients provided faeces samples collected before bowel preparation for endoscopy in a plain blue-capped faecal container, and kept refrigerated (4 °C) for a maximum of 2 days before handing in. Study protocol allowed freezing (–20 °C) of faecal samples before processing (this occurred in 67.9 % of samples; median days between collection and processing: 10; 10th–90th percentile: 4–21). If not frozen, the refrigerated faecal samples needed to be processed for calprotectin testing within 6 days (adherence 96.3 %; median days: 2: 10th–90th percentile: 0–3), and needed to be tested for Hb within 3 days of collection (adherence 94.5 %; median days: 2: 10th–90th percentile: 0–3).

We analysed the faecal samples for calprotectin concentration by a quantitative POC test (Quantum Blue®; dynamic range 30–300 μg/g) and by an enzyme-linked immunosorbent assay (ELISA; EK-CAL Calprotectin ELISA, both from Bühlmann Laboratories), both yielding estimates of μg calprotectin/g faeces, and for faecal Hb by a qualitative POC FIT (Clearview® iFOBT One Step Faecal Occult Blood Test Device, Alere Health), yielding either a positive or negative test result (lower detection limit of 6 μg/g). Laboratory technicians performed the ELISA, and trained research nurses the POC tests, blinded for clinical information and according to the manufacturers’ instructions. Briefly, for the calprotectin assays, 80 mg homogenized faeces was centrifuged and the supernatant was tested for calprotectin (1:16 diluted for the POC test and undiluted for the ELISA; supernatant for the ELISA was stored at –20 °C for maximally 4 months before analysis); for the POC FIT three separate random areas of the faecal sample were stabbed by the specimen collection stick and transferred to the collection tube, and two drops of extracted specimen were then applied to the test device. For more details see Kok et al. [11].

Diagnostic outcome

Experienced gastroenterologists from three high-volume centres (i.e. > 1000 endoscopies annually) performed endoscopy in all patients, i.e. colonoscopy or sigmoidoscopy. A final diagnosis was established according to routine clinical practice, including histopathology of biopsies if required, and 3 months follow-up after negative endoscopy. We defined SCD as CRC, IBD, diverticulitis, or advanced adenoma (AA; > 1 cm). Outcome assessment was blinded for the biomarker test results and other diagnostic information.

Statistical analysis

In view of the number of SCD diagnoses [17], we first developed a basic diagnostic model for SCD considering 15 patient history and physical examination predictors (listed in Table 1) and simple blood analyses (Hb and CRP concentrations). We started by selecting patient history and physical examination predictors using Akaike’s Information Criterion (AIC)-based stepwise-backward logistic regression; first considering and selecting only the patient history predictors, and then considering and selecting the physical examination predictors while keeping the selected patient history predictors fixed. Subsequently, Hb and/or CRP were only selected if they significantly improved the patient history/physical examination model. We deliberately used a more stringent selection criterion for the blood analyses (P < 0.05 instead of AIC-based) in view of the patient burden associated with obtaining this information. Blood Hb and CRP were modelled continuously instead of using a threshold for abnormal values (e.g. defining anaemia), to preserve as much diagnostic information as possible.

Table 1 Distribution and accuracy of individual predictors for diagnosing SCD in primary care as observed in 810 Dutch patients with lower abdominal complaints referred for endoscopy in the CEDAR studya

We then added the faecal biomarker tests to this basic diagnostic model (the calprotectin tests continuously and the POC FIT dichotomously), resulting in five extended models: three separate extensions (calprotectin POC or ELISA, or POC FIT), and two combined extensions (calprotectin POC or ELISA with POC FIT). As faecal testing may also be burdensome, we used the same stringent selection criterion for each faecal biomarker test as for the blood analyses (i.e. P < 0.05 for model improvement). Any blood analysis included in these extended models was subsequently removed if non-significant. For those models extended with the FIT, we also considered whether the FIT diagnostic odds ratio for SCD was lower in patients with overt rectal blood loss compared to those without (implying less diagnostic information), by testing a [FIT*blood loss] interaction term. All predictor selection tests were based on the log likelihood ratio. In all modelling, continuous predictors were included as such, using transformations if necessary to maintain linearity, while truncating outliers. Transformations were necessary for blood Hb (U-shape relation with SCD risk), and for duration of abdominal pain and CRP (logarithmic relations). See Additional file 1 for further model development details.

The final six diagnostic models were assessed for discrimination (area under the receiver operating characteristic curve; AUC), calibration, explained variation (Nagelkerke R2), accuracy (i.e. sensitivity, specificity, negative and positive predictive values (NPV and PPV) at different SCD probability thresholds: 2.5 %, 5.0 % and 7.5 %), and net benefit (decision curve analysis) [1820]. All faecal biomarker extended models were compared to the basic model and the combined biomarker extended models to the individual biomarker extended models, in terms of discrimination, explained variation, and reclassification (net reclassification improvement (NRI) at 5.0 % and 50.0 % probability threshold for low and high risk, and (relative) integrated discrimination improvement (IDI)) [21].

We used 500-fold bootstrap resampling, including predictor selection, to derive optimism-corrected AUCs, Nagelkerke R2s, and regression coefficients [22]. We multiple imputed the 5.2 % missing data points [2325], and pooled the results from the 10 imputed datasets [26, 27]. Analyses were performed in R version 3.1.3. All P values are two-sided. This publication adheres to the TRIPOD statement [28].

Results

Study population

Of 843 enrolled patients, 810 could be evaluated (96.1 %; Fig. 1). Their median age was 61 years (range 19–92), and 54.9 % were female. SCD was diagnosed in 17.4 % of patients (n = 141; 37 had CRC, 37 IBD, 18 diverticulitis, and 49 AA). The most frequent presenting symptoms were abdominal pain (80.7 %), change in bowel habit (65.5 %), constipation (57.9 %) and abdominal bloating (55.0 %; Table 1). CRP was elevated in 9.4 % and 48.7 % tested positive for calprotectin (POC, threshold at > 50 μg/g). Rectal blood loss was present in 43.6 % and 25.1 % tested POC FIT positive. Half the patients provided a faecal sample within 19 days of the GP visit (25th–75th percentile: 13–26), median waiting time for endoscopy was 28 days (25th–75th percentile: 17–39), and median time between faecal sample collection and endoscopy was 5 days (25th–75th percentile: 1–15). Of all considered predictors, the faecal biomarkers yielded the highest NPVs for SCD if evaluated individually.

Fig. 1
figure 1

Flowchart of Dutch primary care patients with lower abdominal complaints for at least 2 weeks and referred for endoscopy, and their enrolment in the CEDAR study from July 2009 through January 2012. CEDAR Cost-Effectiveness of a Decision rule for Abdominal complaints in primary care; GP general practitioner; SCD significant colorectal disease. 1 Non-SCD was established by other bowel tests for six patients (abdominal ultrasound in five and barium enema in one patient) and by the gastroenterologist based on bowel investigations performed before recruitment in the study for four patients. 2 SCD was established by the gastroenterologist for one patient on the basis of bowel investigations performed before recruitment in the study

Basic and extended diagnostic models

Nine of the 15 candidate predictors from patient history and physical examination were selected for the basic diagnostic model, to which blood Hb did not significantly contribute (P = 0.23) but CRP did (P = 0.03; see Table 2 for specification of the basic diagnostic model). This basic model significantly improved upon individual or combined extension with the calprotectin POC or ELISA and the POC FIT tests. Although CRP significantly contributed to the basic diagnostic model, it did not contribute to any of the five faecal biomarker extended models and was thus excluded from these. In none of the models with POC FIT did the odds ratio for SCD significantly differ in patients with and without rectal blood loss (Additional file 1), so we did not stratify the FIT results for overt rectal bleeding subgroups in the final models.

Table 2 Improvement in discrimination, reclassification, and explained variation upon various extensions of the basic diagnostic model and individual faecal biomarker extended models for SCD, as observed in 810 Dutch patients with lower abdominal complaints referred for endoscopy in the CEDAR study

Model performance and comparison

The basic model’s AUC increased from 0.741 (95 % CI, 0.694–0.789) to 0.763 (95 % CI, 0.718–0.809; P = 0.078) and 0.831 (95 % CI, 0.791–0.872; P < 0.001) upon extension with POC calprotectin and FIT, respectively, and to 0.837 (95 % CI, 0.798–0.876; P < 0.001) upon combined extension (Fig. 2 and Table 2). All three POC test extended models showed significant net reclassification improvement compared to the basic model. The FIT-only extended model and the combined POC extended model both yielded the highest NRI (both 0.38; see Additional file 1 for the corresponding reclassification tables). When adding FIT to the calprotectin POC extended model, both the AUC and NRI significantly increased, which was not true for adding calprotectin to the FIT extended model (Table 2). The basic model explained 19.0 % of the variation in SCD, which increased to 23.5, 34.5, and 35.8 % for the calprotectin, the FIT, and the combined POC extended models, respectively. All diagnostic models showed excellent calibration (Additional file 1).

Fig. 2
figure 2

Receiver operating characteristic curves for diagnosing SCD for the basic diagnostic model, and the POC FIT and the calprotectin POC test extended models. FIT faecal immunochemical test for haemoglobin; POC point-of-care; SCD significant colorectal disease. Areas under the curve (before optimism-correction): basic model 0.741 (95 % CI, 0.694–0.789); calprotectin POC test extended 0.763 (95 % CI, 0.718–0.809); POC FIT extended 0.831 (95 % CI, 0.791–0.872); Both faecal POC tests extended 0.837 (95 % CI, 0.798–0.876). Dashed line is reference line

Ruling out SCD

Using the combined POC extended model at the ≥ 5.0 % SCD probability threshold for referral would rule out SCD (i.e. prevent referral) in 30.4 % of all patients in our study, with 96.4 % NPV and 93.7 % sensitivity (inappropriately not referring one CRC [stage 1], four diverticulitis, and four AA patients; Table 3). At the same threshold, the FIT-only extended model would rule out SCD in 30.1 % of patients with 96.0 % NPV, but would miss one additional AA (resulting in 93.0 % sensitivity). At a ≥ 2.5 % referral threshold, the considered diagnostic models would prevent referral in 2.0–7.2 % of patients with 98.0–100.0 % NPV and 99.4–100.0 % sensitivity, and a threshold of ≥ 7.5 % would prevent referral in 27.5–46.7 % of patients with 93.4–95.7 % NPV and 87.9–90.0 % sensitivity.

Table 3 Diagnostic accuracy when basing endoscopy referral on varying SCD probability thresholds for the basic and the five faecal biomarker extended models, as observed in 810 Dutch patients with lower abdominal complaints referred for endoscopy in the CEDAR studya

Regarding the net benefit at the ≥ 5.0 % SCD probability threshold for referral when compared to the basic model, the combined POC extended model resulted in 60 more correctly non-referred patients without increasing the number of non-referred SCD patients, and three more correctly referred SCD patients without increasing unnecessary referrals (all per 1000 tested patients). These numbers were 34 and two, respectively, for the FIT extended model (Additional file 1).

Calprotectin POC versus ELISA test

Substituting the calprotectin POC with an ELISA test yielded similar results with regard to discrimination, explained variation, reclassification, and diagnostic accuracy (Tables 2 and 3; see Additional file 1 for ROC curves).

Towards use in new patients

To improve valid estimation of SCD risk in future patients, Table 4 shows the optimism-corrected regression coefficients of the combined POC and the FIT-only extended models (see Additional file 1 for the other models); the optimism-corrected AUC and explained variation of these models were 0.818 (95 % CI, 0.779–0.857) and 0.813 (95 % CI, 0.772–0.853), and 30.6 % and 29.5 %, respectively. See Additional file 1 for nomograms.

Table 4 Risk of SCD in relation to routine diagnostic predictors and faecal biomarkers as based on the optimism-corrected combined POC and the POC FIT extended diagnostic models, developed in 810 Dutch primary care patients with lower abdominal complaints referred for endoscopy in the CEDAR studya,b

Discussion

We are the first to develop a diagnostic strategy in primary care patients suspected of SCD, considering signs, symptoms, simple blood analyses, and both faecal calprotectin and Hb levels. This study showed that especially a POC FIT, and to a much lesser extent calprotectin tests, have incremental value beyond patient history, physical examination, and CRP in ruling out SCD in primary care patients with persistent lower abdominal complaints. Use of a simple diagnostic model including calprotectin POC and POC FIT test results could safely rule out SCD and prevent endoscopy referral in about 30 % of patients with 96.4 % NPV (at a 5.0 % SCD probability referral threshold). Excluding the calprotectin test from this model yielded similar results, missing one additional AA patient (of the 49 present in our study). Substituting the calprotectin POC test by an ELISA did not substantially change these results.

A perfect strategy would not miss any SCD patients. A substantial reduction of the number of unnecessary endoscopy referrals – as we show is feasible – will, however, inevitably result in a small risk of missing serious SCD. In our study, one patient with stage 1 CRC was not selected for referral by any of the POC FIT extended models at the ≥ 5.0 % SCD probability threshold (this patient tested negative on both the calprotectin POC test and the POC FIT). With keen attention in case of non-referral at first consultation to persisting symptoms over a time frame of 2–3 weeks, we think this will result in delaying, but not missing, such diagnoses. Such a limited delay will also not likely advance the disease stage substantially for CRC patients who were initially non-referred [29].

Notwithstanding the 2013 NICE recommendation for use in diagnosing IBD [10], calprotectin has so far only been studied in absence of other diagnostic information [1113]. One retrospective study investigating the use of calprotectin in irritable bowel syndrome-suspected primary care patients from the United Kingdom reported an AUC for SCD of 0.89 (95 % CI, 0.85–0.93), much higher than we report here (0.68; 95 % CI, 0.63–0.73 [POC], 0.66; 95 % CI, 0.61–0.72 [ELISA]) [12]. Besides the different patient populations, adenomas were not considered SCD in that study, as they were in ours. As calprotectin levels are low in (advanced) adenoma patients [11], this partly explains the observed difference between the studies (AUCs for SCD without adenomas in our data: 0.74; 95 % CI, 0.69–0.80 [POC], 0.73; 95 % CI, 0.67–0.80 [ELISA]). Related to this, the prevalence of AAs in our study almost doubled from February 2011 onwards (from 4.2 to 7.7 %, comprising 25.8 % versus 41.8 % of SCD cases – an increase that could not be explained by changes in patient mix throughout the study period, nor by differences in detection rates between endoscopy centres, but may have been introduced by increased awareness of gastroenterologists who around that time started preparing for the introduction of the CRC screening program in 2014). This increase in AA prevalence likely explains why our current results are less favourable compared to our previous (interim) analysis of patients enrolled through January 2011 (AUC: 0.75; 95 % CI, 0.67–0.82 [POC], 0.73; 95 % CI, 0.66–0.81 [ELISA]) [11]. Still, calprotectin did not show as much incremental diagnostic value as expected. This observation remained when analysing the data for IBD instead of SCD, and when considering adenomas non-SCD (data not shown).

Faecal Hb testing for CRC screening is widely accepted. Here, we showed that a qualitative POC FIT also has large incremental value for ruling out SCD in primary care. Our data further suggests that the POC FIT has value even in patients with overt rectal bleeding, equally so as in those without (Additional file 1). Additional analysis showed that the POC FIT was negative in 65.6 % of our patients with overt rectal bleeding. It may be more specific for blood mixed with faeces, thereby better reflecting the generally higher gastrointestinal location of SCD compared to other causes of rectal bleeding (e.g. haemorrhoids).

In a recent United Kingdom-based primary care study that ran between 2013–2014, 755 patients referred for bowel examination had available data on both faecal calprotectin (same ELISA as in our study) as well as Hb levels (using the quantitative EIKEN OC-Sensor assay) [16]. The authors concluded that undetectable faecal Hb may be sufficient to exclude CRC/IBD/higher-risk adenomas with 41.7 % test negatives, 96.2 % NPV and 88.2 % sensitivity – thereby questioning the added value of calprotectin, as in our study. Other studies have also advocated quantitative faecal Hb testing for ruling out SCD [30, 31], or advanced neoplasia [3234], in symptomatic patients. We could not confirm these promising results of faecal Hb by itself (Table 1), which is possibly because of the higher threshold of our POC FIT (with a detection limit of 6 μg/g), and it being a qualitative and not a quantitative test. Previous results suggest that using a single test could, in fact, be sufficient in deciding whom to refer for endoscopy. Indeed, our results also underscore that a positive POC FIT already implies the need for referral by itself (at the ≥ 5.0 % SCD probability threshold; see nomogram in Additional file 1). Here, the clinical data do not add much, but they do when the POC FIT returns negative. Also, in daily clinical practice, and certainly in primary care, it is rare that – except in a screening situation – physicians would immediately apply such test in suspected patients presenting with symptoms and signs of SCD without even considering any other pre-test diagnostic information from history taking and physical examination. The diagnostic process in primary care is sequential, starting with history taking and physical examination, and follow-up testing only in cases where the first provide indications that legitimates additional testing. To adhere as much as possible to primary care practice, we therefore explicitly first evaluated the diagnostic value of history taking, physical examination, and simple blood analysis, and subsequently the added value of the POC FIT test, rather than the other way around. Obviously, in unsuspected people, in the realm of screening, a single-test approach using first and foremost the POC FIT test, seems a very reasonable approach, but in our view not for diagnostic work-up of clinically suspected patients, which was the focus of this paper.

A major strength of our study is its prospective conduct in a primary care setting, where results from secondary care studies may not be applicable [8]. We also took care to enrol representative patients from 266 general practices, while measuring all potentially relevant diagnostic information, including blood and faecal biomarkers, under routine conditions, enhancing the generalizability of our results. Moreover, patients underwent reference testing by the same standard, including 3 months follow-up after inconclusive endoscopy to identify any initially missed SCD, and index and reference tests were interpreted independently in each patient. Finally, we purposely developed diagnostic models for SCD, and not solely for CRC (or IBD) as commonly done. This resulted in a diagnostic strategy applicable to primary care patients with persistent lower abdominal complaints that is optimally aligned with the diagnostic challenge at hand: ruling out SCD.

When defining SCD, we only included adenomas > 1 cm as AA, without taking histologic high-risk features such as the presence of high-grade dysplasia or villous components in smaller adenomas into account. However, such high-risk features are seldom present in small adenomas [35], and we estimate that about 2 to 3 of the small adenomas we have considered non-SCD are actually high-risk lesions. This amount of misclassification (i.e. only ~2 % of all SCD cases in CEDAR) will likely not have importantly influenced the results. Some other limitations of our study also need discussion. For instance, we did not enrol primary care patients urgently referred for endoscopy (e.g. for on-going bleeding or imminent obstruction) or at very low SCD-suspicion (not necessitating endoscopy). Our study population thus reflects patients at intermediate risk of SCD. These patients, however, pose the largest diagnostic dilemma, where an improved diagnostic work-up is especially urgent. Further, most diagnostic predictors had missing data despite systematic data collection, and we had to use state of the art multiple imputation of the 5.2 % missing data points to prevent selection bias and loss of information [2325]. Furthermore, as we used all available data to optimally develop the best diagnostic strategy, and despite using bootstrapping techniques for internal validation to correct for over-optimism, formal external validation of our findings is still warranted.

Finally, the use of a qualitative POC FIT in the way that we did in this study, although easily implemented in primary care, also has limitations. First, as the qualitative POC FIT yields a positive or a negative test result (with a detection limit of 6 μg Hb/g faeces), the diagnostic information that would be available by quantitatively assessing the amount of Hb present in faeces is lost. Second, patients collected faecal samples in regular blue-capped containers without Hb stabilizing buffer (so each patient needed to fill only one faecal container for both calprotectin and Hb analysis). Samples were kept refrigerated, and – if not frozen before further processing – 90 % were tested within 3 days of collection. Additional data-analysis showed that the chance of a positive POC FIT slightly decreased with increasing time between collection and testing (0.3 % absolute decrease per day; P = 0.19), and that frozen samples were more likely to be POC FIT negative than non-frozen samples (absolute 8.6 % decrease in POC FIT positivity; P = 0.017; calprotectin results seemed not to be affected). Some patients have thus likely tested falsely negative for the POC FIT because of Hb degradation in our study. However, in none of the models with POC FIT did its odds ratio for SCD significantly differ in patients whose faecal samples were and were not frozen. Furthermore, the POC FIT performed well in our study despite these limitations, and the sensitivity and discriminatory performance of faecal Hb testing in primary care will thus likely be even better when using Hb stabilizing buffers in faecal sample collection devices and using a quantitative FIT.

Conclusions

A simple model including information from history taking, physical examination, and a POC FIT may safely rule out SCD and prevent unnecessary endoscopy referral in approximately one-third of SCD-suspected primary care patients. Adding a calprotectin test to such a strategy has limited value.