Background

Colorectal cancer (CRC) screening using tests for the presence of blood in faeces, commonly known as faecal occult blood tests (FOBT), has been shown to be an effective intervention for reducing CRC-related mortality in controlled studies conducted both in Europe [1,2,3] and in the USA [4]. The mortality reduction varied between 14 and 18%, with colonoscopy being used as the second stage investigation in those with a positive faecal test result. Thus, screening reduces the burden of CRC, which is the most common cancer in industrialized countries and has a high mortality rate of approximately 25.4 expected deaths per 100,000 in the overall population. The standardized incidence-based mortality ratio is 0.47 (95% confidence interval [CI]: 0.26–0.80) with colonoscopic polypectomy, suggesting a 53% reduction in mortality [5, 6].

FOBT has been widely implemented for CRC screening and, in 2003, the European Union (EU) published an official recommendation for its members to carry out FOBT screening for the average-risk population aged between 50 and 74 years [7]. In this regard, faecal testing has improved markedly since the aforementioned studies were carried out, with the original guaiac test (gFOBT) being superseded by faecal immunochemical tests for haemoglobin (FIT), which are potentially much better at detecting advanced adenomas (AA) and CRC and are also much better accepted by potential participants because of ease of use and the lack of a need for special dietary requirements [8, 9]. The EU guidelines recommend use of FIT in population-based programmes [10, 11] and, indeed, an impact on cancer incidence has been found in recent studies [12, 13], although further investigation is needed to assess the longer-term impact. A recent meta-analysis shows an average sensitivity of 79% and a specificity of 94% of FIT for CRC in asymptomatic subjects [14].

Current main concerns are centered on quality-assurance practices and the possible negative consequences of such programmes. Quality assurance throughout the screening process is based on criteria and indicators recommended by the European guidelines [10], whereas the negative effects concern the main side effects of CRC programmes, in particular, colonoscopy-related complications and false-negative and false-positive results. In the case of false positive results, three studies found differences between the sexes [15, 16] and noted that this situation was unsatisfactory, especially for women [17].

Some models have been designed to include faecal haemoglobin concentration (f-Hb) as a predictor for colorectal neoplasia and have suggested that adjustments must be made to take into account sex, family history or morbidities when implementing programmes [18], In this regard, the Scottish Bowel Screening Programme evaluation using FIT showed important differences in the results for men and women, with a greater participation with FIT than with gFOBT, a higher positivity rate in men than women in all groups, and a higher detection rate in men for AN and CRC. In contrast, the number of false-positive results was lower in men (49.1% versus 58.9% in women) for colonoscopies performed [19]. A similar pattern was reported by the Basque Country for lesions detected in the period 2009–2011 [20].

Adjusted incidence rates for CRC in the Basque Country have increased significantly, by 2.3% per year in men (from 60.3 per 100,000 in 2000 to 87.6 in 2011) and by 6.5% per year in women (from 56.6 in 2007 to 71.8 in 2011). The age-standardized incidence rates for 2007 (prior to implementation of the Basque Country Colorectal Cancer Screening Programme) showed a high men-to-women ratio for different locations [21].

A recent review [22] concluded that the influence of sex on the comparative performance of tests for detecting advanced colorectal neoplasia (AN) has not been investigated with sufficient power in any of the diagnostic cohort studies conducted to date. In a prospective cross-sectional study, van Turenhout et al. [23] concluded that FIT has a higher sensitivity and lower specificity for CRC in men and that different f-Hb cut-offs should be used in screening programmes. These data are consistent with those published by Fraser et al. [24], who concluded that f-Hb distributions vary by sex and age, this supporting the view that setting and using a single f-Hb cut-off in any CRC screening programme is far from ideal. Alvarez-Urturi et al. [25] have recently conclude in the ColonPrev randomized controlled trial study that FIT cut-offs could be individualized by sex and age to improve the performance of FIT in CRC screening programmes. On the other hand Kapidzic et al. [26], in a prospective cohort of invited people from the Dutch population-based screening programme, do not recommend different f-Hb cut-offs in men and women based on the consideration that positive predictive values for the sexes should be the same. Establishing different f-Hb cut-offs between men and women and between age groups could influence the effectiveness of screening. Looking ahead to achieve consistent detection rates among regions, the cut-offs could differ. However any increase in the f-Hb cut-off selected to define positivity, while increasing sensitivity for AN, can increase the rate of false positives [27].

Colonoscopy demand increases with the use of FIT when used with the widely applied low f-Hb cut-offs since the expected number of positive test results is more than three times higher than that with gFOBT, posing an economic challenge for many regions as regards the implementation of population-based screening programmes, since additional investment and resources are needed to implement them, at least in the early screening rounds. As such, an exercise to estimate the clinical outcomes including the number needed to screen (NNS) to detect one case, and the f-Hb cut-offs to be used are a difficult dilemma for epidemiologists and decision-makers. Using quantitative FIT, the f-Hb cut-off (s) to be used becomes a crucial decision since the positivity rate determines the number of colonoscopies required. In this regard, some f-Hb cut-offs have been suggested and simulated outcomes created to answer these questions [28,29,30].

The main question, however, is how to determine the best f-Hb cut-off (s) for a specific target population in order to detect the true positive results without increasing the number of interval cancers (ICs), a serious consideration in any screening programme [31, 32]. In this study, we aimed to answer these questions on the basis of a high participation rate population-based screening programme and determine whether strategies using f-Hb cut-offs stratified by sex and age group may be useful.

Methods

Study population and interventions

The Basque Country CRC Screening Programme is population-based and started in 2009 as a pilot and was extended in 2010 after evaluation and optimisation of the processes involved. The main strategy was based on: A) a Coordinating Office, including clinical epidemiologists and statisticians, to plan, organize and manage the programme; B) all residents from 50 to 69 years were invited, taking into account the Health Centers and referral Hospitals, in order to adjust the positivity expected and colonoscopy capacity; C) prior to the invitation, the Coordinating Office selected the target population and linked the database to the Basque Population Cancer and Medical Procedures Registries to exclude people with a previously diagnosed CRC, terminal illness and colonoscopy reported in the last 5 years; D) training and involvement of Basque Health Service Primary Care staff; E) individualized posted invitations providing information about the programme. After 4–6 weeks from the initial invitation, the kit was sent along with instructions and an individualized bar code. This code allows the sample and person to be identified when processing the result. Samples were collected at Primary Health Centers of the Basque Public Health Service and processed in centralized public laboratories under strict total quality management systems; F) automatically the software system introduces the result in the “ad hoc” CRC database and primary care physicians review all results of their patients (reader has to bear in mind that electronic clinical records are implemented in community care in the Basque Country). Letters were posted with the results: a) if negative, the invitation will be repeated in 2 years’ time if the person is younger than 70 years, or b) if positive, participants are recommended to visit their General Practitioner, who will indicate the need for a colonoscopy and c) in case of error, another kit and instructions were sent; G) colonoscopies are performed in referral public hospitals under sedation by expert specialists; H) all cases are followed-up with close coordination between Primary Care and Specialized Units; J) every case is coded by the Coordinating Office staff following standard EU guidelines and Spanish Network consensus recommendations [10, 33]. This study was approved by the Basque Country’s Ethics Committee (Reference: PI2014059). All participants provide written informed consent.

Detection of ICs: prior to a subsequent invitation, all negative cases from a previous round are linked to the register of hospital discharges with ICD-9 1530–1548, in primary and secondary diagnosis, ICDO-10 C18-C21 of hospital registers and population-based Cancer registries as well as codes of Pathology. In all coinciding cases, the qualified staff from the Programme’s Coordinating Centre checked the clinical history, including the cases as ICs which complied with the criteria of having a negative FIT result in the previous invitation (0–24 mo or more in case of a delay in the invitation to the screening programme). To ensure against any possible losses, this process was repeated on an annual basis.

Definitions

The FIT used from early 2009 and in early 2010 (during the pilot study) were OC-Sensor Micro (Eiken Chemical Co, Tokyo, Japan) and FOB-Gold (Sentinel CH. SpA, Milan, Italy), in both with a f-Hb cut-off of 20 μg Hb/g faeces. After comparison of the results obtained with both devices [34], OC-Sensor was selected and has been used since. OC-Sensor is a quantitative FIT, with chemistry based on human haemoglobin antibody mediated latex agglutination. Bar coded specimen collection devices were analysed for f-Hb. In the current analysis, the data are only related to this FIT. The result was considered positive when f-Hb was ≥20 μg Hb/g faeces.

The histology of all lesions detected was evaluated by expert pathologists specializing in gastrointestinal oncology according to the quality standards of the European guidelines [10]. The maximum reach of the endoscope, adequacy of bowel preparation, as well as the characteristics and location of any polyps were recorded. Adenomas ≥10 mm, adenoma with a villous component (i.e., tubulovillous or villous adenoma) or adenomas with severe/high-grade dysplasia were classified as AA [10].

AN was defined as CRC plus AA. Tumour staging was established according to the TNM classification system in agreement with the AJCC Cancer Staging Manual [35]. Finally, participants were classified and then assigned according to the most advanced lesion found.

Statistical analysis

CRC screening performance measures were assessed following the European guidelines [10]. Variables were calculated and described as percentages with 95% confidence intervals.

The number needed to screen (NNS) was calculated as the number of completed screening tests required to find one AN. All test characteristics were calculated separately for f-Hb cut-offs of 20, 25, 30, 35, 40, 50 and 60 μg Hb/g faeces, respectively.

Differences in the test characteristics between men and women and different age ranges were assessed using the chi-squared and/or Fisher’s tests. Since the data on f-Hb did not follow a normal distribution, the Mann-Whitney U test was used to compare continuous variables between the groups. The normality of the distribution of continuous variables was assessed using a normal Q-Q plot. A p-value of less than 0.05 was considered to be statistically significant using a two-sided test.

A logistic regression was performed to analyze the risk of loss in the detection of AN by sex and age stratified group.

The statistical analysis was conducted using SPSS version 23.0 (IBM Corp. Released 2013. IBM SPSS Statistics for Windows, Version 23.0. Armonk, NY: IBM Corp.).

Results

Between 2009 and 2012, 444,582 subjects were invited to the Basque Country CRC Screening Programme. The flow diagram is summarized in Fig. 1. The study population comprised 17,387 participants with a positive test result who underwent complete colonoscopy.

Fig. 1
figure 1

Study flow diagram

The overall participation was high (66.5%; 95% CI: 66.4–66.7), as was the colonoscopy compliance (95.1%; 95% CI: 94.8–95.5). The characteristics of the participants in the study population are summarized by sex and age group in Tables 1 and 2, respectively.

Table 1 Characteristics of participants studied
Table 2 Characteristics of participants stratified by sex and age

The proportion of false negative results was 7.6% (95% CI: 6.5–8.8). We identified 136 interval cancers (IC) and, in Table 3, the difference in characteristics of IC and screen-detected cancers (SD-C) are summarized divided into two groups, those cancers detected in participants attending for the first time (prevalent screening cancers) and those attending in subsequent rounds (incidence screening cancers).

Table 3 Characteristics of interval cancers and screen-detected colorectal cancer

Programme performance indicators and test characteristics

The positive predictive values (PPV) for AN, both for the study group and in each sex and age stratified groups of participants, are shown in Tables 4 and 5. Significant differences were observed at a f-Hb cut-off of 20 μg Hb/g faeces, and this patternwas maintained throughout the different f-Hb cut-offs analysed by sex. The PPV was significantly higher in men at all f-Hb cut-offs. There were also significant differences between age-specific groups in men and women, with the PPV being higher in the older population for both sexes.

Table 4 Test characteristics at different faecal haemoglobin concentration cut-offs by sex
Table 5 Test characteristics at different faecal haemoglobin concentration by sex and age group

The positivity rate for the range of f-Hb cut-offs assessed was also higher in men and the difference with women was also significant, with the positivity decreasing with increasing f-Hb cut-off. The positivity was lower for all age groups in both sexes as the f-Hb cut-off increased, being higher in older men and women, and with significant differences by sex (Tables 4 and 5).

The CRC detection rate (CDR) was higher in men than in women and in older subjects, with significant differences for all f-Hb cut-offs (Tables 4 and 5). In men, the CDR decreased from 5.2‰ (95% CI: 4.8–5.6) to 4.1‰ (95% CI: 3.8–4.4) and in women from 2.2‰ (95% CI: 2.0–2.4) to 1.7‰ (95% CI: 1.5–1.9). The advanced neoplasia detection rate (ANDR) was also higher in men at a f-Hb cut-off of 20 μg Hb/g faeces (44.0‰ [95% CI: 42.9–45.1]), with a significant difference with respect to women, for whom the ANDR was lower (15.9‰ [95% CI: 15.2–16.5]). This significant difference was also maintained at different f-Hb cut-offs. The ANDR was higher in older groups in both sexes, with significant differences by sex for all f-Hb cut-offs (Tables 4 and 5). In any case, the ANDR in men over 60 years remained higher than that of women.

Colonoscopy savings and the risk of losses in the detection of advanced colorectal Neoplasia

A lower NNS to detect one AN (59; 95% CI: 56–63) was seen in men at a f-Hb cut-off 20 μg Hb/g faeces compared to 92 (95% CI: 83–100) for women. On increasing the f-Hb cut-off, NNS increased to 230 for women at a f-Hb cut-off of 60 μg Hb/g faeces. The differences between men and women were significant at f-Hb cut-offs of 20 and 25 μg Hb/g faeces but not at higher cut-offs (30 and 35 μg Hb/g faeces), as shown in Fig. 2a.

Fig. 2
figure 2

Number Needed to Screen to detect Advanced Neoplasia (AN) (a) and the Odds Ratio for the loss in detection of AN (b) Men versus women through increasing the faecal haemoglobin cut-off. (*p < 0.001; p < 0.05; no significance). (¥Cut-off 50 μg Hb/g faeces in men = 509 [95% CI: 333–1000])

A logistic regression analysis was performed to determine the risk of loss in the detection of AN by increasing the f-Hb cut-off (Fig. 2b). The risk is higher in men than in women and this risk increases significantly upon increasing the f-Hb cut-off from 1.49 (95% CI: 1.30–1.71) to 1.69 (95% CI: 1.56–1.83).

The colonoscopy saved by increasing the f-Hb cut-off in the case of women increases to 55.5% (N = 4273). As such, the savings made in terms of colonoscopies are offset by the loss in detection of CRC and AA (Fig. 3). The loss of AA in women can be as high as 43.3% (N = 962), and 22.9% for CRC (N = 81). Around 19.1% of the colonoscopies saved upon increasing the f-Hb cut-off to 25 μg Hb/g faeces will have an AN, and this percentage rises to 24.4% on increasing the f-Hb cut-off to 60 μg Hb/g faeces It can also be seen that the CRC missed were diagnosed mostly at an early stage (Stage I-II: from 70.2% in men to 66.3% in women).

Fig. 3
figure 3

Relation between saving colonoscopies (SC) and lesion loss upon increasing the faecal haemoglobin concentration cut-off by sex. Dotted lines represent lesion detection rates (for colorectal cancer (CRC) and advanced adenoma (AA)) and solid lines saved colonoscopies. The left Y axis represents lesion detection rate and the right Y axis the percentage of colonoscopies saved. Saving Colonoscopies: the percentage of colonoscopies that will not be performed in the programme by increasing the f-Hb cut-off, due to the reduction of positivity rate

Colonoscopy savings increased in all age groups on increasing the f-Hb cut-off in both sexes. However, as can be seen from Fig. 4, there is no substantial difference in this saving by age group (from 48.6 to 51.9% in men and 54.3 to 57.0% in women). However, an analysis of the decrease in CRDR and ANDR showed a considerable difference between age groups in both sexes. Thus, in men, the AADR decreased by 24.1 and 10.9‰, in the oldest group and in the youngest groups respectively, whereas in women it decreased by 9.0‰ in the oldest group and by 4.9‰ in the youngest. A similar pattern was observed in CDR and, depending on the age group analysed, the diagnoses of early-stage CRC not detected could be as high as 86.4% in men and 80.0% in women.

Fig. 4
figure 4

Relation between saving colonoscopies (SC) and lesion losses upon increasing the cut-off level of the FIT by sex and age group. Dotted lines express lesion detection rates (colorectal cancer (CRC) and advanced adenoma (AA)) and solid lines saved colonoscopies. The left Y-axis represents lesion detection rate and the right axis the percentage of colonoscopies saved. Saving Colonoscopies: the percentage of colonoscopies that will not be performed in the programme by increasing the f-Hb cut-off, due to the reduction of positive rate

Discussion

We have compared CRC screening with FIT at different f-Hb cut-offs in a large population aged between 50 and 69 years. To our knowledge, there have been few previous studies of sex and age related differences in population-based FIT screening programs.

In our study, a total of 444,582 persons were invited to participate in the Basque Country CRC Screening Programme. This large number of participants facilitated the performance of a reliable and robust statistical analysis to determine whether a simple, single f-Hb cut-off should be used for different populations without increasing the interval cancer rate, thus allowing the provision of insight for others running similar programmes.

CRC screening programmers in a number of countries have encountered higher than expected positivity [36], thus leading to overwhelming demand for scarce colonoscopy resources and a need to increase the f-Hb cut-off to lower the number of referrals. In consequence, data on the performance of FIT in men and women are of key importance due to the current widespread and growing use of FIT in population-based CRC screening programmes.

We observed a higher PPV for AN and higher detection rates for CRC and AN than other programmes, these results could be due to the high rate of compliance to colonoscopy assessment, that allowed a minimal loss of neoplasm detection As reported in recently published studies [26, 37], higher positivity was found in men at the full range of f-Hb cut-offs. This pattern is also consistent when comparing older men and women against younger ones, with these variables being higher in older groups. A decision on whether to adjust the age at which screening begins also requires taking into consideration whether the recommended age for men should be younger or the recommended age for women older. In this regard, Sung et al. [38], in the Asia Pacific consensus recommendations for CRC screening, suggested that women may start screening at later ages due to the relatively low incidence of CRC at 50–55 years. Similarly, Brenner suggested that the optimal age for screening initiation should be five years younger for men than for women. Despite this, European guidelines recommend that screening programs for CRC should start at age 50 years for both men and women of average risk [10]. However, the question of using different f-Hb cut-offs for men and women and/or younger and older participants remains unsolved. Differences in the epidemiological pattern of CRC among sexes have been identified during the last years [39]. Hence, it is a matter of discussion if the screening must be implemented on the basis of same sex, age and f-Hb cut-off.

Recent studies [22, 27] have concluded that FIT has a higher sensitivity and a lower specificity for CRC in men than in women and therefore that equal test characteristics can be achieved by allowing different f-Hb cut-offs for the sexes. However, Kapidzic et al. [26], observed that there were no significant differences between men and women in PPV at a f-Hb cut-off of 10 μg Hb/g faeces, thus meaning that the chance that a colonoscopy is unnecessary after a positive test result is the same. It was suggested that, if the same differences were to persist between men and women in a larger sample, the differences in PPV would become significant, and this is exactly what we have observed in our study, in which the differences between men and women have remained statistically significant. However, can we therefore argue that it would be better to increase the f-Hb cut-off for women? According to the results of Kapidzic et al. [26], the PPV could be improved using a higher f-Hb cut-off in women; however, this would be at the expense of increasing the NNS as this increases at higher f-Hb cut-offs.

It may take approximately 10 years from the appearance of the first lesion with abnormal histopathology to develop a possible malignant lesion. In 2007, Brenner et al. [39] showed that the risk of transition from AA to CRC was similar for men and women, but increased with age. Some studies [40, 41] have reported significantly higher detection rates for AN and CRC with colonoscopy for men than for women in all age groups, thus suggesting that male sex constitutes an independent risk factor for colorectal neoplasia. Such studies recommended sex-specific ages for screening. These differences are similar to those observed in our study.

Colonoscopy resource can be key to defining the strategies and characteristics adopted in screening programmes. Indeed, the additional number of colonoscopies that need to be performed may become an important factor when deciding whether to establish any such programme. We observed that the saving in colonoscopies increased consistently in both sexes and in all age groups as the f-Hb cut-off was increased. It might seem appropriate to increase the f-Hb cut-off since this would a lower the number of colonoscopies required. However, when increasing the f-Hb cut-off, the risk of lowering the ANDR increases significantly in both sexes and in all age groups. The proportion of IC could be higher in men than in women and in older groups. Thus, an increase in the f-Hb cut-off could increase the loss from 7.9 to 28.1% in men and from 5.1 to 22.9% CRC in women. This loss in the detection of CRC is consistent over all age groups. Moreover, taking into account that most of those with CRC would be diagnosed in their early stages, this would go against the principles of preventive screening programmes. These results are consistent with those published recently by Digby et al. [42], who concluded that CRC screening programmes would benefit from using low f-Hb cut-off to gain lower IC proportions as well as higher sensitivity and detection of earlier stage disease, but at the cost of increased demand for colonoscopy.

Recent studies suggested the potential benefits of using a risk prediction model including f-Hb in CRC screening [18, 29, 31, 43] to improve the effectiveness of screening strategies. Future studies performed should therefore be designed to evaluate the benefits of implementing models according to the different risks of different groups according to sex and age. Some studies have suggested that other factors could be used to determine the optimal cut-off values for men and women, and that the combination of these data with microsimulation models could improve the implementation of screening programmes [28, 44].

One of the main strengths of the current study was the large number of participants evaluated, all of whom were recruited in an organized, population-based screening programme, coordinated and systematically evaluated at a single centre. The lack of studies published to date with real data from such a FIT-based programme and with a participation rate of more than 65% (the level recommended in the European guidelines [10]) is also worth noting.

However, several limitations have to be acknowledged. The study included assessment of the effects of sex and age but no other possible confounding factors, such as socio-economic status which has been shown to affect f-Hb [36, 45], though they could be retrospectively explored on the basis of a case/control nested analysis. Furthermore, Brenner [38] suggested that appropriate differentiation of age at initiation of CRC screening by sex might be equally or more relevant from a public health point of view than the widely used differentiation by family history.

Conclusions

In conclusion, this population-based study provides relevant information on the performance of a realistic FIT-based colorectal screening programme in men and women at different f-Hb cut-offs. Men have higher PPV, CDR and ANDR, which results in a lower NNS when compared to women, and this pattern is consistent when comparing younger and older groups. However, given the assessed loss in detection of AN and CRC, most of them in their early stages, it may be that the f-Hb cut-off that is going to be implemented should not be change only by sex or age, at least initially, in accordance with the recommendations of the European guidelines, in order not to increase the ratio of interval cancers, which is another important variable to examine.