Introduction

Colorectal cancer (CRC) is currently the third most commonly diagnosed type of cancer and the second cause of cancer death worldwide, with an estimated 1.8 million new cases and 880 thousands deaths in 2018, with a greater burden among males respect to females [1]. Typically, CRC can be considered a disease related to wealth. National levels of both CRC incidence and mortality are closely related to the income and development level of the country, with a cumulative risk of CRC or CRC death three times higher in countries with a high Human Development Index (HDI) than countries with a medium or low HDI [1].

Over the last decade, the majority of the countries in Europe, Oceania and North America witnessed a decrease in CRC mortality [2]. Likely, one of the main reasons for such a reduction in mortality rates in Western or developed countries could be related to the adoption of screening programs for CRC. As for CRC screening, different methods and strategies are effective at reducing its mortality and have been implemented in different countries worldwide, the most represented by fecal occult blood testing and fecal immunochemical test [3,4,5,6]. However, in recent years researchers have explored the possibilities of stratified screening, through the use of prediction models that could guide CRC risk assessment for asymptomatic patients [7]. In particular, most recent research in this field has focused on the inclusion of genetic factors into prediction models, particularly through the use of a genetic risk score (GRS) or a polygenic risk score (PRS) [8]. Furthermore, the increasing number of genome-wide association studies (GWASs) that are being conducted, with more than 70 GWASs currently published for CRC [9], is leading to a progressive improvement of our knowledge regarding the impact of common genetic variants or single nucleotide polymorphisms (SNPs) on the risk of CRC. In this sense, it should be noted that up to 35% of inter-individual variability in CRC risk has been attributed to genetic factors [10, 11], thus making the importance of this field for public health evident. Genetic factors could guide CRC risk assessment, thus improving the effectiveness of currently available screening strategies.

However, the methods currently used by researchers to incorporate genetic factors into prediction models for CRC and the characteristics of the latter are highly heterogeneous [8]. In addition, the potential improvement in discriminatory accuracy yielded by the addition of genetic factors to CRC prediction models including only traditional risk factors is still unclear, as it is not certain whether the number of genetic variants included in the models are related to such improvement.

For these reasons, the primary aim of the present study is to perform a systematic review regarding polygenic risk prediction models for CRC in order to identify which prediction models including genetic risk variants for CRC have been reported in the Scientific Literature.

The secondary aim is to assess the impact, in terms of improvement in discriminatory accuracy, of the addition of SNPs into prediction models with only traditional risk factors, and to test whether there is any relation between the number of SNPs included in the models and the improvement of their discriminatory accuracy. In addition, we aimed to evaluate which factors, besides the number of SNPs, influence the improvement of discriminatory accuracy.

Methods and materials

We registered a protocol for this review on PROSPERO (Record ID: CRD42019135304), the international prospective register of systematic reviews. We uploaded on the PROSPERO register, prior to completing data extraction, the review title, timescale, team details, methods, and general information.

Search strategy and study selection

We queried Pubmed, Web of Knowledge, Embase and CINAHL Complete electronic databases up to February 2020 using the elements of the Population, Intervention, Comparator, Outcome (PICO) model (P, population/patient; I, intervention/indicator; C, comparator/control; and O, outcome) [12]. In detail, our study population was represented by colorectal cancer; the intervention by SNPs; the comparator was none, and outcome was represented by risk prediction models. For this reason the following search string was built: (“Colorectal Neoplasms”[Mesh] OR “colorectal cancer” OR “colon cancer”) AND (“genetic variant” OR “genetic variants” OR “genetic variation” OR “genetic data” OR polymorphism OR SNP OR SNPs OR polygenic) AND (“risk stratification” OR “risk model” OR “risk profile” OR “risk profiling” OR “risk prediction” OR “risk determination” OR “risk discrimination” OR “risk score” OR “predictive model” OR “prediction model” OR “prediction models” OR “stratified screening”). The search was refined by hand searching and analysis of bibliographic citations in order to identify missing articles. No publication time limits were applied.

The manuscript was written following the recommendations of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (Supplementary material) [13].

We systematically searched databases to retrieve all eligible scientific studies that developed, compared or validated a prediction model (or clinical prediction rule based on a model) using multiple (at least two) SNPs to predict the risk of CRC.

Two independent investigators (M.M. and M.S.) screened titles and abstracts of all potentially pertinent articles to identify eligible studies. We obtained, read and included, if relevant, full papers following the same procedures. At all levels, any discrepancies and disagreement were solved by consensus or by involving a third investigator (R.P.).

We included English-written peer-reviewed papers focusing on sporadic CRC reporting primary data and that evaluated the combined effect of two or more genes on CRC risk (e.g. GRS or PRS) or that reported a formal prediction model using genetic factors.

We excluded all studies that tested a model on simulated populations, pediatric populations, or dealing with inherited forms of colorectal cancer (e.g. Lynch syndrome). Furthermore, we did not include in this review commentaries, editorials, review papers, case reports, case series, book chapters, and articles with no primary data. Lastly, as for articles updating previous ones, we included only the last updated study.

Data extraction

Data extraction was conducted independently by two researchers (M.M. and M.S.), for articles deemed relevant, using an in-depth piloted data extraction form and following an adapted version of the “CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies” (CHARMS) checklist [14]. Disagreements were solved through discussion or referral to a third reviewer (R.P.).

Extracted data include information regarding: author details; year of publication; study design; study population; sample size; genetic factors analyzed; GRS and related methods used to calculate it; factors other than genetic included in the model; internal and external validation; Area Under Curve (AUC) of non-SNP-enhanced models; AUC of SNP-enhanced models; Integrated discrimination improvement (IDI); and net reclassification improvement (NRI). In particular, NRI and IDI are measures used to compare the performances of two models, specifically an old model and a new model resulting from the addition of one or more predictors to the old one. The AUC is a measure of discriminatory accuracy and quantifies the ability of the model to discriminate between individuals with and without the outcome of interest [15], while NRI quantifies the ability of the new model to reclassify individuals compared to the previous one [16, 17], and IDI represents the difference in discrimination slopes of the new and the previous models, with the discrimination slope being the absolute difference in the averages of estimated probabilities of the event between those who experienced the event and those who did not [17,18,19].

For studies including both individuals with adenomas and CRC, we only extracted information about results related to CRC.

Quality assessment

The risk of bias of included studies was assessed by two investigators (M.M. and M.S.) using the Prediction model Risk Of Bias ASsessment Tool (PROBAST) [20]. PROBAST is a tool developed to assess the risk of bias and applicability of prediction model studies and contains a total of 20 signaling questions divided into 4 key domains that regard: participants, predictors, outcome, and analysis. Each domain is rated for risk of bias (low, high or unclear risk of bias). The signaling questions can be rated as “yes”, “probably yes”, “probably no”, “no” or “no information”. Every signaling question is phrased so that “yes” or “probably yes” mean absence of bias, while “no” or “probably no” warn for potential risk of bias. The first three domains that regard participants, predictors and outcome are also assessed for concerns for applicability (high, low, or unclear) to the defined review question.

Statistical analysis

Statistical analysis was carried out including only studies that reported both a model with only traditional risk factors and one incorporating also genetic factors. For studies that calculated the AUCs of the same model constructed in different ways (e.g. counted GRS and weighted GRS), only the model showing the best performance or, for those showing the same values of AUC, the simplest one was included in the analysis. Stratification according to the number of SNPs was conducted using tertiles based on the distribution of the number of SNPs included in the models across included studies, with lowest, mid, and highest tertile being represented by ≤22, 23–47, and ≥ 48 SNPs, respectively. We calculated standard errors of AUCs using the Hanley and McNeil method [15].

First, we tested whether a significant trend in the increase of the AUC of the SNP-enhanced models according to the number of SNPs included in the models could be observed. Secondly, we estimated the Pearson’s correlation coefficient between AUC improvement and number of SNPs. Eventually, we investigated whether the increasing number of SNPs added to the baseline models determined an observable trend in the improvement of the AUC by drawing a forest plot. In order to calculate a pooled AUC improvement for SNP-enhanced models compared with non-SNP-enhanced models, we conducted a meta-analysis using the random effects model, based on the assumption that clinical and methodological heterogeneity was very likely to occur and to have an effect on the results. We quantified statistical inconsistency using the I2 statistic. Moreover, we assessed whether specific factors (number of cases, number of SNPs, publication year, AUC of non-SNP-enhanced model, ethnicity of study participants, number of traditional risk factors in the model, and inclusion of gender in the model both as a covariate or by stratification) were significantly associated with AUC improvement and explained statistical heterogeneity by conducting meta-regression, with p-values adjusted for multiple testing computed using 1000 Monte-Carlo permutations.

All statistical analyses were conducted using the Stata software version 13.0 [21].

Results

Study selection

The results of abstract and full-text screening with reasons for exclusion are shown in the PRISMA flow diagram [13] in Fig. 1. The database research resulted in 749 records. A total of 6 articles were retrieved through hand search. After checking for duplicates, 566 articles were analyzed for eligibility and 472 were excluded after title and abstract screening. The remaining 94 articles were selected for full-text review, resulting in 33 articles included in the qualitative synthesis and 10, eventually, included in the meta-analysis. The main causes for exclusion were represented by: articles with no primary data or with simulated populations (35%), non-pertinent articles (30%); articles with population represented by individuals with inherited forms of colorectal cancer (20%); eventually, studies that were later updated and published (10%) or that gathered together with CRC cancer and colorectal benign polyps without distinguishing these two populations (5%).

Fig. 1
figure 1

PRISMA flow-chart of the study selection process

Study and population characteristics

The main characteristics of the articles included in the systematic review are summarized in Table 1. Studies included in this review were published from 2008 and 2019. Most of them were case-control studies (78.79%) [22, 23, 25, 27,28,29,30,31,32,33,34,35,36, 39, 41,42,43, 45,46,47, 49,50,51,52,53,54], followed by 5 cohort studies (15.15%) [24, 38, 40, 44, 48], and 2 (6.06%) case-cohort studies [26, 37]. No sample overlap can be reported across studies. Twenty-one (63.64%) evaluated risk prediction models among individuals of European ancestry [23, 24, 26,27,28, 30,31,32, 34, 35, 38,39,40,41,42,43,44,45,46, 49, 50], 12 (36.36%) among a population of Asian ancestry [22, 25, 29, 33, 36, 37, 47, 48, 51,52,53,54]. Population sizes ranged from 603 [47] to 361,543 [44] individuals.

Table 1 Main characteristics of the included studies in the systematic review

Risk prediction models characteristics

The number of genetic variants evaluated in the risk prediction model ranged from 4 [54] to 696 SNPs [45]. A complete list of SNPs included in each study is provided in Table S1.

In order to include genetic factors into prediction models, different methodologies were investigated across the included studies. In particular, 26 (78.79%) studies used a GRS, 11 (42.31%) of which used a weighted GRS [31, 33,34,35, 40, 42,43,44,45,46, 52], other 6 (23.08%) studies used an unweighted GRS [22, 24, 26,27,28,29]. Instead, a total of 9 studies (34.62%) used both unweighted and weighted methods to develop risk scores [23, 25, 30, 32, 36, 37, 49,50,51].

Of the remaining 7 studies that did not use GRS (21.21%), one [39] derived 7 genes from a larger set. After gene profiling and cluster analysis, specific genes were selected, further validated and evaluated for predictive performance. The second one performed a Mendelian randomization analysis to assess the association between hyperlipidemia and CRC using Burgess statistics [55] and a fixed-effects meta-analysis to derive final odds ratios [41], while another one [47] applied logistic regression, Jackknife feature selection and ANOVA testing to construct the prediction model. Other authors [53] applied a stepwise selection procedure in order to determine the inclusion or exclusion of the putative risk factors from the models, and the combined effect of genes on colorectal cancer risk was assessed by multivariate unconditional logistic regression. Instead, 2 studies used machine learning approaches [38, 54]; the last one evaluated the predictive accuracy of genetic corrected serum levels of specific biomarkers compared to uncorrected ones [48].

Difference in discriminatory accuracy between SNP-enhanced and traditional risk factor models

Using the Swets classification [56], i.e. low accuracy when the AUC is between 0.5 and 0.7, moderate accuracy between 0.7 and 0.9, only two of the studies that included both a traditional risk factor only model and one incorporating also genetic factors found a moderate discriminatory accuracy. The first study [36] showed that, only among males, AUC values for models including counted GRS and weighted GRS reached 0.729 (95% CI: 0.682, 0.767) and 0.719 (95% CI: 0.677, 0.761), respectively; while models without SNPs showed low accuracy (i.e. AUC lower than 0.7). The second study [37] found moderate discriminatory accuracy for both SNP and non-SNP-enhanced models. In particular when overall colon and rectal cancer risk, colon cancer risk only, and rectal cancer risk only were separately considered, SNP-enhanced models yielded AUC values of 0.74 (95% CI: 0.70, 0.78), 0.75 (95% CI: 0.69, 0.81), and 0.74 (95% CI: 0.68, 0.79), respectively; while non-SNP-enhanced model yielded AUC values of 0.73 (95% CI: 0.69, 0.78), 0.76 (95% CI: 0.70, 0.83), and 0.71 (95% CI: 0.65, 0.77), respectively.

A total of 4 articles [33, 37, 49, 51] used the NRI and/or the IDI to compare the performances of two models (traditional only vs genetic enhanced model). In the first article [37], the NRI for a prediction model with GRS respect to the traditional risk score model was 0.17 (95% CI: − 0.05, 0.37) for CRC, − 0.17 (95% CI: − 0.33, 0.21) for colon cancer only, and 0.41 (95% CI: 0.10, 0.68) for rectal cancer only. The second one [33] found an increase in the inclusive model compared to the non-genetic model for the mean IDI (0.015) and the mean continuous NRI (0.39). After defining risk categories of NRI by arbitrary cut-off values of 1.5 and 3% of 10-year absolute risk of developing colorectal cancer, the mean NRI value was equal to 0.12 when the non-genetic and inclusive models were compared. The third [49] showed an increase in the NRI in all the models when different variables were included in the model (Table 1). Eventually, the last one [51] found that the traditional model with smoking status showed worse performance respect to the combined model that included genetic (simple count GRS,) and smoking factors: NRI of 0.317 (95% CI: 0.225, 0.408) and IDI of 0.031 (95% CI: 0.023, 0.039).

AUC analysis

A total of 14 risk prediction models, from 10 studies were included in the AUC analysis [23, 30, 32, 33, 35,36,37, 44, 49, 51]. We found no significant trend regarding the increase in the AUC of the SNP-enhanced risk prediction models according to the number of SNPs included in the models and, when the AUC was tested for trend, no significant association was retrieved (p for trend = 0.774). Pearson’s correlation coefficient between AUC improvement and number of SNPs was also estimated, r = − 0.0993 (95% CI: − 0.541, 0.385; p = 0.6951). No correlation could be found between the number of SNPs and AUC increase.

The meta-analysis resulted in a pooled estimate of AUC improvement for SNP-enhanced prediction models compared with non-SNP-enhanced models of 0.040 (95% CI: 0.035, 0.045) for all 14 models (Fig. 2). High heterogeneity was found reaching 98.5% (p < 0.001).

Fig. 2
figure 2

Overall improvement in AUC for SNP-enhanced prediction models compared with non-SNP-enhanced models

A stratified analysis by number of SNPs included across models was performed (Fig. 3). The AUC difference between the SNPs-enhanced models respect to non-SNP-enhanced models for the lowest tertile of SNPs added to the model (less than or equal to 22 SNPs) resulted in an improvement of 0.044 (95% CI: 0.022, 0.067). As to the mid (23–47 SNPs) and highest tertiles (more than or equal to 48 SNPs) of SNPs added, the estimates showed an improvement in the AUC of 0.018 (95% CI: 0.014, 0.022) and 0.045 (95% CI: 0.031, 0.058), respectively.

Fig. 3
figure 3

Improvement in AUC for SNP-enhanced prediction models compared with non-SNP-enhanced models stratified by the tertile of number of SNPs included in the model

The results of the meta-regression (Table 2) showed that the factor more strongly associated, inversely, with AUC improvement after the addition of SNPs to a model with only traditional risk factors was the AUC of the non-SNP-enhanced model (p < 0.001). Furthermore, an inverse significant association was found also between the number of cases included in the study and AUC improvement (p = 0.002). Eventually, ethnicity was associated with AUC improvement too (p = 0.023), with better AUC improvements achieved by models constructed among Asians compared with individuals with European ancestry. No significant associations were found for other investigated factors. Overall, the factors included in the meta-regression explained almost half statistical heterogeneity, with a residual I2 equal to 54.18%.

Table 2 Results of the meta-regression assessing which factors are associated with AUC improvement of SNP-enhanced models compared with non-SNP enhanced models

Quality assessment

Results of the overall risk of bias and applicability assessment can be found in Table 3.

Table 3 Results of the risk of bias for each domain of the PROBAST tool

The majority of the studies (93.94%) were scored as having high risk of bias [22,23,24,25,26,27,28,29,30, 32,33,34,35,36,37,38,39,40,41,42, 44,45,46,47,48,49,50,51,52,53,54, 57], 2 (6.06%) studies were rated as having an overall unclear risk of bias [31, 43].

A total of 22 (66.67%) studies were assessed only for the development of the model, 8 (24.24%) studies were assessed for both model development and validation, 3 (9.09%) only for model validation.

As to the model development, 66.67, 36.67, 20.00 and 70.00% of the studies were assessed as having high risk of bias respect to participants, predictors, outcome and statistical analysis, respectively; 33.33, 20.00, 63.33, 3.33% were deemed as having a low risk of bias, while 0.00, 43.33, 16.67, 26.67% were assessed as having unclear risk of bias respectively for participants, predictors, outcome and statistical analysis assessment.

As to validation models, 27.27, 36.36, 45.45, 9.09% of the included studies were assessed as having low risk of bias for participants, predictors, outcome and statistical analysis, respectively; while 72.73, 63.64, 54.55 and 90.91% were rated as high or unclear risk of bias.

Regarding the applicability of prediction models, in development model studies 30.00, 3.33, and 0.00% were at high or unclear risk; in validation studies 18.18, 0.00, 9.09% were at high or unclear risk as to, respectively, participants, predictors and outcome.

Discussion

Overall, from the 35 studies that we included in our systematic review we identified prediction models for CRC incorporating genetic factors, with extreme heterogeneity regarding the number of genetic factors included. Instead, as for the methods to include genetic factors in the prediction model, most studies used a weighted GRS, with a minority of them using either the count model or both the weighted and count methods.

As for studies reporting the AUC value of the model, most of them could not find a satisfactory discriminatory accuracy (e.g. AUC > 0.7 [56]) for their models, even though the addition of genetic factors to traditional risk factors improved it, with an improvement in the AUC ranging from 0.010 [37, 44] to 0.084 [51]. Nonetheless, similarly to what was previously reported for breast cancer [58], we found no evidence of association or correlation between the number of SNPs included in the model and the improvement in the AUC value. However, among studies comparing two or more models, only a minority reported data on NRI or IDI, witnessing the need to better quantify and report the improvement of accuracy of a model when adding new biomarkers or genetic data [59]. According to the interpretation suggested by Pencina et al. for NRI values, all these four studies showed a weak or intermediate strength of SNPs (for all of them in the form of a GRS), in terms of discriminatory potential, when added to models with only traditional risk factors [17].

Regarding the pooled improvement in AUC, a clear trend in the improvement of AUC related to the number of SNPs could not be found. The best results were achieved in the lowest (≤22 SNPs) and highest (≥48 SNPs) tertiles of SNPs incorporated into the models, which led to a larger improvement in AUC compared with the mid tertile (23–47 SNPs). As expected, due to the extremely high heterogeneity among variables, regarding various SNPs and several environmental factors included in the retrieved prediction models and among statistical methods used to incorporate such variables in the models, our meta-analysis results show significant statistical heterogeneity, witnessed by the high values of the I2 obtained. For this reason, the results of our study should be interpreted cautiously and cannot be considered conclusive.

Similarly to our results, Fung et al. reported that the addition of genetic information improved discriminatory accuracy of the identified prediction models for breast cancer, even though AUC improvement was found to be not correlated or associated with the number of SNPs that were included in the model [58].

It should be noted that the improvement of AUC values with the addition of biomarkers, such as SNPs, to a model depends on the starting AUC value, which means the higher the AUC value of the model including only traditional risk factors, the smaller the improvement in AUC after adding genetic information into the model [17, 60, 61]. This was further confirmed by the results of our meta-regression. In addition, an inverse relation with AUC improvement was found also for the number of cases included in the study, which could actually be linked to the AUC of the non-SNP enhanced model. Likely, the higher the number of cases in the study, the larger the AUC of the non-SNP enhanced model and, hence, the smaller the AUC improvement.

Furthermore, the ethnicity of study participants was found to significantly affect AUC improvement, suggesting possible differences in the role of genetic factors between different populations, and witnessing the need to foster research in the field of genetic prediction models for all ethnicities [62]. The distribution of genetic factors associated with a specific cancer may vary between different ethnicities even more than traditional risk factors, thus the need for ethnicity-specific genome-wide association studies (GWAS) is crucial to inform the development of specific prediction models for different ethnicities [22, 63]. Furthermore, the importance of the chosen population in the construction of predictive models should be properly taken into account, as a model is applicable only to the specific population it was designed for [60].

Eventually, results of the meta-regression showed that the number of SNPs, publication year, the number of traditional risk factors in the model, and inclusion of gender in the model were not associated with AUC improvement. However, they largely explained statistical heterogeneity between included studies.

As far as we know, previous systematic reviews on prediction models for CRC including genetic factors were limited to a qualitative synthesis [8]. Hence, to our knowledge, our study is the first to investigate, through a quantitative approach, the improvement in discriminatory accuracy that can be obtained through the incorporation of SNPs into prediction models for CRC in addition to traditional risk factors. We also assessed which factors affect such improvement.

However, our study has some limitations. As previously mentioned, we identified extremely different prediction models, both in terms of genetic factors included in the models and in the methods used to include them -which range from weighted and unweighted GRS, to machine learning methods. The accuracy of a model, in terms of AUC values, depends not only on predictors that were used, but also on the method used for its construction. [64] Hence, as expected, this led to high heterogeneity of the results of our meta-analysis, which parallels what was previously described by Fung et al. regarding breast cancer [58]. Even though we showed that some factors partially explain such heterogeneity, our results should be considered exploratory and not conclusive due to the differences showed by included studies regarding chosen SNPs and traditional risk factors, as well as GRS computation methods.

Moreover, we found very limited high-quality evidence, with only one study having an overall low risk of bias [65], while majority had a high risk of bias. This not only limits the strength of our results, but also strongly suggests the need for better reporting, using as guidance the GRIPS Statement [66] or its updates, such as Polygenic Risk Score Reporting Standards (PRS-RS) [67], and higher quality research in the field of prediction models, which applies to CRC, and other chronic conditions – e.g. cardiovascular diseases [68]. Notably, all these factors affecting heterogeneity might have had an impact also on other estimates we reported in the analysis. Indeed, discriminatory accuracy of prediction models is expected to improve with the addition of newly discovered SNPs, [60] partially in contrast with our results. However, recently Khera et al. constructed 30 PRSs using millions of SNPs for five common diseases, obtaining PRSs with lower AUC values than those based on genome-wide significant SNPs only [69, 70]. This underlines the striking importance of an appropriate choice of SNPs to include in the models [58]. In addition, it should be noted that some SNPs used for risk prediction models by studies included in our analysis might have not been confirmed as risk loci by subsequent larger GWASs.

Furthermore, while recent research efforts in the field of PRS modelling are going towards the inclusion of thousand or even million SNPs into prediction models through the use of sophisticated methods, [70] such as LDpred2, lassosum, PRS-CS, and others, [71,72,73] the highest number of SNPs in the models included in our analyses was less than one hundred, thus limiting the applicability of our findings.

To further implement and advance knowledge in the field, in near the future, the adequate application of existing guidelines to improve the quality of prediction model studies, especially regarding study design and/or standardization of methodology to conduct these types of study, will be essential [20]. We showed that the addition of genetic factors into a prediction model with only traditional risk factors improves its performance, even if slightly. However, it is arguable if such improvement could really have an impact on populations’ health. In particular, in the field of disease prediction, great attention should be paid not only to the prediction performance, but also to clinical utility of the models [60]. As for CRC, disease prediction might play a key role in the personalization of screening programs, which could start earlier for individuals proven to be at higher risk compared with the average population. Hence, the use of a prediction model, especially if also incorporating genetic factors, might greatly impact starting age of screening [35, 74]. In addition, knowing own personal risk of cancer could also be a useful trigger for individuals to improve their adherence to screening programs, which is known to be far from the target levels [75].

The addition of genetic information may offer greater benefit when the models are used for risk prediction among specific subgroups of the population [8, 58]. This might imply that, in the future, this kinds of screening interventions could be an implemented multi-step process: the first regards the stratification of individuals according to their level of risk, followed by personalization of the interventions to carry out [58].

Eventually, as recently reported by Naber et al. [76], if a prediction model having an AUC of at least 0.65 is adopted, stratified screening for CRC becomes cost-effective compared with the current uniform screening [77]. This further underlines the importance to carry out further research in this field to improve performances of developed prediction models.

Conclusions

The integration of genetic information into traditional prediction risk models improves the discrimination accuracy respect to CRC. However, we could not find any association or correlation respect to the number of SNPs added to the model and an AUC improvement. High heterogeneity in the choice of baseline model, method of incorporating genetic information, and studied population suggest that standardization in the conduction of this kind of studies be needed. Further steps in research are surely needed in order to improve knowledge, increase comprehension and target people who would benefit more from this intervention. It is also crucial to consider how to apply the studied models into clinical and real-life settings, in fact, the implementation of prediction models into practice will require a better comprehension of potential economic benefits and organizational effects, as well as patient safety, ethical, social, and legal implications, which will make the impact of polygenic prediction models on Health Systems clearer.