Introduction

C-reactive protein (CRP) is one of the most widely used biomarkers in clinical practice. First identified in 1930 [1], this acute phase reactant was initially used as a biomarker for infection [2]. The advent of high-sensitivity CRP measurement in the 1990s, alongside experimental and clinical evidence suggesting a potential role of inflammation in cardiovascular disease a few years later [3, 4], increased research interest in CRP. It has since been examined as a potential risk factor for an ever-expanding list of diseases including different cardiovascular outcomes, cancers, metabolic and skeletal diseases and autoimmune diseases [5,6,7,8,9]. Today, despite intensive research efforts, the role of CRP in the etiology of common diseases remains unclear.

Umbrella review is a systematic overview of systematic reviews and meta-analyses that assesses the evidence from the current literature in a field of research [10]. We aimed to systematically summarize and evaluate the breadth and validity of associations between CRP and health outcomes using the umbrella review methodology. We summarized meta-analyses of observational studies, examined the extent of phenotypic associations with CRP, and evaluated the strength of associations and bias in these identified associations. At the same time, we performed a systematic review of Mendelian randomization (MR) studies considering CRP levels as the exposure, to assess the evidence for causality stemming from this literature.

Methods

Data sources and searches of observational studies

We systematically searched PubMed, Scopus, and Cochrane Database of Systematic Reviews, from inception to 31 March 2019, for meta-analyses of observational studies examining the association of CRP with any health outcome (see search algorithms in Additional file 1: Appendix Table 1). All identified publications went through a three-step parallel review of title, abstract, and full text (performed by CK, GMa, SC, NK) based on predefined inclusion and exclusion criteria.

Study selection and data extraction of observational studies

We included systematic reviews and meta-analyses of observational studies that examined associations between CRP levels and health outcomes that had identified at least three studies per outcome examined, keeping only articles that were full publications and in the English language. We excluded studies without systematic literature searches (for meta-analyses of observational studies), without quantitative synthesis of effect sizes, and studies where CRP concentrations were the outcome. Also, due to the well-known role of CRP in infectious disease diagnosis, articles which investigated infections as the outcome of interest were excluded. We also excluded meta-analyses using only cross-sectional assessments, meta-analyses of only crude (unadjusted) estimates, and associations reported as correlation coefficients. Where more than one article with overlapping outcomes was retrieved, the article with the meta-analysis of only prospective studies, the most comprehensive meta-analysis (the one including the largest number of studies), or the more recently published one was included in the final analysis (in order of preference).

Three independent investigators (CK, GMa and SC) extracted the data, which were checked by a second investigator (IT, ET) and in case of discrepancies consensus was reached. From each eligible meta-analysis, we extracted information on the first author, journal and year of publication, examined risk factors and the number of studies considered, type of metric reported (hazard ratio, risk ratio, odds ratio [OR], in order of preference), maximally adjusted effect sizes and 95% confidence intervals (CIs), number of total studies included, design of the original studies, unit of comparison, number of cases and population. When the number of cases or controls for individual studies was not reported, we abstracted them from the original studies when possible. When CRP was examined in more than one level of comparison (e.g. as a continuous biomarker and by tertiles), we extracted the data for the comparison having the largest number of component studies.

Data synthesis and analysis of meta-analyses of observational studies

For meta-analyses of observational studies, we estimated the summary effects obtained from the random-effects method [11, 12] for which we also estimated the 95% prediction intervals to indicate the possible interval that could include the effect size of a new study examining the same association and describe the uncertainty of the summary effect size [13]. The heterogeneity between studies was assessed using the I2 metric, which has a range between 0 and 100%. It is calculated as the ratio of the variance between-studies over the sum of the variances between and within studies [14]. Values exceeding 50% or 75% are considered to represent large or very large heterogeneity, respectively. Small study effects were assessed with the use of the Egger’s regression asymmetry test [15]. A P ≤ 0.10 combined with a more conservative effect in the largest study than in random-effects meta-analysis was judged to provide evidence for small-study effects.

We further applied the excess statistical significance test, which evaluates whether there is a relative excess of formally significant findings in the published literature due to any reason (e.g., publication bias, selective reporting of outcomes or analyses) [16]. It is a Chi square-based test that assesses whether the observed number of studies with nominally significant results is larger than their expected number. We used the effect size of the largest study (smallest standard error) in each meta-analysis to calculate the power of each study using a non-central t distribution. Excess statistical significance was claimed at two-sided P ≤ 0.10 with observed > expected as previously proposed [16, 17].

Quality assessment and evidence grading of observational studies

We classified the evidence of the associations that had P < 0.05 as strong, highly suggestive, suggestive, and weak based on a set of previously used criteria whose rationale has been described elsewhere in detail [10, 18,19,20]. In brief, these criteria try to consider the level of statistical significance, amount of evidence, consistency, and lack of signals of bias. Thus, we classified as strong evidence those associations that had significance P < 1×10−6 based on the random effects model, more than 1000 cases, the I2 metric was less than 50%, there was no evidence of small study effects, the prediction interval did not include the null value, and there was no evidence for excess significance bias. Associations were classified as highly suggestive when P < 1×10−6 based on the random-effects model, more than 1000 cases, and the P value of the largest study in the meta-analysis was < 0.05. The associations with P < 0.001, and more than 1000 cases were considered as suggestive. Finally, associations were considered as weak when P < 0.05 on the random effects model.

Some meta-analyses used estimates from studies with different study designs. Due to the inherent limitations of cross-sectional and case–control studies to examine temporal associations, we performed a sensitivity analysis by excluding cross-sectional and case–control studies.

Finally, for each association in the strong and highly suggestive category, we reassessed the evidence after examining each meta-analysis in depth by assessing the eligibility of the included studies as well as verifying the data used in the meta-analysis using AMSTAR (A MeaSurement Tool to Assess systematic Reviews) [21].

Data sources and searches, study selection and data extraction of Mendelian randomization studies

We used the search algorithm (See Additional file 1: Appendix Table 1) to identify MR studies evaluating potential causal association between CRP levels and health outcomes, excluding infections. The titles, abstracts, and full texts of the resulting papers were examined in detail by two authors (GMa and IT), and discrepancies were resolved by consensus. From each eligible MR study, two authors (GMo and GMa) extracted data in relation to first author, journal and year of publication, the study cohort/s, sample size, number of cases (as applicable), type of data used (individual participant or summary level), the instrumental variables (single-nucleotide polymorphisms [SNPs]), the instrument selection approach, population ancestry, SNP exclusion criteria,  % variance explained by the instruments, the outcome phenotypes, the MR effect estimate and the corresponding CIs. When we observed a nominally significant association (P < 0.05) in the main MR analysis, we further extracted and evaluated all information on sensitivity MR analyses.

Evidence grading of Mendelian randomization studies

We stratified MR analyses into those using instrumental variables which included only variants located in the CRP gene and those using instrumental variables with SNPs that were significantly associated with CRP levels from throughout the genome (i.e., not restricted to the CRP gene). The latter approach for selecting instruments is more likely to incorporate invalid instruments that have pleiotropic effects [22]. Indeed, a genome-wide association study (GWAS) of CRP has revealed a large number of genetic variants, which were not specific to CRP, but influence other inflammatory cytokines including interleukin-6 receptor (IL-6R) and interleukin 1 family member 10 (ILF10) [23]. For MR analyses restricted to variants located in the CRP gene, we considered MR evidence as ‘potentially supportive’ when the main analysis reported a P < 0.01 [20] and there was consistent evidence from sensitivity analyses; ‘limited/inconsistent evidence’ when there was 0.01 < P < 0.05 or P < 0.01 without further support from sensitivity analysis, and ‘not present’ when P > 0.05. For MR analyses with variants throughout the genome for CRP, we considered as ‘limited/inconsistent evidence’ when there was P < 0.05 and further support from sensitivity analysis, and ‘not present’ otherwise.

Results

CRP levels and health outcomes reported in meta-analyses of observational studies

Our literature search yielded 4100 eligible articles. Following title review, 863 articles were considered eligible (Fig. 1), and after abstract screening, 552 articles were potentially eligible for full text review. Finally, 55 studies [5, 24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77] including 113 comparisons of different outcomes were included in the umbrella review of observational studies, consisting of 952 primary estimates. To facilitate interpretation, the different outcomes were classified into the following groups: cancer-related (52 outcomes), cardiovascular-related (31 outcomes), kidney-related (7 outcomes), skeletal (6 outcomes), neurological (3 outcomes), pregnancy-related (2 outcomes), respiratory-related (2 outcomes), and other (10 outcomes).

Fig. 1
figure 1

Flowchart of study selection for a umbrella review and b Mendelian Randomization review

The majority of the primary studies were cohorts (N = 823; 86.5%, of which 497 were prospective, 264 retrospective, and 62 of unclear design), followed by case–control studies (N = 115; 12.1%). Other study designs consisted of cross-sectional studies (N = 6; 0.6%), case-cohorts (N = 7; 0.7%), and one case-crossover study (0.1%).

Ninety-five out of 113 associations (84.1%) presented a statistically significant effect at P < 0.05 under the random-effects model, 67 remained significant at P < 0.001, whereas 34 associations had a statistically significant effect at P < 1×10−6 (Table 1). However, only 24 (21.2%) associations had a 95% prediction interval that excluded the null. The largest study was statistically significant in 71 of the 113 comparisons (62.8%) and was more conservative than the meta-analysis estimate in 87 of 113 comparisons (77%) (Table 1). Twenty-three associations (20.4%) presented very large between-study heterogeneity (I2 > 75%), and 31 (27.4%) associations had large heterogeneity estimates (I2 > 50% and I2 < 75%). In 45 (39.8%) of the 113 associations the Egger’s test was statistically significant (P < 0.1) and the random effects estimate was inflated compared to the largest study (Table 1). Forty-seven associations (41.6%) showed evidence of excess significance, meaning that the number of observed studies with statistically significant results exceeded the number of expected ones (Table 1).

Table 1 Health outcomes and assessment of evidence in meta-analyses of observational studies

Assessment of epidemiological credibility

Of the 113 associations, only two cardiovascular outcomes (cardiovascular mortality and venous thromboembolism) fulfilled the necessary criteria to be categorized in the strong level of evidence (Table 1). Ten (8.9%) associations were supported by highly suggestive evidence, 6 of which were on cardiometabolic outcomes. The highly suggestive associations were all-cause mortality in general population, all-cause mortality in patients with chronic kidney disease, long-term mortality in chronic obstructive pulmonary disease (COPD) patients, long-term mortality or CVD in acute coronary syndrome (ACS)/unstable coronary heart disease (CHD)/angina patients, mortality or CVD in stable coronary artery disease (CAD) patients, CHD in general population, overall survival in hepatocellular carcinoma patients, Bath Ankylosing Spondylitis Disease Activity Index-50 (BASDAI50) in ankylosing spondylitis patients, ovarian cancer in general population, and type 2 diabetes in general population. There were 16 comparisons that were categorized in the suggestive level of evidence and 67 in the weak level. Finally, 18 comparisons did not present a statistically significant association. When we excluded the case–control or cross-sectional studies, only seven comparisons were affected. Only six of those comparisons had at least 3 remaining studies in order to be re-evaluated and for all six the evidence categorization remained the same (Additional file 1: Appendix Table 2).

When we assessed the meta-analyses in either the strong or the highly suggestive evidence category, we observed that the majority of the meta-analysis papers were on moderate study quality (9 of the 11 papers) based on an AMSTAR score between 4 and 7, and only one had a score of 8. Finally, one study [41] was a pooled analysis and therefore it could not be evaluated based on the AMSTAR tool (Additional file 1: Appendix Table 3).

CRP levels and health outcomes reported in Mendelian randomization studies

A total of 196 primary MR analyses were identified from 37 studies [79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,115] covering 82 distinct phenotypes (Table 2 and Additional file 1: Appendix Tables 4, 5). The majority of associations were investigated through two-sample MR methodologies (130 out of 196; 66%). The median number of participants included in MR studies was 26 405 (range 134 to 184 305). The most frequently examined phenotypes included cardiovascular diseases (coronary heart disease and stroke) (n = 19; 9.7%), type 2 diabetes (n = 8; 4.1%), schizophrenia (n = 8; 4.1%), and body mass index (BMI) (n = 6; 3.1%). Eighty-four MR analyses (60 unique outcomes, Table 2) used instrument variants at the CRP gene locus, and 112 used instruments from throughout the genome The SNPs used as instruments varied vastly among studies. The four most commonly used SNPs among the 196 MR associations were rs1130864 (n = 78; 39.8% of the comparisons), rs1205 (n = 74; 37.8%), rs2794520 (n = 74; 37.8%), and rs3093077 (n = 65; 33.2%); all these variants fall within or close the CRP gene region.

Table 2 Health outcome and characteristics of Mendelian randomization studies. Only studies with instruments from the CRP gene are presented. One study is selected per outcome based on the largest sample sizea

Overall, 12 distinct phenotypes presented significant associations at a P < 0.01, of which four (Crohn’s disease, ischemic heart disease, systolic and diastolic blood pressure) presented significant associations (P < 0.01) when the instruments were restricted to CRP gene locus (Appendix Tables 4 and 5). However, independent MR analyses did not show consistent evidence for Crohn’s disease and ischemic heart disease, and none of the aforementioned phenotypes had support from sensitivity analyses.

Nine phenotypes presented significant (P < 0.01) causal effect estimates when instruments from throughout the genome were considered and of those, only schizophrenia and bipolar disorder presented consistent evidence in sensitivity analyses and in analysis restricted to SNPs within CRP locus, but only at P < 0.05. Nonetheless, the result on bipolar disorder [113] was not confirmed by an earlier study [107] where MR using only CRP gene SNPs did not reach statistical significance at P < 0.05. Schizophrenia had evidence from independent studies and sensitivity analysis (weighted median and inverse variant weighted estimate), but this was not supported by MR Egger analysis and the sensitivity analysis using only CRP gene SNPs (P = 0.04).

Overall, only 14 outcomes had evidence available from both MR analyses and meta-analyses of observational studies (Table 3). The evidence between the observational studies and MR analyses was concordant for three outcomes where both meta-analyses of observational studies and MR analyses were not statistically significant (P ≥ 0.05). The remaining studies showed various degree of evidence (weak, suggestive, highly suggestive) with meta-analyses of observational studies and no evidence or limited inconsistent evidence from MR. Finally, MR did not support causality for venous thromboembolism whose evidence was graded as strong in the observational meta-analysis evidence.

Table 3 Comparison of evidence from observational studies meta-analysis and Mendelian randomization (MR) studies taking into account both CRP gene-only and genome-wide significant instruments

Conclusions

Our umbrella review showed an impressive body of literature on CRP including 113 comparisons from 55 studies for separate phenotypes and 196 MR analyses to assess causality of epidemiologic associations. Only 14 phenotypes had evidence from meta-analysis of observational studies and MR analyses. Most summary meta-analytic estimates of observational studies yielded nominally statistically significant results for a direct association between CRP and different phenotypes. Nonetheless, only two of these associations had strong results with no suggestions of biases (cardiovascular mortality and venous thromboembolism in general population) and none of these had supporting evidence of a causal role for CRP in MR investigations.

Low-grade inflammation has been suggested to be involved in many chronic diseases, which may explain the breadth and depth of phenotypes examined in relation to CRP, a general marker of inflammation that can be inexpensively measured in epidemiological and clinical settings. A search of “C-reactive protein or CRP” yields 74,622 items as of March 05, 2019, and the vast number of meta-analyses that we identified are efforts to summarize this huge, expanding literature.

A large proportion of studies examined CRP as a prognostic marker of cancer incidence but also of cancer survival. Out of those 52 comparisons, there was highly suggestive evidence for only two associations (ovarian cancer incidence and overall survival in hepatocellular carcinoma). The evidence from the remaining literature was classified as suggestive or weak. MR efforts, including one on lung cancer, did not highlight any evidence of causality either, although their sample sizes were modest for less common cancers. Chronic inflammation may still be linked to cancer development and progression, as other lines of evidence suggest a higher risk of cancer amongst individuals with inflammatory conditions (e.g., inflammatory bowel diseases and risk of colon cancer), or higher risk of cancer in relation to infections (e.g. human papillomaviruses and cervix cancer) [115,116,117,119]. However, CRP, as a general marker of inflammation, is unlikely to capture the specific inflammatory mediating pathways linking inflammation to cancer development and progression.

CRP and cardiovascular diseases have been subject to an increasing body of research and debate. Our review found that the associations of CRP with cardiovascular mortality and venous thromboembolism were supported by strong evidence. Furthermore, we found highly suggestive evidence between higher CRP and risk of CHD, type 2 diabetes and mortality or CVD on stable CAD patients and on unstable CHD/ACS/angina patients. Nonetheless, MR studies have repeatedly failed to provide evidence for causal association with CHD; an observation further supported from randomized controlled trials [120]. The observational literature of CRP is likely to suffer from diverse biases and the effect size of the associations may be inflated [121, 122]. Beyond causality, even efforts to show that CRP could at least be used in risk prediction have also not demonstrated convincing results [123, 124]. Accordingly, the relative risks that we noted for cardiovascular mortality (2.05, in fact just 1.49 in the largest study) and venous thromboembolism (only 1.14) do not suggest a substantial predictive potential. The role of inflammation in atherosclerotic plaque initiation, progression and rupture has been supported by various other lines of evidence [125], but this may not necessarily prove that CRP should have clinical utility.

COPD is associated with an abnormal inflammatory response beyond the lungs with evidence of low-grade systemic inflammation which causes systemic manifestations such as weight loss, skeletal muscle dysfunction, an increased risk of cardiovascular disease, osteoporosis and depression [125,126,128]. We found highly suggestive evidence that CRP is associated with late (but not with early) mortality in COPD patients. However, MR studies did not support a causal association. CRP might be elevated in COPD patients due to reverse causality as the disease is associated with triggering an inflammatory response. Reverse causality is likely to explain other associations such as mortality in patients with chronic kidney disease or overall survival in hepatocellular carcinoma patients. In these instances CRP could serve as a predictive factor for disease severity, but studies assessing its value over and above validated existing risk prediction algorithms are essential to support any prediction claim [123].

Some particular mention needs to be made on schizophrenia, where, among the tentative MR findings described in this review we found several studies of CRP and schizophrenia onset. Yet, there is a distinctive lack of observational data on this association, and those that exist [129, 130], mainly focus on the reverse pathway of the association (how schizophrenia affects CPR levels) than what is the focus of this review.

In our MR review we found multiple studies and sensitivity analyses show evidence for causal effect, but with very modest P-values, when only CPR SNPs were used in the genetic instruments. One recent analysis (published after the search date of our review [131]) found even lower P-values with inverse variance weights and generalized summary MR modeling. The putative causal association with schizophrenia is even more interesting because it suggests a protective effect of CRP on schizophrenia, while observational data had suggested an association of CRP with higher schizophrenia risk [130].

Overall, the overwhelming majority of the meta-analyses of observational studies reported a nominally statistically significant result (84%) in contrast to MR studies where only 37 of the 196 (19%) analyses presented nominally statistically significant results. These two study designs may be subject to different biases in the biomedical field. A large proportion (48.2%) of the examined observational meta-analyses displayed substantial heterogeneity (I2 > 50%), small study effects (39.5%), and excess significance bias (41.2%), which, in addition to the small effect estimates increase the probability of false-positive findings. MR approaches use genetic variants as instrumental variables to establish whether an exposure is causally related to a disease or trait. The genetic variants are unrelated to confounding factors, and therefore, this approach is not as prone to confounding and reverse causation bias. At the same time, genetic association estimates in MR represent the average lifetime association of the variants with the outcome for all those in the considered population, and are therefore less vulnerable to measurement error [132]. Nonetheless, MR also shares some of the limitations of observational epidemiology literature including small sample sizes, instrument bias and low power, and poor reporting has further additional limitations [22]. For example, we observed that at least half of the MR studies on CRP used instruments derived from genome-wide association studies including genetic variants on genes of other inflammatory cytokines such as IL-6. These approaches may introduce potential pleiotropy and can thus bias MR estimates. There are several methodologies to account for the violation of the pleiotropy assumption of MR, but these cannot always identify pleiotropic effects, and therefore, can only partly disentangle the complex pleiotropy previously shown between CRP and lipid and metabolic pathways [133].

Limitations of our approach need to be acknowledged. Our review focused on existing meta-analyses, and therefore, outcomes that were not assessed in a meta-analysis are not included in this review. Furthermore, we did not appraise the quality of the individual studies but the quality of the actual meta-analyses. We refer interested readers to the quality assessments already made by the authors of each original meta-analysis and we did not wish to change the eligibility criteria based on quality since this would add our own subjective in study selection. We did not include evidence from randomised control trial meta-analyses as these examine a wide range of anti-inflammatory treatments which are not specific to CRP lowering effects. Statistical tests for small-study effects and excess significance bias should also be interpreted with caution in case of large between-study heterogeneity and both tests have limited power in the presence of few studies or sparse studies with significant results. Finally, we adopted credibility assessment criteria, which were based on established tools for observational evidence; however, none of the components of these criteria provides firm proof of credibility of evidence, but they cumulatively describe the possibility that the results are susceptible to bias and uncertainty.

In this extensive systematic review of meta-analyses of observational studies on CRP and disease outcomes and of the evidence stemming from MR studies, we could not find strong evidence supported by both study designs in relation to CRP and the most frequently studied non-infection phenotypes in the literature. Observational studies presented robust evidence of association between higher CRP levels and cardiovascular mortality and venous thromboembolism, but without causality support from MR studies. Following claims that CRP maybe be a novel CVD risk factor [134], it has been extensively studied in relation to an ever-increasing list of phenotypes and diseases, but it does not seem to be crucially relevant to any of them. Despite intensive research efforts, our study shows that there is little evidence that CRP may constitute a priority interventional target for any diseases.