Introduction

Inflammation is a complex and necessary response of the immune system to harmful stimuli such as tissue injury, infection, or exposure to toxins [1]. During the acute phase that is characterized by blood flow changes and increased blood vessel permeability, plasma proteins and leukocytes migrate from the circulation to the site of inflammation [2]. This immediate protective response usually enables the elimination of the initial cause of the cell injury and the restoration of homeostasis. However, when the acute response fails to clear tissue damage, for example, because of prolonged exposure to stimuli, inflammation can become a chronic process [3]. A number of common diseases are at least partly caused by chronic inflammation, including coronary artery disease, type 2 diabetes, and some cancers [4]. Thus, although inflammation plays an important role in human defense against aggression, it also contributes to the pathophysiology of multiple diseases of major public health importance.

Diagnostic tests are capable of detecting the presence and intensity of systemic inflammation [5]. The most commonly used inflammatory biomarker is the acute-phase reactant C-reactive protein (CRP). This ring-shaped protein is produced by hepatocytes upon stimulation by pro-inflammatory cytokines such as interleukin (IL)-1b, IL-6, and TNF-a. Although CRP is commonly used as a sensitive indicator of inflammation, the factors influencing its baseline plasma levels are only partially understood. Circulating amounts of CRP are positively associated with age, body mass index (BMI), and smoking and inversely with male sex and physical activity [6,7,8]. In addition, large-scale genomic analyses have found multiple associations with hs-CRP levels, mainly in the loci enriched in hepatic, immune, and metabolic pathways, such as CRP, LEPR, IL6R, GCKR, APOE, and HNF1A-AS1 [9,10,11,12,13,14]. Altogether, genetic variation explains up to 16% of the variance in plasma CRP levels [14].

To get a more comprehensive view of the factors influencing chronic inflammation in the general population, we used samples and data from the UK Biobank and the CoLaus|PsyCoLaus study to search for associations between baseline CRP levels and chronic infection by persistent/latent pathogens, after careful adjustment for all known demographic, clinical, and genomic influences. Indeed, some infectious agents causing long-term infections in humans have been shown to trigger some degree of local or systemic immune response, resulting in a chronic state of low-grade inflammation that may lead to deleterious health outcomes [15, 16].

Methods

Study cohorts

The UK Biobank is a population-based exploratory study of which the enrollment procedure has been outlined previously [17]. In brief, half a million men and women between the ages of 40 and 69 (45.6% male, mean age ± SD: 56.5 ± 8.1) visited one of 22 UK Biobank screening centers in England, Scotland, and Wales between 2006 and 2010. The evaluation included a survey, a personal interview, and a number of physical measurements and blood. Urine and saliva samples were also collected for long-term storage. This research was undertaken with approved access to UK Biobank data under application number 50085 (PI: Fellay). All UK Biobank study participants gave informed consent at the time of recruitment. Ethical approval for the UK Biobank study was obtained from the North West Centre for Research Ethics Committee (11/NW/0382).

The CoLaus|PsyCoLaus study is a prospective population-based study initiated in 2003 in Lausanne, Switzerland (www.colaus-psycolaus.ch) [18]. It involves more than 6000 participants of European ancestry (47.5% male) initially aged 35 to 75 years (mean ± SD: 51.1 ± 10.9), thus representing a sample of approximately 10% of the inhabitants of Lausanne. Individuals were randomly recruited from the general population and are monitored every 5 years regarding their lifestyle and health status. Detailed phenotypic information was obtained from each study participant through questionnaires, physical assessment, and biological measurements of blood and urine markers. The institutional Ethics Committee of the University of Lausanne, which afterward became the Ethics Commission of Canton Vaud (www.cer-vd.ch), approved the baseline CoLaus|PsyCoLaus study (reference 16/03, decisions of 13 January and 10 February 2003), and written consent was obtained from all participants.

DNA genotyping and quality checks

Genotyping and imputation of UK Biobank individuals have been fully described by Bycroft et al. [19]. Briefly, samples were genotyped on either the UK BiLEVE Axiom array (Affymetrix) or UK Biobank Axiom array (Applied Biosystems). Genotypes were phased using SHAPEIT3 and the 1000 Genome phase 3 dataset as a reference, then imputed using IMPUTE4 using the Haplotype Reference Consortium data, 1000 Genomes phase 3, and UK10K data as references [20,21,22]. Post-imputation quality checks resulted in a total number of 9,349,624 single nucleotide polymorphisms (SNPs) available for analyses. DNA samples from 5399 CoLaus|PsyCoLaus participants were genotyped for 799,653 SNPs using the BB2 GSK-customized Affymetrix Axiom Biobank array. Quality control procedures and imputation of genotypes have been previously described in Hodel et al. [23]. A total of 9,031,263 SNPs from the CoLaus|PsyCoLaus dataset were included for further analyses (flowchart of the inclusion/exclusion criteria are in Additional file 1: Fig. S1).

Measurement of inflammatory biomarkers

For the UK Biobank, non-fasting venous blood samples (∼ 50 mL) were collected at recruitment. Blood samples were shipped at 4 °C to the central processing and archiving facility in Stockport. Serum high-sensitivity CRP (hs-CRP) concentrations were measured in participants by immunoturbidimetric assay on a Beckman Coulter AU5800. The manufacturer’s analytical range was 0.08 to 80 mg/L. Ninety-five individuals with a hs-CRP level of 20 mg/L were removed from the analysis. For CoLaus|PsyCoLaus, venous blood samples (≥ 50 mL) were drawn in the fasting state and allowed to clot. Serum blood samples were kept at 80 °C before the assessment of cytokines and sent in dry ice to the laboratory. hs-CRP was assessed by immunoassay and latex HS (IMMULITE 1000–High, Diagnostic Products Corporation, LA, CA, USA). For quality control, repeated measurements were conducted on 80 subjects randomly drawn from the initial sample. Forty-seven individuals with hs-CRP levels above 20 mg/L were assigned a value of 20 by the manufacturer and were therefore removed from the hs-CRP analyses as they are indicative of acute inflammation.

Serological analyses

To assess the humoral responses to a total of 56 antigens derived from 24 persistent infectious agents (45 antigens from 20 pathogens in UK Biobank, and 38 antigens from 18 pathogens in CoLaus|PsyCoLaus), serum samples were independently analyzed by the Infections and Cancer Epidemiology Division at the German Cancer Research Center (Deutsches Krebsforschungszentrum, DKFZ) in Heidelberg [24, 25]. Seroreactivity was measured at serum dilution 1:1000 using multiplex serology based on glutathione-S-transferase (GST) fusion capture immunosorbent assays combined with fluorescent bead technology. For each infectious agent tested, antibody responses were measured for one to six antigens and then expressed as a binary result (IgG positive or negative) based on predefined median fluorescence intensity (MFI) thresholds [26]. For our analysis, only antigens shared between the two cohorts were retained, resulting in a final combination of 27 antigens from 13 pathogens. To define the overall seropositivity against infectious agents when more than one antigen was used, we applied the pathogen-specific algorithms suggested by the manufacturer. Details of the methods on how the antigens were combined have been described previously [26].

Combining study cohorts

Upon completion of the genotyping and quality control (QC) analyses for each cohort, imputed datasets were matched on the strand, SNP ID, and genomic coordinates. Additional analyses and QC checkpoints were performed to ensure proper merging. This resulted in a dataset of 12,055 unique individuals of European ancestry and a total of 6,899,629 markers.

Polygenic risk score calculation for hs-CRP level

We carried out a polygenic risk score (PRS) analysis to investigate the relationship between human genetic variation and hs-CRP levels. A CRP-PRS was calculated for each study participant based on the risk effects of common SNPs derived from GWAS summary statistics of hs-CRP. As a baseline cohort, we referred to the GWAS summary statistics of the CHARGE cohort (N = 204,402, heritability h2 = 6.5%) [10, 27]. These summary statistics were used to construct the CRP-PRS in our target cohort consisting of the merged UK Biobank and CoLaus|PsyCoLaus data using the clumping and thresholding method of the PRSice-2 v2.2.7 software [28]. We used a standardized method to obtain PRS, by multiplying the dosage of risk alleles for each variant by the effect size in the GWAS and summing the scores across all of the selected variants. SNPs were clumped based on linkage disequilibrium (LD) (r2 ≥ 0.1) within a 250-kb window. Model estimates of the PRS effect were adjusted for sex, age, BMI, and the top 10 PCs. As an additional quality control, the distribution of PRS was checked in each cohort separately, to ensure that they followed a normal distribution.

Analyses of the determinants of hs-CRP levels

We used linear regression with backward selection to identify the factors significantly associated with hs-CRP plasma levels. Tested covariates included serostatus for each pathogen, polygenic risk score for CRP, age, sex, BMI, and the first 10 PCs of the genotyping data. P-value < 0.05 was considered statistically significant. The analysis was performed using the stepAIC function in R version 4.0.5 [29].

Results

Baseline characteristics of study participants

We studied a total of 12,055 individuals with available hs-CRP level, serological results, and genome-wide genotyping data from two independent population-based studies: the UK Biobank (N = 8371) and the CoLaus|PsyCoLaus study (N = 3684) (Additional file 1: Fig. S1). Participants ranged in age from 35 to 75 years (mean age ± SD: 55.68 ± 9.07), with a majority of women (55.4%) and a mean BMI of 26.80 (± SD: 4.73). The hs-CRP level was measured in all participants. The median hs-CRP level was 1.30 mg/L (10th, 90th percentiles: 0.35 mg/L, 5.10 mg/L, respectively). Figure 1 shows the distributions of age, sex, BMI, and log10-transformed hs-CRP in both cohorts. We observed a very comparable distribution of all relevant variables in the two cohorts, which were merged for downstream analyses. Additional file 2: Fig. S2 shows the associations of hs-CRP with demographic and clinical factors. Higher levels of hs-CRP associated with female sex, age, and increased BMI (P-values = 1.5e − 3, 3.4e − 69, and ≈ 0, respectively).

Fig. 1
figure 1

Baseline characteristics of the study cohort. Distribution of A age, B gender, C body mass index (BMI), and D hs-CRP for participants by subcohort

The impact of genetic variation on hs-CRP levels

The filtered genetic variants from the two cohorts were combined (see the “Methods” section) to increase the sample size. To estimate the sample variation, and to control for potential population structure and genotyping bias, principal component analysis (PCA) was performed using the correlation matrix of the genotyping data. PCA plots for the first ten principal components (PC1–PC10) are shown in Additional file 3: Fig. S3A, annotated by the original cohort from which the sample was drawn. We observed that samples from both subgroups (UK Biobank and CoLaus|PsyCoLaus) were segregated on the first PC (PC1) and eighth PC (PC8), but not on the other PCs. The top 10 PCs explained 61% of the total variance and were used throughout the study to correct for stratification (Additional file 3: Fig. S3B).

We computed a CRP-PRS to investigate the effect of multiple gene variants on hs-CRP levels. A total of 1809 SNPs were included at the best P-value threshold (P-value = 3.65e − 3). The PRS followed a normal distribution in the merged cohort, as well as in each subcohort separately (Additional file 4: Fig. S4). To describe the influence of common human genetic variation on plasma hs-CRP levels, we quantified the trait variance (R2) explained by the derived PRS and covariates across individuals. We observed that the variance explained by the full model was 25.8%, with 21.5% attributed to the demographic and clinical covariates and 4.3% to the CRP-PRS. The association between the CRP-PRS and hs-CRP levels was very strong (P-value = 6.58e − 123; Additional file 5: Fig. S5), with hs-CRP levels increasing by 0.48 [standard error (SE) 0.02] for each standard deviation increment in CRP-PRS.

Associations between persistent/latent infections and hs-CRP levels

We searched for associations between hs-CRP levels and serostatus for the following persistent or frequently recurring human pathogens: 10 viruses (BK virus (BKV), Cytomegalovirus (CMV), Epstein–Barr virus (EBV), Human Herpes Virus (HHV)-6, HHV-7, Herpes Simplex Virus (HSV)-1, HSV-2, JC virus (JCV), Kaposi’s sarcoma-associated herpesvirus (KSHV), and Varicella zoster virus (VZV)); two bacteria (Chlamydia trachomatis (C. trachomatis) and Helicobacter pylori (H. pylori)); and one parasite (Toxoplasma gondii (T. gondii)) (Fig. 2). The overall seropositivity ranged from 6.57% (KSHV) to 95.25% (EBV). Cohort-separated seroprevalences are shown in Additional files 6 and 7: Figs. S6 and S7.

Fig. 2
figure 2

Overall pathogen seropositivity and seroprevalence of tested antigens. List of the 13 pathogens and 27 antigens available from the combined study. The gray boxes indicate the pathogen on which the antigen protein is found, and the family to which the pathogen belongs. Percentages in parentheses after pathogen names indicate the overall seropositivity for the specified pathogen. The percentages on the right indicate the seroprevalence of antibodies against infectious disease antigens tested using the Multiplex Serology platform. For study-based figures, see Supplementary Figs. 6 and 7

Using backward stepwise regression including all significantly identified persistent or frequently recurring human pathogens, adjusted for CRP-PRS, sex, age, BMI, and the top 10 PCs, we observed significant associations of hs-CRP levels with seropositivity for H. pylori (P-value = 8.63e − 4) and C. trachomatis (P-value = 5.04e − 3) (Table 1). The final regression model including all significant factors explained 25.9% of the variance of hs-CRP levels. This explained that the fraction of the variance is similar to the value obtained without including the serological results (above), indicating that the impact of H. pylori and C. trachomatis seropositivity on chronic inflammation, even if statistically significant, is likely to be minimal at the population level. We also investigated the interaction effect between the two identified pathogens on the hs-CRP level. No significant interaction was observed, suggesting a joint independent impact of H. pylori and C. trachomatis.

Table 1 Linear regression analysis results for hs-CRP

Pathogen burden associates with higher hs-CRP levels

We then checked if the overall burden of chronic infections contributes to increased hs-CRP levels. Study participants were stratified according to their overall seropositivity index, calculated by summing the number of pathogens for which they were seropositive (range: 0–13). The number of individuals in each serological stratum ranged from 5 (index = 0) to 2717 (index = 7) and is presented in Fig. 3. We used a linear model to search for an association between pathogen burden and hs-CRP levels. hs-CRP levels were found to be significantly and positively associated with increasing pathogen burden (P-value = 4.12e − 4) (Fig. 3).

Fig. 3
figure 3

Levels of hs-CRP by infectious burden. Boxplots showing the hs-CRP value for each pathogen load group. The black bold line within the boxplot indicates the median of the hs-CRP measurement. The boxes are colored by sample size. The sample size and median for each group are shown above the box

Discussion

Mounting evidence suggests that exposure to multiple pathogens, even when they do not cause obvious disease, can affect the immune system and human health [18, 30, 31]. In an effort to better understand the variability of humoral immune response and inflammation patterns in response to pathogen exposure, we selected 27 antigens from 13 persistent infectious agents, which we evaluated using multiplex serology to detect specific immunoglobulin G levels in two well-characterized population-based cohorts.

We first investigated the relationship between common genetic variation and hs-CRP levels by calculating a PRS for all study participants. The PRS explained about 4% of the variation in hs-CRP levels, in agreement with previously published results [9]. We also found that BMI was the major non-genetic predictor of hs-CRP, with approximately 19% of the variance explained.

Next, we studied the impact of persistent infections on chronic inflammation after adjustment for known influencing factors, including age, sex, BMI, and human genetic variability, as explored above. We observed an association between increased levels of hs-CRP and seropositivity for C. trachomatis and H. pylori. The two gram-negative bacteria C. trachomatis and H. pylori do not cause life-long, latent infections. Nevertheless, they are responsible for some of the most frequent chronic infections in humans.

H. pylori can colonize the gastric epithelium for long periods of time, leading to chronic inflammation of the gastric mucosa. Even if the majority of individuals infected with H. pylori have no symptoms, the bacterium has been causally linked with gastritis, gastric ulcer, and an increased risk of gastric cancer [32, 33]. Our results suggest a systemic impact of chronic H. pylori infection beyond the known local inflammatory effect on the gastric mucosa, confirming an observation made previously in a cross-sectional population-based study [34].

C. trachomatis causes genital and ocular infections. The ocular manifestation of the infection, trachoma, is the world’s leading cause of preventable blindness and is endemic in many developing countries. This clinical presentation is however highly unlikely to contribute to the 25% seroprevalence of anti-chlamydia antibodies detected in the Swiss and UK cohorts included in our study. More relevant here, C. trachomatis is the etiological agent of human chlamydia urogenital tract infection, which is the most common bacterial sexually transmitted disease. Chronic or recurrent forms of the disease are frequently observed. To our knowledge, no study has examined the direct association between C. trachomatis infection and hs-CRP levels at the population level. However, studies conducted in the context of associations between C. trachomatis and tubal factor-related subfertility and preterm delivery have also shown elevated hs-CRP levels [30, 31, 35]. Altogether, these results confirm the role of chronic or recurrent bacterial infections in low-grade inflammation, reflected by a small but consistent increase in hs-CRP levels in seropositive individuals. In addition, we found an association between increased pathogen burden and hs-CRP levels by stratifying individuals according to their cumulative number of positive serological results. This indicates that latent infections might play an enhancing role in chronic low-grade inflammation, even if that effect is too small to be detected at the individual pathogen level.

Previous studies have shown that pro-inflammatory cytokines and chronic inflammation are associated with cellular aging (“inflammaging”) and a number of non-communicable diseases, including certain cancers, type 2 diabetes, and cardiovascular disease [3, 4, 36, 37]. It would therefore not be surprising to find that infections also play a key role in these diseases and that the reactivation of these pathogens can contribute to the deterioration of the overall health of older individuals. Finally, CRP-PRS was also found to be significantly associated in the analysis including both genetics and serological results, confirming that human genetic variation plays a modulating role in systemic inflammation.

Our study has some limitations. Firstly, we cannot rule out the effects of other non-measured infections at the time of hs-CRP measurement that may have influenced the level of inflammatory biomarkers. Also, we did not adjust our models for all known influencing factors (e.g., smoking, anti-inflammatory or anti-infective drugs, or possible inflammatory diseases). However, participants in both studies were assumed to be in good overall health at the time of data collection, and the data were filtered before analysis to detect the levels indicative of acute infection. Secondly, some pathogens had relatively low or high seroprevalences and should be reexamined in a larger study. In particular, it will be interesting to repeat the analysis once serological data for all individuals in the UK Biobank are available. This will allow for greater reliability in terms of statistical power. Third, hs-CRP was the only inflammatory biomarker studied. Other pro-inflammatory cytokines such as IL-1β, IL-6, and TNF-α are regulators of host responses to infection and positive mediators of inflammation. Consideration of these other biomarkers would give insight into more specific inflammatory pathways and provide a more comprehensive picture of the overall inflammatory status. Fourth, we only observed associations with the presence of chronic inflammation, and our study design does not allow us to infer any kind of causality. In particular, we cannot exclude the possibility that higher levels of inflammation are responsible for the reactivation of a pathogen, resulting in detectable seropositivity. [38, 39]

In conclusion, we found that seropositivity for C. trachomatis and H. pylori antigens is associated with increased levels of hs-CRP. Together with demographic, clinical, and genetic factors, persistent infections contribute to chronic low-grade inflammation, which can have deleterious long-term consequences on health.