Introduction

Life expectancy in low- and middle-income countries (LMICs) is increasing. This is attributed to reductions in early childhood mortality and higher life expectancy in older age [1, 2]. By 2050, the global population of people aged 65 years and older is expected to triple to 1.5 billion, with most of the increase occurring in LMICs [2]. Africa—predominated by LMICs—is undergoing rapid epidemiological transition towards non-communicable diseases (NCDs) in a backdrop of infectious diseases, environmental stressors and weak health systems. These contextual factors could contribute to low-grade chronic inflammatory state and “inflammaging”, resulting in higher incidence of NCDs and mortality at relatively younger ages [3, 4].

DNA methylation (DNAm), an epigenetic modulator of genetic expression, can be modified by environmental exposures and is recognized as a molecular hallmark of ageing [5, 6]. Hence, DNAm captures acute and chronic exposures, as well as disease risks [7, 8]. Indeed, age-dependent DNAm patterns have been used to derive different types of biomarkers of chronological and biological age, termed the epigenetic clocks [9]. Two groups of epigenetic clock biomarkers have proven their utility in NCD risk prediction. These include the first-generation leukocyte-based HannumAge [10] and pan-tissue HorvathAge [9]—both exclusively trained on chronological age, and the second-generation leukocyte-based PhenoAge and GrimAge—both trained on a combined measure of chronological age and several subclinical NCD-related phenotypes and risk factors [11, 12].

The standardized measure of epigenetic age acceleration (EAA) has been linked with several NCD-related lifestyle risk factors, in a clock-specific manner [13,14,15], highlighting their ability to uniquely capture some aspects of aging. However, there has been more limited characterization of the socioeconomic and lifestyle determinants of the second-generation clocks, compared to the first-generation clocks [14]. EAA, on the other hand, was associated with clinical outcomes (including cardiovascular disease, diabetes, cancer and mortality) in a more consistent manner, with varying magnitude and precision across clocks [14].

Most epigenetic clock studies evidencing the validity of DNAm-based biomarkers as predictors of NCD risk were performed in high-income countries. There is only sparse evidence from studies in LMIC, and specifically studies from Africa. An exploratory study of 256 Central African hunter gatherers reported a positive correlation between chronological and HorvathAge, with lower EEAA in urban dwellers compared to their rural counterparts [16]. Recent investigations of different epigenetic clock biomarkers in 120 black South African men (prospective urban and rural epidemiology cohort [PURE-SA-NW]) and 712 Ghanaians (research on obesity and diabetes among migrants [RODAM]) also reported positive correlations across clocks, with GrimAge exhibiting the highest correlation with chronological age in both studies [17, 18]. The validation of epigenetic clocks and understanding their applicability as risk markers in the African context therefore requires more evidence from diverse age groups, socioeconomic status and settings. Specifically, differences by sex, socioeconomic status and urbanization are relevant aspects of EAA but remain understudied. Furthermore, a parallel investigation of EAA with cardio-metabolic outcomes would evidence their potential utility in linking environmental (and lifestyle) exposures to cardio-metabolic disease in transitioning societies.

In this study, we aimed to test the validity of different DNAm-based epigenetic clock biomarkers, and their cardio-metabolic risk profiles, in adults from urban and rural settings of south-central Côte d’Ivoire. Figure 1 presents the study’s overarching framework. First, we extend principles of causality to investigate the cross-sectional associations of four blocks of risk factors (i.e., biological, socioeconomic, lifestyle and body mass index  [BMI]) with EAA, where the preceding block(s) is a potential confounder of the succeeding block(s), and the latter block(s), a potential mediator of the former. Second, we investigate the association of each EAA measure with metabolic syndrome (MetS) and its severity, for improved population-level understanding of EAA as cardiovascular risk markers in this setting.

Fig. 1
figure 1

Analytical framework for the present study. DNAm, DNA methylation. Numbers represent sequence of hierarchical regression models. Models 1-4 were multivariable linear regressions of age acceleration on the groups of risk factors. Model 5 was multivariable binomial and ordinal logistic regressions of metabolic syndrome and severity on age acceleration and all risk factors. Model 5 was tested with and without inclusion of body mass index as a potential confounder. Metabolic syndrome is a binary variable where the presence of any three of high blood pressure, impaired fasting glucose, low high-density lipoprotein, high triglycerides, and central obesity was considered positive, and otherwise, negative. Metabolic syndrome severity is an ordinal variable with a range of 0–5 capturing the number of metabolic syndrome components in each participant

Results

Descriptive statistics

Details of the study population, setting and sampling are presented in “Study sample” and “Measurement of DNAm” sections. We included 393 participants aged 18–79 years of the Côte d’Ivoire dual burden of disease (CoDuBu) study who had complete data on epigenetic clocks and relevant covariates. Participants comprised 48% females, 55% rural dwellers and 52% with at least primary-level education. Most of the participants were lifetime non-smokers (85%), low-risk alcohol drinkers (65%) and attained recommended levels of physical activity (98%). Mean (standard deviation [SD]) of BMI was 23.3 (4.0) and a majority of the sample (65%) was normal weight. Prevalence of underweight, overweight and general obesity was 7%, 18% and 8%, respectively (Table 1).

Table 1 Phenotypic profile of included participants of the CoDuBu study in south-central Côte d’Ivoire

We identified components of MetS in the study sample: 25% had central obesity (based on waist circumference), 31% had high blood pressure, 56% had high fasting glucose, and 50% and 8% had low high-density lipoprotein (HDL) and high triglycerides, respectively. One hundred and thirty-two participants (34%) had one MetS component, whereas 87 participants (22%) had three or more MetS components, and were defined as cases of metabolic syndrome (Table 1).

Accuracy of epigenetic age as biomarkers of chronological age

Table 2 presents the overall performance of four epigenetic clock biomarkers to derive epigenetic age estimates in the present study. Mean (SD) chronological age was 41.5 (13.0) years. Pearson’s pairwise correlations between chronological age and epigenetic age were high (0.83 ≤ Pearson’s r ≤ 0.93) for all four epigenetic clocks. However, Lin’s concordance between epigenetic and chronological age were comparatively weaker (0.73 ≤ Lin’s rC ≤ 0.85). Prediction accuracy of HannumAge was higher than HovarthAge, with mean (SD) epigenetic age difference [EAD] (i.e., epigenetic age−chronological age) of − 1.9 (6.6) and 5.8 (7.2) years, and Lin’s coefficients of bias (CB) of 0.99 and 0.89, respectively. Prediction accuracy of both first-generation clocks were also higher than the second-generation clocks (Table 2). HannumAge and HovarthAge predictions were optimal at chronological ages of 32 and 62 years, respectively. GrimAge was optimal at 72 years, whereas PhenoAge was beyond the lower age limit of our sample (Fig. 2). The differences between epigenetic clocks remained consistent among apparently healthy participants defined as non-smoking participants without any of the five MetS features (Table 2).

Table 2 Performance of epigenetic clocks as biomarkers of chronological age in the CoDuBu study, south-central Côte d’Ivoire
Fig. 2
figure 2

Linear fit of epigenetic age difference and chronological age in the CoDuBu study, south-central Côte d’Ivoire. CoDuBu, Côte d’Ivoire dual burden of disease study. Chronological age was measured at health examination, and epigenetic age measures were estimated from blood samples taken at health examination. Clock-specific epigenetic age difference (Y) is the absolute difference between clock-specific DNAmage and chronological age (X). R2 is the squared value of the correlation coefficient

Between-epigenetic age correlations were high (0.82 ≤ Pearson’s r ≤ 0.90), but the between-EAA correlations were considerably lower (0.24 ≤ Pearson’s r ≤ 0.66). While EAD had weak to moderate negative correlations with chronological age, EAA did not correlate with chronological age (Table 2).

Associations between sociodemographic and lifestyle factors, and EAA

Multiple linear regressions of EEA on chronological age and sex (model 1) showed consistent increase in IEAA, EEAA and GrimAA in males compared to females, independently of chronological age. Mean IEAA, EEAA and GrimAA increased in males by 1.6 (0.5, 2.7) years, 2.9 (1.5, 4.4) years and 2.7 (2.0, 3.4) years, respectively, compared to females. PhenoAA was not associated with sex, but was directionally consistent with the other clocks (Table 3).

Table 3 Sociodemographic and lifestyle determinants of epigenetic age acceleration in the CoDuBu study, south-central Côte d’Ivoire

We evaluated association of socioeconomic factors with EAA by mutual adjustment (of model 1) for educational attainment, household wealth and urbanization (model 2). Highest household wealth tertile was independently associated with a decrease of 1.9 (− 0.1, 3.9) years and 2.2 (0.2, 4.2) years in mean EEAA and PhenoAA, respectively. Association between wealth, IEAA and GrimAA were directionally consistent, but imprecise (Table 3).

We evaluated association of lifestyle-related factors with EAA by mutual adjustment (of model 2) for smoking, alcohol consumption and physical activity (model 3). Smoking was associated with an increase of 3.5 (2.4, 4.5) years in mean GrimAA. High-risk alcohol consumption was associated with an increase of 0.8 (0.0, 1.5) years and 1.5 (0.0, 3.1) years in mean GrimAA and PhenoAA, respectively. Lowest tertile of physical activity was directionally consistent across EAA measures with the strongest positive association observed with IEAA. Linear term of physical activity also negatively associated with mean IEAA. We observed a positive quadratic relationship between physical activity and GrimAA (Fig. 3), indicating increase in mean GrimAA at lowest and highest tertiles of physical activity levels (Additional file 1: Table S1). These associations were independent of sociodemographic and other lifestyle covariates.

Fig. 3
figure 3

Marginal associations of physical activity, BMI and GrimAge acceleration in the CoDuBu study, south-central Côte d’Ivoire. BMI, body mass index; CoDuBu, Côte d’Ivoire dual burden of disease study. Point estimates represent marginal change in mean Grim Age acceleration in relation to levels of physical activity (left) and BMI (right). Estimates are derived from multivariable linear regression models adjusted for chronological age, sex, education, household wealth index, urbanization, smoking status, alcohol consumption (for physical activity model) and physical activity (for BMI model)

Effect estimates of overweight or obesity compared to normal weight, were directionally consistent (positive associations) across EAA measures. Linear terms of BMI were directionally inconsistent across EAA. Similar to physical activity, we observed a quadratic relationship between BMI and GrimAA (Fig. 3), indicating increase in mean GrimAA at low (underweight) and high BMI (obesity) levels (Additional file 1: Table S1). These associations were independent of sociodemographic and lifestyle covariates.

Associations of risk factors with PhenoAA and GrimAA were robust to leukocyte proportions (Additional file 1: Table S2). Simple and partial Spearman correlations of the risk factors and age-adjusted GrimAge components identified plasminogen activator inhibitor-1 (DNAmPAI-I) and smoking pack year (DNAmPKYRS) surrogates to largely drive the associations (Additional file 1: Table S3). We also observed positive partial correlations of age-adjusted DNAmPAI-I and DNAmPKYRS with male sex and high-risk drinking. Smoking (DNAmPKYRS) and BMI (DNAmPAI-I) were also associated with these plasma proteins (Additional file 1: Table S4).

Associations between sociodemographic and lifestyle factors, and EAA: effect modification

We tested potential effect modifications by age group (cut-off at median value of 41 years), sex and urbanization (Additional file 1: Tables S5–S7). Education was associated with a decrease in IEAA and GrimAA in females and in older participants (> 41 years), respectively. Household wealth was associated with a decrease in IEAA in younger participants (≤ 41 years), and a decrease in EEAA and GrimAA in urban area. High-risk alcohol intake also showed consistent acceleration of all four clocks in younger participants. Smoking was increased EEAA, PhenoAA and GrimAA in males. Lower physical activity was associated with increased EEAA, PhenoAA and GrimAA in rural area. Overweight or obesity was associated with increased EEAA and PhenoAA in females, and increased PhenoAA and GrimAA in rural area. Although interaction terms were often imprecise, we observed consistent negative associations of household wealth, and consistent positive associations of alcohol intake with all EAA measures, in younger participants.

Associations between EAA and cardio-metabolic phenotypes

EEAA and PhenoAA were associated with metabolic syndrome severity, with respective odds ratios (ORs) and 95% confidence intervals (CIs) of 1.23 (1.02, 1.49) and 1.26 (1.03, 1.55) per 1-SD increase in EAA. These associations were independent of demographic, socioeconomic and lifestyle variables described above. Associations with MetS, though imprecise, were directionally consistent and of similar magnitude with those of MetS severity. GrimAA had directionally consistent associations of lower magnitude with both MetS phenotypes, whereas IEAA tended to be negative. EEAA and PhenoAA were also associated with high blood pressure (OREEAA 1.28 (1.00, 1.64); ORPhenoAA 1.22 (0.96, 1.56)) and fasting glucose ((OREEAA 1.19 (0.96, 1.47); ORPhenoAA 1.19 (0.96, 1.47)). These associations were robust to adjustment for BMI. PhenoAA-MetS severity association was also robust to adjustment for leukocyte proportions. PhenoAA was associated with high triglycerides (OR 1.61 (1.02, 2.64)) in the model that accounted for leukocyte composition (Table 4).

Table 4 Association between epigenetic age acceleration and metabolic syndrome components in the CoDuBu study, south-central Côte d’Ivoire

Although GrimAA did not associate with MetS, the DNAmPAI-I and DNAmLeptin components were associated with both MetS (ORPAI-I 1.78 (1.33, 3.39); ORLeptin 1.27 (0.97, 1.26)) and MetS severity (ORPAI-I 1.58 (1.30, 1.91); ORLeptin 1.21 (1.00, 1.46)) per 1-SD increase in DNAmPAI-I and DNAmLeptin, respectively. Among the single MetS components, DNAmPAI-I and DNAmLeptin were associated with high blood pressure, triglycerides and central obesity, whereas only DNAmPAI-I was associated with high fasting glucose. Associations of DNAmPAI-I were robust to adjustment for BMI, whereas associations of DNAmLeptin were weakened by BMI (Additional file 1: Table S8).

Associations between EAA and cardio-metabolic phenotypes: effect modification

Tests of effect modifications by age group, sex and urbanization yielded largely overlapping subgroup estimates, and imprecise interaction terms (Additional file 1: Table S9). However, subgroup estimates of association between EAA and MetS phenotypes tended to be directionally consistent within chronological clocks and biological clocks. On the one hand, we observed higher odds of MetS in relation to IEAA and EEAA in older, male or urban-dwelling participants. On the other hand, we observed higher odds of MetS in relation to PhenoAA and GrimAA in younger, female or urban-dwelling participants.

Discussion

Summary of findings

In this cross-sectional study of adults from primarily rural and  some urban areas of south-central Côte d’Ivoire, DNAm clocks exhibited high correlations with chronological age, but weaker prediction accuracy. HannumAge was more accurate than HovarthAge, and both first-generation clocks performed better than the second-generation mortality clocks. Independent determinants of acceleration in one or more clocks included male sex (IEAA, EEAA and GrimAA), lower socioeconomic status (EEAA and PhenoAA), lower physical activity (IEAA), high-risk alcohol consumption (PhenoAA and GrimAA) and smoking (GrimAA). Interestingly, GrimAge was accelerated at both extremes of physical activity and BMI. EEAA and PhenoAA were positively associated with high blood pressure, fasting glucose and MetS. Consistent with EEAA and PhenoAA findings, the DNAmPAI-1 component of GrimAge was associated with both lifestyle risks and cardio-metabolic phenotypes. Together highlight potential “risk factor-EAA-clinical phenotype” paths for future longitudinal mediation studies.

Contextual relevance

Our finding of high correlations of these epigenetic clocks, including potential uniqueness of each EAA measure (evidenced by weak inter-EAA correlations), in this setting of Côte d’Ivoire, support previous findings in high-income countries [11, 15, 19, 20]. Consistent with the findings in RODAM and PURE-SA-NW studies, HannumAge and GrimAge had stronger correlations than HovarthAge and PhenoAge, respectively, with chronological age [17, 18]. The magnitude of correlation of chronological age, HorvathAge and HannumAge in our study was comparable to those of the RODAM study, but higher than those of the PURE-SA-NW study. The age and sex distribution of our sample was more similar to the RODAM study than the PURE-SA-NW study, which included only older males. Furthermore, under-prediction at higher chronological age was more pronounced in the PURE-SA-NW than in our study and confirm previous findings of weaker performance of these clocks in older individuals [21]. The general lack of associations with Horvath clock also agree with previous studies reporting least phenotypic associations with IEAA in relation to the extrinsic EAAs [14]. This could be explained by IEAA’s exclusive age-based training model, its characteristic lack of inter-individual variability in leukocyte composition, and hence, are independent from extrinsic factors. In contrast, the acceleration of second-generation clocks is dependent on leukocyte composition and capture inter-individual differences in background inflammatory state and disease risk [6, 9].

Our findings of sex-differences in epigenetic age acceleration provide the first evidence from Africa in this regard, and validate previous findings from different contexts [14]. Male sex is the most consistently reported risk factor for EAA. Exploring EAA-sex interactions on MetS, however, revealed that females had higher MetS risk, despite higher EAA in males. Paradoxically, we observed a tendency for higher MetS risk in relation to chronological age in males and higher MetS risk in relation to epigenetic age acceleration in females. DNAm age acceleration markers might therefore provide a promising avenue to understand the unbiased sex-specific risk of disease and the role of healthcare access and poverty-related factors in understanding the higher rates of morbidity among females in this setting [22].

Socioeconomic status remains a relevant determinant of longevity [23]. Studies of EAA and dimensions of socioeconomic status (including education) have shown protective associations with extrinsic measures of EAA [24]. These align with our findings of consistent negative association of household wealth with acceleration of all four clocks. It is expected that urbanization in this setting might in fact capture most aspects of household wealth and education, and a parallel consideration of these three dimensions might constitute over-adjustment. This is, however, not the case in our study as these factors were only moderately correlated, allowing us to investigate independent associations of each dimensions in relation EAA (Additional file 1: Table S3). Interestingly, household wealth was generally more protective of EAA in urban area and education was protective of GrimAA in older participants. These might reflect the healthcare access, affordability and use by individuals of higher socioeconomic status, contributing to their slower rate of aging. Detailed investigation into several dimensions of socioeconomic status (jointly and independently), would improve the characterization of the clocks in relation to socioeconomic status in resource-poor settings.

Evidence relating alcohol and smoking to EAA has been mixed [14]. Although alcohol frequently demonstrates a nonlinear relationship with mortality [25], directionally consistent pooled evidence of increased EAA with alcohol consumption was reported across all clocks [14]. Specifically, alcohol-use disorder, as measured in our study, was consistently associated with acceleration of mortality clocks [12]. Smoking impacts DNA methylation and has more consistent association with mortality than alcohol consumption. Smoking was associated more frequently with the mortality than chronological clocks [11, 19, 24]. Smoking status was correlated with DNAm surrogates of pack years, adrenomedullin and tissue inhibitor metalloproteinase-1 (TIMP-1), whereas high-risk drinking correlated with DNAm surrogates of beta-2 microglobulin (B2M), growth differentiation factor-15 (GDF-15), pack years and PAI-I, and therefore clarifies their consistent positive association with GrimAA in the current and previous studies [13, 19]. However, the lack of multivariable associations of smoking and alcohol with the mortality clocks in the RODAM study [17] highlights the potential role of confounders on age-related DNA methylome, and the need to investigate clock components in clarifying the presence or lack of associations in future studies.

Our observation of borderline association of physical activity and Horvath clock agrees with previous findings [14]. In addition, the acceleration of GrimAge at extremes of physical activity levels support the physical activity paradox in mortality. This is especially of interest in resource-poor settings with a predominance of non-leisure manual labour. Excessive physical activity, especially in the occupational context, might increase the risk of mortality [26, 27]. Repetitiveness of non-leisure activity and insufficient recovery time potentially promote chronic exhaustion, elevated haemodynamic parameters and therefore cardiovascular diseases [27,28,29]. We have not distinguished occupational from leisure physical activity in this study. Therefore, investigation of various physical activity domains in future studies will contribute to the evidence in this setting regarding the beneficial effect of leisure-time activity in a context of excessive occupational physical activity.

Our observation of non-linear association of BMI with GrimAA reflects the complexity of BMI measure and the need for considering polynomials in BMI associations. Linear BMI term seemed to support the recently reported negative association between BMI and EAA [17]. However, further investigation showed accelerated GrimAge with both underweight and obesity. Interestingly, underweight had higher magnitude and precision in comparison to obesity. BMI in this setting does not only capture NCD-related lifestyle risks, but also undernutrition secondary to access, infections and psychosomatic stress [14]. Given that BMI might not entirely capture adiposity, future studies incorporating body fat measures are warranted to deepen the understanding of these findings in the African context.

Our findings of EEAA associations with metabolic syndrome severity agree with previous findings [13, 30]. The association of PhenoAA with MetS is unsurprising given that glucose and C-reactive protein are part of the PhenoAge training model [11]. Although GrimAA was associated with several risk factors, it only tended to associate with MetS in younger participants, confirming a previous finding in Korean adults [31]. Furthermore, PAI-I was the major driver of GrimAge associations with multiple risk factors and MetS. These findings align with previous reports of PAI-I as a major predictor of several NCDs and mortality, highlighting further, the relevance of parallel consideration of GrimAge components in future studies [12, 31, 32]. PAI-I may therefore be a more reliable marker of metabolic dysregulation (resulting from oxidative stress and cell damage) and aging, and needs further exploration for their public health and clinical utility in this setting [33, 34].

Strengths and limitations

To our knowledge, this is the first study that investigated age-related DNAm as a link between sociodemographic and lifestyle risk factors and MetS severity in an African setting. We applied a comprehensive approach in estimating the effect of each risk factor, with minimal bias, in a cross-sectional context. The inclusion of a broad range of age groups, several dimensions of socioeconomic status (education, wealth and urbanization), polynomial and categorical lifestyle associations, and effect modification by age, sex and urbanization allowed a broad view of the patterns of association in this setting. Further investigations into GrimAge components, both with risk factor and MetS, contributed to the comprehensiveness of our approach. Our findings of higher MetS risk (in younger participants) in relation to morbidity and mortality clocks agree with recent hypotheses of younger age of onset of NCDs and related mortality in settings of epidemiological transition [3, 4, 35]. The combination of DNA methylome, NCDs and related external exposome in the CoDuBu study provide a unique resource for further investigations into much-needed evidence from Africa linking DNAm and the burden of NCDs.

Our study is limited by its cross-sectional design, which hinders causal inference of the results. The cross-sectional design also limits the exploration and interpretation of directionality. Although the investigated risk factors precede biological ageing and disease, it remains unclear if biological ageing precedes or succeeds disease development [6, 30], highlighting the relevance of longitudinal data in deepening the understanding of the causal paths linking biological ageing and disease. However, our findings are still relevant given the general lack of evidence on age acceleration from Africa in particular, and the limited investigation of the mortality clocks in terms of their socioeconomic and lifestyle determinants. Our sample size might be considered small relative to previous studies. However, this sample was originally selected to investigate the DNA methylome profile of asymptomatic malaria parasitaemia, and includes all asymptomatic malaria cases and randomly selected malaria negative participants in a ratio of 1:3 [36]. Apart from oversampling of malaria cases, the remaining characteristics were comparable between this sample and the original CoDuBu cohort (Additional file 1: Table S10). In addition, the correlation matrices and primary EAA and MetS models were stable to the exclusion of the 93 malaria cases in the sample (Additional file 1: Tables S11–S13).

Conclusions

We have demonstrated parameters of the DNAm-based epigenetic clocks to be valid and useful chronological or biological age estimators in this African setting of epidemiologic transition. As more DNAm data from African become available and more accessible, future discovery studies on DNAm epigenetic clocks incorporating African DNAm data are warranted. Our results highlight the utility of DNAm patterns as defined by different epigenetic clocks as analytical tools to investigate known socio-demographic and lifestyle risk factors, and cardio-metabolic health in the specific settings of LMICs. A broader understanding of EAA in LMICs would involve the incorporation of other aspects of the external exposome, including environmental exposures to pollutants, stress, sleep quality and psychological factors in explanatory EEA models to specifically identify public health and clinical translational utility of EEAA, PhenoAA and PAI-I in resource-poor settings.

Materials and methods

Study sample

This investigation was done within the CoDuBu study, which aimed to understand the co-occurrence of common infections and NCDs in the Taabo health and demographic surveillance system (HDSS) in south-central Côte d'Ivoire [37]. The detailed CoDuBu protocol is published elsewhere [38]. Concisely, the study began in 2017 and included 1019 adults selected at random, from three purposively selected HDSS areas of varying urbanization. Participants underwent detailed health examinations including anthropometry (body height, weight and waist circumference) and subclinical cardio-metabolic phenotyping, including blood pressure, fasting glucose level and blood lipid profile. Dried blood spots were prepared on Whatman FTA cards, air-dried at room temperature, sealed in desiccant containing pouches, according to manufacturer’s instructions, and stored at − 80° C in a biobank. Participants underwent interviews assessing their sociodemographic, behavioural, lifestyle and environmental characteristics, as well as health status and healthcare use, among others. The CoDuBu study was approved by the Côte d’Ivoire National Ethics Committee for Life and Health Sciences (ref. no. 032/IMSHP/CNER-kp; date of approval: March 24, 2017) and the Ethics Committee of North-West and Central Switzerland (ref. no. 2016-00143; date of approval: May 2, 2016). All participants provided written informed consent before participation in the study [38].

Measurement of DNAm

DNA was extracted from ~ 280 mm2 (10 × 6 mm punch) blood spot per subject for 400 FTA cards using standard salting out procedure based method [39]. The EZ 96-DNA methylation kit (Zymo Research; Irvine, CA, USA) was used for bisulfite-conversion of DNA. Epigenome-wide DNA methylation was measured using the Illumina Infinium MethylationEPIC BeadChip (Illumina, Inc.; San Diego, CA, USA) that covered 866,091 probes. Samples were randomly distributed on arrays to minimize batch effects. Each batch had multiple identical control samples to assess assay variability. Dye-bias correction [40] and absolute methylation level (β values, defined as the ratio of methylation intensity over total intensity, with offset of 100, were computed using the minfi R-package (R Development Core Team) [41]. Quality control excluded 24,138 probes (detection p value > 10−16) and seven samples (call rate < 95% [n = 5] and sex mismatch [n = 2]). We applied beta mixture quantile normalization of the β values to correct for the Illumina probe design bias [42]. We included 841,953 CpG sites from 393 participants in subsequent determination of epigenetic age.

Estimation of epigenetic age

We estimated the parameters of participants’ epigenetic age (and corresponding measures of age acceleration) from the new DNAm age calculator [9] where the participants’ DNAm beta values and annotation files were used as input files. HannumAge, HorvathAge, PhenoAge and GrimAge were estimated using the 71, 353, 513 and 1030 CpGs reported in their discovery studies, respectively [9,10,11,12]. Leukocyte proportions (B cells, CD4+ T, CD8+ T, granulocytes, monocytes and natural killer cells) were estimated using the Houseman’s method implemented in the online calculator [41, 43].

EAA was calculated for each clock as the residuals of linear regression of epigenetic age on chronological age [9, 16]. Unlike absolute difference measures, these residuals are robust to measurement platforms and normalization methods, with improved comparability across studies [44]. Specifically, IEAA is derived from regressing HorvathAge on chronological age and leukocyte composition, hence its intrinsic property. EEAA is derived from regressing a weighted average of HannumAge and age-varying leukocyte components (naïve cytotoxic T cells, exhausted cytotoxic T cells and plasmablasts) on chronological age [45]. This weighted average measure (and therefore EEAA) captures both age-related changes in leukocyte composition and intrinsic aging. PhenoAA and GrimAA are derived from regressing PhenoAge and GrimAge on chronological age, respectively [11, 12]. Given that GrimAge comprises eight DNAm surrogates of plasma proteins (adrenomedullin, B2M, GDF-15, Cystatin C, leptin, PAI-1 and TIMP-1) and pack years, we derived their chronological age-adjusted equivalents for further investigation into GrimAA. The epigenetic clocks of our sample were based on the EPIC array, which does not cover six CpGs of HannumAge and 19 CpGs of HorvathAge. Nevertheless, recent comparative studies showed EPIC-based clocks to exhibit comparable performance to their discovery arrays [44, 46].

Measurement of risk factors

We used a questionnaire to determine participants’ chronological age (years), sex (male or female), formal educational attainment (none, primary, secondary or tertiary), and residence (Amani-Ménou, Taabo-Cité or Tokohiri). Taabo-Cité is an urban area relative to Amani-Menou and Tokohiri, which are more rural. Wealth or asset index was estimated for each participant—in the context of the HDSS—by applying principal component analysis to property or possessions, and housing characteristics of the participants’ households [37]. Wealth index is a reliable and stable proxy for consumption and therefore economic status in general terms [47, 48].

Lifestyle factors included smoking status, alcohol consumption and physical activity. Smoking status was determined as never-smoker (lifetime of nonsmoking), former smoker (smoked in the past but quit) and current smoker (presently smoked tobacco products). Alcohol consumption was determined using the AUDIT-C (Alcohol Use Disorder Identification Test-Consumption) questionnaire, which scores alcohol consumption and frequency (0–12) to identify risky consumption. Participants were classified as low risk (AUDIT-C score 0–3 for males and 0–2 for females), medium risk (4–9 for males and 3–9 for females), and high risk (10–12) [49]. Physical activity was measured as self-reported number of min per week of engagement in at least 10 min of moderate and vigorous activities covering transport, leisure and occupational activities [50]. Body weight (kg), height (cm) and waist circumference (cm) were measured to the nearest 0.1 unit. BMI was calculated as the ratio of body weight and height-squared (kg·m−2), and participants were classified into underweight (< 18.5), normal weight (18.5–24.9), overweight (25–29.9) and obese (≥ 30).

Measurement and definition of metabolic syndrome phenotypes

Central obesity was defined as waist circumference ≥ 94 cm in males and ≥ 80 cm in females. Blood pressure was measured three times, on the left arm, in a sitting position, and the mean of the last two measures noted. Raised blood pressure was defined as mean blood pressure ≥ 135/80 mmHg or use of blood pressure-lowering medication. Fasting glucose and lipid profile were measured using the point-of-care Alere AS100 system (and corresponding cartridges from same production batch), which exhibited good performance in tropical settings [36, 51, 52]. Impaired fasting glucose was defined as having fasting glucose ≥ 5.6 or use of glucose-lowering medication. Low HDL was defined as < 1.0 mmol·L−1 and < 1.3 mmol·L−1 in males and females, respectively, whereas high triglycerides were defined as ≥ 1.7 mmol·L−1. We defined MetS severity as an additive score of five components where a score of zero or five indicates absence or presence of all five components, respectively. We also defined presence and absence of MetS as scoring ≥ 3 and < 3 on the MetS severity scale [53].

Statistical analysis

Descriptive statistics

Statistical analyses were performed in Stata version 16 (Stata Corporation; College Station, TX, USA). We examined the EAA parameters for extreme outliers defined as observations greater than three interquartile range beyond the interquartile range of each parameter. We summarized the characteristics of included participants, using means and SD for continuous variables, and frequencies for categorical variables.

To evaluate the performance of the epigenetic clocks, we calculated the EAD (absolute difference between epigenetic and chronological age) and the Bland-Altman’s 95% limits of agreement [54]. We also tested Lin’s concordance correlations between epigenetic and chronological age. Lin’s concordance correlation coefficient combines the tightness of the observations to the line of best fit, and the nearness of the line of best fit to the identity line (where Y = X) of perfect concordance [55, 56]. We tested Pearson’s correlations between chronological age and (i) epigenetic age; (ii) EAD; and (iii) EAA. Finally, we tested the robustness of these performance indicators limited to apparently healthy participants (i.e., non-smokers with no MetS feature).

Associations between sociodemographic and lifestyle factors, and EAA

We performed hierarchical multivariable regressions to assess the association between sociodemographic and lifestyle factors and each of the four age acceleration parameters. Here, we estimated the minimally biased effect size for each block of covariates by excluding as much as possible, the potential mediators of the variables in the covariate block. As shown in Fig. 1, we estimated the effects of four cumulative blocks of covariates including:

  1. i.

    Model 1: chronological age and sex;

  2. ii.

    Model 2: model 1 + educational level, household wealth index and urbanization (socioeconomic);

  3. iii.

    Model 3: model 2 + smoking, alcohol consumption and physical activity (lifestyle) and

  4. iv.

    Model 4: model 3 + BMI (secondary lifestyle).

We report the effect estimate of a covariate as the effect in the model containing the block at the highest hierarchy. For instance, the effect estimate of sex, wealth, alcohol, and BMI will be derived from models 1, 2, 3 and 4, respectively.

We primarily modelled covariates in two categories: sex (male vs. female), educational level (secondary and tertiary vs. others), wealth index (richest third vs. lower two-thirds), urbanization (urban vs. rural), smoking (former and current smokers vs. never smokers), alcohol consumption (high risk vs. low risk), physical activity (lowest third vs. upper two-thirds) and BMI (≥ 25 vs. <25 kg m−2). We additionally tested polynomials of physical activity (model 3) and BMI (model 4) for potential non-linearity in relation to EAA. We performed collinearity tests for the fully adjusted EEA models and observed minimal evidence for multi-collinearity (variance inflation factor, VIF range 1.1–1.9).

We further performed subgroup analyses by age group (cut-off at median value; > 41 vs. ≤ 41 years), sex and urbanization as the major potential modifiers and tested between-group differences via multiplicative interaction terms between the EAA and each modifier in the model. In sensitivity analyses, we tested robustness of PhenoAA and GrimAA effect estimates to adjustment for leukocyte proportions, and performed partial Spearman correlations of the covariates with each DNAm surrogate component of GrimAA. Results of regression analyses are presented as changes in mean EAA and their 95% CIs in relation to each risk factor.

Associations between EAA and cardio-metabolic phenotypes

We regressed each of the cardiometabolic phenotypes—MetS, single components and severity score—on the standardized values of each EAA. We used binomial logistic regression for MetS and each component, and ordinal logistic regression for MetS severity. In these models, we had two levels of main covariate adjustments. First, we adjusted for chronological age. Second, we additionally adjusted for sex, socioeconomic and lifestyle factors corresponding to model 3 in the EAA outcome model. In sensitivity analyses, we additionally adjusted for BMI (all EAA) and leukocyte proportions (PhenoAA and GrimAA). We also regressed MetS outcomes on each DNAm surrogate of plasma protein components of GrimAA independently of demographic, socioeconomic and lifestyle factors, to identify specific associations with GrimAA components. We also performed collinearity tests for the fully adjusted MetS models and observed minimal evidence for multi-collinearity (VIF range 1.1–1.8).

We also tested effect modification by age group, sex and urbanization using the primary MetS logistic models, and tested between-group differences via multiplicative interaction terms between the EAA and each modifier in the model. Results of regression analyses are presented as ORs of MetS (or components) and 95% CIs in relation to each EAA measure.