Background

The genomic composition of Latin American (LA) populations is the result of a tri-ethnic genetic admixture that occurred during the colonization of the American continent. This process has been documented with historical data and initially corroborated via lineage-based tests, which target mitochondrial DNA and the non-recombining Y chromosome [1, 2]. Subsequently, with the discovery of an autosomal marker-based test to detect sequences with large differences in their allelic frequencies among continental populations (δ), called ancestry-informative markers (AIMs), it has been possible to infer the proportions of each of the three ancestral components in an admixed LA genome [3, 4]. Using AIMs, the ancestral genetic compositions of several LA populations, including the Colombian population, have been determined [3, 5]. The results show that the majority of LA populations were founded by a complex admixture process among three continental populations (Amerindian, African and European), although the proportions vary among – and even within – countries [6].

Knowledge of the ancestry of Latin American populations has different applications, ranging from forensic and anthropological uses to informing biomedical sciences, and can be especially useful in the identification of gene variants associated with both infectious and complex diseases [7, 8].

Using genetic admixture as a tool to identify gene variants involved in complex diseases requires differences in allelic frequencies among the parental population and recent occurrence of admixture mating among those parental populations. The genetic admixture is a factor that influences the allelic frequencies in a population and this, in part, contribute to explaining the differences observed in the epidemiology of certain diseases in admixed population regarding parental populations. These two conditions are fulfilled in the Colombian population because the diseases involved in cardio-metabolic disorders, such as obesity, type 2 diabetes mellitus (T2D), hypertension and dyslipidemia, have different prevalences in European, African and Amerindian populations [9] and because the demographic history of the genetic admixture is very recent [5]. Globally, the prevalence of overweight and obesity varies across different regions. In adults, it is the highest in the Americas (61.3 %) and is the lowest in Africa (30.8 %). The prevalence of diabetes is the highest in the American/Caribbean region and the lowest in Africa (12.9 % and 3.2 %, respectively), whereas the prevalence of high blood pressure in adults is the highest in Africa (30 %) and is lower in the Americas (18 %). Finally, the prevalence of elevated total cholesterol is the highest in Europe (54 %) and the lowest in Africa (22.6 %) [9, 10] (See Additional file 1: Table S1). Because of this heterogeneity, genetic admixture studies provide a unique opportunity to account for the heterogeneity of population-based differences and to understand the role of genetic factors that contribute to disease risks in admixed populations. However, there have been no studies evaluating the effect of ancestry on cardio-metabolic parameters conducted on people under 18 in the Colombian population. Finally, the identification of the effect of genetic variants in the prevalence of such diseases requires that the effect of variables that interfere with this association, such as socio-economic status, parental education, physical activity and diet, be assessed using a control. Based on the above information, we aimed to evaluate the association between individual and average estimates of genetic ancestry and the anthropometric, biochemical and clinical measurements used to assess cardiovascular risk factors in a Colombian population of admixed youth while adjusting for environmental factors.

Methods

Population of the study

A cross-sectional study of 853 unrelated, self-identified healthy volunteers of both sexes between 10 and 18 years old was performed. The population base for subject recruitment was the population examined in following study: “Metabolic Syndrome in overweight youth: Identification of risk factors and evaluation of an intervention.” The participants were affiliated with a company that provides health services in the city of Medellin in northwest Colombia. The exclusion criteria were defined as follows: young people who used corticosteroids or thyroid hormones; those who were hypoglycemic; those diagnosed with diabetes, chronic renal failure, or innate metabolism-related genetic diseases; those with physical or mental disabilities; those who were highly competitive athletes; and those who were pregnant or nursing. This study complied with the Declaration of Helsinki ethical principles for medical research involving human subjects. The researchers complied in all cases with Resolution 8430 of 1993, from the Colombian Ministry of Social Protection.

General socioeconomic and health information

The participants and their guardians answered a questionnaire and provided general information, including sex, date of birth and socioeconomic status, according to the National Administrative Department of Statistics (Departamento Administrativo Nacional de Estadística, DANE), which has established six socioeconomic categories based on housing location, with stratum 1 being the lowest and 6 the highest [11]. For the purpose of this analysis, the data obtained for the socioeconomic strata were classified into three categories: low (1 and 2), middle (3 and 4) and high (5 and 6). Pubertal maturation was based on a self-evaluation performed by the subjects according to the Tanner Sexual Development Stages using pictures or schematic drawings [12, 13]. In addition, information on birth weight and the duration of breastfeeding was obtained.

Phenotypes

For this study, cardio-metabolic risk factors were defined as follows: waist circumference (WC) ≥ the 90th percentile for the individual’s age and sex (according to values reported in the III National Health and Nutrition Examination Survey (NHANES) for Mexican-American youth) [14]; triglycerides (TGs) ≥110 mg/dL; HDL cholesterol ≤40 mg/dL; systolic or diastolic blood pressure ≥the 90th percentile for the individual’s age, sex and height; and fasting blood glucose ≥100 mg/dL. Additionally, overweight (body mass index (BMI) >85th percentile; insulin >23.0 μU/mL and insulin resistance (IR) (HOMA >3.1) were determined. IR was estimated using the Homeostasis Model Assessment (HOMA-IR) using the HOMA Calculator Version 2.2.2 software from the ©Diabetes Trials Unit from Oxford University. Each of the measures were obtained using standardized protocols and calibrated equipment as referenced by previous studies [15].

Environmental covariates

Food intake was assessed using a 24-hour reminder, which was randomly distributed on different days of the week. A second reminder was given in a random subsample and on non-consecutive days to estimate and adjust for individual variability. The level of physical activity was determined using the University of South Carolina Arnold School of Public Health’s 3-Day Physical Activity Recall (3DPAR) questionnaire, in which participants were asked questions about their physical activity (PA) for the three previous days (two weekdays and one weekend day). For the purposes of this analysis, the data obtained were classified into three final levels: low, moderate, and high [16].

Genetic ancestry estimation

To infer the individual and average ancestries of the participants, 40 AIMs widely distributed across the genome that show differences in the allele frequencies (higher than 40 % between at least two populations) among the European, Amerindian and African parental populations were selected. AIMs were selected using the previously available population data included in the Marshfield Clinic diallelic insertion/deletion polymorphism database [17], a database of retrotransposon insertion polymorphisms in humans (dbRIP) [18], and from studies that have also characterized markers for Latin American populations [4, 19]. Detailed information about the 40 AIMs are shown in Additional file 1: Table S2 and Table S3. DNA was extracted from peripheral blood using a standard method [20]. The concentration and quality of the samples were determined via spectrophotometry. The AIMs were typified using PCR-RFLP (polymerase chain reaction – restriction fragment length polymorphism) analysis and based on the differences in the amplified fragment sizes (in cases of an In/Del) using capillary electrophoresis in an ABI-PRISM 310 genetic analyzer (Applied Biosystems). Genotype determination was performed using the GeneMapper v 4.0 program.

Statistical analysis

The assumption of normality of the quantitative variables was evaluated using the Kolmogorov-Smirnov test. The qualitative variables were presented with their frequency distributions; the quantitative variables were summarized according to their median and interquartile ranges. Food consumption data were analyzed based on the Program Evaluation of Dietary Intake (Evaluación de la Ingesta Dietética, EVINDI v4) [21]. Reports of nutrient intake were processed using the program PC SIDE (Personal Computer Version of Software for Intake Distribution Estimation) v 1.0. The allele and genotypic frequencies were calculated with the PLINK v. 1.07 program [22]. Ancestry proportions were calculated with the ADMIXMAP v 3.2 program [23], which uses a frequentist-Bayesian method. The analysis of median differences was performed using Mann-Whitney U or Kruskal-Wallis tests. Individual cardio-metabolic parameters (body mass index, waist circumference, HDL cholesterol, systolic or diastolic blood pressure, fasting blood glucose, and insulin resistance) were coded as binary variables based on the cutoff values described above. Logistic regression analysis was performed using the dichotomous variables as an independent model to assess the effect of each of the genetic ancestries on each cardio-metabolic parameter. Because of the known effects of socioeconomic status, parental education, physical activity and diet on these components, each of these covariates were included in the model as control variables. Analyses were conducted with SPSS (Statistical Package for the Social Sciences) v. 19.0 statistical software and PLINK v. 1.07 [22]. Because all associations assessed were based on a priori hypotheses, both unadjusted and adjusted probability values are reported. The significance test were adjusted by the Bonferroni method, two threshold significance as a p-value <0.05/3 = 0.0167 (three ancestry), and p-value <0.05/27 = 0.0018 (global correction, three ancestries * nine cardio-metabolic parameters) was considered significant for avoiding the error of multiple testing. The threshold of significance for all other comparisons was p <0.05.

Results

General characteristics of the study population

A total of 853 young volunteers (10 to 18 years old) from a region of Colombia (Medellín) were included in the study. The average age was 14 ± 2.4 for both male and female. Of the volunteers, 52 % were female, 42 % belonged to the middle socioeconomic stratum, and 54 % were classified as post-pubertal. As for the perinatal and environmental variables, 86 % had an adequate birth weight and 9 % breastfed for less than a month; 29 % of the youth watch TV for >4 h per day; and 33 % engaged in low levels of physical activity. The median energy intake was 2253 kcal/day. Diets consisted of 13 % protein, 55 % carbohydrates and 32 % fat. Table 1 shows the demographic, perinatal, physical activity and food consumption values in relation to each of the cardio-metabolic parameters.

Table 1 Characteristics of the study population, stratified according to cardio-metabolic parameters

Estimation of ancestry and its effects in relation to each of the cardio-metabolic parameters

The range of European ancestry was 41–82 %; African, 4–48 %; and Amerindian, 10–35 %. The average ancestral percentages of the 853 young people were 66.6 ± 5.7 % European, 14.0 ± 4.7 % African, and 19.4 ± 4.1 % Amerindian. The individual admixtures are shown in Additional file 1: Table S4 and Figure S1. Dirichlet distributions of the proportions of European, African, and Amerindian ancestral components in the study population are shown in Fig. 1. Median differences and logistic regression model analyses were performed after recoding each of the cardio-metabolic parameters to convert them into binary variables. Our bivariate analyses revealed that the subjects with high TGs had a higher percentage of the Amerindian ancestry (P = 0.001) and less European ancestry (P = 0.007) than those with normal values. The subjects who had a high SBP had a lower European ancestry component (P = 0.002) and a higher African component (P = 0.013) in comparison with those who had normal SBP values. Subjects who had high insulin levels and HOMA-IR measurements had a higher percentage of African ancestry (P = 0.029 and P = 0.036, respectively) than those who had normal values in these parameters (Table 2). The values for BMI, WC, glucose levels, HDL and DBP did not show significant differences in relation to any of the ancestries.

Fig. 1
figure 1

Dirichlet distribution of the ancestral components of the study population. The figure shows the Dirichlet distribution of the ancestral components of the study population using 40 ancestral informative markers (AIMs)

Table 2 Comparison of medians of European, African and Amerindian ancestry according to components of metabolic syndrome

The logistic regression model included ancestry as the principal independent variable and was adjusted for age, sex, pubertal maturation, socioeconomic stratum, physical activity, percent of calories from simple carbohydrates, daily consumption of fruits and BMI. Even though some of the covariates were not significantly associated with the outcome variables in Table 1, some of them were included, as they have been previously reported to be important factors. Amerindian ancestry was associated with an increased risk for elevated TGs for each percentage unit increase in Amerindian ancestry; specifically, the probability of having high TGs increases by 6 % (OR = 1.06, 98.3 % IC = 1.01–1.11, P = 0.002). After the adjustments described above, no effect of the European ancestry was found on TGs levels, but a protective effect was found for high SBP; for each percentage unit increase in European ancestry, the probability of having high SBP decreased by 7 % (OR = 0.93, 98.3 % IC = 0.87–0.99, P = 0.008). However, an African ancestry was positively associated with a risk for high SBP; for each percentage unit increase in African ancestry, the probability of having high SBP increased by 6 % (OR = 1.07, 98.3 % IC = 1.01–1.14, P = 0.008) (Table 3).

Table 3 Multiple logistic regression results for the association of genetic ancestry with components of metabolic syndrome

Discussion

This study evaluated the association of ancestry with the following cardio-metabolic risk factors in a young population that was the product of ancestral mixture: WC, BMI, glucose, insulin, HOMA-IR, TGs, HDL, DBP and SBP. This analysis was based on 40 AIMs whose deltas (δs) among the Amerindian, European and African populations indicate that they are good discriminators (Additional file 1: Table S1). The average and individual results obtained for the ancestries of the sample population match the data obtained by other researchers in the same population using different types and numbers of AIMs [6, 8, 24]. We must note, in interpreting the following results, that we set two threshold significance as p <0.0167 (three ancestry), and p <0.0018 (three ancestries * nine cardio-metabolic parameters) to provide a conservative Bonferroni correction for multiple testing, however it may be unnecessarily rigorous when there is a priori evidence of the associations being tested. This study, using a multivariate analysis, showed an association positive between African ancestral components and SBP (Table 3), which has been widely reported in genetic studies of blood pressure (BP) conducted in African populations and in populations that are the product of genetic admixture with African populations, such as Latin American populations, including one study on an adult population from the same region where the present study was conducted [2527]. However, despite the consistently demonstrated effect of an African ancestry component on blood pressure, other studies in the United States have found no differences in the blood pressure of children with different ethnic origins [28]. In this study, the inverse relationship found between SBP and a European ancestral component in young people from 10 to 18 years of age agrees with the findings in adults.

TG levels were positively associated with an Amerindian ancestral component, which agrees with some studies conducted on adult populations where it has been reported that Amerindian women have higher TGs levels than African American (AA) or European-American (EA) women. Additionally, the prevalence of dyslipidemia (low HDL, high LDL or high TGs) is greater in Amerindian than in European populations [29]. In 2014, Ko et al. identified eight gene variants associated with dyslipidemias. In inferring the ancestral state, they found that seven of these variants were of an Amerindian ancestry [30].

With regard to the HOMA-IR and insulin measurements, a significant difference was found when a comparison of medians was carried out with the African ancestral component. Although there was no significant difference shown by the logistic regression, it is useful to discuss these results because several studies have associated insulin resistance with African ancestry [31, 32]. Insulin and HOMA-IR are measurements related to T2D, and other studies have found a high prevalence of this disease in some AA populations [33, 34]. In addition, in a study carried out on adults from the same region from which the subjects of this study come, a significant association was found between the African ancestral component and the risk of T2D. This association reinforces the importance of considering the hypothesis of the “thrifty genotype” that has been described for North American populations, such as the Pima Indians [34], in whom there is a very high prevalence of T2D. This prevalence is believed to be related to the presence of genes variants adapted to the dispersion of these populations across the Bering Strait [34]. Other studies in admixed populations found no association of the Amerindian component with fasting blood glucose levels or with HOMA-IR [25, 35]. In two studies, one conducted in Medellín [17] and the other in Mexico [18], where a positive association of T2D with the Amerindian genetic component was reported, the authors suggest that this association could be explained because the low socioeconomic stratum has a higher Amerindian genetic component than that of the other strata, and the lifestyle and environmental factors of this stratum favor the development of T2D. The etiology of this complex trait has a high epigenetic component; it has been demonstrated that early exposure to hyperglycemia predisposes individuals to changes in their chromatin and gene expression, which can be transmitted from one generation to another (metabolic memory), and this might be what was observed in the African component in the present study, which showed an association with the insulin resistance parameters [36].

No associations of any of the three ancestral components were detected with BMI, WC, HDL and glucose level despite the high heritabilities found for these traits [37]. Additionally, a relationship to ancestry has been demonstrated in several studies on populations in the United States, in which it has been found that adiposity is higher in EA and Hispanic-American (HA) children than in AA children [38, 39]. It has been reported that WC in Mexican-American children is greater than in EA and AA children [14]. Since 2000, higher levels of HDL have been reported in both children and adults in AAs and HAs than in EAs [40]. Based on this evidence, it was expected that there would be some degree of association between the genetic components of the population and these character traits. However, the negative results presented here could demonstrate that the association between these traits and the population’s genetic components is different, not only with respect to age but also with respect to the ancestral genetic architecture, which depends on the demographic history of the population. With respect to studies conducted on associations with the complete genome, evidence has shown the existence of genes associated with BMI and WC in both adults and children, while other associations are present only in adults or only in children [41]. This shows the complexity of the inheritance of these traits. For example, a recent study found an inverse relationship between 31 loci associated with the risk of obesity, as defined by BMI, and changes in weight in relation to age [42].

The results of this study seem to agree with the ideas of Joshi et al. [42], which suggest directional dominance. They assessed whether homozygosis in certain regions of the genome had any directional relationship with quantitative traits such as height, weight, cognitive measures and cardio-metabolic parameters, and they found that only height and cognition clearly presented any directional dominance in continental populations. However, the cardio-metabolic measurements (like those evaluated in the present study) did not present it, and it was shown to have a strong variation with age [43].

Conclusion

Here we have shown that although the effect of ancestry was modest, it was significant, and based on these findings, we suggest that an Amerindian ancestry may act as a risk factor for high triglycerides, while an African ancestral component confers a risk for high systolic blood pressure. A European ancestry serves as a protective factor for this condition in the young people of the population studied here. In addition, our results show that ancestry is not associated with WC, BMI, HDL or glucose levels. It is necessary to extend this study to include more AIMs and a greater sample size in the hopes of replicating these results and, thus, enabling the use of predictions based on ancestral components, either as risk or protection factors, to determine prevention strategies or to identify candidate genes for cardio-metabolic disorders in children.

Abbreviations

MetS, metabolic syndrome; DNA, deoxyribonucleic acid; AIMs, ancestry-informative markers; LA, Latin America; DANE, National Administrative Department of Statistics; WC, waist circumference; TGs, triglycerides, HDL, high-density lipoprotein; DBP, diastolic blood pressure; SBP, systolic blood pressure; T2D, type 2 diabetes mellitus; BMI, body mass index; HOMA, homeostasis model assessment; IR, insulin resistance; PCR-RFLP, polymerase chain reaction-restriction fragment length polymorphism); IN/DEL, insertion-deletion; OR, odds ratio; CIs, confidence intervals; EAs, European-Americans; HAs, Hispanic-Americans; AAs, African Americans