Background

Birthweight is a complex multifactorial trait consistently associated with infant mortality and morbidity, with childhood obesity [1, 2], and with diseases of adulthood including type 2 diabetes, cardiometabolic diseases, and cognitive function [3,4,5,6]. There is growing interest in understanding the roles of gene–environment interactions in population differences in fetal growth following two recent studies led by the World Health Organization (WHO) [7] and the National Institute of Child Health and Human Development (NICHD) at the National Institutes of Health (NIH) [8]. The studies found regional and population differences in fetal growth, even under similar unconstrained maternal socioeconomic and nutritional conditions [7, 8]. The WHO study found significant variations in fetal growth among 12 countries from different parts of the world, and the NICHD Fetal Growth Studies found significant differences in fetal growth among four U.S. racial and ethnic populations [7, 8]. The findings in both studies corroborated earlier studies that found population differences in birthweight [9,10,11] and regional differences in low birthweight incidence, ranging from 6.4–7.7% in Europe and North America to 14.3% in Africa and 18.3% in Asia [12]. The WHO study also found that maternal and fetal characteristics only partially explained these differences [7], consistent with earlier observations in which established non-genetic determinants of birthweight, including socio-demographic and lifestyle-related factors, parental anthropometry, and gestational age, did not fully explain the observed birthweight differences among populations [13]. An important next step is to investigate whether and to what extent population genetic differences at key birthweight-associated loci and their interactions with environmental factors account for the residual fetal growth disparities not explained by other determinants.

To date, genome-wide association studies (GWASs) have discovered a total of 60 loci (of which 59 were autosomal) associated with birthweight [14,15,16]. Autosomal polymorphic single-nucleotide polymorphisms (SNPs) on the genotyping array in a recent multi-ancestry GWAS explained 15.1% of the variance in birthweight [15], reinforcing earlier heritability estimates for birthweight ranging from 25 to 31% [17, 18]. It has previously been shown that the combined effect of seven genetic loci on birthweight was similar to the effect of maternal smoking during pregnancy [16], and that of 59 loci on birthweight variance was similar to that of maternal body mass index [15], indicating a considerably high cumulative effect of the genetic loci on fetal growth. In some instances, genetic variants associated with reduced birthweight display substantial allele frequency differences among populations. For example, the rs11765649 IGF2BP3 variant associated with lower birthweight is carried by nearly all East Asians compared to three-fourths of Europeans (99% in Han Chinese in Beijing and 74% in Utah residents with Northern and Western European Ancestry from the 1000 Genomes Project, http://www.internationalgenome.org/).

Although recent studies have illuminated the role of genetic variation on birthweight, much remains to be understood with respect to the cumulative burden of birthweight-associated loci in different populations, and to what extent they contribute to birthweight differences across populations with different ancestries. Here we tested the hypothesis that the cumulative burden of genetic variants with birthweight-lowering effect is different among ancestrally diverse human populations. Using genotype data from 26 global populations grouped into five super-populations, (1) we compared the genetic risk burden and frequency distributions of birthweight-lowering variants identified by multi-ancestry GWASs between Europeans and non-Europeans and (2) we determined whether population differences in genetic risk burden to lower birthweight vary depending on whether a variant is ancestral or derived and whether a variant is relatively benign or deleterious. Furthermore, several studies have indicated that birthweight is a strong predictor of neonatal and infant mortality [19] and optimal fetal growth and development is an important goal of pregnancy to enhance perinatal survival [20, 21]. Therefore, in global regions where low birthweight, infant mortality, socioeconomic disadvantage, and rare birthweight-lowering variants are high, it is possible that the action of negative genetic selection, which tends to wipe out deleterious birthweight-reducing variants, has been stronger to enhance the survival of offspring. Therefore, we also evaluated whether population-restricted negative selection influenced observed differences in the proportion of rare birthweight-lowering variants among populations.

Methods

Study population and data sets

This study included participants in phase 3 of the 1000 Genomes Project (www.1000genomes.org), which consists of 2504 individual samples from 26 global populations grouped into five super-populations: Africans (AFR, n = 661), admixed Americans (AMR, n = 347), East Asians (EAS, n = 504), Europeans (EUR, n = 503), and South Asians (SAS, n = 489). All participants declared themselves to be healthy at the time the samples were collected. Hence, they were very unlikely to have had severe genetic diseases during recruitment. In addition to genotype data, each participant’s sex, ethnicity, and place of origin were collected as part of the project [22].

Selection and annotation of SNPs

We selected all 59 autosomal SNPs found to be associated with birthweight at the genome-wide level of significance in multi-ancestry GWASs involving offspring genotypes [14,15,16]. Given the modest effect sizes of the birthweight-associated loci, the association tests in non-Europeans did not surpass the genome-wide threshold, potentially because of the small sample sizes of the non-Europeans in the discovery study [15]. Therefore, we examined some metrics to evaluate the validity of using the loci in polygenic risk scores among diverse ancestries. The evidence supported the trans-ancestral effect of the loci on birthweight. These include:

  1. 1.

    A trans-ethnic meta-analysis resulted in lower p values compared with a European-only meta-analysis in the vast majority of loci.

  2. 2.

    Pooled analyses of non-Europeans and Europeans discovered seven loci (DTL, HIST1H2BE, TRIB1, APOLD1, GPR139, ACTL9, and PEPD) associated with birthweight, which was not achieved in the European-only cohorts.

  3. 3.

    The effect sizes of the SNPs were similar in both Europeans and non-Europeans, as evidenced by the strong correlations in effect sizes (r = 0.88; 95% confined interval CI: 0.81–0.93, p < 2.2 × 10− 16) across the 59 SNPs.

  4. 4.

    Heterozygosity between the trans-ancestry datasets, tested with the Q statistics, was not significant (p > 0.05) in 57 out of the 59 SNPs tested (the two exceptions were rs854037 in the 5q11.2 locus and rs28510415 in PTCH1).

  5. 5.

    Altogether, 50 out of 59 SNPs (85%) had directionally consistent effects in Europeans and non-Europeans.

In all, these metrics indicate that the loci have trans-ancestral effects on fetal growth.

Genotype data for the 59 SNPs were extracted from the 2504 individual samples. The SNPs included in this analysis, their birthweight-lowering alleles, nearby genes, effect size, and other annotations are reported in Additional file 1. To determine the functional and pathogenic relevance of the genetic loci, SNPs were assigned deleteriousness scores using the Combined Annotation Dependent Depletion (CADD) framework as implemented in CADD v1.2 (http://cadd.gs.washington.edu). CADD integrates functional and evolutionary importance from multiple annotation sources to generate a deleteriousness score for each genetic variant [23]. In the present analysis, the median phred-like CADD score (−10 × log10 (rank/total)) [23] was found to be 2.8. SNPs with CADD score >2.8 (n = 29) were considered to be relatively deleterious and SNPs with CADD score ≤2.8 (n = 30) were considered to be relatively benign. The ancestral or derived state of alleles for each SNP was assigned based on the Ensembl Compara 59 pipeline (six primate Enredo-Pecan-Ortheus) (http://useast.ensembl.org/).

Statistical analyses

For each individual, the genetic risk burden for low birthweight (GRB) was calculated as the sum of the number of birthweight-lowering alleles (0, 1, or 2) per SNP multiplied by the effect size derived from the largest GWAS meta-analysis [15], followed by rescaling by the sum of the effect sizes [24]. We also generated a GRB not weighted by effect size, and no substantial differences were detected between the two metrics (Additional file 2). The mean frequencies of birthweight-lowering alleles and mean GRB were compared between Europeans and each of the four non-European populations (AFR, AMR, EAS, and SAS) with the t-test. The proportions of rare birthweight-lowering alleles were compared between Europeans and non-Europeans with the chi-squared test. To detect negative natural selection (purifying selection), we tested for any deviation of the allelic frequencies from the distribution expected under the neutrality model towards lower values [25]. All analyses were performed using the software program PLINK 1.9 (https://www.cog-genomics.org/plink2) [26] or R (http://www.R-project.org/).

Results

GRB was significantly higher in Africans [mean ± standard deviation (s.d.): 64.53 ± 4.21], admixed Americans (64.41 ± 5.33), East Asians (64.23 ± 4.34), and South Asians (62.45 ± 4.59) compared to Europeans (61.38 ± 4.66) (p < 0.001). The direction of GRB differences between Europeans and non-Europeans varies depending on the evolutionary status of the polymorphic site (ancestral vs. derived birthweight-lowering alleles). For birthweight-lowering alleles with ancestral state (n = 33 SNPs), GRB was significantly higher in Africans (mean ± s.d.: 48.85 ± 3.20), admixed Americans (45.58 ± 4.01), East Asians (45.07 ± 3.06), and South Asians (43.86 ± 3.37) compared to Europeans (41.84 ± 3.36) (p < 0.001). In contrast, for birthweight-lowering alleles with derived state (n = 26 SNPs), GRB was significantly lower in Africans (mean ± s.d.: 15.68 ± 2.90), admixed Americans (18.82 ± 3.16), and South Asians (18.59 ± 3.01) compared to Europeans (19.53 ± 3.18) (p < 0.001). Compared to Europeans, Africans display the largest mean GRB difference of 3.15 (95% CI: 2.64, 3.66), largely driven by SNPs with ancestral birthweight-lowering alleles (mean difference 7.01; 95% CI: 6.63, 7.39) (Table 1 and Additional file 3). Further comparisons between the individual populations forming each of the super-populations revealed significant GRB differences among admixed American populations. Specifically, GRB was significantly higher in Colombians (p = 0.048), Mexicans (p = 5.2 × 10− 5), and Peruvians (p = 2.3 × 10− 12) compared to Puerto Ricans, and in Peruvians compared to Colombians (p = 0.015) (Additional files 4, 5, and 6). For each super-population, GRB was significantly higher among relatively deleterious than relatively benign loci (p < 0.001) and within each deleteriousness stratum, non-Europeans had higher GRB than Europeans, but the differences were not statistically significant for most comparisons (Table 2). The most deleterious birthweight-lowering variant (rs2229742 NRIP1) (CADD = 25.9; Additional file 1) is nearly fixed (i.e., frequency of ~100%) in Africans and East Asians, but is polymorphic in other super-populations (90.1% in EUR, 94.4% in AMR, and 94.8% in SAS).

Table 1 Genetic risk burden of birthweight-reducing alleles in diverse populations
Table 2 Genetic risk burden of birthweight-reducing alleles by deleteriousness score

Next, we examined population differences in allele frequency of the birthweight loci. The frequency density of the birthweight-lowering alleles was aligned with a bell shape in Europeans consistent with the expectation of the common disease-common variant hypothesis [27] but showed the greatest deviation from a bell shape in Africans and East Asians. In Europeans, the density curve peaks for birthweight-lowering alleles have a frequency of 30–40% compared to 10–20% in Africans and East Asians (Fig. 1). The proportion of rare SNPs (minor allele frequency <0.05) associated with birthweight was significantly higher in Africans (26.67%) and East Asians (15%) compared to Europeans (1.67%) (Fisher’s exact test p = 0.0001 and 0.0085, respectively) (Fig. 2). Moreover, of the 59 autosomal loci analyzed, five were polymorphic in Europeans but had fixed birthweight-lowering allele frequency (RAF ≥ 0.99) in non-Europeans, primarily in Africans and East Asians (rs138715366 (YKT6-GCK), rs11765649 (IGF2BP3), rs144843919 (SUZ12P1-CRLF3), rs2229742 (NRIP1), rs62240962 (SREBF2)). Notably, the YKT6-GCK locus, which had the largest birthweight-lowering effect size among the 59 loci, was fixed in each of the non-European populations (Table 3).

Fig. 1
figure 1

Frequency density of risk alleles associated with reduced birthweight. AFR Africans, AMR admixed Americans, EAS East Asians, EUR Europeans, SAS South Asians

Fig. 2
figure 2

Proportion of rare SNPs (MAF < 0.05) associated with birthweight. AFR Africans, AMR admixed Americans, EAS East Asians, EUR Europeans, SAS South Asians, SNP single-nucleotide polymorphism, MAF minor allele frequency

Table 3 Birthweight-reducing alleles that are polymorphic in Europeans but fixed in other populations

To investigate whether birthweight-lowering variants were subjected to the effect of negative genetic selection that would increase the frequency of rare SNPs, we tested whether the proportion of rare birthweight-lowering alleles was higher than that of the reciprocal common birthweight-lowering alleles (RAF < 0.05 vs. RAF > 0.95; RAF < 0.5 vs. RAF > =0.5). We did not find a significantly higher proportion of rare birthweight-lowering alleles in any population (Additional file 7). Moreover, GRB was significantly higher among deleterious than benign loci in all populations (mean difference: 4.83–7.84) (Table 2), indicating that birthweight-lowering variants are not enriched for negative selection. No significant GRB differences were found between males and females.

Finally, we attempted to validate our findings using genotypes of seven global regional populations in the Human Genome Diversity Project database [28]. The median frequency of the birthweight-reducing alleles of five SNPs retrieved from the Human Genome Diversity Project database (http://spsmart.cesga.es/) was highest in Africans and the Americas (p = 0.028 compared to Europeans), and lowest in European and Middle Eastern populations (Additional file 8).

Discussion

The current study found that the magnitude of the genetic burden imposed by birthweight-lowering variants is different among ancestrally diverse populations. In particular, Africans and Asians had a consistently higher burden of birthweight-lowering variants compared to Europeans. This finding is consistent with global data on the gradient of low birthweight. Regions with predominantly African and Asian ancestry populations have the highest incidence of low birthweight compared to those with predominantly Europeans ancestry populations [12]. A recent multinational study by the WHO involving healthy women with low-risk pregnancies and an unconstrained nutritional and social background from ten countries in Africa, Asia, Europe, and South America found significant differences in fetal growth across countries. The study also found significant differences in birthweight between countries. The median birthweight for countries in Africa and Asia was 400–500 g lower compared to European countries such as Norway [7]. The NICHD Fetal Growth Studies also found significant variations in fetal growth among Asian, black, Hispanic, and white ethnic groups. Asian fetuses were the smallest followed by African fetuses, and white fetuses had the largest size [8], largely consistent with the country-specific ethnic distributions in the WHO study [7].

The major determinants of these considerable variations in fetal growth and birthweight across populations remain unknown. Established maternal factors (such as maternal age, height, weight, and parity) and neonatal characteristics (such as sex) that influence fetal growth and birthweight explained only 1–2% of variations in fetal growth [7]. On the other hand, recent studies demonstrated a considerably high contribution of genetics to birthweight. There was an array-wide heritability of 15.1% [14] and a strong cumulative effect of birthweight loci that was as high as maternal smoking during pregnancy [16] and maternal body mass index [15]. Together with these observations, our findings of genetic risk burden differences among populations indicate that genetic variations and their complex interactions with environmental risk factors may contribute to observed regional and ethnic disparities in birthweight. Further, understanding these interactions may help us to understand what underlies the very slow change between 1990 and 2000 in the incidence of low birthweight in developing countries despite some improvements in their economies [12]. It may also explain why we witnessed recent decreases in birthweight in the U.S. [29, 30], with disproportionately higher declines in African-Americans than whites [31], and in Sweden [32], which could not be explained by maternal and neonatal characteristics.

In the present study, we observed population differences in the frequency spectrum of birthweight-lowering alleles. The proportion of rare risk loci was higher in individuals of African and Asian ancestry compared to those of European ancestry. The bell-shaped and symmetrical overall distribution of birthweight loci in our study has a bearing on the common disease-common variant hypothesis, which posits that common traits are most likely due to common variants with small to modest effects [27]. Nonetheless, we observed relatively higher deviations in Africans and Asians. These two findings showing differences between Europeans and Africans/Asians in the genetic variation landscape of common and rare birthweight loci may be because of population differences in the genetic architecture of birthweight and fetal growth. In addition, the overwhelming majority of GWASs, including those on birthweight, utilized samples of European ancestry populations [14,15,16] and most genotyping platforms are ascertained for common SNPs in European ancestry populations, limiting the power of discovery in other populations [33, 34]. These limitations may contribute to our findings of population differences in the genetic variation landscape of birthweight loci. The putative causal variants are most likely tagged by the SNPs associated with birthweight in the discovery GWAS involving European ancestry individuals; however, the extent to which those SNPs tag the causal variants in non-Europeans is not known. Therefore, we acknowledge a limitation in our study that the differences in the burden of risk alleles among populations may not represent differences in burden of causal variants. Genomic studies involving diverse population samples are warranted to discover common genetic loci associated with fetal growth and to close the gap between the estimated heritability of birthweight (25–31%) and the heritability explained by the GWAS loci discovered so far (<5%) [15, 17, 18].

In agreement with other studies [35, 36], our analysis showed a higher frequency of ancestral than derived birthweight-lowering variants in all populations, and a higher GRB of ancestral birthweight-lowering alleles in Africans and East Asians compared to Europeans. Although the well-known association of birthweight with infant mortality implied the importance of optimal birthweight to survival and reproductive fitness, our findings of (i) similar proportions of ancestral and derived birthweight-lowering alleles, (ii) higher GRB among deleterious than benign birthweight-lowering alleles, and (iii) no significantly higher proportion of rare vs. reciprocal common alleles indicate that birthweight loci were not subject to negative selection. Rather, by interrogating dbPSHP, a database of recent positive selection across human populations (http://jjwanglab.org/dbpshp), we found that 14 birthweight loci (23.3%) overlap with previously published genetic loci targeted by recent positive selection (ZBTB7B, ATAD2B, CPA3, HHIP, CDKAL1, HIST1H2BE, HMGA1, SLC45A4, HHEX, NT5C2, ITPR2, CRLF3, PEPD, and SREBF2) (Additional file 9).

Conclusions

The present study found that non-Europeans, particularly Africans and Asians, have a higher burden of birthweight-lowering variants compared to Europeans. Moreover, the allele frequency landscape of birthweight-lowering variants in Africans and Asians has a greater deviation from the bell-shaped distribution expected under the common disease-common variant hypothesis. These findings parallel global data on the gradient of low birthweight, in which regions with predominantly African and Asian ancestry populations have the highest incidence of low birthweight and smaller fetuses that were not explained by traditional non-genetic factors. Future studies are warranted to understand the extent to which this genetic risk burden difference and its interaction with environmental factors contribute to fetal growth disparities among ancestrally diverse global populations, and to investigate the ways in which these population differences in genetic burden are governed by human demographic and adaptive history.