Introduction

High blood pressure is a risk factor for cardiovascular disease and stroke. As well as environmental influences known to affect blood pressure, genetic factors play a role with estimates of blood pressure heritability of 30–50 % [1, 2]. Large-scale genome-wide association studies (GWAS) scanning millions of single nucleotide polymorphism (SNP) variants across the genomes of tens of thousands of individuals have identified multiple genetic loci that are associated with blood pressure. These include loci that contain plausible candidate genes in biological pathways known to influence blood pressure as well as novel loci where the genes implicated have not previously been suspected to be important in blood pressure or have an unknown function. These findings provide valuable insight into blood pressure mechanisms. However, it is notable that the largest (and most successful in terms of new discovery) studies were mainly, if not exclusively, comprised of individuals of European-ancestry [3••, 4, 5, 6••]. There are differences in the prevalence and incidence of cardiovascular disease between individuals of different ancestries, for example, the prevalence of hypertension between 1999 and 2004 was 40 % for African Americans compared to 27 % for European Americans, and in 2004 the death rate attributable to hypertension was three times greater in African Americans than in European Americans [710]. Differing responses to anti-hypertensive therapy have also been observed among Europeans, Africans and East Asians [1113]. Although a large part of these differences may be accounted for by environmental differences (for example, lifestyle) and related risk factors, genetics are likely also to play a role.

Although there is no a priori reason to believe that the biological mechanisms underlying blood pressure regulation differ across ancestries, the extent to which ancestry-specific genetic variation differentially affects these pathways is not well understood. Consistent with the theory of one or more migrations out of Africa early in human history, there is more genetic variation within African populations than in other populations, and all other populations share a common subset of the variation seen in Africans (as well as variation that has arisen as a result of more recent mutation and recombination events and the effects of migration and admixture) [14•]. The structure of linkage disequilibrium (LD) varies between populations, with shorter LD blocks observed in the African population, and SNP minor allele frequencies vary between ancestries (Fig. 1). It is plausible that although some SNPs may show strong association with blood pressure across all ancestries, there is also likely to be variation in the size of effect between ancestries when the LD between the SNP being measured and the underlying causal SNP varies or where there are differences in environmental interactions [15].

Fig. 1
figure 1

Comparison of allele frequencies for 100,000 SNPs on chromosome 20 between the European (EUR) 1000 Genomes Project super-population (x axis) and the Admixed American (AMR), East Asian (ASN) and African (AFR) super-populations (y axes). EUR super-population comprises CEU, TSI, FIN, GBR and IBS populations. AMR comprises MXL, PUR, CLM and PEL populations. ASN comprises CHB, JPT, CHS, CDX and KHV populations. AFR comprises YRI and LWK populations

Understanding the genetic architecture of blood pressure furthers understanding of the biology of blood pressure and can also inform approaches to treatment through identification of additional drug targets or by identification of pharmacogenetic interactions, which could facilitate stratified approaches to medicine.

This review will summarise the findings to date for large GWAS of blood pressure across different ancestries and will explore how combining genetic information across ancestries can improve understanding of observed associations and accelerate progress towards a full understanding of the genetic architecture of blood pressure.

GWAS of Blood Pressure: Achieving Large Sample Sizes

It has now been shown for many quantitative traits that the allelic effects of genetic variants that are common in the population are individually modest and that the combined effects of the common variants identified to date collectively only explain a small proportion of the phenotypic variance. GWAS of quantitative traits are better powered than GWAS of binary traits to detect common variants of modest effect. Although systolic blood pressure (SBP) greater than 140 mmHg and diastolic blood pressure (DBP) greater than 90 mmHg are the basis of a clinical diagnosis of hypertension, variation of blood pressure within the normal range has an important impact on the risk of cardiovascular disease and mortality, and the analysis of blood pressure as a quantitative trait to infer the genetic risks of hypertension has been a popular approach. Pulse pressure (PP), the difference between systolic and diastolic blood pressure, which gives a measure of arterial stiffness, and mean arterial pressure (MAP) have also been analysed for genetic associations. Blood pressure measurement can be sensitive to the time of day and location of measurement (for example, the “white coat effect” where an individual experiences a rise in blood pressure in response to being in a clinical setting) as well as the instrument and protocol used. Many cohort studies strive for consistency of measurements across subjects by taking repeat readings under similar clinical settings (self-reported blood pressure has also been collected in some studies). At least for the purposes of hypothesis-free discovery of genomic loci associated with blood pressure, large sample sizes appear to compensate for the effects of non-systematic measurement error [3••]. In addition, very large sample sizes are needed to detect modest allelic effects. Given this, it is not surprising that early genome-wide and candidate gene studies of association with such traits in small sample sizes (a few hundred up to a few thousand) had limited success in identifying associations which were replicable. Bringing together cohorts of many thousands of samples is hugely expensive in terms of both financial resources and time. Hundreds of smaller studies ranging from a couple of hundred individuals to several thousand individuals are now available, many with existing genome-wide SNP array genotype data. Combined with the advent of imputation approaches, where individual variants not directly measured on the array can be inferred using linkage disequilibrium and haplotype information, the availability of a multitude of smaller studies with blood pressure information and genome-wide SNP genotypes has enabled researchers to undertake association testing in very large sample sizes through meta-analysis. Unlike traditional meta-analysis of published results, meta-analysis of GWAS in recent years generally involves de novo association testing within each participating cohort according to an agreed protocol (analysis plan) with association statistics (usually at a minimum the allelic effect size and standard error) being provided to a central analysis group for meta-analysis. This has circumvented the sensitive issues of shared access to individual-level data whilst enabling cohorts to contribute to large analyses (although sharing of individual-level data is increasing).

Large published meta-analyses of GWAS (of blood pressure and other traits) have mostly comprised a two-stage approach whereby a subset of SNPs that reached some nominal level of significance in the discovery stage is followed up in a second independent meta-analysis of GWAS. This second stage can comprise results from de novo genotyping (for studies without genome-wide SNP data or for SNPs not well measured using imputation approaches) and in silico resources (selection of association results for a subset of SNPs from studies that have genome-wide SNP data but did not contribute to the discovery stage). This second stage can be used for either independent replication or for meta-analysis with the discovery stage results to boost power through increased sample size [1618].

Published GWAS of blood pressure in European populations have now reached sample sizes of up to ~75,000 in the discovery stage with up to ~200,000 individuals for some SNPs achieved through meta-analysis across discovery and second stage results [3••, 4, 5, 6••]. Forty-two genomic loci have so far been reported to be associated with blood pressure traits (SBP, DBP, PP and/or MAP) with genome-wide significance in large samples of individuals of European-ancestry since 2009 [3••, 4, 5, 6••, 19•, 20•, 21•, 22•] (Table 1). These include a number of loci for which there is already some evidence from functional studies for a role in blood pressure regulation and other loci for which there is little known about function.

Table 1 Loci associated with blood pressure traits—evidence of association across different ancestries

GWAS in other ancestries has lagged behind studies of Europeans with sample sizes only recently moving into the tens of thousands. Early SNP genotyping arrays were designed to be of optimal use in European populations leading to suboptimal capture of variation in non-European ancestries [23, 24]. Whilst discovery sample sizes of Europeans were upwards of 35,000 in 2009 (with available follow-up resources in the order of 70,000), similar sample sizes for African and Asian populations did not begin to reach this level until 2011. A number of consortia representing different broad ethnicity groups have now been formed to bring together available resources to boost the sample size across traits through meta-analysis. These include the Asian Genetic Epidemiology Network (AGEN) comprising individuals from East Asia (including Japan, China, Korea, Malaysia and the Philippines) and the Continental Origins and Genetic Epidemiology Network (COGENT) comprising primarily US African American cohorts.

GWAS in Non-European Ancestries

To date, large populations of individuals of African-, East Asian- and South Asian-ancestry have mainly been utilised to investigate blood pressure genetics in one of four ways. First, single-ancestry GWAS have been undertaken for discovery of novel blood pressure loci that have not been identified in Europeans. Second, replication of signals of association first discovered in Europeans has been sought by testing the association of a subset of SNPs. Third, transethnic meta-analysis has improved power by boosting sample sizes. Finally, differences in linkage disequilibrium structure have facilitated fine-mapping of regions of association to try to close the net on the causal variant driving the signal of association.

Discovery of Novel Loci Using Studies of Non-European-ancestry

The availability of large sample sizes of individuals of non-European ancestries has been limited compared to availability of European samples, and GWAS of non-European-ancestry populations have so far had limited success in identifying novel associations with blood pressure.

The first two large GWAS of blood pressure in African Americans, with discovery sample sizes of 1,017 and 8,591, respectively, [25, 26], were unable to detect any novel genome-wide significant associations. More recently, a large GWAS of blood pressure in African-ancestry individuals comprising 29,378 individuals from 19 cohorts (COGENT consortium; 18 African American cohorts and one African cohort) identified one genome-wide significant association with SBP in CYB5R2. However, this failed to replicate in an independent sample of 10,386 African-ancestry individuals or in a transethnic meta-analysis of these additional African-, European- and East Asian-ancestry individuals.

GWAS of blood pressure traits in East Asians have, to date, reached discovery sample sizes of up to ~26,000 with follow-up in additional sample sizes of ~30,000 [27••, 28••]. Three novel loci have been identified in East Asian populations (ENPEP, RPL6-PTPN11-ALDH2, CASZ1). Of these, at least one (CASZ1, associated with DBP in East Asians) has shown genome-wide significant evidence of association with SBP in European populations [4, 20•] (the intronic SNP in CASZ1 has very similar MAF in East Asians and Europeans).

Replication of Association Signals from European Populations

To date, 6 out of 43 loci showing genome-wide significant association with blood pressure have also shown some evidence (after correction for multiple testing) of association in populations of African-ancestry (Table 1) [29••] [NB: to date, only the 28 loci confirmed by the International Consortium for Blood Pressure GWAS (ICBP-GWAS) [3••] have been consistently evaluated in non-European ancestries]. ICBP-GWAS [3••] evaluated the effects of the 29 SNPs associated with blood pressure in Europeans in a sample of 19,775 African-ancestry individuals, and although 22 showed a consistent direction of effect between the two ethnicities, none were significant after adjustment for multiple testing. Fox et al. [26], although unsuccessful in detecting any novel blood pressure loci, provided nominal evidence of replication (P < 0.05) for three loci (SH2B3/ATXN, TBX3-TBX5, CYP1A1-ULK3/CSK) previously identified in Europeans (with the signal at CYP1A1-ULK3/CSK remaining significant after adjustment for multiple testing of 18 SNPs) [4, 5]. Franceschini et al. [29••] showed nominal evidence of replication (P < 0.05) in 29,378 African-ancestry individuals of the association signal for 13 of the 29 SNPs reported by [3••, 4, 5], although only four of these were significant after correction for multiple testing (29 SNPs, P < 1.7 × 10−3). A comparison of the effect sizes in Europeans for 21 of the 29 SNPs identified in large studies of Europeans (described above, 8 of the 29 are monomorphic in the HapMap CHB and JPT samples) with their effect sizes in Africans and East Asians showed very high correlation (Pearson correlation coefficients 0.6–0.8).

Studies in Asian populations have provided evidence of association for more of the loci identified in Europeans and with greater confidence [3••, 4, 5, 27••, 28••, 3033]. Newton-Cheh et al. [5] sought replication of findings from 105,658 European samples, in Indian Asians (n ≤12,889) from the LOLIPOP study. Although likely underpowered to detect the individually modest effects of most of the variants identified in the large European samples, replication of association was observed for FGF5 and CYP17A1. More recent large GWAS of East Asians have provided genome-wide significant (P < 5 × 10−8) evidence of association for seven previously reported loci in ~50,000 East Asians (ST7L-CAPZA1-MOV10, NPR3-C5orf23, FGF5, CYP17A1-NT5C2, ATP2B1, TBX5-TBX3, FIGN-GRB14) [27••, 28••]. ICBP-GWAS [3••] explored the effects of 29 blood pressure-associated SNPs in East Asians (n = ~29 K) and demonstrated that genetic variants identified in Europeans mostly showed a consistent direction of effect in East Asians (21 out of 29 SNPs) and that some associations reached statistical significance after correction for multiple testing (9 SNPs) (Table 1). SNPs at four loci were genome-wide significant (P < 5 × 10−8) for association with one or more blood pressure traits (FGF5, CYP17A1-NT5C2, ATP2B1 and CYP1A1-ULK3/CSK).

At the time of writing this review, there had been no published GWAS of blood pressure in more than ~3,000 South Asians [34]. ICBP-GWAS [3••] demonstrated a direction of effect consistent with that observed for Europeans for 22 of the 29 SNPs assessed in a sample of 23,977 individuals of South Asian-ancestry (primarily from India and Pakistan) and six loci (ST7L-CAPZA1-MOV10, JAG1, GNAS-EDN3, FGF5, CYP17A1-NT5C2, ATP2B1) reached statistical significance after correction for multiple testing (none were genome-wide significant).

No locus has shown independent genome-wide significant evidence of association with blood pressure in European, African, East Asian and South Asian ancestries although two loci (ST7L-CAPZA1-MOV10 and JAG1) have shown evidence of replication (significant P values after correction for multiple testing) across all four ancestries.

The most likely reasons for non-replication are lack of power or genuine lack of association with the trait of the SNPs tested, either because the genomic locus itself does not have an influence on blood pressure across all ancestries, or, more plausibly, because of differences in allele frequencies and linkage disequilibrium structure between populations.

Composite risk scores, whereby information from a set of defined SNPs (for example, those identified as being strongly associated with blood pressure in Europeans) is combined and tested for association with blood pressure in a population of different ancestry, can give insight as to whether there is an effect of these variants across ancestries. Franceschini et al. [29••] observed significant associations for a composite risk score based on the 29 SNPs showing the strongest associations with SBP or DBP in Europeans, with both traits in African Americans. Highly significant associations with risk scores based on subsets of the same 29 SNPs have also been observed for East Asian and South Asian populations [3••]. This suggests that, in general, the SNPs identified as showing an association with blood pressure in European populations are also associated with blood pressure in these other populations.

Differences in effect sizes between European and non-European populations have been described for some loci [3••, 32]. Of nine SNPs that were associated with blood pressure with genome-wide significance in Europeans and with statistical significance (after adjustment for multiple testing) in East Asians, six had larger effects on both SBP and DBP in East Asians [3••]. In some cases, the effect in East Asians was almost twice as large as that in Europeans. Differences in effect size could be due to differences in LD with the true causal SNP or genuinely different effect sizes of the causal SNP due to environmental interactions.

Transethnic Meta-analysis to Boost Sample Size and Power

Sample size is the key determinant of power in GWAS and combining data from across ancestries has been used to boost sample size and make novel discoveries. Franceschini et al. [29••] undertook a meta-analysis of GWAS of 29,378 African-ancestry individuals and followed up findings in a transethnic meta-analysis of an additional 10,386 African Americans, 69,395 Europeans and 19,601 East Asians. Six SNPs achieved nominal significance in the COGENT discovery sample and five of these were genome-wide significant after transethnic meta-analysis. Two of the SNPs were in loci that had previously been identified as being associated with blood pressure (ULK4 [4] and SOX6 [22•]) and three (RSPO3, PLEKHG1, EVX1-HOXA) were novel. SNPs at these three loci showed evidence of association with blood pressure in the 69,395 Europeans with P values from 1.5 × 10−4 (PLEKHG1) to 8 × 10−6 (EVX1-HOXA). The P values for association of the EVX1-HOXA SNP in the 19,601 East Asians were 0.34 and 0.44 for SBP and DBP, respectively. These findings suggest that these loci are influencing blood pressure in Europeans and African-ancestry individuals. These results do not rule out an association of these loci with blood pressure in East Asians; differences in LD structure at these loci in East Asians may result in the true underlying causal SNPs not being tagged by the SNPs tested in this analysis.

Fixed effect meta-analysis models (where the same underlying allelic effect is assumed for all populations being meta-analysed) are more powerful than random effects meta-analysis models (where different allelic effects are assumed for each population) and have been the most commonly used approaches in large published meta-analyses of GWAS. However, given that there is known variation in the linkage disequilibrium structure, allele frequencies (Fig. 1) and environmental exposure between different ancestry groups, the assumptions of the fixed effect models may not hold for all loci. Although the random effect models may seem more appropriate, these models fail to adequately account for homogeneity in effects within each ancestry group whilst allowing for varying degrees of heterogeneity of effect sizes between ancestry groups. At least two approaches to improve the power of transethnic meta-analyses by allowing for heterogeneity of effect sizes have been proposed. Han and Eskin [35•] proposed an approach that is based on a random effect model, but is less conservative, by assuming that the true value of the effect size is zero for all studies under the null hypothesis of no association (compared to heterogeneity of effect sizes under the null hypothesis in the traditional model) (metasoft). Morris [36•] proposed a Bayesian approach that allows closely related populations to have a common effect size while at the same time allowing the effect size to vary between more distantly related populations (MANTRA). Both of these approaches have been independently shown to improve power compared to traditional fixed and random effect model approaches [37•] and have been utilised to identify novel variants associated with other traits in transethnic meta-analyses (examples [38, 39]). These approaches have yet to be applied to the study of blood pressure.

Transethnic Meta-analysis for Fine-mapping

As well as boosting power for the discovery of individual SNP associations, transethnic meta-analysis can improve resolution of regions of association by fine-mapping the signal. A signal of association from one or more SNPs at a given locus does not determine whether the associated SNPs are also causal or are tagging the causal variant. Often, signals of association can spread over many kilobases and include multiple genes. Localising the signal to a causal SNP or region within which the causal variant is likely to lie is of interest to inform follow-up strategies. Differences in the linkage disequilibrium structure between different populations can be exploited to refine the region of association and this has been demonstrated for other traits [38, 40].

Admixture Mapping

Admixture mapping, where the assumption that different disease rates in populations are partly due to differences in frequency of the causal variant in the different populations, has also been used to identify novel loci showing association with blood pressure. Zhu et al. [41•] undertook admixture mapping followed by SNP association testing in up to 18,185 African and African American individuals and identified a novel signal upstream of SUB1 on chromosome 5 (~240 Kb from the previously reported NPR3-C5orf23 SNP). CXADR on chromosome 21 has also been implicated in blood pressure regulation through admixture mapping [42•, 43].

Conclusions

As increasing sample sizes become available for ancestries other than European, there is increasing support for a shared set of loci showing association with blood pressure across two or more ancestries. Of the seven loci showing genome-wide significant evidence of association with blood pressure in Europeans and East Asians, a number contain very plausible blood pressure modifying candidate genes. For example, NPR3 encodes the natriuretic peptide clearance receptor and has a known role in blood volume [44], missense mutations in CYP17A1 (cytochrome P450 enzyme CYP17A1) are known to cause a form of adrenal hyperplasia with hypertension, hypokalemia and reduced plasma renin activity [45, 46], and expression of ATP2B1 (which encodes a plasma membrane calcium/calmodulin-dependent ATPase) in rats provides supportive evidence for a role in blood pressure regulation [47].

It is clear that with appropriate methodology, combining GWAS from multiple ancestries can both aid discovery of novel loci and also refine known signals of association. The key challenge in developing the use of data from multiple ancestry resources is obtaining sufficient sample sizes to be able to detect associations of variants with smaller effects.

All of the meta-analyses of GWAS described above have been based on genotyped SNPs or SNPs imputed to the HapMap reference panel. The 1000 Genomes Project has built on the success of HapMap by generating a larger reference panel of genetic variation based on a broader collection of subpopulations from across the European, African and Asian populations [48]. The next generation of GWAS will be based on imputation to this larger reference panel and many consortia have taken the approach of advocating imputation to reference panels comprising haplotypes from all ancestries in order to attempt to capture more of the global genetic variation. Along with an increase in content of low frequency (minor allele frequency 1–5 %) variants compared to previous imputation panels, this latest development has potential to leverage more from the data available from each ancestry.

Although there is a growing list of loci showing an association with blood pressure, these still only explain a modest proportion of the underlying genetic variation [3••], and it is hypothesised that rare variants with larger effects than are seen for common variants may help to explain more of this variation. Re-sequencing approaches and genotyping on arrays that target very rare functional SNPs may yield such findings but, as for the study of common variation, very large sample sizes are necessary for sufficient statistical power particularly as these variants are more likely to be subject to population stratification than common variants.