Population pharmacogenomics: an update on ethnogeographic differences and opportunities for precision public health

Both safety and efficacy of medical treatment can vary depending on the ethnogeographic background of the patient. One of the reasons underlying this variability is differences in pharmacogenetic polymorphisms in genes involved in drug disposition, as well as in drug targets. Knowledge and appreciation of these differences is thus essential to optimize population-stratified care. Here, we provide an extensive updated analysis of population pharmacogenomics in ten pharmacokinetic genes (CYP2D6, CYP2C19, DPYD, TPMT, NUDT15 and SLC22A1), drug targets (CFTR) and genes involved in drug hypersensitivity (HLA-A, HLA-B) or drug-induced acute hemolytic anemia (G6PD). Combined, polymorphisms in the analyzed genes affect the pharmacology, efficacy or safety of 141 different drugs and therapeutic regimens. The data reveal pronounced differences in the genetic landscape, complexity and variant frequencies between ethnogeographic groups. Reduced function alleles of CYP2D6, SLC22A1 and CFTR were most prevalent in individuals of European descent, whereas DPYD and TPMT deficiencies were most common in Sub-Saharan Africa. Oceanian populations showed the highest frequencies of CYP2C19 loss-of-function alleles while their inferred CYP2D6 activity was among the highest worldwide. Frequencies of HLA-B*15:02 and HLA-B*58:01 were highest across Asia, which has important implications for the risk of severe cutaneous adverse reactions upon treatment with carbamazepine and allopurinol. G6PD deficiencies were most frequent in Africa, the Middle East and Southeast Asia with pronounced differences in variant composition. These variability data provide an important resource to inform cost-effectiveness modeling and guide population-specific genotyping strategies with the goal of optimizing the implementation of precision public health. Supplementary Information The online version contains supplementary material available at 10.1007/s00439-021-02385-x.


Introduction
Interindividual differences in drug response are a common phenomenon in pharmacological therapy. While some patients respond appropriately to a given treatment, in others, it can result in lack of efficacy, which affects an estimated 10-45% of patients (Salvà Lacombe et al. 1996;Trivedi et al. 2006). Furthermore, interindividual differences can give rise to sometimes severe adverse drug reactions (ADRs) in a subset of patients that overall account for approximately 7% of all hospitalizations and 0.3% of death among all hospitalized patients (Lazarou et al. 1998;Pirmohamed et al. 2004). Among the factors causing interindividual differences, genetic germline variations in genes that are involved in pharmacokinetics and pharmacodynamics are estimated to explain 20-30% of drug response variability.
Notably, many of these pharmacogenes are among the most polymorphic genes in the human genome and harbor thousands of genetic variants, which can change enzyme activity or disrupt drug-target interactions, thereby eventually altering drug effects Zhou et al. 2021a). Much effort has been made to identify actionable associations between genetic variants and differential drug response. As of 2021, > 310 drugs have received pharmacogenomic information in their labels or have received guidelines by pharmacogenomic expert working groups, such as the Clinical Pharmacogenetics Implementation Consortium (CPIC) and the Dutch Pharmacogenetics Working Group (DPWG), that can guide drug selection or posology Shekhani et al. 2020).
Nevertheless, only a fraction of these established pharmacogenomic biomarkers is implemented in routine clinical care and the only preemptive tests that are mandated are for HLA-B*57:01 and DPYD variants to inform abacavir and fluoropyrimidine therapy, respectively. While the underlying reasons are complex and multifaceted, prevalence of the variants in question constitutes one of the factors that impacts the clinical utility of genetic testing (Lauschke and Ingelman-Sundberg 2016;Russell et al. 2021). Thus, mapping variant frequencies in different ethnogeographic groups can provide important information to inform costeffectiveness modeling and guide population-specific genotyping strategies. Here, we provide an updated overview of population pharmacogenomics of ten important pharmacokinetic genes (CYP2D6, CYP2C19, DPYD, TPMT, NUDT15 and SLC22A1), drug targets (CFTR) and genes involved in adverse event risk independent of drug pharmacokinetics or target (HLA-A, HLA-B and G6PD). We provide a detailed overview of ethnogeographic differences in allele frequencies, infer functional consequences and discuss implications and relevance for the implementation of population-specific precision public health. For other clinically relevant pharmacogenes, such as CYP2B6 (Langmia et al. 2021), UGT1A1 (Hall et al. 1999) or NAT2 (Sabbagh et al. 2011), we refer the interested reader to excellent reviews on the topic.

CYP2D6
CYP2D6 is one of the most pleiotropic drug-metabolizing enzymes and is involved in the hepatic clearance of approximately 25% of all clinically used drugs, including tricyclic antidepressants, opioids, antiemetics and antiarrhythmics (Zanger and Schwab 2013). Importantly, at least in part due to the lack of important endogenous substrates and low evolutionary constraints, CYP2D6 constitutes one of the most polymorphic genes in the cytochrome P450 (CYP) gene family, resulting in drastic functional diversity of CYP2D6 (Fujikura et al. 2015;Ingelman-Sundberg 2005). Of the more than 100 different CYP2D6 alleles that have been described to date, the loss-of-function (LOF) alleles CYP2D6*3, *4, *5 and *6, the decreased function alleles *9, *10, *17, *29 and *41 as well as the CYP2D6 duplications *1xN and *2xN are functionally most relevant and are common with minor allele frequencies (MAF) > 1% in at least one population (Tables 1 and 2). Over the past decades, substantial interethnic differences have been revealed for these alleles, which translate into substantial variability in metabolic phenotypes across populations (Gaedigk et al. 2017;Zhou et al. 2017).
The country-specific CYP2D6 allele frequency data can be aggregated to infer CYP2D6 phenotypes (Gaedigk et al. 2017;Koopmans et al. 2021). The frequency of CYP2D6 poor metabolizers (PM), defined as individuals carrying two LOF alleles, is highest in Ashkenazi Jews (6%) and European population (5.4-11.4%) and lowest in populations from the Middle East (0.9%), East Asia (0.4%) and Oceania (0.4%; Fig. 1). In contrast, the prevalence of intermediate metabolizers (IM) that exhibit reduced but measurable CYP2D6 metabolism was found to be highest in African populations (10-60%) and Ashkenazim (10-40%), and lowest in South Asia (3.8%) and the Americas (2.8%). Ultrarapid metabolizers (UM) that carry at least one functional gene duplication, are most common in indigenous Oceanian populations (21.2%) and North Africa (up to 39%), whereas they are lowest in East Asia (1.4%). These functional extrapolations can provide important information for population-specific drug selection and the posology of CYP2D6 substrates.

CYP2C19
CYP2C19 is a key enzyme involved in the metabolism of the antiplatelet drug clopidogrel, selective serotonin reuptake inhibitors (SSRIs) as well as proton pump inhibitors, and genetic variability in CYP2C19 contributes to the differential response to these substrates. The clinically most relevant variant alleles are CYP2C19*2 (rs4244285) and CYP2C19*3 (rs4986893) that abolish enzyme activity and the regulatory CYP2C19*17 variant (rs12248560) that results in increased gene activity (Table 3).
The functional allele frequency data has been used to predict CYP2C19 phenotypes across ethnicities (Koopmans et al. 2021). CYP2C19 PM status was most common in Oceania where around 58% of individuals are homozygous or compound heterozygous for CYP2C19 LOF alleles (Fig. 2). Considerable numbers of CYP2C19 PMs were also reported in East Asian (14.2%) and Central/South Asian (11.8%) populations, whereas their numbers are lower in Latin America (1.1%), Europe (2.7%) and Africa (3.3%). CYP2C19 UMs are most common in European, African and Latin American populations with frequencies pivoting around 20-30%, whereas only 2.1% of the East Asians are UMs (Koopmans et al. 2021 (Tables 1 and 2; Supplementary Table 1). Countries are color-coded with the highest frequency in red, the average frequency across all populations ( f ) in yellow, and the lowest frequency in green. In case of missing population frequencies, averaged continent frequency data from the literature (Gaedigk et al. 2017) were used to infer metabolizer phenotypes

DPYD
Fluoropyrimidines, including 5-fluorouracil and its prodrugs capecitabine and tegafur, are important chemotherapeutics for the treatment of various solid tumors. They are among the most prescribed anticancer drugs worldwide with more than two million patients estimated to use fluoropyrimidines each year (Ezzeldin and Diasio 2004). However, up to 40% of patients experience fluoropyrimidine-induced toxicity that is severe enough to require discontinuation of therapy, and in 0.5-1% of patients these ADRs are fatal (Hoff et al. 2001;Van Cutsem et al. 2001). The toxicity of fluoropyrimidines is strongly associated with the metabolic activity of dihydropyrimidine dehydrogenase (DPD), the enzyme catalyzing the rate-limiting step in the biotransformation of fluoropyrimidines into non-toxic metabolites. As such, reduced activity of DPD increases fluoropyrimidine exposure, resulting in increased cytotoxicity. Interindividual variation in DPD activity is strongly associated with genetic variability of the respective gene, DPYD. The most well-studied DPYD variant is DPYD*2A (rs3918290; c.1059 + 1G > A; IVS14 + 1G > A), a splicing variant that results in exon skipping and gives rise to a truncated gene product with no catalytic activity (Vreken et al. 1996). The highest frequency of DPYD*2A is found in the Finnish population (2.4%) (Zhou et al. 2020), whereas frequencies in Central, South and East Europe are > twofold lower, pivoting around 1%, 0.5% and 0.3%, respectively (Raida et al. 2001;Salgueiro et al. 2004;Sulzyc-Bielicka et al. 2008;Uzunkoy et al. 2007;van Kuilenburg et al. 2001) (Table 4). DPYD*2A is extremely rare in Asian, African The corresponding references are provided in Supplementary Table 2 AF allele frequency, n number of individuals genotyped, N/A not available  (Elraiyah et al. 2017;Hariprakash et al. 2018;Zhou et al. 2020).
Previous estimates for the global prevalence of partial and full DPD deficiency are 3-8% and 0.02-0.2%, respectively, with highest frequencies in Africans and Finnish and lowest in Ashkenazi Jews and East Asians (Caudle et al. 2013;Zhou et al. 2020). As frequencies of DPD deficiency differ by up to tenfold between populations, these data thus emphasize the importance of population-adjusted strategies for the optimization of fluoropyrimidine dosing and solid cancer therapy.

TPMT and NUDT15
Thiopurine methyltransferase (encoded by TPMT) and nudix hydrolase 15 (encoded by NUDT15) play important roles in the metabolism of the thiopurines mercaptopurine and   (Table 3; Supplementary Table 2). Countries are color-coded with the highest frequency in red, the average frequency across all populations ( f ) in yellow, and the lowest frequency in green. In case of missing population frequencies, averaged continent frequency data from the literature (Ionova et al. 2020;Scott et al. 2013) were used to infer metabolizer phenotypes thioguanine, which are widely used in the treatment of acute lymphoblastic leukemia, inflammatory bowel diseases and autoimmune disorders. Thiopurines are metabolized intracellularly into thioguanosine monophosphate (TGMP), which is further converted into the active thioguanine di-and triphosphates that exert their cytotoxic and antiproliferative effects by blocking purine synthesis and by causing direct damage to DNA and RNA (Bökkerink et al. 1993;Inamochi et al. 1999;Karim et al. 2013). Furthermore, they have antiinflammatory effects by inducing T cell apoptosis via inhibition of the GTPase RAC1 (Poppe et al. 2006). TPMT plays a central role in the metabolism of thiopurines into inactive methyl-metabolites thereby shunting TGMP away from further metabolic activation. Similarly, NUDT15 dephosphorylates thioguanine di-and triphosphates back into its monophosphate form, counteracting its incorporation into DNA and RNA. Genetic variations can cause TMPT and NUDT15 deficiency, resulting in excessive formation of thioguanine di-and triphosphates and an increased risk of severe myelosuppression. The most common and well-characterized TPMT alleles are TPMT*3A (rs1142345 and rs1800460), *3C (rs1142345) and *2 (rs1800462), which together explain more than 90% of decreased TPMT activity phenotypes (Schaeffeler et al. 2004;Zhou et al. 2020). TPMT*3A is most common in European and Latin American populations with frequencies pivoting around 2-4%. The highest TPMT*3A frequencies in Europe were observed in the UK (4.5%)  and Greenland (8.1%) (Toft et al. 2006), whereas frequencies in Croatia were substantially lower (1.9%) (Ladić et al. 2016). No TPMT*3A alleles were found in 194 indigenous Saami in Norway (Loennechen et al. 2001). In Latin America, frequencies were highest in Brazil (up to 3.9%) (Ferreira et al. 2020), Colombia (3.6%) (Isaza et al. 2003) and Argentina (3.1%) (Laróvere et al. 2003).
Based on frequencies of TPMT*3A, *3C, *2, it is estimated that the frequency of patients harboring intermediate TPMT activity is around 3-14%, and approximately 1 in 178 to 1 in 3,736 patients are fully TPMT deficient (Relling et al. 2011). When extending these analyses using Next Generation Sequencing to also include other functional variations, recent studies suggested highest prevalence of intermediate and full TMPT deficiency in Africa with frequencies of 11% and 0.3%, respectively, whereas the corresponding frequencies were lowest in Asian populations (0.03-0.04% full deficiency; 3.3-3.9% intermediate activity) and Ashkenazim (0.02% full deficiency; 2.9% intermediate activity) (Zhou et al. 2020).
While polymorphisms in TPMT alone explain around 40% of thiopurine-induced ADRs (Schaeffeler et al. 2019), predictions can be further improved by including the missense variant p.R139C in NUDT15 (c.415C > T; rs116855232) (Yang et al. 2015b(Yang et al. , 2014. Mechanistically, this variant destabilizes the protein structure, thereby resulting in lower Due to the high frequency of p.R139C, NUDT15 deficiency is common in East Asian (22.6%), South Asian (13.6%) and Latin American (12.5-21.2%) populations (Moriyama et al. 2016), surpassing the prevalence of TPMT deficiency and suggesting that variations in NUDT15 rather than in TPMT are the major drivers of thiopurine-induced toxicity across Asia and Latin America. In contrast, TMPT reduced function alleles explain the majority of thiopurine toxicity in Europe and Africa.

SLC22A1 (OCT1)
The SLC22A1 gene encodes the organic cation transporter OCT1 that is highly expressed in hepatocytes, immune cells and most epithelial barriers. OCT1 partakes in the disposition of a large number of structurally diverse drugs (including metformin, tramadol, lamivudine, oxaliplatin, sorafenib and morphine), endogenous substrates (choline, acetylcholine and agmatine), vitamins (vitamin B1) and toxins (1-methyl-4-phenylpyridinium), and genetic variants in SLC22A1 have been reproducibly associated with altered substrate pharmacokinetics (Arimany-Nardi et al. 2015; Chen et al. 2014;Herraez et al. 2013;Tzvetkov et al. 2013Tzvetkov et al. , 2011. Importantly, SLC22A1 is highly polymorphic with more than 1,000 genetic variants of which 450 alter the amino acid sequence of the transporter (Schaller and Lauschke 2019). While most of these variations are very rare and poorly characterized, at least 15 functionally relevant alleles have been identified that are common in at least one population (Seitz et al. 2015).
The patterns of genetic SLC22A1 variability are substantially different in African populations. In Sub-Saharan Africa, SLC22A1*8 (p.R488M; rs35270274), a variant allele with slightly increased activity towards morphine and metformin, constitutes the most common allele with frequencies between 2 and 18% (Seitz et al. 2015). Furthermore, SLC22A1*7 (p.S14F; rs34447885) is common with frequencies up to 9%. Effects of this allele are substrate-specific, entailing reduced transport of metformin, tropisetron and tyramine, whereas no differences are observed for morphine, debrisoquine and tramadol. SLC22A1*2 is found across Sub-Saharan Africa albeit with lower prevalence than in Europe (0-11% compared to 10-20%). In aggregate, only around 15% of individuals in Africa harbor reduced function variants, whereas around 12% carry the African increased activity allele SLC22A1*8. In contrast to Sub-Saharan Africa, Northern Africa and the Middle East recapitulates the variant pattern observed in European populations with SLC22A1*2 and SLC22A1*3 being most common, while SLC22A1*7 and SLC22A1*8 are only rare with frequencies around 1%.

Pharmacogenetically important HLA alleles
While around 80% of ADRs are consequences of excessive pharmacological actions, the remaining 20% are idiosyncratic events that are unrelated to the therapeutic effect of the drug (Uetrecht and Naisbitt 2013). Many but likely not all idiosyncratic ADRs are immunologically mediated and can affect virtually any tissue, either in isolation or in combination with systemic effects (Phillips 2016). Idiosyncratic ADRs are more often severe or life-threatening with specific manifestations, such as Stevens-Johnson syndrome (SJS) and toxic epidermal necrolysis (TEN) resulting in mortality rates up to 13-60% (Schulz et al. 2000;Watanabe et al. 2021). The human leukocyte antigen (HLA) gene family encodes the major histocompatibility complex (MHC), which regulates T-cell mediated immunity. HLA genes have been strongly implicated in the etiology of immune-related adverse events caused by a multitude of drugs . The established models suggest that drugs (1) act as haptens, binding covalently to proteins and forming new antigens, (2) directly interact with the T cell receptor via non-covalent bonds or (3) bind non-covalently to the MHC, resulting in deformations of the peptide-binding groove and altered antigen presentation (Pavlos et al. 2015).
Notably, HLA genes are extremely polymorphic, but most idiosyncratic immunological ADRs are restricted to carriers of one or few specific HLA variant alleles. For instance, abacavir binds exclusively to the peptide-binding groove of HLA-B*5701, resulting in altered presentation of self-peptides, which in turn triggers polyclonal alloreactive autoimmunity and gives rise to abacavir hypersensitivity syndrome (AHS) (Illing et al. 2012;Ostrov et al. 2012). Further prominent and clinically well-established associations are associations of allopurinol-induced cutaneous adverse events with HLA-B*58:01 and links between carbamazepine-induced SJS/TEN and HLA-B*15:02 and HLA-A*31:01.
Abacavir is a nucleoside analog reverse transcriptase inhibitor that is used in combination with other antiretrovirals for the treatment of HIV/AIDS. In historic studies before the identification of HLA-B*5701 as a genetic risk factor, AHS occurs in around 5% of patients treated with abacavir with a mortality rate of around 3 per 1000 (Bannister et al. 2008;Hetherington et al. 2001). Importantly, while almost half of all HLA-B*57:01 carriers develop AHS after abacavir exposure, AHS was not observed in any of the patients without HLA-B*57:01 (Mallal et al. 2008). Based on these unambiguous data, preemptive testing of HLA-B*57:01 has become mandatory across the US and Europe before the initiation of abacavir therapy. HLA-B*57:01 allele frequency is a key factor to assess AHS risk in a populationscale. We recently evaluated the ethnogeographic distribution of pharmacogenetically relevant HLA alleles based on genetic information from 6.5 million individuals across 74 countries (Zhou et al. 2021b). The results showed that HLA-B*57:01 is generally rare in Africa, the Middle East and East Asia, whereas in Europe frequencies are reported between 1% in Sweden to 5.8% in Ireland (Fig. 3A). Globally, HLA-B*57:01 is most frequent in India (6.2%) and Sri Lanka (9.3%), whereas it is much less abundant in other South Asian countries such as Malaysia (1.1%), Thailand (2.1%) and Vietnam (2.6%).
Carbamazepine-induced severe cutaneous adverse reactions (SCAR) are associated with two alleles, HLA-B*15:02 and HLA-A*31:01, and odds ratios up to 2,504 (Chung et al. 2004) and 58 (Genin et al. 2014) have been reported, respectively. HLA-B*15:02 is exclusively found in Southeast Asian populations where allele frequencies are particularly high in the Philippines (22%), Vietnam (13.8%), Indonesia (11.6%) and Malaysia (11.5%), with the notable exception of Japan (< 0.1%; Fig. 3B). Consequently, genetic testing of HLA-B*15:02 is recommended in individuals of Asian ancestry but not for other populations. In contrast to the region-specific HLA-B*15:02, HLA-A*31:01 is common worldwide (Fig. 3C). It is most prevalent in indigenous populations in the Americas, such as in Argentina (28.8%), Mexico (10.1%), the United States (7.8%), Nicaragua (6.7%) and Chile (6.6%), whereas frequencies in Africa and Oceania seem to be lower (approximately 1%). However, frequency estimates of the latter are only based on small cohorts and further information in these populations is needed to corroborate these observations.
The xanthine oxidase inhibitor allopurinol is used for the treatment for hyperuricemia, but its utility is limited by the development of SCAR in up to 0.5% of patients (Yang et al. 2015a). HLA-B*58:01 is the predominant risk allele in Asian populations (Hung et al. 2005;Lonjou et al. 2008) where it is very common in Mongolia (8.8%), China (7.8%), Thailand (7.8%) and Singapore (7.2%; Fig. 3D). In addition, it is prevalent in several African countries, including Kenya (8.2%), Guinea Bissau (7.8%) and Senegal (6.9%). In contrast, HLA-B*58:01 frequencies are overall low across Europe and the Americas with frequencies ranging from 0.5 to 3.5%. Combined, these data provide the molecular basis for ethnogeographic differences in idiosyncratic ADR risk and suggest that preemptive testing can reduce idiosyncratic toxicity particularly in at-risk populations where the frequency of the respective HLA alleles are high.

CFTR
The CFTR gene encodes a chloride channel that is part of the ATP-binding cassette (ABC) transporter superfamily (ABCC7). The gene product plays essential roles in ion and water secretion and absorption in epithelial tissues. Genetic variations that impact CFTR function constitute the cause of cystic fibrosis (CF), an autosomal recessive disorder most commonly observed in populations of European descent. CF manifests primarily as lung disease with symptoms that resemble pneumonia, bronchiectasis and asthma. Further non-pulmonary symptoms include pancreatic dysfunction, intestinal obstructions and elevated sweat electrolytes.
Notably however, phenotypes, ages of onset and clinical manifestations differ considerably between patients.
By now, more than 2,100 genetic variants in CFTR have been described of which more than 400 are assumed to be pathogenic (Kounelis et al. 2020;Xiao and Lauschke 2021). Pathogenic variants are classified into five categories: variants that cause defective protein production, mostly due to premature stop codons or frameshift mutations or large insertions (class I); variants that result in defective protein trafficking (class II); variants causing defects in protein gating (class III) or dysfunctional protein conductance (class IV); and variants that cause reduced amounts of functional proteins, mostly due to splicing defects (class V).
Overall, the class II variant p.F508del (rs1801178) is most common, accounting for 70-75% of CF cases in individuals of European descent (Watson et al. 2004). In contrast, p.F508del is less common in ethnogeographic groups from Africa and Asia. Further misfolding variants include p.N1303K (rs80034486) and p.I507del (rs1490508086) that explain up to 2.8% of CF cases in Ashkenazim and up to 1.9% in Africans, respectively (Table 5). Splicing defect variants (class V) that substantially reduce the amount of functional CFTR at the plasma membrane include c.2988G > A (3120 + 1G > A), c.3717 + 12191C > T (3849 + 10kbC > T) as well as various other rare CFTR rearrangements and are of substantial relevance in Africa, where they constitute a frequent, in some groups even the most common, variant class associated with CF (Goldman et al. 2001;Macek et al. 1997;Schrijver et al. 2016;Owusu et al. 2020).
The major variant that causes the generation of correctly trafficked but dysfunctional channel proteins (class III) is p.G551D (rs75527207). While this variant only contributes minorly (< 1%) to cystic fibrosis risk in Hispanics and Ashkenazim, it explains between 2 and 3.5% of cases in non-Hispanic Caucasians and Asian Americans (Watson et al. 2004). Other variants resulting in CFTR dysfunction include p.R347P (rs77932196) and the Asian-specific variant p.S549N (rs121908755).
There is substantial heterogeneity within the larger populations. For instance, on average only 3-5% of European CF patients carry class III, IV or V variants; however, up to 14% of CF patients in Ireland have at least one class III variant, while more than 12% of patients in Moldova carry at least one class V mutation (De Boeck et al. 2014). Importantly, which genetic factors underlie the disease in a given patient determines the choice of pertinent therapy. Activity of reduced function CFTR proteins that have been correctly trafficked to the plasma membrane can be stimulated using "CFTR potentiators" (ivacaftor), while "CFTR correctors" (lumacaftor, tezacaftor and elexacaftor) can act as molecular chaperones to support channel folding and correct delivery of the transporter to the plasma membrane. Read-through agents (ataluren and ELX-02) have been suggested for carriers of premature termination codons that account for up to 12% of pathogenic CF alleles. However, ataluren failed to show improvement in clinical outcomes in a phase III trial and further development was hence halted (Aslam et al. 2017). ELX-02 showed promising results in vitro and phase II trials are currently ongoing (Kerem 2020). Combined, these data indicate that around 80% of CF patients in European populations carry at least one allele that renders them susceptible to treatment with currently available CFTR potentiators and CFTR correctors (p.F508del, p.G551D, p.S549N and c.3717 + 12191C > T). In contrast, the fraction of patients with suitable genotypes is considerably lower in African (∼ 60%), Hispanic (∼ 55%), Asian (∼ 45%) and Ashkenazi Jewish individuals (∼ 40%).

G6PD
G6PD encodes glucose-6-phosphate dehydrogenase, a key enzyme in the pentose phosphate pathway that regulates NADPH levels, which is essential for redox homeostasis. Importantly, G6PD is highly polymorphic, and more than 200 variants have been shown to cause reduced G6PD activity (Beutler and Vulliamy 2002). While mostly asymptomatic, G6PD deficiency can be of importance upon exposure to certain triggers of oxidative stress, particularly in erythrocytes that lack mitochondria and are thus reliant on G6PD for the synthesis of redox equivalents. Triggers can be dietary components, such as fava beans or legumes, different bacterial or viral infections, or various chemically diverse drugs, such as primaquine, dapsone, sulfonamide antibiotics and rasburicase. Under these circumstances G6PD deficiency strongly increases the risk of sometimes life-threatening acute hemolytic anemia. Notably, G6PD is located on the X-chromosome and thus primarily impacts hemizygous males and homozygous females, whereas among heterozygous females only around 8-20% exhibit clinically relevant levels of reduced G6PD activity Dechyotin et al. 2021;Johnson et al. 2009;Satyagraha et al. 2021). G6PD deficiency is most common in Africa, followed by Southeast Asia and the Middle East (Koromina et al. 2021;Nkhoma et al. 2009). While overall disease prevalence might be similar between these regions, the genetic basis of G6PD deficiency differs drastically (Table 6). Of note, G6PD variant alleles are commonly referred to by their trivial names, which is a convention we will also follow in this review. In Sub-Saharan Africa, the A-202A/376G allele is most common with frequencies around 10% and local peaks up to 24%, followed by A-968C/376G with frequencies around 1% (Awandu et al. 2018;May et al. 2000;Pernaute-Lau et al. 2021). However, frequency profiles can be reversed in specific ethnogeographic groups, as demonstrated for West African populations in Senegal and Guinea where the A-968C/376G is predominant (approximately 7-11% for A-968C/376G vs. 1-3% for A-202A/376G ) (De Araujo et al. 2006;Howes et al. 2013). Further West African alleles include the Sierra Leone (or A-311A/376G ) variant, which however has not been extensively characterized with high geographic resolution (Jalloh et al. 2008). In contrast to Sub-Saharan Africa, the different A-alleles are very rare in East African populations (Assefa et al. 2018;Hamid et al. 2019). These results have potentially important implications for public health decisions, particularly for the treatment and prevention of malaria, as they support the roll out of primaquine, a drug associated with major anemia risk in G6PD deficient individuals, as radical cure for Plasmodium vivax and as transmission interruption for Plasmodium falciparum in East Africa, whereas G6PD genotyping before the initiation of 8-aminoquinolone therapy is warranted in South and West Africa. However, the status of other deficient variants beyond A-should be evaluated in East Africa to further corroborate this conclusion. In Middle Eastern populations G6PD deficiency is primarily attributed to the Mediterranean allele (Doss et al. 2016), accounting, for instance, for 88% and 74% of G6PD deficiency among the Kurdish population in Northern Iraq and in Kuwaiti Arabs (MAF in the general population = 1-4%), respectively (Al-Allawi et al. 2010;Alfadhli et al. 2005). Further relevant G6PD deficient variants in the Middle East are A-968C/376G , Cairo and Chatham, with overall MAFs of 0.4-0.8%. The Mediterranean variant is furthermore common in Southcentral Asia with frequencies up to 8.9% in Afghani Pashtun (Jamornthanyawat et al. 2014). While it also constitutes a relevant factor in India, explaining around 24% of G6PD deficiencies in a countrywide survey, the overall most prevalent allele was Orissa, which accounted for 57% of all deficiencies (Devendra et al. 2020). Further rare variants of relevance in specific South Asian subpopulations and tribal groups are Kalyan-Kerala and Namoru (Chalvam et al. 2007). In Southeast Asia, the predominant allele is Mahidol, which explains 38-96% in of G6PD deficiencies in Burma, Thailand and Myanmar (Matsuoka et al. 2004;Phompradit et al. 2011). In contrast, G6PD deficiency in Cambodia was almost exclusively caused by the Viangchan allele (Matsuoka et al. 2005). Furthermore, specific subpopulations feature unique molecular G6PD patterns; for instance, the otherwise rare Aures allele constitutes the most common G6PD deficient variant in the Lao Theung population, the second largest ethnic group in Laos (Sanephonasa et al. 2021).
Compared to the variant profile in South and Southeast Asian populations, G6PD variability in China is distinctly different. In Han Chinese, Kaiping (MAF = 0.3%) and Canton (MAF = 0.3%) were the most common G6PD deficient alleles and showed a clear South-to-North national gradient (He et al. 2020). In other Chinese ethnic groups, such as Dai, Miao, Tibetans and Yi, variant signatures showed pronounced differences with the G6PD Gaohe, Baise, Fushan and Union alleles explaining > 10% of population-specific deficiencies (Zheng et al. 2020). In contrast to China where the country-wide prevalence of G6PD deficiency is around 1.9% among males, G6PD deficiency is a rare disorder in Japan with an overall frequency of < 0.1%. Notably, despite this low frequency, a multitude of distinct very rare Japanese deficient alleles have been described, including Fukushima, Morioka, Yamaguchi and Musashino. Combined, these results demonstrate the conspicuous differences in G6PD molecular genetics even across ethnic groups in close geographical proximity and indicate that it is essential to employ genotyping strategies that are tailored to the specific population or ethnic background of a given patient.

Opportunities for precision public health
Population pharmacogenomic profiling can reveal genetic differences that predispose to differences in drug response. In Europeans, reduced function alleles of CYP2D6 are considerably more frequent than in other populations. Thus, genetic testing is particularly beneficial in these populations for identifying outlier patients, such as CYP2D6 poor metabolizers. The respective information can be utilized clinically, e.g. for prescribing alternatives to tramadol and codeine analgesics for pain relief (Crews et al. 2021) and for recommending aromatase inhibitors, such as anastrozole instead of tamoxifen for the prevention of breast cancer recurrence (Goetz et al. 2018)). Furthermore, European populations harbour the highest frequencies of CFTR trafficking mutations, suggesting that the rate of cystic fibrosis patients responding to CFTR correctors is overall higher in Europe compared to other populations.
Reduced function variants of DPYD and TPMT are most prevalent in Sub-Saharan Africa and, thus, preemptive genetic testing and genotype-guided dose adjustments of fluoropyrimidines and thiopurines are particularly beneficial in those populations. Similarly, African populations have the highest frequencies of genetic G6PD deficiency, which constitutes a contraindication to treatment with the 8-aminoquinoline antimalarials primaquine and tafenoquine, the only curative treatments for Plasmodium vivax malaria, due to drastically elevated risk of severe acute haemolytic anaemia (Watson et al. 2018). Furthermore, G6PD deficiency status is useful to guide treatment with various other drugs, including pegloticase, rasburicase, flutamide, as well as sulfonamide antibiotics.
Southeast Asia constitutes the main hotspot of the HLA-B*15:02 and HLA-B*58:01 alleles, entailing that testing for carbamazepine and allopurinol induced severe cutaneous adverse reactions is most important in these groups. Notably, country-specific frequency information can refine pharmacogenomic decision making at the national level. For example, while HLA-B*15:02 is generally common in Asian populations with average minor allele frequencies of 5-10%, rates are much higher in the Philippines where about half of the population are carriers, whereas frequencies in Japan are < 0.1%. With increasing availability of genotype information, genetic differences between ethnic groups are revealed with higher and higher resolution and the resulting data shows that pronounced genetic differences can exist even across relatively small geographic regions. However, we want to emphasize that both high resolution studies with well-defined cohorts as well as population-scale aggregated information should be considered to allow for an integration of information about ethnogeographic differences with modern human migration and admixture.

Conclusions
Interindividual differences in drug response are in part caused by genetic variants with differential ethnogeographic prevalence and information about their distribution can be important for population-stratified therapy (Mette et al. 2012;Roberts et al. 2021;Yasuda et al. 2008). In this review, we provide a current update of population differences in the genetic variability of ten different genes that are included in the labels of 141 different drugs or therapeutic regimens as warnings or as factors impacting the clinical pharmacology of the agents in question (Supplementary Table 5). The aggregated data suggest strong differences in variant distribution and gene functionality between major ethnogeographic groups. We hope that the overview provided herein can serve as a useful resource for pharmacologists, clinical geneticists and public health researchers to evaluate treatment risks and inform population-adjusted genotyping strategies. Funding Open access funding provided by Karolinska Institute.

Declarations
Conflict of interest YZ and VML are co-founders and shareholders of PersoMedix AB. In addition, VML is CEO and shareholder of Hepa-Predict AB and discloses consultancy work for Enginzyme AB.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.