Background

Adverse drug reaction (ADR) having the ability of causing severe morbidity and mortality among patients is a major concern in clinical practice and the pharmaceutical industry. Increasing evidence shows that genetic differences between individuals are an important factor to ADR [1]. Pharmacogenomics is a discipline that studies how genetic factors affect the responses of individuals to drug therapy [2] and transforms the drug responses of individuals into a molecular diagnosis. Therefore, it can be used for individualised drug therapy [3]. Over the past 60 years, pharmacogenomics has been used to determine the genetic determinants of drug effects and to maximize drug efficacy and minimize ADR [1]. At present, it is necessary to integrate genomic data into the benefit and risk assessment of daily treatment so that individualised treatment has a certain possibility to vary from person to person [4].

PharmGKB, the Pharmacogenomics Knowledge Base (http://www.pharmgkb.org) is dedicated to disseminating information on how genetic variation causes variation in drug response. The PharmGKB database describes the connection between genes, diseases and drugs and provides various forms of knowledge, including the abstracts of very important pharmacogene (VIP) , drug pathway diagrams and selected literature notes [5]. The PharmGKB database also integrates information from the Clinical Pharmacogenetics Implementation Consortium (CPIC) to provide drug dosage guidance based on individual genotypes [6].

There are 56 ethnic groups recognized by the People's Republic of China, and different ethnic groups have different reactions to drugs. The Wa people reside mainly in the Yunnan Province of Southwestern China. The total population of the Wa ethnic group in China is 429,709, based on the data of the sixth nationwide population census in 2010. Because of the differences in genetics, physiology, pathology, diet, living environment, and nutritional status, the same drug regimen may not be suitable for every ethnic groups [7]. For example, in the Han, Bai, Wa, and Tibetan populations of the Yunnan Province in Southwestern China, there are significant differences in MDR1 genotype distribution and the haplotype spectrum [8]. Studies have shown that CYP2C9 mutation alleles frequencies in Caucasians are relatively higher (*2:12%, *3:8.3%), while CYP2C9 mutation alleles frequencies in Chinese are relatively lower (CYP2C9*2:0%,*3:0%,*2:15%) [9]. Many of the observed drug response variability has a genetic basis, which is caused by the differences in the genetic determination of drug absorption, disposal, metabolism, or excretion [10].

We selected and genotyped 52 VIP variants among 27 genes in the Wa population. Next, we compared the genotype frequency and allelic distribution differences of VIP variants between the Wa ethnic group and the other 26 populations from the 1000 Genomes Project. The research results will expand the current Wa ethnic group pharmacogenomics information and ethnic diversity, and help clinicians to use genomic and molecular data to effectively implement personalized medicine in the future.

Results

According to the PharmGKB database, we designed 67 SNPs and obtained 52 VIP variants, which are distributed mainly on 27 genes, mainly related to the cytochrome P450 family, dihydropyrimidine dehydrogenase, cyclooxygenase, N-acetyltransferase and others. The chromosome position, base pair, functional result, genotype-drug relationship, information about the drug related to gene mutation, gene, level of evidence, genotyping, minor allele frequency (MAF), and other basic information are shown in Table 1. The designed PCR primers is designed using the Agena MassARRAY Assay Design 4.0 software (San Diego, California, USA), and the specific information is showed in Supplementary Table 1.

Table 1 Basic characteristics of the selected VIP variants from the PharmGKB database and genotype frequencies in the Wa population

We used the chi-square test to study the frequency distribution of 52 loci and compared the Wa ethnic group with the other 26 different populations from the 1000 Genomes Project (CDX, CHB, CHS, JPT, KHV, ACB, ASW, ESN, GWD, LWK, MSL,YRI, CLM, MXL, PEL, PUR, CEU, FIN, GBR, IBS, TSI, BEB, GIH, ITU, PJL and STU). Compared with the other 26 ethnic groups, we observed 17, 21, 18, 22, 18, 33, 32, 36, 37, 33, 34, 36, 37, 33, 35, 38, 36, 40, 39, 41, 38, 32, 40, 39, 40, and 39 different SNPs without adjustment (p < 0.05) (Table 2). The table shows that the Wa ethnic group has the smallest difference compared with the CDX, CHB, CHS, and KHV in the East Asian population, but the biggest difference is in the GIH and PJL in the South Asian population compared with the FIN and IBS in the European population. Among these loci, CYP3A5 rs776746, ACE rs4291, CYP4F2 rs3093105, SLC19A1 rs1051298, and CYP2D6 rs1065852 had higher frequencies compared with the other 26 populations. We also found that the significant differences between KHV, JPT, CDX, LWK and Wa people were in rs3093105 and rs1065852.

Table 2 Significant VIP variants in the Wa people compared with the other 26 populations without adjustment

Compared the Wa ethnic group with the other 26 population groups, there were 6, 9, 6, 10, 7, 28, 25, 27, 32, 29, 28, 30, 23, 21, 23, 27, 27, 24, 24, 24, 26, 20, 26, 24, 26, and 27 different VIP variants after Bonferroni's multiple adjustments (p < 0.05/(52×26)) (Table 3). Compared with the Wa population in the Yunnan province of China, the differences of CDX, CHB, and CHS the East Asian population are the smallest; the differences of GWD, LWK, and YRI, whose genomes are African, are the biggest. CYP3A5 rs776746, ACE rs4291, CYP4F2 rs3093105, SLC19A1 rs1051298, and CYP2D6 rs1065852 in the Wa population still have a high frequency in the other 26 populations after adjustment. There are also some variants becoming insignificant, such as NAT2 rs4646244 and CYP2A6 rs8192726. According to statistics, the frequency of NAT2 rs1041983, rs1799930 and CYP2C9 rs1057910 among the Wa population is only different from PEL, STU, and GIH, while other loci are different between the Wa and multiple ethnic groups.

Table 3 Significant VIP variants in the Wa people compared with the other 26 populations after Bonferroni’s multiple adjustment

Our research results show that rs776746 (CYP3A5), rs4291 (ACE), rs3093105 (CYP4F2), rs1051298 (SLC19A1) and rs1065852 (CYP2D6) are the five important VIP variants, and their drug-related information is shown in Table 4. Rs776746 (CYP3A5) is mainly related to the dose and metabolism/pharmacokinetics of tacrolimus in the East Asian populations. Rs4291 (ACE), which plays a functional and important role in captopril, is related to the toxic effects of aspirin in the East Asian populations and is related to amlodipine,chlorthalidone,and lisinopril in the mixed populations. Rs3093105 (CYP4F2) plays a metabolic/pharmacokinetic role in vitamines. In the European populations, rs1051298 (SLC19A1) plays an effective and crucial role in the bevacizumab pemetrexed drug and the pemetrexed drug in the mixed populations. In the East Asian populations, rs1065852 (CYP2D6) plays a metabolic/pharmacokinetic role in alpha-hydroxymetoprolol and is related to citalopramescitalopram in the European populations. This gene is also closely related to iloperidone. In clinical medication, SNPs at the same variant have different effects on the types and effects of drugs in the different populations, which should be fully and carefully considered.

Table 4 Significant VIP variants and drug-related information in the Wa population

We combined the calculated allele frequencies with previously published data from the global population, and then conducted a comprehensive analysis of the above several loci. Figure 1 shows that the frequency of the GA genotype of rs1065852 is the highest one (85%) in the Wa population; the frequency of the GG genotype of rs1065852 and the CT genotype of rs776746 is the lowest in the Wa population, but the highest is in the African population. In the Wa population, the TA genotype frequency of rs4291 is 1.00%, the CA genotype frequency of rs3093105 is 99.5%, and the AG gene of rs1051298 has a type frequency of 77.9%, which is significantly higher than that of the other populations, showing that the genotype frequencies of the same SNPs in different races are diverse. Figure 2 clearly shows that rs4291-T and rs3093105-C are the highest among the Wa population, with a frequency ranging from 40% to 60%, while rs1065852-G is the lowest among the East Asian population, with a frequency ranging from 34% to 64%. Rs776746-T is the highest in the African population and the lowest in the Wa population; the frequency of rs1051298-G in the East Asian population is 38%-50%, which is lower than that of in the American population. In short, the distribution of alleles is different in each ethnic group, which indicates that there are some differences in genetic background.

Fig. 1
figure 1

Genotype frequency of significant VIP variants in 27 global populations

Fig. 2
figure 2

Distribution of alleles with significant VIP variants in 27 global populations

Discussion

Pharmacogenomics refers to gene-based testing to give the appropriate medicine to different patients at the right dose, thereby maximizing the efficacy and minimizing toxicity, thus improving the goal of personalized medicine [11]. In our study, we selected 52 variant genes related to drug response in the Yunnan Wa ethnic group from PharmGKB and compared the results with the other 26 populations distributed worldwide. The research results are not only enriched the knowledge of Wa pharmacogenomics but also laid a certain theoretical foundation for individualised medication. In our study, we found that the frequency of CYP3A5 rs776746, ACE rs4291, CYP4F2 rs3093105, SLC19A1 rs1051298, and CYP2D6 rs1065852 in the Wa population is higher than the other 26 populations from the 1000 Genomes Project. There are significant differences in the genotype frequency and allele distribution of these VIP variants. For the reason of these differences, we should also consider some factors affecting allele frequency distribution, such as genetic mutation, natural selection, genetic drift, and individual migration between populations. Wa people in the Yunnan Province of China may have special living environment and eating habits, as well as an unique geographical location.

CYP3A5 is located in chromosome 7q21-q22.1, encoding an enzyme of the CYP3A subfamily. The most common nonfunctional variant is CYP3A5*3. The status of CYP3A5*3 is determined by the rs776746-derived allele, that is, the change of intron 3 from A to G [12]. Tacrolimus is an immunosuppressant of calcineurin inhibitors which can prevent allograft rejection in solid organ transplant recipients [13, 14]. After studying the effect of CYP3A5 (rs776746) on the concentration/doses (C/Ds) of tacrolimus and the long-term prognosis of Chinese heart transplantation, Liu et al. [15] found that CYP3A5 nonexpressors (CYP3A5*3/*3) did not expressed in all point of time. The C/Ds of crolimus are significantly higher than that of expressers (CYP3A5*1/*3), so nonexpressors have higher tacrolimus C/Ds, and expressers tend to have the worse long-term prognoses. In our study, we found that CYP3A5 rs776746 is more significant in the Wa population compared with the other 26 populations, which is related to tacrolimus dose and metabolism/pharmacokinetics in the East Asian population which indicates that the factor should be fully considered when performing tacrolimus therapy to help to determine the appropriate dose.

Cytochrome P450 4F2 (CYP4F2) is an omega-hydroxylase and the only enzyme which is currently showed to metabolize vitamin E in the human body [16]. There are two common genetic variants (V433M, rs2108622 and W12G, rs3093105) that can change its activity. CYP4F2 gene polymorphisms affects vitamin E to improve the liver of nonalcoholic fatty liver disease children and adults who participated in the Treatment of Nonalcoholic Fatty Liver Disease in Children and Pioglitazone versus Vitamin E versus Placebo for the Treatment of Nondiabetic Patients with Nonalcoholic Steatohepatitis Histology, but there are obvious individual differences in its efficacy [17]. Studies have shown that the W12G mutant has increased enzymatic activity on tocopherols and tocotrienols, while the V433M mutant has reduced enzymatic activity on tocopherols. There is no reduced enzymatic activity on tocotrienols. The influence of these SNPs on vitamin E status and the response of the human body to vitamin E supplementation has an important and obvious clinical significance [16]. The MAF W12G variants in the European and African American populations have been reported to be 11% and 21%, respectively. By using the Asian combined sampling group (Chinese and Japanese HapMap data sets), the W12G variants, the MAF of the body is 6% [18]. The results shows that in the Wa population, the C allele frequency of rs3093105 is 40%-60%, which is higher than that of the other populations in China. Not only that, this gene can affect the metabolism/pharmacokinetics of vitamin E. Therefore, the fact that patients supplemented vitamin E and clinicians had fully understanding its status will help clinicians to better individualize treatment.

The canonical RefSeq CYP2D6 gene spans approximately 4,400 nucleotides, including 9 exons, and is encoded on the negative strand of the chromosome 22q13.2 [19]. CYP2D6 polymorphisms can affect the metabolism of alpha-hydroxymetoprolol [20], citalopramescitalopram [21], and iloperidone [22]. Drug dosage can be recommended according to the metabolism of CYP2D6. A previous study of atorvastatin in the treatment of ischemic stroke found that the G allele of rs1065852 (CYP2D6) had a better lipid-lowering effect, and patiebts carrying the GG genotype had a better effect on atorvastatin treatment reaction. For example, patients with insulin resistance who carry the GG genotype should be considered to reduce atorvastatin use to avoid the drug reactions [23]. Li et al. [24] reported that in the Han population with lung cancer in Northwestern China,the most significant correlation is the A allele of CYP2D6 rs1065852 and the AA genotype, which can increase the cancer risk. Sun et al. [25] showed that the G allele in the CYP2D6 rs1065852 may be related to the efficacy of labetalol in the treatment of early-onset preeclampsia. This study found that the G allele frequency of rs1065852 in the East Asian population was 34%-64%, and the frequency of the GG genotype in the Wa population was 0.5%, which were much lower than the other populations. Therefore, when clinicians use drugs to treat related diseases, the optimal dose of the drug should be based on the specific genotype of the individual patient to maximize the therapeutic effect.

Angiotensin-converting enzyme (ACE), encoded by the ACE gene, is located in 17q23, consists of 28 exons and 25 introns. ACE participates in the renin-angiotensin-aldosterone system (RAAS), which affects salt retention a protein for water balance and blood vessels; therefore, RAAS controls blood pressure, and drugs that inhibit this enzyme are effective in treating high blood pressure [26]. Migdalov et al. [27] demonstrated that captopril can be used to lower blood pressure by inhibiting ACE. Studies have shown that through the changes in fasting urea and creatinine over one year of dementia caused by Alzheimer’s disease (AD), the use of angiotensin converting enzyme inhibitors has found to be effective for carriers of rs1800764 CT/rs4291 AA. Though having a protective effect, changes in creatinine is harmful to carriers of rs1800764 CT/rs4291 AT [28]. Our study found that the TA genotype frequency was 1.00 in the Wa population, which was higher than that of in the other populations, while the AA genotype frequency was the lowest, which indicated that the optimal dose of ACE inhibitor should be based on the specific genotype of the individual Wa patients.

The SLC19A1 gene encodes a folate transporter and is involved in the regulation of intracellular folate concentration [29]. Studies have shown that folate carrier protein 1 (SLC19A1) affects the transport process of pemetrexed in the body. An analysis of the Han patients with non-small cell lung cancer who were only received pemetrexed treatment showed that the SLC19A1 rs1051298 (c.*746 C > T) increases the risk of all adverse drug reactions of pemetrexed treatment in different cycles. As with the risk of all adverse reactions, this effect is particularly important in liver injury [30]. Corrigan et al. [31] found that the SNP rs1051298 in the SLC19A1 gene can affect the overall survival and progression-free survival of patients with advanced non-small cell lung cancer receiving pemetrexed combined with platinum therapy. The results show that compared with the other 26 populations, the Wa population SLC19A1 rs1051298 is more significant and based on its polymorphism affecting the efficacy of pemetrexed, we can maximize the therapeutic effect of pemetrexed on the Wa patients.

Conclusions

This study analyzed the differences in genotype frequency and allele distribution between the Wa ethnic group and the other 26 ethnic groups worldwide. Rs776746 (CYP3A5), rs4291 (ACE), rs3093105 (CYP4F2), rs1051298 (SLC19A1) and rs1065852 (CYP2D6) in the Yunnan Wa population have a higher frequency, which provides a theoretical basis for safe medication and efficacy improvement. Our study complement the pharmacogenomics information of Wa population from Yunnan province and provide valuable information for future studies and better individualized treatments. This study has certain limitations. Due to the small sample size and the unadvanced genotyping technology, it is not able to fully and totally detect less common variants (in fact, variants with potentially important pharmacogenomic markers) that may (erroneously) give negative results, so participants may carry other important DNA variants not detected by the Agene MassARRAY platform. A large number of sample studies are also needed to verify the accuracy of our research.

Methods

Study participants

We randomly recruited 200 unrelated Wa adults from the Yunnan province of China. The selected subjects were judged to be in good health according to their medical history and had only Wa ethnic origins in at least the last three generations. In addition, this study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Clinical Research Ethics of Xizang Minzu University. Each participant also signed an informed consent form.

Variant selection and genotyping

We searched the PharmGKB database and 52 random VIP variants of 27 genes were ultimately selected for our study according to available data on frequency, functionality, and linkage based on published research. The method of operation used was to extract the genomic DNA of peripheral blood according to the GoldMag-Mini whole blood genome DNA Purification Kit (GoldMag Ltd. Xi'an, China). The DNA concentration was measured by a NanoDrop 2000C spectrophotometer (USA). Agena MassARRAY Assay Design 4.0 software (San Diego, California, USA) was used to design multiple SNP MassEXTEND arrays (Gabriel et al., 2008) and to design primers and single base extension primers for the selected sites. The PCR primers for the selected variants are presented in Supplementary Table 1. Following the instructions provided by the manufacturer, we used Agena MassARRAY RS1000 (San Diego, California, USA) to determine the genotype of the SNP. A brief overview of the Agena MassARRAY RS1000 (San Diego, California, USA) method for genotyping were as follows: (1) PCR amplification, (2) SAP purification, (3) iPLEX single base extension reaction, (4) resin exchange, and (5) mass spectrometry detection. Finally, Agena Typer 4.0 software was used for data statistics and analyses (Thomas et al., 2007) [32].

1000 Genomes Project

The individual genotype data of the 26 populations were downloaded from the website of the 1000 Genomes Project (http://www.1000genomes.org/) [33]. These 26 populations were: (1) African Caribbean in Barbados (ACB); (2) African Ancestry in Southwest US (ASW); (3) Esan in Nigeria (ESN); (4) Gambian in Western Divisions, The Gambia – Madinka (GWD); (5) Luhya in Webuye, Kenya (LWK); (6) Mende in Sierra Leone (MSL); (7) Colombian in Medellin, Colombia (CLM); (8) Mexican Ancestry in Los Angeles, California (MXL); (9) Peruvian in Lima, Peru (PEL); (10) Puerto Rican in Puerto Rico (PUR); (11) Chinese Dai in Xishuangbanna, China (CDX); (12) Yoruba in Ibadan, Nigeria (YRI); (13) Han Chinese South (CHS); (14) Japanese in Tokyo, Japan (JPT); (15) Kinh in Ho Chi Minh City, Vietnam (KHV); (16) Utah residents with Northern and Western European ancestry (CEU); (17) Finnish in Finland (FIN); (18) British in England and Scotland (GBR); (19) Iberian populations in Spain (IBS); (20) Toscani in Italy (TSI); (21) Bengali in Bangladesh (BEB); (22) Gujarati Indians in Houston, Texas (GIH); (23) Indian Telugu in the UK (ITU); (24) Punjabi in Lahore, Pakistan (PJL); (25) Sri Lankan Tamil in the UK (STU), and (26) Han Chinese in Beijing, China (CHB).

Statistical analyses

Microsoft Excel and SPSS 20.0 statistical software packages were used to perform Hardy-Weinberg equilibrium (HWE) analysis and χ2 tests (SPSS, Chicago, IL, USA). The χ2 tests were used to evaluate the frequency of variation from HWE in the Wa population for verification. In this study, All p-values were two-sided and p-values less than 0.05 were considered statistically significant. Next, the Bonferroni multiple adjustment method was used for correction, and p < 0.05/(52×26) has a significant difference. Subsequently, we obtained SNPs allele frequencies from the Ensemble database (https://asia.ensembl.org/index.html). Finally, the overall genetic variation pattern of specific loci was analyzed [34].