Background

Current pharmacogenetic research includes the study of genes involved in the absorption, distribution, metabolism, and excretion (ADME) of drugs. These data may help clinicians and researchers to understand individual differences in sensitivity or resistance to certain drugs, thereby avoiding adverse drug reactions (ADRs) in patients and improving the quality of therapies. Thus, pharmacogenetic research has great practical value in the development of personalised medicine. Moreover, pharmacogenetic studies contribute to our understanding of population genetics because the frequencies of certain allelic variants may differ depending on the population. The people of Central Asia are poorly understood from a population genetic standpoint. However, studies in this field are on-going; the Kazakh population has been studied by both domestic and foreign scientists [14].

Kazakhs, one of the Turkic peoples of Central Asia, are the main population of Kazakhstan. According to the Committee on Statistics of the Republic of Kazakhstan, about 11 million Kazakhs live in Kazakhstan, and about 3.5 million Kazakhs live in regions neighbouring Kazakhstan and in other regions (China, Russia, Uzbekistan, Turkmenistan, Kyrgyzstan, west Mongolia, and Turkey) [5]. Kazakhs residing in the territory of Kazakhstan have an internal division into three large groups, Elder or Senior (Uly), Middle or Medium (Orta), and Lesser or Junior (Kishi) Zhuzes (or Hordes); historically, these three groups had demarcated territories. Additionally, there were several tribes in each Zhuz [6]. Every Kazakh knows to which tribe and Zhuz he or she belongs, and representatives of the same tribe are considered relatives as they have descended from a common ancestor. Therefore, according to the seven generations law, marriage between members of the same tribe is only possible after seven generations from a common ancestor [7].

Many genes associated with the ADME of drugs have been identified. A team including representatives of the pharmaceutical industry and an academic centre developed a core list of 32 ADME genes, which includes 184 markers that can be used to screen patients in clinical trials. These data are available on the PharmaADME website (http://pharmaadme.org/).

In this study, we aimed to analyse single nucleotide polymorphisms (SNPs) involved in the ADME of multiple drugs in Kazakhs from Kazakhstan using an OpenArray PGx Panel derived from the PharmaADME Core Marker List.

Results

Allele and genotype frequency analysis

Allele and genotype frequency data were obtained for 158 SNPs (Additional file 1). Seventy-five out of the 158 SNPs included in this study were not found in heterozygous or homozygous variants (Additional file 2). The allele and genotype frequencies of the remaining 83 SNPs are summarized in Table 1.

Table 1 Allele frequency and genotype distribution in the Kazakh population (а number of chromosomes; b number of alleles)

The correspondence of the distributions of genotype frequencies to the Hardy-Weinberg equilibrium was assessed using exact tests with a modified version of the Markov-chain random walk algorithm [8] (р > 0.05). Seven SNPs of the 83 (i.e., rs1799929 [p = 0.03], rs2069514 [p = 0.02], rs1801280 [p = 0.01], rs12248560 [p = 0.00], rs2032582 G > A [p = 0.00], rs2032582 G > T [p = 0.04], and rs2273697 [p = 0.04]) were not in Hardy-Weinberg equilibrium.

Comparative analysis of allele frequencies

A comparative analysis of the allele frequency between the Kazakh samples analysed here and HapMap published data from 11 populations worldwide was carried out (Table 2). A comparative analysis was performed for those SNPs found in the Kazakh population in heterozygous or homozygous variants. hCV32287186, hCV32407240, rs2069514, rs17885098, rs72552763, rs35742686, rs72549350, rs5030867, rs55965422, rs11572080, rs4986893, rs72559710, rs41291556, rs55793712, rs35350960, rs55785340, rs2032582, rs3892097, rs17878459, rs28399433, rs34059508, rs41279188, rs4646438, rs17886522, rs12721655, and rs1902023 frequency data are missing in the HapMap database; therefore, these SNPs were not analysed, and comparative analysis was carried out for 56 SNPs. Exact tests of population differentiation with a significance level of 0.05 were used [9]. No statistically significant differences in the frequencies of rs8177507, rs3740066, rs4986988, rs1799930, rs4646277, or rs1801272 genotypes were found with any population (p > 0.05).

Table 2 A comparative analysis of the allele frequency between Kazakh population (our data) and world’s populations (HapMap data)

Next, we performed a comparative analysis of the differences in genotype frequencies among the Kazakh population and data for world populations collected from the HapMap database. For individuals of African ancestry living in the southwest USA (ASW), only 35 SNPs of a total of 56 were analysed. The remaining data for this population were no included in the HapMap database. Twenty of these 35 SNPs were significantly different from those in the Kazakh population. These genes encoded drug transporters (ABCB1, ABCC2, ABCG2, SLC15A2, SLC22A2, SLCO1B1, SLCO1B3, and SLCO2B1) and phase I (DPYD, CYP1A1, and CYP2B6) and II (GSTP1, TPMT, UGT2B7, and UGT1A1) drug metabolic enzymes. However, we found that there were no significant differences in SNPs within genes belonging to the acetyltransferase family (NAT2).

For Utah residents with Northern and Western European ancestry from the CEPH collection (CEU), population analysis was carried out for 50 SNPs; 21 of these SNPs showed significant differences compared with the Kazakh population. These SNPs were found in genes encoding drug transporters (ABCB1, ABCC2, SLC22A1, SLCO1B1, SLCO1B3, and SLCO2B1) and phase I (CYP1A1, CYP2C8, CYP2C9, and CYP2C19 ) and II (NAT2, GSTP1, and UGT1A1) drug metabolic enzymes. SNPs in genes belonging to the solute carrier family 15 (H+/peptide transporter) did not show differences between the Kazakh and CEU populations.

Only 26 of 50 SNPs showed significant differences among the Kazakh population and the Han Chinese population in Beijing, China (CHB). For the Chinese population in Metropolitan Denver, CO (CHD), population analysis was carried out for 34 SNPs; 14 of these SNPs showed significant differences from the Kazakh population. Significant differences were also observed for 24 of 51 SNPs in the Japanese population in Tokyo, Japan (JPT), 23 of 30 SNP in the Luhya population in Webuye, Kenya (LWK), 14 of 37 SNPs for the population of Mexicans in Los Angeles, CA (MEX), 21 of 33 SNPs for the population of Maasai in Kinayawa, Kenya (МKK), 17 of 33 SNPs the Tuscan population in Italy (TSI), and 36 of 50 SNPs in the Yoruban population in Ibadan, Nigeria (YRI).

For the Gujarati Indian population in Houston, TX (GIH), population analysis was carried out for 38 SNPs; 23 of these SNPs showed significant differences from the Kazakh population. Notably, comparative analyses of rs12720461, rs28371685, rs1058930, and rs3918290 were carried out only for the GIH population because frequency data in the HapMap database were only available for this population. Of these SNPs, only rs3918290 showed a significant difference from the Kazakh population.

If we compare the ratios of significantly different SNPs with the amount of data (i.e., the number of SNPs that were analysed) for each population, the YRI population showed the greatest differences compared with the Kazakh population. However, similar to the CEU population, statistically significant differences for SNPs of genes belonging to the solute carrier family 15 (H+/peptide transporter) were not found.

The SNPs rs8177507, rs3740066, rs4986988, rs4986782, rs12720461, rs1799930, rs28371685, rs4646277, rs1801272, rs11572103, and rs1058930 showed no significant differences with any of the compared populations, suggesting that the power of the study (320 DNA samples) may be insufficient.

Linkage disequilibrium (LD) analysis for the Kazakh population

Using Haploview 4.2 software, LD statistics results for the Kazakh population were obtained (Fig. 1). For block generations, the Confidence Intervals default algorithm was used. We selected SNPs that were consistent with Hardy-Weinberg equilibrium and ignored those with minor allele frequencies (MAFs) of less than 0.05. As a result, four haplotype blocks were defined: two blocks consisting of two SNPs, i.e., rs7662029 and rs7668258 (block 3, chromosome 4) and rs4149117 and rs7311358 (block 2, chromosome 12); one block consisting of three SNPs, i.e., rs2293616, rs2257212, and rs1143671 (block 4, chromosome 3); and one block consisting of five SNPs, i.e., rs1041983, rs1801280, rs1799929, rs1799930, and rs1208 (block 1, chromosome 8). The strongest LDs were found for rs2293616–rs2257212, rs2293616–rs1143671, rs2257212–rs1143671, rs7662029–rs7668258, rs4986893–rs17886522, and rs10509681–rs11572080 in the Kazakh population. The haplotype frequencies in the studied population are presented in Table 3.

Fig. 1
figure 1

LD SNP plot. The LD is displayed according to standard colour schemes, with bright red for very strong LD (LOD > 2, D’ = 1), light red (LOD > 2, D’ < 1) and blue (LOD < 2, D’ = 1) for intermediate LD, and white (LOD < 2, D’ < 1) for no LD

Table 3 Haplotype frequencies in the Kazakh population

The crossover percentage matrix showed that the highest value had the pattern GA-AT (block 2–block 3; 40.4 %). Additionally, 34.5 % of all samples had the pattern GA-GC (block 2–block 3), 28.5 % had the pattern AT-ATT GC (block 3–block 4), and 26.9 % had the pattern CTCGA-GA (block 1–block 2).

Tag-SNP analysis was also carried out using the aggressive tagging strategy (r2 threshold: 0.8, logarithm (base 10) of odds [LOD] threshold: 3.0, minimum distance between tags: 0 kb). The analysis results are shown in Table 4. We found that rs1143672 was a tag-SNP for block 4. Therefore, it was likely that block 4 was formed by four SNPs, i.e., rs2293616, rs2257212, rs1143671, and rs1143672, rather than three SNPs.

Table 4 Tag SNPs

Comparative analysis of haplotype frequency

Next, we carried out a comparative analysis of the haplotype frequencies of the samples from the Kazakh population and published data from the HapMap database, including 11 worldwide populations. All of the SNPs described in Fig. 1 were used for analysis; however, not all of these SNPs were present in the HapMap database. For block generations, the Confidence Intervals default algorithm was used (Haploview 4.2, MAF < 0.05). Block generation results for all 11 population are presented in Additional file 3. From these data, only the CEU population formed a block in the NAT2 gene that was similar to that in the Kazakh population, consisting of rs1041983, rs1801280, rs1799929, rs1799930, and rs1208. The CEU block contained seven haplotypes, whereas that in the Kazakh population contained only six haplotypes; additionally, the frequencies were different (Table 5). The GIH, LWK, MKK, and TSI populations generated blocks consisting of only four SNPs: rs1041983, rs1799929, rs1799930, and rs1208, whereas the MEX and YRI populations generated blocks consisting of three SNPs (rs1041983, rs1799929, and rs1799930). The JPT population generated blocks consisting of two SNPs (rs1041983 and rs1799930). Blocks were not generated by ASW, CHB, or CHD populations. Additionally, CEU, CHB, JPT, and YRI populations generated blocks similar to those of the Kazakh population, consisting of two SNPs (rs4149117 and rs7311358) in the SLCO1B3 gene (Additional file 3). These populations had four haplotypes that differed in frequency (Fig. 2). The highest frequency of haplotype GA was found in the CEU population (0.852), whereas the lowest frequency of haplotype GA was found in the YRI population (0.342). The value closest to that in the Kazakh population for haplotype GA (0.758) was found in the CHB population (0.710). The highest and lowest frequencies of haplotype TG were found in the YRI (0.658) and CEU (0.148) populations. The value closest to the Kazakh population for haplotype TG (0.213) was found in the CHB population (0.265). The TA haplotype was found only in the JPT (0.038) and CHB (0.025) populations, and the GG haplotype was found only in the Kazakh population (0.030). The rest of the analysed populations did not generate blocks.

Fig. 2
figure 2

Haplotype analysis results of rs4149117 and rs7311358 in the SLCO1B gene (chromosome 12)

Table 5 Haplotype analysis results of rs1041983, rs1801280, rs1799929, rs1799930 and rs1280 in NAT2 (chromosome 8)

Kazakh population block, consisting of rs7662029 and rs7668258 in the UGT2B7 gene, was found in all 11 populations (Additional file 3). The highest and lowest frequencies of haplotype GC were found in the YRI (0.824) and CEU (0.490) populations, and the highest and lowest frequencies of haplotype AT were found in the CEU (0.510) and AWS (0.176) populations, respectively. The GC (0.464) and AT (0.525) haplotype frequencies in the Kazakh population were close to the respective frequencies in the CEU population (Fig. 3).

Fig. 3
figure 3

Haplotype analysis results of rs7662029 and rs7668258 in the UGT2 gene (chromosome 4)

All 11 populations generated blocks in the SLC15A2 gene (Additional file 3). However, these blocks contained different numbers of SNPs. The CEU, CHB, JPT, and YRI populations generated blocks consisting of four SNPs: rs2293616, rs2257212, rs1143671, and rs1143672. The blocks of the other analysed populations consisted of three SNPs: rs2293616, rs2257212, and rs1143671. The highest and lowest frequencies of haplotype GCC were found in the MEX (0.728) and CEU (0.253) populations (Fig. 4). The highest and lowest frequencies of haplotype GCCG were found in the CEU (0.540) and JPT (0.233) populations. The highest frequencies of haplotypes ATT and ATTA were found in the CHD (0.747) and CHB (0.750) populations, whereas the lowest frequencies of haplotypes ATT and ATTA were found in the GIH (0.295) and CEU (0.450) populations.

Fig. 4
figure 4

Haplotype analysis results of rs2293616, rs2257212, and rs1143671 in the SLC15A2 gene (chromosome 3)

If we take into account rs1143672 tagging analysis results of the Kazakh population and assume that block 4 consisted of four SNPs, the frequency of the GCCG haplotype was 0.459, and that of ATTA was 0.537. These values were nearly identical to the results of the YRI population.

Discussion

In this study, we examined the frequencies of specific SNPs in the Kazakh population and compared the results with those in the HapMap database for 11 other populations throughout the world. The results showed a fairly high percentage of population differentiation, providing insights into the different racial groups that may have contributed to the Kazakh population.

The Kazakh population is an interesting model in population genetics, and the process through which the Kazakh population formed is poorly understood. However, some scientists believe that the Kazakh population was formed by the mixing of the Asian and Caucasoid populations [6] owing to the observation that there are Kazakh individuals who have distinctive Asian and/or Caucasoid traits. Additionally, the Kazakh people are divided into three Zhuzes and further divided into distinct tribes in each Zhuz. The historical division into Zhuzes could be argued on the basis of the different origins of each Zhuz; this could explain the different frequencies of SNPs within the population. However, in our previous study, in which we had a larger sample collection, we compared the frequencies of SNPs within the three Zhuzes and found no significant differences in SNPs between Zhuzes [7]. Thus, we concluded that we could combine all samples in one sample collection.

Genotyping of 158 SNPs from 320 DNA samples showed that 75 SNPs were not found in the studied samples (Table 1, Additional file 2). The frequencies of many of these SNPs were very low in other populations as well [10]. However, we could not conclude that these SNPs did not occur (or were only present in a very low frequency) in the Kazakh population. In addition, seven of 83 SNPs identified in the Kazakh population were not in Hardy-Weinberg equilibrium. We expect that this result may have been caused by the insufficient power of the study.

In this study, we selected SNPs involved in the ADME of drugs for genotyping. Thus, 19 of 83 SNPs occurring in the Kazakh population were associated with drugs used in the treatment of cardiovascular diseases (statins, beta-blockers, anticoagulants, and antiplatelet agents). The recommended dosage for the cholesterol-lowering agent simvastin is 80 mg (U.S. Food and Drug Administration [FDA], www.fda.gov). Moreover, the FDA recommends dose correction when using simvastatin with certain drugs that cause increased concentrations of simvastatin, resulting in increased risk of myopathy. In patients with the C allele at the SNP rs4149056 in the SLCO1B1 gene, there are modest increases in myopathy risk even at lower doses of simvastatin (40 mg daily); if optimal efficacy is not achieved with a lower dose, alternate agents should be considered [11]. The TT genotype frequency in our study was 72 % in Kazakhs, compared with 91 %, 71 %, 60 %, and 98 % in the ASW, CHB, TSI, and YRI populations, respectively. Moreover, responses of individuals to statin drugs are associated with ABCB1 (rs2032582), ABCC2 (rs717620), ABCG2 (rs2231142), SLCO1B1 (rs2306283), CYP2C8 (rs10509681), and CYP2C9 (rs1799853, rs1057910). Comparative analysis of the frequencies of these SNPs in the Kazakh population with those in the ASW population showed significant differences for all SNPs, except for the SNPs in cytochrome P450. In contrast, for the CEU population, only the SNPs in cytochrome P450 and SLCO1B1 (rs2306283) were significantly different from those in the Kazakh population.

The VKORC1 gene on chromosome 16 is one of the main genes associated with the dosage of coumarin anticoagulants, and several mutations in this gene are associated with enzyme deficiency. An allelic variant in VKORC1 (c.-1639G > A) determines up to 30 % of the variability in warfarin dosage [12, 13]. In a previous study, the VKORC1 c.-1639G > A mutation was found to be linked with VKORC1 c. 173 + 1369G > C (rs8050894) and VKORC1 c. 173 + 1000C > T (rs9934438) mutations [14]. Subjects carrying the 1173 T (rs9934438) allele required a lower maintenance dose of warfarin compared with that in subjects harbouring the CC genotype in African Americans and Caucasians. Before reaching the maintenance dose, only Caucasians with the T allele had a significantly increased risk of international normalized ratio compared with that in Caucasians harbouring the CC genotype. Polymorphisms in the VKORC1 gene are associated with the maintenance dose requirements of warfarin among both African Americans and Caucasians [15]. Interestingly, in VKORC1, the allele frequency of rs8050894 c. 173 + 1369G > C is as high as 94 % (G allele) in Asian populations, whereas that in Caucasians is about 37 % (G allele). In the Kazakh population, we found that the frequency of allele G was 63 %. Importantly, the response to anticoagulant drugs (e.g., warfarin) is associated with CYP1A1 (rs1048943) and CYP2C9 (rs1057910, rs28371685, and rs1799853). Comparative analysis of the frequencies of these SNPs showed that all of the SNPs listed above were significantly different between the Kazakh population and the YRI population, with the exception of rs28371685. The majority of the data were not present in the HapMap database (Table 2).

The treatment of cardiovascular diseases often involves administration of Plavix (clopidogrel). The influence of genetics on the pharmacokinetic and pharmacodynamic response to clopidogrel has been examined in previous studies [16]. Several polymorphic P450 enzymes are involved in the activation of clopidogrel. The CYP2C19 isoenzyme is involved in the formation of an active metabolite and intermediate metabolite, 2-oxoclopidogrel. The pharmacokinetics and antiplatelet effects of the active metabolite of clopidogrel, which were investigated by means of platelet aggregation ex vivo, vary depending on the genotype of the CYP2C19 isoenzyme. Allele CYP2C19*1 is responsible for the normally functioning metabolism, whereas alleles of the CYP2C19*2 and CYP2C19*3 genes are responsible for decreased metabolism. The frequency of the A (rs4244285) allele in our study was 17 % in Kazakhs, compared with 15.5 %, 28 %, and 14 % in CEU, JPT, and YRI populations, respectively. For rs4986893, the A allele frequency in our study was 4 % in Kazakhs; no HapMap data were available for other populations. Other alleles associated with reduced metabolism have been identified in CYP2C19*4, CYP2C19*5, CYP2C19*6, CYP2C19*7, and CYP2C19*8; however, these alleles were rarely found in our population.

The response to antiplatelet agents (Plavix) is also associated with ABCB1 (rs2032582), CYP1A1 (rs1048943), CYP1A2 (rs762551), CYP2B6 (rs3745274), CYP2C8 (rs10509681), CYP2C9 (rs1799853), and CYP2C19 (rs12248560). Comparative analysis of SNP frequencies showed that these SNPs were significantly different between the Kazakh population and the YRI population, with the exception of rs2032582. The majority of data were not available in the HapMap database. Significant differences in genes in the ATP-binding cassette system were not found between the Kazakh and JPT populations (Table 2).

Labetalol is a nonselective β-adrenergic antagonist with additional α1-adrenergic antagonist properties. CYP2C19 is involved in the metabolism of several important groups of drugs, including a number of β-blockers, such as propranolol and labetalol [17]. A previous study showed that the activity of labetalol is significantly affected by common CYP2C19 polymorphisms in individuals of Chinese ethnicity; specifically, subjects with the CYP2C19*2/*2 (rs4244285) genotype had a higher peak and area under the concentration-time curve than subjects with the CYP2C19*1/*1 genotype, and heterozygotes had intermediate values [18]. In the Kazakh population, genotype AA was found in 2 % of individual, whereas 5.2 %, 6.8 %, and 3.4 % of individuals in the CEU, JPT, and YRI populations carried this allele.

Responses to β-blockers are associated with ABCB1 (rs1128503) and UGT1A1 (rs4148323 and rs4124874). All of these SNPs were significantly different between the Kazakh and YRI populations, although most data were not available in the HapMap database. Significant differences in genes in the ATP-binding cassette system and UDP glucuronosyltransferase were not observed between the Kazakh and JPT populations. Moreover, SNPs in the UGT1A1 genes did not differ between the CHD and TSI populations (Table 2).

SNPs in ABCB1 (rs1045642) and CYP2C19 (rs4244285) are associated with the response to β-blockers, anticoagulants, and antiplatelet agents. Importantly, the frequencies of these SNPs were significantly different between the Kazakh population and the ASW, CEU, GIH, MKK, and YRI populations for rs1045642 and between the Kazakh population and the CHB and JPT populations for rs4244285 (Table 2).

Analysis of the results of haplotype frequencies among the populations examined in this study showed substantial and significant variations. For example, only four populations generated the block in the SLCO1B3 gene, similar to the Kazakh population. The CHB population had the most similar haplotype frequency compared with the Kazakh population. However, there were variations in haplotypes among populations, with differences in GA, TG, and TA haplotypes for the CHB and in GA, TG, and GG haplotypes in the Kazakh population. Only eight populations generated blocks in the NAT2 gene, and 24 haplotypes were formed by the analysed SNPs. From these results, none of the examined populations were similar to the Kazakh population with regard to this gene. However, all 11 populations generated haplotype blocks in UGT2B7 and SLC15A2 genes, and the CEU population had the closest frequency for UGT2B7, whereas the YRI population had the closest frequency for SLC15A2 relative to the Kazakh population. Thus, for these three genes (UGT2B7, SLC15A2, and SLCO1B3), the Kazakh population showed similarities with three different populations. All three of these populations showed significant differences in these three genes.

Conclusion

In summary, our data provided important information for personalised medicine in the Kazakh population, supporting the genotyping of specific SNPs before administration of drugs with respect to the patient’s ethnicity. The allele frequencies of the studied SNPs were quite different in the Kazakh population compared with those for all of the other populations examined. Moreover, we could not classify the Kazakh population as Asian or Caucasian, indicating that the Kazakh population may have been formed from several populations belonging to different racial groups.

Our study had several limitations. First, we had only a small number of samples. In addition, it will be useful to perform comparative analysis of the frequencies of SNPs in the different Zhuzes in order to clarify that combining samples from all Zhuzes is acceptable. Unfortunately, in this study, we did not have sufficient data to classify individuals into Zhuzes, only by nationality. In future studies, we plan to increase the number of samples and to examine additional SNPs.

Methods

Characteristics of the study populations

A total of 320 individuals living in Astana during 2012–2013 and belonging to the Kazakh nationality participated in this study. All individuals included in the present study were unrelated and randomly selected from different regions of Kazakhstan. The mean (± standard deviation [SD]) age of the participants was 44.06 ± 17.98 years (age range: 19–86), and the population included 239 men and 81 women.

Blood samples were collected in clinics in the city of Astana (Republican research center of transfusion, National research cardiac surgery center and Medical center of the Presidential Administration of Kazakhstan). Blood samples were taken according to the study protocol, which was approved by the Ethics Committee of the National Center for Biotechnology of the Republic of Kazakhstan, Astana, Kazakhstan (No. 11, 14.02.2010), Republican research center of transfusion, National research cardiac surgery center and Medical center of the Presidential Administration of Kazakhstan.

Each participant was informed of the purpose and methods of the study, and written informed consent was obtained from all participants. Each volunteer filled out a questionnaire to collect standard personal data, including their nationality and the nationalities of their parents and grandparents. Based on the concept of Zhety ata, in which each Kazakh individual is expected to know seven generations of their ancestors, we were able to collect information on nearly seven generations from each volunteer. While the questionnaire included data only to the second generation, the ethnicities of ancestors from the third to seventh generations were determined according to a verbal survey. If an individual indicated that he or she had an ancestor who was not a Kazakh, the blood sample from this individual was excluded.

Genotyping

DNA was collected from whole venous blood samples collected in EDTA-containing tubes. DNA from blood was extracted by the salting-out method [19], and genotyping was performed using real-time polymerase chain reaction (PCR) with high-throughput OpenArray technology. Amplification was performed on a QuantStudio 12 K Flex thermocycler (Life Technologies, USA) using pharmacogenomic PGx panels. The composition of the PCR mixture was as follows: OpenArray Genotyping Master Mix (2.5 μL/sample) and DNA sample of 50 ng/μL (2.5 μL/sample). The reaction volume was 5 μL. Each reaction mixture was covered by immersion oil. The PCR conditions were as follows: 10 min at 93°С; 50 cycles of 45 s at 93°С, 13 s at 94°С, and 2.14 min at 53.5°С; and incubation at 25°С for 2 min. Data processing was carried out using TaqMan Genotyper Software v. 1.3.

Statistical analysis

Statistical analysis was performed using Haploview 4.2 [20] and Arlequin 3.1 [21] software. The correspondence of the distributions of genotype frequencies to the Hardy-Weinberg equilibrium was assessed using the χ2 criterion (preliminary analysis) and exact tests using a Markov chain. Data from the HapMap database were used for the comparative analysis of the differences in genotype and haplotype frequencies among Kazakh and world populations (HapMap Genome Browser release #27 [Phases 1, 2, & 3 - merged genotypes and frequencies]) [10]. The exact test of population differentiation (Markov chain) method was used for the analysis [9, 21].

Availability of supporting data

The data sets supporting the results of this article are included within the article and its additional files.