Background

In recent years, the global malaria epidemic has gradually scaled down, yet the death tolls remain high and the number of malaria cases in certain areas has increased [1]. Meanwhile, multiple technological restrains, including the difficulty to identify low-density Plasmodium infection [2], the expansion of drug resistant Plasmodium [3, 4], and the lack of new drugs in the anti-malarial pipeline [5], pose daunting challenges to the elimination of malaria. Although the local cases of malaria have not been reported since 2016 [6], China still needs to cope with a large number of imported malaria cases, especially from Africa and Southeast Asian countries [7]. From 2011 to 2018, 50.7% of the imported malaria cases in Yunnan Province were caused by Plasmodium vivax [8]. Therefore, the elimination of malaria calls for the strict enforcement to block the transmission of P. vivax.

Tafenoquine and primaquine are recommended by the World Health Organization (WHO) for the treatment of relapsed malaria owing to its efficiency of eradicating the hypnozoites of P. vivax via 5-hydroxy-primaquine (5-hydroxy-primaquine) formed by the enzyme CYP2D6 (Cytochrome P450, family 2, subfamily D, polypeptide 6) in human liver cells [9,10,11,12]. CYP2D6, also known as debrisoquine4-hydroxylase, is an isozyme in the Cytochrome P450 (CYP450) super family and an important phase I drug-metabolizing enzyme. Owing to its low capacity and high affinity, CYP2D6 is involved in the metabolism of ~ 30% of commonly used drugs, including primaquine although it only accounts for 4% of the total P450 enzyme protein in the liver [12]. Studies have found that the heterogeneity of CYP2D6 activity is predominantly governed by its genetic variation [13], and that decreased CYP2D6 isoenzyme activity caused by genetic polymorphisms would obstruct the generation of 5-hydroxy-primaquine, making large doses or repeated use of primaquine necessary to compensate for the declined efficacy of primaquine. Meanwhile, primaquine can cause life threatening haemolysis in patients with glucose-6-phosphate dehydrogenase (G6PD) deficiency, subsequently inflicting low efficacy in anti-relapse and acute hemolysis [9, 14, 15]. Therefore, the risk of G6PD deficiency should be identified prior to receiving anti-relapse treatment of vivax malaria, and that the genotype and enzyme activity of CYP2D6 should be tested as well [14, 16, 17].

CYP2D6 enzyme activity is measured mainly through phenotypic observation and genotypic analysis. Compared with phenotypic characterization, genotypic prediction manifests the advantages of greater stability, more convenience and being less affected by environmental or physiological factors [18,19,20]. CYP2D6 enzyme activity can be detected from the blood sample of the patients; genotypic sequencing is currently the common technical method to determine the allelic form of CYP2D6 gene [21,22,23,24,25] by identifying the locus status of the CYP2D6 gene [13] or allelic combinations [26, 27], thereby predicting the activity level of CYP2D6. However, as the number of alleles of CYP2D6 gene continues to increase, efforts should be made to improve the precision and inclusiveness of genotyping results [28]. The better understanding of the genetic polymorphism of CYP2D6, which metabolizes about 25% of all medications in the human liver, can not only help deal with individual adverse event and but also enable the more precise DNA genotyping-based prediction.

Primaquine is commonly used in the malaria endemic areas of Yunnan Province, yet the systematic survey on the genetic polymorphisms of CYP2D6 has been not conducted. In this study, Whole Genome Sequencing (WGS) was conducted to investigate the polymorphisms of CYP2D6 gene in the vivax malaria cases, thereby revealing the association between human CYP2D6 gene polymorphisms and the relapsed cases of vivax malaria.

Methods

Study subjects and sampling

A total of 120 subjects living in Yunnan province, China (97°31′ E to 106°11′ E; 21°8′ N to 29°15′ N) were enrolled in the study. The subjects were diagnosed of mono-infection of P. vivax by both microscopy examination and Plasmodium 18S rRNA gene detection (PCR) in Yunnan Province Malaria Diagnosis Referent Laboratory (YPMDRL) (Additional file 1). All the patients received oral chloroquine therapy (1550 mg in total) within 3 days, followed by the subsequent 8-day course of primaquine therapy (22.5 mg/day).

The subjects diagnosed from 2014 to 2018 were subdivided into suspected relapse of vivax malaria (SR group) and non-relapse of vivax malaria (NR group). The inclusion criteria of the SR group are: (1) the patients were re-diagnosed as P. vivax infection within the interval of 28–180 days after the initial diagnosis; (2) epidemiological investigation confirmed that the patients were not re-exposed to malaria endemic areas between the two malaria episodes; and (3) the comparative sequencing of CSP (Circumsporozoite Surface Protein) and MSP-1 (Merozoite Surface Protein 1) of P. vivax with those of the previously infected strains found 100% similarity and 0 variable sites.

A simple random sampling method was conducted to selected 75 NR cases from the 143 patients confirmed in 2018. The subjects in the NR group had no history of relapsed vivax malaria after 1 year of follow-up. The source of vivax malaria infection was determined through epidemiological investigations. The subjects had no travelling history to malaria endemic areas within 30 days before the onset of malaria were regarded as local cases, otherwise they were regarded as imported cases. The study was approved by Yunnan Institute of Parasitic Diseases and Ethical Committee document No. 201904 (January 25, 2019). Genotyping was performed by using the stored blood samples obtained from the suspected patients who experienced fever and chills. Informed consent was obtained from all the patients. The blood samples of the subjects were provided by the Centers for Disease Control and Prevention in Yunnan Province, according to the described protocol [29]. Dried blood spots (DBS) sampling was conducted by collected 0.6 ml of venous blood, which was stored in a dried tube at − 80 ℃ before the extraction of DNA.

Extraction of genomic DNA

Three dried blood spots (diameter = 5 mm) were made, and genomic DNA was extracted according to the manufacturer’s instructions of the QIAgen Mini Kit (Germany, QIAamp Company's DNA Mini Kit), according to the manufacture’s instruction. The extracted DNA was stored at − 20 ℃.

PCR amplification of CYP2D6 gene fragment

The primers of CYP2D6 for polymerase chain reaction (PCR) amplification were designed by using GenBank (https://www.ncbi.nlm.nih.gov/gene/), reference sequence (ID: NC_000022.11) was used as template, following the reaction condition in previous studies [23, 30]. The primers for first-round nested PCR covering the coding region of CYP2D6 gene exons1–4 were: 5′-CCAGTGACAGATAAGGGTGC-3′ and 5′-GACGTGGATAGGAGGTAC AGAG-3′; the primers for second-round nested PCR were: 5′-GGTGACTTCTCCGACCAG G-3′ and 5′-TTCCCAAACCCATCTATGC-3′. The final amplification region was 42,131,088–42,128,678, and the expected fragment length of the product was 2411 bp. The primers that cover exons 5–9 of CYP2D6 gene were: 5′-GCCGACTGAGCCCTGGGA GGTAGGTA-3′ and 5′-GCTGGGGCCTGAGACTT-3. The amplification area was 42,126,035–42,128,422, and the amplification products had an expected fragment length of 2388 bp. For all the PCR reaction systems, we used 2.6 μl DNA template, 14.0 μl 2 × Taq PCR hybrid system (QIAGEN, Germany), 0.7 μl upstream primer (20umol/L), 0.7 μl downstream primers (20umol/L). The total volume was adjusted to 25.0 μl with ddH2O. PCR reaction conditions were clarified as follows: 92 to 94 °C for 2–5 min; 92 to 94 °C for 10–30 s, 50 to 56 °C for 15–30 s, 68 to 72 °C for 2 to 3 min, 35 cycles; 68 to 72 °C for 7 min. The triplicated parallel repetition was adopted for each PCR reaction. The amplified products were observed using 1.5% agarose gel electrophoresis. The positive amplification products were sent to Shanghai Meiji Biomedical Technology Co, Ltd. for sequencing by using the Sanger method. Only the sequences that showed identical results in at least two tests were used for subsequent analysis.

CYP2D6 gene mutations analysis

The sequencing reads were aligned by using DNAStar and BioEdit software packages. The obtained DNA sequences were assessed by using the Basic Local Alignment Search Tool (BLAST, http://blast.ncbi.nlm.nih.gov/Blast.cgi) on the NCBI platform. The sequences having a homology of 100% and query cover of above 99% with the non-mutated “reference” sequence (ID: NC_000022.11) were considered as the human CYP2D6 gene sequence.

The sequence was further compared with the non-mutation reference sequence to identify 9 exon regions, the transcriptional starting point and the end point of CYP2D6 gene. The DNA sequences of the 9 exon regions were spliced into the maternal CDS (coding DNA sequence) of CYP2D6 gene in the order from exon 1 to exon 9. The paternal CDS was obtained by replacing the undisplayed diploid bases on the maternal CDS chain [31, 32], and the diploid bases were determined by searching the sequencing peaks (Additional file 2). This approach makes up for the shortcoming that the sequencer can only read one base signal for each site, whereas misses the other hybrid base. MEGA v5.04 software was used to analyse the missense mutations and synonymous mutation sites of maternal and paternal CDS, and the ID number of the defined mutation site was queried in Genbank. DnaSP 6.11.01 software was used to identify the haplotype of the CDS, to calculate the nucleic acid diversity index (π), expected heterozygosity (He) and other parameters [33].

The haplotype (allelic form) of CDSs was determined according to the criteria of Human Cytochrome P450 Allele Nomenclature Committee (NM_000106.6) [34] and previous study [35], and were termed to the level of sub-alleles (such as *10.001 and *39.001). The sub-alleles were then merged. Those haplotypes unmatched with known CYP2D6 alleles were grouped into the “other” category. The genotype of each sample was a diploid composed of the maternal CDS and parental CDS (such as *10/*39 and *1/*10). SPSS software (version 21.0; IBN; Chicago; IL) was used to conduct chi-square test on the frequency of genotypes and mutation loci co-existed in the SR cases and NR cases. The odds ratio (OR) of genotypes and mutation loci with statistical significance were calculated by comparing to the relapse rate of P. vivax. The significance level was set as P < 0.05; OR < 1 indicates that the haplotype is a protective factor, while OR < 1 indicates a risk factor.

Results

Demographics and clinical characteristics of the subjects and PCR amplification of human CYP2D6 gene

A total of 45 SR vivax malaria cases were confirmed by epidemiological investigation and gene sequence alignment (Additional file 3), including 43 cases with one suspected relapsed event, 1 case with two suspected relapsed events and 1 case with three suspected relapsed events. The demographic and clinical characteristics of the subjects are listed in Table 1. The male–female ratio was 3:1. Most of the patients presented only one suspected relapsed episode (95.6%).

Table 1 Demographic and clinical characteristics of the study cohort

Exons 1–4 and 5–9 of CYP2D6 gene in the blood samples collected from 45 SR patients and 75 NR patients were amplified by conducting PCR. The amplified product revealed a clear band at > 2000 bp, and therefore can be considered as the target band (Additional file 4). The CYP2D6 gene fragments of 75 NR patients and 44 SR patients were successfully amplified, expect for the PCR product of CYP2D6 gene was not effectively amplified for 1 SR patient.

Locus polymorphisms of CDS and its association with relapsed cases of vivax malaria

A total of 119 maternal CDS of CYP2D6 gene (Genbank accession number: MT339075-MT339193) were obtained. Base substitutions at 12 loci, including c.31 and c.100, on the 238 maternal and paternal CDS, were determined (Table 2). The proportions of third-base and first-base substitution in the triplet codon were 41.7% (5/12), and the proportion of second-base substitution was 16.6% (2/12). 7 missense mutation loci and 5 synonymous mutation loci were determined. 91.7% (11/12) of the mutation sites were found in the sequences of SR group, while 66.7% (8/12) of the mutation sites were determined in the sequences of NS group (Table 2).

Table 2 Polymorphisms of CDS mutation in CYP2D6 gene between different vivax malaria cases

The comparative results of 12 single nucleotide polymorphisms (SNPs) between the two groups were listed in Table 2. In the SR group, the frequency of mutation allele was the highest at c.1457 G > C (85.2%, 75/88), and c. 408 G > C (85.2%,75/88). In the NR group, the highest frequency of mutation was at c. 408 G > C (76.0%, 114/150), followed by c. 1457 G > C (72.0%, 108/150). Eight mutation loci found both in SR group and NR group were selected to analyse the association between loci mutation and vivax malaria relapse. Correlation analysis showed that the ratio of c.886C > T site mutation to the occurrence of vivax malaria recurrence was 2.167 (95% CI 1.104–4.252), and that statistically significance was determined (P˂ 0.05). Such result suggests that c.886 mutation increases the risk of recurrence of vivax malaria, which is 2.167 times higher than that of other SNPs (Table 2).

Haplotype diversity

A total of 23 haplotypes (Hap_1 ~ Hap_23) were identified amongst the 238 CDSs of CYP2D6 gene. Of them, 9 haplotypes were only observed in the sequences of SR group (π = 0.0015, He = 0.821), while 6 haplotypes were only found in the NR group (π = 0.0014 and He = 0.760). 8 haplotypes, including Hap_2, Hap_3, Hap_4, Hap_5, Hap_6, Hap_7, Hap_13 and Hap_14, were found in both SR and NR group. The compositions of every haplotype were shown in Fig. 1. Hap_3 did not contain mutation loci, and the rest haplotypes contained different mutation loci.

Fig. 1
figure 1

Haplotype structure of CYP2D6 gene. 9 exons are indicated by numbered boxes with DNA polymorphisms indicated on top. Predicted amino acid changes are indicated below; no change is synonymous mutation. Hap: Haplotypes; *m-*x: They could not meet the allele inclusion criteria provided by the Allele Nomenclature Committee

Alleles/genotypes and their association with vivax malaria relapse

Among the 23 haplotypes, 11 sub-alleles (such as *1.001, *1.011, *2.001 and *2.004) that could be accurately termed belonged to 5 star (*) alleles, including *1, *2, *4, *10 and *39. Moreover, 12 sub-alleles (*m-*x) had no referencing information. The most common alleles included *10 (45.4%, 108/238), *1 (19.7%, 47/238) and *39 (11.8%, 28/238) (Table 3). In addition, *10 alleles accounted for the highest proportion in both SR group and NR group, reaching the frequencies of 38.6% (34/88) and 49.3% (74/150), respectively, yet the difference did not reach statistically significant level (x2 = 2.560, P > 0.05). The frequency of alleles *1 in the NR group (x2 = 6.193, P < 0.05), and the frequency of alleles *2 in SR group was significantly higher (x2 = 16.177, P < 0.05).

Table 3 The association between CYP2D6 genotypes and the relapse of vivax malaria

The current study defined 32 sub-allelic genotypes, which can be classified into 24 star alleles genotypes (Additional file 5). Among them, the frequency of *10/*10 was the highest in both SR group and NR group, reaching 24.2% (12/44) and 32.0% (24/75), respectively (Additional file 5). In addition, 8 sub-allelic genotypes, including *4/*4, *4/*o, *2/*39, *39/*m, *39/*x, *1/*r, *1/*n, *v/*10, were only detected in SR group. We conducted correlational analysis on the 9 genotypes distributed in both two groups (including *1/*1, *2/*2, *1/*2, *1/*10, *10/*10, *10/ *39, *39/*39, *10/*s and *39/*s), and revealed that the odd ratios of all genotypes were not statistically significant (P > 0.05) (Table 3). This result suggested that the different CYP2D6 genotypes in this study did not affect the recurrence of vivax malaria.

Discussion

Located on chromosome 22q13.1, CYP2D6 gene (4383 bp in length) consists of 8–9 exons and 7–8 introns. Its CDS is 1338–1491 bp in length and encodes 446–497 amino acids. The heterogeneity of CYP2D6 monooxygenase activity varies from complete dysfunction to ultra-rapid enzyme metabolism, depending on the various genotypes of CYP2D6 [14]. Among the 150 identified alleles, multiple mutations such as CYP2D6*3, CYP2D6*4, CYP2D6*5 and CYP2D6*6 are considered to represent null CYP2D6 enzyme activity [30, 32, 33, 36].

The current study is the first one that attempts to analyse CYP2D6 polymorphisms in the entire coding region sequence in the blood specimen of relapsed vivax malaria patients in Yunnan. The current study reveals the association between CYP2D6 polymorphisms and thwarted CYP2D6 enzyme activity, which is accountable to the failed treatment of using primaquine. This study found that mutation at c. 886 locus of CYP2D6 gene elevated the risk of vivax malaria relapse risk by 2.167-fold (P < 0.05). In the findings of Wang et al. [37] and Gaedigk et al. [38], it is concluded that the relapse of vivax malaria may be relate to c.886. locus mutation of CYP2D6 gene in Yunnan Province.

Among the 23 alleles defined by the 23 haplotypes in all the CDSs, CYP2D6*10 accounts for the highest proportion, reaching as high as 45.4% (108/238). Consistently, previous studies found the highest frequency of CYP2D6*10 alleles mutation in the Chinese Han population and in Asian subjects [31, 39,40,41]. Although the frequency of CYP2D6*10 alleles did not show statistical significant difference between the SR group and the SR group, CYP2D6*10 mutation was considered to indicate decreased enzyme activity [24, 34]. This notion suggested that the wide distribution of CYP2D6 gene*10 alleles in the population of Yunnan should not be disregarded, as it will amplify the accumulated risk of reduced therapeutic efficacy of primaquine in eliminating P. vivax hypnozoites in Yunnan Province.

The current study found that CYP2D*2 allele was more frequently determined in the relapsed patients. This result was not in agreement with that of Brasil et al. [30], which reported that the frequency of CYP2D*2 mutation was significantly higher in the non-relapsed patients. Such inconsistent results might be attributed to the different subjects enrolled in the two studies. Specifically, the subjects included in this study were vivax malaria patients who live in the malaria endemic areas in Yunnan Province, hence the long-term malaria epidemic environment could positively screen the human defective genes [42,43,44].

In this study, all the alleles were identified based on the mutations in the exon regions of CYP2D6, which is different from allelic recognition that covers intron mutations [32, 45]. Therefore, the homogeneity of CYP2D6*2 allele in the two studies could not be ruled out. Although CYP2D6*2 allele is generally considered to be an allelic type of CYP2D6 normal enzyme activity, multiple mutation that contains c. 886 locus could result in lowered activity of CYP2D6 enzyme [38, 42]. At present, the complicated multi-directional change of CYP2D6 enzyme activity caused by *2 allelic mutation is yet to be validated [35, 37, 38, 45].

Eight mutant allelic types were found only in the SR group, yet the association between these different CYP2D6 genotypes and the relapse of vivax malaria could not be established. Given that the *10/*10 genotype still accounted for the largest proportion (30.3%) in the overall study cohort and the no-function *4/*4 genotype was found in the SR group [39], the risk of declined therapeutic efficacy of primaquine caused by defective CYP2D6 enzyme activity in malaria endemic areas should not be ignored. In addition, effort should be made to validate whether the one case of amplification failure case in this study was caused by full gene deletion (*5/*5) or other factors.

The limitations of the current study should be noted. Firstly, the sample size of suspected recurrence of vivax malaria is not large enough, and the alleles (*4, *m-*x,) of CYP2D6 genes with low frequency of distribution might not be quantitatively analyzed (Additional file 5). Secondly, while the re-infection with P. vivax was ruled out from the SR group, the influences of primaquine doses and primaquine resistance were not evaluated. Finally, polymorphisms in the entire coding region of CYP2D6 gene were investigated yetthe possible mutation in intron region and the copy number of CYP2D6 gene, which has been considered to disturb the activity of CYP2D6 enzyme, were not analysed. Thusly, the activity of CYP2D6 cannot be indirectly predicted based on the results of CYP2D6 genotyping in the present study. In view of such shortcoming, the sampling size should be expanded to conduct molecular epidemiological investigations on CYP2D6 genotype.

Currently, the number of CYP2D6 gene CDS has increased to 320, and consistent results pertaining to the loci mutation and allelic type were found. Moreover, multiplex ligation-dependent probe amplification (MLPA) and other advanced methods will be applied to optimize the accurate identification of human CYP2D6 genotypes, so as to swiftly predict CYP2D6 enzyme activity of individual subjects. These findings will contribute to the efficacy of primaquine in the treatment of vivax malaria in Yunnan province.

Conclusions

This pilot study reveals the polymorphisms of CYP2D6 gene coding region and its association with the recurrence of vivax malaria in the population of Yunnan Province, suggesting that the radical cure of vivax malaria by using primaquine could be hampered due to the impaired CYP2D6 enzyme activity in the population of Yunnan Province. The study revealed that (1) CYP2D6*10 allele accounts for the highest proportion in the vivax malaria cases of Yunnan Province; (2) mutation at c. 886 locus is most closely related to the relapse of vivax malaria and CYP2D6*2 allele containing this mutation locus was significantly higher in SR; and (3) The *4/*4 genotype that indicates null CYP2D6 enzyme function (poor metabolizers) was only found in the SR cases.