Heterogeneous genetic diversity pattern in Plasmodium vivax genes encoding merozoite surface proteins (MSP) -7E, −7F and -7L

Garzón-Ospina, Diego; Forero-Rodríguez, Johanna; Patarroyo, Manuel A

doi:10.1186/1475-2875-13-495

Heterogeneous genetic diversity pattern in Plasmodium vivax genes encoding merozoite surface proteins (MSP) -7E, −7F and -7L

Research
Open access
Published: 13 December 2014

Volume 13, article number 495, (2014)
Cite this article

Download PDF

You have full access to this open access article

Malaria Journal Aims and scope Submit manuscript

Heterogeneous genetic diversity pattern in Plasmodium vivax genes encoding merozoite surface proteins (MSP) -7E, −7F and -7L

Download PDF

Diego Garzón-Ospina^1,2,
Johanna Forero-Rodríguez¹ &
Manuel A Patarroyo^1,2

2222 Accesses
12 Citations
7 Altmetric
1 Mention
Explore all metrics

Abstract

Background

The msp-7 gene has become differentially expanded in the Plasmodium genus; Plasmodium vivax has the highest copy number of this gene, several of which encode antigenic proteins in merozoites.

Methods

DNA sequences from thirty-six Colombian clinical isolates from P. vivax (pv) msp-7E, −7F and -7L genes were analysed for characterizing and studying the genetic diversity of these pvmsp-7 members which are expressed during the intra-erythrocyte stage; natural selection signals producing the variation pattern so observed were evaluated.

Results

The pvmsp-7E gene was highly polymorphic compared to pvmsp-7F and pvmsp-7L which were seen to have limited genetic diversity; pvmsp-7E polymorphism was seen to have been maintained by different types of positive selection. Even though these copies seemed to be species-specific duplications, a search in the Plasmodium cynomolgi genome (P. vivax sister taxon) showed that both species shared the whole msp-7 repertoire. This led to exploring the long-term effect of natural selection by comparing the orthologous sequences which led to finding signatures for lineage-specific positive selection.

Conclusions

The results confirmed that the P. vivax msp-7 family has a heterogeneous genetic diversity pattern; some members are highly conserved whilst others are highly diverse. The results suggested that the 3′-end of these genes encode MSP-7 proteins’ functional region whilst the central region of pvmsp-7E has evolved rapidly. The lineage-specific positive selection signals found suggested that mutations occurring in msp-7s genes during host switch may have succeeded in adapting the ancestral P. vivax parasite population to humans.

Plasmodium knowlesi clinical isolates from Malaysia show extensive diversity and strong differential selection pressure at the merozoite surface protein 7D (MSP7D)

Article Open access 29 April 2019

Diversity and evolutionary genetics of the three major Plasmodium vivax merozoite genes participating in reticulocyte invasion in southern Mexico

Article Open access 21 December 2015

Genetic diversity and natural selection of Plasmodium knowlesi merozoite surface protein 1 paralog gene in Malaysia

Article Open access 14 March 2018

Background

Malaria remains a major public health problem worldwide. Plasmodium falciparum is the parasite species causing the lethal form of the disease whilst Plasmodium vivax has long been considered a parasite causing mild disease, thereby diverting attention away from this species regarding research; however, recent studies have reported that this species also causes severe clinical syndromes[1, 2]. Even though both species infect humans, they both emerged from different evolutionary lineages; whilst P. vivax shares a common ancestor with Asian non-human primate malaria, P. falciparum has diverged from parasites infecting great apes[3].

The different evolutionary paths leading to the appearance of P. vivax and P. falciparum have also led to important differences regarding hosts being invaded by both species[4, 5]. In spite of such differences, initial interaction between the parasite and red blood cells (RBC) seems to be directed by the MSP-1 protein[6–8] which is present in all species from the genus. MSP-1 forms a complex with MSP-6 and MSP-7 in P. falciparum[9–11]; the latter protein is encoded by a gene forming part of a multigene family which has been differentially expanded amongst Plasmodium species[12]. Studies involving msp-7 family members have shown that the resulting protein products are located on the parasite membrane and that a 22 kDa C-terminal fragment (derived from proteolytic processing during parasite development)[10] has regions interacting with RBC[13]. The msp-7 knockout in P. falciparum (pfmsp-7I) and Plasmodium berghei (pbmsp-7B) has shown that even though its absence is not lethal, it does reduce mutant parasite invasion ability[14, 15]. These results, together with prior in silico analysis, have suggested that the members of this family could have functional redundancy[12, 15, 16] and their protein products (or some of them) could thus be involved in invasion. On the other hand, antigenicity studies have shown that some of these genes’ protein products are recognized by sera from infected patients[17, 18]. Antibodies directed against these proteins can inhibit parasite invasion of RBC[19], whilst immunization with members of the Plasmodium yoelii msp-7 family has shown that they can confer protection in vaccinated mice following experimental challenge[20].

The genetic variability patterns observed in msp-7 family members have been different between P. falciparum and P. vivax[21–24]; whilst members of the former species have low polymorphism[23, 24], some members of P. vivax (pvmsp-7C, pvmsp-7H and pvmsp-7I) are highly polymorphic[21]. However, other members, such as pvmsp-7A and pvmsp-7K, are amongst the most conserved P. vivax antigens[22]. There are thirteen msp-7 genes in this species’ chromosome 12; these have been named in alphabetical order according to their location regarding the PVX_082640 gene[12]. Eleven of these genes are transcribed, but only seven of them are transcribed during the last hours of the intra-erythrocyte stage[25]. The genetic diversity of four of these seven genes has already been evaluated[21, 22]; this study was therefore aimed at evaluating the genetic variability of the three remaining members (pvmsp-7E, pvmsp-7F and pvmsp-7L) which are expressed during the intra-erythrocyte stage. pvmsp-7E displayed high polymorphism and its central region had undergone rapid evolution whilst pvmsp-7F and pvmsp-7L were seen to be highly conserved. The genes’ 3′-ends tended to be conserved by negative selection, suggesting that they encode the functional region for these proteins. Similar to what happened with the msp-1 gene[26, 27], msp-7 genes seem to have diverged due to positive selection, which could have resulted from malaria parasites adaptation to different hosts.

Methods

Ethics statement

All P. vivax-infected patients who provided us with the blood samples were informed about the purpose of the study and all gave their written consent. All procedures carried out in this study were approved by the ethics committee of the Fundación Instituto de Inmunología de Colombia.

Parasite DNA and genotyping

Thirty-six peripheral blood samples from patients proving positive for P. vivax malaria by microscope examination were collected from some of Colombia’s departments (Chocó and Nariño in the south-west, Guainía, Guaviare and Meta in the south-east, Tolima in the Midwest, and Atlántico, Antioquia and Córdoba in the north-west) between 2007 and 2010 (nine isolates in 2007, seven isolates in 2008, 8 isolates in 2009 and twelve isolates in 2010). DNA was obtained using a Wizard Genomic DNA Purification kit (Promega), following the manufacturer’s instructions, and stored at −20°C until use. The parasite samples were genotyped by PCR-RFLP of the pvmsp-1 gene as previously described[28]. Samples having single P. vivax msp-1 allele infection were used for PCR amplification.

PCR amplification and sequencing

Primers were designed for amplifying pvmsp-7E, pvmsp-7F and pvmsp-7L DNA fragments, based on Sal-I sequences (PlasmoDB IDs: PVX_082665, PVX_082670 and PVX_082700, respectively). The pvmsp-7E gene fragment was amplified with 7Edto 5′ GCCGATCTGTTGTCTTTTCC 3′ and 7Erev 5′ CCTTACGACACGTCAAATGG 3′ primers. pvmsp-7F was amplified by using 7Fdto 5′ TCCTCTCCTTGCTGATACTCC 3′ and 7Frev 5′ CAGCCGCTTAAATCACTTC 3′ primers whilst pvmsp-7L was amplified with 7Ldto 5′ AGTACTATTCTTCTTGCCGTCC 3′ and 7Lrev 5′ TCCCCTCAGTAGTAAAACATCG 3′ primers. All PCR reactions were performed using KAPA HiFi HotStart Readymix containing 0.3 μM of each primer in a final 25 μL volume. Thermal conditions were set as follows: one cycle of 5 min at 95°C, 30 cycles of 20 sec at 98°C, 15 sec at 63°C, 30 sec at 72°C, followed by a 5 min final extension at 72°C. PCR products were purified using the UltraClean PCR Clean-up (MO BIO) kit, and then sequenced with a BigDye Terminator kit (Macrogen, Seoul, South Korea) in both directions. Three PCR products obtained from independent PCR amplifications were sequenced per isolate to discard errors. Sequences having a different haplotype to the previously reported ones were deposited in the GenBank database (accession numbers KM212276 - KM212302).

Phylogenetic analysis for Plasmodium cynomolgi msp-7 orthologous identification

A similar approach used for msp-7 identification in other Plasmodium species[12] was adopted for identifying msp-7 genes in Plasmodium cynomolgi (pc) and establishing their orthologous relationships. The genomic region (obtained from GenBank, accession number: NC_020405) encoded by the PCYB_122860 and PCYB_122720 genes (homologues to PVX_082640 and PVX_082715 which circumscribed the msp-7 region in P. vivax) was analysed using ORF Finder[29] and Gene Runner software for identifying open reading frames (ORFs) encoding proteins larger than 300 amino acids. Deduced amino acid sequences obtained with Gene Runner were aligned with P. vivax (12 proteins) and Plasmodium knowlesi (5 proteins) MSP-7 sequences using the MUSCLE algorithm[30]. The best model for amino acid substitutions was selected by Akaike’s information criterion using the ProtTest program[31]. Phylogenetic trees were inferred through Maximum Likelihood (ML) and Bayesian (BY) methods using the JTT+G model. The observed amino acid frequencies (JTT+G+F) were also considered in Bayesian phylogenetic analysis and the analysis was run for one million generations. ML topology reliability was evaluated by bootstrap, using 1,000 iterations, whilst the sump and sumt commands in Bayesian analysis were used for tabulating posterior probability and building consensus trees. MEGA v.5 software was used for ML analysis and MrBayes v.3.2 software for assessing Bayesian inference. The P. falciparum MSP-7H (Pf MSP-7H) sequence was used as outgroup in both methods.

DNA diversity and evolutionary analysis in pvmsp-7 genes

CLC Main workbench (CLC bio, Cambridge, MA, USA) software was used to assemble forward and reverse sequences from three independent PCR fragments per isolate. Deduced amino acids from Colombian isolates’ pvmsp-7 sequences and those obtained from several sequencing projects (Sal-I, Brazil-I, Mauritania-I, India-VII and North Korean reference sequences)[4, 32] were aligned using the MUSCLE algorithm[30], followed by manual editing. PAL2NAL software[33] was then used for inferring codon alignments from the aligned amino acid sequences. The T-REKS algorithm[34] was used for searching for repeats having 90% similarity with the deduced msp-7 amino acid sequences.

DnaSP v.5 software[35] was used for calculating the number of polymorphic segregating sites (Ss), the number of singleton sites (s), the number of parsimony-informative sites (Ps), the number of haplotypes (H), the Watterson estimator (θ^w) and nucleotide diversity per site (π) for all available sequences (reference sequences and Colombian ones), as well as for the Colombian population alone. Departure from the neutral model was assessed in the Colombian population by frequency spectrum-based tests (Tajima’s D, Fu and Li’s D* and F* statistics Fay and Wu’s H) and tests based on the distribution of haplotypes (Fu’s Fs and K-tests and H-test (for the latter test haplotype diversity obtained from DnaSP software was multiplied by (n-1)/n according to Depaulis and Veuille[35, 36])). DnaSP v.5 and/or ALLELIX software were used for these tests, coalescent simulations being used for obtaining confidence intervals[35]. Positions containing gaps or repeats in the alignment were not taken into account.

Natural selection was assessed by using the modified Nei-Gojobori method[37] which calculated non-synonymous (d_N) and synonymous (d_S) rate substitution. Differences between d_N and d_S were assessed by applying Fisher’s exact test (suitable for d_N and d_S < 10[38]) and the Z-test available in MEGA software v.5[39]. The Datamonkey web server[40] was used for assessing codon sites under positive or negative selection at population level, along with the IFEL codon-based maximum likelihood method[41]. Positive or negatively selected sites were also assessed by FEL, SLAC, REL[42], MEME[43] and FUBAR[44] methods. A <0.1 p-value was considered significant for IFEL, FEL, SLAC and MEME methods, a >50 Bayes factor for REL and a >0.9 posterior probability for FUBAR. Recombination was considered before running these tests. The branch-site REL method was used for identifying branches (lineages) when a percentage of sites have evolved under positive selection for exploring the long-term selection effect. Non-synonymous divergence (K_N) and synonymous divergence (K_S) rate substitutions were also calculated using the modified Nei-Gojobori method[37] with Jukes-Cantor correction[45]. Positive and negative selection at every codon for P. vivax/P. cynomolgi msp-7 alignments were also evaluated by FEL, SLAC, MEME, REL and FUBAR methods.

Linkage disequilibrium (LD) was evaluated by calculating the Z_nS statistic[46]. Linear regression between LD and nucleotide distances was evaluated to ascertain whether recombination was taking place in pvmsp-7 genes. Recombination was also assessed by the GARD method[47] and by ZZ[48] and RM tests[49]. RDP3 v3.4 software was used for detecting recombinant fragments in pvmsp-7 genes[50].

Results and discussion

Genotyping natural isolates

The thirty-six samples used in this study were genotyped by PCR-RFLP of the pvmsp-1 marker. All samples were infected by a single strain (a single P. vivax msp-1 allele was detected) and considered for PCR amplification of pvmsp-7 genes. The RFLP pattern confirmed the presence of different genotypes in the isolates so obtained. In spite of all samples amplifying the pvmsp-1 fragment, no amplimers were detected in some samples for some of the msp-7 genes (pvmsp-7E n = 31, pvmsp-7F n = 36 and pvmsp-7L n = 31).

The msp-7 family structure in Plasmodium cynomolgi and phylogenetic analysis

Prior analysis has suggested that there are several msp-7 species-specific duplications in P. vivax[12]. The recent sequencing of the P. cynomolgi genome[51], a species phylogenetically close to P. vivax[3], has meant that new sequences from this multigene family are now available. The P. cynomolgi genomic region flanked by the PCYB_122860 and PCYB_122720 genes contained eleven 0.9 to 1.4 Kb length ORFs having the same transcription orientation. A shorter 0.5 Kb fragment having 30% similarity with the identified ORFs was also observed. A 314 bp region having 75.8% identity with the 285 bp fragment in P. vivax between the pvmsp-7I and pvmsp-7K genes was also found (Figure 1). The P. cynomolgi msp-7 genes (and/or fragments) were named in alphabetical order, according to their location regarding the PCYB_122860 gene (Figure 1). Contrasting with PlasmoDB annotation, our group found that PcMSP-7C, −7F, −7H, −7I, −7K proteins might be encoded by a single exon like P. vivax MSP-7K[52] (Additional file1). The deduced amino acid sequences from these ORFs had a signal peptide, but no membrane-anchoring regions. The domain characteristic of the MSP-7 family (MSP_7C, Pfam domain ID: PF12948) was absent in the deduced PcMSP-7L protein sequences due to a premature stop codon.

Orthologous relationships were established for P. cynomolgi MSP-7 (PcMSP-7) sequences by inferring phylogenies, using these sequences together with P. vivax (Pv) and P. knowlesi (Pk) MSP-7 sequences (Figure 2) with the P. falciparum MSP-7H (PfMSP-7H) as outgroup. The topologies revealed eleven clades having good statistical support (Figure 2); each P. cynomolgi MSP-7 had a counterpart in P. vivax or P. knowlesi. However, clustering for PcMSP-7B and PcMSP-7E differed regarding the other MSP-7s. PcMSP-7E formed a group with PvMSP-7E (Figure 2A) in ML topology, even though not having very high statistical support (72%). The BY topology method (Figure 2B) gave a subgroup formed by PcMSP-7B and PcMSP-7E which appeared as an external PvMSP-7B/PvMSP-7E group suggesting that these genes are inparalogous and, therefore, the duplication events occurred after the divergence of P. cynomolgi and P. vivax. Even though pcmsp-7B and pcmsp-7E did not form a group with pvmsp-7B and pvmsp-7E, respectively, their location regarding PVX_082640 and PCYB_122860 genes was the same (Figure 1). Moreover, genetic distances between pcmsp-7B and pcmsp-7E (and/or pvmsp-7B and pvmsp-7E) were similar to pcmsp-7B and pvmsp-7B (or pcmsp-7E and pvmsp-7E). Furthermore, genetic distance between pcmsp-7B and pvmsp-7B was smaller than that between pcmsp-7B and pvmsp-7E and the distance between pcmsp-7E and pvmsp-7E was less than the distance between pcmsp-7E and pvmsp-7B (Additional file2). Consequently, there is little probability that duplication events following the divergence of these two species independently led to the same order of msp-7B and msp-7E genes. Evidence of gene conversion (mechanism conducting paralogous homogenization) between P. vivax msp-7 members has been reported previously[21]; if such mechanism occurs between msp-7B and msp-7E genes it would be expected that they would become clustered in the phylogeny and not with their counterparts from a sister taxon; however, further analysis is needed for confirming such hypothesis.

The aforementioned results have shown that P. vivax and P. cynomolgi share the whole msp-7 repertoire described to date, suggesting that the duplications which gave rise to the msp-7B, −7C, −7E, −7F, −7I, −7L and -7M genes occurred before the divergence between P. vivax and P. cynomolgi and after their divergence from P. knowlesi, and thus msp-7B, −7C, −7E, −7F, −7I, −7L and -7M are not exclusive to P. vivax.

pvmsp-7E genetic diversity

168 segregant sites were found in the pvmsp-7E gene (Table 1), showing that this gene is highly polymorphic. The nucleotide diversity (π) estimated for this gene (Table 1) was comparable to that found in other genes encoding surface proteins (pvmsp-1 [53], pvmsp-3 [54] and pvmsp-5 [55, 56]) as well as other members of pvmsp-7 family [21]. Even though genes such as pvmsp-1, pvmsp-3 and pvmsp-5 are highly polymorphic, their diversity at protein level is usually located in determined regions. These regions are usually immune response targets and have thus tended to evolve more rapidly, accumulating mutations which alter the protein sequence and thereby evading the host’s immune system. Around 60% of the polymorphism found in pvmsp-7E was located at the gene’s central region whilst the gene’s ends were relatively high conserved (Additional files 3 and 4). This pattern has been previously observed in other pvmsp- 7 genes [21].

Table 1 DNA polymorphism measurements for pvmsp-7 genes

Full size table

Regarding DNA, 23 haplotypes have been found worldwide in pvmsp-7E (Table 1 and Additional file4); fourteen different haplotypes have been found in this gene’s 3′-end, whilst eleven haplotypes have been found in the central region and five in the 5′-end (Additional file3). Nineteen of these 23 haplotypes were found in the Colombian population (Table 1 and Additional file4; haplotype 10: 26%; haplotypes 5, 9: 15%; haplotypes 7, 11, 13, 18: 10%; and haplotypes 6, 8, 12, 14–17, 19–23: 5%) over the course of a 3-year period (2007–2010) without any longitudinal or spatial trends. Thirteen haplotypes were found in Colombia at amino acid level (Additional file5); ten haplotypes having similar frequencies were observed in the pvmsp- 7E central region in the Colombian population whilst three and four haplotypes were distinguished towards the N- and C-terminals, respectively (Additional file5).

The T-REKS algorithm did not find repeats in the deduced protein sequences. Of the thirty-six sequences analysed for this gene, the North Korean sequence had a premature stop codon, suggesting that pvmsp-7E is a pseudogene in this strain; however, the annotation in the Broad institute for the North Korean pvmsp-7E gene (accession: PVNG_00513.1) suggests that this could have an intron. Further cDNA analysis is needed to confirm such issue regarding this strain.

pvmsp-7E neutrality and selection tests

Several tests based on the neutral model of molecular evolution were used with pvmsp-7E sequences from the Colombian population for evaluating whether this gene deviated from neutral expectations (Table 2 and Additional file6). Tests based on the polymorphism frequency spectrum had significant values (Table 2). Fu and Li’s D* and F* estimators had values greater than zero whilst Fay and Wu’s H estimator had statistically significant negative values (Table 2), indicating deviation from the neutral model of evolution. In addition, the haplotype distribution-based tests also gave statistically significant values; Fu’s Fs test gave values greater than zero and the haplotype number (18) and haplotype diversity (0.922) were lower than that expected under neutrality (Table 2).

Table 2 Neutrality, linkage disequilibrium and recombination tests for pvmsp-7 genes for the Colombian population

Full size table

A sliding window for D, D*, F* and H statistics gave values greater than zero (D, D* and F*) in the gene’s central region and lower than zero in the 3′-end, this being the region where the most negative value for H was located (Additional file7). The gene was divided into three fragments: 5′-end (nucleotide 1 to 390), central (nucleotide 391 to 747) and 3′-end (nucleotide 748 to 1158); the aforementioned neutrality estimators were calculated for each of them (Additional file6). Fu and Li’s D* and Fu’s Fs tests gave values greater than zero at the 5′-end whilst Tajima’s D, Fu and Li’s D*, F* and Fu’s Fs test scores were greater than zero in the central region. The haplotype number and haplotype diversity were lower than expected under neutrality in the 5′-end and central region. Only Fay and Wu’s H tests gave statistically significant values at the 3′-end (Additional file6).

Natural selection seemed to act in different ways within the pvmsp-7E gene. A sliding window for the non-synonymous substitutions per non-synonymous site and synonymous substitutions per synonymous site rate (d_N/d_S = ω) gave values greater than 1 (ω > 1) in the central region and a peak at the 3′-end (Figure 3). The d_S rate at the 5′ and 3′-ends was significantly greater than d_N, whilst d_N was significantly greater than d_S in this gene’s central region (Table 3). Eight positively selected sites were identified in the central region and one in the 3′-end. Twenty-six negatively selected sites were identified, mainly in the 5′-end (9 sites) and 3′-end (14 sites) (Additional files4 and8).

Table 3 Average number of pvmsp-7 gene synonymous substitutions per synonymous site (d _S ) and non-synonymous substitutions per non-synonymous site (d _N )

Full size table

These results suggested that the central region was under natural positive selection. According to Tajima and Fu and Li tests (which had significant values higher than zero) pvmsp-7E seemed to be under balancing selection favouring the existence of different alleles in the population. This type of selection frequently occurs in antigens exposed to the immune system; immune responses therefore seemed to be directed towards the pvmsp-7E central region in which the mutations were accumulated at a greater rate by positive selection (Additional files4 and8).

Interestingly, as Fay and Wu’s H tests gave significant negative values and as K-test and H-test were lower than expected under neutrality (Table 2), then a selective sweep would be probable and strong LD and low genetic diversity would thus be expected. Z_nS values suggested that pvmsp-7E had non-random polymorphism association, as expected in a selective sweep (Table 2). However, differently to what was expected, pvmsp-7 had high genetic diversity (Table 1). When recombination is present in a locus under selective sweep, then it would be expected that genetic diversity would only become reduced close to the selection site[57]. There was evidence of recombination throughout the pvmsp-7E gene (Table 1 and Figure 4, see below) and since the deepest H “valley” was located at the 3′-end (Additional file7), the selective sweep may not have affected the gene completely but just this region. The 5′-end and central regions showed no evidence of selective sweep (Additional file6) and π was high in the central region and low at the 5′ extreme (Additional file3). The 3′-end had significant values in H and Z_nS tests (Additional file6), suggesting that the selection site should have been located in this region. The π value in 3′-end was considerably reduced regarding that for the central region (Additional file3). However, the number of SNPs seems to be higher than that expected in a selective sweep. Our results suggested that a selective sweep affected the pvmsp-7E gene, the selection site was located in the gene’s 3′-end and this seemed to be an incomplete selective sweep due to the presence of recombination since not all variability had been lost. However, if the selective sweep has not been recent, new mutations could have become fixed following such sweep[57].

Fu’s Fs test gave values greater than zero (Table 2 and Additional file6) which may have resulted from a reduction in haplotypes due to a recent bottleneck. Consequently, a low genetic diversity throughout the gene pool is expected in P. vivax Colombia population. Prior studies have shown high genetic diversity in parasitic antigens in the Colombian population[21, 55], meaning that such demographic event is highly unlikely. This test’s result may have been due to a reduction of pvmsp-7E haplotypes by the selective sweep, causing the number of haplotypes to be lower than that expected.

pvmsp-7F and pvmsp-7L genetic diversity

In contrast to pvmsp-7E, pvmsp-7F and pvmsp-7L had low genetic diversity (Table 1). This genetic diversity was similar to that observed in pvmsp-4[58, 59], pvmsp-8[60], pvmsp-10[22, 60], pv12, pv38[61], pv41[62], as well as in pvmsp-7A and pvmsp-7K[22]. Aligning the Colombian sequences with those obtained from the databases (reference sequences, Additional files9 and10) revealed that these genes only had four and six segregant sites, respectively. The π values and the number of haplotypes for these genes were low (Table 1). The most frequently occurring pvmsp-7F allele in Colombia was haplotype 2 (61%), followed by haplotype 1 (19%), haplotype 3 (17%) and haplotype 4 (3%), whilst haplotype 1 (61%) was the most frequent for pvmsp-7L, followed by haplotype 3 (16%), haplotype 2 (14%) and haplotypes 4, 5 and 6 (3%).

pvmsp-7F and pvmsp-7L neutrality and selection tests

Neutrality for pvmsp-7F and -7L genes in the Colombian population could not be ruled out as no statistically significant values were found for the tests based on the neutral model of molecular evolution (Table 2 and Additional file6). Likewise, no natural selection signals were found to be acting on these genes when d_N and d_S rates were calculated (Table 3). However, when the effect of selection on each codon was evaluated, it was seen that codon 424 regarding pvmsp-7F, was under positive selection. Concerning pvmsp-7L, codons 159, 260 and 357 showed positive selection signals (Additional files8,9 and10).

Intragene linkage disequilibrium (LD) and recombination in pvmsp-7 genes

As mentioned above, there were non-random associations regarding polymorphism for pvmsp-7E according to the Z_nS test (Table 2 and Additional file6). No evidence of LD was found in pvmsp-7F or pvmsp-7L (Table 2 and Additional file6), indicating that polymorphism within these genes was not associated. A linear regression between LD and nucleotide distance for pvmsp-7s gave a line sloping downwards as nucleotide distance increased in pvmsp-7E, suggesting intragene recombination. Twelve minimal recombination (RM) events were found for pvmsp-7E whilst only one RM was found in pvmsp-7F and pvmsp-7L (Table 2). The ZZ test and GARD method suggested recombination in pvmsp-7E (ZZ = p < 0.05 and GARD 2 breakpoints, p < 0.0004) but not in pvmsp-7F or pvmsp-7L. Figure 4 shows the fragments produced by recombination in pvmsp-7E.

pcmsp-7 and pvmsp-7 genes appear to have diverged by positive selection

Natural selection’s long-term effect on evolutionary history can be evaluated by comparing the orthologous genes from phylogenetically-related species[60–64]. msp-7E was highly divergent when compared to the pvmsp-7E and pcmsp-7E genes (Figure 3). In spite of msp-7F and msp-7L being highly conserved in P. vivax, they also have been shown to be highly divergent when compared to P. cynomolgi orthologous genes (Figure 3). The random effects branch-site model (Branch-site REL) was performed for determining how natural selection had acted during P. vivax and P. cynomolgi evolutionary history. This test displayed lineage-specific diversifying selection signals in msp-7E and msp-7L (ω > 1, Figure 5). Moreover, the sliding window for the non-synonymous divergence per non-synonymous site rate (K_N) and the synonymous divergence per synonymous site rate (Ks) (divergent omega, K_N/Ks = ω) gave highly divergent areas in the central region and 3′-end of these three genes (Figure 3). A statistically significant K_N > K_S was found in the central region (p < 0.001) in pvmsp-7E (Table 4). No significant values were found in msp-7F, but in msp-7L, K_S was significantly higher than K_N (Table 4). However, when intraspecific polymorphism was compared to interspecific divergence using the McDonald-Kreitman test (MKT) no statistically significant values were found for these genes. The methods for estimating ω values for each codon (SLAC, FEL, REL, MEME and FUBAR) identified twenty-five (for msp-7E), four (for msp-7F) and seven (for msp-7L) codons under positive selection between pvmsp-7 and pcmsp-7 sequences (Additional files4,9,10 and11).

Table 4 Average number of msp-7 gene synonymous divergence per synonymous site (K _S ) and non-synonymous divergence per non-synonymous site (K _N )

Full size table

These results suggested that these genes have become diversified by positive selection; a similar pattern which have been reported for the pvmsp-1 gene[26, 27]. Divergence due to positive selection in msp-1 coinciding with Asian macaque radiation[26, 65] 3 to 6 million years ago means that divergence by positive selection in msp-1 appears to be the result of adaptations to available new hosts[26, 65]. P. falciparum MSP-1 and MSP-7 form a protein complex involved in invasion[9, 10]. Assuming the formation of a protein complex between MSP-1 and MSP-7 in P. vivax, MSP-7s would be under the same selective pressures and may thus have evolved in a similar way. Theoretically[66, 67], it has been suggested that a strong selective sweep may result in population differentiation at the hitchhiking locus, provided that the gene flow between these populations is low. Since malarial parasites could become diversified by sympatric events[68, 69], msp-7 (similar to msp-1) may have become diversified by positive selection (Figure 5) as a mechanism for adapting the ancestral P. vivax population to a new host during the switch to humans[70] and thus the selective sweep detected in msp-7E might have been an effect of such adaptation.

Negative selection within and between species supports the idea that the 3′-end encodes the functional region in MSP-7 proteins

In spite of divergence by positive selection, msp-7 functional regions could have evolved more slowly due to their role during invasion and thus the accumulation of substitutions would have been mainly synonymous. K_S > K_N was revealed in msp-7E and msp-7L when comparing P. vivax and P. cynomolgi sequences (Table 4). Fifty-seven sites were revealed to be under negative selection in msp-7E, twenty-four in msp-7F and thirty-six in msp-7L (Additional files4,9,10 and11). A large percentage of negatively selected sites were located in the gene’s 3′-end encoding the msp-7 family’s characteristic domain (MSP7_C, Pfam domain ID: PF12948). The protein’s C-terminal region encoded by these genes was highly conserved in pvmsp-7A, −7C, −7H, −7I, −7K[21, 22], −7E, −7F and -7L; furthermore, this region has been conserved for a long period of time (2.6 to 5.2 million years ago[3]), at least in msp-7E (84.8% similarity between P. vivax and P. cynomolgi), −7F (86.8%) and -7L (95.4%). The negative selection signals identified at the 3’-end of these three genes (Additional files8 and11) suggested that the biological structure encoded by this region has been stable slowly evolving since divergence between P. vivax and P. cynomolgi due to its functional importance. These results support the idea that this region encodes this family’s functional domain[21].

The pvmsp-7 and pcmsp-7 sequences have different gene structures

Marked differences were observed between P. vivax and P. cynomolgi msp-7 genes. pcmsp-7F had a long insertion (one hundred ninety-two nucleotides) compared to pvmsp-7F (Additional file9); however, the ORF remained open. pcmsp-7L had a premature stop codon caused by the deletion of one or two nucleotides from the sequence (Additional file10). The protein encoded by this gene thus had no domain characteristic of this family (MSP7_C, Pfam domain ID: PF12948); however, many synonymous substitutions between species were observed in the region encoding this domain (the gene’s 3‵-end) when P. vivax and P. cynomolgi sequences were compared. Thirteen sites in this region were under negative selection in msp-7L (Additional files10 and11). The GeneScan algorithm[71] was then used for searching for exon/intron splice sites in pcmsp-7F and pcmsp-7L sequences. GeneScan analysis revealed regions which could act as donor and acceptor sequences in pcmsp-7L but not in pcmsp-7F. There was a thymine in pcmsp-7L nucleotide 609, whilst there was a cytosine in the homologous position in its orthologue in P. vivax (nucleotide 615 in the Sal-I sequence). Such change may have produced a putative donor (GT) site in pcmsp-7L whilst a putative acceptor site was located in position 1,030/1,031 (Additional file12); an intron region was thus located in pcmsp-7L between nucleotides 608 and 1,031. Such exon-intron-exon structure in pcmsp-7L can be observed in the annotation of the P. cynomolgi genome available from PlasmoDB; however, the intron predicted in PlasmoDB was shorter than that predicted by GeneScan. This exon-intron-exon structure allowed pcmsp-7L to encode a protein having the MSP7_C domain.

Conclusions

Our results confirmed that the P. vivax msp-7 family has a heterogeneous genetic diversity pattern. Some members were seen to be highly conserved whilst other had high genetic diversity. Consequently, P. vivax msp-7 genes must have evolved differently from those in P. falciparum which have low polymorphism[23, 24]. The PvMSP-7s C-terminal region (the gene’s 3′-end) tended to be conserved within and between genes[21]. This region’s conservation tended to be maintained by negative selection in msp-7E, msp-7F and msp-7L, suggesting that this is the functional region for this group of proteins. On the other hand, PvMSP-7 highly diverse members (pvmsp-7C, −7H, −7I[21] and -7E) were seen to have undergone rapid evolution at the protein’s central region; immune responses would thus been directed towards this portion of the protein. New alleles have consequently arisen in the population and been maintained by balancing selection as a mechanism for evading an immune response. In addition to this type of evasion, the P. vivax msp-7 family (similar to that suggested for the pvmsp-3 family[72]) would follow a model of multi-allele diversifying selection where functionally redundant paralogues[12] would increase evasion of the immune responses by antigenic diversity.

Our results have shown that P. vivax and P. cynomolgi share the whole msp-7 repertoire described to date and have revealed lineage-specific positive selection signals which are similar to those reported for pvmsp-1. Mutations occurring in msp-7s genes during host switch may thus have succeeded in adapting the ancestral P. vivax parasite to humans.

References

Singh J, Purohit B, Desai A, Savardekar L, Shanbag P, Kshirsagar N: Clinical Manifestations, treatment, and outcome of hospitalized patients with Plasmodium vivax malaria in two Indian States: A Retrospective Study. Malar Res Treat. 2013, 2013: 341862-
PubMed Central PubMed Google Scholar
Jain V, Agrawal A, Singh N: Malaria in a tertiary health care facility of Central India with special reference to severe vivax: implications for malaria control. Pathog Glob Health. 2013, 107: 299-304. 10.1179/204777213X13777615588180.
Article PubMed Central PubMed Google Scholar
Pacheco MA, Battistuzzi FU, Junge RE, Cornejo OE, Williams CV, Landau I, Rabetafika L, Snounou G, Jones-Engel L, Escalante AA: Timing the origin of human malarias: the lemur puzzle. BMC Evol Biol. 2011, 11: 299-10.1186/1471-2148-11-299.
Article PubMed Central PubMed Google Scholar
Carlton JM, Adams JH, Silva JC, Bidwell SL, Lorenzi H, Caler E, Crabtree J, Angiuoli SV, Merino EF, Amedeo P, Cheng Q, Coulson RM, Crabb BS, Del Portillo HA, Essien K, Feldblyum TV, Fernandez-Becerra C, Gilson PR, Gueye AH, Guo X, Kang’a S, Kooij TW, Korsinczky M, Meyer EV, Nene V, Paulsen I, White O, Ralph SA, Ren Q, Sargeant TJ: Comparative genomics of the neglected human malaria parasite Plasmodium vivax. Nature. 2008, 455: 757-763. 10.1038/nature07327.
Article PubMed Central CAS PubMed Google Scholar
Iyer J, Gruner AC, Renia L, Snounou G, Preiser PR: Invasion of host cells by malaria parasites: a tale of two protein families. Mol Microbiol. 2007, 65: 231-249. 10.1111/j.1365-2958.2007.05791.x.
Article CAS PubMed Google Scholar
Chitnis CE, Blackman MJ: Host cell invasion by malaria parasites. Parasitol Today. 2000, 16: 411-415. 10.1016/S0169-4758(00)01756-7.
Article CAS PubMed Google Scholar
Rodriguez LE, Urquiza M, Ocampo M, Curtidor H, Suarez J, Garcia J, Vera R, Puentes A, Lopez R, Pinto M, Rivera Z, Patarroyo ME: Plasmodium vivax MSP-1 peptides have high specific binding activity to human reticulocytes. Vaccine. 2002, 20: 1331-1339. 10.1016/S0264-410X(01)00472-8.
Article CAS PubMed Google Scholar
Urquiza M, Rodriguez LE, Suarez JE, Guzman F, Ocampo M, Curtidor H, Segura C, Trujillo E, Patarroyo ME: Identification of Plasmodium falciparum MSP-1 peptides able to bind to human red blood cells. Parasite Immunol. 1996, 18: 515-526. 10.1046/j.1365-3024.1996.d01-15.x.
Article CAS PubMed Google Scholar
Kauth CW, Woehlbier U, Kern M, Mekonnen Z, Lutz R, Mucke N, Langowski J, Bujard H: Interactions between merozoite surface proteins 1, 6, and 7 of the malaria parasite Plasmodium falciparum. J Biol Chem. 2006, 281: 31517-31527. 10.1074/jbc.M604641200.
Article CAS PubMed Google Scholar
Pachebat JA, Ling IT, Grainger M, Trucco C, Howell S, Fernandez-Reyes D, Gunaratne R, Holder AA: The 22 kDa component of the protein complex on the surface of Plasmodium falciparum merozoites is derived from a larger precursor, merozoite surface protein 7. Mol Biochem Parasitol. 2001, 117: 83-89. 10.1016/S0166-6851(01)00336-X.
Article CAS PubMed Google Scholar
Trucco C, Fernandez-Reyes D, Howell S, Stafford WH, Scott-Finnigan TJ, Grainger M, Ogun SA, Taylor WR, Holder AA: The merozoite surface protein 6 gene codes for a 36 kDa protein associated with the Plasmodium falciparum merozoite surface protein-1 complex. Mol Biochem Parasitol. 2001, 112: 91-101. 10.1016/S0166-6851(00)00350-9.
Article CAS PubMed Google Scholar
Garzon-Ospina D, Cadavid LF, Patarroyo MA: Differential expansion of the merozoite surface protein (msp)-7 gene family in Plasmodium species under a birth-and-death model of evolution. Mol Phylogenet Evol. 2010, 55: 399-408. 10.1016/j.ympev.2010.02.017.
Article CAS PubMed Google Scholar
Garcia Y, Puentes A, Curtidor H, Cifuentes G, Reyes C, Barreto J, Moreno A, Patarroyo ME: Identifying merozoite surface protein 4 and merozoite surface protein 7 Plasmodium falciparum protein family members specifically binding to human erythrocytes suggests a new malarial parasite-redundant survival mechanism. J Med Chem. 2007, 50: 5665-5675. 10.1021/jm070773z.
Article CAS PubMed Google Scholar
Kadekoppala M, O’Donnell RA, Grainger M, Crabb BS, Holder AA: Deletion of the Plasmodium falciparum merozoite surface protein 7 gene impairs parasite invasion of erythrocytes. Eukaryot Cell. 2008, 7: 2123-2132. 10.1128/EC.00274-08.
Article PubMed Central CAS PubMed Google Scholar
Tewari R, Ogun SA, Gunaratne RS, Crisanti A, Holder AA: Disruption of Plasmodium berghei merozoite surface protein 7 gene modulates parasite growth in vivo. Blood. 2005, 105: 394-396. 10.1182/blood-2004-06-2106.
Article CAS PubMed Google Scholar
Mello K, Daly TM, Morrisey J, Vaidya AB, Long CA, Bergman LW: A multigene family that interacts with the amino terminus of Plasmodium MSP-1 identified using the yeast two-hybrid system. Eukaryot Cell. 2002, 1: 915-925. 10.1128/EC.1.6.915-925.2002.
Article PubMed Central CAS PubMed Google Scholar
Chen JH, Jung JW, Wang Y, Ha KS, Lu F, Lim CS, Takeo S, Tsuboi T, Han ET: Immunoproteomics profiling of blood stage Plasmodium vivax infection by high-throughput screening assays. J Proteome Res. 2010, 9: 6479-6489. 10.1021/pr100705g.
Article CAS PubMed Google Scholar
Wang L, Crouch L, Richie TL, Nhan DH, Coppel RL: Naturally acquired antibody responses to the components of the Plasmodium falciparum merozoite surface protein 1 complex. Parasite Immunol. 2003, 25: 403-412. 10.1111/j.1365-3024.2003.00647.x.
Article PubMed Google Scholar
Woehlbier U, Epp C, Hackett F, Blackman MJ, Bujard H: Antibodies against multiple merozoite surface antigens of the human malaria parasite Plasmodium falciparum inhibit parasite maturation and red blood cell invasion. Malar J. 2010, 9: 77-10.1186/1475-2875-9-77.
Article PubMed Central PubMed Google Scholar
Mello K, Daly TM, Long CA, Burns JM, Bergman LW: Members of the merozoite surface protein 7 family with similar expression patterns differ in ability to protect against Plasmodium yoelii malaria. Infect Immun. 2004, 72: 1010-1018. 10.1128/IAI.72.2.1010-1018.2004.
Article PubMed Central CAS PubMed Google Scholar
Garzon-Ospina D, Lopez C, Forero-Rodriguez J, Patarroyo MA: Genetic diversity and selection in three Plasmodium vivax merozoite surface protein 7 (Pvmsp-7) genes in a Colombian population. PLoS One. 2012, 7: e45962-10.1371/journal.pone.0045962.
Article PubMed Central CAS PubMed Google Scholar
Garzon-Ospina D, Romero-Murillo L, Tobon LF, Patarroyo MA: Low genetic polymorphism of merozoite surface proteins 7 and 10 in Colombian Plasmodium vivax isolates. Infect Genet Evol. 2011, 11: 528-531. 10.1016/j.meegid.2010.12.002.
Article CAS PubMed Google Scholar
Roy SW, Weedall GD, da Silva RL, Polley SD, Ferreira MU: Sequence diversity and evolutionary dynamics of the dimorphic antigen merozoite surface protein-6 and other Msp genes of Plasmodium falciparum. Gene. 2009, 443: 12-21. 10.1016/j.gene.2009.05.007.
Article CAS PubMed Google Scholar
Tetteh KK, Stewart LB, Ochola LI, Amambua-Ngwa A, Thomas AW, Marsh K, Weedall GD, Conway DJ: Prospective identification of malaria parasite genes under balancing selection. PLoS One. 2009, 4: e5568-10.1371/journal.pone.0005568.
Article PubMed Central PubMed Google Scholar
Bozdech Z, Mok S, Hu G, Imwong M, Jaidee A, Russell B, Ginsburg H, Nosten F, Day NP, White NJ, Carlton JM, Preiser PR: The transcriptome of Plasmodium vivax reveals divergence and diversity of transcriptional regulation in malaria parasites. Proc Natl Acad Sci U S A. 2008, 105: 16290-16295. 10.1073/pnas.0807404105.
Article PubMed Central CAS PubMed Google Scholar
Sawai H, Otani H, Arisue N, Palacpac N, de Oliveira ML, Pathirana S, Handunnetti S, Kawai S, Kishino H, Horii T, Tanabe K: Lineage-specific positive selection at the merozoite surface protein 1 (msp1) locus of Plasmodium vivax and related simian malaria parasites. BMC Evol Biol. 2010, 10: 52-10.1186/1471-2148-10-52.
Article PubMed Central PubMed Google Scholar
Tanabe K, Escalante A, Sakihama N, Honda M, Arisue N, Horii T, Culleton R, Hayakawa T, Hashimoto T, Longacre S, Pathirana S, Handunnetti S, Kishino H: Recent independent evolution of msp1 polymorphism in Plasmodium vivax and related simian malaria parasites. Mol Biochem Parasitol. 2007, 156: 74-79. 10.1016/j.molbiopara.2007.07.002.
Article CAS PubMed Google Scholar
Imwong M, Pukrittayakamee S, Gruner AC, Renia L, Letourneur F, Looareesuwan S, White NJ, Snounou G: Practical PCR genotyping protocols for Plasmodium vivax using Pvcs and Pvmsp1. Malar J. 2005, 4: 20-10.1186/1475-2875-4-20.
Article PubMed Central PubMed Google Scholar
ORF Finder (Open Reading Frame Finder). [http://www.ncbi.nlm.nih.gov/projects/gorf/]
Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004, 32: 1792-1797. 10.1093/nar/gkh340.
Article PubMed Central CAS PubMed Google Scholar
Abascal F, Zardoya R, Posada D: ProtTest: selection of best-fit models of protein evolution. Bioinformatics. 2005, 21: 2104-2105. 10.1093/bioinformatics/bti263.
Article CAS PubMed Google Scholar
Neafsey DE, Galinsky K, Jiang RH, Young L, Sykes SM, Saif S, Gujja S, Goldberg JM, Young S, Zeng Q, Chapman SB, Dash AP, Anvikar AR, Sutton PL, Birren BW, Escalante AA, Barnwell JW, Carlton JM: The malaria parasite Plasmodium vivax exhibits greater genetic diversity than Plasmodium falciparum. Nat Genet. 2012, 44: 1046-1050. 10.1038/ng.2373.
Article PubMed Central CAS PubMed Google Scholar
Suyama M, Torrents D, Bork P: PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 2006, 34: W609-612. 10.1093/nar/gkl315.
Article PubMed Central CAS PubMed Google Scholar
Jorda J, Kajava AV: T-REKS: identification of Tandem REpeats in sequences with a K-meanS based algorithm. Bioinformatics. 2009, 25: 2632-2638. 10.1093/bioinformatics/btp482.
Article CAS PubMed Google Scholar
Librado P, Rozas J: DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics. 2009, 25: 1451-1452. 10.1093/bioinformatics/btp187.
Article CAS PubMed Google Scholar
Depaulis F, Veuille M: Neutrality tests based on the distribution of haplotypes under an infinite-site model. Mol Biol Evol. 1998, 15: 1788-1790. 10.1093/oxfordjournals.molbev.a025905.
Article CAS PubMed Google Scholar
Zhang J, Rosenberg HF, Nei M: Positive Darwinian selection after gene duplication in primate ribonuclease genes. Proc Natl Acad Sci U S A. 1998, 95: 3708-3713. 10.1073/pnas.95.7.3708.
Article PubMed Central CAS PubMed Google Scholar
Nei M, Kumar S: Molecular evolution and phylogenetics. 2000, Oxford New York: Oxford University Press
Google Scholar
Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S: MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011, 28: 2731-2739. 10.1093/molbev/msr121.
Article PubMed Central CAS PubMed Google Scholar
Delport W, Poon AF, Frost SD, Kosakovsky Pond SL: Datamonkey 2010: a suite of phylogenetic analysis tools for evolutionary biology. Bioinformatics. 2010, 26: 2455-2457. 10.1093/bioinformatics/btq429.
Article PubMed Central CAS PubMed Google Scholar
Pond SL, Frost SD, Grossman Z, Gravenor MB, Richman DD, Brown AJ: Adaptation to different human populations by HIV-1 revealed by codon-based analyses. PLoS Comput Biol. 2006, 2: e62-10.1371/journal.pcbi.0020062.
Article PubMed Google Scholar
Kosakovsky Pond SL, Frost SD: Not so different after all: a comparison of methods for detecting amino acid sites under selection. Mol Biol Evol. 2005, 22: 1208-1222. 10.1093/molbev/msi105.
Article PubMed Google Scholar
Murrell B, Wertheim JO, Moola S, Weighill T, Scheffler K, Kosakovsky Pond SL: Detecting individual sites subject to episodic diversifying selection. PLoS Genet. 2012, 8: e1002764-10.1371/journal.pgen.1002764.
Article PubMed Central CAS PubMed Google Scholar
Murrell B, Moola S, Mabona A, Weighill T, Sheward D, Kosakovsky Pond SL, Scheffler K: FUBAR: a fast, unconstrained bayesian approximation for inferring selection. Mol Biol Evol. 2013, 30: 1196-1205. 10.1093/molbev/mst030.
Article PubMed Central CAS PubMed Google Scholar
Jukes TH, Cantor CR: Evolution of protein molecules. Mammalian Protein Metabolism. Edited by: Munro HN. 1969, New York: Academic Press
Google Scholar
Kelly JK: A test of neutrality based on interlocus associations. Genetics. 1997, 146: 1197-1206.
PubMed Central CAS PubMed Google Scholar
Kosakovsky Pond SL, Posada D, Gravenor MB, Woelk CH, Frost SD: Automated phylogenetic detection of recombination using a genetic algorithm. Mol Biol Evol. 2006, 23: 1891-1901. 10.1093/molbev/msl051.
Article PubMed Google Scholar
Rozas J, Gullaud M, Blandin G, Aguade M: DNA variation at the rp49 gene region of Drosophila simulans: evolutionary inferences from an unusual haplotype structure. Genetics. 2001, 158: 1147-1155.
PubMed Central CAS PubMed Google Scholar
Hudson RR, Kaplan NL: Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics. 1985, 111: 147-164.
PubMed Central CAS PubMed Google Scholar
Martin DP, Lemey P, Lott M, Moulton V, Posada D, Lefeuvre P: RDP3: a flexible and fast computer program for analyzing recombination. Bioinformatics. 2010, 26: 2462-2463. 10.1093/bioinformatics/btq467.
Article PubMed Central CAS PubMed Google Scholar
Tachibana S, Sullivan SA, Kawai S, Nakamura S, Kim HR, Goto N, Arisue N, Palacpac NM, Honma H, Yagi M, Tougan T, Katakai Y, Kaneko O, Mita T, Kita K, Yasutomi Y, Sutton PL, Shakhbatyan R, Horii T, Yasunaga T, Barnwell JW, Escalante AA, Carlton JM, Tanabe K: Plasmodium cynomolgi genome sequences provide insight into Plasmodium vivax and the monkey malaria clade. Nat Genet. 2012, 44: 1051-1055. 10.1038/ng.2375.
Article PubMed Central CAS PubMed Google Scholar
Mongui A, Perez-Leal O, Soto SC, Cortes J, Patarroyo MA: Cloning, expression, and characterisation of a Plasmodium vivax MSP7 family merozoite surface protein. Biochem Biophys Res Commun. 2006, 351: 639-644. 10.1016/j.bbrc.2006.10.082.
Article CAS PubMed Google Scholar
Figtree M, Pasay CJ, Slade R, Cheng Q, Cloonan N, Walker J, Saul A: Plasmodium vivax synonymous substitution frequencies, evolution and population structure deduced from diversity in AMA 1 and MSP 1 genes. Mol Biochem Parasitol. 2000, 108: 53-66. 10.1016/S0166-6851(00)00204-8.
Article CAS PubMed Google Scholar
Mascorro CN, Zhao K, Khuntirat B, Sattabongkot J, Yan G, Escalante AA, Cui L: Molecular evolution and intragenic recombination of the merozoite surface protein MSP-3alpha from the malaria parasite Plasmodium vivax in Thailand. Parasitology. 2005, 131: 25-35. 10.1017/S0031182005007547.
Article CAS PubMed Google Scholar
Gomez A, Suarez CF, Martinez P, Saravia C, Patarroyo MA: High polymorphism in Plasmodium vivax merozoite surface protein-5 (MSP5). Parasitology. 2006, 133: 661-672. 10.1017/S0031182006001168.
Article CAS PubMed Google Scholar
Putaporntip C, Udomsangpetch R, Pattanawong U, Cui L, Jongwutiwes S: Genetic diversity of the Plasmodium vivax merozoite surface protein-5 locus from diverse geographic origins. Gene. 2010, 456: 24-35. 10.1016/j.gene.2010.02.007.
Article PubMed Central CAS PubMed Google Scholar
Nurminsky D: Selective sweep. 2005, Georgetown, Tex. New York, N.Y: Landes Bioscience/Eurekah.com; Kluwer Academic/Plenum Publishers
Book Google Scholar
Putaporntip C, Jongwutiwes S, Ferreira MU, Kanbara H, Udomsangpetch R, Cui L: Limited global diversity of the Plasmodium vivax merozoite surface protein 4 gene. Infect Genet Evol. 2009, 9: 821-826. 10.1016/j.meegid.2009.04.017.
Article PubMed Central CAS PubMed Google Scholar
Martinez P, Suarez CF, Gomez A, Cardenas PP, Guerrero JE, Patarroyo MA: High level of conservation in Plasmodium vivax merozoite surface protein 4 (PvMSP4). Infect Genet Evol. 2005, 5: 354-361. 10.1016/j.meegid.2004.12.001.
Article CAS PubMed Google Scholar
Pacheco MA, Elango AP, Rahman AA, Fisher D, Collins WE, Barnwell JW, Escalante AA: Evidence of purifying selection on merozoite surface protein 8 (MSP8) and 10 (MSP10) in Plasmodium spp. Infect Genet Evol. 2012, 12: 978-986. 10.1016/j.meegid.2012.02.009.
Article PubMed Central CAS PubMed Google Scholar
Forero-Rodriguez J, Garzon-Ospina D, Patarroyo MA: Low genetic diversity and functional constraint in loci encoding Plasmodium vivax P12 and P38 proteins in the Colombian population. Malar J. 2014, 13: 58-10.1186/1475-2875-13-58.
Article PubMed Central PubMed Google Scholar
Forero-Rodriguez J, Garzon-Ospina D, Patarroyo MA: Low genetic diversity in the locus encoding the Plasmodium vivax P41 protein in Colombia’s parasite population. Malar J. 2014, 13: 388-10.1186/1475-2875-13-388.
Article PubMed Central PubMed Google Scholar
Chenet SM, Pacheco MA, Bacon DJ, Collins WE, Barnwell JW, Escalante AA: The evolution and diversity of a low complexity vaccine candidate, merozoite surface protein 9 (MSP-9), in Plasmodium vivax and closely related species. Infect Genet Evol. 2013, 20: 239-248.
Article PubMed Central CAS PubMed Google Scholar
Pacheco MA, Ryan EM, Poe AC, Basco L, Udhayakumar V, Collins WE, Escalante AA: Evidence for negative selection on the gene encoding rhoptry-associated protein 1 (RAP-1) in Plasmodium spp. Infect Genet Evol. 2010, 10: 655-661. 10.1016/j.meegid.2010.03.013.
Article CAS PubMed Google Scholar
Carlton JM, Das A, Escalante AA: Genomics, population genetics and evolutionary history of Plasmodium vivax. Adv Parasitol. 2013, 81: 203-222.
Article PubMed Google Scholar
Slatkin M, Wiehe T: Genetic hitch-hiking in a subdivided population. Genet Res. 1998, 71: 155-160. 10.1017/S001667239800319X.
Article CAS PubMed Google Scholar
Nurminsky DI: Genes in sweeping competition. Cell Mol Life Sci. 2001, 58: 125-134. 10.1007/PL00000772.
Article CAS PubMed Google Scholar
Perez-Tris J, Hellgren O, Krizanauskiene A, Waldenstrom J, Secondi J, Bonneaud C, Fjeldsa J, Hasselquist D, Bensch S: Within-host speciation of malaria parasites. PLoS One. 2007, 2: e235-10.1371/journal.pone.0000235.
Article PubMed Central PubMed Google Scholar
Sutherland CJ, Tanomsing N, Nolder D, Oguike M, Jennison C, Pukrittayakamee S, Dolecek C, Hien TT, do Rosario VE, Arez AP, Pinto J, Michon P, Escalante AA, Nosten F, Burke M, Lee R, Blaze M, Otto TD, Barnwell JW, Pain A, Williams J, White NJ, Day NP, Snounou G, Lockhart PJ, Chiodini PL, Imwong M, Polley SD: Two nonrecombining sympatric forms of the human malaria parasite Plasmodium ovale occur globally. J Infect Dis. 2010, 201: 1544-1550. 10.1086/652240.
Article CAS PubMed Google Scholar
Mu J, Joy DA, Duan J, Huang Y, Carlton J, Walker J, Barnwell J, Beerli P, Charleston MA, Pybus OG, Su XZ: Host switch leads to emergence of Plasmodium vivax malaria in humans. Mol Biol Evol. 2005, 22: 1686-1693. 10.1093/molbev/msi160.
Article CAS PubMed Google Scholar
Burge C, Karlin S: Prediction of complete gene structures in human genomic DNA. J Mol Biol. 1997, 268: 78-94. 10.1006/jmbi.1997.0951.
Article CAS PubMed Google Scholar
Rice BL, Acosta MM, Pacheco MA, Carlton JM, Barnwell JW, Escalante AA: The origin and diversification of the merozoite surface protein 3 (msp3) multi-gene family in Plasmodium vivax and related parasites. Mol Phylogenet Evol. 2014, 78C: 172-184.
Article Google Scholar

Download references

Acknowledgements

We would like to thank Jason Garry for translating the manuscript. This work was financed by the “Departamento Administrativo de Ciencia, Tecnología e Innovación (COLCIENCIAS)” through contracts RC # 0309–2013 and # 0709–2013. JF-R received financing through COLCIENCIAS cooperation agreement # 0719–13.

Author information

Authors and Affiliations

Molecular Biology and Immunology Department, Fundación Instituto de Inmunología de Colombia (FIDIC), Carrera 50 No. 26-20, Bogotá, DC, Colombia
Diego Garzón-Ospina, Johanna Forero-Rodríguez & Manuel A Patarroyo
Basic Sciences Department, School of Medicine and Health Sciences, Universidad del Rosario, Carrera 24 No. 63C-69, Bogotá, DC, Colombia
Diego Garzón-Ospina & Manuel A Patarroyo

Authors

Diego Garzón-Ospina
View author publications
You can also search for this author in PubMed Google Scholar
Johanna Forero-Rodríguez
View author publications
You can also search for this author in PubMed Google Scholar
Manuel A Patarroyo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manuel A Patarroyo.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

DG-O and JF-R devised and designed the study, performed the experiments, made the population genetics analysis and wrote the manuscript. MAP coordinated the study, and helped to write the manuscript. All the authors have read and approved the final manuscript.

Diego Garzón-Ospina, Johanna Forero-Rodríguez contributed equally to this work.

Electronic supplementary material

12936_2014_3635_MOESM1_ESM.txt

Additional file 1: Putative P. cynomolgi msp-7 gene sequences obtained from chromosome 12, whole genome shotgun sequence GenBank accession number: NC_020405. ORF Finder and Gene Runner software were used to identify open reading frames encoding P. cynomolgi MSP-7 proteins. (TXT 13 KB)

12936_2014_3635_MOESM2_ESM.pdf

Additional file 2: Genetic distance between pcmsp-7B , pcmsp-7E and pvmsp-7B , pvmsp-7E sequences from 5 P. vivax isolates. The number of nucleotide differences per site was estimated as well as the standard error regarding the pvmsp-7E and pvmsp-7B reference sequences and the pcmsp-7E and pcmsp-7B sequences. (PDF 186 KB)

12936_2014_3635_MOESM3_ESM.pdf

Additional file 3: DNA polymorphism measurements at the 5′-end, central region and 3′-end for pvmsp - 7 genes in the Colombian population. Ss: number of segregating sites, S: number of singleton sites, Ps: number of parsimony-informative sites, H: number of haplotypes, θ^W: Watterson estimator, π: nucleotide diversity. (SD): standard deviation. 5′-end (pvmsp-7E: nucleotide 1–390, pvmsp-7F: nucleotide 1–432, pvmsp-7L: nucleotide 1–381), central (pvmsp-7E: nucleotide 391–747, pvmsp-7F: nucleotide 433–1,053, pvmsp-7L: nucleotide 382–816) and 3′-end (pvmsp-7E: nucleotide 748–1,158, pvmsp-7F: nucleotide 1,054–1,449, pvmsp-7L: nucleotide 817–1,275). Numbers based on Additional files 4, 9 and 10. (PDF 212 KB)

12936_2014_3635_MOESM4_ESM.pdf

Additional file 4: pvmsp-7E gene alignment. The alignment shows the 23 haplotypes found in pvmsp-7E together with the pcmsp-7E haplotype. Haplotype 1, Sal-I; haplotype 2, Brazil-I; haplotype 3, India-VII; haplotype 4, Mauritania-I; haplotype 5–23, Colombian isolates. Dots represent nucleotide identity. Codons under positive selection are shown in green (intra-species) and turquoise (inter-species) and those under negative selection are shown in yellow (intra-species) and fuchsia (inter-species). (PDF 162 KB)

12936_2014_3635_MOESM5_ESM.pdf

Additional file 5: Haplotype alignment at PvMSP-7E protein level in Colombia. The alignment shows the 13 haplotypes at protein level found in PvMSP-7E. Dots represent amino acid identity. (PDF 74 KB)

12936_2014_3635_MOESM6_ESM.pdf

Additional file 6: Neutrality, linkage disequilibrium and recombination tests at the 5′-end, central region and 3′-end for pvmsp-7 genes in the Colombian population. 5′-end (pvmsp-7E: nucleotide 1–390, pvmsp-7F: nucleotide 1–432, pvmsp-7L: nucleotide 1–381), central (pvmsp-7E: nucleotide 391–747, pvmsp-7F: nucleotide 433–1,053, pvmsp-7L: nucleotide 382–816) and 3′-end (pvmsp-7E: nucleotide 748–1,158, pvmsp-7F: nucleotide 1,054–1,449, pvmsp-7L: nucleotide 817–1,275). Numbers based on Additional files 4, 9 and 10. •: p <0.02, *: p <0.05. (PDF 203 KB)

12936_2014_3635_MOESM7_ESM.tiff

Additional file 7: Neutrality test sliding window for the pvmsp-7E gene. Tajima’s D (blue), Fu and Li’s D* (red), F* (green) and Fay and Wu’s H (purple). The gene was divided into 3 regions: the 5′-end (nucleotide 1 to 390), central (nucleotide 391 to 747) and 3′-end region (nucleotide 748 to 1,158). The bars at the bottom indicate that a test gave significant values in each region. Numbering based on the alignment shown in Additional file 4. (TIFF 523 KB)

12936_2014_3635_MOESM8_ESM.pdf

Additional file 8: Intra-species positively and negatively selected sites detected for pvmsp-7 genes. 5′-end (pvmsp-7E: nucleotide 1–390, pvmsp-7F: nucleotide 1–432, pvmsp-7L: nucleotide 1–381), central (pvmsp-7E: nucleotide 391–747, pvmsp-7F: nucleotide 433–1,053, pvmsp-7L: nucleotide 382–816) and 3′-end (pvmsp-7E: nucleotide 748–1,158, pvmsp-7F: nucleotide 1,054–1,449, pvmsp-7L: nucleotide 817–1,275). Numbers based on Additional files 4, 9 and 10. (PDF 57 KB)

12936_2014_3635_MOESM9_ESM.pdf

Additional file 9: pvmsp-7F gene alignment. The alignment shows the 8 haplotypes found in pvmsp-7F together with pcmsp-7F haplotype. Haplotype 1, Sal-I; haplotype 2, Brazil-I and North Korea; haplotype 3, India-VII; haplotype 4, Mauritania-I; haplotypes 5–8, Colombian isolates. Dots represent nucleotide identity. Codons under positive selection are shown in green (intra-species) and turquoise (inter-species) and those under negative selection are shown in fuchsia (inter-species). (PDF 99 KB)

12936_2014_3635_MOESM10_ESM.pdf

Additional file 10: pvmsp-7L gene alignment. The alignment shows the 7 haplotypes found in pvmsp-7L together with pcmsp-7L haplotype. Haplotype 1, Sal-I; haplotype 2, Brazil-I, India-VII and North Korea; haplotype 3, Mauritania-I; haplotypes 4–7, Colombian isolates. The dots represent nucleotide identity. Codons under positive selection are shown in green (intra-species) and in turquoise (inter-species) and those under negative selection are shown in fuchsia (inter-species). (PDF 91 KB)

12936_2014_3635_MOESM11_ESM.pdf

Additional file 11: Inter-species positively and negatively selected sites detected for msp-7 genes. 5′-end (msp-7E: nucleotide 1–390, msp-7F: nucleotide 1–432, msp-7L: nucleotide 1–381), central (msp-7E: nucleotide 391–747, msp-7F: nucleotide 433–1,053, msp-7L: nucleotide 382–816) and 3′-end (msp-7E: nucleotide 748–1,158, msp-7F: nucleotide 1,054–1,449, msp-7L: nucleotide 817–1,275). Numbers based on Additional files 4, 9 and 10. (PDF 58 KB)

12936_2014_3635_MOESM12_ESM.pdf

Additional file 12: pcmsp-7L putative donor and acceptor sites. An alignment was made between the Sal-I strain pvmsp-7L sequences, pcmsp-7L and the sequence resulting from GeneScan analysis (pcmsp-7L _mRNA). The red arrows indicate the putative donor and acceptor sites in pcmsp-7L. (PDF 62 KB)

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Rights and permissions

This article is published under an open access license. Please check the 'Copyright Information' section either on this page or in the PDF for details of this license and what re-use is permitted. If your intended use exceeds what is permitted by the license or if you are unable to locate the licence and re-use information, please contact the Rights and Permissions team.

About this article

Cite this article

Garzón-Ospina, D., Forero-Rodríguez, J. & Patarroyo, M.A. Heterogeneous genetic diversity pattern in Plasmodium vivax genes encoding merozoite surface proteins (MSP) -7E, −7F and -7L. Malar J 13, 495 (2014). https://doi.org/10.1186/1475-2875-13-495

Download citation

Received: 21 October 2014
Accepted: 10 December 2014
Published: 13 December 2014
DOI: https://doi.org/10.1186/1475-2875-13-495

Heterogeneous genetic diversity pattern in Plasmodium vivax genes encoding merozoite surface proteins (MSP) -7E, −7F and -7L

Abstract

Background

Methods

Results

Conclusions

Similar content being viewed by others

Background

Methods

Ethics statement

Parasite DNA and genotyping

PCR amplification and sequencing

Phylogenetic analysis for Plasmodium cynomolgi msp-7 orthologous identification

DNA diversity and evolutionary analysis in pvmsp-7 genes

Results and discussion

Genotyping natural isolates

The msp-7 family structure in Plasmodium cynomolgi and phylogenetic analysis

pvmsp-7E genetic diversity

pvmsp-7E neutrality and selection tests

pvmsp-7F and pvmsp-7L genetic diversity

pvmsp-7F and pvmsp-7L neutrality and selection tests

Intragene linkage disequilibrium (LD) and recombination in pvmsp-7 genes

pcmsp-7 and pvmsp-7 genes appear to have diverged by positive selection

Negative selection within and between species supports the idea that the 3′-end encodes the functional region in MSP-7 proteins

The pvmsp-7 and pcmsp-7 sequences have different gene structures

Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ contributions

Electronic supplementary material

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation