Malaria

Malaria is caused by protozoan parasites belonging to the genus Plasmodium. The latest WHO report (2016) estimates 216 million malaria cases and 445,000 deaths worldwide each year (World Health Organization et al. 2015). Africa is the continent most affected by malaria with 90% of the global cases; 7% percent are in Southeast Asia while less than 1% occur in Central and South America. The Eastern Mediterranean region accounts for approximately 2% of cases.

Despite Plasmodium being an ancient parasite (Carter and Mendis 2002) and the extensive knowledge gathered regarding its life cycle (Fig. 1), researchers of its biology, genetics, and epidemiology today face a challenge in tackling the threat it poses to millions of people. Plasmodium falciparum is responsible for the deadliest parasitic disease in history. Its distribution is wide although the African continent is the most severely affected—in sub-Saharan countries, it represents an enormous public health problem, since it is here where more than 90% of cases are recorded, and 91% of deaths worldwide occur, most of them children. Plasmodium vivax, on the other hand, is highly prevalent in most malarial areas, except in Africa. This species generally causes a debilitating disease although in some regions severe malaria cases have been reported (Naing et al. 2014). Other species such as Plasmodium malariae, Plasmodium knowlesi, Plasmosdium ovale curtisi, and Plasmodium ovale wallikeri (Sutherland et al. 2010) (Zaw and Lin 2017) seem to be less widely distributed, or their prevalence is underestimated. Traditionally, P. ovale curtisi and P. ovale wallikeri were considered subspecies. However, according to the Ansari manuscript (Ansari et al. 2016), which looked at diversity in the surface antigens together with their phylogenetic separation, it was revealed that, in fact, they are two different species. P. ovale curtisi, P. ovale wallikeri, and P. malariae can be found in Asia and, especially, in West Africa, while P. knowlesi is located in Southeast Asia. Co-infections are frequent (Zimmerman et al. 2004). Consequently, mixed infections of P. falciparum and P. vivax (Imwong et al. 2011) (Ginouves et al. 2015), or P. ovale curtisi and P. ovale wallikeri with P. malariae have been detected in patients living in areas where both species are prevalent (Dinko et al. 2013) (Fançony et al. 2012).

Fig. 1
figure 1

Plasmodium life cycle

Plasmodium species are numerous, and all have similar life cycles with an arthropod as a vector and vertebrate host specificity. Some non-human species can infect reptiles, birds, or mammals such as rodents or apes. The Plasmodium life cycle is complex; its sexual reproduction takes place in mosquitoes of the genus Anopheles. In this phase, the parasite is briefly found as a diploid organism. In another phase, it is found reproducing asexually (schizogony) in the vertebrate or intermediate host where it has a haploid genome. In humans, the parasite has two multiplicative stages: one inside the liver cell (the exoerythrocytic phase) and another inside the erythrocyte (the intra-erythrocyte phase). The cycle begins with the bite of an infected female Anopheles mosquito, which inoculates the sporozoites into the vertebrate host while feeding. These sporozoites, which remain on the skin for between 1 and 3 h (Ejigiri and Sinnis 2009), exhibit great mobility and hence migrate through the bloodstream (Ménard et al. 2013) (Formaglio and Amino 2015) (Acharya et al. 2017) to the hepatic parenchymal cells. From the single sporozoite invading a hepatocyte, thousands of merozoites are produced, a process that requires between 2 and 14 days depending on the Plasmodium species (Hall et al. 2005). Thus, in P. vivax and P. ovale, the hypnozoites are formed. These are dormant forms that remain in the liver cells and are responsible for relapse episodes, which can occur weeks, months, or even years later (Howes et al. 2016) (Markus 2011). Subsequently, the hepatic merozoites invade the erythrocytes, and the parasites multiply again releasing a new generation of merozoites that will invade new red blood cells—in this way, the erythrocyte asexual cycle is repeated over and over again (Silvie et al. 2008; Gilson and Crabb 2009). At this stage, the symptoms of the disease and pathology manifest. In the erythrocyte cycle, some parasites are programmed to transform into male and female gametocytes, these are the infective forms that enter the Anopheline mosquito species.

In the mosquito midgut, both female and male gametocytes are transformed into gametes (gametogenesis). This process is rapid, and recent studies by Bansal et al. (2017) have revealed the fundamental role of a calcium-dependent protein, kinase (PfCDPK2) in the transformation of male gametocytes into gametes. Zygotes mature into ookinetes, which traverse the peritrophic membrane and midgut epithelium, and then differentiate into oocysts that lodge on the midgut’s outer surface (Barillas-Mury and Kumar 2005). The oocysts have four haploid genomes, which might be recombinants (Sinden 2015), and thousands of haploid sporozoites are developed and released into the hemocele, finally invading the salivary glands (1–3 weeks). Parasites reach the next host when the female Anopheles mosquitoes blood feed, thus completing the parasite’s life cycle.

Objectives

The main objective of this work is to carry out a review of articles published on the genome of the Plasmodium parasite species capable of affecting humans. To bring to light certain aspects of malarial transmission and its parasitic interactions with vertebrate and invertebrate hosts, since 2002 (when the P. falciparum genome was first sequenced), a large number of studies have focused on and defined the genetics, genomics, and functional genomics of a list of genes and families related to drug resistance. This review comprises the large number of genes that encode for proteins at the interface of host-parasite interactions in order to establish new antigenic candidates for the development of novel vaccines.

Epidemiology

In addition to the four species that have historically affected humans, P. knowlesi was recently recognized as a further species causing human malaria. This species was first identified in 1932 as a natural parasite of macaques in Southeast Asia, and the first human case of malaria was recorded in 1965. Almost 40 years later, Singh et al. studied a large number of malaria cases produced by P. knowlesi in Malaysian Borneo (Singh et al. 2004a). It is now considered as an emerging Plasmodium species in the Asian continent (Herdiana et al. 2016), where numerous human cases have been recorded over the last decade (White 2008) (William et al. 2013) (Yusof et al. 2014). This is the most common type of malaria in Malaysia, with the indigenous population and travelers who enter the jungle being most affected by the parasite (Millar and Cox-Singh 2015) (Barber et al. 2017). Potentially, the actual number of cases produced by P. knowlesi is even greater than that estimated due to the possibility of misdiagnosis—the blood stages of P. knowlesi are similar to those of P. malariae. Therefore, morphological identification is quite complicated and under complex epidemiological conditions, they might be indistinguishable from other human-affecting species (Singh et al. 2004a) (Herdiana et al. 2016) (Lubis et al. 2017).

In South America, two species were found that infect humans: P. simium and P. brasilianum. The first is almost indiscernible from P. vivax, and the second is indiscernible from P. malariae. P. simium is a native monkey parasite and although its natural hosts (arboreal howler monkeys, wooly spider monkeys, and capuchin monkeys) are distributed throughout South America (Alvarenga et al. 2015), P. simium is located exclusively in the Atlantic forest of southern and south-eastern Brazil. The first case of human infection was described in Brazil infecting a scientific assistant exposed naturally to vector bites, probably from infectious primate malaria parasites (Deane and Deane 1992). In 2014, Costa et al. (Costa et al. 2014) reported a high prevalence of P. simium in monkeys inhabiting this geographical region. Between 2006 and 2016, several human malaria cases were detected in the valleys of the Atlantic forest (Siqueira et al. 2016).

Later, Brasil et al. (Brasil et al. 2017) reported a malarial outbreak in the Atlantic forest of Rio de Janeiro state during 2015 and 2016. The epidemiological data seem to indicate that these cases were due to the encroachment of humans into the monkeys’ natural habitat; they therefore concluded that the zoonotic transmission of P. simium was unambiguous. Probably such transmission has always existed but the cases were misdiagnosed as P. vivax (Grigg and Snounou 2017)—either that or P. simium is becoming better adapted to infecting humans.

Further mitochondrial DNA studies of Plasmodium species found in human, simian, and mosquito samples suggest that in the Atlantic Forest of Brazil, a cross-over from humans and ape species (Buery et al. 2017) is occurring.

P. brasilianum was first identified in 1908 by Gonder and Von Berenberg-Gossler, and described as a simian Plasmodium that infected several monkeys species (Guimarães et al. 2012). It was found to be distributed in Panama in the 1930–50s (Collins 2002) but there is no recent information on its prevalence and risk to humans. More than a decade ago, the species was reported to be infecting the primate Alouatta palliate (the mantled howler monkey) in Costa Rica (Chinchilla et al. 2006). In this country, the difficulty in differentiating P. malariae from P. brasillianum in simian and human samples was recently evidenced (Calvo et al. 2015; Fuentes-Ramírez et al. 2017).

The first human cases of P. brasilianum in the Amazonian and Atlantic forest regions of South America were recorded in 2015 by Lalremruata et al. (Lalremruata et al. 2015); these occurred in indigenous Yanomami Indians on the border between Venezuela and Brazil and resulted from anthropozoonotic transmission. It is very likely that the monkeys are acting as reservoirs for both species (Araújo et al. 2013; Figueiredo et al. 2017; Fuentes-Ramírez et al. 2017). P. cynomolgi is another primate parasite, found on the Asian continent, whose natural hosts are long-tailed macaques. The first human case was recently recorded in the Malaysian peninsula, where these apes are widely distributed (Ta et al. 2014).

All of the above threaten the eradication of malaria. Consequently, it is essential to improve diagnoses using molecular techniques as well as to strengthen the epidemiological studies in order to determine if the monkeys are really acting as parasite reservoirs.

Clinical form and treatment

The clinical manifestations of malaria and their evolution can vary greatly. Both depend on the particular Plasmodium species, the host’s innate and acquired immunity, and the choice of a suitable and timely treatment. Malaria is usually classified into three types: asymptomatic, uncomplicated, and severe (WHO 2014).

Malaria is considered asymptomatic when blood parasites are present but there are no clinical symptoms (fever and chills) and therefore no antimalarial treatment (Lindblade et al. 2013) (Phillips et al. 2017) is administered. Mild or uncomplicated malaria presents with non-specific symptoms that may include fever and shaking chills, with parasitemia but no serious organ disturbance. This can be caused by all the Plasmodium spp. Severe malaria, on the other hand, is the most dangerous form of the disease with numerous complications such as severe anemia and multiple organ damage, including the brain (cerebral malaria), lungs, and kidneys (Bartoloni and Zammarchi 2012) (White et al. 2014). This causes high morbidity and mortality rates in African children under 5 years of age and is mainly produced by P. falciparum (Seydel et al. 2015) (Maitland 2016). Nonetheless, it can also be caused by P. vivax (Arnott et al. 2012) or P. knowlesi (Bartoloni and Zammarchi 2012) at a much lower frequency.

Immunity seems to be a determining factor in malaria symptoms. In areas of moderate to high transmission, repeated infections and continuous exposure produce partial immunity to the disease, resulting in a decrease of clinical symptoms (Filipe et al. 2007). On the other hand, it has been seen that this immunity does not depend so much on the frequency of exposure as on the maturation of the immune system itself (Lindblade et al. 2013), so that adults and children over 5 years of age living in areas of transmission of P. falciparum or P. vivax develop protective immunity against the parasite (Mueller et al. 2013). This immunity against the parasite is carried out through the control of its replication and the consequent decrease of its parasitic density (Mohan and Stevenson 1998). In areas of high transmission, anti-disease immunity develops more rapidly than in areas of low transmission, and asymptomatic infections are more frequent (Hamad et al. 2000) (Magesa et al. 2002). However, P. malariae, P. ovale curtisi, and P. ovale wallikeri are mostly detected as mixed infections combined with other malaria species (Rojo-marcos 2011) (Scuracchio et al. 2011) (Dinko et al. 2013). Chen et al. proposed using the term “chronic malaria” (Chen et al. 2016) for asymptomatic infections (microscopic or submicroscopic) that may persist for long periods of time (Bousema et al. 2014) and, therefore, should be considered as important reservoirs since they may contribute to the disease’s transmission. Accordingly, several authors have shown the infective capacity of gametocytes in the mosquito host, even at submicroscopic levels (Schneider et al. 2007) (Ouédraogo et al. 2009). However, there are discrepancies in the results obtained regarding the role the mosquitoes play as a reservoir (Nyboer et al. 2017) (Lin et al. 2014) (Gonçalves et al. 2016). These infections are more widespread than previously thought, even in low-endemic areas, and are very common in areas with seasonal malaria transmission (Golassa et al. 2015).

Uncomplicated malaria can be produced by any Plasmodium species and will depend largely on the degree of prior exposure (Bartoloni and Zammarchi 2012). P. malariae, P. ovale curtisi, and P. ovale wallikeri have been considered responsible for mild infections. However, it is known that P. malariae can cause important clinical complications or remain as a chronic infection for long periods of time (Collins and Jeffery 2007).

Severe malaria occurs in non-immune subjects infected with P. falciparum yet the derived complications can be avoided with early diagnosis and appropriate treatment. Certain genetic disorders in the host like thalassemia (López et al. 2010) (Williams 2012) or sickle trait, the heterozygous state of normal hemoglobin A (HbA) and sickle hemoglobin S (HbS) (Cholera et al. 2008), can also protect against severe malaria. The severity and pathogenesis of the disease depend on certain surface proteins that are expressed by the parasite (Phillips et al. 2017), such as those encoding for var genes in P. falciparum (Wassmer et al. 2015) (Gillrie et al. 2016). Nowadays, there are more and more cases of severe malaria associated exclusively with P. vivax (Quispe et al. 2014). Patients from the Brazilian Amazon with severe P. vivax have suffered an increase in the expression of the genes involved in chloroquine resistance compared to patients with the mild form of the disease (Fernández-Becerra et al. 2009).

Genetics in malaria diagnosis

To determine the Plasmodium species, microscopy has been the main method employed worldwide. Two decades ago, rapid diagnostic tests were introduced and then improved upon over the years. These are of great use in remote areas that lack laboratory facilities but they only discriminate P. falciparum from the other Plasmodium species (such as P. vivax); they also provided only limited sensitivity for low-level parasitemias (Abba et al. 2014; Li et al. 2017). Their performance can further be affected by parasite polymorphism (Cheng et al. 2014) (Cheng et al. 2014). Moreover, to end malaria, one has to be able to accurately diagnose all malaria species affecting humans, including P. malariae, P. ovale curtisi, P. ovale wallikeri, P. knowlesi, and other zoonotic variants. These results will also allow us to update their geographical distribution and to obtain epidemiological data.

In a symptomatic patient, the identification of P. falciparum is prioritized over the other species to prevent disease severity, especially in young children. It is known that uncomplicated P. falciparum can become severe within 24–48 days following diagnosis, even if the parasitemia is low. Moreover, it is common to find P. falciparum infections co-existing with other human malaria that are not considered fatal. To address this problem, the molecular tools now available can be far more accurate and sensitive in detecting low parasitic densities, mixed infections, treatment outcomes, and gametocyte loads. They can also be used to detect gene polymorphism involved in drug resistance or vaccine development (Tangpukdee et al. 2009; Britton et al. 2016) and for detecting other parasite characteristics that are useful for surveillance and to hasten malaria elimination.

Conventional molecular techniques (PCR assays) might not be sufficient because there is a cross-reaction effect between P. knowlesi and P. vivax (Sulistyaningsih et al. 2010). For this reason, mtCOI gene amplicons were analyzed to distinguish human cases of P. knowlesi in Indonesian patients (Setiadi et al. 2016).

The P. malariae and P. brasilianum genomes are very similar (Talundzic et al. 2017). In fact, the recent unexpected detection of P. malariae in Costa Rica could be due to P. brasilianum. By aligning the genomic sequences obtained, a 99% accurate identification of P. malariae was achieved when isolated from atypical human cases occurring in Asia; and a 99% identification accuracy was also achieved for a P. brasilianum sequence isolated from a non-human primate in Guiana. In Costa Rica, P. brasilianum was earlier reported in monkeys (Chinchilla et al. 2006). In spite of this, there is no straightforward, complete diagnostic method to distinguish between both species with certainty. The detection of P. malariae subtypes and their close genetic relationship to P. brasilianum suggest that both species might correspond to a complex group (Talundzic et al. 2017; Rutledge et al. 2017).

Microscopic examination alone might be insufficient in diagnosing the Plasmodium species P. vivax and P. simium—in a recent study carried out in the Atlantic forest of Brazil, an attempt was made to differentiate these types of parasites infecting humans and monkeys, respectively (Brasil et al. 2017). The analysis of the mitochondrial genome revealed the close genetic relationship between the two species and confirmed the presence of P. simium in a large number of samples by means of two single nucleotide polymorphisms. Nevertheless, distinguishing one species from the other (P. vivax from P. simium) is no easy task. More recently, a single differential mutation from the mitDNA was used to develop a molecular test to distinguish between both species (de Alvarenga et al. 2018); this was made possible by analyzing a larger number of samples. However, it is necessary to monitor the method’s performance and probably add new differential SNPs. In addition, the P. simium genome, as cited by Grigg and Snounou et al. (Grigg and Snounou 2017), together with further population studies, might uncover new genetic markers and their genetic relationship to the P. vivax parasite and its transmission dynamics.

The ultra-deep sequencing of genes 18S rRNA, citb, and clpC proved useful in determining distinct species affecting humans in Gabon. Multiple genotypes from each species were determined, including those at low frequency, and even P. ovale curtisi mixed with P. ovale wallikeri were detected (Lalremruata et al. 2017).

The power of molecular tools has also been employed to distinguish P. cynomolgi from P. vivax in human infection (Snounou et al. 1993). P. cynomolgi and other non-human primate malarial parasites might be able to infect humans via mosquito bites more often than previously thought (Ta et al. 2014). In the forests of Vietnam, different non-human primate species were found to infect An. dirus, a mosquito that transmits human malarias (Maeno 2017). More genomic and genetic studies are needed on non-human primate parasites capable of infecting humans, which will hopefully assist in the development of reliable diagnostic tools and in understanding their dynamics and adaptation processes. However, would this be sufficient to end malarial transmission?

Vaccines

Searching for a vaccine against malaria continues to be the goal of many researchers today. There are various reasons why an effective vaccine against Plasmodium has not yet been developed—its complex life cycle and enormous antigenic variability, insufficient knowledge of the immune responses triggered by the parasite, and the lack of adequate animal experimentation models. In spite of all this, the natural immunity acquired by residents in malaria-endemic areas and the sterilizing immunity shown by volunteers exposed to irradiated sporozoites helps one to believe that developing a vaccine might be possible (Rieckmann et al. 1979; Hoffman et al. 2002; Roestenberg et al. 2011).

Most trials have focused on P. falciparum, as this is the species causing the most severe form of the disease. However, in recent years, P. vivax has also been studied because it is the most widespread species, and is even more serious than P. falciparum in some regions. Another reason is the frequent occurrence of mixed infections from both species in certain geographical areas (de Camargo et al. 2018).

Over the last few years, malarial epidemiology has undergone certain changes (Ceesay et al. 2008; O’Meara et al. 2008; Roca-Feltrer et al. 2010)—in 2013, the Malaria Vaccine Roadmap established new guidelines for the development of a vaccine (Moorthy et al. 2013). These stated that vaccines have to be effective against Plasmodium falciparum and Plasmodium vivax, and consider all malaria-endemic areas, not only sub-Saharan Africa. Immunization has to include all ages, not just children younger than 5 years of age. The objectives set for 2030 are the elimination of malaria in multiple places, demanding vaccines that are highly effective against the disease.

In 2011, the malERA (Alonso et al. 2011) introduced the term “vaccines that interrupt malarial transmission (VIMT)” including (i) anti-vector vaccines, directed at important molecules in the mosquito necessary for developing the parasite, (ii) pre-erythrocytic vaccines that act against sporozoites inoculated by the insect vector, thus preventing the invasion of the liver cells, (iii) erythrocytic vaccines acting against the merozoites, blocking the invasion of erythrocytes and reducing the number of blood forms, and (iv) vaccines directed at the sexual phases, known as altruistic vaccines; these do not prevent either the infection or the disease in the immunized person, but do prevent transmission to other people.

The VIMT concept is based on cycle bottlenecks, where the number of parasites is very low. This occurs firstly in the early exoerythrocytic phase when there is a low number of sporozoites inoculated by the vector, and secondly when gametocytes are ingested by the Anopheles females (Smith et al. 2014), where there is only a low presence circulating in human blood. Once the gametocytes pass to the mosquito’s midgut, there is a significant reduction in their number until the formation of the oocysts—this could explain why, even in high transmission areas, most mosquitoes are not infected by the parasite. It is for this reason that gametocytes are a good target for developing vaccines that impede transmission. For a long time, however, there have been attempts to develop vaccines against the asexual erythrocytic forms because these are responsible for the disease symptoms. Nonetheless, trials have since shown that human factors act by limiting gametocyte infectivity and therefore this should be the first step in reducing parasite numbers in the mosquito (Smith et al. 2014).

Currently, multiple vaccines are being evaluated, in the preclinical or clinical phase (Tables of malaria vaccine projects globally http://www.who.int/immunization/research/development/Rainbow_tables/en/), the main objective being to induce humoral and cellular CD4+ and CD8+ responses, since the role of T cell exhaustion during malaria is known, as well as the induction of memory T and B cells (Wykes et al. 2014). The results of Bergman et al. confirm the effector role of antibodies and T cells against parasites using the in vivo Imaging System (Bergmann-Leitner et al. 2014).

Pre-erythrocytic vaccines aim for a humoral and/or cellular response, inducing antibodies that prevent the sporozoites from invading the liver cells and/or acting on the infected hepatocytes. The immunological basis of the asexual blood-stage vaccines comes from more than 50 years of results on the passive immunity transmitted through the serum of immune adults to infected children, significantly reducing the number of parasites and the disease’s clinical symptoms (Cohen et al. 1961). Many vaccines are being developed against different parasite proteins that target natural immunity using different technologies and formulations (Conway 2015). Subunit vaccines contain key specific antigens (one or a few) and can be constructed as recombinant proteins (soluble or forming virus-like particles), large synthetic peptides, recombinant plasmid DNA, or recombinant viral vectors. The results obtained with vaccines based on viral vectors in heterologous prime-boost regimens represent a hopeful way of inducing potent T cell responses (Venkatraman et al. 2017). It is also important to use appropriate adjuvants in the formulation of vaccines since these stimulate the immune response, whether humoral or cellular, increasing protection against infection and disease (Lee and Nguyen 2015; Sastry et al. 2017).

Advances in our understanding of host-parasite interactions (Acharya et al. 2017; Cowman et al. 2017) have allowed us to select key parasite development proteins and design possible vaccines that act on the asexual forms of the parasite: sporozoites in the hepatic or pre-erythrocytic phase, and merozoites in the erythrocytic phase. However, despite the 5507 genes contained in the P. falciparum genome, only 22 encode for proteins that are used in subunit vaccine development. The same antigens are sometimes used on different platforms, with different adjuvants, or in combination with other parasite antigens (Tuju et al. 2017). Today, multi-component/multi-stage/multi-antigen vaccines are at an early stage of development. It is hoped that these “next-generation vaccines” will have highly effective presentations (Draper et al. 2015).

Pre-erythrocytic vaccines

One of the most important antigens on the sporozoite surface is the circumsporozoite protein (CSP), the major surface coat protein of the Plasmodium parasite (Coppi et al. 2011). It is expressed at the beginning of the parasite infection, in the sporozoite and in the early liver stages. CSP is related to adhesion and hepatocyte invasion.

Currently, the vaccine that is in the most advanced stage of development is the anti-sporozoite subunit vaccine, RTS, S/AS01B, marketed as Mosquirix and developed for about 30 years by GSK in collaboration with the Walter Reed Army Institute of Research (WRAIR). It is based on a recombinant protein which contains parts of the P. falciparum CSP combined with a surface antigen of the hepatitis B virus and a patented adjuvant (AS01), formed as virus-like particles without infective capacity (Wilby et al. 2012).

The final results obtained with this vaccine in phase 3 showed protection against the disease in children and infants for at least 3 years (D’Alessandro et al. 2015). In July 2015, the Committee for Medicinal Products for Human Use (CHMP) of the European Medicines Agency (EMA) adopted a positive scientific opinion regarding RTS, S for use outside the EU. In 2016, in a cost-effectiveness study using mathematical models, it was concluded that the use of this vaccine would represent a very favorable balance in an area with moderate to high infection transmission (Penny et al. 2016). In 2018, through the Malaria Vaccine Implementation Program (MVIP), the RTS, S vaccine will be tested on young children in selected areas of Ghana, Kenya, and Malawi. The limited efficacy achieved by the RTS, S vaccine as well as the high protection against malaria in humans achieved by immunization with radiation-attenuated Plasmodium falciparum sporozoites inoculated by mosquitoes, has led to the development of vaccines with complete sporozoites (Clyde et al. 1973; Hoffman et al. 2002). Living parasites induce antibodies against multiple antigens, and immunization with these vaccines elicits a strong humoral and cellular response. This type of vaccine is being developed by a US biotech company, Sanaria. They have several candidate vaccines in various stages of development: PfSPZ Vaccine with sporozoites attenuated by irradiation; PfSPZ-GA1 with sporozoites attenuated by knocking out a gene for two proteins; Pfb9 and Pfslarp, both essential for hepatic development (van Schaijk et al. 2014); and PfSPZ-CVac with attenuated sporozoites in vivo by concomitant administration of an antimalarial drug. In the latter case, Sanaria has developed a product called PfSPZ, which is equal to PfSPZ Vaccine but with non-attenuated sporozoites (Richie et al. 2015; Mordmüller et al. 2017). This American company, together with its collaborators, are currently working on aspects of manufacturing and delivery, administration and clinical development (Hoffman et al. 2015). PfSPZ Vaccine and PfSPZ-CVac have induced high-level protection in humans against controlled malarial infections, by mosquito bite or by intradermal injection, respectively.

It is very important to devote resources to finding new sporozoite proteins capable of producing high and adequate antibody responses so as to design vaccines that block hepatic infection by inhibiting the prior stages, from the point of mosquito inoculation to its entry into the hepatocyte. All these efforts have to be combined in an attempt to obtain a multi-stage vaccine (Sack et al. 2017).

Recent research has shown that antibodies present in the serum of subjects immunized with whole sporozoite vaccines and protected from CHMI recognized a large number of antigens, and in some cases developed protection against infection (Aguiar et al. 2015; Peng et al. 2016). Some have already been included in new vaccine projects and are now being tested.

Blood-stage vaccines

An alternative to pre-erythrocytic vaccines are blood-stage vaccines (Miura 2016). The natural immunity acquired against the disease is mediated by antibodies fighting the blood forms of the parasite. The proteins present in the merozoite are very numerous but not all induce immunological protection, and many merozoite antigens studied for use in possible vaccines are highly polymorphic, differentially expressed in populations or functionally redundant. All this contributes to the fact that the development of these types of vaccines is more delayed than those of the pre-erythrocytic class. Such vaccines should decrease merozoite and gametocyte populations by limiting infection and the clinical symptoms of the disease.

Surface merozoite proteins, as well as certain apical proteins involved in the invasion of the erythrocyte, have proven to be good targets for vaccine preparation (Richards et al. 2013). Understanding the molecular mechanisms involved in the sequential process of red blood cell invasion has allowed us to better comprehend the role played by certain proteins involved in erythrocyte adhesion and penetration along with their capacity for producing host antibodies that block the ligands required for the merozoite invasion of the erythrocytes (Weiss et al. 2015).

At present, eight surface proteins or apical organs (MSP1 (K1 allele), MSP2 (3D7 allele), MSP1, AMA1, MSP3, GLURP, SE36, and RESA) have been selected to participate in blood-stage vaccines; all are important in erythrocyte invasion. Due to the great genetic variability of these proteins, none have yet shown significant results in infection or disease control. None are still in phase 3 of development:

The MSP1 surface protein is the most abundant and forms a complex with other surface proteins; successful merozoite invasion of the host erythrocytes is dependent on this protein complex (Lin et al. 2016).

The K1-MSP1 Allele is currently included in the Combination B vaccine formulation along with the 3D7-IMSP2 allele, another protein from the surface protein complex, and RESA (Ring-infected Erythrocyte Surface Antigen), a protein discharged by the parasite into the red blood cell membrane that contains the merozoites (Mills et al. 2007); this interacts with the spectrin network, decreasing the deformation of the red blood cells (such deformation is fundamental to parasite survival during the ring stage).

Another merozoite surface protein that participates in vaccine development is MSP3, whether on its own or combined with a protein in the GMZ2 vaccine. This is a recombinant protein vaccine, composed of two Plasmodium falciparum blood-phase antigens: glutamate-rich protein (GLURP), which is the target of cytophilic antibodies, and merozoite surface protein 3 (Hermsen et al. 2007). This is the first blood-stage malaria vaccine tested in humans that decreases the incidence of malaria in children (Sirima et al. 2016; Amoah et al. 2017).

The apical membrane antigen 1 (AMA1), which is highly polymorphic, has also been tested as a vaccine against malaria. The results were not good, and no significant protection against clinical malaria was found. However, it is still considered a good potential candidate in a multi-component malaria vaccine (Thera et al. 2011).

SE36 is a new recombinant molecule based on the serine 5 repeat antigen (SERA5); this might be a good immunogen for a potential vaccine (Horii et al. 2010).

New vaccines try to use several antigens from different parasitic-cycle phases. These are the multi-stage vaccines, such as NYVAC-PF7, a testing phase vaccine against multi-stage parasite antigens. This is an attenuated virus vaccine containing genes encoding for proteins in the pre-erythrocytic phase (CSP, SSP, and LSA1), the blood stage (MSP1, AMA1, and SERA), and the mosquito-stage (Pfs25).

In other vaccines, synthetic protein peptides related to the invasion process are used; these include EBA175 (Tolia et al. 2005), RH5 (Volz et al. 2016), and P27A, a synthetic protein peptide exported through the membrane of the parasitophorous vacuole in the trophozoite stage (Kulangara et al. 2012).

All of the above focus on P. falciparum studies. However, in recent years, interest in P. vivax has increased and more and more projects are dedicated to finding an effective vaccine against this parasite. In addition to the difficulties already mentioned for P. falciparum, others can be added for P. vivax. These include its capacity to cause hypnozoite relapse in the liver and the culture problems present in the laboratory. In contrast to P. falciparum, most P. vivax vaccines are still in the preclinical development stage (Mueller et al. 2015; Phillips et al. 2017).

Based on what is known about P. falciparum, various authors agree that a high-efficacy vaccine for P. vivax transmission blocking requires the combination of multiple antigens (Mueller et al. 2015; Tham et al. 2017).

Current research into the mosquito stages

Transmission blocking immunity (TBI) entails the induction of antibodies against the parasite’s sexual stages (the gametocytes, gametes, zygote, and ookinete); these antibodies are capable of blocking parasite development in the mosquito’s midgut. In the pre-genomic era (before the year 2000), several molecules were discovered using biochemical and immunological approaches, and great advances were achieved in TB vaccines. Some of those molecules are still considered important vaccine candidates.

Malaria parasites have a complex but vigorous life cycle; to interrupt this, a variety of approaches needs to be developed. In the mosquito vector, malaria parasites express multiple molecules that can be intervention targets. Nevertheless, more knowledge is required concerning vector-parasite interaction and how the mosquito species drives parasite evolution. There are more than 40 mosquito species that transmit human malaria; these include Old and New World species that diverged around 95 mya (Moreno et al. 2010). Sinka et al. (Sinka et al. 2012) have demonstrated that there is an important differential distribution of the main Anopheline species in the world. The parasite developing in the mosquito midgut expresses hundreds of molecules that participate in parasite growth and evade the mosquito’s defense mechanisms. Parasite and vector populations might have adapted to different parasite strains and ecosystems (Eldering et al. 2017).

Genetic analysis has shown that Pvs25/28 (Chaurio et al. 2016) and Pvs47 (Molina-Cruz et al. 2015) are under selection and differentiation pressures; mitDNA is important for parasite development in the vector (Pacheco et al. 2017); and some mutations on the citb gene are responsible for the interruption of parasite development in the mosquito (Goodman et al. 2016)—this result is of significant importance, given that atovaquone resistance selects citb mutants, which are unable to be dispersed by vector transmission.

Pre-fertilization stages

At present, advances have been made on P. falciparum and a P. berghei mouse model that interacts with Anopheline species (Akinosoglou et al. 2015). More than 500 molecules participate in parasite development in the mosquito (Bennink et al. 2016).

About 90 gametocyte proteins have been detected in proteomic studies and many of these might induce TBI. Pfs230, a gametocyte surface protein involved in gamete function, is a large protein containing 3135 amino acids comprised of complex domains, repeating six-cysteine (6-Cys) motifs with abundant disulfide bonds, anchored to the membrane surface by GPI (Williamson et al. 1995; Gerloff et al. 2005; Arredondo et al. 2012). The N-amino terminal has been of interest as a TBV. Given that conformational epitopes are protective, one challenge has been to produce the proper folding for this protein. Antibodies expressed in the baculovirus system were able to reduce oocyst density in a dose-dependent manner and this inhibition increased when complement was added. The antibodies against this protein affected the formation of male gametes (Lee et al. 2017). Pfs48/45 (Carter et al. 1995; Van Dijk et al. 2001; Pradel 2007), another TBV that comprises protective conformational epitopes, tested in combination with Pfs25, both eliciting strong antibody responses (Datta et al. 2017). Pfs47 is expressed on the surface of female gametocytes and gametes (van Schaijk et al. 2006; Pradel 2007), and protects the parasite from a complement-like response mediated by TEP-1 and LRMI in the mosquito midgut through a selective processes; this shows high genetic structure (Molina-Cruz et al. 2013).

During the malaria season, antibodies against Pfs230 and Pfs48/45, and against other proteins such as Pfmdv1, Pfs16, PF3D7_1346400, and PFeD7_1024800, increased in almost all sample subjects (Skinner et al. 2015). Because the low antibody response might not be boosted by natural exposure, a different approach is required by printing certain modifications for expression in the vector that are capable of inducing strong, long-lasting immunity.

P. vivax expressed the orthologs of the main TBV candidates, Pvs230 (Tachibana et al. 2012), Pvs48/45, and Pv47 (Tachibana et al. 2015). A locus associated with vector diversity was revealed that included Pvs47 (PVX_083240) and Pvx48/45 (PVX_083235). As previously found, there are continental-specific Pvs47 and Pvs48/45 SNPs (and haplotypes), consistent with the presence of different species of mosquito in each region; this resembles the pattern found in P. falciparum (Vallejo et al. 2016; Benavente et al. 2017b). However, it seems that the selective pressure on Pvs47 has been more recent than that on the ortholog of P. falciparum (Hupalo et al. 2016). Recently, a study of P. malariae 48/45 showed little polymorphism amongst isolates from Southeast Asia (Srisutham et al. 2018).

Ookinetes and gliding motility

More than 500 transcripts have been detected in P. falciparum gametocytes, and more than half of these are under translational repression; for example, P25/28. This gene is expressed in the ookinete while other genes have been found that express in the oocyst and/or the sporozoites, suggesting that transcribed genes are under variable-term storage. This is a strategy to accomplish sexual reproduction and differentiation expression in an efficient and timely manner (Lasonder et al. 2016).

The ookinete expresses multiple proteins during its formation in the blood meal and its migration through the midgut epithelium. The most abundant and conserved surface proteins amongst the P25 and P28 Plasmodium species are promising vaccine candidates. They are transcribed during the pre-ookinete stages although their translation is maximal in the ookinete, as has been shown in P. falciparum (Saxena et al. 2007), P. vivax (Sattabongkot et al. 2003) and other malaria species. It has been suggested that P28 is ancestral to the duplicated P25 gene; antibodies against these proteins as well as knockout gene manipulation impede its development (Baton and Ranford-Cartwright 2005). First of all, two P28 types were detected in P. ovale (Tachibana et al. 2001), which were linked to a different variant of the rRNA type A gene (Tachibana et al. 2002). Genomic studies proved that these two types actually correspond to different species. In P. ovale wallikeri, P28 is encoded by four genes—one adjacent to P25, one p28 gene copy within an orphan contig, and two copies within contigs that contain a large array of oir genes (Ansari et al. 2016). P. ovale curtisi, on the other hand, has two P28 copies that have very low homology (Tachibana et al. 2001). It is not clear if gene expansion occurred in response to vector specificity. It is possible that the different Plasmodium species might have evolved differently or been under different selective pressures in the passage to humans and different vector species, selecting distinct genotypes at the local level, as observed in two P. vivax P25/28 phenotypes (González-Cerón et al. 2010).

Gliding motility and transcriptional regulation are mechanisms that allow successful parasite development in the mosquito. As with other invasive forms, the ookinete lacks rhoptries. The glideosome is a large group of proteins arranged in an actin-myosin motor. Proteins participate from the micronemes and the inner membrane, along with glideosome-associated proteins (GAP, e.g., GAP50, GALM2). The microneme proteome exposes hundreds of molecules (Lal et al. 2009). During the invasion process, these molecules are systematically secreted to the parasite surface (e.g., chitinase, von Willebrand Factor A domain-related protein (WARP), CTRP, SOAP, HSP70, CelTOS, PDI, A-M1 and others). There are other proteins that bind to the glideosome, one of these is Phil 1, which is integral to zoite development and is localized in the cytoskeleton and the apical end; its expression is upregulated in the gametes and zygote (Saini et al. 2017).

Ookinete proteins are not greatly exposed to the human immune response unless they are also expressed in the blood or hepatic stages. Because of this, their polymorphism is probably the result of mosquito selective pressure. The PfWARP micronemal protein expressed in the ookinete and oocyst, which comprises a von Willebrand factor A domain, has shown limited polymorphism (Richards et al. 2006); antibodies against WARP, CTRP and chitinase reduce P. falciparum 3D7 infectivity to Anopheles gambiae and An. stephensi (Li et al. 2004). PfCelTOs, on the other hand, is expressed in ookinetes and sporozoites (Espinosa et al. 2017).

The PfCTRP (CSP, TRAP-related protein) is a member of the TRAP-MIC2 family. This family shares the same structure: they comprise a signal sequence, an N-terminal domain, and at least one thrombospondin type 1 domain (TPS) whereas CTRP, TRAP and TLP have a von Willebrand type A domain (vWA), a transmembrane helix and a C-terminal tail. It has been proposed that the binding of the vWA domain causes an extension of the TSP domain (Moreira et al. 2008). CTRP is the most complex; it comprises six vWF domains and seven TPS domains. There is a single copy located in chr3 (Trottein et al. 1995). Disruption to PfCTRP allows ookinete development but affects ookinete gliding, also no oocyst develops (Templeton et al. 2000).

Another important protein in parasite invasion is Enolase, which is found on the surface of Plasmodium ookinetes. It promotes the invasion process, interacts with the midgut epithelium and captures plasminogen (Ghosh et al. 2011). Pf PPLP4, in the same family as MOAP, is an apical protein which presumably mediates ookinete traversal through the midgut epithelium (Wirth et al. 2015). In P. falciparum, antibodies against alanyl aminopeptidase (AnAPN1) were effective in reducing mosquito infection.

Plasmepsin IV (known to be present in the asexual-stage food vacuole) was previously shown to be involved in Plasmodium gallinaceum infection in the mosquito midgut. Plasmepsins VII and X (not known to be present in the asexual-stage food vacuole) are upregulated in the Plasmodium falciparum mosquito stages (Li et al. 2016). Many of these proteins are orthologous in P. vivax and other species; for example, PvCelTOS, PvChit, PvCTRP and PvSOAP.

Sporozoite glideosome and its preparation for hepatocyte invasion

At present, three sporozoite types have been detected based on their protein expression profile and confirmed by genome analysis—in the mosquito midgut, in the salivary gland and inside the vertebrate host.

Several proteins have been identified on the parasite surface; these include CSP, STARP, LSA-3, SALSA, SPART, PfEMP3 in the micronemes; and TRAP, SPECT1, SPECT2 and MAEBL in sporozoites (Garcia et al. 2006). CelTOS is a microneme protein in ookinetes and salivary gland sporozoites, which participates in parasite migration through the midgut epithelium and the sinusoidal cell layer. Although it is expressed in the midgut sporozoite and the merozoite, it is not translated until the sporozoite has invaded the salivary gland (Kariu et al. 2006). Pf CelTOS has low diversity, most of the non-synonymous variations are detected at the carboxyl end, most likely as a result of human immune pressure and intragenic recombination (Pirahmadi et al. 2018).

More recently, new molecules have been added to the list. These include the TRAP/MIC2 family, mostly micronemal proteins important for the ookinete and the sporozoite glideosome. Also, TRAP, aldolase, AMA1 and SPECT2 are essential for hepatocyte invasion (Buscaglia et al. 2004). AMA1 is expressed both in merozoites and in the liver stages (Yang et al. 2017). In sporozoites, CSP and microneme proteins such as TRAP are transcribed while in the skin or traveling to the liver. A comparison of transcriptomes and proteomes in gametocytes and sporozoites show that many transcripts are under translational repression and are expressed only at the particular time.

In sera taken from humans immunized with irradiated sporozoites, a range of new antigens has been detected in a high percentage of individual (> 50%), more than for CSP and TRAP antigens; for example, Ag2 was detected in 100% of immunized individuals. Some of these proteins have the dual function of invading both the salivary gland and the hepatocyte (e.g., CSP, TRAP, MAEBL). In the skin, before the sporozoite invades the hepatocyte, the molecular machinery needed to invade and develop in the hepatocyte is translated; this likely occurs in the irradiated sporozoite. In P. falciparum sporozoites incubated at 37C in the presence of hepatocytes, > 500 genes were upregulated, while in rodent malaria, they were not found to be orthologous for about 25% of those genes (Siau et al. 2008). In the sporozoites, the expression of CSP, TRAP, AMA1, aldolase, SPECT2, CelTOS and PPLI did not change. Only two molecules that expressed in the human sporozoite were implicated in hepatocyte invasion (the sporozoite invasion-associated proteins SIAP-1 and SIAP-2 were detected on the parasite surface). It is interesting that SIAP-2 was only detected in primate malaria genomes (e.g., P. vivax and P. knowlesi) but not in rodent malaria, whereas SIAP-1 is present in all genomes. Likewise, LSAP1 and LSAP2 were exclusive to primate malaria (Siau et al. 2008).

The ApiAP2 family is comprised of 14 members, which have AP2-binding domains and participate in controlling the transcription of the life cycle. The functions of some of these have been studied recently. AP2-G2 is a repressor in both the asexual and sexual stages. Four AP2 (-0/-02/-03/-04) participate in transforming the zygote into the ookinete and oocyst. Other AP2s (SP/SP2) are required to form sporozoites but to be infective, AP2-SP3 is required (Modrzynska et al. 2017).

Human malaria species

In 2017, Rutledge et al. (Rutledge et al. 2017) established the phylogenetic relationships in human malaria constructing a maximum-likelihood tree using 1000 conserved single-copy core genes presents in the six species (Fig. 2). In addition to these six, about 200 different species of Plasmodium have been described parasitizing other mammals (other than humans), reptiles and birds. Taxonomically, Plasmodium is part of the family Plasmodiidae, of the order Haemosporida, the Class Aconoidasida and the Phylum Apicomplexa.

Fig. 2
figure 2

Phylogenetic tree of the genus Plasmodium that causes human malaria

The use of more complex technology such as DNA arrays or massive sequencing has been fundamental in understanding the different mechanisms that the parasite uses; for example, to adapt to an environment, penetrate its host or escape from the action of certain drugs (Olszewski et al. 2009) (Doolan et al. 2014).

These technologies provide knowledge regarding the parasite’s persistence in different regions by deciphering what genomic changes favor its transmission. Some of the work being undertaken by the international scientific community focuses on identifying the genetic loci that might be associated with phenotypes related to drug resistance (Ranford-Cartwright and Mwangi 2012). Other work lines focus on the study of population genetics or genomics, with the intention of providing information on the spread of the disease and its origins (Hay et al. 2004). More knowledge of vector-parasite interactions will be useful in deciphering vector transmission. Understanding how the parasite is evolving in and across different ecosystems is a primary tool in definitively eliminating the disease.

The NGS methodologies are now being simplified considerably; this is providing scientists with a powerful tool that allows them to read DNA on a large scale, at an affordable price and over a short time (Garrido-Cardenas et al. 2017)—this is even the case for blood infected with non-falciparum malaria at low parasitemia levels. Its main goal has been the identification of single nucleotide polymorphisms (SNPs), indels (insertions and deletions) and microsatellites in order to detect regions under selective evolutionary pressure, as well as to assist advances in system biology. The high throughput (HTP) application is also capable of describing polyclonality, and deep sequencing can detect the low-frequency genotypes.

Thanks to DNA sequencing, the P. falciparum genome was sequenced in 2002 (Gardner et al. 2002). Subsequently, the genomes of P. vivax (Carlton 2003) and P. knowlesi (Pain et al. 2008) were sequenced during the first decade of the twenty-first century. More recently, the genomes of P. malariae (Ansari et al. 2016; Rutledge et al. 2017), P. ovale wallikeri and P. ovale curtisi (Ansari et al. 2016) were sequenced, completing the six genomes of human malarial species. Other non-human primate Plasmodium species, capable of infecting humans by accident, or by a mosquito bite, are being targeted for genomic studies. The first draft of the P. brasillianum genome is in the process of being published (Talundzic et al. 2017). Consequently, using comparative and functional genomics, we might gain a full understanding of parasite biology—its evolution, vector adaptation and pathology, amongst other things.

Additionally, the development of functional genomic, proteomic, metabolomic and transcriptomic analyses are contributing a great deal of information regarding the basic molecular function of the parasite (Tymoshenko et al. 2013). Although the identification of genes associated to drug resistance is a priority, identifying genes associated with phenotypes related to clinical manifestations of high pathogenicity may become critical in disease control. On the other hand, the analysis of different Plasmodium species populations might help in the parasite’s molecular barcoding, necessary for epidemiological surveillance. In fact, the variability in malarial epidemiology worldwide requires multiple methods and approaches.

Plasmodium falciparum

P. falciparum causes the most severe disease in humans. This species has been studied for decades because of its high parasitemia and ability to grow in vitro. Genomic data are continually being gathered from different laboratory strains and field isolates. At present, there are 36 different genome assemblies. The comparative analysis of these genomes aims to understand the evolutionary aspects of P. falciparum, theoretically related to its biology, pathogenicity and drug resistance. Likewise, it is hoped information is found that allows us to understand certain mechanisms such as its medicine resistance, transmissibility and immune evasion.

The Plasmodium falciparum genome was first published by Gardner et al. in 2002 from the 3D7 strain; this was as a result of the work carried out by the P. falciparum Genome Sequencing Consortium, established in 1996. The nuclear genome of P. falciparum 3D7 is organized into 14 linear chromosomes plus two circular fragments of extra-chromosomal DNA, one apicoplast and one mitochondrial genome. This genome is publically available through geneDB or plasmoDB, the main databases for Plasmodium. The nuclear genome has a size of 23.3 Mb, a GC content of approximately 19.4%, and 5507 genes (Table 1). Most genes are between 2000 and 2500 bp in length and they do not contain introns, or at least there are very few. Most families of hypervariable and highly amplified genes are found in the subtelomeric regions of the chromosome; in the central region of this, protected against instability, are the genes related to the parasite’s metabolic functions (Rovira-Graells et al. 2012). The subtelomeric chromosomal regions in the P. falciparum genome are very dynamic in evolutionary terms. The size of this region ranges from 60 to 120 kb and has a well-studied structure, with 1 to 6 telomere-associated repeat elements and several virulence genes and surface antigens which play critical roles in virulence, immune evasion and antigenic variation (de Bruin et al. 1994). The P. falciparum mitochondrial genome is only 6 Kb in size (Vaidya et al. 1989), and most of its DNA consists of short-sequence tandem repeats. It has only three genes that encode for the proteins Cox1, Cox3 (subunits 1 and 3 of cytochrome c oxidase) and Cytb (cytochrome b) (Hikosaka et al. 2011). An important feature of the mitochondrial genome is that it does not encode for any tRNA so this must be exported from the cytosol. The apicoplast genome is circular, 35 kb in size, and encodes for 30 proteins involved in essential pathways such as fatty acid or isoprenoid synthesis, as well as in the heme group pathway (Foth and McFadden 2003).

Table 1 Features of the nuclear genomic assemblies in parasites that cause human malaria

The P. falciparum 3D7 genome is being continually revised and reannotated. It has been used as a reference in numerous trials, such as that carried out by Chang et al. in 2013 (Chang et al. 2013), where they analyzed the data obtained from the complete genome sequencing of the 159 P. falciparum strain from Senegal. There was high variation in the single nucleotides and the number of copies, indicating that the parasite is subjected to significant selective pressure.

A study of the genome of 65 Gambian P. falciparum isolates showed, as expected, a high level of polyclonal infections (57%) and balancing selection signatures in genes encoding for antigenic molecules; the strongest signal was on the mps3-like gene belonging to a family with a role in immune evasion. In each schizont, a different gene type was found to be expressed. This, along with other blood-stage antigens, including one hypothetical protein, demonstrated the highest level of polymorphism, with a Tajima value above 1.0. Such a result is exceptional, as most genes have negative Tajima values reflecting their historical population expansion and purifying selection (Amambua-Ngwa et al. 2012).

Understanding this variability will make it easier for scientists to determine the population structure of these parasites. Different studies have shown that genes related to diverse metabolic pathways, as well as genes related to protein degradation, are under directional selection. The genes, or gene families, that are of greatest interest in definitively eradicating malaria, and thus the most studied, are those related to immune evasion mechanisms and drug resistance.

Establishing gene families is a very useful strategy for several reasons, the main one being that, in genomic studies, this allows us to know the evolution over time of the gene functions (Ohta 2000). Another important advantage is that it is easier to predict its function once we have assigned a gene to a gene family. In general, genes of the same family have similar functions. From this, we can study the most important gene families in P. falciparum to fight the disease.

The pir gene family

The Plasmodium interspersed repeats (pir) multi-gene family is related to immune evasion and encodes several variant surface antigens (VSAs) (Janssen et al. 2004). The P. falciparum genes belonging to this family are known as rif (repetitive interspersed fragment) and stevor (subtelomeric open reading frame) genes. At present, rif and stevor gene families are treated as a single family because the proteins they encode for, STEVOR and RIFINS, are identical from a probabilistic model perspective. This is reflected in the Pfam model for the two protein families, presented by Bateman and Lawson (accession number PF02009).

The action mechanism of the rif/stevor gene family is by recognizing Glycophorin C receptor on the red blood cell (Niang et al. 2014), making it possible to chronicle an infection by means of an adaptive immunity mechanism controlling antigenic variation. Most of the pir genes described so far have a common structure, consisting of two or three exons, the last of which encodes for a transmembrane domain (Janssen et al. 2002).

In mice models, only a small percentage of the parasites in an acute blood infection expressed pir genes, while almost all parasites express them in a chronic disease (Brugat et al. 2017). This mechanism, mediated by pir genes, facilitates the transmission of the parasite reservoir to other human hosts.

However, although evasion mechanisms are understood to be the primary function of pir genes, analyses carried out on Plasmodium chabaudi, a parasite infecting mice, suggests that the proteins encoded by these genes might have other functions in a blood-stage infection, such as interaction with host molecules. It is also possible that they play an important role in signaling, traffic and cell-adhesion mechanisms (Yam et al. 2016).

The var gene family

This gene family encodes for highly polymorphic proteins called PfEMP1 (Plasmodium falciparum erythrocyte membrane protein 1), also related to the process of antigenic variation. These PfEMP1 proteins are what the parasite uses in its interaction with the human host (Flick and Chen 2004).

The 3D7-strain genome contains 60 var genes, distributed across almost all the chromosomes. Proteins encoded by var genes are expressed on the surface of the infected erythrocytes, where they interact with different human endothelial receptors, thus preventing the elimination of the parasite (Chen et al. 1998). Each parasite expresses only one var gene at a time, in a phenomenon known as allelic exclusion; this means that the remaining genes present in the genome are transcriptionally silenced, waiting for the immune system to develop a response to the expressed protein, at which point they change the expression and restore their infective capacity (Voss et al. 2005).

Var genes can be divided into three families, depending on their location on the chromosome (Lavstsen et al. 2003). Group A genes are found in subtelomeric regions, those of group B are scattered in the chromosome, and group C are only present in regions interior to the others. Regardless of their location, all the genes in this family share a gene structure consisting of two exons separated by a small well-conserved intron. The first of the exons has a series of characteristic domains: an N-terminal segment (NTS), plus several Duffy-binding-like (DBL) and cysteine-rich interdomain region (CIDR) domains. The second exon codes for the intracellular component of PfEMP1. Comparing different P. falciparum whole-genome sequencings has identified dozens of structural variations, duplications, deletions, translocations and single nucleotide polymorphisms (SNPs) associated to var (Claessens et al. 2014). All these genetic variations give rise to the extensive polymorphisms found in PfEMP1 proteins. There are two promoters for each gene. One is upstream of the open reading frame responsible for mRNA production, and the other is included in the intron, promoting the production of sterile non-coding transcripts.

The PHIST gene family

Eighty-nine members of this highly amplified PHIST (Plasmodium helical interspersed subtelomeric) gene family have been predicted (Sargeant et al. 2006). These proteins have a similar number of amino acids, about 150, which have a significant presence of highly conserved aromatic residues in an alpha-helical structure (Warncke et al. 2016). All share a uniform pattern of exportation to the erythrocyte cytoplasm due to the presence of a PEXEL (Plasmodium export element) or HT motif, consisting of a pentameric amino acid motif RxLxE /Q/D, which directs the protein’s localization.

PHIST proteins carry out different functions related to the remodeling of the erythrocyte structure during the asexual cycle; this facilitates the parasite’s survival and multiplication as well as its immune system evasion—in this way, the erythrocyte becomes a suitable host cell (Moreira et al. 2016). PHIST proteins are also involved in very important metabolic processes such as the visualization of the PfEMP1 surface, gametocytogenesis and changes in cell stiffness (Zhang et al. 2017).

Currently, PHIST proteins are classified into three subfamilies, which differ both in their sequence and in their function. The PHISTa subgroup consists of 26 very short proteins, present exclusively in P. falciparum. They have two characteristic conserved tryptophan residues. The family members are transcriptionally silenced in strain 3D7 (Scholz and Fraunholz 2008). The PHISTb subfamily consists of 24 members. Its primary structure is significantly longer than that of other PHIST proteins since it has a stretch in the C-terminal, of unknown function. It is thought that the function of these proteins is related to the remodeling of the iRBC (infected red blood cell) cytoskeleton, contributing to the malaria pathology (Tarr et al. 2014). PHISTc is the best known subgroup, consisting of 18 proteins entirely shared with P. vivax and P. knowlesi; it seems that their function is related to protein trafficking (de Koning-Ward et al. 2009).

It is expected that, given the role PHIST proteins play in the remodeling of the host, these are very important in processes such as gametocytogenesis or in different mosquito stages; however, at the moment, this knowledge is not available.

Artemisinin-resistance genes: kelch13

Genome studies have discovered that single point mutations in the kelch13 gene were associated with slow-elimination parasite clearance (RCA) in malaria patients treated with artemisinin-based combination therapy (ACT). The mutation mapped onto the β-propeller and BTB/POZ domains of the encoded kelch-like protein (PF3D7_1343700, on chromosome 13) (Cheeseman et al. 2012; Ariey et al. 2014). This can become a very serious problem in the fight against malaria since artemisinin and its derivatives are currently the most powerful drugs used against the disease (Burrows 2015). These drugs were introduced at the end of the twentieth century to treat parasites resistant to chloroquine or other drugs, and they reduced both malarial mortality and morbidity. However, artemisinin resistance in P. falciparum has recently been discovered (Mbengue et al. 2015).

At least 20 kelch13 mutations have been identified that are associated with a slow parasite clearance rate following treatment with artemisinin derivatives; these affect the encoded propeller of the kelch13 protein (Miotto et al. 2015). The majority of SNPs, whether they are high or low frequency, are associated with a similar prolongation of the parasite clearance half-life. Most mutations appear from amino acid 440 onwards, and the most widespread is C580Y, which emerged independently in several distinct geographic locations (Ashley et al. 2014). The association of this and other mutations with slow clearance was validated in vitro and/or in vivo. Therefore, parasite monitoring is necessary to warn of artemisinin efficacy decay below 90% (WHO 2016).

What draws our attention to these mutations is their geographic location. While forms of artemisinin resistance associated with kelch13 mutations have appeared in Vietnam, Thailand, Cambodia and Myanmar, they rarely do so in other Southeast Asian countries such as Laos or Bangladesh, nor do they in African countries. It has also been observed that there are background markers that seem to predispose the appearance of kelch13 mutations, the presence of which seems to be predominant in only some Southeast Asian countries. These background markers include arps10 and mdr2 on chromosome 14, fd on chromosome 13, and crt on chromosome 7 (Miotto et al. 2015). It is assumed that each background marker acts differently although their significance is not known exactly. What is clear is that these mutations appeared earlier than those of kelch13.

Other genes: crt, mdr, dhfr, dhps, and CAs

These genes play an important role in mechanisms related to resistance to different drugs, molecule biosynthesis and others.

Pfcrt (chloroquine-resistance transporter)

This gene is located on chromosome 7 and is related to chloroquine resistance in P. falciparum (Roepe 2009). The protein encoded by the Pfcrt gene is an integral membrane protein, with 10 transmembrane domains, present in the parasite’s acid digestive vacuole; its function is unclear (Martin et al. 2009). Mutations at residue Lys76Thr result in reduced chloroquine accumulation by the parasite. The Lys76Thr mutation causes a loss of positive charge, changing the protein’s substrate specificity to allow the transport of the protonated drug via the vacuole to a place far from its action site (Martin et al. 2009).

Recent studies conducted in Zambia (Mwanza et al. 2016) have shown that, after discontinuing the use of chloroquine as an antimalarial drug, the mutation was no longer detected; that is to say, when the pharmacological pressure was eliminated, the drug resistance disappeared, reestablishing chloroquine’s great clinical efficacy.

Pfmdr1 (multi-drug resistance)

This gene is located on chromosome 5 and it encodes for the P-glycoprotein homolog protein (PGH1), expressed throughout the parasite’s asexual erythrocytic life cycle (Cowman et al. 1991). PGH1 is homologous to human P-glycoprotein, which intervenes in the pharmacological resistance processes in cancer cells. The Pfmdr1 gene is related to chloroquine resistance and also resistance to other drugs such as mefloquine and halofantrine (Sidhu et al. 2006). The first analyses carried out studying the Pfmdr1 gene concluded that the parasite’s resistance to different drugs was directly related to an increase in the number of copies of this gene (Price et al. 2004). However, later on it was realized this was not simply a numerical issue. Different haplotypes demonstrate different sensitivities to the use of antibiotics. Some haplotypes confer greater resistance to a certain drug than others, for an identical number of copies. Malmberg et al. (Malmberg et al. 2013) showed that parasites presenting the Pfmdr1 N86/184F/D1246 haplotype persist at higher lumefantrine concentrations in the blood than those expressing the mdr1 86Y/Y184/1246Y haplotype. This means that resistance to different drugs, mediated by the Pfmdr1 gene, is a parasite-strain-dependent phenomenon.

Pfdhfr (dihydrofolate reductase) and Pfdhps (dihydropteroate synthetase)

These genes are located on chromosome 4, and chromosome 8, respectively. Point mutations in these genes are implicated in the main mechanism of high-level sulfadoxine-pyrimethamine (SP) resistance (Wang et al. 1997). SP is an inhibitor of the folic acid pathway in P. falciparum; unfortunately, however, this drug’s effectiveness in malaria treatment is decreasing due to the continuous emergence of resistance events. The Pfdhfr gene encodes for a protein containing 608 amino acids and 71.7 KDa, with an A + T content of 75%; it has no introns (Bzik et al. 1987). In P. falciparum, the DHFR protein appears bound to a TS protein (thymidylate synthase) containing 94 amino acids, giving rise to the PfDHFR-TS bifunctional complex. The DHFR protein catalyzes the conversion of dihydrofolate to tetrahydrofolate, a cofactor used in the biosynthesis of certain amino acids and purine nucleotides (Peterson et al. 1988). The Pfdhps gene encodes for a DHPS protein that also appears biologically as a bifunctional complex, joined to a PPPK protein (dihydro-6-hydroxymethylpterin pyrophosphokinase). The gene has 2118 bp encoding at the 83 KDa protein, which contains 706 amino acids (Triglia and Cowman 1994).

Point mutations have been reported in different dhfr codons, which potentiate pyrimethamine resistance when acting synergistically (McCollum et al. 2008). These mutations are N51I, C59R, S108N, and I164L (Plowe et al. 1998). Similarly, mutations have been described in dhps at the S436A, A437G, K540E, A581G, and A613S codons, acting synergistically to enhance sulfadoxine resistance (Gregson and Plowe 2005).

Pfcas (carbonic anhydrases)

This is a gene that encodes for CA metalloenzymes. Different members of each of the five families described for this enzyme are present in all the life kingdoms (Alterio et al. 2012). The reaction carried out by the CA protein is the hydration of carbon dioxide to generate bicarbonate, which releases a proton. This generated bicarbonate will later be used in the de novo biosynthesis of pyrimidines. P. falciparum has two classes of Pfcas genes. One belongs to the so-called α-class and encodes for a protein containing 235 amino acids (Krungkrai and Krungkrai 2011), while the other, recently described, belongs to the η-class and encodes for a protein containing 358 amino acids (Del Prete et al. 2014). The amino acid sequence analysis of the CA enzyme in P. falciparum revealed that it is different from the analogous protozoan and human enzyme sequences, and that the protein has different catalytic properties from the human variant. Given the essential metabolic role of CA protein in P. falciparum, different aromatic sulfonamides, which have been found to be efficient inhibitors of the enzyme, are being investigated as antimalarial drugs (Krungkrai et al. 2008); these limit P. falciparum gametocyte development.

In 100 P. falciparum clinical isolates from Guinea, 99,305 SNPs were detected; however, most of them (68%) were exclusive to one isolate. When compared to the Gambian parasites, evidence was detected of recent positive and balancing selection; this was probably due to antimalarial drugs and host immunity. Only in the Guinean parasites were selective sweeps around Pfcrt and Pfmdr1 genes consistent with chloroquine usage. Even though selection signatures around the Pfdhps gene were only detected in the Gambian parasites, they were also consistent with the use of SP as a first-line treatment (Mobegi et al. 2014).

Recently, P. falciparum genome-wide SNP variation was studied in parasites selected in vitro for potential antimalarial compounds or “resistomes.” In 262 parasites, 159 gene amplifications were detected along with 148 non-synonymous nucleotide changes in 83 genes. Pfmdr1 mutations were associated with resistance to six different compounds (Cowell et al. 2018) (Cowel et al. 2018).

Plasmodium vivax

The main problem limiting our understanding of the biological parameters of the P. vivax parasite is the inherent difficulty of cultivating it in the laboratory and its low parasitemia. Studies of its genome are presented as an alternative to developing tools that allow therapeutic and epidemiological approaches. The Salvador I strain was the one used for the first complete P. vivax genome sequencing project (Carlton et al. 2008). This remains inaccurate due to the existence of a large number of unassembled scaffolds and the limited number of annotated genes. Genome drafts of other strains such as Brazil-I, India-VII, North Korea and Mauritania-I have also been used to gather useful data on the P. vivax genome. Recently, a new P. vivax sequence, PvP01 from a PNG isolate, has been assembled and annotated. Because of its superior quality, it can be used as a reference (212 coverage) (Auburn et al. 2016). The principle difference observed between the two analyses is the genome size. While the estimated size sequenced for the Salvador I strain is 26.8 megabases (Mb), the assembly is larger for PvP01, at 29 Mb. This difference is mainly due to better assembly in the subtelomeric sequences. In fact, of the 26.8 Mb sequenced in Salvador I, 4.3 Mb of small subtelomeric contigs were unassigned due to their repetitive nature (Carlton et al. 2008). In addition, thanks to the sequencing of the PvP01 strain, the complete sequence of the mitochondrial genome (5 Kb) and a partial sequence of the apicoplast genome (29.6 Kb) are also available. A second difference is observed in the number of genes found. While 5433 genes were identified in Salvador I, in PvP01 there were 6642 genes. This difference is mainly due to the genes belonging to subtelomeric-region families. In addition, the number of genes to which a function can be attributed has also increased, thanks to better assembly. On the other hand, the GC content in the Salvador I and PvP01 analyses is of a similar order (42.3 and 39.8, respectively); this is much higher than the 19.4% observed for P. falciparum.

As in the case of P. falciparum, several gene families can be found in the P. vivax genome. The analysis and knowledge of these can lead to the development of applicable tools for therapeutic studies.

The vir gene family

The genes in this family are variant surface antigen (VSA) expression genes. These are a homologous family to the pir gene family of P. falciparum, which comprise (along with the kir in P. knowlesi and the cir/bir and yir family in three rodent malarias) the main multi-gene family in malaria parasites. In P. vivax, these genes are related to the expression of certain cytoadherence proteins on the ICAM-1 endothelial receptor (Bernabeu et al. 2012), and also involved in antigenic variation (Fernandez-Becerra et al. 2009). However, in most cases, the function of this gene family is unclear, and the different members of the gene family can present different functions. In the Salvador I strain, 346 vir genes were found, which were grouped into 12 families named with the letters A-L. Analysis of the PvP01 strain has revealed the presence of more than 1200 genes in this family. The vir genes present multiple variants, with genes containing a single exon to genes with up to 5 exons, varying in size from 150 to more than 2000 bp.

As more isolates are sequenced, new information regarding parasite evolution emerges; e.g., the genomic analysis of an isolate from China-Myanmar (CMB-1), which revealed 78 novel vir genes. It has also been proposed that, in distinct samples, the gene families might cluster differently due to their rapid evolution (Chen et al. 2017).

Other genes

The P. vivax genomic analysis of global parasites showed that the most diverse antigenic genes are the merozoite surface proteins, (msp) 7 and 3, along with the serine repeat antigen (SERA) families (Hupalo et al. 2016). The CMB-1 strain showed high levels of genetic variability in genes belonging to the RBP, SERA, vir, MSP3 and AP2 families (Chen et al. 2017).

When comparing the P. falciparum and P. vivax genomes, about 4465 genes were found with sequence homology (Auburn et al. 2016). This is the case for the ortholog genes pv-phist, pv-crt, pv-mdr, pv-dhps and pv-dhfr. The functions of these genes are not yet clear in P. vivax. In fact, there is not even any conclusive evidence of the presence of polymorphism or their resistance to antimalarial drugs (chloroquine with Pvcrt/Pvmdr and sulfadoxine-pyrimethamine with Pvdhps/Pvdhfr), as there is in P. falciparum. Indeed, there is hardly any molecular baseline epidemiologic data (Huang et al. 2014).

At first, it was proposed that mutations at Pvmdr1 were associated with CQ resistance (Reed et al. 2000) but no association was detected with clinical outcomes in patients (Sá et al. 2005; Barnadas et al. 2008). Accordingly, the genome analysis exposed no signs of positive selection in these genes and no selective sweep was observed around the Pvmdr1 locus in Colombia or in other parts of the world (Hupalo et al. 2016). This gene seems to be a marker of parasite adaptation to local conditions similar to other genes involved in red blood cell invasion and drug resistance (Hupalo et al. 2016).

A distinct scenario has been shown in SP resistance. Polymorphisms at PvDHFR 57L, 58R, 117T/N and 173F have been associated with resistance to pyrimethamine (PM) in vitro (Hastings et al. 2005). These polymorphisms might be present at distinct frequencies in different geographic regions when using SP treatment for P. falciparum infections (Asih et al. 2015).

Genomic studies showed that in Colombian parasites, there is high diversity (33,853 SNPs; 42.7% of which were present in just one sample) (Winter et al. 2015); this is similar to parasites from other sites (Benavente et al. 2017b). A genetic sweep around Pvdhps was detected and the A383G amino acid substitution was present. Pvdhfr variation at the S58R and S117N residues has also been detected (Winter et al. 2015).

Furthermore, genomic analysis showed that selection intensity on chromosomes 5 and 14 (containing pvdhfr and pvdhps genes, respectively) was stronger in parasites from western Thailand than those from western Cambodia and Papua, Indonesia. Chr10 and Chr14 regions had a selection signal comprised of unknown proteins (Pearson et al. 2016). Hupalo et al. 2016 also demonstrated the presence of a selective sweep in loci comprising the Pvdhps and Pvdhfr genes.

Hypnozoites are a particular feature. To date, primaquine is the only drug licensed to eliminate them. However, even the most effective treatment scheme is accomplishing little, PQ tolerance has been reported and the maximal dose has reached a plateau in some regions due to G6PDH deficiency (Baird 2015). Discovering the factors that determine dormancy and their commitment is important for drug discovery. By cultivating P. cynomolgi (a non-human primate species that causes relapse episodes) in monkey hepatocytes, it was possible to employ NGS to obtain the transcriptome of small parasite forms, most likely hypnozoites or early liver stages. From the comparative genomics of the up and downregulating genes, 12 were not detected in the P. falciparum genome while 11 were present in P. vivax and P. ovale; the other one was specific to P. cynomolgi. The ApiAP2 family is involved in stage transaction; one of its members is PCYB_102390, the ortholog of which is present only in P. vivax and P. ovale. It was given the name AP2-Q and is probably involved in hypnozoite commitment. Other molecules that can contribute to quiescence and hypnozoite homeostasis might also participate in translational repression and post-transcriptional gene regulation (Cubi et al. 2017). Further functional studies are necessary to elucidate if these molecules could be the targets for new anti-hypnozoite drugs. Such results also highlight the importance of NGS technologies in studying other non-human primate malarias, making the genomes of many malaria species available.

Plasmodium knowlesi

Malaria produced by Plasmodium knowlesi in long-tailed Macaca fascicularis (Mf) and pig-tailed M. nemestrina (Mn) in Southeast Asia has been known about for quite some time. However, in 2004, several cases of human malaria were reported that were caused by the same parasite (Singh et al. 2004b). It is now known that P. knowlesi infections are widespread in humans in a large part of Southeast Asia (Kantele and Jokiranta 2011). Various genetic analyses have indicated that no systematic differences exist between the macaque and human isolates. On the other hand, mitochondrial DNA analysis reveals that P. knowlesi derives from an ancestral parasite that predates the settlement of Homo sapiens in Southeast Asia and, therefore, it had to be specific to monkeys. Hence, evolutionary analyses indicate that P. knowlesi is zoonotic. This is corroborated by the fact that the number of parasite genotypes is much higher in monkeys than in humans (Lee et al. 2011).

Phylogenetically, P. knowlesi is closely related to P. vivax, although at the phenotypic level there are important differences such as the absence of hypnozoites and, consequently, in P. knowlesi there is a latent hepatic stage. The P. knowlesi nuclear genome sequence was published in 2008 (Pain et al. 2008), coming from the H strain, Pk1 (A1) clone; its size is 23.5 Mb and it presents extremely high genetic diversity. This polymorphism level is much higher than that in the P. falciparum and P. vivax genomes, and is of a similar order to that between human and macaque P. knowlesi. The genome has a GC percentage of 37.5%, and 5188 protein-encoding genes. Approximately 80% of these genes are orthologous to genes identified in both P. falciparum and P. vivax although there are two large specific gene families, called SICAvar (schizont-infected cell agglutination variant) and Kir (knowlesi interspersed repeat). A distinctive feature of these variant antigen genes is that they are widely dispersed throughout the genome. Additionally, five families of genes unique to P. knowlesi, referred to as Pkfam-a to Pkfam-e, were described, each with 4–15 paralog members. The function of these families’ genes is unknown although it has been suggested that some may be related to the export of proteins. Recently, a new reference genome was sequenced by combining PacBio (Pacific Biosciences Inc., USA), RS-II long read and Illumina HiSeq short read sequence data. The new obtained genome had a genome length of 24.4 Mb and described the distribution of methylated bases (Benavente et al. 2017a).

At the genomic level, there are three highly divergent P. knowlesi subpopulations. Two of these appeared as a result of sympatric speciation in humans residing in Malaysian Borneo, probably because they were transmitted by different vectors. The third subpopulation was found in some laboratory isolates from other parts of Southeast Asia (Assefa et al. 2015).

The kir gene family (similar to rifin/stevor in P. falciparum)

This family of genes is responsible for the expression of variant surface antigens on parasitized red blood cells, and mediates cytoadherence processes along with genes from other gene families such as pir in P. falciparum and vir in P. vivax. Gene families with similar characteristics also exist in rodent parasites, such as yir in P. yoelii (Fonager et al. 2007), bir in P. berghei (Pasini et al. 2013), and cir in P. chabaudi (Lawton et al. 2012); these lead to the major gene families in Plasmodium species (Janssen et al. 2004).

According to the H strain, Pk1 (A1) clone nuclear genome sequence, there are 68 kir genes, which give rise to 36–97 kDa proteins. Depending on whether the gene has 2, 3, 4 or 5 exons, the kir genes are classified into types I, II, III and IV, respectively; most of which (90%) belong to types I and II. The encoded proteins contain 1–3 HMM (Hidden Markov model) domains followed by a transmembrane domain referred to as KIR TM. Collectively, the KIR proteins contain sequences in the extracellular domain with an almost 100% identity to certain CD99 regions in the host; these represent an example of molecular mimicry—this may play a fundamental role in processes related to immunological resistance given that CD99 is critical to T cell functioning.

The SICAvar gene family (similar to var in P. falciparum)

SICAvar is the largest gene family in P. knowlesi, and is related to immune evasion strategies by variant surface antigenic (VSA) (Al-Khedery et al. 1999). The gene family has 136 genes categorized as type I (with 5–16 exons) or II (with 3–4 exons) (LAPP et al. 2017). They are dispersed uniformly in the parasite’s 14 nuclear chromosomes and encode for a group of SICA proteins, similar to the highly polymorphic protein PfEMP1 of Plasmodium falciparum (GALINSKI et al. 2017).

SICAvar genes were initially reported with a structure of 10 exons encoding for a 205-KDa SICA protein. Subsequently, they were redefined with 12 exons (Lapp et al. 2009). The protein structure to which they give rise, which has a variable number of cysteine-rich domains (CRD), is very similar to that of PfEMP1, although the structure of the genes is very different in both cases, as there are only two exons in P. falciparum; to be precise, the introns begin at the start of the CRD domains. Regardless of whether the genes encoding for the protein are type I or II, the first two exons contain a PEXEL motive and the two final exons encode for a transmembrane domain and a conserved cytoplasmic domain.

Plasmodium malariae

Historically, Plasmodium malariae and P. ovale infections have not received much attention. This was due, in part, to their low prevalence and the low parasite density caused by either species in humans. Nevertheless, with the greater sensitivity of current molecular techniques, especially whole and deep-genome sequencing, a large number of mixed infections have been examined that contain P. malariae and/or P. ovale together with other Plasmodium spp., thus grabbing the attention of the international scientific community.

Malaria caused by P. malariae is characterized by a 72-h fever period, significantly higher than for the rest of the human malaria parasites (48 h), and it is a chronic yet mild disease. The first draft of the P. malariae nuclear genome was sequenced by Ansari et al. in 2016 from the CDC Uganda I strain (Ansari et al. 2016). Subsequently, a more in-depth analysis was carried out by Rutledge et al. in 2017 (Rutledge et al. 2017). The most reliable data show that the P. malariae nuclear genome is 33.6 Mb in size, having a GC content of approximately 24% and 6540 genes. The first thing that draws one’s attention to this nuclear genome is the presence of two large gene families of highly expanded genes in the subtelomeric region, called fam-m and fam-l. These were initially considered as a single gene family called Pm-fam-a. The curious thing was that they had not been described in any other Plasmodium sp.; in P. malariae, there are almost 700 members. The next gene family found in the nuclear genome was the mir gene family, similar to the pir gene family in P. falciparum, with 250 members. There are a similar number found in P. falciparum, though there are less than those found in P. vivax (1212). A differential feature is that in P. malariae, almost half of the pir genes are, in fact, pseudogenes. Finally, a third striking finding in the P. malariae genome was a 14-fold copy of a gene encoding for the sexual-stage cytoplasmic protein P27/25. This gene is also found in the P. falciparum genome (only a single copy), and the protein it produces appears at the beginning of gametogenesis (Carter et al. 1989), playing an important role in gametocyte membrane integrity (Olivieri et al. 2009).

The fam-m and fam-l gene families

Initially, more than 550 genes were grouped into a single highly expanded gene family, Pm-fam-a, encoding for certain proteins that together were related to the 2TM superfamily. In P. malariae, they can have 1–6 predicted TM domains, the majority being proteins in 2TM. Subsequently, further similar genes were identified; these were subdivided into two families, fam-m and fam-l, with 283 and 396 members, respectively. According to the predictions made by Rutledge et al., the proteins encoded by these genes would present as heterodimers, thanks to the joint expression of the fam-m and fam-l doublets on chromosome 5 of P. malariae. In most cases, the proteins have a PEXEL export signal peptide that will probably cause them to be exported to the surface of the infected red blood cells. Another interesting fact is that the three-dimensional structure of the proteins in these families overlaps perfectly with that of the RH5 protein in P. falciparum (a fundamental protein in erythrocyte invasion), despite the genes encoding for this only having a 10% sequence similarity. These data point to the important role that the fam-l and fam-m genes play in cell adhesion. On the other hand, approximately half of the members of these families comprise a domain of unknown function, which is also present in 33 P. vivax and 4 P. knowlesi proteins (Ansari et al. 2016).

Plasmodium ovale

This parasite was first described in 1922 (Stephens 1922). Two morphologically indistinguishable yet genetically distinct forms were named in 2010: P. ovale curtisi and P. ovale wallikeri (Sutherland et al. 2010). Nonetheless, as with the malaria caused by P. malarie, it has received little interest from the international scientific community and from malaria control programs. Nowadays, considering P. ovale wallikeri and P. ovale curtisi to be distinct species is beyond doubt, not only because of their genetic differences but also because of the diversity observed in their surface antigens. The genome of these two species was analyzed for the first time by Ansari et al. (Ansari et al. 2016) from four genomes (two of each). The two P. ovale wallikeri isolates were obtained from two Chinese workers returning from West Africa. Similarly, a third Chinese worker provided one of the P. ovale curtisi isolates while the other came from the CDC/Nigeria I strain. These genomes have a total length of 33.5 Mb, and a GC content of 29%. The P. ovale species has a greater number of genes than the other human malaria parasites, with 7132 for P. ovale curtisi, and 7052 for P. ovale wallikeri.

The two most important gene families found in the genomes of both species are the Plasmodium interspersed repeats (pir) multi-gene family, which in this case is called oir, and the surfin gene family, which encode for the Surfin protein. For the oir genes, the number of these present in the genome of both P. ovale wallikeri and P. ovale curtisi is so great (1500–2000), that this fact alone means their genomes are considerably larger than those of other parasites responsible for human malaria, such as P. falciparum (33.5 Mb vs 23.3 Mb). This large number of oir genes is common to both P. ovale species, indicating that the expansion must have occurred in the proto-ovale ancestral lineage, before speciation occurred (Sutherland and Polley 2011).

Other genes of interest present in the ovale parasite genomes are those that encode for the reticulocyte-binding proteins (13–14 genes) and the genes with tryptophan-rich domains (40 genes); these have been proposed as potential candidate antigens for future vaccines; they are also present in other Plasmodium spp. such as P. vivax (Wang et al. 2015).

The surfin gene family

Surfin (surface-associated interspersed) genes are present in all human malaria parasites but to a much greater degree in the P. ovale variants, with 125 genes in P. ovale wallikeri and 50 genes in P. ovale curtisi. They mostly have a subtelomeric location and usually appear within or adjacent to pir/oir genes. The polymorphic proteins encoded by these genes are transmembrane proteins that are exposed on the surface of both the infected erythrocyte and the merozoite (Winter et al. 2005). As with SICAvar proteins in P. knowlesi and PfEMP1 proteins in P. falciparum, SURFIN proteins contain several copies of a tryptophan-rich domain in the cytoplasmic region, known as WRD, the function of which is unknown. In SURFIN proteins, the WRD consists of three blocks: a WRD-A block of 40–60 aa, and two WRD-b blocks of 40–50 aa. The comparative analysis of the conserved sequences in the WRD domain allows us to conclude that SURFIN proteins have an evolutionary origin common to the PfEMP1 and SICAvar proteins (Frech and Chen 2013). On the other hand, an analysis of the modification in the WRD domain expression of these proteins reveals the fundamental role they play in transport from the parasite-derived membranous structures established in the infected red blood cell cytoplasm (called Maurer’s clefts) to the membranes (Kagaya et al. 2015).

Conclusion

Eradicating malaria has become one of the key challenges facing the World Health Organization. In recent years, enough information has been accumulated on the disease, the parasite and the mosquito vectors to understand that only a global approach can put an end to malaria. The programs used so far have made it possible to control the disease in Europe and North America, with a residual number of cases in both regions. But the problem is in those countries where malaria is a real public health problem. This is the case, for example, in the countries of sub-Saharan Africa. In these, it will be necessary to combine all the tools available to science. One of the tools that more knowledge has generated in recent years are the genomics. Knowing and understanding the information derived from genomic analyses of the parasites causing human malaria should make it possible to approach the eradication of the disease from different fronts. The molecular tools used in recent years, such as next-generation sequencing (NGS), make it possible to attack the disease and act at an epidemiological, treatment or diagnostic level. The great depth obtained with the new bioinformatics approaches, which save a great deal of time and facilitate decision-making, must be added. The development of vaccines, the detection of genes associated with drug resistance, or the knowledge of molecular hot spots in the life cycle of the parasite, have to be addressed in the future in a global manner and taking into account the conscientious analysis of the genomics of Plasmodium. Of course, we cannot ensure that control of all aspects of genomics will lead to end malaria transmission. We are not in a position to know whether that knowledge will be sufficient. But we know that it will be necessary in the development of tools that enable rapid and reliable diagnosis, and that allow us to control adaptive parasite processes. Similarly, molecular tools and genomic knowledge of the parasite must become the cornerstone on which to base the development of one or more effective malaria vaccines. The great genetic variability of the parasite, its complex life cycle and the lack of knowledge about the immune response it triggers are the main stumbling blocks for the development of an effective vaccine. But there is no doubt that the difficulties will be minor relying on genomic knowledge of the disease.