Background

The Arabian Peninsula, comprising of Bahrain, Kuwait, Oman, Saudi Arabia, Yemen, Qatar and the United Arab Emirates, has made great strides towards malaria elimination over the past 50 years. Of the seven countries in the region, only Saudi Arabia and Yemen still experience ongoing local transmission [1]. Whilst transmission in Saudi Arabia is contained to two south western provinces [2], Yemen struggles with ongoing transmission across the country, with an estimated 25 % of the population at risk of transmission (>1 case per 1000 population) [3].

Yemen is one of the poorest and least developed countries in the Arabian Peninsula and the combination of political unrest and widespread resistance to chloroquine, which was the first-line anti-malarial drug until 2005, resulted in a resurgence of malaria during the 1990s, exacerbated by widespread flooding and unusual rain patterns due to the El Niño climate oscillation [1]. In 2001, a new malaria strategic plan was implemented by the government, which focused on the use of indoor residual spraying (IRS), larviciding and insecticide-treated nets (ITN). In 2009, the first-line drug was switched to artemisinin-based combination therapy (ACT) [3]. Subsequently, malaria incidence appears to have reduced, although data on confirmed cases are sparse [1].

Fig. 1
figure 1

Location of study district, South West Yemen

Wadi Sukhmal is a valley in a mountainous area, located in Wusab As Safil district, approximately 360 km from the capital Sana’a, in the west of the governorate of Dhamar. The district is partially situated in the Tihama Plains which border the Red Sea in Yemen and Saudi Arabia. As an area with ongoing malaria transmission this district was chosen as a site for a wider trial comparing the effects of combining long-lasting insecticide-treated nets (LLIN) and IRS versus LLINs alone. A baseline malaria survey of the population was conducted and filter paper samples collected from residents in order to estimate prevalence and genetic complexity of malaria infections. The genetic complexity and number of clones circulating in the parasite population in an area is often correlated with transmission intensity [47]. In addition, filter paper samples were tested for the presence of antibodies to Plasmodium falciparum and Plasmodium vivax antigens in order to estimate transmission intensity and reconstruct historical transmission patterns in the area. Seroconversion rate and antibody dynamics in a population have been shown to correlate with transmission intensity in a number of settings and can be particularly useful in areas of low transmission [810]. All measures were collected as part of both the epidemiological characterization of the study area and as potential endpoints for the intervention trial.

Methods

Setting

The study took place in Wusab As Safil district, situated in the Western foothills of Yemen with a population of approximately 150,000 (Fig. 1). Malaria across Yemen is unstable and related to altitude, rainfall and topography. In the study area, where altitude ranges between 601 and 1000 m above sea level, malaria occurs 10 months of the year, although peaks in transmission are seen during and following the rains and is predominantly due to P. falciparum. The main vectors are Anopheles arabiensis and Anopheles sergenti which are thought to breed in the fresh-water wadi (riverbed/stream) in the valley. The majority of the district is agricultural and is made up of mountains and valleys. The size of the study area was approximately 100 km2.

Survey and sample collection

Samples were collected as part of a baseline survey for a larger trial to assess the effects of combining LLIN and IRS compared with LLIN alone. Clusters were selected from villages near Wadi Sukhmal, which was chosen due to high malaria incidence recorded by local health facilities. The survey took place in 12 clusters in the wadi that were randomized after stratification by distance to the wadi and baseline prevalence of infection, which was measured by rapid diagnostic test (RDT) and microscopy in October 2011. The average population in the study clusters was 800 (range: 337–1677). Clusters were separated by at least 500 m. Filter paper samples were collected during a second baseline survey which took place in March 2012, prior to the trial intervention. Households were randomly sampled from within each cluster. Every child aged 15 and under who was present at the time of visit had a finger prick blood sample taken for subsequent microscopy, as well as for an RDT (Carestart HRP-2, a P. falciparum specific test), and a filter paper for molecular and serological analyses.

Laboratory methods

DNA extraction and PCR amplification

DNA was extracted from three punches of 3 mm diameter from each filter paper sample (corresponding to approximately 10 µl of blood) using the Chelex method, as documented previously [8]. The extracted DNA (5 µl) was amplified by 18S ribosomal DNA (rDNA) nested PCR [9] using primers listed in Table 1. Subsequently species specific (P. falciparum, Plasmodium malariae, P. vivax) nested PCR was performed (Table 1). P. malariae and P. vivax PCR were only run on samples from three selected clusters due to suspected low prevalence of these species. Nested PCR products were analysed on 1.5 % agarose gels.

Table 1 Primers and sequences used for nested PCR

MSP-2 genotyping

The msp2 genotyping was adapted from a published protocol [10]. Briefly, 5 µl of Chelex-extracted DNA was added to the primary PCR containing 500 nM forward and reverse primers (Table 1). The nested PCR contained 300 nM forward primer S1Tail and 300 nM fluorescently labelled reverse primers M5 (Fc27 family specific) and N5 (3D7 allelic family specific).

Samples were analysed together with 500-ROX size standard (Applied Biosystems) at the MRC Genomics Core Facility (Imperial College, Hammersmith Campus). The data was analysed using the GeneMapper Software (Applied Biosystems). All detected peaks were sorted by size and divided into three base pair bins to call the alleles.

Enzyme-linked immunosorbent asssay (ELISA)

Filter paper samples were stored at 4 °C with desiccant until processed, as previously described [11]. Samples were tested at a final serum dilution equivalent of 1:1000 for human immunoglobulin G antibodies against Pfalciparum and P. vivax merozoite surface protein 119 (MSP-119) and P. vivax apical membrane antigen-1 (AMA-1) and 1:2000 for antibodies against P. falciparum AMA-1 using ELISA methods described in detail previously [11]. Optical Densities (OD) were normalized against the standard curve on each plate to account for differences between ELISA batches. ELISA results for each antigen were classified as positive and negative using a mixture model, as previously described [11]. Individuals were classed as positive if they had antibodies to either antigen.

Data analyses

Statistical analysis was performed using STATA 13 (StataCorp, Texas, US). RDT and microscopy results were compared using Cohen’s kappa. Sensitivity and specificity of RDT and microscopy results were calculated using P. falciparum PCR as the gold standard. Standard errors of all prevalence estimates were adjusted using robust variance estimates [12] as implemented in the svy commands in Stata [13] to account for between cluster variation of responses, using household or cluster as primary sampling unit when presenting cluster-level or overall prevalence, respectively. To examine the effect of age on MOI and prevalence, age data was split into 0–1 year olds, 1–5 year olds and 5–15 year olds. The non-parametric Spearman’s Rho test was used to measure associations at an individual and cluster level.

Genetic diversity and population structure

The multiplicity of infection was defined as the number of different msp-2 alleles detected in one sample. The expected heterozygosity index (He), which measures the diversity at a locus, was calculated using the Excel Microsatellite Toolkit software [14]. Heterozygosity can range from 0 to 1, where low heterozygosity indicates inbreeding, or a small population diversity, whilst a high heterozygosity indicates the presence of genetic variability. Allelic richness, which is the mean number of alleles per locus normalized to allow for sample size among populations, was calculated using FSTAT software. In addition, the fixation index (Wright’s F statistic), a measure of population variance, was calculated using FSTAT allowing 55,000 permutations.

Serological analyses

Seroprevalence to P. falciparum and P. vivax antigens was modelled against age using a simple reversible catalytic model as previously described [15]. In order to examine whether a change in transmission intensity had occurred, profile likelihood plots were generated to see whether fitting a model with two seroconversion rates may be more appropriate than a model with a single seroconversion rate [16]. A likelihood ratio test was used to determine which model fitted the data best.

Ethics approval

Ethics approval was obtained from the National Committee for Health Research, Ministry of Health, Yemen. Informed consent was obtained from all participants or their guardians, prior to the collection of samples.

Results

Study population

A total of 2261 filter paper samples were collected, with an average of 185 children per cluster (range 163–216) aged <16 years. The median age was 6 years (25–75 range: 3–9) and 53 % were female. 1341 (60 % of study population) individuals were classified as living within 500 m of the wadi.

Plasmodium infections detected by RDT and microscopy

Plasmodium falciparum prevalence was 12.4 % (10.3–14.6) (n = 281) and 11.1 % (9.2–13.1) (n = 237) by RDT and microscopy, respectively, with a Cohen’s kappa of 0.83. Of the 281 RDT positive individuals, 62 were not positive by microscopy, whilst of 237 microscopy P. falciparum positive individuals, 16 were not positive by RDT. In addition, microscopy identified 19 P. malariae infections (overall prevalence of 0.8 %).

Plasmodium infections detected by PCR

PCR results were successfully obtained for 2181 samples. Of these, 19.6 % (95 % CI 17.3–21.9) (n = 430) were positive for P. falciparum. Further analysis for malaria species was performed in three clusters (clusters 5, 10 and 12) in which four P. vivax infections were found [all in cluster 12, (2.2 %)] and 17 P. malariae infections were found [in clusters 5 (2 %), 10 (6 %) and 12 (0.6 %)].

Prevalence of low-density malaria infections and spatial heterogeneity

Using P. falciparum PCR as the gold standard, RDT had a sensitivity of 45.8 % (41.0–50.6) and a specificity of 95.4 % (94.3–96.3) and P. falciparum microscopy had a sensitivity of 41.6 % (36.9–46.4) and a specificity of 97.1 % (96.3–97.9), although sensitivity for both diagnostic methods varied substantially by cluster, ranging from 0 % in cluster 1 to 78.3 % in cluster 4. In general, sensitivity was higher in clusters of higher prevalence, suggesting more of the infections were high density in these areas (Fig. 2). Approximately 45 % of P. falciparum infections present were not detected by RDT and microscopy. This did not vary by age group. Only 5 of 11 microscopy positive P. malariae infections that were tested with PCR were confirmed as positive.

Fig. 2
figure 2

Proportion of patent (detectable by a RDT and b microscopy) and sub-patent (detected by PCR) P. falciparum infections by cluster

As with the other parasite prevalence measures, parasite prevalence by PCR ranged substantially by cluster (Figs. 2, 3). The highest prevalence was detected in Cluster 4 (46.0 %, 92/200) with cluster 2 also being high (37.6 %, 78/206). The lowest prevalence was in clusters 1 and 9, where prevalence was 2.2 % (4/185 and 4/182 respectively) (Table 2). Parasite prevalence by PCR was highest in clusters that were located on the wadi (26.8 %, 355/1325), compared with clusters located >500 m from the wadi (8.5 %, 72/850 p < 0.001). This difference was even more pronounced with RDT or microscopy parasite measures (19.9 and 1.6 % for RDT and 16.8 and 1.3 % for microscopy) (Fig. 4).

Fig. 3
figure 3

Prevalence for a RDT b Microscopy c PCR and d seroprevalence by cluster

Table 2 P. falciparum prevalence as measured by RDT, microscopy, PCR and antibodies to either P. falciparum MSP-1 or AMA-1, by cluster
Fig. 4
figure 4

Proportion of patent (detectable by a RDT and b microscopy) and sub-patent (detected by PCR) P. falciparum infections by distance to the wadi

Seroprevalence to P. falciparum antigens

Seroprevalence for P. falciparum antigens was 23.1 % (n = 453), 21.2 % (n = 471) and 31.8 % (n = 729) for MSP-119, AMA-1, or either antigen, respectively. As with the parasitological results, seroprevalence ranged substantially between clusters, with clusters 2 and 5 having the highest prevalence (58.5 %, 121/216 and 56.4 %, 110/195 for either antigen, respectively) whilst clusters 1 and 11 were low (9.2 %, 17/185 and 3.5 %, 6/178 for either antigen, respectively) (Table 2).

Seroprevalence to P. vivax antigens

Whilst P. vivax specific PCR was only performed on samples from three clusters, all samples were tested on P. vivax MSP-119 and AMA-1 antigens to look for P. vivax specific antibodies. Seroprevalence was 1.4 % (n = 30), 2.2 % (n = 47) and 3.2 % (n = 75) for MSP-119, AMA-1, or either antigen respectively. None of the four P. vivax infections detected by PCR were antibody positive to either antigen. No children under 1 year of age were seropositive for any P. vivax antigens, with highest seroprevalence detected in 12–15 year olds (5.7 %). As with the other prevalence measures, seroprevalence to P. vivax antigens was highest in the clusters located closest to the wadi (4.5 %, 60/1341), compared to those located further away (1.7 %, 15/880). Sixty-five percent (n = 49) of P. vivax serological positive samples were also positive for P. falciparum antibodies.

Historical patterns of transmission patterns elucidated from serological data

Age seroprevalence curves for responses to either P. falciparum antigen were modelled using a simple reversible catalytic model for clusters within 500 m of the wadi and for those further away. The seroconversion rate in the clusters on the wadi was double that of the ones further away (λ 0.22 and λ 0.11 respectively, equivalent to EIRs of approximately 15 and 3 [15]). Profile likelihood analysis showed no evidence for a recent change in transmission in the low transmission clusters. However, in clusters close to the wadi, a model with a change in force of infection 9 years previously fit the data best (p < 0.001) (Fig. 5). In these clusters, seroconversion rates for children under the age of nine were half that of those over the age of nine (λ 0.33 and λ 0.74 in children under and over nine, respectively).

Fig. 5
figure 5

Seroprevalence curves to either P. falciparum antigen for a people living within 500 m of the wadi and b further than 500 m from the wadi. Black circles represent actual data, plotted at median percentiles for age and black line represents the results from a reversible catalytic model. In the clusters close to the wadi, there was evidence for a change in transmission 9 years prior to the survey—this dataset has been fitted allowing for two lambdas

Genetic diversity of Plasmodium falciparum infections

msp2 markers were amplified from 339 (79 %) of the P. falciparum positive samples (N = 430). A total of 84 different msp2 alleles were successfully genotyped, ranging in size from 205 to 609 base-pairs. Of these, 62 % (n = 52) and 38 % (n = 32) belonged to the 3D7 and Fc27 families respectively (Fig. 6). The frequency of individual 3D7 alleles was low: all had a frequency below 5 %. 63 % (n = 211) of infections were made up of 2 or more alleles (range 1–10). The mean MOI was 2.3, with the highest MOI detected in cluster 5 (3.1) (Table 3). The MOI decreased slightly with age group, although this was not significant. Nor was there a significant relationship between clusters close or far from the wadi or between clusters themselves; all clusters had a number of infections consisting of more than one allele. The estimated heterozygosity was above 0.9 in all clusters apart from cluster 11 (0.75) (Table 3). Allelic richness was 11.7 and ranged from 3.9 to 11.6 by cluster. The overall fixation index for the 12 clusters was 0.029, which is suggestive of little genetic differentiation and frequent gene flow between all clusters.

Fig. 6
figure 6

Frequency of a 3D7 and b Fc27 allelles in P. falciparum samples, Yemen

Table 3 Multiplicity of infection (MOI), allelic richness and heterozygosity at the msp2 alleles in P. falciparum infections in Yemen

Correlations between metrics of transmission

At a cluster level, the parasite prevalence measures were strongly associated with P. falciparum seroprevalence (ρ 0.94 p < 0.001, ρ 0.86 p < 0.001 and ρ 0.90 p < 0.001 for RDT, microscopy and PCR, respectively). All P. falciparum prevalence and seroprevalence measures were strongly correlated with allelic richness (ρ 0.90 p < 0.001, ρ 0.85 p = 0.001, ρ 0.85 p = 0.001, ρ 0.96 p < 0.001 for RDT, microscopy, PCR and seroprevalence respectively) and expected heterozygosity (ρ 0.85, p < 0.001, ρ 0.83 p = 0.002, ρ 0.75 p = 0.008, ρ 0.91 p < 0.001). As previously described, expected heterozygosity was strongly correlated with allelic richness (ρ 0.97 p < 0.001). Interestingly, cluster MOI was not significantly associated with any other measures. Correlations are shown in Additional file 1.

Discussion

Malaria remains a significant public health problem in Yemen. This study demonstrates the wide range of malaria infection prevalence and exposure in communities separated by short distances in a single valley in Western Yemen using a variety of metrics. Parasite prevalence was notably high in some clusters (up to 46 %), although heterogeneity was marked. A large proportion of sub-patent infections were detected, indicating that diagnosis using RDTs or microscopy would leave a significant number of infections untreated. Moreover, the highly diverse nature of the parasite population in this study area is suggestive of a high degree of transmissibility and may pose a barrier for malaria control efforts in the region.

The marked heterogeneity of infection prevalence in this setting is surprising given the short distance between the clusters sampled i.e. prevalence between cluster 1 (2 %) and 4 (46 %) despite only being 2 km apart (Fig. 3). Heterogeneity in infection has been reported in many settings [17, 18] although rarely to the same extreme as seen in this setting over such short distances [19, 20]. Variation in infection prevalence can be due to variability in risk factors such as use of interventions, poor quality housing or location relative to mosquito breeding sites. A clear risk factor in this setting was the wadi, with clusters located within 500 m of the wadi demonstrating consistently higher prevalence. The riverine environment is likely to be a breeding site for the primary malaria vector in the area, An. arabiensis [21] and one explanation for the high hetereogenity is that the range of the mosquitoes is relatively restricted and that there is limited mosquito movement between clusters. A recent study conducted in different districts in Yemen also found a range of parasite prevalences, although over a wider distance than is reported here [22]. In that study, house structure and occupation were most likely to influence prevalence. This information was not collected in the current study and is a limitation in maximizing the findings from the data collected.

The high proportion of sub-patent infections poses challenges for malaria control and elimination efforts. This study observed discordance between microscopy (which is the main diagnostic in Yemen) and RDT, as well as approximately 45 % of infections being missed by these methods. Whilst low parasite density infections are likely to be less infectious to mosquitoes [23], they still represent a risk for onward transmission and such a large reservoir is likely to play a role in transmission [24]. The proportion of sub-patent infections was higher further from the wadi, in the lower transmission clusters; 82 % of infections were sub-patent in the clusters further from the wadi, compared to 38 % in those close to the wadi. This is consistent with observations from large-scale comparison of PCR and RDT/microscopy which found the highest levels of sub-patent infections in low transmission settings [24]. As molecular diagnostics have become more sensitive, it has become clear that low-density infections are often present in high numbers, even in areas of relatively low transmission intensity [23, 25, 26]. The duration of these low-density infections, which are often asymptomatic, has been shown to be much longer than thought [27], although data is relatively sparse. Chronic, low-density infections are a barrier for malaria elimination in areas where Anopheles mosquitoes are still present.

The high diversity of the parasite population in this region was surprising although is consistent with other studies from Yemen [28, 29]. Previous molecular studies have noted that a highly diverse parasite population tends to be associated with higher malaria transmission [30]. In this study area, where there is a range of transmission intensity, the genetic diversity was high, even in lower transmission clusters. This apparent discordance may occur in areas where transmission has only recently reduced, such that there is only a subtle change in diversity [31] or when a change has not yet become detectable [32, 33]. However, the serological results suggest that transmission has been consistently low in the clusters further from the wadi, with no evidence for a recent change in transmission. High heterozygosity was also detected in nearby areas in a previous study [29] and is suggestive of a large parasite population circulating in Yemen. This could be influenced by population movement in the region, with the high level of parasite diversity in this study being similar to that seen in highly endemic African populations [10, 34].

In this study, there was no association found between MOI and transmission intensity. Whilst many studies have shown that MOI is higher in areas of greater transmission intensity [10, 20], there have been other studies that demonstrate no correlation [35]. It is possible that the timing of the survey may have influenced this with a study in Senegal showing that MOI increased during the transmission season [36].

Serological data indicates a significant difference in exposure to infection between children born 9 years previously compared with those born more recently, with older children experiencing approximately double the transmission rate. This effect is most apparent in the clusters located within 500 m of the wadi. This difference may be due to an age related behavioural pattern, for example if children over ten spend more time outdoors and/or in high-risk areas, as has been seen observed in South East Asia [37, 38] and South America [39, 40].

Alternatively, this difference could be indicative of a fall in transmission 9 years previously and in this case enhanced efforts of the malaria control programme, initiated in 2002. However, the fact that samples were only collected from those <16 years means it is difficult to fully reconstruct historical transmission patterns using serological data and it is possible a different pattern would have been apparent if samples from all ages had been examined.

Despite fewer samples being tested for P. vivax and P. malariae parasites using PCR, data suggest that P. vivax prevalence is extremely low with no infections detected by microscopy, in line with the national control programme estimates [3]. Seroprevalence to P. vivax antigens was below 10 % in all clusters, with only a small number of children with high antibody responses to either P. vivax antigen. Several P. malariae infections were identified in the clusters that were tested. This had also been detected in previous molecular studies in Yemen [41]. The presence of multiple Plasmodium species has implications for diagnosis, with RDT and microscopy lacking sensitivity for species other than P. falciparum.

As well as being informative for transmission intensity levels, understanding the genetic diversity of parasites in Yemen may help elimination programmes in the region to accurately classify a case as imported or locally acquired. Such a diverse parasite population may pose challenges for the effectiveness of malaria vaccines, which need to target the most common allelic families present in a region [42]. As has been seen in other settings [20, 28], 3D7 alleles were more prevalent in this setting than the Fc27 family.

Conclusions

This study used a variety of metrics to identify extreme and relatively focal heterogeneity of malaria infection prevalence over small distances in the foothills of Yemen, including a high proportion of sub-patent infections. The parasite population was highly diverse, even in clusters of low prevalence, with transmission focused close to the wadi. Identifying risk factors for high prevalence, such as proximity to mosquito breeding sites or high proportions of the population travelling to high transmission areas, will enable intervention campaigns to target villages appropriately and may result in a more cost-effective approach to reduce transmission in Yemen.