Background

In recent years there has been a renaissance in mapping malaria distribution at the national [14], continental [5, 6] and global scales [7]. These maps have adopted a variety of approaches and data sources in their construction. The most widely cited, contemporary continental resolution map of Plasmodium falciparum transmission distribution for Africa was developed by the Mapping Malaria Risk in Africa (MARA) collaboration [5]; http://www.mara.org.za. It has formed the basis for several reports by the Roll Back Malaria partnership [8, 9] and was used during several recent estimates of the pan-African public health malaria burden to identify population's at-risk [1012]. The MARA model used a fuzzy membership approach, assigning 5 × 5 km areas to a suitability estimate for stable P. falciparum transmission based upon simple rainfall and temperature determinants of the parasite's sporogonic development and mosquito survival. The model did not attempt to define intensities of transmission; rather it determined the likelihood that stable transmission could occur. Using the MARA model, areas of low stable and high stable endemicity across the continent have been distinguished categorically by assuming that the greater the climatic likelihood of stable transmission, the more likely areas would support higher intensities of malaria transmission [11, 12]. In this paper, the validity of these assumptions and the likely margins of error are examined by comparing MARA climate suitability values with empirical P. falciparum parasite prevalence survey data in Kenya.

Methods

Plasmodium falciparum prevalence surveys data among children aged 0–15 years in Kenya

A search of published and unpublished literature related to malaria infection prevalence surveys in Kenya was conducted as part of the MARA project [13, 14]. In brief, electronic database searches were performed using Medline® (SilverPlatter International, Boston, MA, USA 2000), Popline® (Johns Hopkins School of Hygiene & Public Health, Baltimore, MD, USA, 2000) and EmBase® (Elsevier Science Little Rock, Arkansas, USA 1999–2000). The following keywords were used in the search: Kenya, malaria, parasite and malaria and Kenya, Plasmodium falciparum, parasite rate, parasite prevalence and malaria transmission. For each publication, bibliographies were cross-referenced to identify additional sources of information from other studies. Where additional details could not be identified through the published sources, authors were contacted to provide more information on geographical location, survey dates and age-specific characteristics of the parasitological data. Postgraduate theses held in the libraries of four departments of the University of Nairobi were also searched (Medical school campuses at the Kenyatta National Hospital and Chiromo, Community Health Department and the Faculty of Science). Annual reports, journals and conference proceedings of national medical research institutes, and non-governmental organisations were reviewed at respective institute's libraries. Results of routine parasite prevalence surveys undertaken by the Ministry of Health's (MoH) Vector Control Department were manually searched in archives at national headquarters and at seven Provincial offices in Kenya.

Using a variety of sources: 1:50 000 scale topographic maps [15], digital maps of administrative units in Kenya [16] or public domain digital gazetteers [17, 18], a longitude and latitude was ascertained for each parasitological survey in decimal degrees. These geo-references were imported into a geographical information system platform ArcView 3.2 (ESRI, Redlands, CA, USA), mapped and overlaid on administrative boundary maps for Kenya [16]. The national administrative boundary maps were first warped within ArcView 3.2 to United Nations approved national boundaries [19] and then used to check for inconsistencies in spatial positioning and to define the spatial coverage of each survey (see below).

All surveys undertaken in Kenya (n = 923) were subjected to a number of selection criteria for inclusion in the analysis. First, to allow for a contemporary assessment of infection risk, surveys were only selected if they were undertaken between 1980 and 2003 (n = 657). The historical data (1927–1979) will be described elsewhere (Omumbo & Snow, in preparation). Second, surveys were excluded if the survey formed part of clinic visits, drug sensitivity testing or included intervention arms of controlled trials. Only total population, randomly sampled or longitudinal community-based surveys were included. Several community-based surveys, particularly those undertaken by the Ministry of Health, provided little detail on the sampling strategy used to select the childhood populations. These surveys were retained but were coded separately in the database to distinguish them from truly random selections should there be differences in estimates of parasite ratio dependent upon sampling methods (n = 74). Third, surveys were excluded if they covered infants only or an age range that extended into adulthood (>=15 years, (n = 12)). Fourthly, a minimum survey sample size of 50 was imposed on the selection to allow for adequate precision in the estimates of infection prevalence [20], or surveys were excluded if there where no details of the denominator or numerator (n = 21). Finally, repeat surveys by the same investigators within a twenty-four month period were combined into a single estimate. Surveys undertaken by several investigators at different times in the same location were reduced to one estimate by selecting the most recent survey, or the one with the largest sample size.

MARA Fuzzy Climate Suitability Index

The MARA model describes climatic conditions that range from unsuitable (0) to completely suitable (1) for stable P. falciparum transmission [5]. The Fuzzy Climate Suitability (FCS) index is defined by a series of curves

where x is a climate parameter, U is the value of x when conditions are unsuitable, and S is the value of x when conditions are suitable. When S is greater than U the suitability (1-y), increases with x; when S is less than U the suitability y, decreases as x increases. The model defines a monthly increasing curve (S = 22 C, U = 18 C) and decreasing curve (S = 22 C, U = 40 C) for mean diurnal air temperature, a monthly increasing curve (S = 80 mm, U = 0 mm) for rainfall, and a single increasing curve (S = 6 C, U = 4 C) for annual minimum temperature.

The FCS values for each parasite ratio data point were extracted from the FCS model developed by Craig et al. [5]. To provide a spatially congruent FCS value for each parasite prevalence survey, surveys were classified as representing one of five spatial dimensions: First, for surveys representing a single village, the central longitude and latitude was used to create a 2.5 km buffer surrounding the village using ArcView 3.2 (ESRI, Redlands, CA, USA). The size of the buffer was defined by the average flight range of Anopheles gambiae s.l. [21]. Using the MILA Utilities 3.2 update extension within ArcView 3.2 http://www.esri.com/arcscripts, the average of all the 5 × 5 km pixel FCS values within this buffer was extracted to represent the average FCS value for the respective village parasite prevalence survey. The second spatial classification reflected surveys that sampled from several villages but presented the data as a single parasite ratio estimate (n = 5). In this case, a polygon was created to connect the villages and a 5 km buffer created around the polygon to represent the wider spatial sampling. Average FCS values within this buffered polygon were summarised. The third, fourth and fifth spatial criteria corresponded to surveys undertaken at the 5th (sub-location), 4th (location) or 3rd (division) administrative unit levels in Kenya. These spatially distinct polygons were created within ArcView 3.2 using a digital administrative map of Kenya [16] and a 2.5 km buffer was created around each polygon extent before extraction of the average FCS values.

Data entry and statistical methods

All parasite prevalence survey data were entered twice in Microsoft® Access version 7 (Microsoft Corporation 1989–1996; Seattle, Washington, USA). Data entry errors and range inconsistencies were checked and verified against the original material. Data were summarized for matching to other data sources using Microsoft® Excel 2000 version 9.0 and then analysed using SPSS (SPSS, v10.01, 1999, Chicago, Illinois, USA).

Data were then re-classified in accordance with criteria used to define three categorical limits of FCS and parasite ratio risk for burden of disease analysis for the African continent to allow for differences in disease and mortality risks between classes of malaria endemicity [11, 12] (Table 1). First, were areas with a FCS value of zero. Second were areas where populations are exposed to marginal risks of malaria transmission and those communities able to support cross-sectional estimates of parasite prevalence below 25% (hypo to meso-endemic malaria transmission). These areas were assumed to be represented by an FCS greater than zero but less than 0.75. Finally, areas of the continent that are described by an FCS of greater than 0.75, which might support parasite prevalence rates of 25% and above (meso- through to holoendemic malaria).

Table 1 Agreement between parasite prevalence categories among 217 surveys and spatially congruent categorical values of the Fuzzy Climate Suitability (FCS) using the 0.75 threshold for stable endemic malaria.

The significance of agreement between the three classes of parasite ratio and FCS was tested using the kappa (k) statistic [22], which is a measure of the agreement between two classifications discounting for the probability that the agreement could be due to chance. Values of kappa range from 0 (no agreement) to 1 (full agreement) and Landis & Koch [23] suggest the following interpretation of agreement defined by the kappa statistic: poor where k ≤ 0.4, good where 0.4 < k < 0.75 and excellent where kappa ≥0.75.

The simple correlation between the contiguous measures of the FCS values and their respective estimates of the parasite prevalence was also tested. The continuous association was further tested using weighted least-squares regression with the model weighting the dependent variable (parasite ratio) for its precision using the Log10+1 of the standard error of the parasite ratio. Covariates in this model included the end-year of the study (pre-1990, 1990–94, and 1995–2002), the survey sampling methodology (random, longitudinal or total population surveys versus surveys without precise details) and the maximum age range covered in the sample (up to 5 years, up to 10 years and up to 15 years). The proportion of variation in the parasite ratio explained by each variable in the model was calculated by comparing overall proportions of explained variation (R2) between subsequent models with additional variables entered in a stepwise fashion allowing for all variables already in the model.

Results

A total of 217 spatially independent parasite prevalence surveys were identified that met the selection criteria. Eleven were reported in peer reviewed journals, 168 were unpublished Ministry of Health reports, NGO/bilateral/multilateral organisation reports, 16 were reported in doctoral or masters theses and 22 were provided as unpublished data by malaria scientists working in Kenya. Twenty-five percent (n = 54) of the surveys were undertaken before 1990. The median sample size in the childhood surveys was 220 (inter-quartile range 118, 430). 14.7% (n = 32) of the surveys were undertaken among age groups covering the 0–5 year age range; 61.8% (n = 134) spanned the age range 0–10 years; and 23.5% (n = 51) of surveys included children between the ages of 0 and 15 years. The majority of surveys were regarded as random, longitudinal or total population surveys (65.9%, n = 143). The distribution of the survey data against the categorical distinctions for the MARA FCS values is shown in Figure 1.

Figure 1
figure 1

MARA Fuzzy Suitability Class (FCS) values categorised into three categories (light grey, zero FCS; light red FCS > 0 & < 0.75 & red ≥ 0.75) in Kenya showing distribution of selected parasite survey data points (black dots, n = 217)

Table 1 compares the categorical definitions of parasite prevalence with categories of the FCS values used during recent malaria burden estimations [11, 12]. Of the parasite surveys reporting a zero prevalence of infection, five out of eight had an FCS value greater than zero. One hundred and thirty seven surveys reported parasite prevalence greater or equal to 25% (i.e. stable endemic transmission), however only 79 (58%) had an FCS value ≥ of 0.75. For surveys reporting a low parasite prevalence between 1–24% (n = 72), 58 (81%) were characterised by an FCS value >0 but less than 0.75. Overall agreement between the categories was poor, k = 0.367, although statistically significant (p < 0.0001). Table 2 reflects a change in the FCS categories around 0.6 to represent zero, 1–24% and ≥25% parasite prevalence. These criteria greatly improved the levels of agreement with the parasite ratio categories (k = 0.442, p < 0.000) and 68% of surveys reporting a parasite prevalence ≥25% were described by an FCS value ≥0.6. Further changes in the FCS criteria made little difference to the levels of agreement.

Table 2 Agreement between parasite prevalence categories among 217 surveys and spatially congruent categorical values of the Fuzzy Climate Suitability (PCS) using the 0.6 threshold for stable endemic malaria.

Regarding the parasite prevalence and the FCS values as continuous measures produced a weak positive correlation (R2 = 0.364, p < 0.0001; Figure 2). Inclusion of covariates, upper age, year of survey or sampling method did not improve the association and nor did weighting the parasite ratio by its standard error during weighted least-squares regression.

Figure 2
figure 2

Plasmodium falciparum parasite prevalence (%) among children under the age of 15 years surveyed in 217 spatially independent surveys by Fuzzy Climate Suitability values extracted for the same spatial areas (R2 = 0.364, p < 0.0001)

Discussion

The analyses presented in this paper compare empirical parasite prevalence data among children sampled in 217 communities across Kenya with a climate-driven model that estimates the likelihood of stable malaria transmission. The results of the study suggest that there is a significant linear, albeit weak, association between these two measures of P. falciparum transmission (Figure 2&3). This is perhaps not surprising as the parasite ratio and the FCS value represent two very different transmission criteria. The parasite ratio reflects the intensity of transmission and has been routinely used as a marker of endemicity in Africa since the 1950's [24]. The FCS is a representation of the rainfall and temperature determinants of the parasite and vector's ability to coexist and thus enable stable transmission [5]. Furthermore, the MARA FSC is based on climatological averages for the 1951–1995 period (although recent analyses suggest this to have been surprisingly stable over the last century [25] and the parasite rate sampled in specific years. The potential confounding influence of timing on the parasite prevalence sample is an area of on-going research.

Figure 3
figure 3

A box plot of distribution of Plasmodium falciparum parasite prevalence (%) according to three FCS categories: >0, 0 – <0.75 & ≥0.75. The median (central line), 25 and 75% quartile ranges (box length) and upper and lower limits (bars) are shown.

The positive, albeit weak, correlation with measures of the intensity of transmission lends some support for the MARA model's ability to define populations at-risk of differing intensities of malaria transmission. There were too few surveys reporting zero infection prevalence (n = 8) to argue whether the FCS model can correctly distinguish areas of no transmission, however, 5 areas reporting zero prevalence did have a FCS value greater than zero. More striking was the ability of the FCS categories >0 and <0.75 to correctly identify populations at low risk of malaria infection with parasite prevalence's between >0 – <25% (81%: Table 1). During estimations of malaria burden it has been assumed that populations residing in these areas experience much lower risks of malaria-specific morbidity and mortality compared to populations located in areas described by an FCS ≥0.75 [11, 12]. Conversely areas described as supporting stable, endemic transmission with parasite prevalence ≥25% were less well described using the criteria FCS ≥0.75. Only 58% of the parasite prevalence surveys reporting a parasite ratio ≥25% were classified as having a FCS value ≥0.75, the remaining 42% of surveys were classified as having FCS values >0 but less than 0.75. Altering the FCS criteria to ≥0.6 for stable, endemic transmission improved the classifications of parasite prevalence surveys in Kenya with 68% of surveys correctly identified (Table 2).

Conclusions

The results of this study suggest that applying climate suitability modelled estimates of transmission potential to distinguishing populations at differing levels of malaria infection intensity has several limitations. First it may not adequately distinguish populations at no risk of infection. Second, populations exposed to stable, endemic transmission may be poorly defined using criteria of ≥0.75 FCS. These results relate only to Kenya and similar validations are required in other settings in Africa. Nevertheless, for Kenya the disease and mortality burden of malaria will have been under-estimated using the criteria shown in Figure 1. The extent of higher intensity transmission resulting in higher malaria burdens could be wider and the extent of no risk might be more restricted. The model developed by Craig et al. [5] represents the only available continental scale map upon which to define populations at risk of P. falciparum infection. Our analysis suggests that, at a crude level, it does distinguish communities exposed to differing levels of malaria transmission intensity. Given the large number of assumptions made regarding the paucity of attribute morbidity and mortality data to define continental scale DALY's for malaria, defining the spatial extents of population denominators is only one part of the problem that is at least soluble. New malaria risk models are being developed which go beyond the climatic determinants of P. falciparum transmission and new iterations of malaria risk models will hopefully become more robust with the inclusion of new global scale data on population settlement, land use and ecology.