Background

Malaria remains one of the biggest health problems within the tropical region despite the improvements in malaria control programmes at a global scale. There were approximately 212 million cases of infection and 429,000 malaria-related deaths in 2015; and more than 90% of these cases occurred in sub-Saharan Africa [1]. In Zimbabwe, malaria continues to be a major public health threat with an estimated over half of the population of 13.5 million at risk of contracting malaria [2]. However, by 2010, Zimbabwe had managed to reduce malaria incidence to 45 malaria cases per 1000 inhabitants per year thereby surpassing the Abuja 2010 set target of 68 cases per 1000 inhabitants [3]. Consistent with the national trend, Gwanda district located in the Matabeleland South Province in Zimbabwe has progressed to malaria pre-elimination phase. A malaria pre-elimination capacity assessment study conducted in Matabeleland South Province in 2011 reported malaria positivity rates of 8% and Anopheles larvae scoop of four for Gwanda district [4]. Malaria control in Gwanda district is mainly through indoor residual spraying (IRS), use of long-lasting insecticidal nets (LLINs) and larviciding [5].

Malaria transmission is heterogeneous at varying geographical scales even in the malaria pre-elimination zones [6]. Usually malaria endemicity levels, and especially in low incidence areas, malaria tends to cluster in ‘hotspots’ and ‘hot’ populations that become sources of continued infection [7]. Malaria hotspots have close spatial associations with vector-breeding habitats, and in certain ‘high-risk’ sub-sets of the population, having higher exposure to vector-breeding habitat: ‘hot-pops’ [8]. Active and timely identification of these hotspots and related factors is important for effective malaria control [7]. Malaria heterogeneity in time and space has also been attributed to risk factors, including altitude, climate, environmental parameters, and socio-economic factors [6, 7]. Rainfall, temperature and altitude are key factors in determining the habitat suitability of malaria vectors, including Anopheles arabiensis which is common in Zimbabwe, as well as determining malaria incidences [9,10,11,12]. In most tropical African countries, high habitat suitability of malaria vectors such as An. arabiensis mostly translates into increased malaria incidences [11]. Mabaso et al. [12] used a model to analyse the spatial and temporal role of climate in inter-annual variation of malaria incidence in Zimbabwe for the period 1988–1999 and their study demonstrated that mean values of temperature, rainfall and vapour pressure are strong predictors of malaria incidence. Gwitira et al. [13] also concluded that annual precipitation, precipitation of the wettest month, isothermality and temperature seasonality combined with altitude, are key predictors of An. arabiensis habitat suitability in Zimbabwe. In their study, the habitat suitability was significantly and positively correlated with recorded malaria incidences. Based on their observations they inferred that high malaria cases would be expected in areas of high vector habitat suitability. However, they also noted the need to consider malaria interventions in the study region in order to draw meaningful conclusions about the relationship between mosquito habitat suitability and malaria incidence.

Accurately assessing the local risk of transmission is fundamental for the development of malaria control programmes. Differences in malaria transmission exist, not just between different regions but also at local level [14,15,16]. As Gwanda district is moving towards malaria elimination phase a better understanding of the distribution of malaria cases/incidence at local scale is essential. In this regard, it is critical to understand the key factors for determining variability of malaria cases at micro-spatial scale for improving malaria control strategies in cases of relapses or outbreaks.

An exploratory study to investigate the relationship between the above-mentioned factors (rainfall, temperature, altitude, and other factors, including vegetation cover and wetness) at local scale is required. This paper reports on the spatial distribution of malaria incidence in 2015 based on health facility cases and related risk factors, in order to strengthen control measures in the pre-elimination phase of Gwanda district, Matabeleland South Province, Zimbabwe.

Methods

Study area

Zimbabwe experiences seasonal and spatial variation in malaria transmission that is related to the country’s rainfall pattern [12]. The malaria peak transmission season in Zimbabwe is between February and April. This study was conducted in Gwanda district, Matabeleland South Province, Zimbabwe (Fig. 1). Gwanda district receives annual rainfall of lower than 700 mm, and that has necessitated the construction of dams and establishment of irrigation schemes to improve peoples’ livelihoods as many of them rely on subsistence agriculture. However, this has also resulted in increased exposure and vulnerability of people to malaria. Gwanda recorded the second highest incidence of malaria and mortality due to malaria in Matabeleland South Province in the period 2009–2013 [5].

Fig. 1
figure 1

Gwanda district, Matabeleland South Province, Zimbabwe

Malaria control measures in Gwanda district include IRS; the use of LLINs and larviciding [5]. The planning of these control measures is done at district level and implemented at ward level. A ward is an administrative area under a district that consists of an average of 10 villages with each village comprised of an average of 100 households [5]. Gwanda has 24 and 10 rural and urban wards, respectively. Wards implement different malaria control programmes and also experience environmental conditions that could determine the spatial and distribution of malaria incidence/cases. The district implements IRS in some wards and the LLNs in other wards. This is a provincial decision where the areas with higher prevalence use IRS while those with lower prevalence use LLNs. The idea is to finally stop the use of IRS and focus on personal protection as the prevalence goes down. Given the variations across wards there is need to understand the transmission dynamics at ward level to improve the overall control programme to reach the elimination phase.

Data collection

Mapping malaria in low-transmission settings is a challenge due to low incidence and malaria transmission tends to concentrate in hot spots and hot pops (populations that maintain malaria transmission) [17]. Data from health facilities (passive surveillance) provide a means of detecting hot spots of transmission [18, 19]. In this regard malaria cases recorded at health facilities in Gwanda district for 2015 were considered in the analysis in this study. Cases and their corresponding Global positioning systems (GPS) coordinates for the households were obtained from District Health Information System version 2 (DHIS2) through Ministry of Health and Child Care, Zimbabwe. These cases from health facilities had been parasitologically confirmed through rapid diagnostic tests (RDTs) or microscopy, and were complete both spatially and temporally. The DHIS2 system was implemented in mid-2014, hence 2015 had a complete dataset by the time of data collection (August 2016). The estimated total population per ward was used to calculate the incidence per ward [20].

The environmental/ecological/climatic/topographic factors considered include rainfall, temperature, wetness, vegetation cover. Table 1 shows data on these variables indicating the sources. These variables were accessed and processed (including the calculation of averages per ward) through the IRI data library portal [21]. The seasonal variation and within ward heterogeneity were not considered in this study.

Table 1 Source of environmental data used at ward level for 2015 in Gwanda district, Matabeleland South Province, Zimbabwe

Malaria control measures considered in this study include the coverage of LLINs and IRS. The data on malaria control measures per ward in Gwanda district and the extent of coverage were obtained from Ministry of Health and Child Care, Zimbabwe. The data were presented as percentage coverage or percentage population protected (%). It was also considered whether the ward was rural or urban.

Spatial analysis

The Poisson purely spatial model [22] was used to determine the malaria hot spots/clusters and the geographically weighted Poisson regression (GWPR) model [23] to examine the potential role of environmental variables and malaria control strategies on the spatial patterns of malaria cases recorded at health facilities and aggregated to a ward level. The GPS coordinates from DHIS2 were plotted and overlaid on the Gwanda ward boundaries to obtain the number of cases per ward. Malaria incidence was calculated as number of cases per ward divided by the total population per ward multiplied by 1000. The Poisson purely spatial model in SatScan spatial scan statistic [22] was used to detect the malaria hot spots or clusters. In this model, the numbers of malaria cases follow a heterogeneous Poisson process [23, 24]. The spatial scan statistic is a local cluster test [24] which has been widely used in spatial-epidemiological studies to detect local clusters with statistically significant elevated risk of infectious diseases [24,25,26,27]. The input data for this model included the number of malaria cases per ward, the approximate total population per ward, and the centroid coordinates for each ward. Considering that Gwanda is in the pre-elimination phase characterized by low number of cases, the maximum population at risk was set not to exceed 2% of the total population. This improves the precision in detecting local clusters since the default value of 50% of the population at risk is more likely to produce clusters of no practical use [25, 28].

The GWPR model [23] in GWR4.09 package (GWPR development team) was used to explore the local variation of malaria cases in relation to the environmental factors and coverage of control measures at local scale (ward level). Exploratory disease mapping and local cluster tests have been used for identifying areas with statistically significant high risks—hot spots or clusters. GWPR has proved to be very effective for measuring the spatially varying association between possible factors and disease risk. This allows the development of disease control measures targeting specific population groups that are most at risk in specific locations across the landscape [25, 29,30,31].

In this study it was assumed that the numbers of malaria cases in each ward follow independent Poisson distributions. Hence, Y i denote the number of malaria cases observed in ward i in Gwanda district, i = 1,…, 34. GWPR allows the examination of the spatially varying coefficients over space and the semiparametric GWPR uses the local and global terms. The GWPR and semiparametric GWPR models are defined in Eqs. 1 and 2, respectively:

$$ y_{i} \sim Poisson\left( {N_{i} \exp \left( {\mathop \sum \limits_{k} \beta_{k} (u_{i} ,v_{i} )x_{i,k} } \right)} \right) $$
(1)
$$ y_{i} \sim Poisson\left( {N_{i} \exp \left( {\mathop \sum \limits_{k} \beta_{k} (u_{i} ,v_{i} )x_{i,k} + \mathop \sum \limits_{l} \gamma_{l} z_{l,i} } \right)} \right) $$
(2)

where y i , x i,k , and N i denote, respectively, dependent variable (the total number of malaria cases), kth independent variable including the constant term and the offset variable [population size at risk (ward population) in the ward i]. (u i , v i ) is the geographic coordinate of the centroid of the ith ward (the location of i). The coefficients β k (u i , v i ) are assumed to be smoothly varying conditional on their location. z l,i is the lth independent variable with a fixed coefficient γ l .

Three models were computed: (1) a global model that assumes that the process accounting for the disease is spatially constant throughout the study area; (2) a local model that assumed all environmental factors and control measures vary locally; and, (3) a model including local and global variables. The adaptive bi-square kernel for geographically weighting was used in this study. This is suitable for clarifying local extends for model fitting and keeping constant the number of areas to be included in the kernel [32]. The adaptive bi-square kernel is also more suitable for when one seeks a definitive local extent for model fitting [33]. The golden search was used as a bandwidth selection method. Considering that the sample size was small, the minimum and maximum bandwidth ranges were defined as 30 and 34 respectively in GWR4.09. The golden search helps to automatically reach or obtain the optimum bandwidth size based on the Akaike information criterion (AICc) [34].

The local model computed a spatial weights matrix based on the assumption of spatial autocorrelation. This means that the individual observation was influenced by the surrounding observations and the extent of this influence was inversely related to distance [35]. The local model provided locally varying parameter estimates, standard errors, as well as the respective pseudo t values as described by Nakaya et al. [23]. A pseudo t value less than − 1.96 or greater than + 1.96 indicates a p value  <  0.05. The weighting function, the bandwidth size of the matrix and the best model regarding different independent variable sub-sets were based on the small sample size bias corrected AICc comparison as suggested by Alves et al. [33]. The smaller AICc indicates a better model performance. If the difference between AICc is larger than 2, the model with lower AICc is selected [36]. The coefficients and the percent deviance explained (measure of local goodness-of-fit and is a type of pseudo-R2 [37]) were only showed for the best performing model based on the AICc. These coefficients were mapped using QGIS 2.2.0. The wards with significant coefficients were also shown on each local coefficient map.

Statistical analysis

Before performing the GWPR model, correlation analyses was conducted in Excel 2007 to ensure that the variables (Table 2) were not highly correlated. Strong multicollinearity can impair the model and produce artificial and erroneous effects [38]. The variance inflation factor (VIF) was used as an indicator of multicollinearity. Generally, VIF value of greater than 4–10 is regarded as severe multicollinearity [39,40,41]. Hence, the lower limit (> 4) was used as a cut-off in this study as also used by Ehlkes et al. [35].

Table 2 Characteristics of malaria clusters for year 2015 in Gwanda District, Matabeleland South Province, Zimbabwe

Results

Distribution of malaria in 2015

A total of 75 cases in a total population of approximately 137,105 in Gwanda was recorded in 2015. This translates to overall annual malaria incidence of 0.547 per 1000 inhabitants. The cases were distributed in the wards as shown in Fig. 2. The incidence per ward ranged from 0 to 2.59 per 1000 inhabitants (Fig. 3). Two significant malaria clusters for year 2015 were detected in Gwanda district. The properties of these clusters (Fig. 3) are shown in Table 2.

Fig. 2
figure 2

Malaria cases per ward in Gwanda district, Matabeleland South Province, Zimbabwe (2015)

Fig. 3
figure 3

Malaria incidence and clusters per ward in Gwanda district, Matabeleland South Province in Zimbabwe (2015)

The clusters were detected in both rural and urban wards; cluster 1 (ward 24) and cluster 2 (ward 9), respectively. Rural cluster had higher relative risk compared to the urban cluster (Table 2). The coefficients of the variables from the s-GWPR model in rural and urban areas including the clusters are shown in Fig. 4.

Fig. 4
figure 4

Variation of local coefficient estimates from GWPR a intercept, b Normalized Difference Vegetation Index (NDVI), c minimum temperature, d Normalized Difference Water Index (NDWI), e altitude in Gwanda district, Matabeleland South Province, Zimbabwe (2015)

Local variation of exploratory variables in determining the heterogeneity of malaria incidence

The semiparametric-GWPR model with mixed variables (after local to global) had an AICc of 70.882 and a percent deviance explained of 0.529. This model used 34 (all samples) as the optimum bandwidth based on the adaptive kernel and golden search for bandwidth selection. This model performed better than other models (Table 3) and was further used to explore the local variation of the exploratory variables in relation to malaria cases.

Table 3 Comparison of model performances based on AICc

After local to global variable selection, only three variables were moved to global. These variables include LLINs, IRS and rural or urban with the following estimated coefficients: 0.844, − 0.171, and − 0.873, respectively. The influence of these variables did not vary across the study area and remained in the models as global terms. The other exploratory variables were varying at ward level (Fig. 4); the summary of their coefficients is shown in Table 4.

Table 4 Summary of the coefficients of the locally varying variables based on the best s-GWPR model

The semiparametric-GWPR captured the spatially non-stationary relationships between malaria cases and some of the exploratory variables (minimum temperature, Normalized Difference Vegetation Index (NDVI), Normalized Difference Water Index (NDWI), and altitude) at the ward level. The coefficients of these variables and the intercept in the GWPR varied spatially, suggesting the effects of the exploratory variables were different between wards. Gwanda district is not entirely homogeneous as it was shown by wide ranges of estimated coefficients for NDWI, NDVI, minimum temperature and altitude. NDWI (proxy for surface water or wetness) estimated coefficients showed positive association with malaria cases per ward and was significant in the rural wards located to the southern and southeastern part of the study area. Altitude and minimum temperature estimated coefficients ranged from negative to positive but not significant. The estimated NDVI coefficients were negative over the whole study area and significant in all urban wards and most of the rural wards. The intercept coefficients were negative and significant across the whole study area.

Multicollinearity

The VIF for all the variables was ranging from 0.822 to 19.378. Three variables showed high correlation with other variables. These include rainfall, maximum temperature and EVI with VIF of 10.509, 19.378 and 4.929, respectively; hence these variables were not considered in the model.

Discussion

Results of this study have shown that the distribution of annual cases or incidence of malaria is heterogeneous even in the malaria pre-elimination zones, as observed in other studies [6,7,8]. Differences exist not just between different regions but also at local level [14,15,16]. Identifying the malaria clusters (areas with elevated number of cases) is critical in developing or improving malaria control strategies at local scale. In this study, clusters were detected in rural and urban areas. This indicates that malaria incidence could be high or above average in both rural and urban settings in Gwanda district. However the cluster had a higher relative risk compared to the urban cluster. These clusters may point to the areas that need immediate attention in terms of preparation and implementation of the disease control strategies.

Heterogeneity of malaria cases is driven by a variety of ecological, biological and sociological factors [6]. As noted by Ehlkes et al. [35] most studies assume homogeneous influence of exploratory variables [42,43,44,45] but this may not always be most appropriate [23]. In this study, the analysis showed that assuming that other variables vary at local level substantially improves the GWPR model performance. Allowing spatial heterogeneity within the regression model allows clearer interpretation regarding the true nature of potential associations [35]. That could be because the LLINs are mostly distributed to the wards that are known to have higher incidences in Gwanda district and there was limited data regarding the actual use of these nets. When assessing the associations between environmental variables and malaria cases one must consider the pathways in which these variables under study lie [35]. For example, the environmental variables: minimum temperature, NDVI, altitude, NDWI which influence the malaria cases were considered in this study as they determine the abundance of mosquito or their breeding habitats. Malaria control strategies (IRS and LLINS coverage) per ward were also considered in this study. These factors tend to reduce the cases of malaria. The interaction between these factors and malaria cases may bring out unexpected results, defying the norms regarding the relationship between environmental factors and malaria. It has been established that transmission potential decreases as the altitude increases [2, 46, 47]. This was also noted in this study as altitude showed its expected negative relationship with malaria cases in some of the wards but also showed positive relationship in the wards to the southern part of Gwanda district. However, it was not significant in any of the wards. NDWI coefficients showed the expected positive association with malaria cases and were significant in the rural wards located in south and southeastern part of Gwanda district. Altitude and minimum temperature estimated coefficients ranged from negative to positive over Gwanda district. This indicates that GWPR successfully captured the spatially non-stationary of these factors and how the global model can be misleading since averaging these local effects reveals a single impact assumed to hold across all regions [33]. The estimated NDVI coefficients were negative over the study area, which was against the expectation.

The weak positive and strong negative correlation coefficients between environmental factors and malaria incidences in some of the wards could be due to the protection effect of malaria control factors, such as vector control methods including LLINs and IRS. These malaria interventions contribute significantly to the decline in malaria cases particularly in areas progressing towards malaria elimination [47]. Gwitira et al. [13] also noted that in cases where there is effective malaria control, there will be weak correlations between habitat suitability and malaria cases. This was observed in this study based on the proxies for land cover (NDVI), wetness (NDWI). However, there is need to consider mapping the land use and land cover types and relate to malaria cases at micro-scale. Changes in land use and land cover have also been found to be critical in determining the survival of Anopheles malaria vectors [35, 48]. Significant land cover and land use changes may lead to increase in the abundance of malaria vectors and consequently increased malaria transmission.

Geographically weighted regression (GWR) has shown its ability to handle the socio-economic variables in relation to disease transmission. For example, the semiparametric-GWR (s-GWR) managed to detect schistosomiasis hot spots based on socio-economic and environmental factors at household level in Ndumo area, uMkhanyakude in South Africa [35]. Active surveys are required to capture the data on socio-economic factors, including housing structure and entomological data. Previous studies showed that malaria transmission tends to be higher in houses built with mud and thatch than those with asbestos or iron sheets and are built using cement [9]. All these factors influence the spatial and temporal distribution of malaria incidence. The assessment of the effect of malaria control measures may need to be done in spatio-temporal modelling. The temporal aspect would be able to show the decrease or increase in malaria cases in relation to control measures and environmental factors over time. There was also no information regarding larviciding for 2015 and 2016. In the pre-elimination and elimination phases, interventions have to be targeted to entire villages or towns with higher malaria incidence until only individual episodes of malaria remain and become the centre of attention [49].

This study has shown the feasibility of using passive surveillance data from health facilities to map malaria cases and detect clusters. The passively identified health facility cases reflected malaria transmission levels in places where malaria cases tend to cluster at ward level as also noted by Rulisa et al. [7]. These cases were parasitologically confirmed using RDTs or microscopy and were complete both spatially and temporally. The clinical malaria cases were previously used in trend analysis of malaria transmission in relation to climatic and environmental factors in Tubu village, Botswana [50]. However, these data are usually incomplete [51], and future studies may consider approaches that adjust for health facility utilization and under-reporting [52, 53]. Sturrock et al. [17] noted that mapping malaria in low transmission settings is a challenge, given that as incidence drops, transmission concentrates in hot spots and hot pops. It is challenging to identify hot pops operationally because only a sub-set of febrile individuals may seek treatment at formal health facilities [53]. In low malaria transmission settings, the treatment-seeking patterns may be determined by individual immune status [54]. For example, the onset of fever in low-immunity populations may lead to presentation at peripheral health centres, while populations highly exposed (hot pops) are less likely to seek treatment [55]. These malaria hot spots may serve to perpetuate residual malaria transmission in low-transmission seasons and hinder efforts to eliminate malaria [56]. Active and timely identification of these hot spots and associated risk factors through active surveys is essential for targeting interventions to optimize malaria control [49].

This study presents an analysis of a single year of cross-sectional data that was temporally aggregated, hence the temporal dimension was not considered. Using only 1 year of data may only reflect where cases occurred during that single year, but not temporally stable areas of high malaria risk. This ignores and masks any seasonal pattern, which is highly important in areas approaching elimination. However the results of this study will inform further exploratory studies in pre-elimination zones also considering the amount of data which will be collected through DHIS2. The analysis in this study was also restricted by the limited number of wards in Gwanda district which impacted on sample size. Ward is the smallest political administrative unit with reliable data on population size. This might have impacted on the analysis as Paez et al. [57] advised that GWR may not be used in cases with sample size less than 160 as the small sample size issue could result in inaccurate estimates in statistical modelling [57,58,59,60]. However, Li et al. [61], noted that after considering the spatial heterogeneity in the county-level data, with sample size less than 160, the GWPR outperformed the traditional generalised linear models (GLM) in predicting fatal crashes in individual counties [61]. Therefore this study contributes to other GWR studies on disease and risk factors, most of which show that global statistical models may produce misleading results [62]. GWPR models are capable of capturing the spatial heterogeneity of phenomenon compared to global estimates/models [63]. The local coefficient maps in this study also show the magnitude, significance and direction of the relationships between malaria cases and exploratory variables. For local planning, such as district or ward (as in this case), the local GWR models seem to be more appropriate, since global models may not capture local changes or variation [63].

Conclusions

This study has confirmed that malaria incidence is heterogeneous in low-transmission zones including those in pre-elimination phase. It explored local variations in the relationship between malaria cases and environmental and malaria control factors at ward level in Gwanda district, Matabeleland South Province, Zimbabwe. In 2015, Gwanda district had two significant clusters: ward 24 (rural) and 9 (urban). The NDVI and NDWI showed significant association with malaria cases in some of the wards in Gwanda district. Despite the tendency to underestimate malaria burden, routine data from health facilities are helpful in reflecting the spatial distribution of malaria, especially in low malaria incidence settings. The results of this study can be used in planning and implementation of malaria control strategies at both district and ward levels.