Introduction

Malaria is a major public health concern and has continued to be one of the major focus of Sustainable Development Goals1. Efforts channeled towards reducing malaria incidence have yielded some positive results as reports have shown decrease in incidence and mortality rates by 37% and 60% globally, respectively2. A lot is still however needed to be done to further scale down the incidence of malaria as over 200 million cases and more than 400,000 malarial-associated deaths were reported in 2017 alone3. Reports showed that 15 countries shared the highest burden of malaria as 80% of the deaths are concentrated in these countries2. Sub-Saharan African countries account for 92% of the global malarial burden4. Nigeria is one of the most affected countries in the world with estimated 53.7 million cases and 79,800 deaths corresponding to one-fourth and one-fifth of the overall global picture, respectively3.

A number of environmental and ecological factors are known to influence transmission of malaria. Chief among these are rainfall and temperature. Rainfall especially when it is heavy washes-off many of the breeding sites of mosquito vectors of malaria parasites while temperature determines the duration of development of mosquito larvae in the environment and parasite development within the vector5,6. Precipitation is directly related to rainfall and it is an important factor that influences the bionomics of mosquito vectors of malaria parasites7. Modification of environment caused by impoundment for dam construction and irrigation schemes can influence the type and distribution of mosquito breeding sites8.

The application of spatial analysis of risk factors including environmental factors that aid transmission is very important in the fight against several vector borne diseases including malaria. Three environmental variables including stream density, road density, and land surface have been observed to be significantly associated with West Nile Virus using least squares regression (LSR) spatial analysis9. Visceral leishmaniasis transmission hotspots were also identified using global and local autocorrelation analyses10. In Bangladesh, spatial models showed normalized difference vegetation index (NDVI) as the best leading indicator of incidence of malaria transmission. Vegetation greenness was negatively correlated with incidence of malaria11. Despite the burden of malaria in Nigeria, the use of spatial statistics to examine interrelationship between incidence of malaria and prevailing environmental factors are still relatively understudied. This method is very useful in identifying disease hotspots within a specific region for possible intervention.

The identification of malaria transmission hotspots through the use of spatial statistics for targeted intervention is important because if interventions are not targeted, residual malaria transmission are likely to persist in hotspots12. Studies in East and West African countries have supported the observations on malaria hotspots persistence following overall reduction in malaria transmission13,14. This can have a serious implication in malaria control as transmission hotspots may stall intervention programmes. So, a logical and viable control intervention will focus more on malaria hotspots. Given the problem of low availability of resources in many malaria endemic regions, spatial analysis for identification of hotspots for targeted control becomes more cost-effective. The study therefore attempts to investigate the spatial and temporal variation in malaria incidence rates using a nationally representative Malaria Indicator Survey of 2015 which covered 326 clusters in the six geopolitical zones of Nigeria for the period 2000–2015. It is proposed that certain environmental factors significantly influence the incidence rate of malaria in Nigeria in space and time. To establish this, a number of exploratory and spatial statistical models were used. It is expected that the models will identify malaria transmission clusters in Nigeria for necessary interventions.

Results

Malaria incidence in Nigeria

The incidence rate of malaria was presented in Table 1. Generally, the incidence of malaria was higher in the Northern than in the Southern region of Nigeria. The result showed that the incidence of malaria was significantly higher in the North Central region of Nigeria than the rest of the country (P < 0.05). No significant variation in incidence of malaria was observed in the three geopolitical zones of Nigeria including the South East, South South and South West (P > 0.05). Malaria incidence rate in rural residential areas (0.430 ± 0.103) was significantly higher than in urban settings (0.368 ± 0.123) (P < 0.05) (Table 2).

Table 1 Malarial incidence rate in the six geopolitical zones of Nigeria.
Table 2 Distribution and incidence of malaria by residence type.

Between 2000 and 2005, there was no significant difference in the incidence rate of malaria in Nigeria. However, a significant drop in mean incidence from 0.436 ± 0.112 in 2005 to 0.377 ± 0.120 in 2010 was observed (Table 3). In 2015, a further significant decrease in mean incidence of malaria was observed (P < 0.05). The variations in mean incidence rate of malaria and environmental factors (from 2000–2015) that influence malaria incidence in the six geopolitical zones in Nigeria is presented in Table 2. The mean incidence of malaria was consistently higher from 2000 to 2005 in all the geopolitical zones of Nigeria except in the North West where the incidence was significantly lower in 2005 (0.428 ± 0.092) compared to the value reported (0.476 ± 0.113) in 2010.

Table 3 Variations in mean malarial incidence rate and environmental factors in Nigeria (2000–2015).

Generally, the Southern Nigeria recorded lower mean temperature, ranging from 31.253 °C ± 0.209 in the South South in 2005 to 31.887 °C ± 0.412 in the South West in 2010. There was no specific pattern in mean temperature variation in the six regions in Nigeria. Aridity increased generally in the Northern part of Nigeria from 2005–2015 but a significant increase in aridity from 2010 to 2015 was particularly common in all the Northern regions (Table 3). The South South with aridity index 58.175 ± 7.759 in 2000 and 52.22 ± 8.050 in 2015 was the most humid. There was direct relationship between aridity and decrease in rainfall. The Northern regions with increase aridity also recorded lower mean rainfall. There was no significant difference in the mean rainfall patterns in the North East region between 2000 and 2015. The South South region with the highest mean rainfall (2698.811 mm ± 610.783) in 2010 also recorded the highest mean precipitation (196.391 mm ± 15.62).

Non-spatial relationships between incidence of malaria and environmental variables

A significant positive correlation occurred between the incidence rate of malaria and maximum temperature (r = 0.094, P < 0.05), and proximity to water (r = 0.216, P < 0.01). A significant negative correlation, however, was recorded between malaria incidence rate and aridity (r = −0.133, P < 0.01), rainfall (r = −0.094, P < 0.05) and precipitation (r = −0.100, P < 0.05). Rainfall correlated negatively with maximum temperature (r = −0.791, P < 0.001).

Non-spatial versus spatial regression on impact of environmental variables on incidence rate of malaria

The predictors such as annual rainfall, precipitation and proximity to water have positive and significant effect on the incidence rate of malaria (P < 0.01) in the OLS model whereas aridity was negatively related to malaria incidence rate (P < 0.001) in the same model. The rate of incidence of malaria increased significantly with increase in temperature in the SLM and SEM spatial models (Table 4). In addition to negative and significant impact of precipitation on incidence of malaria in SEM, the coefficient of temperature, aridity and proximity to water were also positive and significant. The SEM model with smallest information criteria value (AIC = −2686.945, BIC = −2645.809) provide best explanation on impact of selected environmental factors on malaria incidence. The non-spatial OLS perform poorly compared to the spatial models.

Table 4 Spatial models showing correlation between incidence rate of malaria and environmental variables.

Moran’s I statistics for determination of spatial autocorrelation

A significant Moran’s I statistics of 0.440 was observed (P < 0.05) Fig. 1. Figure 2 showed the Moran’s I scatter plot of incidence rate of malaria. Points in quadrant I showed clusters with high malaria incidence rate (relative to average of the 344 clusters) was surrounded by clusters of high malaria incidence rate (HH), quadrant II showed regions with low malaria incidence rate surrounded by clusters with high malaria incidence rate (LH), quadrant III showed regions with low malaria incidence rate surrounded by clusters of low incidence rate of malaria (LL), and quadrant IV showed regions with high incidence rate of malaria surrounded by clusters of low incidence rate of malaria (HL). The plots showed more cluster points in quadrant I and III.

Figure 1
figure 1

Mean Moran’s I values for local spatial autocorrelation for malaria incidence at varying spatial lags.

Figure 2
figure 2

LISA cluster map.

LISA significance map of Local Moran’s I test for local spatial autocorrelation patterns of incidence rate of malaria was presented in Fig. 3. The bright green and green shade clusters represented regions of malaria incidence rate which showed significant local spatial autocorrelation (P < 0.05).

Figure 3
figure 3

LISA significance map.

Spatio-temporal distribution of malaria

The spatio-temporal distribution of malaria is presented in Fig. 4. Further analysis of LISA showed that there were 344 hotspots locations distributed across the six geopolitical zones. The distribution of these hotspots locations revealed that the Northern geopolitical zones have larger proportion with North Central, North West and North East having 33.7%, 29.1% and 18.0%, respectively. Whereas, the proportion of hotspots in the South South, South East, and South West were 8.4%, 7.0% and 3.0%, respectively (Table 5). The distribution of hotspots of malaria incidence depicted in Fig. 5 showed similar pattern across the geopolitical zones over time with the North Central taking the lead and immediately followed by the North West. However, there was a systematic decrease in the number of hotspots clusters in each of the geo-political zone from year to year. In general, the hotspots clusters reduced by 51.5% between 2000 and 2010.

Figure 4
figure 4

Spatial-temporal distribution of malaria.

Table 5 Analysis of local indicator of spatial autocorrelation (LISA) by geopolitical Zones.
Figure 5
figure 5

Spatial-temporal distribution of incidence of malaria by geopolitical zones.

Discussion

Malaria continues to be a serious threat in all regions of Nigeria. Studies across Nigeria have attributed higher prevalence of malaria as high as 70–99% to the South15,16,17. The reasons in support of this are the higher rainfall patterns, more water bodies and heavy forest which are predominant environmental factors that characterise the South, and which aid malaria transmission in the region18. However, this study showed that the incidence of malaria which is the number of new malaria cases during 2000–2015 period of time was significantly higher in the Northern than in the Southern Nigeria. The North East Nigeria although recorded higher incidence than all the Southern regions, the incidence in the region was lower than the two other geopolitical zones in the North. This lower incidence could be attributed to larger coverage of insecticide treated nets (ITNs) in the region compared to the rest. In a report by the Nigeria Malaria Indicator Survey in 2010, 67.4% of individuals from the North East claimed ownership of at least one ITN, while 32.7% and 59.7% were reported for North Central and North West respectively19.

Although the prevailing climatic conditions appeared to be negatively correlated with malaria transmission in Northern Nigeria, areas surrounding the confluence of the Rivers Niger and Benue in the North Central and many isolated areas of the North East and North West parts of Nigeria have been reported to have as high as 70% prevalence of malaria18. Poor access to health care and public health services in isolated areas of the North and impact of Rivers Niger and Benue could be responsible for such high endemicity of malaria in the regions. Higher malaria incidence in the rural areas of Nigeria could be attributed to the prevailing cultural practices in the areas that could predisposed the people to infection by malaria parasites. Many of the rural areas both in the North and South are isolated and very difficult to access. Besides, poor socio-cultural development of the areas and lack of basic social amenities often discourage health workers posted to the places. Because of these fundamental problems, health service delivery in those areas is often poor and it usually undermines the people’s access to good health care services. Although malaria incidence in urban centers is lower in this study, there is however, stable transmission of malaria in Nigerian urban regions. One important reason is that some urban centers in Southern Nigeria are located in the coastal regions thus providing suitable breeding sites for mosquito vectors of malaria parasites. Urban agricultural development involving irrigation is common in the North and this could facilitate malaria transmission in the region. Poor drainage systems and creation of artificial vector breeding sites like ditches and tyre tracks during heavy downpour are common in Nigerian urban centers.

The Roll Back Malaria (RBM) Partnership initiated in 2000 seemed not to have yielded any significant positive result as malaria incidence in some of the regions in Nigeria either remained the same or rose significantly between 2000 and 2005. The failure of RBM Partnership necessitated the development of the Global Malaria Action Plan in 2008. It appeared that the new initiative yielded some positive results in Nigeria as there were significant reduction in the incidence of malaria between 2010 and 2015 in all the six geopolitical zones of Nigeria. This is further supported by the overall significant decrease in incidence of malaria between 2010 and 2015.

Malaria is a disease whose transmission is greatly influenced by environmental factors. These factors are good predictors of transmission but could share non-linear relationship with mosquito abundance and malaria transmission18. The impact of environmental variables on malaria transmission can be adequately established by spatial statistical models which can predict the transmission of malaria both in space and also in time. Rise in temperature shortens the blood meals-seeking behaviour of female Anopheles mosquito, therefore causing a corresponding decrease in ovulation and production of juvenile mosquitoes. The temperature of as high as 34 °C which is the average upper limit temperature recorded in the Northern Nigeria has been reported to cause reduction in the gonotrophic cycle length of mosquitoes20. The general temperature range (27–29 °C) in many areas of Nigeria falls within the optimum value for the development of sporozoites within the mosquitoes21. The daily survival of mosquitoes is also influenced by temperature. Mosquitoes’ daily survival rate of about 90% has been attributed to temperatures between 16 °C and 36 °C21. Whichever way, it is clear that temperature is a very important factor that aids transmission of malaria both in the Northern and the Southern parts of Nigeria. This claim was supported by a positive correlation between incidence of malaria and temperature. More importantly, the very significant relationships the SLM and SEM spatial models showed with temperature makes the later a significant predictor of malaria transmission in Nigeria. In fact, the positive and significant spatial lag coefficient in the SLM indicates that malaria incidence rate in one cluster depends directly on the rate of incidence in its neighboring clusters affected by temperature.

Rainfall and precipitation are also two factors that affect incidence of malaria in Nigeria. However, because of the variation in the rainfall patterns of the Northern and Southern parts of Nigeria, these may affect malaria transmission dynamics differently. The Southern regions enjoy a longer duration of rainfall than the North. So, transmission is usually higher at the onset of rainy season and the beginning of dry season22. The characteristic rainfall patterns in the Southern Nigeria create shallow water pockets suitable for breeding of Anopheles gambiae which is the main mosquito vector of malaria parasite in Nigeria23. The negative correlation in the non-spatial statistical analysis is suggestive of a negative impact continuous heavy rainfall especially during the peak rainy season may have on malaria parasites vectors and eventual transmission of the disease. This could explain the reason the incidence of malaria is relatively lower in the region compared to the Northern part of Nigeria. Precipitation has been considered to be the most important climatic factor that influence incidence of malaria in the lowlands24. Our study showed that precipitation is strongly correlated with rainfall. The impact of precipitation in malaria transmission is both direct and indirect especially where dams are situated. It raises reservoir’s water level and creates potential mosquito breeding sites along the shorelines24. Previous findings from Nepal using generalized additive mixed models (GAMM) however, showed that maximum temperature and rainfall were not significantly associated with malaria incidence25. The same was observed with rainfall in Bangladesh11 but our study was similar to reports from India and Sri Lankan which reported negative correlation between rainfall and incidence of malaria26,27. The difference in the sign and significance of the parameters between the OLS and the spatial models confirms the assertion that OLS models remain unbiased in the presence of spatial autocorrelation but remain inefficient and inconsistent in SEM and SLM models, respectively. Misleading conclusion is inevitable when OLS technique is used in analysing sample data collected for regions or points I space. The significance of the spatial autoregressive parameter ρ (Rho) in the SLM and λ (Lambda) in the SEM indicated that spatial autocorrelation exists in the data and that the spatial model is more appropriate than standard aspatial model which is prone to misleading result and under or over estimation of the parameters. This result agreed with Anselin28 and LeSage and Kelly29 that OLS result is inconsistent and inefficient in the SLM and SEM models, respectively

Aridity is higher in the North than in the South and its increase influences malaria transmission by reducing mosquito biting rate and the adult lifespan than the extrinsic incubation period for malaria parasite24,30. Using the work of De Martonne31, the North East and North West zones of Nigeria are semi-arid, while the North Central is semi-humid. The South West, South East and South South regions of Nigeria are humid, very humid, and extremely humid respectively.

A significant Moran’s I statistics also denotes the same and justifies that malaria incidences in the nearby clusters are more related than those far away. The univariate Moran’s scattered plots showed more points in quadrant I and III denoting a positive spatial autocorrelation pattern in incidence of malaria among clusters in different regions of Nigeria. The extent of this autocorrelation was tested by LISA model which is a class of spatial statistics that provides information specific to clusters and estimates the extent of spatial autocorrelation of malaria incidence in a particular cluster in relation to its neighbours. The over 700 clusters with significant local spatial autocorrelation patterns in incidence of malaria as revealed by the LISA significance plots shows that there are indeed presence of spatial association in incidence of malaria in Nigeria. The reduction in malaria hotspot clusters from 2000–2015 indicated that the various interventions from government and international agencies to combat malaria in the country has been productive.

One limitation of spatial modelling is that while infectious disease data has a lot of intra- and inter-annual variability depending on epidemic and non-epidemic periods, the regression analysis assumes the association between exposure and outcome to be stationary over time32.

Conclusion

Our study has shown that malaria is still a serious problem in all the regions of Nigeria with environmental factors like rainfall, temperature and aridity playing important roles in transmission of the disease. There is more malaria incidence in the North than in the South and rural than urban areas. The spatial statistical models adopted are important to design a prompt and early malaria transmission mitigation support system in suspected regions. The models can help to generate malaria risk map and spatially channel available resources to the disease hot spots.

Materials and Methods

Study area

The study was carried out in Nigeria, a country in sub-Saharan African region, located between latitudes 4°16′ and 13°53′ North and longitudes 2°40′ and 14°41′ East. The country has a total surface area of approximately 923,768 square kilometers and density of 212.04 individuals per square kilometers. One of the country most severe public health problems is malaria and the climatic conditions of the country make it suitable for recurrent malaria transmission. There have been various interventions from government and international agencies to mitigate the burden of this tropical disease.

Data source

The Demographic and Health Survey Programme (DHSP) assists countries worldwide in the collection and use of data to monitor and evaluate population, health, and nutrition programmes. Data emanating from the survey are processed and made available upon request for download through the Demographic and Health Surveys (DHS) Programme website. The data often come with geospatial covariates and it is often difficult to link these covariates with the DHS Programme’s data to determine the impact of location on health outcomes. To alleviate the difficulty, the DHS Programme Geospatial Team developed a set of standardised files of the most commonly used geospatial covariates already linked with the dataset.

The covariate variables came from two types of data: raster and vector. Raster data, such as images and modeled surfaces, rely on pixels or cells to convey their data values. On the other hand, vector data, such as points, lines, and polygons, show the discrete location or boundary of a feature. Because of the differences in the data types, the methods needed to extract meaningful values varied. Firstly, Geospatial covariate layers (i.e. modeled surfaces) that are relevant to the DHS Programme indicators were acquired from Digital Globe (~35 cm resolution) remotely sensed imagery. GPS coordinates representing the location of a survey cluster were obtained from the DHS programme. In addition to modeled surfaces, vector (polygon and line) data, which were obtained from various publicly available sources were also included. Secondly, Raster and vector datasets were imported and linked to GPS using a standalone Python programming language script and ArcGIS, respectively.

The study used data emanating from 2015 Nigeria Malaria Indicator Survey (NMIS) accessed at the DHS website. The 2015 Nigeria Malaria Indicator Survey was implemented by the National Malaria Elimination Programme (NMEP), the National Population Commission (NPC), and the National Bureau of Statistics (NBS) and other international agencies from October 2015 through November 2015. The International Classification of Functioning, Disability and Health provided technical assistance as well as funding to the project through the DHS Programme; a project funded by the United States Agency for International Development (USAID)19.

Rainfall data was obtained from a satellite-based rainfall product called the Climate Hazards Group InfraRed Precipitation with Stations (CHIRPS) which has high temporal and spatial resolution33. Maximum temperature and precipitation data were obtained from the Climate Research Unit (CRU) of the University of East Anglia, UK, which produces a range of global climate time series gridded data, derived from meteorological stations across the world’s land areas. The datasets were provided on high resolution (0.5 × 0.5 degrees) grids over the period 1901–201634,35. Aridity was modeled using data available from the WorldClim Global Climate Data and was updated for the period of 2000, 2005, 2010 and 2015 using high resolution grids obtained from the CRU datasets35. Proximity to water data was extracted from lakes dataset (L2) at full resolution and the shoreline dataset (L1), also at full resolution, in the Global Self-consistent, Hierarchical, High-resolution Shoreline (GSHHG) database. The datasets used were based on the World Vector Shorelines, CIA World Data Bank II, and Atlas of the Cryosphere36,37.

Sampling procedures

A two-stage sampling strategy was adopted for the 2015 NMIS. In the first stage, nine clusters (EAs) were selected from each state, including the Federal Capital Territory (FCT). The sample selection was done in such a way that it was representative of each state. The result was a total of 333 clusters throughout the country, 138 in urban areas and 195 in rural areas. The geospatial covariates of 2015 NMIS housed data on malaria incidence (defined as the average number of people per year who show clinical symptoms of Plasmodium falciparum malaria within the 2 km (urban) or 10 km (rural) buffer surrounding the DHS survey cluster location) as well as the environmental variables measured using remote sensing within the 2 km (urban) or 10 km (rural) buffer surrounding the DHS survey cluster location for 326 clusters within the country over interval of five years (2000, 2005, 2010 and 2015) were used. To ensure completeness of the dataset, all empty cells and inconsistent cases were removed and the retained samples became 1264 as against original cases of 1304 which amount to 96.9% of the total cases. The distribution of the retained samples by geopolitical zones and residence type are shown in Tables 1 and 2.

Statistical and spatial analyses

Descriptive statistical analysis, mean difference and association between malaria incidence and the environmental variables were done using frequency counts, percentages, independent t-test, Pearson’s Product Moment Correlation (PPMC) and one-factor analysis of variance (ANOVA).

The main motivation for applying spatial statistical model is the existence of spatial autocorrelation. This is analogous to time series serial autocorrelation except that it is multidirectional while serial autocorrelation is unidirectional. Global spatial autocorrelation is commonly detected in georeferenced data by the Moran’s I test-statistics28 and it is given as;

$$I=\frac{n}{{S}_{o}}\frac{{\sum }_{i=1}^{n}{\sum }_{j=1}^{n}{w}_{ij}({x}_{i}-\bar{x})({x}_{j}-\bar{x})}{{\sum }_{i=1}^{n}{({x}_{i}-\bar{x})}^{2}}\,{\rm{or}}\,{\rm{in}}\,{\rm{matrix}}\,{\rm{form}}\,{\rm{as}}\,I=\frac{n}{{S}_{o}}\frac{x^{\prime} Wx}{x^{\prime} x}$$
(1)

Where n is the n × 1 vector of a random variable which has been standardised such that the mean and variance are 0 and 1, respectively. W is an n × n row standardised (row sum equal to 1) spatial weight matrix and So is the sum of the elements of W. W captures the nature of connected among the spatial units in the data and this can be conceived in the topological notion of neighbourhood. In this study, Queen Contiguity criterion is adopted, which stipulates that two areas are neighbours when they share a common side or vertex. A first order queen contiguity matrix is defined as \({W}_{ij}=1\) if clusters i and j share common side or vertex and zero if otherwise. The diagonal element of W is constraint to be zero so as to prevent a cluster from being a neighbour to itself. Torres-Preciado et al.38 reported that such matrix facilitates the interpretation of neighbourhood phenomenon underlying the administrative breakdown and improves the efficiency algorithms during the estimation process. Moran’s I index takes value between −1 and 1 and it can be interpreted as a product moment correlation coefficient. The positive values of Moran’s I indicate that observation of similar values occurs as neighbour whereas negative values signify that both high and low value observations occur as neighbours. A Moran’s I value of zero signifies a random spatial distribution. A local indicator of spatial autocorrelation (LISA)34 or the so-called local Moran’s I, test for local spatial autocorrelation. The LISA indicates significant spatial clustering and sums up proportional to the global Moran’s I39. It is possible for the dataset to have significant local spatial clustering but no global spatial autocorrelation.

Based on the likelihood that malaria incidence in a given cluster might be influenced by the similar incidence in a nearby cluster, Moran’s diagram was employed to have a rapid and global knowledge of the global spatial autocorrelation in malaria incidence while LISA was used to detect the hot and cold spots clustering location in the sample. As earlier described, the positive value of Moran’s I will be interpreted as high values of malaria incidence and are grouped together in space whereas its negative signified that the dissimilar values of malaria incidence come together geographically. If it is zero, then spatial dependence is absent in the variable and in this case the assumption of independence holds. The cluster and significant map showed the hot and cold spot locations.

Prior to the production of the diagram and map, connectivity matrix among the clusters was created using the coordinates of the cluster displaced by up to 2 kilometers (for Urban points) and 10 kilometers (for Rural points)40 based on the first order queen contiguity criterion earlier illustrated. In the diagram, the values of malaria incidence on the x-axis was plotted against the average values of the malaria incidence for the neighbouring observations Wy (lagged malaria incidence) in the y-axis. The diagram has four quadrants as shown in Fig. 1. The value above the diagram is the global Moran’s I index. If the value is close to zero, it means malaria distribution is spatially random, while a positive value indicates spatial clustering41.

Due to the spatial nature of the data and the possibility that malaria incidence in one location may be influenced by similar values in another location, three regression specifications were used to model the relationship between incidence of malaria and environmental factors. The non-spatial regression, Spatial Lag Model (SLM), and Spatial Error Model (SEM) as shown in Eqs 24. Ordinary Least Square (OLS) estimation method was used for equation while Eqs 3 and 4 were estimated by maximum likelihood method because OLS estimation of Eq. 3 has been reported to be inconsistent42,43 while in the case of Eq. 4, it remained unbiased but inefficient29.

$$OLS;y=X\beta +\varepsilon $$
(2)
$$SLM;y=\rho Wy+X\beta +\varepsilon $$
(3)
$$SEM;y=X\beta +\varepsilon ;\varepsilon =\lambda W\varepsilon +u$$
(4)

The OLS model is aspatial and it behaves well under the assumptions of independence of observations and homoskedastic error terms. Sample data collected for regions or points in space are not independent, but rather spatially dependent44. Firstly, data records at proximal locations appear to be either positively or negatively correlated, which is called spatial dependence. Secondly, in spatial data setting the homoskedastic assumption cannot hold due to lack of structural stability across space such as varying parameters or functional forms. Due violation of classical statistics assumption regarding independence and of observation and homoskedastic error terms the need for models that can account for spatial structure in their specification is necessitated. The most common way of adjusting model 2 to accommodate spatial structure is to add spatial lag of the dependent variable or the disturbance term to the model. Models 3 and 4 are spatial regression models in that the spatial lag of the dependent variable (Wy) and that of the disturbance term () have been added to their specification. The two models revert to aspatial model (model 1) when the spatial effect parameters (ρ and λ) are equal to zero.

Maximum likelihood estimation technique was derived and suggested for SLM and SEM models28,45,46. In this approach, the probability of the joint distribution (likelihood) of all observations is maximized with respect to a number of relevant parameters. If the regularity conditions for the log-likelihood functions are satisfied, the obtained ML estimation will achieve the desirable properties of consistency, asymptotic efficiency, and asymptotic normality. Moreover, in most situations, the resulting estimates for the regular parameters of the models are also unbiased28.