1 Introduction

Sustainable agriculture is essential to ensure food security for the increasing population in sub-Saharan Africa (SSA) while reducing rural poverty and the degradation of natural resources [1]. However, the gradual decline of soil fertility in SSA soils induced by management, especially in intensively cropped areas, is a major cause of decreased yields and food production per capita [2]. This has led to undesirable mid- to long-term soil and environmental degradation [3]. Therefore, enhancement of the soil nutrient resource base through sound agronomic management practices is vital to increase crop productivity and household food security in the smallholder sector of Zimbabwe [4].

In nature, soil is inherently variable due to the variations in soil-forming processes at different spatial scales. Soil variability may also occur as a result of anthropogenic influences such as land use, cultivation and erosion. Thus, the soil varies in space and time because of geochemical processes and soil management practices such as fertilization and irrigation [5, 6]. This has implications for how the soil should be managed. Challenges such as acidity, erosion risk, and input requirements, vary from place to place and an optimal response (for productivity, profitability and environmental impact) should be based on local soil information. Therefore variation of soil fertility properties must be examined to understand the effects of land use and management systems on soil functions [7, 8,9,10,]. It is important to understand this variation and how far it reflects the influence of long-range factors. This might mean that recommendations for management should, in principle, differ between farms and between fields within farms, allowing more efficient, cost-effective, and less environmentally damaging use of agricultural inputs [9,10,11,12]. A better understanding of the variation of soil fertility could therefore help farmers to achieve increased soil productivity and the objectives of sustainable agriculture [13,14,15].

Smallholder farmers in SSA have different access to resources depending on resource endowment and may manage some fields differently from others [2, 16]. Most farmers are resource-constrained and often have access to limited amounts of inorganic and organic fertilizers which they therefore apply to the most productive fields at the expense of less productive fields whose fertility gradually declines. Strong gradients of decreasing soil fertility are usually found with increasing distance from the homestead as fields nearer to homesteads (homefields) preferably receive more nutrients. Homefields tend to have higher fertility and balanced nutrition compared to fields further away from homesteads (outfields) [17,18,19,20]. However, cases where outfields are more fertile than homefields have also been reported [18].

The spatial variation of soil fertility properties can be investigated by geostatistical methods [21]. Whereas most conventional statistical methods treat observations as independent on the basis of an appropriate sampling design, geostatistics involves the modelling of spatial dependence in data, showing the spatial scales at which they vary [22]. On the basis of this spatial model, local predictions are created as an optimal combination of nearby observations which minimizes the mean squared error of the prediction, which is also calculated, the kriging variance [23]. Geostatistical methods thus produce spatial predictions of soil fertility properties which make the best use of the available data [24] and which allow one to account for the uncertainty in soil information based on limited sampling by computing, for example, the probability that a soil property at some location falls above or below some threshold value such as an advisory nutrient concentration index [25]. Geostatistical predictions, and the kriging variances which quantify their uncertainty, are useful information to support management decisions about soil, such as the local fertilizer requirement or risk of acidity problems [22]. By basing decisions on sound spatial information we should be able to improve productivity and profitability, and to reduce environmental impact of agriculture [26]. Thus, modern geostatistical methods provide a basis for mapping the spatial variation of the soil fertility properties on the basis of relatively limited sampling, and so providing information (with measures of uncertainty) which might help direct advice and management interventions better.

Ward 10 of Hurungwe District is typical of many smallholder areas in Zimbabwe and SSA as a whole. There is little information on the variation of soil fertility properties and recommendations are generally uniform so might be suboptimal in many places. Also, due to variation in farmers’ resource endowment, some fields may receive more inputs than others. In Zimbabwe, little work has been done to examine the spatial variation of soil nutrient status and pH at these scales using the methods of spatial statistics [3, 17, 18]. The aim of this study was to examine how some key soil fertility properties vary spatially. The objectives of the study were to determine if there were: (1) trends or other long-range variations which might mean that recommendations should vary across the ward. (2) differences between homefields and outfields that may contribute substantially to this variation (3) other factors such as soil texture that may also contribute to this variation. Lastly, soil fertility properties that showed evidence for spatial dependence were mapped using geostatistics and geographic information system (GIS) facilities.

2 Materials and methods

2.1 Study site description

The study was conducted in ward 10 of Hurungwe District which is located about 235 km north-west of Harare and lies between 16 and 17oS and 29° E and 30° E (Fig. 1) and covers an area of 252.9 km2. The area is predominantly under smallholder farming. Zimbabwe is demarcated into five agro-ecological zones based on rainfall and yield potential for crop production [27]. The study location is in Agro-ecological zone II. The climate is sub-humid, with annual average rainfall of 750–1000 mm and a mean annual temperature of 16–19 °C. It has large potential for crop yields and is suitable for intensive cropping and livestock production.

Fig. 1
figure 1

Study area of ward 10 in Hurungwe district, Zimbabwe

Soils in the ward 10 of Hurungwe District are mainly granitic sands, Lixisols in the FAO-UNESCO classification [28, 29], with inherently poor nutrient supply potential but there are some smaller exceptional areas with fertile clay soils (Luvisols in the FAO-UNESCO classification) derived from dolerite intrusions [30].

Production systems in the area characteristically integrate livestock and crops, with cattle and goats as the dominant livestock whilst maize (Zea mays L.) is the main crop. Livestock that are kept in kraals near the homestead provide manure for soil fertility enhancement while in winter, when grazing areas are inadequate, crop residues in the individually owned but communally grazed fields provide supplementary feed to the livestock. Other crops grown in the area include tobacco (Nicotiana tabacum L), groundnut (Arachis hypogaea L), cowpea (Vigna unguiculata (L) Walp.), soyabean (Glyxine max L.), common bean (Phaseolus vulgaris L.), bambara nut (Vigna subterranean (L) Verdc.) and cotton (Gossypium hirsutum L.).

2.2 Sampling sites selection procedure

An appropriate sampling design depends on the objectives of the study and the methods which are to be used to analyze the data. In this case, our objective was to characterize the spatial variation of soil properties in fields used to grow maize across the sample domain (Ward 10), and, where possible, to map this variation spatially. The data were to be analyzed using model-based geostatistical methods [31]. Sampling to support such analyses does not require a probability sample (in the sense of [32] to allow assumptions of independence precisely because the model does not assume independence of the observations but models their spatial correlation. Furthermore, the objective of spatial mapping is best-served by a sampling design which gives good spatial coverage. Our sampling was therefore purposive in the sense of de Gruijter et al. [32] in that sample units were selected to support the objective of spatial mapping.

The sample frame was defined as all fields used to grow maize in seven villages in ward 10 in the 2017–2018 growing season. There are eleven villages in the ward, in the four excluded villages most farmers grew crops other than maize, predominantly tobacco. Within this sampling frame, sampling was exhaustive at field level. That is to say, all fields where maize was grown in the season under investigation were sampled. The identification of these fields was facilitated by the farmers and agricultural extension officials in the Ministry of Lands, Agriculture, Water and Rural Resettlement from Hurungwe District of Zimbabwe. Because all units in the sample frame were selected the sample design, at field level, does not introduce bias.

Within each field, sampling was undertaken in a circular plot of radius 10 m centred at the middle of the field to avoid any edge effects. The size and shape of the volume of material from which a sample is drawn is called the sample’s support in geostatistics. All statistics are conditional on the support. Had we sampled from a smaller area in each field than we used we would expect the variance of the data to be larger than we observed. For geostatistical mapping, using the methods described below, a sample support which is small relative to the region of interest is required. It is also important that the sample support is consistent over all the observations; otherwise, we cannot assume a fixed variance of the variable, under the stationarity assumptions of the statistical model that we use. This is the reason for not forming a composite sample across the whole of each field, as these vary markedly in shape and size.

Within the plot 10 soil cores were collected from locations selected by simple random sampling. These cores were then combined into a single composite sample. The cores were collected by auger over the depth interval of 0–15 cm which is the layer normally sampled for soil testing and fertilizer recommendations purposes in the smallholder sector [33]. A GPS was used to record the location of the central point of the plot.

A total of 250 fields were sampled in this way. Because some farmers no longer grow crops on outfields, due to lack of adequate resources such as seed, labour, draft power etc., most of the selected fields were homefields (178) with a smaller number of outfields (72). Homefields were on average less than 50 m away from the homesteads and on fairly flat slopes (< 2%) ranging from 0.1 to 0.4 ha in area. Most outfields were fairly on flat slopes (< 2%) and more than 1 km away from the homesteads with fields ranging from 0.4 to 2 ha in area.

The total sampled area in this study is characteristic of the agro-ecological zone to which it belongs, and a substantial sample effort has been concentrated in one ward to allow the spatial analysis to be undertaken, and to gain insight into spatial variation of soil fertility properties in this setting.

2.3 Sample preparation and analysis

The collected samples were air-dried and crushed using a wooden pestle and mortar to pass through a 2-mm sieve and were stored in paper bags at room temperature. The samples were then analyzed for particle size distribution (hydrometer method), pH (0.01 M CaCl2 method), organic C (modified Walkely–Black method) [34], KCl-extractable N (incubation technique) [35], bicarbonate P (resin membrane technique) [36] and exchangeable K (acidified ammonium acetate method with the concentration determined by atomic emission spectrophotometry [34].

2.4 Statistical and geostatistical methods

The data were analyzed using a linear mixed model (LMM) [37] and Pearson's correlation analysis to reveal the magnitude and direction of relationships between soil fertility properties was computed with GenStat version 14 statistical software (Lawes Agricultural Trust, Rothamsted Experimental Station, UK). In LMM analysis the variable of interest is treated as a linear combination of fixed effects (which define the expected value) and random effects. In this study, we considered a random effects model with a spatially correlated random component, in addition to an independently distributed residual term [38]. For models with such a random effect, the value of the variable of interest can be computed for an unsampled site (for which the fixed effects are defined) by a process equivalent to the kriging interpolation method used in geostatistics [23]. In the mixed model setting, this is called the empirical best linear unbiased prediction (EBLUP).

The linear mixed model may be written as

$${\mathbf{z}} = {\mathbf{M\beta }} + {\mathbf{S\eta }} + {{\varvec{\upvarepsilon}}},$$
(1)

where z is our vector of observations, length n, \(\mathbf{M}\) is known as the design matrix of the fixed effects, \({\varvec{\upbeta}}\) is the vector of the fixed effects coefficients, \({\varvec{\upeta}}\) a normal random variable which has a mean of zero, a variance of c1 and an autocorrelation matrix which expresses the spatial dependence of the variable and ε is an independently and identically distributed normal residual, known as the nugget effect in a geostatistical context. The nugget effect has mean zero and variance c0. The estimation of this model makes the assumption of second-order stationarity [23]. This requires that any systematic spatial trend in the variable z is reflected in the fixed effects. For example, if there is a linear trend in the variable from north to south, then this may be captured by a design matrix M which consists of a column of entries all equal to one, and a second column which contains the northings of the observations. In this case the coefficients in \({\varvec{\upbeta}}\) correspond to the intercept and slope of a linear regression on the northing. The fixed effects may also be factors (categorical variables). We do not discuss how the LMM is fitted in detail, and refer the reader to Lark et al. [38] and Webster and Oliver [23]. In summary, we assumed that the autocorrelation matrix of \({\varvec{\upeta}}\) can be characterized by an exponential correlation function such that the correlation between two observations separated by distance h is given by

$$\rho \left( h \right) = {\text{e}}^{{ - {\raise0.7ex\hbox{$h$} \!\mathord{\left/ {\vphantom {h \phi }}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$\phi $}}}} ,$$
(2)

where ϕ is a distance parameter. The correlation between values of the random effect decays to small values at distances in excess of 3 × ϕ. The parameters 3 × ϕ, c0 and c1 are estimated by residual maximum likelihood. These estimates are then used to obtain generalized least squares estimates of the coefficients in \({\varvec{\upbeta}}\) and their standard errors [− 0.2, 0.2].

We undertook exploratory analysis of the data, computing summary statistics. These comprise the mean, median, standard deviation, coefficient of skewness and the octile skewness due to Brys et al. [39]. The latter is particularly useful as a robust statistic to measure the asymmetry of the distribution of data without undue influence of a few outlying values. The analysis of spatial data with the linear mixed model makes an explicit assumption that the random variation of the observations conforms to a normal distribution. While the modelling process is quite robust to deviations from that assumption [40], the efficiency of model estimation is reduced if data are strongly skewed. Following Webster and Lark [41], we considered transformation to for variables with an octile skewness outside the range [− 0.2, 0.2]. Predictions on the log-scale are of limited use to farmers or other stakeholders, and so predictions of variables which had been transformed this way were back-transformed to the original units by simple exponentiation following Pawlowsky-Glahn and Olea [42]. This is recommended because the resulting prediction is median unbiased, which is particularly suitable for a skewed variable. We also examined plots of the data values against eastings and northings to identify any potential trends.

The first model that we considered for any variable had a constant mean as a fixed effect, so M is a n × 1 vector, all elements set to one. However, in some cases, this was not a plausible model. Exploratory analysis suggested a possible trend north to south in Ward 10, and the variance parameter c1 tended to a large value, with an associated large value of ϕ. This is suggestive of a spatial trend, and so a model was considered in which a second column of the design matrix contained the latitudes of the observations.

Having established a basic “null” model for the data (with the only fixed effect a constant mean, or a north–south trend), we then considered including an effect of location (homefield or outfield). To do this we added a further column to the design matrix which takes the value 1 for all outfield samples, and zero otherwise. The corresponding fixed effects coefficient is the mean difference between outfields and homefields. To test the significance of the location effect we computed a log-likelihood ratio statistic to compare the null model with the full model including the location effect. This was done with a recomputation of the null model so that the residual likelihoods were comparable, following the approach proposed by Welham and Thompson [43]. The log-likelihood ratio statistic has an asymptotic chi-square distribution if the null model is correct, with degrees of freedom equal to the number of additional parameters in the fitted model (one here), so a test of evidence for the fitted model can be conducted.

We then evaluated the evidence for a spatially dependent random effect in the model by calculating Akaike’s information criterion (AIC) [44] for the fitted model and for an alternative in which the only random effect was an independent nugget component. The model with a correlated random effect will have at least a larger residual loglikelihood than the model with an independent random effect only, but it is also more complex, with an additional parameter. Akaike’s information criterion is given by

$$A = 2P - 2L,$$
(3)

where P is the number of parameters in the model and L is the log of the maximum likelihood. The criterion can be used to compare models for the same data with different numbers of parameters. The model is preferred for which A is smallest, and this is a pragmatic rule for selecting a model for prediction, as the alternative expected to be closest to the unknown model generating the data [45]. The use of AIC to select between alternative geostatistical models is recommended by standard texts (e.g. [23]. If this was the model with a correlated random effect, then this model was used to compute predicted values of the target variable on a grid of points across Ward 10, by the EBLUP. Two sets of predictions were computed, one assuming a homefield, and the other assuming that the target point was in an outfield. The EBLUP computation also returns a prediction error variance. Assuming normal prediction errors, one can use this to compute the probability that the true value at a target location falls above or below a threshold.

We then considered an additional fixed effect, the soil texture class. The significance of soil texture was tested by the log-likelihood ratio test described above. We then computed expected value of the soil fertility property for each texture class and for homefields or outfields. In the case where a spatial trend model had also been fitted we computed the expected value for a location at the mean latitude and at the 10th and 90th percentile of latitudes in the sample. The 95% confidence intervals for these expected values were also computed. By plotting these values it is possible to visualize the variation of the soil fertility properties associated with spatial trend, soil texture and management (outfields or homefields).

3 Results

3.1 Descriptive statistics

Table 1 summarizes descriptive statistics for soil physicochemical properties of 250 soil samples collected from Ward 10 in Hurungwe District. Soil pH ranged from 4.0 to 6.9, SOC ranged from 0.2 to 2.02%, mineral N ranged from 7 to 120 mg kg−1, soil available P ranged from 1 to 230 mg kg−1 and exchangeable K ranged from 0.08 to 1.76 meq/100 g. Data for organic carbon, available P and exchangeable K were transformed to logarithms because octile skewness was larger than 0.2 (although note that K remained quite strongly skewed even after transformation with an octile skewness value of 0.26). Soil variability data in Table 1 indicate moderate variability for all soil fertility properties measured, as suggested by Wang et al. [46]. A variable is considered weakly variable, moderately variable or strongly variable when the coefficient of variance (CV) % is less than 10%, between 10 and 100%, and above 100%, respectively.

Table 1 Summary statistics of selected soil fertility properties in ward 10 of Hurungwe district, Zimbabwe

Table 2 shows the Pearson’s correlation coefficients among the five variables. The results indicated a significant (p < 0.001) and weak positive correlations between soil pH and organic C (r = 0.35), soil pH and exchangeable K (r = 0.39), organic C and exchangeable K (r = 0.44) and available P and exchangeable K (r = 0.34). Moreover, there was a significant (p < 0.001) and moderate positive correlation between soil pH and available P (r = 0.56). Lastly the results indicated a significant (p < 0.01) and weak positive correlation between organic C and mineral N (r = 0.20) and organic C and available P (r = 0.24).

Table 2 Pearson correlation coefficient among selected soil fertility properties in ward 10 of Hurungwe district, Zimbabwe

3.2 Linear mixed model

The null model for soil pH included a north–south trend. Note (Table 3a) that the estimated coefficient for latitude is large compared to its standard error, so the evidence for this trend is strong, with pH increasing north to south. The effect of location (homefield or outfield) was strongly significant (p < 0.0001) with the soil of outfields on average 0.34 pH values smaller than homefields. There was also strong evidence for a difference between the soil texture classes, (Fig. 2a); sands were the most acid and sandy clay loams the least, the latter with a pH nearly 0.5 units larger on average than that for sands. Note that the spatial trend between north and south is of similar size to the texture class effect. However, both the spatial trend and the texture effect are significant in the model, so the trend is not accounted for purely in terms of variations in soil texture. There is potential variation of soil pH of over 1.5 units from sandy clay loam soils in a homefield in the south of the ward (expected to be in the Neutral class), to coarser textured outfields in the north of the ward (expected to be in the Strongly Acid class). Table 4 shows that, of models for soil pH with north–south trend and the location the only fixed effect, the model with spatially correlated random effects had the smallest AIC. The difference is very small but, nonetheless, shows that the larger likelihood for this model cannot be attributed just to its greater complexity. The small difference in the AIC is consistent with the small value for the correlated variance (an order of magnitude smaller than the nugget variance), that is to say the variance of soil pH around the north–south trend is dominated by uncorrelated noise. Under these circumstances the EBLUP, computed from all the data will be very close to the predictions based on the trend model only. However, given the smaller value of AIC, it is preferable to use the model which accounts for spatial dependence in the error terms as this avoids the risk of bias in the variance estimates (from treating the errors as independent). The prediction error variances that we have computed are reliable, and quantify the inevitable uncertainty in predictions from variable data. We computed the EBLUP values for soil pH on a fine grid on Ward 10 bounded by the minimum and maximum latitude in the set of observations. Figure 3 shows the predicted values, and Fig. 4 shows the prediction error variances. The probability that soil pH is 5 or less (strongly or very strongly acid) is shown in Fig. 5.

Table 3 Inferences about fixed effects and their estimates in linear mixed models fitted for each soil property; a pH, b organic carbon, c soil mineral nitrogen, d Available P, e Exchangeable K. Fixed effects are listed in the first column. The second two columns provide fixed effects coefficients (estimate) and their standard errors (SE) for those fixed effects in the null model (either a constant mean, or a constant with a trend with latitude). The next two columns consider the effect of adding location (homefield or outfield) as a fixed effect to the terms in the null model. The first three rows provide the log-likelihood ratio statistic, degrees of freedom and p-value for the null hypothesis of no effect of adding location. The remaining columns contain the fixed effects coefficients and their standard errors for the model with location added. Similarly, the final two columns of the table provide the statistics to test the null hypothesis of no difference between the soil textural classes, and the estimates and standard errors for the fixed effects coefficients of the linear mixed model with textural class added
Fig. 2
figure 2

Expected values of a pH; b mineral N; c available P and d exchangeable K with 95% confidence intervals. In each plot the values are in groups for different soil texture classes (S = sand, LS = loamy sand, SL = sandy loamy, SCL = sandy loam clay). The solid symbols are homefields and the open symbols are outfields. The homefields or outfields within any textural class are for the 10th percentile of latitude, mean latitude and 90th percentile of latitude from left to right (south to north). In the case of Figures; c and d the expected values are transformed back to the original scales of measurement and so are median-unbiased

Table 4 Estimates of the random effects parameters for spatial linear mixed models for soil properties with fixed effects including location (homefield or outfield) and latitude where the latter was included in the null model. The random effects parameters are a distance parameter ϕ; the uncorrelated “nugget” variance c0; and the correlated variance c1. Akaike’s information criterion, AIC, see Eq. (3), for the spatial model, and for a linear mixed model with the same fixed effects but the only random effect an identically and independently distributed error (non-spatial) are also presented
Fig. 3
figure 3

Maps showing the best linear unbiased prediction of soil pH across ward 10 of Hurungwe district based on the linear mixed model for that variable presented in Table 4. Mapped values are for a homefields and b outfields

Fig. 4
figure 4

Maps showing the prediction error variances for the predicted values of soil pH (shown in Fig. 3 for a homefields and b outfields across ward 10 of Hurungwe district

Fig. 5
figure 5

Maps showing the probability that soil pH < 5 for a homefields and b outfields across Ward 10 of Hurungwe district. These probabilities are from the conditional distribution of pH at these locations given the map for the variable presented in Fig. 3 and the soil data

There was no evidence to incorporate a trend in the case of soil organic carbon, but as for pH there is strong evidence for a difference between homefields and outfields, and between soil textural classes (Table 3b). As seen in Fig. 6, the effect of soil texture is particularly marked. The expected value for sandy clay loams falls in the Medium to High organic carbon category, whilst sandy soils are expected to have Very Low carbon content, particularly outfields. As seen in Table 4, there was evidence for spatial dependence in organic carbon content, and so the EBLUP predictions were mapped as for soil pH (see Figs. 7, 8, 9, 10) after back-transformation.

Fig. 6
figure 6

Expected values of soil organic carbon content backtransformed to the original scale of measurement with 95% confidence intervals. The values are in groups for different soil texture classes (S = sand, LS = loamy sand, SL = sandy loamy, SCL = sandy loam clay). The solid symbols are homefields and the open symbols are outfields

Fig. 7
figure 7

Maps showing the best linear unbiased prediction of soil OC across ward 10 of Hurungwe district based on the linear mixed model for that variable presented in Table 4. Mapped values are for a homefields and b outfields

Fig. 8
figure 8

Maps showing the prediction error variances for the predicted values of soil OC (shown in Fig. 7) for a homefields and b outfields across Ward 10 of Hurungwe district

Fig. 9
figure 9

Maps showing the probability that that soil OC < 0.75% for a homefields and b Outfields across ward 10 of Hurungwe district. These probabilities are from the conditional distribution of soil OC at these locations given the map for the variable presented in Fig. 7 and the soil data

Fig. 10
figure 10

Maps showing the probability that soil OC < 0.5% for a homefields and b outfields across ward 10 of Hurungwe district. These probabilities are from the conditional distribution of soil OC at these locations given the map for the variable presented in Fig. 7 and the soil data

For mineral nitrogen, available P and exchangeable K there was evidence for a north–south trend, for differences between homefields and outfields and for differences between soil texture classes. Again, the expected values (back-transformed to median-unbiased values on the plots in Fig. 2) show that differences due to spatial trends, soil texture and management are all of similar order, and span intervals which are significant in terms of judgements about nutrient deficiency. However, none of these nutrients showed evidence for a spatially dependent random effect, and so no attempt was made to produce a map (which would simply exhibit the north–south trend).

4 Discussion

Soil pH was significantly and positively correlated with soil OC. This could be due to the ability of soil OC and soil organic matter hydroxyl groups released during stabilization to buffer against acidity [47]. It is also possible that farmers are adding more input of plant residues to less acid soils which are better for crop production. However, our finding differs from Shi et al. [48] and Zhang et al. [49] who noted that soil OC content was significantly and negatively correlated with pH and concluded that acidification inhibits soil OC decomposition,thus, C loss from soils is insignificant. The divergence may be attributed to variations in soil type and climatic conditions. The significant and positive correlation between pH and exchangeable K and pH and available P could be related to the geochemical processes that influence reactivity and solubility of nutrients. Low pH result in reduction in CEC and deficiency of K and increase acidity-induced P-fixation [33, 50]. Furthermore organic C and mineral N and organic C and available P were positive correlated in accordance with the results reported by Metwally et al. [51]. This shows that soil OC is an important parameter of the soil which affects soil physical, chemical and biological properties influencing soil nutrients’ availability [52].

The pH data show quite a strong trend from north to south, relative to the sampled fields map (as shown in Fig. 3) and there is evidence for spatial dependence around this trend. The climate of the study area (for both the north and south) is the same but soil type changes significantly as we move from north to south [27, 30]. The northern part of ward 10 consists mostly of sand and loamy sands which are prone to leaching resulting in low pH. Soil texture does vary north to south, but that both the trend and texture are significant in the model so the north–south trend cannot be entirely due to differences in the sand content of the soil. The strong evidence for pH difference between the soil texture classes could be related to variations in buffering capacity and cation exchange capacity of the soil types [30]. Sandy soils with low pH buffering capacity tend to be most acid with sandy clay loams usually associated with an increased CEC having high soil pH [53]. Therefore, site specific management of soil pH for improved crop productivity is imperative in Ward 10 of Hurungwe district. Figure 5 shows that liming is urgently required in some parts (those with high probability that pH < 5) of Ward 10, but not all. The strong evidence for difference between homefields and outfields in terms of soil pH may be attributed to management of different field locations by smallholder farmers. Preferential addition of limited soil fertility amendments, such as organic material and lime to small areas around the homesteads at the expense of outfields is common in the smallholder sector [18, 54].

Soil OC is very variable spatially because it is affected by soil texture, vegetation, topography and management [55, 56]. The linear mixed model shows that there was strong evidence for a difference between home and outfields, and between soil textural classes (Table 3b) and spatial dependence in the random component of the model (Table 4). These findings suggest that soil OC was affected by both the natural processes occurring in the cropping environment and human activities. Lack of clay-induced physical protection of soil OC from microbes accelerates mineralization in sandy soils resulting in very low carbon content [57]. Soil OC concentrations are also smaller in outfields where limited organic resources are added. This variation in soil OC, related to both texture and management, will have implications for fertilizer requirement [58], and information on soil OC variation could therefore support more targeted fertilizer recommendations.

Mineral N, available P and exchangeable K showed no evidence for a spatially dependent random effect in Ward 10, but there are trends from north to south, and there are differences between textural classes and between homefields and outfields for all these nutrients (Fig. 2). The differences between homefields and outfields and between soil texture classes in terms of N, P and K can be attributed to variations in resource allocation, CEC, leaching, K retention on soil exchangeable complex and acid-induced P fixation of the soil [50, 59, 60]. This variation seems particularly significant in the case of P. For example, considering sandy loams, a homefield in the south may be expected to be high in P, whereas an outfield in the north is expected to be marginal to deficient. Potassium is also variable but it is acutely deficient or deficient everywhere, so the variation is of limited practical relevance.

Our results show that there is spatial variation in soil nutrient status, pH and organic carbon content across Ward 10. Some of this variation may have implications for optimal soil management, consider, as noted above the range of variation in soil P content, and also in pH. Not all variation is relevant, consider the example of soil K which varies with space, but which is inadequate for crop production across the ward. However, such intensive soil sampling as done in this study may not be feasible ward-by-ward to support agricultural extension. Furthermore, it is known that the uncertainty of field-scale estimates of soil properties may be such that field-specific recommendations, at least in the case of very variable fields under smallholder production, are not feasible, [61]. However, our results suggest that one might identify important variation by the analysis of aggregated soil samples across soil types, for broader regions within the ward and according to management history, and that, in collaboration with farmers who readily recognize heterogeneity in soil fertility and crop growth present in their fields [62], it should be possible better to match recommendations to soil conditions than would be achieved by uniform recommendation at the scale of the ward, or coarser.

5 Conclusion

The study showed that spatial variation in selected soil fertility properties namely soil pH, OC, mineral N, available P and exchangeable K, can be identified at Ward scale, and related to both basic properties of the soil and soil management. The soil fertility properties showed long-range variations, and, in some cases, spatially dependent variations about these trends. Field location, that is homefields and outfields, and soil texture contributed substantially to this variation. Generally homefields and sandy clay loam soil had more fertile conditions (larger nutrient and OC concentrations and higher pH) compared to the outfields and other soil texture classes respectively. This variation can be mapped using the linear mixed model, and represented in a GIS environment. This could provide a basis for improved recommendation on fertilizer rates, lime requirement and other management decisions. The importance of differences between soil texture classes and management could also provide a basis for more effective recommendations based on coordinated sampling across Wards rather than attempting to estimate properties within variable individual fields. Finally, we note that one must be cautious offering recommendations for N, P and K based on a single sample, as these properties are temporally variable, reflecting weather and management-driven processes. Soil information generalized appropriately, should be interpreted by farmers and their advisors, based on the farmer’s understanding of heterogeneity in soil fertility and crop growth present in their fields.