Modelling cropland expansion and its drivers in Trans Nzoia County, Kenya

Kipkulei, Harison Kiplagat; Bellingrath-Kimura, Sonoko Dorothea; Lana, Marcos; Ghazaryan, Gohar; Boitt, Mark; Sieber, Stefan

doi:10.1007/s40808-022-01475-7

Modelling cropland expansion and its drivers in Trans Nzoia County, Kenya

Original Article
Open access
Published: 06 August 2022

Volume 8, pages 5761–5778, (2022)
Cite this article

Download PDF

You have full access to this open access article

Modeling Earth Systems and Environment Aims and scope Submit manuscript

Modelling cropland expansion and its drivers in Trans Nzoia County, Kenya

Download PDF

Harison Kiplagat Kipkulei ORCID: orcid.org/0000-0003-0643-2077^1,2,3,
Sonoko Dorothea Bellingrath-Kimura^1,2,
Marcos Lana⁴,
Gohar Ghazaryan¹,
Mark Boitt⁵ &
…
Stefan Sieber^1,2

2464 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

Population growth and increasing demand for agricultural production continue to drive global cropland expansions. These expansions lead to the overexploitation of fragile ecosystems, propagating land degradation, and the loss of natural diversity. This study aimed to identify the factors driving land use/land cover changes (LULCCs) and subsequent cropland expansion in Trans Nzoia County in Kenya. Landsat images were used to characterize the temporal LULCCs in 30 years and to derive cropland expansions using change detection. Logistic regression (LR), boosted regression trees (BRTs), and evidence belief functions (EBFs) were used to model the potential drivers of cropland expansion. The candidate variables included proximity and biophysical, climatic, and socioeconomic factors. The results showed that croplands replaced other natural land covers, expanding by 38% between 1990 and 2020. The expansion in croplands has been at the expense of forestland, wetland, and grassland losses, which declined in coverage by 33%, 71%, and 50%, respectively. All the models predicted elevation, proximity to rivers, and soil pH as the critical drivers of cropland expansion. Cropland expansions dominated areas bordering the Mt. Elgon forest and Cherangany hills ecosystems. The results further revealed that the logistic regression model achieved the highest accuracy, with an area under the curve (AUC) of 0.96. In contrast, EBF and the BRT models depicted AUC values of 0.86 and 0.77, respectively. The findings exemplify the relationships between different potential drivers of cropland expansion and contribute to developing appropriate strategies that balance food production and environmental conservation.

Land Use/Land Cover Change Modeling and Evaluating the Spatiotemporal Dynamics of Highland Bamboo Species in the Southern Highland of Ethiopia

Article 29 January 2024

Validation and refinement of cropland data layer using a spatial-temporal decision tree algorithm

Article Open access 02 March 2022

Monitoring of land use/land cover changes using GIS and CA-Markov modeling techniques: a study in Northern Turkey

Article 23 July 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Population growth and urbanization have pressured terrestrial landscapes, increasing land utilization to meet socioeconomic needs (Bowler et al. 2020; FAO 2017). As a result, agricultural production follows unsustainable practices that focus on enhancing the output per unit of land area. These practices may fail to achieve the intended purpose but drive the continuous impact on the environment as food production and ecosystem functions exhibit some form of interdependent relationship (Pellikka et al. 2013). The situation is even worse with the anticipation of 2.5 billion people being added to our planet by mid-century. Thus, the global demand for food will increase significantly, inducing anti-environmental effects (Tilman et al. 2011).

Globally, agricultural production demand is central to LULCCs on the Earth's surface. These changes involve transformations within and between various land uses. The most widespread form of LULCCs relates to cropland expansion. This transformation is often accompanied by losses in forestlands, grasslands, wetlands, and other features of ecological importance (Lark et al. 2020; Zeng et al. 2018). Empirical evidence suggests that human actions are central to LULCCs (Mwaniki and Möller 2015). These changes vary across diverse spatial scales and magnitudes based on underlying biophysical and climatic conditions. Globally, cropland expansion resulting from LULCCs has been associated with the growing population, poorly formulated government action plans, environmental influences, and technological advancements (Hassan et al. 2016; Jellason et al. 2021; Kindu et al. 2015; Nakalembe et al. 2017; Pham and Smith 2014; Winkler et al. 2021).

In developing regions, cropland expansion portrays similar patterns and trends. Underlying this fact are the common challenges faced by smallholder farmers, who are the primary players in the food production chain in these regions. The challenges stem from an interplay of production factors such as land, income, market access, and prevailing climate conditions (Giller et al. 2021). Different from developed regions, unsustainable land-use practices such as charcoal burning, illegal encroachments, overgrazing, and relaxed enforcement of the law encompass the prevalent drivers of forestland, grassland, and wetland losses (Baldyga et al. 2008; Ewane 2021; Mwangi et al. 2020; Nakalembe et al. 2017). Consequently, these losses induce massive cropland conversions that have severe implications for ecosystem service provision (Song and Deng 2017), hydrological balances (Baldyga et al. 2008), and food production (Hoque et al. 2020). Therefore, understanding LULCCs and intrinsic drivers is a step towards developing tenable and coherent landscape practices that drive sustainable agricultural production (Kindu et al. 2015).

Regular and up-to-date information on land use dynamics and cropland expansion is required to formulate sound policies that foster sustainable human-environmental interactions. Moreover, information on the drivers of cropland expansion is paramount to offering precise and timely solutions to land-use decisions and regulatory measures. Remote sensing information combined with geospatial approaches provides the most feasible, cost-effective way of obtaining cropland expansion dynamics. The technology thus helps to address the issue of data limitation, especially in the data-sparse environments in developing regions. Kenya has experienced rapid conversions of natural ecosystems to croplands in the recent past (Bullock et al. 2021). The expansions challenge the ecosystems' provision capacities and expose the land to degradation, soil erosion, and biodiversity loss (Mulinge et al. 2016). The expansion in croplands is gradual in high potential agricultural production zones (Kogo et al. 2021). Consequently, it poses a threat to the sustainability of agricultural production, given that only 12% of Kenya's land mass falls under the high potential zones for production (Kabubo and Karanja 2007). However, the effects of various drivers on cropland expansion in these zones remain uncertain, and comprehensive analysis has been lacking to date.

In recent years, various modelling approaches involving qualitative and quantitative data analysis have gained prominence in assessing drivers of cropland expansion. These approaches integrate remote sensing information and geospatial analysis that allows explicit assessments of LULCCs. Some studies, for instance, used linear and spatial regression to assess the drivers of deforestation and agricultural expansion. For example, de Espindola et al. (2021) combined satellite information and variables related to proximity, land management, technological resources, and environmental variables to assess drivers of LULCCs in the Amazon basin. Mwangi et al. (2020) combined boosted regression trees and geographically weighted regression to determine the significance and model the spatial influence of the drivers of LULCCs in Central Kenya. Nevertheless, in Kenya, Were et al. (2014) employed a logistic regression approach to uncover the drivers of LULCCs in Kenya-Afromontane forest environments. Other studies have utilized machine learning approaches such as random forest (RF) classification to determine and evaluate the importance of various drivers of LULCCs in the northeastern United States of America (Zhai et al. 2020). Other studies combined qualitative and quantitative data analysis, such as the study of Kindu et al. (2015), who evaluated drivers of LULCCs in Ethiopia. Moreover, Munthali et al. (2019) combined qualitative data analysis and geographic information systems (GIS)-based processing to assess the drivers of LULCCs in Malawi.

The reviewed studies modelled observed LULCCs changes derived through the analysis of remotely sensed imagery as a function of socioeconomic and biophysical attributes of the landscape. Subsequently, they linked the geographical distribution of land-use transitions to ancillary data to establish the significant drivers and uncover the underlying reasons for the observed patterns. Although their applications have been successful in LULCC studies, the use of evidence belief functions to assess drivers of LULCCs remains limited. Furthermore, multiapplication assessment synthesizes the inherent strengths of the individual approaches. Therefore, this study combined logistic regression (LR), boosted regression trees (BRTs), and evidence belief functions (EBFs) to assess the drivers of cropland expansion in Trans Nzoia County. Campbell et al. (2005) concluded that complexities in LULCC processes, especially their linkages with social, ecological, economic, and institutional contexts, require multiple approaches to disentangle the drivers of LULCCs.

The present study thus complements the literature in the following ways. First, three modelling techniques were applied to assess the accuracies of cropland expansions and the underlying processes. Second, the spatial prediction was conducted to depict varying probabilities of cropland expansion across the study area. Finally, spatially modelled raster surfaces were used to enhance the definition of proximity variables by combining cost functions and linear network analysis in a geospatial environment. In this way, a more realistic measure of proximity is defined as opposed to the Euclidean and buffering approaches common in past studies (Sarkar and Chouhan 2020). Thus, this study aimed to achieve the following objectives:

1.
To assess LULC changes in Trans Nzoia county and their contributions to cropland expansions.
2.
To analyse the key drivers of cropland expansion using LR, BRTs, and the EBFs.
3.
To assess the approaches for usability and the quality and accuracies in predicting cropland expansions at a county scale.

Materials and methods

Study area

This study was conducted in Trans Nzoia County, situated in the western part of Kenya and bordering Uganda to the west (Fig. 1). Agriculture is the main economic activity characterized by both small- and large-scale farming. Small-scale farmers cultivate crops such as maize, beans, potatoes, and sorghum, while large-scale farmers focus on producing wheat, tea, and sugarcane (Mwaura and Kenduiywo 2021). Livestock keeping, poultry rearing, fishing, and apiculture are practised for subsistence and commercial purposes. Climatically, the county exhibits a bimodal rainfall pattern. The long and short rainy seasons occur between March and May and October and December, respectively. The average annual precipitation is approximately 1300 mm, while the mean annual minimum and maximum temperatures are 12 °C and 26 °C, respectively (Nyberg et al. 2020). The county hosts Mt. Elgon and Cherangany forest ecosystems, part of Kenya's prominent water towers (Langat 2018). These ecosystems are catchments for the Nzoia and Suam rivers, which drain their waters into Lake Victoria and Turkana. The county population is approximately 990,000 people, according to the 2019 Kenya population and housing census (KNBS 2019). Trans Nzoia County was selected for this study due to its leading role as the country's central food basket. In addition, recent substantial LULCCs in the region pose a serious challenge to food security and environmental sustainability.

Data and data sources

This study used various datasets generated through primary and secondary data surveys, including archived remote sensing images, existing GIS databases, field observations, and discussions with land-use experts. The primary data collection was conducted between May and September 2021. The data collected during this period include ground observations used in the training and validation process of the RS image classification. In addition, land management experts from the Trans Nzoia County lands department provided information about land-based transformations and potential drivers.

The third set of data was sourced from secondary databases. The obtained variables include soil physical and chemical properties, precipitation, temperature, population density, accessibility to water sources, distance to major roads, and proximity to major trading centres. Soil information data was obtained from the International Soil Reference and Information Centre (ISRIC), https://www.isric.org/explore/isric-soil-data-hub; precipitation and temperature variables were sourced from the climatology laboratory of the University of California, https://www.climatologylab.org/terraclimate.html; The population density data were obtained from the Gridded Population of the World (GPW) Version 4 of the Socioeconomic Data and Applications Centre, https://sedac.ciesin.columbia.edu/data/collection/gpw-v4/sets/browse; Road network data was obtained from the Humanitarian Data Exchange of the United Nations, https://data.humdata.org/dataset/kenya-roads; Rivers data was sourced from the World Resources Institute (WRI), https://datasets.wri.org/dataset/permanent-and-non-permanent-rivers-in-kenya; Market centres data was obtained from the Trans Nzoia County Department of Finance and Economic Planning, https://www.transnzoia.go.ke/. The proximity to roads, market centres, and rivers was modelled into raster surfaces using ArcGIS's cost distance functions. The detailed procedure for preparing proximity variables is outlined in the online resource of this article (ESM_1). In addition, the raster maps of all the variables used in this study are accessible from the online resource of this article (ESM_1).

The RS satellite images were acquired from different Landsat sensors, including thematic mapper (TM), enhanced thematic mapper (ETM+), and operational land imager/thermal infrared sensor (OLI/TIRS). The data was processed within the Google Earth Engine (GEE) platform, but the individual scenes are available from https://glovis.usgs.gov/. Three sets of Landsat images from 1990, 2005, and 2020 provided the data for mapping LULCCs. The spectral bands used in this study include the blue, red, NIR, and SWIR bands, with spatial resolutions of 30 m. The processed images were acquired during a relatively dry season between November and March of the succeeding year. The period allowed for the best comparison assessments across the time epochs, as the phenologies of the land features appear relatively similar. Table 1 outlines the sources, descriptions, and purposes of both the primary and secondary data used in modelling.

Table 1 Overview of datasets used, sources, and their purpose in the study

Full size table

Image processing and LULC classification

The study used the GEE cloud computing environment to process the Landsat images and generate a time series of land cover maps for three epochs: 1990, 2005, and 2020. The platform permits large-scale data computing, thus minimising the tedious data downloading and storage requirements (Gorelick et al. 2017). Accordingly, surface reflectance data products were derived for the three epochs. The multitemporal products have already been preprocessed for radiometric and geometric corrections, and the products have also been corrected for absorbing and scattering gases and aerosol atmospheric effects. Therefore, the study used level 2 surface reflectance data corrected for radiometric and geometric defects. The cloud cover threshold was set at 20% to minimize the effect of clouds on the images. Any clouds present in the selected images were masked and replaced with pixels of images in the Landsat archive within 60 days of the acquisition date. The cloud score algorithm was used to mask pixels with high cloud cover based on a Landsat quality image file by computing a cloud-likelihood score from 0 (no clouds) to 100 (most cloudy). Finally, the normalised difference vegetation index, red, blue, NIR, and SWIR bands, were used as input features for the land use and land cover (LULC) classifications.

Supervised classification was performed using the random forest (RF) classifier (Breiman 2001) to map various LULC classes in the region. The algorithm is a nonparametric algorithm used for ensemble learning. It solves classification problems by estimating multiple decision trees from the training datasets and assigns probable class values to the pixels based on the maximum vote of the decision trees. The classifier is robust, achieves high accuracy, and effectively handles outliers and noisier datasets compared to other image classifiers (Belgiu and Drăguţ 2016). In this study, the number of decision trees was set at 500 to achieve a good balance between classification speed and accuracy (Belgiu and Drăguţ 2016). The default values were chosen for variablesPerSplit (√(n_bands)), and the fraction of the input to bag per tree was set to 0.5.

The classification was conducted based on existing LULC classes in the study area, which include croplands, forestlands, wetlands, grasslands, built-up areas, and other lands. The other land category comprises barren lands, unclassified areas, and other exposed surfaces that do not fall in the former LULC categories. The training and validation datasets were collected from field surveys, existing topographical maps, documented historical land use plans, the local knowledge of the two authors, and visual interpretation of high-resolution imagery derived from Google Earth. The period of study determined the training and validation data used. For instance, historical information was used for the 1990 and 2005 image classifications, whereas ground survey data and updated county spatial plans were used for the 2020 image classification. The number of sample points in 1990, 2005, and 2020 was 975, 1105, and 1544, respectively. The samples were split into training samples for training the RF classifier (70%) and verification samples for accuracy verification (30%).

Accuracy assessment

Accuracy assessment is an integral part of digital image processing, as it reveals the quality and reliability of the classified images. The accuracy of the LULC maps was assessed based on a confusion matrix using an independent validation set of ground-based data. Accuracy assessment metrics, such as producer accuracy, user accuracy, and overall accuracy, were used to evaluate the overall classification process (Congalton 1991). Producer accuracy indicates how the ground features are correctly shown on the classified map. In contrast, user accuracy reveals how often a class on the classified map is depicted on the ground surface. The overall accuracy provides the percentage of correctly classified pixels for all class types. The recommendations of Olofsson et al. (2014) were adopted for accuracy assessment. The method outlines good practices for area estimation and accuracy assessment of RS image classification. In addition, it provides guidelines for proper reference sample selection and precise allocation of different class strata to achieve the desired samples.

Change analysis of land cover maps

Change analysis is a post-classification procedure that detects and quantifies changes in independently produced LULC classifications for different dates. The method provides transitions between land covers, quantifies the land cover changes, and presents information on the distribution of changes in the landscape. In this study, the analysis of changes and their distributions was used to derive binary maps of areas that were converted to croplands in the two time periods. Subsequently, they were used to assess the potential drivers of cropland expansion.

Modelling cropland expansion

Logistic regression

LR is a machine learning regression technique that assesses the relationships between dependent variables (binary or continuous) and a set of independent variables (Peng et al. 2002). LR involves logit transformation of the dependent variable. The model has the following form:

$$\mathrm{Logit }\,Y=\mathrm{ ln}\left (\frac{\uppi (x)}{1-\uppi (x)}\right)=\alpha +\sum_{i=1}^{n}{\beta }_{i}{x}_{i}+\varepsilon$$

(1)

$$\uppi (x)=Y|X={x}_{1}\dots {x}_{n})=\frac{{\mathrm{e}}^{\mathrm{\alpha }+{\beta }_{1}{x}_{1}+{\beta }_{2}{x}_{2}+\dots {\beta }_{n}{x}_{n}}}{1+{\mathrm{e}}^{\mathrm{\alpha }+{\beta }_{1}{x}_{1}+{\beta }_{2}{x}_{2}+\dots {\beta }_{n}{x}_{n}}}$$

(2)

where π (x) is the probability of the outcome of interest, α is the y-intercept, β represents the regression coefficients, ε corresponds to the model error term, and x represents a set of explanatory variables. The antilog of Eq. 1 yields Eq. 2, which predicts the probability of the occurrence of the outcome of interest. The parameters α and β are estimated using the maximum likelihood (ML) method. LR is ideal for handling dichotomous outcomes and can be applied in instances of nonnormality of the dependent variable.

Cropland conversion maps were created and used to define the binary outcome. Converted areas were coded as 1, whereas nonconverted zones were coded as 0. The potential drivers of cropland conversion included proximity to rivers, population density, soil type, precipitation, soil organic carbon, time to the nearest urban centres, time to the nearest major road, elevation, soil pH, and slope (Fig. 2). The spatial structure of the drivers was assessed using a semivariogram and interpolation conducted in ArcGIS version 10.8.1. 5,000 random points were generated in the ArcGIS environment and used as representative samples to model the relationship between cropland expansion and the potential drivers. Spatial dependency effects were minimized by maintaining a minimum distance of 200 m between each sample pair.

The selected points formed the basis for extracting the explanatory variables from the corresponding interpolated surfaces. The data were converted into ASCII format for ready import in R software. Subsequently, LR was implemented using the generalised linear model function (R Core Team 2020). 80% of the samples were used for training the model, and 20% were used for model validation. First, a full model was fitted, followed by a multicollinearity assessment among the predictors using the variance inflation factor (VIF) function in the companion to applied regression (CAR) package in R (Fox et al. 2019). Stepwise regression using the backwards selection procedure was then used to select the statistically significant variables at a 5% significance level. The regression technique fits a full model and then iteratively drops predictors with less contribution to the outcome variable. The model with the lowest Akaike information criterion (AIC) was selected as the parsimonious model for generating probability surfaces of cropland expansion.

Boosted regression tree modelling

Boosted regression tree (BRT) modelling is an ensemble model that combines regression trees and boosting algorithms to generate nonparametric statistical models (Schapire 2003). In contrast to conventional statistical models, it fits multiple statistical modes to improve prediction accuracy. The rationale is that fitting multiple trees from several approximate rules and averaging them is easier than obtaining a single highly predictive model. The strengths of the BRT model include the potential for handling missing data, accommodation of different predictor variables, and robust modelling of nonlinear interactions between variables (Elith et al. 2008). The BRT model was implemented using the generalised boosted models package in R statistical software (Ridgeway 2005). The parameters specified for the model include a bagging fraction of 0.5, as recommended by (Elith et al. 2008), a tree complexity of 5, and a learning rate of 0.005. The bagging fraction determines the split between the training and validation data, tree complexity controls whether interactions are fitted, and the learning rate determines the contribution of each tree to the growing model. The sample points used for training and evaluating the LR were also used in the BRT modelling.

Evidential belief function

The evidence belief function (EBF) model is founded on the Dempster–Shafer theory of belief (Dempster 1968). It is a data-driven approach that computes mass functions of belief, disbelief, plausibility, and uncertainty using spatial occurrences of phenomena on the Earth's surface (Park 2011). The concept behind estimating these functions is that the locations of geographical phenomena caused by diverse earth processes can be utilized to determine the probabilities of confounding variables. Accordingly, the confounding factors are categorised into several class groups, which are then used to document the various EBF functions. The EBF model is ideally suited for assessing spatial integration processes such as LULC changes (Arasteh et al. 2019). Accordingly, an evidential map layer of the geographical phenomenon is required to compute the various functions.

Based on the functions, high belief values indicate a high likelihood of a factor contributing to an event within a class category, whereas high disbelief values indicate a lower chance. Therefore, computations of the belief and disbelief functions integrate the total number of unit cells or pixels within a class category, the number of unit cells of the evidential map layer within the class category, and the total number of unit cells in the exploration area. Equations 3–6 were used to compute the belief and disbelief values, where F_ij represents i confounding factors (drivers) with j class categories. N (F_ij) represents the total number of unit cells in class j, whereas N (F_ij ∩ A) is the number of unit cells in class j that were converted to cropland. N (A) and N (T) indicate the total number of unit cells converted to cropland and the total number of unit cells in the exploration area, respectively.

$${\mathrm{Bel }}_{{F}_{ij}}=\frac{{W}_{{F}_{ij(\mathrm{Converted \, pixels})}}}{\sum_{j=1}^{n}{W}_{{F}_{ij} (\mathrm{Converted\, pixels})}}$$

(3)

$${\mathrm{W }}_{{F}_{ij} (\mathrm{Changed \, pixels})}= \frac{N\left({F}_{ij}\cap A\right)/N({F}_{ij})}{\left[N\left(A)-N({F}_{ij}\cap A\right)/[N\left(T\right)-N({F}_{ij})]\right]}$$

(4)

The numerator and the denominator in Eq. 4 correspond to the proportion of unit cells converted to croplands in each class factor and the ratio of unit cells converted to other land uses, respectively.

$${\mathrm{DIS }}_{{F}_{ij}}=\frac{{W}_{{F}_{ij(\mathrm{Non}-\mathrm{converted \, pixels})}}}{\sum_{j=1}^{n}{W}_{{F}_{ij} (\mathrm{Non}-\mathrm{converted \, pixels})}}$$

(5)

$${\mathrm{W }}_{{F}_{ij} (\mathrm{Unchanged \, pixels})}= \frac{N\left({F}_{ij})-N({F}_{ij}\cap A\right)/N({F}_{ij})}{\left[N\left(T\right)-N\left(A\right)-N\left({F}_{ij})+N({F}_{ij}\cap A\right)/[N\left(A\right)-N({F}_{ij})]\right]}$$

(6)

Evaluation of the BRT, LR, and EBF models

The models were evaluated using the area under the curve (AUC) of the receiver-operating characteristic (ROC) at separate classification thresholds. The ROC curve is a plot of sensitivity against specificity (Mas et al. 2013). Sensitivity gives the proportion of the positive class that was correctly classified, while specificity indicates the ratio of the negative class that was correctly classified. The AUC ranges from 0 to 1. A perfect model yields an ROC value of 1, which indicates an exact agreement between the predicted values and the observations. ROC was implemented using the PROC package in R (Robin et al. 2011).

Results

Land use and land cover changes in Trans Nzoia County

The land cover maps for Trans Nzoia County reveal changing land use and land cover dynamics. Across the study area, the dominant land cover class in the studied epochs was cropland. The initial coverage of cropland in 1990 was 33% of the study area, and the coverage further increased to 66% and 72% in 2005 and 2020, respectively. The changes demonstrate a rapid expansion of croplands between 1990 and 2005, followed by a slow, albeit increasing, expansion until 2020. Over the period, the area under croplands grew at the cost of forestlands, wetlands, and grasslands. Spatially, the land cover distribution indicates that croplands occupied the central areas of the county, whereas forestland and grassland classes dominated the western and northeastern parts, respectively (Fig. 3).

Built-up areas recorded positive growth over the study period. The coverage in 1990 was 2.4 km², which translates to approximately 0.1% of the total land area. During the period, few pockets of built-up zones were evident in the main town, located at the centre of the county. However, the coverage increased to approximately 10 km² by 2005, with most expansions occurring in the surrounding areas of the main town. Additionally, some sections of the western and northern parts of the county experienced a notable increase in built-up coverage. The highest coverage of built-up areas was recorded in 2020, with an approximate area of 36 km². In this recent period, built-up areas expanded exponentially and extended along the main transport corridors through the county. Although the built-up area class accounted for the smallest proportion of the total land area, the findings of this study show that it experienced the highest growth in the study period.

Forestland recorded a decline in the 30 years. In 1990, the area under forest cover was approximately 500 km². However, the coverage was reduced to 447 km² and 335 km² in 2005 and 2020, respectively. The decline in forest cover was higher (− 25%) in the 2005–2020 period than in the 1990–2005 period (− 11%). The reduction was more pronounced in the forested areas of Mt. Elgon and Cherangany hills. The land cover maps further indicate that tree cover vegetation along river courses and creeks was drastically removed. Similarly, planted forest patches were gradually cleared. The effect is evident from the 2020 LULC map (Fig. 3c). Likewise, wetlands exhibited a continuous decline in the 30 years. The study area's wetlands comprise permanent streams, open water, seasonal and permanent marshes, riverine vegetation, scrub, forested wetlands, and seasonal flood plains. In 1990, wetlands occupied 270 km². However, the coverage declined to 230 km² and 73 km² in 2005 and 2020, respectively. The decline was higher in the 2005–2020 period (− 68.2%) than in the 1990–2005 period (− 15%).

Grasslands covered 19% of the total area in 1990 and declined by 16% in 2005. In 2020, there was a slight increase of 1% in grassland cover. Grasslands dominated the western region of the county and the periphery of the Cherangany hills forest in 1990. The LULC dynamics show that extensive grassland areas were rapidly transformed into croplands in the 1990–2005 period. Approximately 72% of grassland cover was converted to croplands during this period (Fig. 3a,b).