1 Introduction

Urbanization, characterized by rapid population growth, concentration, and urban expansion, has significantly transformed land use patterns and disrupted local ecosystems (Alphan 2003; López et al. 2001; Sato and Yamamoto 2005; Turner 1994). As cities expand, the delicate balance between human activities and ecological functioning becomes crucial for sustainable development (Jiyuan et al. 2005). In developing countries, the consequences of urbanization are particularly pronounced. Slum development, population surges, environmental degradation, and unemployment have become prevalent (Arimah 2010; Tariq et al. 2021c). Remote sensing and GIS techniques have emerged as powerful tools for monitoring and analyzing spatial patterns over time (Liu et al. 2021; Tariq et al. 2021d). Since the 1970s, satellite imagery has been used to map and quantify urban expansion, with early studies utilizing Landsat data to detect land use changes in metropolitan areas (Howarth and Boasson 1983; Jensen and Toll 1982). These initial efforts laid the groundwork for more sophisticated urban remote-sensing applications in subsequent decades. As computational capabilities advanced, researchers began developing simulation models to predict future urban growth scenarios. Machine learning (ML) algorithms have emerged as valuable tools for handling these complex tasks (Guan et al. 2011). Cellular automata (CA) models gained popularity in the 1980s and 1990s as a method for simulating complex urban systems (Batty et al. 1999; Herold et al. 2005). The integration of artificial neural networks (ANN) with CA models further enhanced their predictive power by incorporating machine learning techniques (Li and Yeh 2001).

Researchers have developed numerical models to predict urban growth. These models are essential for assessing the impact of urbanization, managing land transformation, optimizing land use, and implementing policies in rapidly growing cities (Jiyuan et al. 2005; Li and Yeh 2002). Scholars have employed Markov chains (Selmy et al. 2023), cellular automata-Markov (CA-Markov) models (Aryal et al. 2023), logistic regression (Pande et al. 2024), SLEUTH models (Hadi et al. 2014), and artificial neural networks (ANNs) (Li and Li 2015) to forecast LULC changes. Among these approaches, Markov chain analysis (MCA), cellular automata (CA), CA-Markov, ANN, binary logistic regression, and fractal models have gained prominence in the field. However, despite their widespread use, there is a notable lack of systematic comparisons between these methods to determine which yields the most reliable results. Furthermore, the limited scope of individual research has resulted in a dearth of comprehensive evaluations across different study areas. For instance, (Seto et al. 2011) utilized logistic regression models to analyze urban expansion in Asian megacities, including Karachi. However, logistic regression has limitations in capturing spatial interactions and non-linear relationships. (Baqa et al. 2021) explored Markov chain models, which predict land use transitions based on historical data. Despite their utility, these models struggle with complex urban dynamics.

The CA-ANN (Cellular Automata-Artificial Neural Network) model for simulating land use and land cover change is based on several fundamental assumptions (Li and Yeh 2001). First, it considers spatial dependency, assuming that land use changes are influenced by neighboring cells. The cellular automata component incorporates the state of surrounding cells when determining transitions. Second, the ANN component allows for modeling non-linear relationships between driving factors and land use changes, recognizing that land-use transitions are not always linearly related to influencing variables. Temporal dynamics play a role, as the model assumes that past land use patterns inform future predictions, using historical data for training and calibration. Multiple driving factors, including socioeconomic, spatial, and environmental variables (e.g., slope, distance to roads, urban centers, land fertility, and population density), influence land-use transitions (Wang et al. 2023). Transition potential is calculated for each cell based on these driving factors and neighborhood conditions. Land use changes occur in discrete time steps, capturing significant shifts. Spatial heterogeneity accounts for variations in land use patterns across different areas. Leveraging the learning capability of ANN, the model improves prediction accuracy by capturing complex patterns from historical data. Finally, stochasticity is incorporated to address inherent uncertainty in land use change processes. The integration of ANN with CA enhances prediction capabilities by capturing non-linear relationships and complex patterns in land use change dynamics(Gharaibeh et al. 2020; Jayabaskaran and Das 2023; Lauret et al. 2016). The CA-ANN model combines artificial neural networks with cellular automata, featuring an input layer with spatial variables, hidden layers for learning transition rules, and an output layer representing land use types or transition probabilities (Pijanowski et al. 2002). The model’s performance depends on neighborhood size and cell resolution, with larger sizes generally improving accuracy. Proximity to roads, railways, and city centers also influences land-use transitions (Guan et al. 2011). Overall, CA-ANN models have demonstrated superior predictive abilities compared to traditional cellular automata models (Li and Li 2015).

Urban expansion and sprawl have been extensively studied in Pakistan, as well as globally. (Mahboob et al. 2015) utilized remote sensing and GIS to assess urban sprawl in Karachi, revealing significant expansion from 1979 to 2013, with the built-up area increasing from 9.8 to 37.9% of the total land area. The city’s rapid and often unplanned growth has been a focal point, as highlighted by (Baqa et al. 2021). On a broader scale, (Ahmed et al. 2008) compared urban transportation and equity issues between Karachi and Beijing, emphasizing the challenges faced by developing countries during rapid urbanization. (Rui and Othengrafen 2023)proposed innovative urban planning approaches to address livability issues arising from sprawl. (Shah et al. 2022) analyzed urban expansion Islamabad metropolitan area, emphasizing the need for sustainable policies. (Ahmad et al. 2022) investigated Lahore’s urban sprawl, advocating for strict zoning regulations. Similarly, (Parveen et al. 2019) studied Faisalabad’s urbanization, recommending comprehensive land use policies.

Thus, this study attempts to explore the spatial transition of urban features and associated mechanics within Karachi, Pakistan during the last three decades, and predict the changes in land use patterns in the upcoming three decades. It is the first step to conduct a comprehensive and long-term land use retrieval, as well as the simulation of potential urban spatial dynamics within a developing city. The integration of CA and the machine-learning-based Artificial Neural Network (ANN) models can combine and leverage the advantages of both data analytic approaches. Therefore, more options for developing well-informed policies within the local context can be established from results obtained in this study, and urban policymakers can acquire evidence-based insights in understanding challenges and opportunities of future urban development potentials within Karachi, as well as in neighboring cities of Pakistan.

2 Materials and methods

2.1 Study Area

Karachi, as the former capital and the current economic capital of Pakistan, is the most densely populated metropolis of the country, which consists of a population of 20.382 million according to the recent 2023 Census Report (https://www.pbs.gov.pk), with an average density of 5774 citizens / km2. It is also the 12th largest urban agglomeration in the world, and has faced many socio-economic problems due to its rapid population growth, for example, lack of proper urban planning strategies, concerns of health and epidemics, and squatter settlements. As shown in Fig. 1, the city extends over 3575.76 km2, and is located on the coastline of the Arabian Sea and the Sindh Province of Pakistan, with latitude from 24°45’N‒25°15’N and longitude from 66°37’E‒67°37’E. It consists of 5 major districts, serves as a local transport hub, and is the home to the two largest seaports of Pakistan, thus acting as the largest contributor to GDP in Pakistan. As a metropolitan city, industrial and commercial hub of a developing country, its population has grown continuously due to large-scale internal and external migration flows. Despite having better health and education facilities as compared to other urban areas of the Sindh Province, Karachi is facing numerous social and environmental challenges, for example, inappropriate allocation of infrastructures, crises in solid waste management services, air and water pollution, as well as incapable governance of urban expansion plans within the city. Currently, the local government is offering 400 slums (known as ‘katchi abadis’ in Pakistan) to 40–60% of residents in Karachi (Thaver et al. 1990), but these living environments are far worse than satisfaction. Residents living in slum areas usually with poor health status, because the local government fails to provide reasonable quality care and preventive health advice to these people (Thaver et al. 1990).

Fig. 1
figure 1

The geographical location of Karachi, Pakistan, together with the land use distribution based on the false color combination of Landsat datasets

2.2 Data Acquisition and Preprocessing

In this study, remotely sensed data were acquired from the United States Geological Survey (USGS, Reston, Virginia) (https://earthexplorer.usgs.gov/). In total, 4 Landsat Thematic Mapper (TM) and Operational Land Imager (OLI) images from 1990, 2000, 2010, and 2020 were obtained and retrieved, which could provide objective assessments of spatial land use changes and transitions within prescribed periods. The 152nd path and the 42nd and 43rd rows of the TM and OLI sensors can cover Karachi in a regular period. Detailed acquisition time, spatial and temporal resolutions, as well as a brief description of the corresponding Landsat dataset are provided in Table 1. In this study, the 4 sets of Landsat images were extracted during September, October and November, so that a clear distinction of built-up areas and other land use types can be achieved. There was no cloud coverage within all aforementioned periods, which implies that the satellite images obtained from respective sensors onboard Landsat 5 and 8 could reflect the realistic environmental changes and transitions of land use types within Karachi. On top, to enhance the classification accuracy of different land cover classes, the night light data of corresponding years obtained by the Defense Meteorological Satellite Program (DMSP) Operational Linescan System (OLS) and The Visible Infrared Imaging Radiometer Suite (VIIRS) onboard the National Polar-Orbiting Partnership (NPP) spacecraft were also acquired (Boston College; NOAA).

Table 1 Details of remotely sensed datasets obtained from Landsat 5 and 8, the corresponding acquisition time, and spatial and temporal resolutions of images obtained for this study

Afterwards, the Fast Line-of-sight Atmospheric Analysis of Hypercubes (FLAASH) algorithm from Exelis Visual Information Solutions Inc., Boulder, CO, USA has been adopted in the Environmental for Visualizing Images (ENVI) software, to conduct radiometric calibration and atmospheric corrections, as a result, enhance data quality for further processing (Esri Indonesia). The details of the FLAASH algorithm and its operational characteristics are described in ENVI’s webpage (ENVI; L3HARRISTM). Throughout this study, datasets that cover the Karachi administrative boundary (shown in Fig. 1) were extracted, and Landsat images were utilized to detect existing and potential changes in land use features. Furthermore, re-projection, resampling, and radiometric correction were also applied to the night light data as aforementioned, so that its spatial resolution and coordinate systems were consistent with that of Landsat images and other environmental attributes to be ingested into the model for land use prediction. It is remarked that Landsat datasets consist of many unique features, which could be particularly useful from the perspective of data analytics, i.e., to detect, trace, and explain substantial inter- and intra-pixel changes of urban environments. These include sufficiently fine spatial and temporal resolutions of digital images acquired in urbanized areas with consistent spectral and radiometric resolutions throughout all years of observations (Almazroui et al. 2017; Liu et al. 2021; Wang et al. 2018), and the provision of comprehensive record in land use attributes.

2.3 Image Classification Algorithm Semi-Automatic Classification Plugin (SCP)

Various image classification software and algorithms, like ArcGIS, Erdas Imagine, PCI Geomatica of Catalyst, and Trimble eCognition (ArcGIS Pro. 2D and 3D GIS Mapping Software; eCognition Trimble Geospatial; ENVI; Hexagin; PCI Geomatica) have been utilized in previous land use studies (Kucharczyk et al. 2020). However, some of these are commercial software and may require lengthy processes to determine optimal parameters for image classification, while only one image processing algorithm can be implemented at each time instant. Thus, a more automated classification approach that consists of multiple numerical algorithms is desired, especially when one wishes to obtain more precise land use mapping and assignments at the grid level. In this study, a result-oriented tool, SCP, was used as an open-source plugin in the QGIS (v3.32.3) software (Tempa and Aryal 2022) and applied to classify each land use feature to its considerable extent, where the Kappa coefficient and various statistical metrics were adopted for assessing the overall accuracy of classification. The SCP approach enabled the use of numerous classification algorithms based on a series of remotely sensed land use imageries (e.g., Landsat, MODIS, Sentinel I and II) (Congedo 2021), where the classification of supervised and unsupervised segments of these datasets could become semi-automated (Lotto et al. 2022). In the pre-processing stage, satellite images were calibrated radiometrically and atmospherically corrected. In particular, dark objects like water and forests have reflectance values close to 0 because the associated pixels will not receive solar radiation. Whenever these pixels appeared, the machine-learning-based Dark Object Saturation (DOS) approach was adopted so that pixels were assigned the minimum reflective value. The atmospheric dispersion effect was also taken into account when the surface reflectance \(\:\left({R}_{\text{s}\text{u}\text{r}\text{f}\text{a}\text{c}\text{e}}\right)\) was calculated as in Eq. (1) (Prieto-Amparan et al. 2018).

$$\:{R}_{\text{s}\text{u}\text{r}\text{f}\text{a}\text{c}\text{e}}=\frac{\pi\:{d}^{2}\left({L}_{\text{s}\text{a}\text{t}\text{e}\text{l}\text{l}\text{i}\text{t}\text{e}}-{L}_{0}\right)}{{E}_{OA}\text{cos}\theta\:}\:\:$$
(1)

Here, \(\:d\) is the direct distance to the Sun, \(\:{L}_{\text{s}\text{a}\text{t}\text{e}\text{l}\text{l}\text{i}\text{t}\text{e}}\) and \(\:{L}_{0}\) represent spectral radiance to satellite and backscatter flow through the entire atmosphere respectively, \(\:{E}_{OA}\) stands for surface solar spectral radiance that is directed to the sun’s rays outside the atmosphere, and \(\:\theta\:\) denotes the solar zenith angle.

During the processing stage, components of the pre-processed image were classified based on respective spectral signatures. For supervised classification, one must acquire sufficient training samples for each land use type within the region of interest (ROI) iteratively, then apply the classification algorithms that are intrinsically provided in SCP, which include minimum distance, maximum likelihood, spectral angle mapper, parallelepiped classification, land cover signature classification and algorithm raster. As a result, a thematic map of the land cover within the ROI can be obtained. After processing, SCP consists of a series of post-processing techniques to convert the processed images into appropriate outputs, for example, merging land use classes, conducting accuracy and statistical assessments, as well as the conversion of classified raster datasets into vector format. Within this study, with the aid of the CP dock panel process in QGIS, sufficient training sample polygons were selected from the processed Landsat datasets during all 4 acquisition periods, and then the SCP approach was fitted in to select proper thresholds, so that each pixel was eventually assigned a specific land use class, and the maps that described the change of land use patterns within 1990, 2000, 2010 and 2020 were created. Overall, there were 6 categories of land use classes, including built-up area, grassland, rocky bare, bare soil, waterbody, and mangroves. Built-up surfaces were associated with artificial structures like concrete, stone, and rooftops; while rocky bare consisted of a large area of rocks, but without anything growing within the pixel. These satellite-retrieved spectral signatures could offer us opportunities to quantify land use changes within a developing city like Karachi.

2.4 Accuracy Assessments

For validation purposes, ground truth points (GTP) of each land use category were collected from all 4 Landsat images of Karachi via the application of stratified random sampling (Rozenstein and Karnieli 2011), then the Google Earth Engine was used for conducting ground validation during prescribed periods. Numerous statistical metrics were selected for such spatial assessment concerning land use types, namely Producer’s Accuracy (PA), User’s Accuracy (UA), Overall Accuracy (OA), and Kappa Coefficient (KC), which have shown their appropriateness and capabilities in historical studies. PA has to be derived from referenced datasets, and is defined as the proportion of correctly classified pixels within a specific land use type; UA is expressed as the ratio of the number of accurate pixels to the overall number of classified pixels within a specific land use type; while OA represents the ratio of total number of correctly classified pixels to the number of pixels within the ROI, i.e., Karachi in this study. There were altogether 6 classified land use types, thus the corresponding PA, UA, and OA of the kth land use type (where \(\:k=\text{1,2},\text{3,4},\text{5,6}\)) were calculated by Eqs. (2)-(4). Notations adopted in these equations are listed in Table 2.

$$\eqalign{& {\rm{Producer{^{\prime}}s}}\,{\rm{Accuracy}}\,{\rm{(PA)}} \cr & \,{\rm{of}}\,{\rm{the}}\,{{\rm{k}}^{{\rm{th}}}}\,{\rm{land}}\,{\rm{use}}\,{\rm{type}}\, = \,{{{x_{kk}}} \over {\sum\limits_{i = 1}^6 {{x_{ik}}} }} \cr}$$
(2)
$$\eqalign{& {\rm{User{^{\prime}}s}}\,{\rm{Accuracy}}\,{\rm{(UA)}}\, \cr & {\rm{of}}\,{\rm{the}}\,{{\rm{k}}^{{\rm{th}}}}\,{\rm{land}}\,{\rm{use}}\,{\rm{type}}\, = \,{{{x_{kk}}} \over {\sum\limits_{j = 1}^6 {{x_{kj}}} }} \cr}$$
(3)
$$\eqalign{& {\rm{Overall}}\,{\rm{Accuracy}}\,{\rm{(OA)}}\, \cr & {\rm{of}}\,{\rm{the}}\,{\rm{land}}\,{\rm{use}}\,{\rm{classification}}\, = \,{{\sum\limits_{k = 1}^6 {{x_{kk}}} } \over {\sum\limits_{j = 1}^6 {\sum\limits_{i = 1}^6 {{x_{ij}}} } }} \cr}$$
(4)
Table 2 Notations associated with the definitions of PA, UA and OA in Eqs. (2)-(4)

For KC, having a value exceeding 0.6 means that the land use classification is generally accurate, while a KC value exceeding 0.8 implies that around 64–100% of data are perfectly classified. In contrast, if the KC value is lower than 0.2, the classification is not trustworthy nor reliable, because at most 4% of retrieved pixels are assigned the accurate land use types.

2.5 Landscape Metrics

Landscape metrics are quantitative attributes for characterizing landscape configuration, patterns, and changes within a prescribed time (Hu and Dong 2013). These metrics could be utilized to enhance the accuracy of simulation models related to land use patterns or urban expansion and, as a result, provide a reasonably accurate prediction of land use transitions shortly, which could serve as good references for policymakers to make informed decisions in urban planning. In this study, 3 landscape metrics were considered, including Percentage of LANDscape (PLAND), Largest Patch Index (LPI), and Number of Patches (NP), and were incorporated into the CA-ANN model (see Sect. 2.6).

By detecting landscape composition, PLAND was used to determine whether a particular patch could easily be converted into other land-use classes (Yang et al. 2016).The conversion of the land use type of a pixel was readily influenced by the abundance of similar patches in its neighborhood, thus PLAND, which is defined as the percentage of landscape that is of a specific land use type, can serve as a good indicator of induced potential changes. Denote \(\:i\) as the patch type (or class), \(\:j\) as the number of patches within the ROI, \(\:{a}_{ij}\) as the area of the patch \(\:ij\) (in m2), and \(\:A\) as the total land cover area (in m2), PLAND is calculated as in Eq. (5).

$${\rm{PLAND}}\, = \,{{\sum\limits_{j = 1}^n {{a_{ij}}} } \over A} \times 100\%$$
(5)

On top of PLAND, LPI and NP were also adopted as landscape metrics in this study. LPI quantifies the percentage of the total land cover area that is occupied by the leading patch of consistent patch type (Herold et al. 2005), and is calculated as in Eq. (6). It has been shown that LPI is negatively related to economic indicators, natural factors and social factors. As for NP, it is the number of patches of particular land use type (or class), i.e., \(\:\text{N}\text{P}={n}_{i}\) (where \(\:i=\text{1,2},\text{3,4},\text{5,6}\)). It provides references regarding the fragmentation of the landscape.

$$\:\text{L}\text{P}\text{I}\:\left(\text{o}\text{f}\:\text{t}\text{y}\text{p}\text{e}\:i\right)=\frac{\underset{1\le\:j\le\:n}{\text{max}}{a}_{ij}}{A}\times\:100\%$$
(6)

2.6 Cellular Automata-Artificial Neural Network (CA-ANN) Simulation

Based on existing LULC datasets and the Modules for Land Use Change Evaluation (MOLUSCE) plugin in the QGIS (v3.32.3) software, together with a combination of statistical framework and parameters (e.g., ANN, Logistic Regression (LR), Weights of Evidence (WoE) and Multi-Criteria Evaluation (MCE), simulations of spatial and temporal dynamics of land use can be effectively conducted (Bugday and Erkan Buğday 2019). The MOLUSCE plugin was recognized for its active and eminent algorithms for urban modeling and governing changes in land cover dynamics (Kafy et al. 2021), and consisted of 7 significant elements at its edge, namely inputs, evaluation correlation, changes in areas, transition potential modeling, CA simulation, validation, and messages. In particular, when conducting LULC simulation, CA provides a spatiotemporal framework, while ANN is one of the 4 MOLUSCE-based models that can be used to formulate local transition probabilities of the 6 land use types within this study (Yang et al. 2016). ANN has shown promising statistical performance in capturing nonlinear and complex features within the big data analytic system and is effective in identifying the connections between different attributes through neurons in different layers (Tayyebi and Pijanowski 2014).

In our study, the multi-layer perceptron-based ANN approach was adopted, where the input layer included the categorical one-band raster data files of the validated land use maps in 1990, 2000, 2010, and 2020, landscape metrics as described in Sect. 2.5, and additional raster data layers in the MOLUSCE plugin, for example, population density, digital elevation model, and distance from road (as shown in Fig. 2). All these quantities were set as neurons in the ANN input layer, and the aforementioned spatial variables were necessary to obtain accurately predicted land use patterns because those variables could effectively explain how human activities and habits would affect changes in LULC and urban development. Further, road network and terrain could also impact the infrastructural development and expansion of a city (Abd El-kawy et al. 2019), thus these MOLUSCE-based attributes of Karachi were included in the neuron sets as well. Based on the area investigation module, the changes in these quantities from 1990 to 2020 were recorded and used to estimate the corresponding values at each pixel in future years (e.g., 2030, 2040, and 2050). Iteration count and stretching of the respective proportion of changes were conducted accordingly. After all input neurons were ready, data processing within the CA-ANN framework began, which included initial pre-processing stages like dummy coding and normalization, followed by sampling and data training. Datasets were normalized in the range of [– 1, 1] as in Eq. (7) to avoid overfitting and underfitting (Mehdi et al. 2023), where \(\:{D}_{\text{n}\text{o}\text{r}\text{m}\text{a}\text{l}\text{i}\text{z}\text{e}\text{d}}\) represents the normalized data, Error! Bookmark not defined denotes the set of all actual data, and \(\:{x}_{i}\) is the value of the ith actual data.

$$\:{D}_{\text{n}\text{o}\text{r}\text{m}\text{a}\text{l}\text{i}\text{z}\text{e}\text{d}}=-1+\frac{2\left({x}_{i}-\text{min}\left(\left\{{x}_{i}\right\}\right)\right)}{\text{max}\left(\left\{{x}_{i}\right\}\right)-\text{min}\left(\left\{{x}_{i}\right\}\right)}$$
(7)
Fig. 2
figure 2

Spatial variables as raster data layers in the MOLUSCE plugin platform

For the learning and training processes of ANN, we followed the parameter values listed in (Bugday & Erkan Buğday, 2019), i.e., 9 neighborhood pixels, the learning rate of 0.001, a maximum of 100 iterations, 10 hidden layers, and a momentum value of 0.05, with classical back-propagation imposed; while for validation, 5 iterations were conducted within CA (Kamusoko et al. 2009), by comparing the land use map generated by ANN-trained datasets with the corresponding satellite-derived map acquired from Landsat. In total, 80% of the data was used for training, and the remaining 20% was for validation. Transition probabilities were generated by neurons, and each pixel in the predicted map was assigned the land use class with the highest transition probability, which was reflected in the final output dataset, i.e., the change map. The map was produced at the transition modeling stage and could illustrate land use dynamics that had taken place within the period. The general mathematical equation of ANN (with \(\:n\) neurons in the hidden layer) is provided in Eq. (8) (Lawal and Idris 2020), where \(\:{x}_{i}\) and \(\:{y}_{k}\) are the input and output variables, \(\:f\) represents the transfer function, \(\:{b}_{k}\:\)and \(\:{w}_{ki}\) are the bias within the hidden layer and the weight factor between the input neurons and hidden layers respectively. Furthermore, the overall framework and procedures of the entire study are outlined in Fig. 3.

$$\:{y}_{k}=f\left({b}_{k}+\sum\:_{i=1}^{n}{w}_{ki}{x}_{i}\right)$$
(8)
Fig. 3
figure 3

The overall framework and important ingredients of this study, as described in Sect. 2

Fig. 4
figure 4

Landsat-based land cover maps of Karachi in 1990, 2000, 2010, and 2020 respectively

2.7 Results and Analysis

2.7.1 Accuracy Assessment of LULC Classification

The 6 land cover types based on 4 Landsat images of Karachi in 1990, 2000, 2010, and 2020 were first classified and evaluated. Here, the Tasseled Cap Transformation (TCT) was adopted for data volume compression and providing brightness and greenness indices, and the night light data from DMSP and VIIRS were used to enhance the accuracy of land use classification results. Based on the semi-automated classification in SCP, the land cover maps of Karachi in recent decades were retrieved, as shown in Fig. 4. As shown, there was a significant increase in the amount of built-up area (i.e., impervious surface) and grassland, while a sharp decline in the amount of bare soil during recent decades.

Table 3 shows the statistical metrics of accuracy assessment conducted for each Landsat-derived land use map. All PA and UA values of all 6 land use types exceeded 87% in all years, where the statistical performance in 2010 and 2020 was significantly better than that in 1990 and 2000, with all accuracy values exceeding 90%. In particular, PA and UA of the retrieval conducted in 2020 all exceeded 94%, and some values were very close to 100%. Further, the OA and Kappa index agreement throughout all 4 decades were above 94.38% and 0.90, which indicated that the historical land use maps were accurately retrieved and classified, especially for built-up areas and bare soil. Thus, these land use maps could serve as trustworthy inputs for predicting future land use distribution within Karachi.

Table 3 Statistical Metrics in Accuracy Assessment of Landsat images (1990, 2000, 2010 and 2020)

2.8 Spatial and Temporal Trends of Land Cover during 1990–2020

From the spatial land cover classification maps as shown in Fig. 4, there was a huge increase in the amount of grassland and built-up area pixels from 1990 to 2020, especially in the central and southern Karachi areas, as accompanied by a significant decrease in rocky bare and bare soil pixels. The northern and northeastern parts of Karachi remained as the major areas of rocky bare and bare soil, however, more grassland pixels also appeared in these areas in 2020. Further, some rocky bare areas in central and southern Karachi were gradually replaced by built-up areas, grassland, and bare soil respectively. This indicates the transition and change of land use patterns throughout the past few decades, where rocky bare (40.1%), bare soil (34.7%), and grassland (15.7%) were the major land cover types in 1990; however, grassland (39.7%), rocky bare (28.6%) and built-up areas (19.1%) have become the 3 dominant land use types of Karachi in 2020. From individual plots of Fig. 4, it was observed that extreme urban expansion had been taking place within many areas of northeastern, central, and southern Karachi. Grassland areas had expanded in a noticeable manner rather than built-up areas, because before urban development can take place, the terrain must be cleaned and leveled, and during such a transition period, only grass could grow in these areas instead of other infrastructures or manmade structures. The increment of grassland areas provided proper evidence of urban expansion during recent decades, while the geographical areas that possessed such a phenomenon were potential regions of urban development.

The bar charts of Fig. 5 illustrate the exact areas of each land use type in each investigated year (1990, 2000, 2010, and 2020), so that respective temporal changes can be easily traced. It is interesting to note that the amount of built-up area had increased by 423.75 km2 (from 258.50 km2 in 1990 to 682.25 km2 in 2020), and was mainly constituted by rocky bare (214.14 km2), bare soil (107.97 km2) and grassland (104.47 km2). Throughout the years, 8.05 km2, 2.72 km2, 1.65 km2, 0.91 km2, and 0.08 km2 of the built-up area were converted to grassland, bare soil, waterbody, rocky bare, and mangroves respectively. A tremendous increase in grassland areas was attained from satellite observations, from 560.88 km2 in 1990 to 1417.85 km2 in 2020, mainly due to the contribution from rocky bare (541.59 km2) and bare soil (494.36 km2). During the same period, around 200 km2 of grassland areas were converted into built-up areas, rocky bare, bare soil, and waterbodies accordingly. Conversely, bare soil and rocky bare areas had experienced a sharp decline throughout the past several decades, from 1435.20 km2 to 1023.21 km2 and from 1241.76 km2 to 365.40 km2 respectively. The majority of rocky bare areas had been converted into grassland (541.59 km2) and built-up area (214.14 km2), while most bare soil pixels had become rocky bare pixels (103.61 km2). All these temporal changes and figures were again concrete evidence of the spatial transition processes that had taken place in Karachi throughout the last few decades. As a remark, water bodies and mangroves showed negligible changes in areas, which implies that relevant land use was quite stable in Karachi. The numerical figures of exact areas, change in the area of different land use types, and the resulting discrepancy between 1990 and 2020 were displayed in Table 4.

Fig. 5
figure 5

Areas of each land use type in Karachi in 1990 (blue), 2000 (orange), 2010 (grey) and 2020 (yellow) (km2)

Table 4 Spatial and temporal trends of each land cover type from 1990 to 2020

2.9 Urban Landscape Metrics

Three indices were used in this study to understand the spatial variations of different land cover types in Karachi in the past and the future. The values of these indices in historical and current years were ingested into the CA-ANN model for predicting future land cover maps from 2030 to 2050. Percentage LANDscape (PLAND), Largest Patch Index (LPI), and Number of Patches (NP) from Fragstats were used to assess the land use classes. Other catalogs of Fragstats were not considered in the current study because the resulting deviations attained in the relevant time series were not so obvious. In this study, we noticed that built-up area had a dramatic increase from 1990 to 2020 (258.5 km2 in 1990, 344.5 km2 in 2000, 463.5 km2 in 2010, and 682.25 km2 in 2020), and many areas in northern and central Karachi were cleared as grassland during the 2010–2020 period. Therefore, it is genuinely believed that these areas will be converted into built-up areas in the upcoming 10–20 years. Based on landscape metrics attained in historical and present years together with other attributes (discussed in Sect. 2), the CA-ANN model produced the future land use maps of Karachi according to steps in Sect. 2.6, and then the corresponding landscape metrics were calculated from the predicted maps. Corresponding numerical values of PLAND, LPI, and NP for built-up areas from 1990 to 2050 (every 10-year as a period) are as shown in Table 5. Both PLAND and LPI of the built-up area showed generally increasing trends from 1990 to 2050, where PLAND increased from 9.16% in 1990 to 16.22% in 2020, and is predicted to reach 34.74% by 2050; while LPI almost doubled during the 1990–2020 period, and is projected to 32.37% by 2050. Nevertheless, the NP of the built-up area showed a blended trend throughout the historical years (slightly decreased from 11,041 to 10,109 in 2000, then rebounded to 14,764 in 2010 and decreased again afterward). Based on our model prediction, NP will reach its maximum of 17,163 by 2030, then have a significant decrease in the next 20–30 years. Overall, these statistical metrics suffice to verify the steady increment of contiguous urban areas during the last 3 decades and implicate its continuous increment in upcoming decades.

Table 5 Urban landscape metrics from 1990–2050

2.10 Predicted Spatial and Temporal Changes of Land Covers during 2020–2050

Based on the CA-ANN framework, predicted land cover maps of Karachi in 2030, 2040, and 2050 were obtained (as shown in Fig. 6). The spatial transformation of land cover types has revealed speedy urban growth in the upcoming 2–3 decades. First, the majority of pixels will have become either built-up areas or grasslands, which implies that most parts of Karachi will have already been developed or are in the process of undergoing further development by 2050. Some local urban development plans will likely take place during the 2030–2040 period, as reviewed from the spatial plots of 2030 and 2040, where many pixels in northern and central Karachi have converted from rocky bare to grassland, or from grassland to built-up areas. In the southern or south-eastern Karachi, some infrastructures and facilities will have been built, as many of the cleared grassland pixels have also been converted into built-up areas. Such conversion will become more significant during the 2040–2050 period after the long-term urban development plans of Karachi are laid down and implemented gradually and consistently. It is interesting to note that some bare soil regions in southern Karachi will remain till 2050, while waterbody pixels in northern Karachi will not have been transformed into grassland until 2040. Similar to all previous years, mangroves will be a negligible land use type in the entire Karachi.

Fig. 6
figure 6

CA-ANN-based predicted land cover maps of Karachi in 2030, 2040, and 2050 respectively

Figure 7 shows the predicted areas of each land use type in Karachi in 2030, 2040, and 2050 respectively. Significant temporal dynamics will occur for built-up areas and bare soil. As compared to the numerical figure in the last bar of Fig. 5, the built-up area will increase from 682.25 km2 in 2020 to 1244.18 km2 by 2050, which will mainly be contributed by grassland and rocky bare (419.78 km2 and 106.33 km2 respectively). Further, 40.27 km2 of bare soil, 39.88 km2 of waterbody, and 12.17 km2 of mangroves will be converted into built-up areas during the entire 2020–2050 period. In contrast, a minor part of the built-up area will have been converted into other land use types, in particular, an area of 50.10 km2 will be converted into grassland, 6.13 km2 into rocky bare, and around 1 km2 into bare soil, waterbody and mangroves. This may be attributed to the ongoing re-development and sustainability consideration of the local government, where some land will be served for other purposes, thus the process of conversion into grassland is essential before the re-built process. On top, a slight increment of grassland will take place during the 2020–2050 period, from 1417.66 km2 to 1515.60 km2, which will mostly be converted from the original rocky bare (467.53 km2) and bare soil (103.91 km2) areas. Rocky bare areas will show an overall decreasing trend, from 1023.21 km2 in 2020 to 803.39 km2 by 2050. Such a decrease will not be as significant as that of bare soil, where 214.91 km2, 103.91 km2, and 40.27 km2 of bare soil will have been converted into rocky bare, grassland, and built-up areas. Waterbody areas will also experience a consistently decreasing trend throughout all upcoming decades, from 78.85 km2 in 2020 to 58.56 km2 by 2030, then to 25.12 km2 and 5.8 km2 by 2040 and 2050 respectively, while only less than 1 km2 of other land cover types will be converted into waterbody. Similarly, such a significantly decreasing trend will also be observed for mangroves. All these serve as proper evidence of urban development processes that will take place in Karachi in the upcoming few decades.

Fig. 7
figure 7

Areas of each land use type in Karachi in 2030 (blue), 2040 (orange), and 2050 (grey) (km2)

2.11 Accuracy Assessment and Machine Learning-based Cross Validation

As discussed in Sect. 3.1, results of previous land use classification via the CA model in 1990, 2000, 2010, and 2020 were well validated by several accuracy assessment metrics, like PA, UA, OA, and KC. The CA-ANN model together with the MOLUSCE plugin and the incorporation of urban landscape metrics were used to produce predicted land use maps in 2030, 2040, and 2050 respectively. To evaluate the framework, the retrieved land use maps of 1990 and 2000 (or 2000 and 2010) were inputted to simulate the corresponding land use map of 2010 (or 2020), and then these “predicted” maps were compared with the actual land use maps of the same years. An OA of 89.5% was achieved, which shows that the simulated results from the CA-ANN framework were generally considerable and reliable, thus it is notable and trustworthy to adopt this approach for future land use prediction. The retrieved land use maps of 2010 and 2020 were ingested to simulate the corresponding land cover distribution of Karachi in 2030, then the retrieved map of 2020 and the simulated map of 2030 were inputted into the model so that the predicted land use map in 2040 can be attained after model training and validation processes. Finally, simulated maps of 2030 and 2040 were used to predict the land cover of Karachi in 2050.

On top, further validation was conducted with the aid of the neural network learning curves (NNLC). In particular, a learning curve is a plot of the optimal value of a model’s loss function within training against the loss function during the validation stage (Mohr and van Rijn, 2022). In machine learning, denote the training dataset as \(\:\left\{{\left\{{x}_{i}\right\}}_{i=1}^{n},\:{\left\{{y}_{i}\right\}}_{i=1}^{n}\right\}\) and the validation dataset as \(\:\left\{{\left\{{x}_{i}{\prime\:}\right\}}_{i=1}^{n},\:{\left\{{y}_{i}{\prime\:}\right\}}_{i=1}^{n}\right\}\). The optimization process attempts to find a \(\:{\theta\:}^{*}\) that minimizes the loss function \(\:L\left({f}_{\theta\:}\left({\left\{{x}_{i}\right\}}_{i=1}^{n},\:{\left\{{y}_{i}\right\}}_{i=1}^{n}\right)\right)\), where \(\:f\) represents the model function for prediction based on training datasets. The learning curve is then the plot of two curves \(\:L\left({f}_{{\theta\:}^{*}}\left(\left\{{x}_{1},\:{x}_{2},\dots\:,{x}_{i}\right\}\right),\:\left\{{y}_{1},{y}_{2},\dots\:,{y}_{i}\right\}\right)\) and \(\:L\left({f}_{{\theta\:}^{*}}\left(\left\{{x}_{1}^{{\prime\:}},\:{x}_{2}^{{\prime\:}},\:\dots\:,{x}_{i}^{{\prime\:}}\right\}\right),\:\left\{{y}_{1}^{{\prime\:}},\:{y}_{2}^{{\prime\:}},\:\dots\:,{y}_{i}^{{\prime\:}}\right\}\right)\) against \(\:i\). The under-fitting, over-fitting, or good-fitting of NNLC could effectively reflect the correctness of the prediction results, and cross-validation for future simulated datasets can also be obtained from the resulting curves. Further, during the process, the internal parameters of the model were optimized progressively from time to time via deep learning approaches, therefore it is no surprise to see that the NNLCs of our current study were all good-fitted. In particular, as shown in Fig. 8, the NNLC based on the training (80%) and validation (20%) datasets in 2030, 2040, and 2050 showed good statistical performance, where a small gap was obtained between the training and validated points, and the trends were generally agreed.

Fig. 8
figure 8

NNLC of land use simulation via CA-ANN model in 2030 (left), 2040 (middle) and 2050 (right) respectively, where the green curve indicates the training dataset, and the red curve indicates the validation dataset

2.12 Overall LULC Transition from 1990 to 2050

Figure 9 shows the urban expansion map of Karachi during different stages of the 1990–2050 period, where the newly built-up areas within the prescribed 10-year period were indicated by the corresponding color shown in the legend. As of 1990, central southern Karachi had already experienced urban growth, then the surrounding districts or towns gradually converted from rocky bare or grassland to built-up areas during the following 2 decades, 1990–2010, however, the degree of expansion was much less than that in 1980–1990. Starting in 2010, south-eastern Karachi had undergone significant urban expansion, which was in line with the temporal dynamics obtained from satellite informatics. Based on the CA-ANN simulated results, the process will continue and become more obvious in southeastern Karachi during 2020–2030, while the remaining non-built areas will also transit from other land cover types to built-up areas in the upcoming decades. As visualized in Fig. 8, more than half places of the southern Karachi region will experience urban growth during 2020–2050, while some of the inland areas will also undergo expansion. Corresponding details of LULC transition from 1990 to 2050 as a whole are also displayed in Fig. 10, which illustrate the increment of built-up areas and grassland, as well as the conversion from bare soil to rocky bare, then to either grassland or built-up areas in the long run. This has again implicitly verified the importance of land conversion into grassland before a land can ultimately be used for practical usage, or undergo explicit urban expansion and development / re-development. The spatial map and associated temporal transition graph can provide policymakers and urban planners with good references for laying down policies about human settlement, sustainability development, and extracting land for specific usage, as well as the ways of alleviating social problems induced by rapid urbanization.

Fig. 9
figure 9

Growth of built-up areas within different 10-year periods, from 1990 to 2050

Fig. 10
figure 10

LULC Transition of Karachi from a particular land use to another type during 1990–2050

3 Discussions

3.1 Development and Mode of Urban Expansion in Karachi

The insufficiency of comprehensive datasets and indices poses a significant challenge in many developing regions, impeding the progress of scientists and policymakers in comprehending and mitigating socio-economic and environmental risks. Despite its rarity in developing countries like Pakistan, this study endeavors to investigate the spatial dynamics of urban features and associated mechanisms in Karachi over the past three decades. Additionally, it aims to predict future changes in land use patterns over the next 30 years. In this study, we utilized remotely sensed data to assess land use changes in Karachi. We obtained four Landsat TM and OLI images from 1990, 2000, 2010, and 2020. These images allowed us to objectively analyze spatial transitions over time. Specifically, we focused on the 152nd path and the 42nd and 43rd rows of the TM and OLI sensors. Based on the existing LULC datasets the MOLUSCE plugin is used with multi-layer perceptron-based ANN with input layers including categorical land use maps and additional data (e.g., population density, elevation, distance from roads) for future prediction.

During the diverse period from 1990 to 2020, considerable changes in land cover types have taken place in different parts of Karachi, as illustrated by the increment of built-up areas and grassland, as accompanied by the proportional decline of rocky bare, bare soil, waterbody, and mangroves regions during each transitional stage, where such phenomenon was more obvious in central and southern Karachi. From the predicted land use maps based on the CA-ANN model, it is remarked that built-up area and grassland will continue to increase in the upcoming 3 decades, where the built-up area will become more than 5-fold by 2050, as compared with that in 1990, while grassland will almost be tripled by 2050. Further, urban expansion within Karachi gradually extended from southern districts to the central region, as well as to the northeastern districts and towns; while sparse urban areas in southern Karachi will be further developed into thick urban settlements in the upcoming 30 years. Urban growth trends in a city can generally be divided into three categories, namely (1) leapfrog expansion, (2) in-fill expansion, and (3) linear expansion (Glockmann et al. 2022). The addition of these urban infrastructures and manmade structures has altered the landscape settings, which provided new potential places for human settlement. Such a more convenient geographical setting also stimulated the rural-to-urban migration within Karachi, as well as the movement of humans and service centers of necessities from other neighboring towns or cities of Pakistan towards the urban cores of Karachi. Meanwhile, the waterbodies and mangrove forests both experience a steady decrease in the city. Karachi faces a severe and worsening water scarcity crisis due to rapid population growth, climate change, and inefficient water management. Currently, the city receives only half of its daily water requirement of 1.1 billion a situation exacerbated by dwindling stream flows in the Indus Basin and inadequate infrastructure(Khan and Arshad 2022). If current trends continue, Karachi’s water crisis will deepen, necessitating urgent action to improve resource management, infrastructure, and governance to ensure a sustainable water future. While mangrove forests in Pakistan, confront several threats (Gilani et al. 2021). Rapid urbanization and industrialization have converted mangrove areas into urban land while overharvesting for fuel persist due to economic challenges. Livestock grazing, pollution (including plastic waste and sewage), hydrological changes, and climate change further stress these ecosystems (Bhatti et al. 2023). Despite overall mangrove cover expansion in Pakistan, mature forests near Karachi continue to face pressures. Conservation efforts are crucial to protect these forests, which play a vital role in air quality and sea surge defense. Rapid urbanization has significantly impacted land use and land cover (LULC) patterns, leading to substantial socioeconomic and environmental challenges. The study’s findings reveal a significant increase in built-up area and this trend is projected to continue, with predicted built-up areas. However, it’s crucial to consider external factors that could influence these predictions. Economic fluctuations, policy changes, and unforeseen events like natural disasters can significantly alter urban growth trajectories (Kumar et al. 2013). For instance, economic downturns might slow urbanization rates, while pro-development policies could accelerate them. Natural disasters, such as floods or earthquakes, could reshape urban landscapes and redirect growth patterns (Kumar et al. 2018). Also, Karachi’s coastal areas pose risks due to natural factors (uneven sandbars, rip currents) and human-induced pollution. Lack of safety measures and awareness increases the danger (Hameed et al. 2012).

Additionally, technological advancements and shifts in societal preferences, such as increased remote work adoption, could impact future urban development. These factors underscore the need for flexible and adaptive urban planning strategies that can respond to changing circumstances while addressing the challenges associated with rapid urbanization. The socio-economic transition that took place had led to urbanization and population growth, as a result causing LULC change detection within the short period that followed. Such a way of urban expansion has been noticed in many developing cities (Raut et al. 2020). The urban expansion of Karachi, Pakistan, is influenced by several key parameters. Rapid population growth, driven by an increase from 1.13 million in 1951 to 20.3 million in 2023 (https://www.pbs.gov.pk), with detailed numerical population figures from 1951-present shown in Fig. 11. has been a primary driver. Migration, both internal and international, contributes to the demand for housing and infrastructure. Karachi’s economic prominence, generating 11.4% of Pakistan’s GDP, attracts job seekers and further fuels urban expansion. Industrialization affects land use patterns, necessitating new residential and commercial areas. Housing schemes and mixed land use policies facilitate growth, while geographical factors like coastal expansion and climate change impact the city’s development. Effective urban planning is crucial to manage this growth and address associated challenges.

Fig. 11
figure 11

(Source: https://www.pbs.gov.pk)

Available population statistics of Karachi from 1951–2023.

3.2 Potential Environmental Impacts of Urban Development in Karachi

Despite boosting economic qualities, urban sprawl in Karachi has imposed devastating impacts on surrounding environmental conditions. Due to population inflow, speedy and swift urban development processes took place during the past decades. To fulfill the housing needs of massive inhabitants, slums were established (especially before 1990 and during 1990–2000) for 40–60% of residents in Karachi (Thaver et al. 1990), however, these slums were of poor sanitary and health conditions, thus citizens suffered from increased risks of disease and illness, as a result, imposed a huge social burden on the local government (Nejad et al. 2021). On top, although very few studies have concentrated on the investigation of how urban expansion could affect thermal environmental and meteorological conditions in Karachi, the conversion of land use types obtained in our study was mostly in line with other cities of Pakistan, as well as in other developing countries (Hassan et al. 2015; Tariq et al. 2021a), thus it is believed that similar environmental changes could have been induced in Karachi. In particular, (Rahman et al. 2023) have adopted the Random Forest (RF) algorithm and Google Earth Engine (GEE) to assess the connection between urban expansion with Land Surface Temperature (LST) in Larkana, and it was found that a gradual increase of both maximum and minimum LST was attained in the urban thermal environment throughout the last 3 decades; while (Baqa et al. 2022) have shown that among different land use types in Karachi, built-up areas and bare land possessed the highest LST, and an increase of LST would be observed when a particular area was transited from vegetation or grassland into bare land.

Further, the connection between urbanization and LST was also explored and evident in other cities, for example, in daytime, urban expansion led to a constant decrease in spatial heterogeneity of LST in Changchun, China (Yang et al. 2022). Therefore, a thorough assessment of how current and future urbanization processes could affect the local and city-wise temperature of Karachi is essential and will be one of our future goals. There are 2 major reasons: (1) a compact city will be less efficient in reducing mean urban heat island (UHI) intensity, but will experience relatively less thermal loading at the regional scale, as compared to a dispersed city (2) urbanization can easily lead to changes of climatic conditions, thus reduces degree of thermal comfort, and traps pollutants within street canyons, as a result, particulates and traffic emissions will be accumulated within living environment and urbanized areas, which impose devastating effects on health conditions of citizens. Given all these undesired environmental impacts, apart from monitoring temperature changes within urbanized areas, one should also pay attention to assessing and quantifying the statistical relationships between the transition of land use patterns, regional traffic, and mobility patterns, as well as how varying meteorological conditions can affect distributions of pollutants at sufficiently fine spatial scales(Tariq et al. 2021b). For developing cities like Karachi, due to the lack of sufficient ground monitoring network, the use of satellite remote sensing strategies and low-cost monitoring networks to govern respective trends and connections has become particularly vital (Maharjan et al. 2021).

3.3 Limitations and Enhancement of the CA Model

The CA model provides a spatiotemporal framework within this study and is being synergized with ANN for conducting future land use simulations. The adoption of such a model has pros and cons from different perspectives: (1) From development and implementation perspectives, urban CA models can be efficiently processed because of their simplicity, however such excessive simplicity will increase the difficulty for the model to capture actual urban phenomenon, thus require substantial alterations and the imposing of various relaxation parameters from time to time; (2) From application point of view, our CA-ANN framework can be adapted to different environmental retrieval processes and spatial assessments, however confusion will easily arise if there is no universally accepted definition of transition rules. Thus, the balance between standardization and flexibility of the CA model has to be taken into account; (3) In terms of software usage and the amount of data being ingested, urban CA models can be used to explore speculative hypotheses because they are descriptive models, however, the software that can generate generic urban CA models usually consist of different constraints, and are difficult to be used, thus users will need to redesign the model in a way to satisfy needs of the assigned computing task (Xia et al. 2019). To cope with some of these potential limitations, one may proceed in two ways: (i) adopt other neural network-based models like recurrent neural network (e.g., Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU) within the CA framework, or Variational AutoEncoder (VAE) for reducing the dimension of unlabeled datasets. These approaches have worked well in text classification(Elizabeth et al. 2022) and image retrieval, and have been experimentally confirmed to have higher validation and prediction accuracies than standard neural network approaches. This is because these methods can remember and record attributes for an extended period, and can recall the long-term dependencies during the process; (ii) incorporate the driving forces of changing land use into the CA-ANN model for conducting simulation of future scenarios. Driving forces will include those from economic, environmental, and social perspectives so that different spatiotemporal and geoprocessing models can be used to acquire a better understanding of urban growth trends, which eventually enhances the prediction accuracy of the model. Further, traditional CA models usually assume linear trends among all spatial and temporal processes, which does not necessarily reflect the realistic condition within Karachi or other developing cities, therefore, other neural network-based models can also be adopted for governing nonlinear and complex relationships among different quantities (Gharaibeh et al. 2020).

For coordination and infrastructure upgrades to enhance urban development in Karachi, we recommend a tiered approach. At the top level, an elected mayor should oversee city-wide initiatives, working closely with provincial and federal representatives (the second tier). These representatives can allocate resources, address policy matters, and ensure coordination across different administrative levels. At the local level (third tier), district and municipal representatives should focus on community-specific needs. By streamlining communication and decision-making among these tiers, we can prioritize infrastructure upgrades, such as roads, public transportation, and utilities. Sustainable planning and citizen engagement in effective urban development require practical planning. We propose keeping development away from industrial zones to prevent pollution and health hazards. Instead, consider locating raw materials enterprises in the capital’s periphery, balancing economic growth with environmental concerns. Additionally, adopt participatory planning—involve citizens in discussions about neighborhood improvements, green spaces, and public amenities. Create feedback channels to gather community input. Lastly, leverage open datasets for informed decision-making, ensuring transparency and accountability. Regular environmental impact assessments during development and redevelopment stages will help maintain a sustainable, people-centric urban environment.

4 Conclusion

The primary objective of this study is to analyze historical land use and land cover (LULC) changes using satellite imagery and subsequently model future urban growth patterns in Karachi. The Landsat satellite data were processed to create LULC maps at 10-year intervals from 1990 to 2020, achieving an overall accuracy exceeding 90%. These maps, along with landscape metrics, were then input into a Cellular Automaton-Artificial Neural Network (CA-ANN) model to project land use patterns for 2030, 2040, and 2050.

The research findings indicate significant growth in built-up areas and grassland over the past three decades, with built-up areas expanding from 258.51 km² to 682.25 km² and grassland increasing from 560.88 km² to 1417.85 km². Simultaneously, rocky and bare soil regions were reduced. The model projections suggest continued urban expansion, with bare rock and bare soil areas likely converting to grassland or built-up areas by 2050, contingent on development stages. Urban growth is anticipated primarily in the eastern and northeastern parts of Karachi due to favorable topography for land utilization and resource management, contributing to a more sustainable city. Furthermore, the methodology employed in this study can be extended to analyze land use patterns in other developing cities. Enhancements to the model’s predictive capabilities would be valuable. The insights gained from this research can inform urban governance and guide the implementation of environmental and social policies, aligning with sustainable development goals (SDGs) by 2030 or 2040.