Introduction

Land use management is commonly accomplished by modeling land cover changes (Islam et al. 2018; Saputra and Lee 2019), which is used to show deforestation and reforestation phenomena (Aksoy and Kaptan 2021), as well as quantitatively and spatially examine the socio-economic impacts of land change (Beygi et al. 2019; Dinda, et al. 2019). Therefore, model accuracy is important and is of concern to various studies by proposing various modeling approaches (Kourosh et al. 2019; Kindu, et al. 2018).

Various modeling approaches have been developed such as: Cellular Automata (CA) (Achmad et al. 2021), Markov Chain (MC) (Beygi et al. 2019), Logistic Regression (LR) (Kindu et al. 2018), CLUE-S (the Conversion of Land Use and its Effects at Small regional extent) (Kucsicsa, et al. 2020), and SLEUTH (Slope, Land use, Exclusion, Urban, Transportation, and Hill-shade) (Parchianloo et al. 2021). Furthermore, some Artificial Neural Network (ANN) models such as MLP-CA-Markov or MLP-Markov (Multilayer Perceptron-CA-Markov) (Leta et al. 2021; Christensen and Jokar Arsanjani 2020), are increasingly in demand. This is because of their higher objectiveness and analytical strength in producing an accurate model (Xu et al. 2019; Gharaibeh et al. 2020).

The selection of the appropriate period and calibration interval is also a major concern besides these modeling methods. According to Tripathy & Kumar (2019), the use of short intervals (5 years) was suggested, although Viana & Rocha (2020) and Yalew et al. (2016) recommended a long period of up to 23 years. Despite these results, the two reports did not analyze the differences in the use of different calibration intervals. Considering the accuracy of 5 scenarios, Shooshtari and Gholamalifard (2015) selected appropriate calibration periods and intervals, although no explanation emphasized the patterns by which these time selections affected the appropriateness of the model. Therefore, this study aims to quantitatively prove the assumptions of calibration interval and period selection through a more complete scenario-based modeling approach.

To develop a complete scenario, a multi-temporal land cover map is needed, although data availability and production capacity is still a major challenge in Indonesia (Susetyo, et al. 2021; Rizaldy and Mayasari 2016). Google Earth Engine (GEE) has opportunities and challenges in providing multi-temporal maps for land cover change modeling, regarding calibration interval scenarios. This technology is ideal for map classification due to being integration of a web-based Geographic Information System (GIS) with multi-sensor and multi-temporal Remote Sensing (RS) data in a cloud-based processing environment (Christensen and Jokar Arsanjani 2020). However, the geographical conditions of East Kalimantan indicate that the availability of remote sensing data free from cloud cover is very limited (Capolupo et al. 2020).

The selection of this study area emphasizes its few, partial, and qualitative contributions to land cover change and deforestation reports. Some related literature also predicted that approximately 230 thousand hectares of forest area were lost between 2003 and 2013 in East Kalimantan (Harris et al. 2008). Based on the interviews and literature in 2012–2013, deforestation and land degradation were subsequently caused by plantation and mining activities in Paser Regency, East Kalimantan (Angi and Wiati 2017). Regarding another previous report, forest cover decreased at a rate of 0.29%/year between 1992 and 2012 in the Kandilo sub-watershed (Nugroho et al. 2018). Some analyses between 2018 and 2020 also proved that deforestation was continuously occurring with reduced forest cover in the Bukit Soeharto Grand Forest Park (Arthayani 2020). Since the Spatial Plan for the East Kalimantan Province in 2016–2036 is being revised, mapping the most recent developments is indispensable, regarding land cover changes and deforestation. Therefore, this study aims to carry out the following objectives.

  1. 1.

    Analyze the classification and validation of multi-temporal land cover maps of East Kalimantan Province, using higher remote-sensing data and Google Earth Engine.

  2. 2.

    Evaluate the optimal model investigation method based on the calibration interval scenario, to select the appropriate period.

  3. 3.

    Synthesize the predictions of land cover in 2036 and analyze the spatial patterns of changes for 2016–2036.

Method

Workflow and study area

This experiment began by examining the availability of remote-sensing data and land cover maps in East Kalimantan, Indonesia (Fig. 1). The interval scenario-based model was then developed based on data availability. Each scenario was analyzed to see which model is the most optimal for predicting the future. The optimal model was then used to predict and analyze land cover in East Kalimantan in 2036. The area covering this province is located between 113º44ʹ E–119º00ʹ E and 2º33ʹ N–2º25ʹ S (Fig. 2). The study area is also directly adjacent to North and South Kalimantan Provinces in the northern and southern borders, as well as the Makassar Strait and Sulawesi Sea in the eastern environment. Meanwhile, it is bordered in the western region by the Provinces of Central and West Kalimantan Provinces, as well as the State of Sarawak (East Malaysia).

Fig. 1
figure 1

Workflow

Fig. 2
figure 2

Study site

Data availability

Dubois et al. proposed a semi-empirical model to estimate The availability of remote sensing data in East Kalimantan emphasizing the provision of the Landsat time series satellite images, which were accessed through the E arth engine data catalog (Table 1). This indicated that cloud-free mosaics per year were obtained based on a 1-year recording time, starting from July 1 of YYYY-1 to June 30 of YYYY.

Table 1 Primary data used

Model scenario

Inspired by Shooshtari and Gholamalifard (2015) and based on data availability in East Kalimantan, 14 scenarios were arranged (Table 2), with each showing different calibration intervals and periods (T0-T1). In this case, T2 was the land cover prediction for the first year regarding 1st iteration of the calibration interval. These were continuously represented until the 10th-year prediction (T11). In addition, the 14 selected scenarios were the calibration periods used to predict land cover in 2036, through different number of iterations. In this process, 11 multi-temporal land cover maps were needed and classified using the Random Forest Classification method.

Table 2 Calibration interval scenario

Random forest classification

Random Forest (RF) is undoubtedly the most widely applied machine learning (ML) algorithm in the Google Earth Engine (GEE) environment, for land cover map classification (Christensen and Jokar Arsanjani 2020; Tassi et al. 2021; Ozelkan 2020). This method is reportedly used to produce better classification accuracy than other ML techniques (Rudiastuti, et al. 2021). RF classification also uses bagging aggregation to build multiple Decision Trees (DT) and combines majority voting outputs to develop accurate predictions (Tassi et al. 2021). Furthermore, RF classification was conducted with the ee.Classifier.smileRandomForest algorithm in the GEE, using 10 DT (Tassi et al. 2021). In this process, 70% of the training data was obtained and trained using the ee.Classifier.train algorithm, leading to the production of a trained RF model. The land cover was also classified from the Landsat image input data (5, 7, and 8), using the ee.Image.classify algorithm, regarding the trained model.

To eliminate the “salt and pepper” effect, the ee.Image.focalMode algorithm with the following parameters was used, (i) radius of 30, (ii) squared kernel type, (iii) unit in meters, and (iv) 2 iterations. The accuracy of the land cover map is tested using the following methods,

  1. 1.

    Using 30% of training data (Tassi et al. 2021; Rudiastuti et al. 2021).

  2. 2.

    Implementing the reference map from the MoEF.

  3. 3.

    Adopting 350 Independent Check Points (ICP) identified from high-resolution imagery available in the Google Earth application (Wibowo et al. 2016).

Optimal model investigation

The optimal model investigation was carried out using the Land Change Modeler (LCM) module in the TerrSet 2020 software. This module is commonly used in land cover change reports through the MLP-Markov method, due to its implementation ease (Christensen and Jokar Arsanjani 2020; Rudiastuti et al. 2021; Silva et al. 2020). When using the CA-Markov method, the land cover allocation function is often implemented through a suitability map and Cellular Automata (CA), meanwhile, the LCM module uses a potential transition map in each sub-model to allocate land cover changes (Eastman 2022). Based on this analysis, the model investigation was carried out through the validity and accuracy tests of the predicted land cover maps (T2 to T11). Based on literature study, this modeling process was simulated by considering 14 driving variables, including the distances from the primary road, rivers and water bodies, transportation infrastructure, city center, business center, agricultural land, forest and mangrove edges, built-up area, altitude, land slope, slope orientation, rainfall, surface temperature, and population density (Angriani et al. 2018; Christensen and Jokar Arsanjani 2020; Descals et al. 2019; Dinda et al. 2019; Grinand et al. 2020; Gupta and Sharma 2020; Hadi et al. 2016; Hakim et al. 2021; Hamidah and Santoso 2021; Iizuka et al. 2017; Islam et al. 2018; Kourosh Niya et al. 2019; Leta et al. 2021; Navarro Cerrillo et al. 2020; Pazúr et al. 2020; Rimal et al. 2017; Shooshtari et al. 2020; Shooshtari and Gholamalifard 2015; Tripathy and Kumar 2019; Li et al. 2017; Xu et al. 2019; Yalew et al. 2016; Zakiy Muwafiq et al. 2018).

Prediction of land cover in 2036

The selected calibration period and interval (T0-T1) were used to predict land cover in 2036. The map obtained was subsequently analyzed spatially and temporally with T0 and T1. At this stage, land cover change analysis was carried out to determine the dynamics of land cover change in East Kalimantan Province between 2016 and 2036.

Results and discussion

Result in multi-temporal land cover map

Eleven multi-temporal land cover maps were successfully classified with good accuracy. Table 3 shows the Overall Accuracy (OA) and Kappa (K) values for each map. Based on the results, a 30% validation emphasized very good accuracy, with OA > 97% and K > 0, 97. I n this process, high accuracy indicated a level of consistency in obtaining training data for each land cover class. Therefore, the maps from government agencies were used as a reference, to validate the analytical classification outputs.

Table 3 Accuracy of land cover maps from Random Forest classification result

According to the referential land cover map of the MoEF, OA ranged from 73.47 to 78.17%, although the K-value was only observed between 0.581 and 0.654. The significant difference in the accuracy levels was due to the different classification methods of land cover maps. This referential map subsequently used the on-screen digitization method, by updating the previously changed areas, while the present analysis implemented the Random Forest classification model. Based on the ICP test, good classification outputs were observed with OA and K values ranging from 71.43 and 85.14% and 0.667–0.827, respectively. Since the outputs for 2000 and 2002 showed a Kappa value of less than 0.7, these periods were not included in the modeling scenario.

Selected calibration period and interval

The results of the optimal model investigation are presented in a matrix, which included the accuracy value of the predicted land cover map in the T2-T11 (Table 4). Regarding scenarios 1, 5, 6, and 8, the accuracy level of the maps decreased with prediction time. This indicated that a longer calibration interval led to a greater accuracy decrease over the prediction time. These results were in line with Valencia et al. (2020) and Qiang and Lam (2015), where the predictions for longer periods were unreliable.

Table 4 Optimal model investigation matrix

Based on Table 4, the decrease in the prediction qualities of T2-T5 was up to the 3rd iteration, with the kappa and accuracy values greater than 0, 7 and 70%, respectively. This was not in line with Valencia et al. (2020), where the average ideal prediction period was 1–2 times the calibration period. These results proved that the predictions of future land cover were maximally optimized to the 3rd prediction iteration. According to Valencia et al. (2020), maximum accuracy was obtained by starting modeling from the last year of calibration. This indicated that a closer calibration period to the prediction year led to more optimized accurate outputs. These conditions were observed in scenarios 1, 3, 4, 5, 6, 7, 8, 9, 11, and 12. Maximum accuracy was also obtained by starting prediction modeling from the last year of calibration.

In Estoque and Murayama (2012), model validation was an important step in modeling land cover change reports, to produce an optimal system. The validation was generally carried out by testing the analytical outputs in the first prediction year (T2). When the accuracy and Kappa levels are more than 70% and 0.7, respectively, the model is continuously eligible to predict the next year's iteration (Christensen, et al. 2020). Meanwhile, Rafaai, et al. (2020) stated that the accuracy of the transition sub-model was used as a benchmark for good future land cover prediction appropriateness. In this case, only a sub-model with an accuracy greater than 80% should be used in forecasting. In scenario 8 this theory has been proved, that with the accuracy of the transition sub-model ranging from 81.78 to 91.67%, land cover predictions in the 3rd iteration can still maintain an accuracy of > 80%.

There are three considerations in choosing the calibration period from several scenarios. The first is the length of the prediction time. The prediction year to be achieved is 2036. The second is the accuracy of the model validation results in the first year of prediction (T2). The third is the number of prediction iterations whose accuracy can still be relied upon, namely up to the 3rd iteration. Based on these three considerations, the best calibration interval that can be chosen is scenario 14 with the period 1996–2006. However, this period has long intervals (10 years). Predictions with longer intervals are unreliable (Valencia et al. 2020; Qiang and Lam 2015), and this has also been proven in the model simulations that have been carried out.

However, considering the findings in scenario 8 and the availability of data, the calibration period can be approximated to the predicted year without needing to be validated at T2. Valencia et al. (2020) also agreed that the closer the calibration period is to the prediction year, the accuracy of the prediction results can be optimized. Then by choosing a transitional sub-model with an accuracy of more than 80%, the prediction accuracy in the 3rd year iteration can still be relied upon. This finding is also in line with Rafaai's opinion (2020). Therefore the calibration period is not selected from the 14 scenarios but is synthesized from the empirical findings. The 2016–2021 period is concluded as the optimal calibration interval because it has a moderate interval (5 years) so it has a smaller risk for longer future predictions. Then, predictions for 2036 can also be achieved with 3 prediction iterations.

Land cover prediction in 2036

Land cover maps for 2016 and 2021 (Fig. 3) were used as input data to model land cover changes from 2016 to 2021 and then predict land cover for 2036. Each land cover change transition is a transitional sub-models which is then trained with an MLP neural network. The results of this stage are 42 maps of potential land cover change transitions which represent the possibility of changing each land cover class in East Kalimantan into another land cover class spatially. At this stage, it is also known that the driving variables that have the most significant influence on the land cover change in East Kalimantan, namely: the slope, the aspect, altitude, rainfall, distance from rivers, distance from transportation infrastructure, and distance from agricultural land.

Fig. 3
figure 3

a Land cover maps for 2016. b Land cover maps for 2021.

As recommended by Rafaai (2020), the transitional sub-model that will be used to predict land cover in 2036 is only a sub-model with an accuracy of more than 80%. For the “mixed crop-forest dryland” sub-model, the accuracy is less than 80%. However, because the contribution of land cover change in this sub-model reaches 3.850 km2, the accuracy of the sub-model is optimized by adjusting the training parameters of the MLP neural network to: 0.005 for the start learning rate; 0.0005 for the end learning rate; using 10 hidden layers; and increase the number of iterations to 12.500. The increase in the accuracy of the sub-model was achieved from 79.98 to 80.12%. A similar case was applied for the “mixed crop-plantation” sub-model and the “plantation-mixed crop” sub-model. Although the increase in accuracy was successful, increasing to 7%, the final accuracy could not reach 80%. Taking into account the broad contribution of land cover change being modeled, both sub-models are still included.

Thus, there are 31 out of 42 sub-models with an accuracy of more than 80% and with an area of more than 1.000 km2 that were used to model the land cover in 2036. Each of the resulting optimal transition sub-models is mapped with potential transitions. One of the 31 resulting transition potential maps could be seen in Fig. 4a. The higher the value (red color), the greater the potential for changing the “dryland forest” to a “mixed crop” land cover class. On the other hand, the lower the value (dark blue-black), the lower the potential for changing the “dryland forest” to a “mixed crop” land cover class. Each resulting transition potential map shows the probability of transition from one to another land cover class spatially.

Fig. 4
figure 4

a One of 31 potential transition maps (dryland forest- mixed crops transition). b Prediction results of the 2036 land cover map

In addition to a map of potential transitions, a transition probability matrix is also needed to make predictions obtained from the results of Markov Chain analysis (Table 5). In this context, the potential transition map emphasized the spatial allocation of the land cover change probability. Meanwhile, the MC was responsible for allocating the transition probability of the land cover area changing from the calibration period to 2036. The prediction result from the 2016–2021 model with 3 prediction iterations was a land cover map in 2036 (Fig. 4b).

Table 5 Transition probability matrix

Land cover change 2016–2036

According to Fig. 5a, a slight increase was observed in the dryland forest area, from 2016 to 2018. In this case, the observed decline continued until only 65. 629 km2 was predicted to remain in 2036. This decreasing trend was accompanied by an increase in the land cover area for mixed crops, plantations, and built-up areas, from 2018 to 2021 (Fig. 5b). This proved that the transition from a dryland forest to a mixed crops area was the largest land cover change occurrence in 2016–2021, at 5. 504, 98 km2.

Fig. 5
figure 5

a Diagram of land cover change 2016–2036 in East Kalimantan Part 1 b Part 2 c Part 3. d Land cover change map 2016–2021

Regarding the 2016–2021 dominant land cover change map, the greatest transition with an area larger than 1. 000 km2 was sporadically distributed across East Kalimantan, covering most of the province (Fig. 5d). In this case, the transition from “dryland forests” to “mixed crops” and vice versa were the 3 most dominant changes. Furthermore, the greatest land cover change in East Kalimantan mostly occurred in sloping to flat areas, with a slope level of fewer than 4° and an altitude below 100 m (Fig. 6). This was in line with the simulation of the optimal land cover change model, where the topographical factor was the most influential variable in East Kalimantan.

Fig. 6
figure 6

a Land cover changes in a flat area b at an altitude below 100 m

Based on optimal modeling, the effect level of driving variables on the land transition from the most influence is the following: land slope, land slope direction, altitude, distance from the river, rainfall, distance from transportation infrastructure, distance from agricultural land, distance from the forest and mangrove edges, distance from the primary road, distance from built-up area, distance from the city center, distance from the business center, surface temperature and lastly is population density.

According to Fig. 5c, a decrease was found in forest cover, from a total of 79. 843 to 76. 589 km2 in 2016 to 2021, respectively. This downward trend is predicted to continue until 2036 and is expected to reduce to 69. 203 km2. Spatially, the dynamics of forest cover changes between 2016–2036 are observed in Fig. 7a–e. In Harris et al. (2008) the occurrence of deforestation was predicted between 2003–2013 in East Kalimantan, encompassing an area of 230 hectares or 2.300 km2, with an estimated rate of 230 km2/year. Meanwhile, the predicted total rate was expected to reach 651 km2/year in 2016–2021. This showed an increase in the deforestation rate in East Kalimantan, from 2003-2013 to 2016–2021.

Fig. 7
figure 7

a Forest cover maps for 2016 b 2018 c 2020 d 2021 e Map of changee in forest cover f deforestation and reforestation map in 2016–2021

For the 2021–2036 period, the deforestation rate decreased to 492 km2/year, indicating a nonlinear pattern in predicting the longer future. Based on Qiang & Lam (2015) and Valencia et al. (2020), the trend of land cover change was not often linear and static, although the driving variable was assumed not to change. This indicated that topographical factors were strongly emphasized as the main driving and inhibiting variable for land cover change. These results were subsequently conveyed by Sandy (1985), where the slope was the main limiting factor for land use. Nugroho et al. (2018) also showed the effect of land slope on the deforestation rate, emphasizing its tendency to reduce in steep areas. Therefore, the relationship between land cover change and topographical factors needs subsequent study, especially to compare the use of static and dynamic variables.

Figure 7f shows the spatial distribution map of deforestation and reforestation in 2016–2021. From this context, these phenomena highly occurred in mining and plantation areas. Figure 8a also emphasizes deforestation due to land-clearing activities, to expand the mining area. It also evidences reforestation regarding reclamation activities within the coal mining areas in Batusopang and Muarasamu Districts, Paser Regency. According to Angi and Wiati (2017), deforestation was also detected within Paser District in 2012–2013, through interviews and literature reviews. Based on the East Kalimantan spatial planning map in 2016–2036, that area was designated as a production forest area, which is capable of being used for open pit mining activities with a borrow-use permit (Regional Spatial Plan 2016, PP 24 of 2010). However, on the east side of the mining area, it should not be used for mining activities but for plantation areas.

Fig. 8
figure 8

a Forest cover change map: in the mining area in Paser Regency b in plantation areas in Kutai Regency c and Berau Regency d in a production forest area in the New State Capital area

In Bentian Besar and Siluqngurai Subdistricts, Kutai Regency, deforestation also occurred regarding the expansion of the plantation area (Fig. 8b). Figure 8c subsequently shows the occurrence of plantation expansion in Segah Subdistrict, Berau Regency. The expansion of both plantations aligned with the allotment of agricultural areas, concerning the 2016–2036 spatial planning map of East Kalimantan Province. Deforestation and reforestation were also observed in the productive forest areas around the New National Capital (Fig. 8d). In this area, these phenomena were part of the industrial plantation forest production activities performed in 2016–2021. Based on the Minister of Environment and Forestry, Siti Nurbaya Bakar, 36. 832 hectares of production forest were released for the development of the New National Capital area (CNN Indonesia 2022).

The possibility of deforestation and reforestation were observed in the 2021–2036 East Kalimantan map (Fig. 9), although the prediction was not linear. Besides the downward trend of recorded deforestation in 2021–2036, several of the observed case studies also showed that the Regional Spatial Plan (2016–2036) was functioning adequately in the control of land use. The review of this plan was highly needed to accommodate unsuitable developments, establish a New National Capital strategy, and act as a guide for controlling future land conversion.

Fig. 9
figure 9

a Forest cover change Map 2021–2036

Conclusions

Multi-temporal land cover maps

Eleven multi-temporal land cover maps for 1997, 1998, 2000, 2002, 2003, 2014, 2015, 2016, 2018, 2020, and 2021 were successfully developed with good accuracy. From the ICP-based accuracy test, the Overall Accuracy and Kappa values ranged from 71.43 to 85.14% and 0.667–0.827, respectively.

Optimal model investigation method

The optimal model investigation was carried out by proving the theory of selecting optimal calibration intervals and periods through scenario-based modeling of calibration intervals. From the simulation, 14 scenarios proved that the accuracy of the prediction model decreased over time. In this analysis, future land cover predictions were also optimized up to the third iteration. These results subsequently showed that maximum accuracy was obtained by starting prediction modeling from the last year of calibration. By maintaining the accuracy of the transition sub-model above 80%, the selection of this calibration period was optimized closer to the prediction year. This explained that the selected optimal calibration period and the interval were 2016–2021, with a five-year calibration interval and three times prediction iterations.

Land cover change in East Kalimantan from 2016–2036

In 2016–2021, the land cover changes in East Kalimantan were dominated by the transition from “dryland forest” to “mixed crops”, encompassing an area of 5. 504, 98 km2. The dominant transition also occurred on sloping to flat areas, with a slope of < 4° and at an altitude of < 100 m. In this case, the topographical factor was the most influential driving variable, which caused a decrease in the forest cover area in East Kalimantan, from a total of 79.843 to 76.589 km2 in 2016 to 2021, respectively. This area is subsequently estimated at 69.203 km2 in 2036.

The total deforestation rate in 2016–2021 also reached 651 km2 per year, although it is predicted to achieve a decrease to 492 km2 yearly between 2021 and 2036. This showed that the rate of forest reduction was not linear in the longer future predictions. Deforestation and reforestation were also found to occur in the mining, plantation, and production forest areas around the New National Capital area. Based on the spatial overlay, the 2016–2036 East Kalimantan Provincial Spatial Plan was still functioning appropriately in controlling land use. The review of this plan was necessary to accommodate unsuitable developments, establish a National Capital development strategy, and act as a guide to controlling future land conversion.