Abstract
Forests play a major role in maintaining the ecological stability of the region. In recent years, rampant tourism and other human activities have resulted in the decline of the area covered by forests. Many of the times, it becomes difficult to keep a track of the forest land lost, by regular land surveying. Machine learning classifiers applied to remotely sensed images can map the land cover of the region. The challenge in this experiment is that the classes are imbalanced, and hence the classifiers tend to be more biased toward the class which has a greater number of training samples. The novelty of the work is handling this imbalance at the training data level. This is done by using the area-proportionally sampled training samples for training the parameter tuned Random Forest Classifier. The results of this study revealed that, after the classifier is tuned, area-proportional allocation of training samples per class achieved the best classification results. The overall accuracy obtained is 90.5% and 94.6%, with a kappa of 0.85 and 0.92, respectively, for uniform sampling and area-proportional sampling methods.
Similar content being viewed by others
References
U. Shanker, Deforestation in India and climate change. J. Biol. Innov. 7(4), 631–659 (2018)
J.G. van Minnen, B.J. Strengers, B. Eickhout et al., Quantifying the effectiveness of climate change mitigation through forest plantations and carbon sequestration with an integrated land-use model. Carbon Balance Manage 3, 1–20 (2008). https://doi.org/10.1186/1750-0680-3-3
L.-J. Ferrato, K.W. Forsythe, Comparing hyperspectral and multispectral imagery for land classification of the lower Don River, Toronto. J. Geogr. Geol. 5(1), 92–107 (2013)
J. Luengo, D. García-Gil, S. Ramírez-Gallego, S. García, F. Herrera, Imbalanced data preprocessing for big data, in Big Data Preprocessing. ed. by J. Luengo, D. García-Gil, S. Ramírez-Gallego, S. García, F. Herrera (Springer International Publishing, Cham, 2020), pp.147–160
A. Sharififar, F. Sarmadian, B. Minasny, Mapping imbalanced soil classes using Markov chain random fields models treated with data resampling technique. Comput. Electron. Agric. 159, 110–118 (2019)
K.O. Hounkpatin, K. Schmidt, F. Stumpf, G. Forkuor, T. Behrens, T. Scholten, W. Amelung, G. Welp, Predicting reference soil groups using legacy data: a data pruning and Random Forest approach for tropical environment (Dano catchment, Burkina Faso). Sci. Rep. 8, 1–16 (2018)
N.V. Chawla, K.W. Bowyer, L.O. Hall, W.P. Kegelmeyer, SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
G. Douzas, F. Bacao, F. Last, Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE. Inf. Sci. 465, 1–20 (2018)
G. Douzas, F. Bacao, Geometric SMOTE a geometrically enhanced drop-in replacement for SMOTE. Inf. Sci. 501, 118–135 (2019)
M. Zhu, B. Wu, Y.N. He, Y.Q. He, Land cover classification using high resolution satellite image based on deep learning, in ISPRS International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-3/W10 (2020) pp. 685–690
W. Abeysinghe, M. Wong, C.C. Hung, S. Bechikh, A constrained box algorithm for imbalanced data in satellite images, in International Geoscience and Remote Sensing Symposium, no. May (2019), pp. 3089–3092
T. Leichtle, C. Geiß, T. Lakes, H. Taubenböck, Class imbalance in unsupervised change detection—a diagnostic analysis from urban remote sensing. Int. J. Appl. Earth Obs. Geoinf. 60, 83–98 (2017)
B. Bischke, P. Helber, D. Borth, A. Dengel, Segmentation of imbalanced classes in satellite imagery using adaptive uncertainty weighted class loss, in IGARSS 2018–2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia (2018), pp. 6191–6194. https://doi.org/10.1109/IGARSS.2018.8517836
D. Lu, Q. Weng, A survey of image classification methods and techniques for improving classification performance. Int. J. Remote Sens. 28(5), 823–870 (2007)
S. Talukdar et al., Land-use land-cover classification by machine learning classifiers for satellite observations—a review. Remote Sens. 12(7), 1135 (2020)
P. Thanh Noi, M. Kappas, Comparison of random forest, k-nearest neighbor, and support vector machine classifiers for land cover classification using sentinel-2 imagery. Sensors (Basel) 18(1), 1–20 (2017)
I. Nitze, U. Schulthess, H. Asche, Comparison of machine learning algorithms random forest, artificial neuronal network and support vector machine to maximum likelihood for supervised crop type classification, in Proceeding on 4th Conference Geographic Object-Based Image Analysis—GEOBIA 2012, no. April 2015 (2012), pp. 35–40
C. Li, J. Wang, L. Wang, L. Hu, P. Gong, Comparison of classification algorithms and training sample sizes in urban land classification with landsat thematic mapper imagery. Remote Sens. 6(2), 964–983 (2014). https://doi.org/10.3390/rs6020964
M. Pal, Random forest classifier for remote sensing classification. Int. J. Remote Sens. 26(1), 217–222 (2005). https://doi.org/10.1080/01431160412331269698
D. Ming, T. Zhou, M. Wang, T. Tan, Land cover classification using random forest with genetic algorithm-based parameter optimization. J. Appl. Remote Sens. 10(3), 035021 (2016). https://doi.org/10.1117/1.jrs.10.035021
S. Han, H. Kim, On the optimal size of candidate feature set in random forest. Appl. Sci. (Switz.) 9(5), 1–13 (2019). https://doi.org/10.3390/app9050898
Z. Jabs-Sobocińska, A.N. Affek, I. Ewiak, M.D. Nita, Mapping mature post-agricultural forests in the polish eastern Carpathians with archival remote sensing data. Remote Sens. 13, 2018 (2021). https://doi.org/10.3390/rs13102018
K. Maurya, S. Mahajan, N. Chaube, Remote sensing techniques: mapping and monitoring of mangrove ecosystem—a review. Complex Intell. Syst. 7, 2797–2818 (2021). https://doi.org/10.1007/s40747-021-00457-z
K. Bajpai, R. Soni, Analysis of image enhancement techniques used in remote sensing satellite imagery. Int. J. Comput. Appl. 169(10), 975–8887 (2017)
N.E. Young, R.S. Anderson, S.M. Chignell, A.G. Vorster, R. Lawrence, P.H. Evangelista, A survival guide to Landsat preprocessing. Ecology 98(4), 920–932 (2017)
K. Kulkarni, P.A. Vijaya, Separability analysis of the band combinations for land cover classification of satellite images. Int. J. Eng. Trends Technol. 69(8), 138–144 (2021)
A.D. Kulkarni, B. Lowe, Random forest algorithm for land cover classification random forest algorithm for land cover classification. Computer Science Faculty Publications and Presentations, Paper 1 (2016)
Z. Yang, C.G. Boryan, Impact of non-proportional training sampling of imbalanced classes on land cover classification accuracy with see5 decision tree, in International Geoscience and Remote Sensing Symposium (IGARSS) (2019), pp. 9466–9469. https://doi.org/10.1109/IGARSS.2019.8897985
Acknowledgements
The authors sincerely thank the management of BNM Institute of Technology for their continuous encouragement during the period of the work undertaken. The authors also thank Visveswaraya Technological University for providing a platform to conduct the research work.
Funding
No funding received.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
None.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kulkarni, K., Vijaya, P.A. Mapping Forests Using an Imbalanced Dataset. J. Inst. Eng. India Ser. B 103, 1987–1994 (2022). https://doi.org/10.1007/s40031-022-00790-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40031-022-00790-y