Skip to main content

Advertisement

Log in

Mapping Forests Using an Imbalanced Dataset

  • Original Contribution
  • Published:
Journal of The Institution of Engineers (India): Series B Aims and scope Submit manuscript

Abstract

Forests play a major role in maintaining the ecological stability of the region. In recent years, rampant tourism and other human activities have resulted in the decline of the area covered by forests. Many of the times, it becomes difficult to keep a track of the forest land lost, by regular land surveying. Machine learning classifiers applied to remotely sensed images can map the land cover of the region. The challenge in this experiment is that the classes are imbalanced, and hence the classifiers tend to be more biased toward the class which has a greater number of training samples. The novelty of the work is handling this imbalance at the training data level. This is done by using the area-proportionally sampled training samples for training the parameter tuned Random Forest Classifier. The results of this study revealed that, after the classifier is tuned, area-proportional allocation of training samples per class achieved the best classification results. The overall accuracy obtained is 90.5% and 94.6%, with a kappa of 0.85 and 0.92, respectively, for uniform sampling and area-proportional sampling methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Map Courtesy: https://indiamaps.gov.in/soiapp/

Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. U. Shanker, Deforestation in India and climate change. J. Biol. Innov. 7(4), 631–659 (2018)

    Google Scholar 

  2. J.G. van Minnen, B.J. Strengers, B. Eickhout et al., Quantifying the effectiveness of climate change mitigation through forest plantations and carbon sequestration with an integrated land-use model. Carbon Balance Manage 3, 1–20 (2008). https://doi.org/10.1186/1750-0680-3-3

    Article  Google Scholar 

  3. L.-J. Ferrato, K.W. Forsythe, Comparing hyperspectral and multispectral imagery for land classification of the lower Don River, Toronto. J. Geogr. Geol. 5(1), 92–107 (2013)

    Google Scholar 

  4. J. Luengo, D. García-Gil, S. Ramírez-Gallego, S. García, F. Herrera, Imbalanced data preprocessing for big data, in Big Data Preprocessing. ed. by J. Luengo, D. García-Gil, S. Ramírez-Gallego, S. García, F. Herrera (Springer International Publishing, Cham, 2020), pp.147–160

    Chapter  Google Scholar 

  5. A. Sharififar, F. Sarmadian, B. Minasny, Mapping imbalanced soil classes using Markov chain random fields models treated with data resampling technique. Comput. Electron. Agric. 159, 110–118 (2019)

    Article  Google Scholar 

  6. K.O. Hounkpatin, K. Schmidt, F. Stumpf, G. Forkuor, T. Behrens, T. Scholten, W. Amelung, G. Welp, Predicting reference soil groups using legacy data: a data pruning and Random Forest approach for tropical environment (Dano catchment, Burkina Faso). Sci. Rep. 8, 1–16 (2018)

    Article  Google Scholar 

  7. N.V. Chawla, K.W. Bowyer, L.O. Hall, W.P. Kegelmeyer, SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  MATH  Google Scholar 

  8. G. Douzas, F. Bacao, F. Last, Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE. Inf. Sci. 465, 1–20 (2018)

    Article  Google Scholar 

  9. G. Douzas, F. Bacao, Geometric SMOTE a geometrically enhanced drop-in replacement for SMOTE. Inf. Sci. 501, 118–135 (2019)

    Article  Google Scholar 

  10. M. Zhu, B. Wu, Y.N. He, Y.Q. He, Land cover classification using high resolution satellite image based on deep learning, in ISPRS International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-3/W10 (2020) pp. 685–690

  11. W. Abeysinghe, M. Wong, C.C. Hung, S. Bechikh, A constrained box algorithm for imbalanced data in satellite images, in International Geoscience and Remote Sensing Symposium, no. May (2019), pp. 3089–3092

  12. T. Leichtle, C. Geiß, T. Lakes, H. Taubenböck, Class imbalance in unsupervised change detection—a diagnostic analysis from urban remote sensing. Int. J. Appl. Earth Obs. Geoinf. 60, 83–98 (2017)

    Google Scholar 

  13. B. Bischke, P. Helber, D. Borth, A. Dengel, Segmentation of imbalanced classes in satellite imagery using adaptive uncertainty weighted class loss, in IGARSS 2018–2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia (2018), pp. 6191–6194. https://doi.org/10.1109/IGARSS.2018.8517836

  14. D. Lu, Q. Weng, A survey of image classification methods and techniques for improving classification performance. Int. J. Remote Sens. 28(5), 823–870 (2007)

    Article  Google Scholar 

  15. S. Talukdar et al., Land-use land-cover classification by machine learning classifiers for satellite observations—a review. Remote Sens. 12(7), 1135 (2020)

    Article  Google Scholar 

  16. P. Thanh Noi, M. Kappas, Comparison of random forest, k-nearest neighbor, and support vector machine classifiers for land cover classification using sentinel-2 imagery. Sensors (Basel) 18(1), 1–20 (2017)

    Google Scholar 

  17. I. Nitze, U. Schulthess, H. Asche, Comparison of machine learning algorithms random forest, artificial neuronal network and support vector machine to maximum likelihood for supervised crop type classification, in Proceeding on 4th Conference Geographic Object-Based Image Analysis—GEOBIA 2012, no. April 2015 (2012), pp. 35–40

  18. C. Li, J. Wang, L. Wang, L. Hu, P. Gong, Comparison of classification algorithms and training sample sizes in urban land classification with landsat thematic mapper imagery. Remote Sens. 6(2), 964–983 (2014). https://doi.org/10.3390/rs6020964

    Article  Google Scholar 

  19. M. Pal, Random forest classifier for remote sensing classification. Int. J. Remote Sens. 26(1), 217–222 (2005). https://doi.org/10.1080/01431160412331269698

    Article  Google Scholar 

  20. D. Ming, T. Zhou, M. Wang, T. Tan, Land cover classification using random forest with genetic algorithm-based parameter optimization. J. Appl. Remote Sens. 10(3), 035021 (2016). https://doi.org/10.1117/1.jrs.10.035021

    Article  Google Scholar 

  21. S. Han, H. Kim, On the optimal size of candidate feature set in random forest. Appl. Sci. (Switz.) 9(5), 1–13 (2019). https://doi.org/10.3390/app9050898

    Article  Google Scholar 

  22. Z. Jabs-Sobocińska, A.N. Affek, I. Ewiak, M.D. Nita, Mapping mature post-agricultural forests in the polish eastern Carpathians with archival remote sensing data. Remote Sens. 13, 2018 (2021). https://doi.org/10.3390/rs13102018

    Article  Google Scholar 

  23. K. Maurya, S. Mahajan, N. Chaube, Remote sensing techniques: mapping and monitoring of mangrove ecosystem—a review. Complex Intell. Syst. 7, 2797–2818 (2021). https://doi.org/10.1007/s40747-021-00457-z

    Article  Google Scholar 

  24. K. Bajpai, R. Soni, Analysis of image enhancement techniques used in remote sensing satellite imagery. Int. J. Comput. Appl. 169(10), 975–8887 (2017)

    Google Scholar 

  25. N.E. Young, R.S. Anderson, S.M. Chignell, A.G. Vorster, R. Lawrence, P.H. Evangelista, A survival guide to Landsat preprocessing. Ecology 98(4), 920–932 (2017)

    Article  Google Scholar 

  26. K. Kulkarni, P.A. Vijaya, Separability analysis of the band combinations for land cover classification of satellite images. Int. J. Eng. Trends Technol. 69(8), 138–144 (2021)

    Article  Google Scholar 

  27. A.D. Kulkarni, B. Lowe, Random forest algorithm for land cover classification random forest algorithm for land cover classification. Computer Science Faculty Publications and Presentations, Paper 1 (2016)

  28. Z. Yang, C.G. Boryan, Impact of non-proportional training sampling of imbalanced classes on land cover classification accuracy with see5 decision tree, in International Geoscience and Remote Sensing Symposium (IGARSS) (2019), pp. 9466–9469. https://doi.org/10.1109/IGARSS.2019.8897985

Download references

Acknowledgements

The authors sincerely thank the management of BNM Institute of Technology for their continuous encouragement during the period of the work undertaken. The authors also thank Visveswaraya Technological University for providing a platform to conduct the research work.

Funding

No funding received.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Keerti Kulkarni.

Ethics declarations

Conflict of interest

None.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kulkarni, K., Vijaya, P.A. Mapping Forests Using an Imbalanced Dataset. J. Inst. Eng. India Ser. B 103, 1987–1994 (2022). https://doi.org/10.1007/s40031-022-00790-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40031-022-00790-y

Keywords

Navigation