Classifying Remote Sensing Data with Support Vector Machines and Imbalanced Training Data

  • Björn Waske
  • Jon Atli Benediktsson
  • Johannes R. Sveinsson
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5519)


The classification of remote sensing data with imbalanced training data is addressed. The classification accuracy of a supervised method is affected by several factors, such as the classifier algorithm, the input data and the available training data. The use of an imbalanced training set, i.e., the number of training samples from one class is much smaller than from other classes, often results in low classification accuracies for the small classes. In the present study support vector machines (SVM) are trained with imbalanced training data. To handle the imbalanced training data, the training data are resampled (i.e., bagging) and a multiple classifier system, with SVM as base classifier, is generated. In addition to the classifier ensemble a single SVM is applied to the data, using the original balanced and the imbalanced training data sets. The results underline that the SVM classification is affected by imbalanced data sets, resulting in dominant lower classification accuracies for classes with fewer training data. Moreover the detailed accuracy assessment demonstrates that the proposed approach significantly improves the class accuracies achieved by a single SVM, which is trained on the whole imbalanced training data set.


land cover classification multispectral support vector machines bagging imbalanced training data 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Huang, C., Davis, L.S., Townshend, J.R.: An assessment of support vector machines for land cover classification. Int. J. Remote Sens. 23, 725–749 (2002)CrossRefGoogle Scholar
  2. 2.
    Foody, G.M., Mathur, A.: A Relative Evaluation of Multiclass Image Classification of Support Vector Machines. IEEE Trans. Geosci. and Remote Sens. 42, 1335–1343 (2004)CrossRefGoogle Scholar
  3. 3.
    Melgani, F., Bruzzone, L.: Classification of hyperspectral remote sensing images with support vector machines. IEEE Trans. Geosci. and Remote Sens. 42, 1778–1790 (2004)CrossRefGoogle Scholar
  4. 4.
    Polikar, R.: Ensemble Based Systems in Decision Making. IEEE Circuits and Systems Magazine 6, 21–45 (2006)CrossRefGoogle Scholar
  5. 5.
    Benediktsson, J.A., Chanussot, J., Fauvel, M.: Multiple Classifier Systems in Remote Sensing: From Basics to Recent Developments. In: Haindl, M., Kittler, J., Roli, F. (eds.) MCS 2007. LNCS, vol. 4472, pp. 501–512. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  6. 6.
    Benediktsson, J.A., Kanellopoulos, I.: Classification of Multisource and Hyperspectral Data Based on Decision Fusion. IEEE Trans. Geosci. and Remote Sens. 37, 1367–1377 (1999)CrossRefGoogle Scholar
  7. 7.
    Briem, G.J., Benediktsson, J.A., Sveinsson, J.R.: Multiple Classifiers Applied to Multisource Remote Sensing Data. IEEE Trans. Geosci. Remote Sens. 40, 2291–2299 (2002)CrossRefGoogle Scholar
  8. 8.
    Waske, B., Benediktsson, J.A.: Fusion of Support Vector Machines for Classification of Multisensor Data. IEEE Trans. Geosci. and Remote Sens. 45, 3858–3866 (2007)CrossRefGoogle Scholar
  9. 9.
    Waske, B., van der Linden, S.: Classifying multilevel imagery from SAR and optical sensors by decision fusion. IEEE Trans. on Geosci. and Remote Sens. 46, 1457–1466 (2008)CrossRefGoogle Scholar
  10. 10.
    Breiman, L.: Bagging predictors. Mach. Learning 24, 123–140 (1996)zbMATHGoogle Scholar
  11. 11.
    Kim, H.-C., Pang, S., Je, H.-M., Kim, D., Bang, S.Y.: Constructing support vector machine ensemble. Pattern Recogn. 36, 2757–2767 (2003)CrossRefzbMATHGoogle Scholar
  12. 12.
    Zortea, M., De Martino, M., Serpico, S.: A SVM ensemble approach for spectral-contextual classification of optical high spatial resolution imagery. In: Proc. of IGARSS 2007 Symposium, Barcelona, Spain (2007)Google Scholar
  13. 13.
    Trebar, M., Steele, N.: Application of distributed SVM architectures in classifying forest data cover types. Comp. and Electr. in Agriculture 63, 119–130 (2008)CrossRefGoogle Scholar
  14. 14.
    Imam, T., Ting, K.M., Kamruzzaman, J.: z-SVM: An SVM for Improved Classification of Imbalanced Data. In: Sattar, A., Kang, B.-h. (eds.) AI 2006. LNCS, vol. 4304, pp. 264–273. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  15. 15.
    Kang, P., Cho, S.: EUS SVMs: Ensemble of Under-Sampled SVMs for Data Imbalance Problems. In: King, I., Wang, J., Chan, L.-W., Wang, D. (eds.) ICONIP 2006. LNCS, vol. 4232, pp. 837–846. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  16. 16.
    Barandela, R., Valdovinos, R., Sánchez, J.: New Applications of Ensembles of Classifiers. Pattern Analy. & Appl. 6, 245–256 (2003)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)zbMATHGoogle Scholar
  18. 18.
    Burges, C.J.C.: A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery 2, 121–167 (1998)CrossRefGoogle Scholar
  19. 19.
    Schölkopf, B., Smola, A.: Learning with Kernels. MIT Press, Cambridge (2002)zbMATHGoogle Scholar
  20. 20.
    Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001),

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Björn Waske
    • 1
  • Jon Atli Benediktsson
    • 1
  • Johannes R. Sveinsson
    • 1
  1. 1.Faculty of Electrical and Computer EngineeringUniversity of IcelandReykjavikIceland

Personalised recommendations