Skip to main content

Improving SVM Training by Means of NTIL When the Data Sets Are Imbalanced

  • Conference paper
Foundations of Intelligent Systems (ISMIS 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4203))

Included in the following conference series:

Abstract

This paper deals with the problem of training a discriminative classifier when the data sets are imbalanced. More specifically, this work is concerned with the problem of classify a sample as belonging, or not, to a Target Class (TC), when the number of examples from the “Non-Target Class” (NTC) is much higher than those of the TC. The effectiveness of the heuristic method called Non Target Incremental Learning (NTIL) in the task of extracting, from the pool of NTC representatives, the most discriminant training subset with regard to the TC, has been proved when an Artificial Neural Network is used as classifier (ISMIS 2003). In this paper the effectiveness of this method is also shown for Support Vector Machines.

This work has been supported by Ministerio de Ciencia y Tecnología, Spain, under Project TIC2003-08382-C05-03.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Batista, G.: A study of the behavior of several methods for balancing machine learning training data. SIGKDD explorations 6(1), 20–29 (2004)

    Article  MathSciNet  Google Scholar 

  2. Chawla, N.V., Japkowicz, N., Kotcz, A.: Editorial: Special issue on learning from imbalance data sets. SIGKDD explorations 6(1), 1–6 (2004)

    Article  Google Scholar 

  3. Cortes, C., Vapnik, V.: Support-vector network. Machine Learning (20), 273–297 (1995)

    Google Scholar 

  4. Farrel, K.R., Mammone, R.J., Assaleh, K.T.: Speaker recognition using neural networks and conventional classifiers. IEEE Transations on Speech and Audio Processing, part II, 2(1) (1994)

    Google Scholar 

  5. Japkowicz, N., Stephen, S.: The class imbalance problem: A sistematic study. Intelligent Data Analysis 6(5), 429–449 (2002)

    MATH  Google Scholar 

  6. Juszczak, P., Duin, R.P.W.: Uncertainty sampling methods for one-class classifiers. In: Proc. of the Workshop on Learning from Imbalanced Datasets II, ICML (2003)

    Google Scholar 

  7. Mansfield, A.J., Wayman, J.L.: Best pratices in testing and reporting performance of biometric devices. version 2.01. Technical report (2002)

    Google Scholar 

  8. del Brio, B.M., Sanz Molina, A.: Redes Neuronales y Sistemas Borrosos. Ra-Ma (1997)

    Google Scholar 

  9. Osuna, E., Freund, R., Girosi, F.: Support vector machines: Training and applications. Technical report (1997)

    Google Scholar 

  10. Solomonoff, A., Quillen, C., Campbell, W.M.: Channel compensation for svm speaker recognition. In: Proc. Odyssey 2004, the Speaker and Language Recognition Workshop, May 31 - June 3 (2004)

    Google Scholar 

  11. Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)

    MATH  Google Scholar 

  12. Vivaracho, C.E., Ortega-Garcia, J., Alonso, L., Moro, Q.I.: Extracting the most discriminant subset from a pool of candidates to optimize discriminant classifier training. In: Zhong, N., Raś, Z.W., Tsumoto, S., Suzuki, E. (eds.) ISMIS 2003. LNCS (LNAI), vol. 2871, pp. 640–645. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  13. Vivaracho-Pascual, C., Ortega-Garcia, J., Alonso-Romero, L., Moro-Sancho, Q.: A comparative study of mlp-bsed artificial neural networks in text-indenpendent speaker verification against gmm-based systems. In: Dalsgaard, B.L.P., Benner, H. (eds.) Proc. of Eurospeech 2001, ISCA September 3-7, vol. 3, pp. 1753–1756 (2001)

    Google Scholar 

  14. Weiss, G.M.: Mining with rarity: A unifing framework. SIGKDD explorations 6(1), 7–19 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vivaracho, C.E. (2006). Improving SVM Training by Means of NTIL When the Data Sets Are Imbalanced. In: Esposito, F., Raś, Z.W., Malerba, D., Semeraro, G. (eds) Foundations of Intelligent Systems. ISMIS 2006. Lecture Notes in Computer Science(), vol 4203. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11875604_14

Download citation

  • DOI: https://doi.org/10.1007/11875604_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45764-0

  • Online ISBN: 978-3-540-45766-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics