Improving SVM Training by Means of NTIL When the Data Sets Are Imbalanced

Vivaracho, Carlos E.

doi:10.1007/11875604_14

Carlos E. Vivaracho²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4203))

Included in the following conference series:

International Symposium on Methodologies for Intelligent Systems

2 Citations

Abstract

This paper deals with the problem of training a discriminative classifier when the data sets are imbalanced. More specifically, this work is concerned with the problem of classify a sample as belonging, or not, to a Target Class (TC), when the number of examples from the “Non-Target Class” (NTC) is much higher than those of the TC. The effectiveness of the heuristic method called Non Target Incremental Learning (NTIL) in the task of extracting, from the pool of NTC representatives, the most discriminant training subset with regard to the TC, has been proved when an Artificial Neural Network is used as classifier (ISMIS 2003). In this paper the effectiveness of this method is also shown for Support Vector Machines.

This work has been supported by Ministerio de Ciencia y Tecnología, Spain, under Project TIC2003-08382-C05-03.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Batista, G.: A study of the behavior of several methods for balancing machine learning training data. SIGKDD explorations 6(1), 20–29 (2004)
Article MathSciNet Google Scholar
Chawla, N.V., Japkowicz, N., Kotcz, A.: Editorial: Special issue on learning from imbalance data sets. SIGKDD explorations 6(1), 1–6 (2004)
Article Google Scholar
Cortes, C., Vapnik, V.: Support-vector network. Machine Learning (20), 273–297 (1995)
Google Scholar
Farrel, K.R., Mammone, R.J., Assaleh, K.T.: Speaker recognition using neural networks and conventional classifiers. IEEE Transations on Speech and Audio Processing, part II, 2(1) (1994)
Google Scholar
Japkowicz, N., Stephen, S.: The class imbalance problem: A sistematic study. Intelligent Data Analysis 6(5), 429–449 (2002)
MATH Google Scholar
Juszczak, P., Duin, R.P.W.: Uncertainty sampling methods for one-class classifiers. In: Proc. of the Workshop on Learning from Imbalanced Datasets II, ICML (2003)
Google Scholar
Mansfield, A.J., Wayman, J.L.: Best pratices in testing and reporting performance of biometric devices. version 2.01. Technical report (2002)
Google Scholar
del Brio, B.M., Sanz Molina, A.: Redes Neuronales y Sistemas Borrosos. Ra-Ma (1997)
Google Scholar
Osuna, E., Freund, R., Girosi, F.: Support vector machines: Training and applications. Technical report (1997)
Google Scholar
Solomonoff, A., Quillen, C., Campbell, W.M.: Channel compensation for svm speaker recognition. In: Proc. Odyssey 2004, the Speaker and Language Recognition Workshop, May 31 - June 3 (2004)
Google Scholar
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
MATH Google Scholar
Vivaracho, C.E., Ortega-Garcia, J., Alonso, L., Moro, Q.I.: Extracting the most discriminant subset from a pool of candidates to optimize discriminant classifier training. In: Zhong, N., Raś, Z.W., Tsumoto, S., Suzuki, E. (eds.) ISMIS 2003. LNCS (LNAI), vol. 2871, pp. 640–645. Springer, Heidelberg (2003)
Chapter Google Scholar
Vivaracho-Pascual, C., Ortega-Garcia, J., Alonso-Romero, L., Moro-Sancho, Q.: A comparative study of mlp-bsed artificial neural networks in text-indenpendent speaker verification against gmm-based systems. In: Dalsgaard, B.L.P., Benner, H. (eds.) Proc. of Eurospeech 2001, ISCA September 3-7, vol. 3, pp. 1753–1756 (2001)
Google Scholar
Weiss, G.M.: Mining with rarity: A unifing framework. SIGKDD explorations 6(1), 7–19 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dep. Informática, U. de Valladolid, Spain
Carlos E. Vivaracho

Authors

Carlos E. Vivaracho
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Informatica, Università degli Studi di Bari,
Floriana Esposito
Department of Computer Science, University of North Carolina, NC 28223, Charlotte, USA
Zbigniew W. Raś
Dipartimento di Informatica, Università degli Studi di Bari, via Orabona, 4, 70126, Bari, Italy
Donato Malerba
Dipartimento di Informatica, Università di Bari, Via E. Orabona, 4, 70125, Bari, Italia
Giovanni Semeraro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vivaracho, C.E. (2006). Improving SVM Training by Means of NTIL When the Data Sets Are Imbalanced. In: Esposito, F., Raś, Z.W., Malerba, D., Semeraro, G. (eds) Foundations of Intelligent Systems. ISMIS 2006. Lecture Notes in Computer Science(), vol 4203. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11875604_14

Download citation

DOI: https://doi.org/10.1007/11875604_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45764-0
Online ISBN: 978-3-540-45766-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics