The Class Imbalance Problem in TLC Image Classification

Sousa, António V.; Mendonça, Ana Maria; Campilho, Aurélio

doi:10.1007/11867661_46

The Class Imbalance Problem in TLC Image Classification

António V. Sousa^18,19,
Ana Maria Mendonça^19,20 &
Aurélio Campilho^19,20

Conference paper

1464 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4142))

Abstract

The paper presents the methodology developed to solve the class imbalanced problem that occurs in the classification of Thin-Layer Chromatography (TLC) images. The proposed methodology is based on re-sampling, and consists in the undersampling of the majority class (normal class), while the minority classes, which contain Lysosomal Storage Disorders (LSD) samples, are oversampled with the generation of synthetic samples. For image classification two approaches are presented, one based on a hierarchical classifier and another uses a multiclassifier system, where both classifiers are trained and tested using balanced data sets. The results demonstrate a better performance of the multiclassifier system using the balanced sets.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kubat, M., Holte, R., Matwin, S.: Learning when Negative Examples Abound. In: European Conference on Machine Learning, pp. 146–153 (1997)
Google Scholar
Lazarević, A., Srivastava, J., Kumar, V.: Data Mining for Analysis of Rare Events: A Case Study in Security. In: Financial and Medical Applications PAKDD 2004 (2004)
Google Scholar
Skurichina, M., Raudys, S., Duin, R.P.W.: K-Nearest Neighbors Directed Noise Injection in Multilayer Perceptron Training. IEEE Transactions on Neural Networks 11(2) (2000)
Google Scholar
Chawla, N.V., Japkowicz, N., Kolcz, A.: Editorial: Special Issue on Learning from Imbalanced Data Sets. Sigkdd Explorations 6(1), 1–6 (2004)
Article Google Scholar
Japkowicz, N.: Class Imbalances: Are we focusing on the right issue? In: Workshop on Learning from Imbalanced Datasets II, ICML, Washington, DC (2003)
Google Scholar
Chawla, N.V., Lazarevic, A., Hall, L.O., Bowyer, K.W.: SMOTEBoost: Improving prediction of the minority class in boosting. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS, vol. 2838, pp. 107–119. Springer, Heidelberg (2003)
Chapter Google Scholar
Domingos, P.: MetaCost: a general method for making classifiers cost-sensitive. In: Conference on Knowledge Discovery in Data Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 155–164 (1999)
Google Scholar
Fan, W., Stolfo, S., Zhang, J., Chan, P.: Adacost: Misclassication cost-sensitive boosting. In: Proceedings of Sixteenth International Conference on Machine Learning, pp. 983–990 (1999)
Google Scholar
Tax, D.M.J.: One-class classification; Concept-learning in the absence of counter-examples, - PhD thesis Delft University of Technology ASCI Dissertation Series:65 - 1-190 (2001)
Google Scholar
Ridder, D., Tax, D.M.J., Duin, R.P.W.: An experimental comparison of one-class classification methods In: Proc ASCI 1998, 4th Annual Conf. of the Advanced School for Computing and Imaging (Lommel, Belgium, June 9-11), ASCI, Delft, pp. 213–218 (1998)
Google Scholar
Lazarevic, A., Ertoz, L., Ozgur, A., Srivastava, J., Kumar, V.: A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection. In: Proceedings of Third SIAM Conference on Data Mining, San Francisco (2003)
Google Scholar
Weiss, G.M.: Mining with Rarity: A Unifying Framework. Sigkdd Explorations 6(1), 7–19 (2004)
Article Google Scholar
Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: Proceedings of International ACM Conference on Research and Development in Information Retrieval, pp. 3–12 (1994)
Google Scholar
Visa, S., Ralescu, A.: Learning Imbalanced and Overlapping Classes Using Fuzzy Sets. In: Workshop on Learning from Imbalanced Datasets II, ICML, Washington DC (2003)
Google Scholar
Sousa, A.V., Aguiar, R.L., Mendonça, A.M., Campilho, A.C.: Automatic Lane and Band Detection in Images of Thin Layer Chromatography. In: Campilho, A.C., Kamel, M.S. (eds.) ICIAR 2004. LNCS, vol. 3212, pp. 158–165. Springer, Heidelberg (2004)
Chapter Google Scholar
Sousa, A.V., Mendonça, A.M., Campilho, A.C., Aguiar, R.L., Miranda, C.S.: Feature Extraction for Classification of Thin-Layer Chromatography Images. In: Kamel, M.S., Campilho, A.C. (eds.) ICIAR 2005. LNCS, vol. 3656, pp. 974–981. Springer, Heidelberg (2005)
Chapter Google Scholar
Japkowicz, N.: Learning from Imbalanced Data Sets: A Comparison of Various Strategies. In: Proceedings of Learning from Imbalanced Data, pp. 10–15 (2000)
Google Scholar
Kubat, M., Holte, R., Matwin, S.: Machine Learning for the Detection of Oil Spills in Satellite Radar Images. Machine Learning 30, 195–215 (1998)
Article Google Scholar
Ha, T., Bunke, H.: Off-Line, Handwritten Numeral Recognition by Perturbation Method. Pattern Analysis and Machine Intelligence 19(5), 535–539 (1997)
Article Google Scholar
Lee, S.: Noisy replication in skewed binary classification. Computational Statistics & Data Analysis (2000)
Google Scholar
Heijden, F., Robert, P.W.D., Ridder, D., Tax, D.M.J.: Classification. In: Parameter Estimation and State Estimation. Wiley, Chichester (2004)
Chapter Google Scholar
Demant, C., Streicher-Abel, B., Waszkewitz, P.: Industrial Image Processing. Springer, Heidelberg (1999)
Google Scholar
Chawla, N.V., Bowyer, K., Hall, L., Kegelmeyer, W.: SMOTE: Synthetic Minority Over-Sampling Technique. Jounal of Artificial Intelligence Research 16, 321–357 (2002)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Instituto Superior de Engenharia do Porto, Rua Dr. António Bernardino de Almeida 431, 4200-072, Porto, Portugal
António V. Sousa
Instituto de Engenharia Biomédica, Rua Roberto Frias, 4200-465, Porto, Portugal
António V. Sousa, Ana Maria Mendonça & Aurélio Campilho
Faculdade de Engenharia da Universidade do Porto, Rua Roberto Frias, 4200-465, Porto, Portugal
Ana Maria Mendonça & Aurélio Campilho

Authors

António V. Sousa
View author publications
You can also search for this author in PubMed Google Scholar
Ana Maria Mendonça
View author publications
You can also search for this author in PubMed Google Scholar
Aurélio Campilho
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Engineering, Institute of Biomedical Engineering, Rua Dr. Roberto Frias, University of Porto, 4200-465, Porto, Portugal
Aurélio Campilho
Electrical and Computer Engineering Department, University of Waterloo,
Mohamed Kamel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sousa, A.V., Mendonça, A.M., Campilho, A. (2006). The Class Imbalance Problem in TLC Image Classification. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2006. Lecture Notes in Computer Science, vol 4142. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11867661_46

Download citation

DOI: https://doi.org/10.1007/11867661_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44894-5
Online ISBN: 978-3-540-44896-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics