Advertisement

The Class Imbalance Problem in TLC Image Classification

  • António V. Sousa
  • Ana Maria Mendonça
  • Aurélio Campilho
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4142)

Abstract

The paper presents the methodology developed to solve the class imbalanced problem that occurs in the classification of Thin-Layer Chromatography (TLC) images. The proposed methodology is based on re-sampling, and consists in the undersampling of the majority class (normal class), while the minority classes, which contain Lysosomal Storage Disorders (LSD) samples, are oversampled with the generation of synthetic samples. For image classification two approaches are presented, one based on a hierarchical classifier and another uses a multiclassifier system, where both classifiers are trained and tested using balanced data sets. The results demonstrate a better performance of the multiclassifier system using the balanced sets.

Keywords

Minority Class Synthetic Sample Lysosomal Storage Disorder Class Imbalance Class Imbalance Problem 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Kubat, M., Holte, R., Matwin, S.: Learning when Negative Examples Abound. In: European Conference on Machine Learning, pp. 146–153 (1997)Google Scholar
  2. 2.
    Lazarević, A., Srivastava, J., Kumar, V.: Data Mining for Analysis of Rare Events: A Case Study in Security. In: Financial and Medical Applications PAKDD 2004 (2004)Google Scholar
  3. 3.
    Skurichina, M., Raudys, S., Duin, R.P.W.: K-Nearest Neighbors Directed Noise Injection in Multilayer Perceptron Training. IEEE Transactions on Neural Networks 11(2) (2000)Google Scholar
  4. 4.
    Chawla, N.V., Japkowicz, N., Kolcz, A.: Editorial: Special Issue on Learning from Imbalanced Data Sets. Sigkdd Explorations 6(1), 1–6 (2004)CrossRefGoogle Scholar
  5. 5.
    Japkowicz, N.: Class Imbalances: Are we focusing on the right issue? In: Workshop on Learning from Imbalanced Datasets II, ICML, Washington, DC (2003) Google Scholar
  6. 6.
    Chawla, N.V., Lazarevic, A., Hall, L.O., Bowyer, K.W.: SMOTEBoost: Improving prediction of the minority class in boosting. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS, vol. 2838, pp. 107–119. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  7. 7.
    Domingos, P.: MetaCost: a general method for making classifiers cost-sensitive. In: Conference on Knowledge Discovery in Data Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 155–164 (1999)Google Scholar
  8. 8.
    Fan, W., Stolfo, S., Zhang, J., Chan, P.: Adacost: Misclassication cost-sensitive boosting. In: Proceedings of Sixteenth International Conference on Machine Learning, pp. 983–990 (1999)Google Scholar
  9. 9.
    Tax, D.M.J.: One-class classification; Concept-learning in the absence of counter-examples, - PhD thesis Delft University of Technology ASCI Dissertation Series:65 - 1-190 (2001)Google Scholar
  10. 10.
    Ridder, D., Tax, D.M.J., Duin, R.P.W.: An experimental comparison of one-class classification methods In: Proc ASCI 1998, 4th Annual Conf. of the Advanced School for Computing and Imaging (Lommel, Belgium, June 9-11), ASCI, Delft, pp. 213–218 (1998) Google Scholar
  11. 11.
    Lazarevic, A., Ertoz, L., Ozgur, A., Srivastava, J., Kumar, V.: A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection. In: Proceedings of Third SIAM Conference on Data Mining, San Francisco (2003)Google Scholar
  12. 12.
    Weiss, G.M.: Mining with Rarity: A Unifying Framework. Sigkdd Explorations 6(1), 7–19 (2004)CrossRefGoogle Scholar
  13. 13.
    Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: Proceedings of International ACM Conference on Research and Development in Information Retrieval, pp. 3–12 (1994)Google Scholar
  14. 14.
    Visa, S., Ralescu, A.: Learning Imbalanced and Overlapping Classes Using Fuzzy Sets. In: Workshop on Learning from Imbalanced Datasets II, ICML, Washington DC (2003)Google Scholar
  15. 15.
    Sousa, A.V., Aguiar, R.L., Mendonça, A.M., Campilho, A.C.: Automatic Lane and Band Detection in Images of Thin Layer Chromatography. In: Campilho, A.C., Kamel, M.S. (eds.) ICIAR 2004. LNCS, vol. 3212, pp. 158–165. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  16. 16.
    Sousa, A.V., Mendonça, A.M., Campilho, A.C., Aguiar, R.L., Miranda, C.S.: Feature Extraction for Classification of Thin-Layer Chromatography Images. In: Kamel, M.S., Campilho, A.C. (eds.) ICIAR 2005. LNCS, vol. 3656, pp. 974–981. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  17. 17.
    Japkowicz, N.: Learning from Imbalanced Data Sets: A Comparison of Various Strategies. In: Proceedings of Learning from Imbalanced Data, pp. 10–15 (2000)Google Scholar
  18. 18.
    Kubat, M., Holte, R., Matwin, S.: Machine Learning for the Detection of Oil Spills in Satellite Radar Images. Machine Learning 30, 195–215 (1998)CrossRefGoogle Scholar
  19. 19.
    Ha, T., Bunke, H.: Off-Line, Handwritten Numeral Recognition by Perturbation Method. Pattern Analysis and Machine Intelligence 19(5), 535–539 (1997)CrossRefGoogle Scholar
  20. 20.
    Lee, S.: Noisy replication in skewed binary classification. Computational Statistics & Data Analysis (2000)Google Scholar
  21. 21.
    Heijden, F., Robert, P.W.D., Ridder, D., Tax, D.M.J.: Classification. In: Parameter Estimation and State Estimation. Wiley, Chichester (2004)CrossRefGoogle Scholar
  22. 22.
    Demant, C., Streicher-Abel, B., Waszkewitz, P.: Industrial Image Processing. Springer, Heidelberg (1999)Google Scholar
  23. 23.
    Chawla, N.V., Bowyer, K., Hall, L., Kegelmeyer, W.: SMOTE: Synthetic Minority Over-Sampling Technique. Jounal of Artificial Intelligence Research 16, 321–357 (2002)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • António V. Sousa
    • 1
    • 2
  • Ana Maria Mendonça
    • 2
    • 3
  • Aurélio Campilho
    • 2
    • 3
  1. 1.Instituto Superior de Engenharia do PortoPortoPortugal
  2. 2.Instituto de Engenharia BiomédicaPortoPortugal
  3. 3.Faculdade de Engenharia da Universidade do PortoPortoPortugal

Personalised recommendations