Abstract
The paper presents the methodology developed to solve the class imbalanced problem that occurs in the classification of Thin-Layer Chromatography (TLC) images. The proposed methodology is based on re-sampling, and consists in the undersampling of the majority class (normal class), while the minority classes, which contain Lysosomal Storage Disorders (LSD) samples, are oversampled with the generation of synthetic samples. For image classification two approaches are presented, one based on a hierarchical classifier and another uses a multiclassifier system, where both classifiers are trained and tested using balanced data sets. The results demonstrate a better performance of the multiclassifier system using the balanced sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Kubat, M., Holte, R., Matwin, S.: Learning when Negative Examples Abound. In: European Conference on Machine Learning, pp. 146–153 (1997)
Lazarević, A., Srivastava, J., Kumar, V.: Data Mining for Analysis of Rare Events: A Case Study in Security. In: Financial and Medical Applications PAKDD 2004 (2004)
Skurichina, M., Raudys, S., Duin, R.P.W.: K-Nearest Neighbors Directed Noise Injection in Multilayer Perceptron Training. IEEE Transactions on Neural Networks 11(2) (2000)
Chawla, N.V., Japkowicz, N., Kolcz, A.: Editorial: Special Issue on Learning from Imbalanced Data Sets. Sigkdd Explorations 6(1), 1–6 (2004)
Japkowicz, N.: Class Imbalances: Are we focusing on the right issue? In: Workshop on Learning from Imbalanced Datasets II, ICML, Washington, DC (2003)
Chawla, N.V., Lazarevic, A., Hall, L.O., Bowyer, K.W.: SMOTEBoost: Improving prediction of the minority class in boosting. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS, vol. 2838, pp. 107–119. Springer, Heidelberg (2003)
Domingos, P.: MetaCost: a general method for making classifiers cost-sensitive. In: Conference on Knowledge Discovery in Data Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 155–164 (1999)
Fan, W., Stolfo, S., Zhang, J., Chan, P.: Adacost: Misclassication cost-sensitive boosting. In: Proceedings of Sixteenth International Conference on Machine Learning, pp. 983–990 (1999)
Tax, D.M.J.: One-class classification; Concept-learning in the absence of counter-examples, - PhD thesis Delft University of Technology ASCI Dissertation Series:65 - 1-190 (2001)
Ridder, D., Tax, D.M.J., Duin, R.P.W.: An experimental comparison of one-class classification methods In: Proc ASCI 1998, 4th Annual Conf. of the Advanced School for Computing and Imaging (Lommel, Belgium, June 9-11), ASCI, Delft, pp. 213–218 (1998)
Lazarevic, A., Ertoz, L., Ozgur, A., Srivastava, J., Kumar, V.: A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection. In: Proceedings of Third SIAM Conference on Data Mining, San Francisco (2003)
Weiss, G.M.: Mining with Rarity: A Unifying Framework. Sigkdd Explorations 6(1), 7–19 (2004)
Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: Proceedings of International ACM Conference on Research and Development in Information Retrieval, pp. 3–12 (1994)
Visa, S., Ralescu, A.: Learning Imbalanced and Overlapping Classes Using Fuzzy Sets. In: Workshop on Learning from Imbalanced Datasets II, ICML, Washington DC (2003)
Sousa, A.V., Aguiar, R.L., Mendonça, A.M., Campilho, A.C.: Automatic Lane and Band Detection in Images of Thin Layer Chromatography. In: Campilho, A.C., Kamel, M.S. (eds.) ICIAR 2004. LNCS, vol. 3212, pp. 158–165. Springer, Heidelberg (2004)
Sousa, A.V., Mendonça, A.M., Campilho, A.C., Aguiar, R.L., Miranda, C.S.: Feature Extraction for Classification of Thin-Layer Chromatography Images. In: Kamel, M.S., Campilho, A.C. (eds.) ICIAR 2005. LNCS, vol. 3656, pp. 974–981. Springer, Heidelberg (2005)
Japkowicz, N.: Learning from Imbalanced Data Sets: A Comparison of Various Strategies. In: Proceedings of Learning from Imbalanced Data, pp. 10–15 (2000)
Kubat, M., Holte, R., Matwin, S.: Machine Learning for the Detection of Oil Spills in Satellite Radar Images. Machine Learning 30, 195–215 (1998)
Ha, T., Bunke, H.: Off-Line, Handwritten Numeral Recognition by Perturbation Method. Pattern Analysis and Machine Intelligence 19(5), 535–539 (1997)
Lee, S.: Noisy replication in skewed binary classification. Computational Statistics & Data Analysis (2000)
Heijden, F., Robert, P.W.D., Ridder, D., Tax, D.M.J.: Classification. In: Parameter Estimation and State Estimation. Wiley, Chichester (2004)
Demant, C., Streicher-Abel, B., Waszkewitz, P.: Industrial Image Processing. Springer, Heidelberg (1999)
Chawla, N.V., Bowyer, K., Hall, L., Kegelmeyer, W.: SMOTE: Synthetic Minority Over-Sampling Technique. Jounal of Artificial Intelligence Research 16, 321–357 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sousa, A.V., Mendonça, A.M., Campilho, A. (2006). The Class Imbalance Problem in TLC Image Classification. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2006. Lecture Notes in Computer Science, vol 4142. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11867661_46
Download citation
DOI: https://doi.org/10.1007/11867661_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44894-5
Online ISBN: 978-3-540-44896-9
eBook Packages: Computer ScienceComputer Science (R0)