Abstract
This paper mainly emphasis on the use of machine learning algorithms such as self-organizing maps (SOM) and support vector machines (SVM) for classifying text documents. We have to classify documents effectively and accurately to different classes based on their content. We tested classification of self-organizing map on Reuters R-8 data set and compared the results to three other popular machine learning algorithms: k-means clustering, k nearest neighbor searching, and Naive Bayes classifier. Self-organizing map yielded the highest accuracies as an unsupervised method. Furthermore, the accuracy of self-organizing maps was improved when used together with support vector machines.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Saarikoski J, Laurikkala J, Jrvelin K, Juhola M (2011) Self-organising maps in document classification: a comparison with six machine learning methods. Adaptive and Natural Computing Algorithms, pp 260–269
Haykin S (1999) In neural networks: a comprehensive foundation, Second Edition. Prentice Hall, Upper Saddle River
Mary Amala Bai V, Manimegalai D (2010) An analysis of document clustering algorithms. In: International conference on communication control and computing technologies, Ramanathapuram, pp 402–406
Ko Y, Seo J (2000) Automatic text categorization by unsupervised learning. In: Proceedings of the 18th international conference on computational linguistics, (COLING2000), pp 453459
Indu M, Kavitha KV (2009) Review on text summarization evaluation methods. In: International conference on research advances in integrated navigation systems (RAINS), Bangalore, 2016, pp 1–4
Rui W, Liu J, Jia Y (2016) Unsupervised feature selection for text classification via word embedding. In: ICBDA
Shafiabady N, Lee LH, Rajkumar R, Kallimani VP, Akram NA, Isa D (2016) Using unsupervised clustering approach to train the support vector machine for text classification. Neurocomputing 211:4–10
Brucher H, Knowlmayer G, Mittermayer MA (2002) Document classification methods for organizing explicit knowledge. In: Proceedings of the 3rd European conference on organizational knowledge, learning and capabilities, (ECOKLC02), Institute of Information Systems, University of Bern, Engehaldenstrasse Bern, Switzerland, Athens, Greece, pp 124–126
Lobo VJAS (2009) Application of self-organizing maps to the maritime environment. In: Popovich VV, Claramunt C, Schrenk M, Korolenko KV (eds) Information fusion and geographic information systems. Lecture Notes in Geoinformation and Cartography. Springer, Berlin, Heidelberg
Sigogne A, Constant M (2009) Real-time unsupervised classification of web documents. In: IMCSIT, pp 281–286. IEEE
Li TS, Huang CL (2009) Defect spatial pattern recognition using a hybrid SOMSVM approach in semiconductor manufacturing. Expert Syst Appl 36(1):374–385
Wu W, Liu X, Xu M, Peng J, Setiono R (2004) A hybrid SOM-SVM method for analyzing zebra fish gene expression. In: Proceedings of the 17th international conference on pattern recognition (ICPR04) vol 2, pp 323–326
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Patil, V., Jadhav, Y., Sirsat, A. (2021). Categorizing Documents by Support Vector Machine Trained Using Self-Organizing Maps Clustering Approach. In: Pawar, P.M., Balasubramaniam, R., Ronge, B.P., Salunkhe, S.B., Vibhute, A.S., Melinamath, B. (eds) Techno-Societal 2020. Springer, Cham. https://doi.org/10.1007/978-3-030-69921-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-69921-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69920-8
Online ISBN: 978-3-030-69921-5
eBook Packages: EngineeringEngineering (R0)