Abstract
Clustering deals with the grouping together of data items which are similar amongst themselves and differ to a greater extent in terms of proximity to items of other groups. The problem is that in most institutions, undergraduate thesis titles are not grouped based on similarity and it is time-consuming for research students to search for a thesis report based on similarity or research papers which have similar topics since the titles are just stored sequentially in the database. The low score of Silhouette coefficient using k-means as a clustering algorithm on text clustering motivated to exploit the potentiality of the Enhanced K-Strange points clustering algorithm to obtain better results. The objective of this paper is to group the undergraduate thesis titles using the Enhanced K-Strange points clustering algorithm. The Silhouette coefficient is used to test the cluster quality. The result of the research is a method that can process the titles of the undergraduate thesis and group them into different groups using a clustering technique.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Afzali, M., & Kumar, S. (2019). Text document clustering: Issues and challenges. In 2019 international conference on machine learning, big data, cloud and parallel computing (COMITCon) (pp. 263–268). IEEE.
Alfakih, A. Y., Khandani, A., & Wolkowicz, H. (1999). Solving Euclidean distance matrix completion problems via semidefinite programming. Computational Optimization and Applications, 12(1), 13–30.
Beil, F., Ester, M., & Xu, X. (2002). Frequent term-based text clustering. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 436–442).
Chakraborty, G., Pagolu, M., & Garla, S. (2014). Text mining and analysis: Practical methods, examples, and case studies using SAS. SAS Institute.
Johnson, T., & Lobo, J. Z. (2012). Collinear clustering algorithm in lower dimensions. IOSR Journal of Computer Engineering, 6(5), 08–11.
Johnson, T., & Singh, S. K. (2015). Enhanced k strange points clustering algorithm. In 2015 international conference on emerging information technology and engineering solutions (pp. 32–37). IEEE.
Johnson, T., & Singh, S. K. (2015). K-strange points clustering algorithm. In Computational intelligence in data mining(Vol. 1, pp. 415–425). Springer.
Kao, A., & Poteet, S. R. (2007). Natural language processing and text mining. Springer Science & Business Media.
Kobayashi, V. B., Mol, S. T., Berkers, H. A., Kismihók, G., & Den Hartog, D. N. (2018). Text mining in organizational research. Organizational Research Methods, 21(3), 733–765.
Plattel, C. (2014). Distributed and incremental clustering using shared nearest neighbours. Master’s thesis.
Rohilla, V., Kumar, M. S. S., Chakraborty, S., & Singh, M. S. (2019). Data clustering using bisecting k-means. In 2019 international conference on computing, communication, and intelligent systems (ICCCIS) (pp. 80–83).
Rong, Y., et al. (2020). Staged text clustering algorithm based on k-means and hierarchical agglomeration clustering. In IEEE international conference on artificial intelligence and computer applications (ICAICA) (pp. 124–127). IEEE.
Schütze, H., Manning, C. D., & Raghavan, P. (2008). Introduction to information retrieval (Vol. 39). Cambridge University Press Cambridge.
Sen, A., Pandey, M., & Chakravarty, K. (2020). Random centroid selection for k-means clustering: A proposed algorithm for improving clustering results. In 2020 international conference on computer science, engineering and applications (ICCSEA) (pp. 1–4).
Vijayarani, S., Ilamathi, M. J., Nithya, M., et al. (2015). Preprocessing techniques for text mining—An overview. International Journal of Computer Science & Communication Networks, 5(1), 7–16.
Zahrotun, L., Putri, N. H., & Khusna, A. N. (2018). The implementation of k-means clustering method in classifying undergraduate thesis titles. In 2018 12th international conference on telecommunication systems, services, and applications (TSSA) (pp. 1–4). IEEE.
Acknowledgements
I would like to take this opportunity to express my profound gratitude and deep regard to my Prof. Teslin Jacob, Computer Engineering Department, Goa College of Engineering, for his guidance and valuable feedback and constant encouragement.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Madeira, M.A., Jacob, T. (2022). The Implementation of Enhanced K-Strange Points Clustering Method in Classifying Undergraduate Thesis Titles. In: Shakya, S., Balas, V.E., Kamolphiwong, S., Du, KL. (eds) Sentimental Analysis and Deep Learning. Advances in Intelligent Systems and Computing, vol 1408. Springer, Singapore. https://doi.org/10.1007/978-981-16-5157-1_21
Download citation
DOI: https://doi.org/10.1007/978-981-16-5157-1_21
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-5156-4
Online ISBN: 978-981-16-5157-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)