Abstract
Word Sense Disambiguation refers to the task of correctly identifying the sense of an ambiguous word in a given context. This is considered as an AI-complete problem in the field of natural language processing. In this paper, an LDA-based WSD model is proposed which, to the best of our knowledge, is the first work that makes use of LDA for disambiguation in Malayalam. We had selected three ambiguous words and manually collected sentences including those from different novels, short stories, and web documents. The corpus consists of more than 200 instances of each ambiguous word. The model performed well compared to the state-of-the-art WSD systems in Malayalam.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Haroon RP (2010) Malayalam word sense disambiguation. In: 2010 IEEE international conference on computational intelligence and computing research. IEEE, pp 1–4
Junaida MK, Jayan JP, Sherly E (2017) Word sense disambiguation for Malayalam in a conditional random field framework. In: Proceedings of the 14th international conference on natural language processing (ICON-2017), pp 495–502
Jayan JP, Junaida MK, Sherly E, Malayalam word sense disambiguation using maximum entropy model
KP SS, Raj PR, Jayan V (2016) Unsupervised approach to word sense disambiguation in Malayalam. Procedia Technology 24:1507–1513
Boyd-Graber J, Blei D, Zhu X (2007) A topic model for word sense disambiguation. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp 1024–1033
Preiss J, Stevenson M (2013) Unsupervised domain tuning to improve word sense disambiguation. In: Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 680–684
Tan L, Bond F (2013) Xling: matching query sentences to a parallel corpus using topic models for word sense disambiguation. In: International workshop on semantic evaluation (SemEval 2013)
Lindgren J (2020) Evaluating hierarchical LDA topic models for article categorization
Rodriguez MY, Storer H (2020) A computational social science perspective on qualitative data exploration: using topic models for the descriptive analysis of social media data. J Technol Hum Serv 38(1):54–86
Navigli R, Ponzetto SP (2010) BabelNet: building a very large multilingual semantic network. In: Proceedings of the 48th annual meeting of the association for computational linguistics, pp 216–225
Ekinci E, İlhan Omurca S (2020) Concept-LDA: incorporating Babelfy into LDA for aspect extraction. J Inf Sci 46(3):406–418
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sruthi, S., Balakrishnan, K., Paul, B. (2021). An LDA-Based Approach Towards Word Sense Disambiguation in Malayalam. In: Prateek, M., Singh, T.P., Choudhury, T., Pandey, H.M., Gia Nhu, N. (eds) Proceedings of International Conference on Machine Intelligence and Data Science Applications. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-33-4087-9_39
Download citation
DOI: https://doi.org/10.1007/978-981-33-4087-9_39
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-33-4086-2
Online ISBN: 978-981-33-4087-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)