Skip to main content

An LDA-Based Approach Towards Word Sense Disambiguation in Malayalam

  • Conference paper
  • First Online:
Proceedings of International Conference on Machine Intelligence and Data Science Applications

Part of the book series: Algorithms for Intelligent Systems ((AIS))

  • 503 Accesses

Abstract

Word Sense Disambiguation refers to the task of correctly identifying the sense of an ambiguous word in a given context. This is considered as an AI-complete problem in the field of natural language processing. In this paper, an LDA-based WSD model is proposed which, to the best of our knowledge, is the first work that makes use of LDA for disambiguation in Malayalam. We had selected three ambiguous words and manually collected sentences including those from different novels, short stories, and web documents. The corpus consists of more than 200 instances of each ambiguous word. The model performed well compared to the state-of-the-art WSD systems in Malayalam.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022

    Google Scholar 

  2. Haroon RP (2010) Malayalam word sense disambiguation. In: 2010 IEEE international conference on computational intelligence and computing research. IEEE, pp 1–4

    Google Scholar 

  3. Junaida MK, Jayan JP, Sherly E (2017) Word sense disambiguation for Malayalam in a conditional random field framework. In: Proceedings of the 14th international conference on natural language processing (ICON-2017), pp 495–502

    Google Scholar 

  4. Jayan JP, Junaida MK, Sherly E, Malayalam word sense disambiguation using maximum entropy model

    Google Scholar 

  5. KP SS, Raj PR, Jayan V (2016) Unsupervised approach to word sense disambiguation in Malayalam. Procedia Technology 24:1507–1513

    Google Scholar 

  6. Boyd-Graber J, Blei D, Zhu X (2007) A topic model for word sense disambiguation. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp 1024–1033

    Google Scholar 

  7. Preiss J, Stevenson M (2013) Unsupervised domain tuning to improve word sense disambiguation. In: Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 680–684

    Google Scholar 

  8. Tan L, Bond F (2013) Xling: matching query sentences to a parallel corpus using topic models for word sense disambiguation. In: International workshop on semantic evaluation (SemEval 2013)

    Google Scholar 

  9. Lindgren J (2020) Evaluating hierarchical LDA topic models for article categorization

    Google Scholar 

  10. Rodriguez MY, Storer H (2020) A computational social science perspective on qualitative data exploration: using topic models for the descriptive analysis of social media data. J Technol Hum Serv 38(1):54–86

    Article  Google Scholar 

  11. Navigli R, Ponzetto SP (2010) BabelNet: building a very large multilingual semantic network. In: Proceedings of the 48th annual meeting of the association for computational linguistics, pp 216–225

    Google Scholar 

  12. Ekinci E, İlhan Omurca S (2020) Concept-LDA: incorporating Babelfy into LDA for aspect extraction. J Inf Sci 46(3):406–418

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Sruthi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sruthi, S., Balakrishnan, K., Paul, B. (2021). An LDA-Based Approach Towards Word Sense Disambiguation in Malayalam. In: Prateek, M., Singh, T.P., Choudhury, T., Pandey, H.M., Gia Nhu, N. (eds) Proceedings of International Conference on Machine Intelligence and Data Science Applications. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-33-4087-9_39

Download citation

Publish with us

Policies and ethics