Skip to main content
Log in

Word Sense Disambiguation in Bangla Language Using Supervised Methodology with Necessary Modifications

  • Original Contribution
  • Published:
Journal of The Institution of Engineers (India): Series B Aims and scope Submit manuscript

Abstract

An attempt is made in this paper to report how a supervised methodology has been adopted for the task of word sense disambiguation in Bangla with necessary modifications. At the initial stage, the Naïve Bayes probabilistic model that has been adopted as a baseline method for sense classification, yields moderate result with 81% accuracy when applied on a database of 19 (nineteen) most frequently used Bangla ambiguous words. On experimental basis, the baseline method is modified with two extensions: (a) inclusion of lemmatization process into of the system, and (b) bootstrapping of the operational process. As a result, the level of accuracy of the method is slightly improved up to 84% accuracy, which is a positive signal for the whole process of disambiguation as it opens scope for further modification of the existing method for better result. The data sets that have been used for this experiment include the Bangla POS tagged corpus obtained from the Indian Languages Corpora Initiative, and the Bangla WordNet, an online sense inventory developed at the Indian Statistical Institute, Kolkata. The paper also reports about the challenges and pitfalls of the work that have been closely observed and addressed to achieve expected level of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. N. Ide, J. Véronis, Word sense disambiguation: the state of the art. Comput. Linguist. 24(1), 1–40 (1998)

    Google Scholar 

  2. R. Florian, S. Cucerzan, C. Schafer, D. Yarowsky, Combining classifiers for word sense disambiguation. Nat. Lang. Eng. 8(4), 327–341 (2002)

    Article  Google Scholar 

  3. M.S. Nameh, M. Fakhrahmad, M.Z. Jahromi, A New approach to word sense disambiguation based on context similarity, in Proceedings of the World Congress on Engineering, vol. I (2011)

  4. W. Xiaojie, Y. Matsumoto, Chinese word sense disambiguation by combining pseudo training data, in Proceedings of The International Conference on Natural Language Processing and Knowledge Engineering (2003), pp. 138–143

  5. R. Navigli, Word sense disambiguation: a survey. ACM. Comput. Surv. 41(2), 1–69 (2009)

    Article  Google Scholar 

  6. M. Sanderson, Word sense disambiguation and information retrieval, in Proceedings of the 17th Annual International ACM SIGIR conference on Research and Development in Information Retrieval, SIGIR’94, July 03–06, Dublin (Springer, New York, 1994), pp. 142–151

  7. E. Agirre, P. Edmonds (eds.), Word Sense Disambiguation, Algorithms and Applications, Text Speech and Language Technology, vol 33 (Springer, Netherlands, 2007)

    Google Scholar 

  8. H. Seo, H. Chung, H. Rim, S.H. Myaeng, S. Kim, Unsupervised word sense disambiguation using WordNet relatives. Comput. Speech Lang. 18(3), 253–273 (2004)

    Article  Google Scholar 

  9. G.A. Miller, R. Beckwith, C. Fellbaum, D. Gross, K. Miller, WordNet: an on-line lexical database. Int. J. Lexicogr 3, 235–244(1990)

    Article  Google Scholar 

  10. S.G. Kolte, S.G. Bhirud, Word sense disambiguation using WordNet domains, in 1st International Conference on Digital Object Identifier (2008), pp. 1187–1191

  11. Y. Liu, P. Scheuermann, X. Li, X. Zhu, Using WordNet to disambiguate word senses for text classification, in Proceedings of the 7th International Conference on Computational Science (Springer, Berlin, 2007), pp. 781–789

  12. G.A. Miller, R. Beckwith, C. Fellbaum, D. Gross, K.J. Miller, WordNet an on-line lexical database. Int. J. Lexicogr. 3(4), 235–244 (1990)

    Article  Google Scholar 

  13. G.A. Miller, WordNet: a lexical database. Commun. ACM 38(11), 39–41 (1993)

    Article  Google Scholar 

  14. A.J. Cañas, A. Valerio, J. Lalinde-Pulido, M. Carvalho, M. Arguedas, Using WordNet for Word Sense Disambiguation to Support Concept Map Construction. In: String Processing and Information Retrieval, eds. by M.A. Nascimento, E.S. de Moura, A.L. Oliveira. SPIRE 2003. Lecture Notes in Computer Science, vol 2857 (Springer, Berlin, Heidelberg, 2003) pp. 350–359

    Google Scholar 

  15. C. Marine, W.U. Dekai, Word sense disambiguation vs. statistical machine translation, in Proceedings of the 43rd Annual Meeting of the ACL (Ann Arbor, 2005), pp. 387–394

  16. http://www.ling.gu.se/~sl/Undervisning/StatMet11/wsd-mt.pdf. 14 May 2015

  17. http://nlp.cs.nyu.edu/sk-symposium/note/P-28.pdf. 14 May 2015

  18. S.C. Yee, T.N. Hwee, C. David, Word sense disambiguation improves statistical machine translation, in Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (Prague, 2007), pp. 33–40

  19. R. Mihalcea, D. Moldovan, An iterative approach to word sense disambiguation, in Proceedings of Flairs 2000 (Orlando, FL, 2000), pp. 219–223

  20. S. Christopher, P.O. Michael, T. John, Word Sense Disambiguation in Information Retrieval Revisited, SIGIR’03, July 28–Aug 1, 2003 (Canada, Toronto, 2003)

  21. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.65.6828&rep=rep1&type=pdf. 14 May 2015

  22. http://www.aclweb.org/anthology/P12-1029. 14 May 2015

  23. https://www.comp.nus.edu.sg/~nght/pubs/esair11.pdf. 14 May 2015

  24. http://cui.unige.ch/isi/reports/2008/CLEF2008-LNCS.pdf. 14 May 2015

  25. S. Banerjee, T. Pedersen, An adapted Lesk algorithm for word sense disambiguation using WordNet, in Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics (Mexico City, 2002)

  26. M. Lesk, Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone, in Proceedings of SIGDOC (1986)

  27. http://www.dlsi.ua.es/projectes/srim/publicaciones/CICling-2002.pdf. 14 May 2015

  28. K. Mittal, A. Jain, Word sense disambiguation method using semantic similarity measures and owa operator. ICTACT J. Soft Comput, Special .Issue .Soft. Comput. Theor. Appl. Implications. Eng. Technol. 05(02), 896–904 (2015)

    Google Scholar 

  29. http://www.d.umn.edu/~tpederse/Pubs/cicling2003-3.pdf. 14 May 2015

  30. http://www.aclweb.org/anthology/U04-1021. 14 May 2015

  31. http://www.aclweb.org/anthology/C10-2142. 14 May 2015

  32. M.C. Diana, J. Carroll, Disambiguating nouns, verbs, and adjectives using automatically acquired selectional preferences. Comput. Linguist. 29(4), 639–654 (2003)

    Article  MATH  Google Scholar 

  33. Y. Patrick, B. Timothy, Verb sense disambiguation using selectional preferences extracted with a state-of-the-art semantic role labeler, in Proceedings of the 2006 Australasian Language Technology Workshop (ALTW2006) (2006), pp. 139–148

  34. http://link.springer.com/article/10.1023/A%3A1002674829964#page-1. 14 May 2015

  35. S. Parameswarappa, V.N. Narayana, Kannada Word sense disambiguation using decision list. Inter. J. Emerg. Trends. Technol. Comput. Sci. 2(3), 272–278 (2013)

    Google Scholar 

  36. http://www.academia.edu/5135515/Decision_List_Algorithm_for_WSD_for_Telugu_NLP. Accessed 10 Mar 2015

  37. T. Pedersen, in Unsupervised Corpus-Based Methods for WSD, eds. by E. Agirre, P. Edmonds. Word Sense Disambiguation. Text, Speech and Language Technology, vol 33. (Springer, Dordrecht, 2007), pp. 133–166

    Chapter  Google Scholar 

  38. R.L. Singh, K. Ghosh, K. Nongmeikapam, S. Bandyopadhyay, A decision tree based word sense disambiguation system in Manipuri language. ACIJ 5(4), 17–22 (2014)

    Article  Google Scholar 

  39. http://wing.comp.nus.edu.sg/publications/theses/2011/low_wee_urop.pdf. 14 May 2015

  40. http://www.d.umn.edu/~tpederse/Pubs/naacl01.pdf. 14 May 2015

  41. C. Le, A. Shimazu, High WSD accuracy using Naive Bayesian classifier with rich features, in PACLIC 18, Dec 8th–10th, 2004 (Waseda University, Tokyo, 2004), pp. 105–114

  42. http://www.cs.upc.edu/~escudero/wsd/00-ecai.pdf. 14 May 2015

  43. N.T.T. Aung, K.M. Soe, N.L. Thein, A word sense disambiguation system using Naïve Bayesian algorithm for Myanmar Language. Int. J. Sci. Eng. Res. 2(9), 1–7 (2011)

    Google Scholar 

  44. http://crema.di.unimi.it/~pereira/his2008.pdf. 14 May 2015

  45. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.13.9418&rep=rep1&type=pdf. 14 May 2015

  46. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.154.3476&rep=rep1&type=pdf. 14 May 2015

  47. http://www.aclweb.org/anthology/W02-1606. 14 May 2015

  48. http://www.aclweb.org/anthology/W97-0323. 14 May 2015

  49. https://www.comp.nus.edu.sg/~nght/pubs/se3.pdf. 14 May 2015

  50. D. Buscaldi, P. Rosso, F. Pla, E. Segarra, E.S. Arnal, Verb Sense Disambiguation Using Support Vector Machines: Impact of WordNet-Extracted Features, ed. by A. Gelbukh. CICLing 2006, LNCS 3878 (2006), pp. 192–195

  51. http://www.cs.cmu.edu/~maheshj/pubs/joshi+pedersen+maclin.iicai2005.pdf. 14 May 2015

  52. S. Brody, R. Navigli, M. Lapata, Ensemble methods for unsupervised WSD, in Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL (Sydney, 2006), pp. 97–104

  53. http://arxiv.org/pdf/cs/0007010.pdf. 14 May 2015

  54. http://www.aclweb.org/anthology/S01-1017. 14 May 2015

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alok Ranjan Pal.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pal, A.R., Saha, D., Dash, N.S. et al. Word Sense Disambiguation in Bangla Language Using Supervised Methodology with Necessary Modifications. J. Inst. Eng. India Ser. B 99, 519–526 (2018). https://doi.org/10.1007/s40031-018-0337-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40031-018-0337-5

Keywords

Navigation