Abstract
An attempt is made in this paper to report how a supervised methodology has been adopted for the task of word sense disambiguation in Bangla with necessary modifications. At the initial stage, the Naïve Bayes probabilistic model that has been adopted as a baseline method for sense classification, yields moderate result with 81% accuracy when applied on a database of 19 (nineteen) most frequently used Bangla ambiguous words. On experimental basis, the baseline method is modified with two extensions: (a) inclusion of lemmatization process into of the system, and (b) bootstrapping of the operational process. As a result, the level of accuracy of the method is slightly improved up to 84% accuracy, which is a positive signal for the whole process of disambiguation as it opens scope for further modification of the existing method for better result. The data sets that have been used for this experiment include the Bangla POS tagged corpus obtained from the Indian Languages Corpora Initiative, and the Bangla WordNet, an online sense inventory developed at the Indian Statistical Institute, Kolkata. The paper also reports about the challenges and pitfalls of the work that have been closely observed and addressed to achieve expected level of accuracy.
Similar content being viewed by others
References
N. Ide, J. Véronis, Word sense disambiguation: the state of the art. Comput. Linguist. 24(1), 1–40 (1998)
R. Florian, S. Cucerzan, C. Schafer, D. Yarowsky, Combining classifiers for word sense disambiguation. Nat. Lang. Eng. 8(4), 327–341 (2002)
M.S. Nameh, M. Fakhrahmad, M.Z. Jahromi, A New approach to word sense disambiguation based on context similarity, in Proceedings of the World Congress on Engineering, vol. I (2011)
W. Xiaojie, Y. Matsumoto, Chinese word sense disambiguation by combining pseudo training data, in Proceedings of The International Conference on Natural Language Processing and Knowledge Engineering (2003), pp. 138–143
R. Navigli, Word sense disambiguation: a survey. ACM. Comput. Surv. 41(2), 1–69 (2009)
M. Sanderson, Word sense disambiguation and information retrieval, in Proceedings of the 17th Annual International ACM SIGIR conference on Research and Development in Information Retrieval, SIGIR’94, July 03–06, Dublin (Springer, New York, 1994), pp. 142–151
E. Agirre, P. Edmonds (eds.), Word Sense Disambiguation, Algorithms and Applications, Text Speech and Language Technology, vol 33 (Springer, Netherlands, 2007)
H. Seo, H. Chung, H. Rim, S.H. Myaeng, S. Kim, Unsupervised word sense disambiguation using WordNet relatives. Comput. Speech Lang. 18(3), 253–273 (2004)
G.A. Miller, R. Beckwith, C. Fellbaum, D. Gross, K. Miller, WordNet: an on-line lexical database. Int. J. Lexicogr 3, 235–244(1990)
S.G. Kolte, S.G. Bhirud, Word sense disambiguation using WordNet domains, in 1st International Conference on Digital Object Identifier (2008), pp. 1187–1191
Y. Liu, P. Scheuermann, X. Li, X. Zhu, Using WordNet to disambiguate word senses for text classification, in Proceedings of the 7th International Conference on Computational Science (Springer, Berlin, 2007), pp. 781–789
G.A. Miller, R. Beckwith, C. Fellbaum, D. Gross, K.J. Miller, WordNet an on-line lexical database. Int. J. Lexicogr. 3(4), 235–244 (1990)
G.A. Miller, WordNet: a lexical database. Commun. ACM 38(11), 39–41 (1993)
A.J. Cañas, A. Valerio, J. Lalinde-Pulido, M. Carvalho, M. Arguedas, Using WordNet for Word Sense Disambiguation to Support Concept Map Construction. In: String Processing and Information Retrieval, eds. by M.A. Nascimento, E.S. de Moura, A.L. Oliveira. SPIRE 2003. Lecture Notes in Computer Science, vol 2857 (Springer, Berlin, Heidelberg, 2003) pp. 350–359
C. Marine, W.U. Dekai, Word sense disambiguation vs. statistical machine translation, in Proceedings of the 43rd Annual Meeting of the ACL (Ann Arbor, 2005), pp. 387–394
http://www.ling.gu.se/~sl/Undervisning/StatMet11/wsd-mt.pdf. 14 May 2015
http://nlp.cs.nyu.edu/sk-symposium/note/P-28.pdf. 14 May 2015
S.C. Yee, T.N. Hwee, C. David, Word sense disambiguation improves statistical machine translation, in Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (Prague, 2007), pp. 33–40
R. Mihalcea, D. Moldovan, An iterative approach to word sense disambiguation, in Proceedings of Flairs 2000 (Orlando, FL, 2000), pp. 219–223
S. Christopher, P.O. Michael, T. John, Word Sense Disambiguation in Information Retrieval Revisited, SIGIR’03, July 28–Aug 1, 2003 (Canada, Toronto, 2003)
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.65.6828&rep=rep1&type=pdf. 14 May 2015
http://www.aclweb.org/anthology/P12-1029. 14 May 2015
https://www.comp.nus.edu.sg/~nght/pubs/esair11.pdf. 14 May 2015
http://cui.unige.ch/isi/reports/2008/CLEF2008-LNCS.pdf. 14 May 2015
S. Banerjee, T. Pedersen, An adapted Lesk algorithm for word sense disambiguation using WordNet, in Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics (Mexico City, 2002)
M. Lesk, Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone, in Proceedings of SIGDOC (1986)
http://www.dlsi.ua.es/projectes/srim/publicaciones/CICling-2002.pdf. 14 May 2015
K. Mittal, A. Jain, Word sense disambiguation method using semantic similarity measures and owa operator. ICTACT J. Soft Comput, Special .Issue .Soft. Comput. Theor. Appl. Implications. Eng. Technol. 05(02), 896–904 (2015)
http://www.d.umn.edu/~tpederse/Pubs/cicling2003-3.pdf. 14 May 2015
http://www.aclweb.org/anthology/U04-1021. 14 May 2015
http://www.aclweb.org/anthology/C10-2142. 14 May 2015
M.C. Diana, J. Carroll, Disambiguating nouns, verbs, and adjectives using automatically acquired selectional preferences. Comput. Linguist. 29(4), 639–654 (2003)
Y. Patrick, B. Timothy, Verb sense disambiguation using selectional preferences extracted with a state-of-the-art semantic role labeler, in Proceedings of the 2006 Australasian Language Technology Workshop (ALTW2006) (2006), pp. 139–148
http://link.springer.com/article/10.1023/A%3A1002674829964#page-1. 14 May 2015
S. Parameswarappa, V.N. Narayana, Kannada Word sense disambiguation using decision list. Inter. J. Emerg. Trends. Technol. Comput. Sci. 2(3), 272–278 (2013)
http://www.academia.edu/5135515/Decision_List_Algorithm_for_WSD_for_Telugu_NLP. Accessed 10 Mar 2015
T. Pedersen, in Unsupervised Corpus-Based Methods for WSD, eds. by E. Agirre, P. Edmonds. Word Sense Disambiguation. Text, Speech and Language Technology, vol 33. (Springer, Dordrecht, 2007), pp. 133–166
R.L. Singh, K. Ghosh, K. Nongmeikapam, S. Bandyopadhyay, A decision tree based word sense disambiguation system in Manipuri language. ACIJ 5(4), 17–22 (2014)
http://wing.comp.nus.edu.sg/publications/theses/2011/low_wee_urop.pdf. 14 May 2015
http://www.d.umn.edu/~tpederse/Pubs/naacl01.pdf. 14 May 2015
C. Le, A. Shimazu, High WSD accuracy using Naive Bayesian classifier with rich features, in PACLIC 18, Dec 8th–10th, 2004 (Waseda University, Tokyo, 2004), pp. 105–114
http://www.cs.upc.edu/~escudero/wsd/00-ecai.pdf. 14 May 2015
N.T.T. Aung, K.M. Soe, N.L. Thein, A word sense disambiguation system using Naïve Bayesian algorithm for Myanmar Language. Int. J. Sci. Eng. Res. 2(9), 1–7 (2011)
http://crema.di.unimi.it/~pereira/his2008.pdf. 14 May 2015
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.13.9418&rep=rep1&type=pdf. 14 May 2015
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.154.3476&rep=rep1&type=pdf. 14 May 2015
http://www.aclweb.org/anthology/W02-1606. 14 May 2015
http://www.aclweb.org/anthology/W97-0323. 14 May 2015
https://www.comp.nus.edu.sg/~nght/pubs/se3.pdf. 14 May 2015
D. Buscaldi, P. Rosso, F. Pla, E. Segarra, E.S. Arnal, Verb Sense Disambiguation Using Support Vector Machines: Impact of WordNet-Extracted Features, ed. by A. Gelbukh. CICLing 2006, LNCS 3878 (2006), pp. 192–195
http://www.cs.cmu.edu/~maheshj/pubs/joshi+pedersen+maclin.iicai2005.pdf. 14 May 2015
S. Brody, R. Navigli, M. Lapata, Ensemble methods for unsupervised WSD, in Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL (Sydney, 2006), pp. 97–104
http://arxiv.org/pdf/cs/0007010.pdf. 14 May 2015
http://www.aclweb.org/anthology/S01-1017. 14 May 2015
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pal, A.R., Saha, D., Dash, N.S. et al. Word Sense Disambiguation in Bangla Language Using Supervised Methodology with Necessary Modifications. J. Inst. Eng. India Ser. B 99, 519–526 (2018). https://doi.org/10.1007/s40031-018-0337-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40031-018-0337-5