Skip to main content
Log in

A novel approach to word sense disambiguation in Bengali language using supervised methodology

  • Published:
Sādhanā Aims and scope Submit manuscript

Abstract

An attempt is made in this paper to report how a supervised methodology has been adopted for the task of Word Sense Disambiguation (WSD) in Bengali with necessary modifications. At the initial stage, four commonly used supervised methods, Decision Tree (DT), Support Vector Machine (SVM), Artificial Neural Network (ANN) and Naïve Bayes (NB), are developed at the baseline. These algorithms are applied individually on a data set of 13 most frequently used Bengali ambiguous words. On experimental basis, the baseline strategy is modified with two extensions: (a) inclusion of lemmatization process into the system and (b) bootstrapping of the operational process. As a result, the levels of accuracy of the baseline methods are slightly improved, which is a positive signal for the whole process of disambiguation as it opens scope for further modification of the existing method for better result. In this experiment, the data sets are prepared from the Bengali corpus, developed in the Technology Development for Indian Languages (TDIL) project of the Government of India and from the Bengali WordNet, which is developed at the Indian Statistical Institute, Kolkata. The paper reports the challenges and pitfalls of the work that have been closely observed during the experiment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6

Similar content being viewed by others

References

  1. Ide N and Véronis J 1998 Word sense disambiguation: the state of the art. Comput. Linguist. 24(1): 1–40

    Google Scholar 

  2. Cucerzan R S, Schafer C and Yarowsky D 2002 Combining classifiers for word sense disambiguation. Nat. Lang. Eng. 8(4): 327–341

    Article  Google Scholar 

  3. Nameh M S, Fakhrahmad M and Jahromi M Z 2011 A new approach to word sense disambiguation based on context similarity. In: Proceedings of the World Congress on Engineering, vol. I

  4. Xiaojie W and Matsumoto Y 2003 Chinese word sense disambiguation by combining pseudo training data. In: Proceedings of the International Conference on Natural Language Processing and Knowledge Engineering, pp. 138–143

  5. Navigli R 2009 Word sense disambiguation: a survey. ACM Comput. Surv. 41(2): 1–69

    Article  Google Scholar 

  6. Xiaojie W and Matsumoto Y 2003 Chinese word sense disambiguation by combining pseudo training data. In: Proceedings of the International Conference on Natural Language Processing and Knowledge Engineering, pp. 138–143

  7. Sanderson M 1994 Word sense disambiguation and information retrieval. In: Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR’94, July 03–06, Dublin, Ireland. New York: Springer, pp. 142–151

    Chapter  Google Scholar 

  8. Eneko Agirre E and Edmonds P (Eds.) Word Sense Disambiguation: Algorithms and Applications

  9. Seo H, Chung H, Rim H, Myaeng S H and Kim S 2004 Unsupervised word sense disambiguation using WordNet relatives. Comput. Speech Lang. 18(3): 253–273

    Article  Google Scholar 

  10. Miller G et al 1991 Introduction to WordNet: an on-line lexical database. Int. J. Lexicogr. 3(4): 235–244

    Article  Google Scholar 

  11. Kolte S G and Bhirud S G 2008 Word sense disambiguation using WordNet domains. In: Proceedings of the First International Conference on Digital Object Identifier, pp. 1187–1191

  12. Liu Y, Scheuermann P, Li X and Zhu X 2007 Using WordNet to disambiguate word senses for text classification. In: Proceedings of the 7th International Conference on Computational Science, Springer, pp. 781–789

  13. Miller G A, Beckwith R, Fellbaum C, Gross D and Miller K J 1990 WordNet: an on-line lexical database. Int. J. Lexicogr. 3(4): 235–244

    Article  Google Scholar 

  14. Miller G A 1993 WordNet: a lexical database. Commun. ACM 38(11): 39–41

    Article  Google Scholar 

  15. Cañas A J, Valerio A, Lalinde-Pulido J, Carvalho M and Arguedas M 2003 Using WordNet for word sense disambiguation to support concept map construction, string processing and information retrieval. In: Proceedings of SPIRE 2003, pp. 350–359

    Google Scholar 

  16. Marine C and Dekai W U 2005 Word sense disambiguation vs. statistical machine translation. In: Proceedings of the 43rd Annual Meeting of the ACL, Ann Arbor, pp. 387–394

  17. Màrquez L, Escudero G, Martínez D and Rigau G Supervised corpus-based methods for WSD. In: Word Sense Disambiguation. Text, Speech and Language Technology, vol. 33, pp. 167–216

  18. Carpuat M and Wu D 2005 Evaluating the word sense disambiguation performance of statistical machine translation. In: Proceedings of the Second International Joint Conference on Natural Language Processing, Jeju, Korea, October

  19. Yee S C, Hwee T N and David C 2007 Word sense disambiguation improves statistical machine translation. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, pp. 33–40

  20. Mihalcea R and Moldovan D 2000 An iterative approach to word sense disambiguation. In: Proceedings of FLAIRS 2000, Orlando, FL, pp. 219–223

  21. Christopher S, Michael P O and John T 2003 Word sense disambiguation in information retrieval revisited. In: Proceedings of SIGIR’03, July 28–August 1, Toronto, Canada

  22. Sanderson M 1994 Word sense disambiguation and information retrieval. In: SIGIR ‘94 Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Dublin, Ireland, July 03–06, pp. 142–151

    Chapter  Google Scholar 

  23. Zhi Z and Hwee Tou N 2012 Word sense disambiguation improves information retrieval. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Jeju Island, Korea, vol. 1, pp. 273–282

  24. Tou Ng H 2011 Does word sense disambiguation improve information retrieval? In: Proceedings of ESAIR’11, ACM, October 28, Glasgow, Scotland, UK, pp. 17–18

  25. Jacques G, Gilles F, Saïd R and Karim E 2009 Analysis of word sense disambiguation-based information retrieval. In: Proceedings of the 9th Workshop of the Cross-Language Evaluation Forum, CLEF 2008Evaluating Systems for Multilingual and Multimodal Information Access, Denmark, 17–19 September 2008. Berlin: Springer, pp. 146–154

  26. Banerjee S and Pedersen T 2002 An adapted Lesk algorithm for word sense disambiguation using WordNet. In: Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City

  27. Lesk M 1986 Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In: Proceedings of SIGDOC

  28. Soler S and Montoyo A 2002 A proposal for WSD using semantic similarity. In: Gelbukh A (Ed.) Computational Linguistics and Intelligent Text Processing. Proceedings of CICLing 2002. Lecture Notes in Computer Science, vol. 2276. Berlin–Heidelberg: Springer

    Chapter  Google Scholar 

  29. Mittal K and Jain A 2015 Word sense disambiguation method using semantic similarity measure and OWA operator. ICTACT J. Soft Comput. 05(02) (Special issue on Soft-Computing Theory, Application and Implications in Engineering and Technology)

  30. Patwardhan S, Banerjee S and Pedersen T 2003 Using measures of semantic relatedness for word sense disambiguation. In: CICLing’03 Proceedings of the 4th International Conference on Computational Linguistics and Intelligent Text Processing, Mexico City, February, pp. 241–257

  31. Ye P 2004 Selectional preference based verb sense disambiguation using WordNet. In: Proceedings of the Australasian Language Technology Workshop, December, Sydney, Australia, pp. 155–162

  32. Xuri T, Xiaohe C, Weiguang Q and Shiw Y 2010 Semi-supervised WSD in selectional preferences with semantic redundancy. In: COLING 2010: Posters, August, 2010, Beijing, China, pp. 1238–1246

  33. Diana M C and Carroll J Disambiguating nouns, verbs, and adjectives using automatically acquired selectional preferences. Computat. Linguist. 29(4): 639–654

  34. Patrick Y and Timothy B 2006 Verb sense disambiguation using selectional preferences extracted with a state-of-the-art semantic role labeler. In: Proceedings of the 2006 Australasian Language Technology Workshop (ALTW2006), pp. 139–148

  35. Yarowsky D 2000 Hierarchical decision lists for word sense disambiguation. Comput. Humanit. 34(1–2): 179–186

  36. Parameswarappa S and Narayana V N 2013 Kannada word sense disambiguation using decision list. Int. J. Emerg. Trends Technol. Comput. Sci. 2(3): 272–278

    Google Scholar 

  37. Palanati D P and Kolikipogu R 2013 Decision list algorithm for word sense disambiguation for Telugu natural language processing. I. Int. J. Electron. Commun. Comput. Eng. 4(6): 176–180

    Google Scholar 

  38. Pedersen T In: Agirre E and Edmonds P (Eds.) Word Sense Disambiguation: Algorithms And Applications

  39. Shinnou H and Sasaki M 2003 Unsupervised learning of word sense disambiguation rules by estimating an optimum iteration number in the EM algorithm. In: Proceedings of the Seventh CoNLL, held at HLT-NAACL 2003, Edmonton, May–June 2003, pp. 41–48

  40. Boshra F, Al_Bayaty Z and Joshi S 2014 Sense identification for ambiguous word using decision list. Int. J. Adv. Res. Sci. Eng. 3(10): 109–115

  41. Singh R L, Ghosh K, Nongmeikapam K and Bandyopadhyay S 2014 A decision tree based word sense disambiguation system in Manipuri language. Adv. Comput. Int. J. l.5(4): 17–22

    Google Scholar 

  42. Park S, Zhang B and Kim Y T 2003 Word sense disambiguation by learning decision trees from unlabeled data. Appl. Intell. 19: 27–38

    Article  Google Scholar 

  43. Sarmah J and Sarma S 2016 Decision tree based supervised word sense disambiguation for Assamese. Int. J. Comput. Appl. 141(1): 42–48

    Google Scholar 

  44. Le C and Shimazu A 2004 High WSD accuracy using Naive Bayesian classifier with rich features. In: Proceedings of PACLIC 18, December 8th–10th, 2004, Waseda University, Tokyo, pp. 105–114

  45. Escudero G, Màrquez L and Rigau G 2000 Naive Bayes and exemplar-based approaches to word sense disambiguation revisited. In: Proceedings of the 14th European Conference on Artificial Intelligence, ECAI, Berlin, Germany

  46. Aung N T T, Soe K M and Thein N L 2011 A word sense disambiguation system using Naïve Bayesian algorithm for Myanmar language. Int. J. Sci. Eng. Res. 2(9): 1–7

    Google Scholar 

  47. Abraham A 2004 Meta learning evolutionary artificial neural networks. Neurocomputing 56: 1–38

    Article  Google Scholar 

  48. Azzini A and Tettamanzi A 2006 A neural evolutionary approach to financial modeling. In: Proceedings of GECCO’06, vol. 2. San Francisco, CA: Morgan Kaufmann, pp. 1605–1612

  49. Azzini A and Tettamanzi A 2006 A neural evolutionary classification method for brain-wave analysis. In: Proceedings of EVOIASP’06, pp. 500–504

    Google Scholar 

  50. Yao X and Liu Y 1997 A new evolutionary system for evolving artificial neural networks. IEEE Trans. Neural Netw. 8(3): 694–713

    Article  Google Scholar 

  51. Ng H T 1997 Exemplar-based word sense disambiguation: some recent improvements. In: Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pp. 208–213

  52. Lee Y K, Ng H T and Chia T K 2004 Supervised word sense disambiguation with support vector machines and multiple knowledge sources. In: Proceedings of SENSEVAL-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Association for Computational Linguistics, Barcelona, Spain, July

  53. Buscaldi D, Rosso P, Pla F, Segarra E and Arnal ES 2006 Verb sense disambiguation using support vector machines: impact of WordNet-extracted features. In: Gelbukh A (Ed.) Proceedings of CICLing 2006, LNCS 3878, pp. 192–195

    Chapter  Google Scholar 

  54. Joshi M, Pedersen T and Maclin R 2005 A comparative study of support vector machines applied to the supervised word sense disambiguation problem in the medical domain. In: Proceedings of the 2nd Indian International Conference on Artificial Intelligence (IICAI-05), December 20–22, 2005, Pune, India

  55. Brody S, Navigli R and Lapata M 2006 Ensemble methods for unsupervised WSD. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, Sydney, pp. 97–104

  56. Escudero G, Màrquez L and Rigau G 2000 Boosting applied to word sense disambiguation. In: Proceedings of the 12th European Conference on Machine Learning, ECML, Barcelona, Catalonia

  57. Escudero V, Marquez L and Rigau G 2001 Using lazy boosting for word sense disambiguation. In: Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems, July, Toulouse, France, pp. 71–74

  58. Pal A R, Saha D, Naskar S and Dash N S 2015 Word sense disambiguation in Bengali: a lemmatized system increases the accuracy of the result. In: Proceedings of the 2nd International Conference on Recent Trends in Information Systems (ReTIS), pp. 342–346

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alok Ranjan Pal.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pal, A.R., Saha, D., Dash, N.S. et al. A novel approach to word sense disambiguation in Bengali language using supervised methodology. Sādhanā 44, 181 (2019). https://doi.org/10.1007/s12046-019-1165-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12046-019-1165-2

Keywords

Navigation