Abstract
Natural languages are inherently ambiguous. Ambiguities exist at many levels, word sense ambiguity being one of them. Resolving sense ambiguity is crucial in many Natural Language Processing applications. In this paper, we focus on word sense ambiguity and propose an unsupervised graph-based algorithm for Hindi Word Sense disambiguation task. The work is motivated by the encouraging results achieved by graph-based WSD algorithms for English and other European languages and the lack of wide-coverage sense annotated dataset for Hindi. The proposed algorithm creates a weighted graph wherein the nodes represent the senses of words appearing in the context of an ambiguous word and the edges depict relations between them. It uses semantic similarity derived from Hindi WordNet to assign weight to edges and a random walk-type algorithm to assign the most appropriate sense to a polysemous word in a given context. The evaluation has been done on a sense annotated dataset comprising 20 polysemous nouns. We observed an overall accuracy of 63.39% which is better than earlier reported work on the same dataset.
Similar content being viewed by others
References
Jain A, Lobiyal DK. A new approach for unsupervised word sense disambiguation in Hindi language using graph connectivity measures. Int J Artif Intell Soft Comput. 2014;4(4):318–34.
Jain A, Lobiyal DK. Fuzzy Hindi WordNet and word sense disambiguation using fuzzy graph connectivity measures. ACM Trans Asian Low-Resour Lang Inf Process. 2015;15(2):1–31.
Jain A, Lobiyal DK. Unsupervised Hindi word sense disambiguation based on network agglomeration. In: 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom). 2015; 195–200.
Jain A, Yadav S, Tayal D. Measuring context-meaning for open class words in Hindi language. In:Proc. of 2013 Sixth International Conference on Contemporary Computing (IC3). IEEE. 2013;pp. 118–123.
Yarowsky D. Unsupervised word sense disambiguation rivaling supervised methods. In: 33rd annual meeting of the association for computational linguistics. Cambridge. 1995; pp. 189–196
Agirre E, Soroa A. Semeval-2007 Task 02: Evaluating word sense induction and discrimination systems. In: Proceedings of SemEval-2007, Prague. Czech Republic. 2007; pp. 7–12.
Agirre E, Martinez D, de Lacalle O, Soroa A. Two graph-based algorithms for state-of-the-art WSD. In: Proceedings of EMNLP-2006. Sydney, Australia; 2006, pp. 585–593.
Agirre E, de Lacalle OL, Soroa A. Random walks for knowledge-based word sense disambiguation. Comput Linguist. 2014;40(1):57–84.
Klapaftis I, Manandhar S. Word sense induction using graphs of collocations. In: ECAI. July 2008. pp. 298–302. http://dx.doi.org/https://doi.org/10.3233/978-1-58603-891-5-298
Cuadros M, Rigau G. KnowNet: building a large net of knowledge from the Web. In Proc. of COLING-08.2008; pp161–168.
Bevilacqua M, Pasini T, Raganato A, Navigli R. Recent trends in word sense disambiguation: a survey. In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21. International Joint Conference on Artificial Intelligence, Inc. 2021; pp. 4330–4338.
Mishra N, Yadav S, Siddiqui TJ. An unsupervised approach to hindi word sense disambiguation. In: Tiwary US, Siddiqui TJ, Radhakrishna M, Tiwari MD (Eds) Proceedings of the First International Conference on Intelligent Human Computer Interaction. Springer, New Delhi. 2009. https://doi.org/10.1007/978-81-8489-203-1_32
Kouris P, Alexandridis G, Stafylopatis A. Abstractive text summarization: enhancing sequence-to-sequence models using word sense disambiguation and semantic content generalization. Comput Linguist. 2021;47(4):813–85.
Navigli R. Word sense disambiguation: a survey. ACM Comput Surv. 2009;41(2):1–69. https://doi.org/10.1145/1459352.1459355.
Mihalcea R, Tarau P, Figa E. Pagerank on semantic networks with application to word sense disambiguation. In: COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics. 2004; pp. 1126–1132.
Mihalcea R. Unsupervised large-vocabulary word sense disambiguation with graph-based algorithms for sequence data labeling. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 2005;pp 411–418. DOI: https://doi.org/10.3115/1220575.1220627
Sinha R, Mihalcea R. Unsupervised graph-based word sense disambiguation using measures of semantic similarity. In the Proceedings of International Conference on Semantic Computing. IEEE. 2007; pp. 363–369. http://dx.doi.org/https://doi.org/10.1109/ICSC.2007.87.
Singh S, Siddiqui TJ. Role of karaka relations in hindi word sense disambiguation. J Inf Technol Res. 2015;8(3):21–42. https://doi.org/10.4018/JITR.2015070102.
Bhingardive S, Redkar H,Sappadla P, Singh D, and Bhattacharyya P. IndoWordNet::similarity-computing semanticsimilarity and relatedness using indoWordNet. In: Proceedings of the 8th Global WordNet Conference (GWC). 2016; pp. 39–43
Ponzetto SP, Navigli R. Knowledge-rich word sense disambiguation rivaling supervised systems. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Uppsala, Sweden: Association for Computational Linguistics. 2010; Pp. 1522–1531.
Singh S, Siddiqui TJ. Utilizing corpus statistics for hindi word sense disambiguation. Int Arab J Inform Technol. 2015;12(6A):755–63.
Singh S and Siddiqui Tanveer J. Evaluating effect of context window size, stemming and stop word removal on Hindi word sense disambiguation. In: International Conference on Information Retrieval & Knowledge Management (CAMP). 2012.
Singh S, SiddiquiTanveer J, Sharma Sunil K. Naïve Bayes classifier for Hindi word sense disambiguation. In: Proceedings of the 7th ACM India computing conference. 2014; pp. 1–8.
Singh S, Singh VK, Siddiqui TJ. Hindi word sense disambiguation using semantic relatedness measure. In: the Proceedings of MIWAI 2013, LNCS 8271, Springer. Berlin. 2013. pp. 247–256.
Vishwakarma SK, Vishkarma CK. A graph based approach to word sense disambiguation for Hindi language. Int J Sci Res Eng Technol (IJSRET). 2012;1:313–8.
Sense Annotated Hindi Corpus: Indian Language Technology Proliferation and Deployment Centre. https://tdil-dc.in/index.php
Zhong Z, Ng HT. Word sense disambiguation improves information retrieval. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Jeju, Republic of Korea. 2012, pp 273–282.
HussainMH, KhanumMA. Word sense disambiguation in software requirement specifications using wordnet and association mining rule. ICTCS '16: Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies, March 2016, Article No.: 119, Pages 1–4.
Jain G, Lobiyal DK. Word sense disambiguation of hindi text using fuzzified semantic relations and fuzzy hindi WordNet. 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 2019, pp. 494–497.
Funding
No funding is available for the work reported in this paper.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Research Trends in Computational Intelligence” guest edited by Anshul Verma, Pradeepika Verma, Vivek Kumar Singh and S. Karthikeyan.
Appendix I
Appendix I
अगर | अगली | अगले | अच्छी | अति | अथवा | अधिक | अनुसार | अनेक |
अन्य | अपना | अपनी | अपने | अब | अभी | अलावा | आ | आई |
आएाँ | आगे | आती | आदि | आने | आप | आम | आसपास | इतनी |
इतने | इन | इनमे | इन्हीं | इन्हे | इस | इसका | इसकी | इसके |
लिए | इसके | इसमें | इसलिए | इससे | इसी | इसीलिए | इसे | उतनी |
उधर | उन | उनका | उनकी | उनके | उनमें | उनसे | उन्हीं | उन्हे |
उन्हें | उन्होंने | उन्होने | उस | उसका | उसकी | उसके | उससे | उसी |
उसे | ऊपर | एक | एक-एक | एवं | ऐसा | ऐसी | ऐसे | ओर |
कई | कछ | कब | कभी | कभी-कभी | कम | कया | कर | करके |
करता | करती | करते | करना | करनी | करने | करा | कराने | कराया |
करेंगे | करेगा | करेगी | का | काफी | काफी | कि | किंतु | किए |
कितनी | कितने | किन | किया | किये | किस | किसी | की | कुछ |
कुल | के | कारण | कैसे | को | कोई | कौन | क्या | क्यो |
क्योकि | गई | गईं | गए | गया | गयी | गये | चलता | चलने |
चली | चाहती | चाहते | चाहिए | चाहे | चुका | चुकी | चुके | चुके |
छह | छू | जगह | जब | जबकि | जल्द | जल्दी | जहाँ | जहां |
जहां-तहां | जा | जाए | जाएं | जाएंगी | जाएगी | जाएाँ | जाकर | जाता |
जाती | जाते | जानना | जाना | जाने | जाये | जारी | जितना | जितनी |
जिनमें | जिन्हें | जिसका | जिससे | जिसे | जी | जैसा | जैसे | जो |
जोर | ज्यादा | ठीक | तक | तथा | तब | तभी | तरफ | तरह |
तहत | ताकि | तीन | तो | तौर | था | थी | थे | थोडा |
दरअसल | दिए | दिखाए | दिया | दी | दूर | दूसरी | दूसरे | दे |
देंगी | देंगे | देकर | देता | देती | देते | देना | देने | दो |
दोनो | द्वारा | न | नई | नए | नया | नहीं | नीचे | ने |
पडता | पडने | पडा | पर | परंत | पहला | पहले | पांच | पाएं |
पााँच | पीछे | पूरी | प्रति | प्रत्येक | फिर | बजाय | बजे | बडी |
बढ़ | बढ़ा | बढ़े | बताया | बन | बनाई | बनाए | बनाना | बनाने |
बनी | बने | बल्कि | बहुत | बाकी | बाद | बार | बार-बार | बारे |
बिना | बीच | बेहद | भी | मगर | मुताबिक | मे | में | यदि |
यद्यपि | यह | यहाँ | यही | या | यानी | ये | रखना | रह |
रहती | रही | रहे | रहेगा | रहेगी | रहो | रोका | लगभग | लगा |
लगाई | लगे | लाकर | लाने | लिए | लिया | लिये | ली | ले |
लेकर | लेकिन | लेगी | लेना | लेने | व | वनाट | वह | वहााँ |
वहीं | वाला | वाली | वाले | वालो | विभिन्न | वे | वैसे | वो |
शायद | सकता | सकती | सकते | सका | सके | सकेगा | सकेगी | सब |
सबकी | सबके | सबसे | सभी | सहज | सही | सा | सात | साथ |
साथ-साथ | साफ | सामने | सारे | सिर्फ | सीधे | से | हाँ | हम |
हमने | हमारी | हमारे | हमें | हर | हां | हांलांकि | ही | हुआ |
हुई | हुए | हूँ | है | हैं | हो | हों | होंगी | होगा |
होगी | होता | होती | होते | होना | होनी | होने |
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jha, P., Agarwal, S., Abbas, A. et al. A Novel Unsupervısed Graph-Based Algorıthm for Hindi Word Sense Disambiguation. SN COMPUT. SCI. 4, 675 (2023). https://doi.org/10.1007/s42979-023-02116-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-023-02116-1