A Novel Unsupervısed Graph-Based Algorıthm for Hindi Word Sense Disambiguation

Jha, Prajna; Agarwal, Shreya; Abbas, Ali; Siddiqui, Tanveer J.

doi:10.1007/s42979-023-02116-1

A Novel Unsupervısed Graph-Based Algorıthm for Hindi Word Sense Disambiguation

Original Research
Published: 02 September 2023

Volume 4, article number 675, (2023)
Cite this article

SN Computer Science Aims and scope Submit manuscript

Prajna Jha ORCID: orcid.org/0000-0002-2937-2487¹,
Shreya Agarwal¹,
Ali Abbas¹ &
…
Tanveer J. Siddiqui¹

68 Accesses
2 Citations
Explore all metrics

Abstract

Natural languages are inherently ambiguous. Ambiguities exist at many levels, word sense ambiguity being one of them. Resolving sense ambiguity is crucial in many Natural Language Processing applications. In this paper, we focus on word sense ambiguity and propose an unsupervised graph-based algorithm for Hindi Word Sense disambiguation task. The work is motivated by the encouraging results achieved by graph-based WSD algorithms for English and other European languages and the lack of wide-coverage sense annotated dataset for Hindi. The proposed algorithm creates a weighted graph wherein the nodes represent the senses of words appearing in the context of an ambiguous word and the edges depict relations between them. It uses semantic similarity derived from Hindi WordNet to assign weight to edges and a random walk-type algorithm to assign the most appropriate sense to a polysemous word in a given context. The evaluation has been done on a sense annotated dataset comprising 20 polysemous nouns. We observed an overall accuracy of 63.39% which is better than earlier reported work on the same dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Hindi Word Sense Disambiguation Using Semantic Relatedness Measure

Word Sense Disambiguation Using WordNet Semantic Knowledge

WordNet and Wiktionary-Based Approach for Word Sense Disambiguation

References

Jain A, Lobiyal DK. A new approach for unsupervised word sense disambiguation in Hindi language using graph connectivity measures. Int J Artif Intell Soft Comput. 2014;4(4):318–34.
Google Scholar
Jain A, Lobiyal DK. Fuzzy Hindi WordNet and word sense disambiguation using fuzzy graph connectivity measures. ACM Trans Asian Low-Resour Lang Inf Process. 2015;15(2):1–31.
Article Google Scholar
Jain A, Lobiyal DK. Unsupervised Hindi word sense disambiguation based on network agglomeration. In: 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom). 2015; 195–200.
Jain A, Yadav S, Tayal D. Measuring context-meaning for open class words in Hindi language. In:Proc. of 2013 Sixth International Conference on Contemporary Computing (IC3). IEEE. 2013;pp. 118–123.
Yarowsky D. Unsupervised word sense disambiguation rivaling supervised methods. In: 33rd annual meeting of the association for computational linguistics. Cambridge. 1995; pp. 189–196
Agirre E, Soroa A. Semeval-2007 Task 02: Evaluating word sense induction and discrimination systems. In: Proceedings of SemEval-2007, Prague. Czech Republic. 2007; pp. 7–12.
Agirre E, Martinez D, de Lacalle O, Soroa A. Two graph-based algorithms for state-of-the-art WSD. In: Proceedings of EMNLP-2006. Sydney, Australia; 2006, pp. 585–593.
Agirre E, de Lacalle OL, Soroa A. Random walks for knowledge-based word sense disambiguation. Comput Linguist. 2014;40(1):57–84.
Article Google Scholar
https://www.cfilt.iitb.ac.in/wordnet/webhwn/
Klapaftis I, Manandhar S. Word sense induction using graphs of collocations. In: ECAI. July 2008. pp. 298–302. http://dx.doi.org/https://doi.org/10.3233/978-1-58603-891-5-298
Cuadros M, Rigau G. KnowNet: building a large net of knowledge from the Web. In Proc. of COLING-08.2008; pp161–168.
Bevilacqua M, Pasini T, Raganato A, Navigli R. Recent trends in word sense disambiguation: a survey. In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21. International Joint Conference on Artificial Intelligence, Inc. 2021; pp. 4330–4338.
Mishra N, Yadav S, Siddiqui TJ. An unsupervised approach to hindi word sense disambiguation. In: Tiwary US, Siddiqui TJ, Radhakrishna M, Tiwari MD (Eds) Proceedings of the First International Conference on Intelligent Human Computer Interaction. Springer, New Delhi. 2009. https://doi.org/10.1007/978-81-8489-203-1_32
Kouris P, Alexandridis G, Stafylopatis A. Abstractive text summarization: enhancing sequence-to-sequence models using word sense disambiguation and semantic content generalization. Comput Linguist. 2021;47(4):813–85.
Article Google Scholar
Navigli R. Word sense disambiguation: a survey. ACM Comput Surv. 2009;41(2):1–69. https://doi.org/10.1145/1459352.1459355.
Article Google Scholar
Mihalcea R, Tarau P, Figa E. Pagerank on semantic networks with application to word sense disambiguation. In: COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics. 2004; pp. 1126–1132.
Mihalcea R. Unsupervised large-vocabulary word sense disambiguation with graph-based algorithms for sequence data labeling. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 2005;pp 411–418. DOI: https://doi.org/10.3115/1220575.1220627
Sinha R, Mihalcea R. Unsupervised graph-based word sense disambiguation using measures of semantic similarity. In the Proceedings of International Conference on Semantic Computing. IEEE. 2007; pp. 363–369. http://dx.doi.org/https://doi.org/10.1109/ICSC.2007.87.
Singh S, Siddiqui TJ. Role of karaka relations in hindi word sense disambiguation. J Inf Technol Res. 2015;8(3):21–42. https://doi.org/10.4018/JITR.2015070102.
Article Google Scholar
Bhingardive S, Redkar H,Sappadla P, Singh D, and Bhattacharyya P. IndoWordNet::similarity-computing semanticsimilarity and relatedness using indoWordNet. In: Proceedings of the 8^th Global WordNet Conference (GWC). 2016; pp. 39–43
Ponzetto SP, Navigli R. Knowledge-rich word sense disambiguation rivaling supervised systems. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Uppsala, Sweden: Association for Computational Linguistics. 2010; Pp. 1522–1531.
Singh S, Siddiqui TJ. Utilizing corpus statistics for hindi word sense disambiguation. Int Arab J Inform Technol. 2015;12(6A):755–63.
Google Scholar
Singh S and Siddiqui Tanveer J. Evaluating effect of context window size, stemming and stop word removal on Hindi word sense disambiguation. In: International Conference on Information Retrieval & Knowledge Management (CAMP). 2012.
Singh S, SiddiquiTanveer J, Sharma Sunil K. Naïve Bayes classifier for Hindi word sense disambiguation. In: Proceedings of the 7th ACM India computing conference. 2014; pp. 1–8.
Singh S, Singh VK, Siddiqui TJ. Hindi word sense disambiguation using semantic relatedness measure. In: the Proceedings of MIWAI 2013, LNCS 8271, Springer. Berlin. 2013. pp. 247–256.
Vishwakarma SK, Vishkarma CK. A graph based approach to word sense disambiguation for Hindi language. Int J Sci Res Eng Technol (IJSRET). 2012;1:313–8.
Google Scholar
Sense Annotated Hindi Corpus: Indian Language Technology Proliferation and Deployment Centre. https://tdil-dc.in/index.php
Zhong Z, Ng HT. Word sense disambiguation improves information retrieval. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Jeju, Republic of Korea. 2012, pp 273–282.
HussainMH, KhanumMA. Word sense disambiguation in software requirement specifications using wordnet and association mining rule. ICTCS '16: Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies, March 2016, Article No.: 119, Pages 1–4.
Jain G, Lobiyal DK. Word sense disambiguation of hindi text using fuzzified semantic relations and fuzzy hindi WordNet. 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 2019, pp. 494–497.

Download references

Funding

No funding is available for the work reported in this paper.

Author information

Authors and Affiliations

Department of Electronics and Communication, University of Allahabad, Prayagraj, India
Prajna Jha, Shreya Agarwal, Ali Abbas & Tanveer J. Siddiqui

Authors

Prajna Jha
View author publications
You can also search for this author in PubMed Google Scholar
Shreya Agarwal
View author publications
You can also search for this author in PubMed Google Scholar
Ali Abbas
View author publications
You can also search for this author in PubMed Google Scholar
Tanveer J. Siddiqui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Prajna Jha.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Research Trends in Computational Intelligence” guest edited by Anshul Verma, Pradeepika Verma, Vivek Kumar Singh and S. Karthikeyan.

Appendix I

अगर	अगली	अगले	अच्छी	अति	अथवा	अधिक	अनुसार	अनेक
अन्य	अपना	अपनी	अपने	अब	अभी	अलावा	आ	आई
आएाँ	आगे	आती	आदि	आने	आप	आम	आसपास	इतनी
इतने	इन	इनमे	इन्हीं	इन्हे	इस	इसका	इसकी	इसके
लिए	इसके	इसमें	इसलिए	इससे	इसी	इसीलिए	इसे	उतनी
उधर	उन	उनका	उनकी	उनके	उनमें	उनसे	उन्हीं	उन्हे
उन्हें	उन्होंने	उन्होने	उस	उसका	उसकी	उसके	उससे	उसी
उसे	ऊपर	एक	एक-एक	एवं	ऐसा	ऐसी	ऐसे	ओर
कई	कछ	कब	कभी	कभी-कभी	कम	कया	कर	करके
करता	करती	करते	करना	करनी	करने	करा	कराने	कराया
करेंगे	करेगा	करेगी	का	काफी	काफी	कि	किंतु	किए
कितनी	कितने	किन	किया	किये	किस	किसी	की	कुछ
कुल	के	कारण	कैसे	को	कोई	कौन	क्या	क्यो
क्योकि	गई	गईं	गए	गया	गयी	गये	चलता	चलने
चली	चाहती	चाहते	चाहिए	चाहे	चुका	चुकी	चुके	चुके
छह	छू	जगह	जब	जबकि	जल्द	जल्दी	जहाँ	जहां
जहां-तहां	जा	जाए	जाएं	जाएंगी	जाएगी	जाएाँ	जाकर	जाता
जाती	जाते	जानना	जाना	जाने	जाये	जारी	जितना	जितनी
जिनमें	जिन्हें	जिसका	जिससे	जिसे	जी	जैसा	जैसे	जो
जोर	ज्यादा	ठीक	तक	तथा	तब	तभी	तरफ	तरह
तहत	ताकि	तीन	तो	तौर	था	थी	थे	थोडा
दरअसल	दिए	दिखाए	दिया	दी	दूर	दूसरी	दूसरे	दे
देंगी	देंगे	देकर	देता	देती	देते	देना	देने	दो
दोनो	द्वारा	न	नई	नए	नया	नहीं	नीचे	ने
पडता	पडने	पडा	पर	परंत	पहला	पहले	पांच	पाएं
पााँच	पीछे	पूरी	प्रति	प्रत्येक	फिर	बजाय	बजे	बडी
बढ़	बढ़ा	बढ़े	बताया	बन	बनाई	बनाए	बनाना	बनाने
बनी	बने	बल्कि	बहुत	बाकी	बाद	बार	बार-बार	बारे
बिना	बीच	बेहद	भी	मगर	मुताबिक	मे	में	यदि
यद्यपि	यह	यहाँ	यही	या	यानी	ये	रखना	रह
रहती	रही	रहे	रहेगा	रहेगी	रहो	रोका	लगभग	लगा
लगाई	लगे	लाकर	लाने	लिए	लिया	लिये	ली	ले
लेकर	लेकिन	लेगी	लेना	लेने	व	वनाट	वह	वहााँ
वहीं	वाला	वाली	वाले	वालो	विभिन्न	वे	वैसे	वो
शायद	सकता	सकती	सकते	सका	सके	सकेगा	सकेगी	सब
सबकी	सबके	सबसे	सभी	सहज	सही	सा	सात	साथ
साथ-साथ	साफ	सामने	सारे	सिर्फ	सीधे	से	हाँ	हम
हमने	हमारी	हमारे	हमें	हर	हां	हांलांकि	ही	हुआ
हुई	हुए	हूँ	है	हैं	हो	हों	होंगी	होगा
होगी	होता	होती	होते	होना	होनी	होने

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Jha, P., Agarwal, S., Abbas, A. et al. A Novel Unsupervısed Graph-Based Algorıthm for Hindi Word Sense Disambiguation. SN COMPUT. SCI. 4, 675 (2023). https://doi.org/10.1007/s42979-023-02116-1

Download citation

Received: 07 March 2023
Accepted: 03 July 2023
Published: 02 September 2023
DOI: https://doi.org/10.1007/s42979-023-02116-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Novel Unsupervısed Graph-Based Algorıthm for Hindi Word Sense Disambiguation

Abstract

Access this article

Similar content being viewed by others

Hindi Word Sense Disambiguation Using Semantic Relatedness Measure

Word Sense Disambiguation Using WordNet Semantic Knowledge

WordNet and Wiktionary-Based Approach for Word Sense Disambiguation

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix I

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Novel Unsupervısed Graph-Based Algorıthm for Hindi Word Sense Disambiguation

Abstract

Access this article

Similar content being viewed by others

Hindi Word Sense Disambiguation Using Semantic Relatedness Measure

Word Sense Disambiguation Using WordNet Semantic Knowledge

WordNet and Wiktionary-Based Approach for Word Sense Disambiguation

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix I

Appendix I

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation