Abstract
The versatility and depth of Sanskrit have led to its designation as a universal syntax. The significance of grammar in language translation cannot be exaggerated. Grammar pertains to the structural arrangement of a sentence. It consists of regulations and guidelines. POS tagging is the process of assigning the appropriate part of speech to each word in a phrase. Additionally, it associates with adjacent and comparable words inside a phrase or sentence. The process of morphology involves the separation of phrases and does not determine the correct meaning. Parts of Speech Tagging (POST) consider word sequences to ascertain the accurate interpretation of a word inside a given sentence. Russian, English, and Japanese, in contrast to Indian languages, have developed efficient POST for processing. POS tagging is primarily performed using rule-based, stochastic, and transformation-based methods. The paper will concentrate on the examination of the structure and meaning of Sanskrit sentences. This paper utilizes Lex and Yacc to create a part-of-speech rule-based tagger for Sanskrit. The tagger employs a concise collection of elementary principles to produce sequences of tokens, along with a limited lexicon or vocabulary to identify potential tags for each word. The database maintains a record of these regulations. The system automatically analyzes the provided sentence and assigns the appropriate tags to each word.
Similar content being viewed by others
Data availability
No datasets were generated or anaysed during the current study.
References
Schmid H (1994) Part-of-speech tagging with neural networks. Comput Linguist 1(1):172–176
Ali et al (2011) Study of noun phrase in Urdu. Linguist Lit Rev 1(1):44–54
Anbananthen et al (2017) Comparison of stochastic and rule-based POS tagging on Malay Online Text. Am J Appl Sci 843–851
Barman D, Chowdhury N (2020) A novel semi-supervised approach for text classification. Int J Inf Technol 12:1147–1157
Dubey P (2019) The Hindi to Dogri machine translation system: grammatical perspective. Int J Inf Technol 11(1):171–182
Baskarn et al (2008) Designing a common POS-tagset framework for Indian language. In: 6th Workshop on Asian Languange Resources
Garg A, Jindal MK, Singh A (2021) Offline handwritten Gurmukhi character recognition: KNN vs. SVM classifier. Int J Inf Technol 13:2389–2396
Bhardwaj et al (2009) Keyword spotting Tech. for Sanskrit Documents. Sanskrit Comput Linguist 5402:403–416
Garg K (2020) Sentiment analysis of Indian PM’s “Mann Ki Baat. Int J Inf Technol 12(1):37–48
Brill E (1995) Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Comput Linguist 2(4):543–565
Tapaswi N (2023) Spellchecker for Sanskrit sentences based on morphological analysis. Kiranavali, India, pp 69–82
Dermatas E, Kokkinakis G (2002) Automatic stochastic tagging of natural language texts. MIT Press USA, Cambridge
Jurafsky D, Martin JH (2009) Speech and language processing: an introduction to speech recognition. Computational linguistics and natural language processing. Prentice-Hall Delhi
Kulkarni A, Ramakrishnamacharyulu KV (2011) Parsing Sanskrit texts: some relation specific Issues. International Sanskrit computational symposium. Springer Verlag, pp 182–191
Megyesi B (1999) Improving Brill‟S POS tagger for an agglutinative language. Stockholm University
Nivre J (2007) Dependency grammar and dependency parsing. https://www.scss.tcd.ie/conferences/esslli2007/content/CD_Contents/content/id35/id35.pdf
Root WWD (1997) Verb-forms and primary derivatives of the Sanskrit language. Motilal Banarsidass, Delhi
Rashid M, Priya, Singh H (2019) Text to speech conversion in Punjabi language using nourish forwarding algorithm. In: International Journal of Information Technology, pp 1–10
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tapaswi, N. An efficient part-of-speech tagger rule-based approach of Sanskrit language analysis. Int. j. inf. tecnol. 16, 901–908 (2024). https://doi.org/10.1007/s41870-023-01668-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41870-023-01668-y