Skip to main content
Log in

An efficient part-of-speech tagger rule-based approach of Sanskrit language analysis

  • Original Research
  • Published:
International Journal of Information Technology Aims and scope Submit manuscript

Abstract

The versatility and depth of Sanskrit have led to its designation as a universal syntax. The significance of grammar in language translation cannot be exaggerated. Grammar pertains to the structural arrangement of a sentence. It consists of regulations and guidelines. POS tagging is the process of assigning the appropriate part of speech to each word in a phrase. Additionally, it associates with adjacent and comparable words inside a phrase or sentence. The process of morphology involves the separation of phrases and does not determine the correct meaning. Parts of Speech Tagging (POST) consider word sequences to ascertain the accurate interpretation of a word inside a given sentence. Russian, English, and Japanese, in contrast to Indian languages, have developed efficient POST for processing. POS tagging is primarily performed using rule-based, stochastic, and transformation-based methods. The paper will concentrate on the examination of the structure and meaning of Sanskrit sentences. This paper utilizes Lex and Yacc to create a part-of-speech rule-based tagger for Sanskrit. The tagger employs a concise collection of elementary principles to produce sequences of tokens, along with a limited lexicon or vocabulary to identify potential tags for each word. The database maintains a record of these regulations. The system automatically analyzes the provided sentence and assigns the appropriate tags to each word.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

No datasets were generated or anaysed during the current study.

References

  1. Schmid H (1994) Part-of-speech tagging with neural networks. Comput Linguist 1(1):172–176

    MathSciNet  Google Scholar 

  2. Ali et al (2011) Study of noun phrase in Urdu. Linguist Lit Rev 1(1):44–54

  3. Anbananthen et al (2017) Comparison of stochastic and rule-based POS tagging on Malay Online Text. Am J Appl Sci 843–851

  4. Barman D, Chowdhury N (2020) A novel semi-supervised approach for text classification. Int J Inf Technol 12:1147–1157

    Google Scholar 

  5. Dubey P (2019) The Hindi to Dogri machine translation system: grammatical perspective. Int J Inf Technol 11(1):171–182

    Google Scholar 

  6. Baskarn et al (2008) Designing a common POS-tagset framework for Indian language. In: 6th Workshop on Asian Languange Resources

  7. Garg A, Jindal MK, Singh A (2021) Offline handwritten Gurmukhi character recognition: KNN vs. SVM classifier. Int J Inf Technol 13:2389–2396

    Google Scholar 

  8. Bhardwaj  et al (2009) Keyword spotting Tech. for Sanskrit Documents. Sanskrit Comput Linguist 5402:403–416

  9. Garg K (2020) Sentiment analysis of Indian PM’s “Mann Ki Baat. Int J Inf Technol 12(1):37–48

    Google Scholar 

  10. Brill E (1995) Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Comput Linguist 2(4):543–565

    Google Scholar 

  11. Tapaswi N (2023) Spellchecker for Sanskrit sentences based on morphological analysis. Kiranavali, India, pp 69–82

  12. Dermatas E, Kokkinakis G (2002) Automatic stochastic tagging of natural language texts. MIT Press USA, Cambridge

    Google Scholar 

  13. Jurafsky D, Martin JH (2009) Speech and language processing: an introduction to speech recognition. Computational linguistics and natural language processing. Prentice-Hall Delhi

    Google Scholar 

  14. Kulkarni A, Ramakrishnamacharyulu KV (2011) Parsing Sanskrit texts: some relation specific Issues. International Sanskrit computational symposium. Springer Verlag, pp 182–191

    Google Scholar 

  15. Megyesi B (1999) Improving Brill‟S POS tagger for an agglutinative language. Stockholm University

    Google Scholar 

  16. Nivre J (2007) Dependency grammar and dependency parsing. https://www.scss.tcd.ie/conferences/esslli2007/content/CD_Contents/content/id35/id35.pdf

  17. Root WWD (1997) Verb-forms and primary derivatives of the Sanskrit language. Motilal Banarsidass, Delhi

    Google Scholar 

  18. Rashid M, Priya, Singh H (2019) Text to speech conversion in Punjabi language using nourish forwarding algorithm. In: International Journal of Information Technology, pp 1–10

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Namrata Tapaswi.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tapaswi, N. An efficient part-of-speech tagger rule-based approach of Sanskrit language analysis. Int. j. inf. tecnol. 16, 901–908 (2024). https://doi.org/10.1007/s41870-023-01668-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41870-023-01668-y

Keywords

Navigation