Abstract
In this paper, we present a parts-of-speech tagger for inflectional and derivational morphologically rich language Marathi. Marathi is spoken by the native people of Maharashtra. The general approach used for the development of tagger is statistical-based hidden Markov model (HMM). We establish a methodology of parts-of-speech (POS) tagging for Marathi using HMM. The main concept of HMM is to calculate probabilities to determine which is the best sequence of tags that correspond to observation sequence of words. In this paper, we show the development of the tagger. Moreover, we have also shown the evaluation done.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bharti, A., Sangal, R., Sharma, D.M., Bai, L.: Anncorra: Annotating corpora guidelines for POS and chunk annotation for Indian languages. LTRC-TR31 (2006)
Singh, T.D., Bandyopadhyay, S.: Morphology driven Manipuri POS tagger. In: Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages, pp. 91–98. Hyderabad, India (2008)
Ekbal, A., Bandyopadhyay, S.: Web-based Bengali news corpus for lexicon development and POS tagging. In: Proceeding of Language Resource and Evaluation (2008)
Dhanalakshmi, V., Anandkumar, M., Rajendran, S., Soman, K.P.: Tamil POS tagging using linear programming. In: proceeding of International Journal of Recent Trends in Engineering, vol. 1, No. 2 (2009)
Dalal, A., Nagaraj, K., Swant, U., Shelke, S., Bhattacharyya, P.: Building feature rich pos tagger for morphologically rich languages: Experience in Hindi. In: Proceedings of International Conference on Natural Language Processing (ICON) at IIIT, Hyderabad (2007)
Gill, M.S., Lehal, G.S., Joshi, S.S.: Part-of-Speech tagging for grammar checking of Punjabi. Linguis. J. 4(1), 6–21 (2009)
Manju, K., Soumya, S., Idicula, S.M.: Development of a POS tagger for Malayalam-an experience. In: Proceedings of International Conference on Advances in Recent Technologies in Communication and Computing, IEEE (2009)
Joshi, N., Darbari, H., Mathur, I.: HMM based POS tagger for Hindi. In: Proceeding of 2013 International Conference on Artificial Intelligence, Soft Computing (AISC-2013) (2013)
Patel, C., Gali, K.: Part-Of-Speech tagging for Gujarati using conditional random fields. In: Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages, pp. 117–122. Hyderabad, India (2008)
Reddy, S., Sharoff, S.: Cross language POS taggers (and other tools) for Indian languages: An experiment with Kannada using Telugu resources. In: Proceedings of IJCNLP Workshop on Cross Lingual Information Access: Computational Linguistics and the Information Need of Multilingual Societies. Thailand (2011)
Brants T.: Tnt: a statistical part-ofspeech tagger. In: Proceedings of the sixth conference on Applied natural language processing, ANLC ’00, pp. 224–231, Association for Computational Linguistics, Stroudsburg, PA, USA (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer India
About this paper
Cite this paper
Singh, J., Joshi, N., Mathur, I. (2014). Marathi Parts-of-Speech Tagger Using Supervised Learning. In: Mohapatra, D.P., Patnaik, S. (eds) Intelligent Computing, Networking, and Informatics. Advances in Intelligent Systems and Computing, vol 243. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1665-0_24
Download citation
DOI: https://doi.org/10.1007/978-81-322-1665-0_24
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-1664-3
Online ISBN: 978-81-322-1665-0
eBook Packages: EngineeringEngineering (R0)