Transliterated SVM Based Manipuri POS Tagging

  • Kishorjit Nongmeikapam
  • Lairenlakpam Nonglenjaoba
  • Asem Roshan
  • Tongbram Shenson Singh
  • Thokchom Naongo Singh
  • Sivaji Bandyopadhyay
Part of the Advances in Intelligent and Soft Computing book series (AINSC, volume 166)


Manipuri is a Scheduled Indian language which has two script: a borrowed Bengali Script and the original Meitei Mayek (Script). Manipuri is a resource poor language specially the Meitei Mayek text Manipuri. This paper deals with Support Vector Machine (SVM) based Part of Speech (POS) tagging of the Bengali Script text and then are transliterated to Meitei Mayek after POS tagging. So far POS tagging of Meitei Mayek Manipuri is not reported and this could be the first attempt.


SVM POS Transliteration Features Manipuri 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Brill, E.: A Simple Rule-based Part of Speech Tagger. In: The Proceedings of Third International Conference on Applied NLP. ACL, Trento (1992)Google Scholar
  2. 2.
    Brill, E.: Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in POS Tagging. Computational Linguistics 21(4), 543–545 (1995)Google Scholar
  3. 3.
    Ratnaparakhi, A.: A maximum entropy Parts- of- Speech Tagger. In: The Proceedings EMNLP, vol. 1, pp. 133–142. ACL (1996)Google Scholar
  4. 4.
    Kupiec, R.: Part-of-speech tagging using a Hidden Markov Model. Computer Speech and Language 6(3), 225–242 (1992)CrossRefGoogle Scholar
  5. 5.
    Lin, Y.C., Chiang, T.H., Su, K.Y.: Discrimination oriented probabilistic tagging. In: The Proceedings of ROCLING V, pp. 87–96 (1992)Google Scholar
  6. 6.
    Chang, C.H., Chen, C.D.: HMM-based Part-of-Speech Tagging for Chinese Corpora. In: The Proc. of the Workshop on Very Large Corpora, Columbus, Ohio, pp. 40–47 (1993)Google Scholar
  7. 7.
    Lua, K.T.: Part of Speech Tagging of Chinese Sentences Using Genetic Algorithm. In: The Proceedings of ICCC 1996, National University of Singapore, pp. 45–49 (1996)Google Scholar
  8. 8.
    Ekbal, A., Mondal, S., Bandyopadhyay, S.: POS Tagging using HMM and Rule-based Chunking. In: The Proceedings of SPSAL 2007, IJCAI, India, pp. 25–28 (2007)Google Scholar
  9. 9.
    Ekbal, A., Haque, R., Bandyopadhyay, S.: Bengali Part of Speech Tagging using Conditional Random Field. In: The Proceedings 7th SNLP, Thailand (2007)Google Scholar
  10. 10.
    Ekbal, A., Haque, R., Bandyopadhyay, S.: Maximum Entropy based Bengali Part of Speech Tagging. Advances in Natural Language Processing and Applications. Research in Computing Science (RCS) Journal (33), 67–78 (2008)Google Scholar
  11. 11.
    Singh, S., Gupta, K., Shrivastava, M., Bhattacharya, P.: Morphological Richness offsets Resource Demand–Experiences in constructing a POS tagger for Hindi. In: The Proceedings of COLING-ACL, Sydney, Australia (2006)Google Scholar
  12. 12.
    Antony, P.J., Mohan, S.P., Soman, K.P.: SVM Based Part of Speech Tagger for Malayalam. In: The Proc. of International Conference on Recent Trends in Information, Telecommunication and Computing (ITC), Kochi, Kerala, India, pp. 339–341 (2010)Google Scholar
  13. 13.
    Ekbal, A., Mondal, S., Bandyopadhyay, S.: Part of Speech Tagging in Bengali Using SVM. In: Proceedings of International Conference on Information Technology (ICIT), Bhubaneswar, India, pp. 106–111 (2008)Google Scholar
  14. 14.
    Doren Singh, T., Bandyopadhyay, S.: Morphology Driven Manipuri POS Tagger. In: The Proceeding of IJCNLP NLPLPL 2008, IIIT Hyderabad, pp. 91–97 (2008)Google Scholar
  15. 15.
    Doren Singh, T., Ekbal, A., Bandyopadhyay, S.: Manipuri POS tagging using CRF and SVM: A language independent approach. In: The Proceeding of 6th ICON 2008, Pune, India, pp. 240–245 (2008)Google Scholar
  16. 16.
    Kishorjit, N., Sivaji, B.: Identification of Reduplicated MWEs in Manipuri: A Rule based Approached. In: The Proc. of 23rd ICCPOL 2010, San Francisco, pp. 49–54 (2010)Google Scholar
  17. 17.
    Nongmeikapam, K., Laishram, D., Singh, N.B., Chanu, N.M., Bandyopadhyay, S.: Identification of Reduplicated Multiword Expressions Using CRF. In: Gelbukh, A.F. (ed.) CICLing 2011, Part I. LNCS, vol. 6608, pp. 41–51. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  18. 18.
    Doren Singh, T., Bandyopadhyay, S.: Web Based Manipuri Corpus for Multiword NER and Reduplicated MWEs Identification using SVM. In: The Proceedings of the 1st WSSANLP (COLING), Beijing, pp. 35–42 (2010)Google Scholar
  19. 19.
    Nongmeikapam, K., Singh, N.H., Thoudam, S., Bandyopadhyay, S.: Manipuri Transliteration from Bengali Script to Meitei Mayek: A Rule Based Approach. In: Singh, C., Singh Lehal, G., Sengupta, J., Sharma, D.V., Goyal, V. (eds.) ICISIL 2011. Communications in Computer and Information Science, vol. 139, pp. 195–198. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  20. 20.
    Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer (1995)Google Scholar
  21. 21.
    Huang, C.-L., Wang, C.-J.: A GA-based feature selection and parameters optimization for support vector machines. Expert Systems with Applications 31, 231–240 (2006), doi:10.1016/j.eswa.2005.09.024CrossRefGoogle Scholar
  22. 22.
    Mangang, K., Ng: Revival of a closed account. In: Sanamahi Laining Amasung Punsiron Khupham, Imphal, pp. 24–29 (2003)Google Scholar
  23. 23.
    Kishorjit, N., Bishworjit, S., Romina, M., Mayekleima Chanu, N., Bandyopadhyay, S.: A Light Weight Manipuri Stemmer. In: The Proceedings of National Conference on Indian Language Computing (NCILC), Chochin, India (2011)Google Scholar

Copyright information

© Springer-Verlag GmbH Berlin Heidelberg 2012

Authors and Affiliations

  • Kishorjit Nongmeikapam
    • 1
  • Lairenlakpam Nonglenjaoba
    • 1
  • Asem Roshan
    • 1
  • Tongbram Shenson Singh
    • 1
  • Thokchom Naongo Singh
    • 1
  • Sivaji Bandyopadhyay
    • 2
  1. 1.Dept. of Computer Science and Engg., Manipur Institute of TechnologyManipur UniversityImphalIndia
  2. 2.Dept. of Computer Science and Engg.Jadavpur UniversityKolkataIndia

Personalised recommendations