Multiword Expressions (MWE) for Mizo Language: Literature Survey

  • Goutam Majumder
  • Partha Pakray
  • Zoramdinthara Khiangte
  • Alexander GelbukhEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9623)


We examine the formation of multi-word expressions (MWE) and reduplicated words in the Mizo language, basing on a news corpus (reduplication is a repetition of a linguistic unit, such as morpheme, affix, word, or clause). To study the structure of reduplication, we follow lexical and morphological approaches, which have been used for the study of other Indian languages, such as Manipuri, Bengali, Odia, Marathi etc. We also show the effect of these phenomena on natural language processing tasks for the Mizo language. To develop an algorithm for identification of reduplicated words in the Mizo language, we manually identified MWEs and reduplicated words and then studied their structural and semantic properties. The results were verified by linguists, experts in the Mizo language.


Multi-word expressions Reduplication Mizo language Onomatopoetic sounds Natural language processing 



This work presented here under the research project Grant No. YSS/2015/000988 and supported by the Department of Science & Technology (DST) and Science and Engineering Research Board (SERB), Govt. of India at NIT Mizoram. We would like to acknowledge National Institute of Technology Mizoram for providing the research environment and sponsorship to carry out this work. Also, we are thankful to Mr. Jereemi Bentham and Mr. Sunday Lalbiknia, students of CSE dept. of NIT Mizoram for their help.


  1. 1.
    Lalthangliana, B.: ‘Mizo tihin ṭawng a nei lo’ tih khaGoogle Scholar
  2. 2.
    Sarmah, P., Wiltshire, C.: An acoustic study of Mizo tones and morpho-tonology, unpublishedGoogle Scholar
  3. 3.
    Chhangte, L.: A preliminary grammar of the Mizo language, Master’s thesis. University of Texas, Arlington (1986)Google Scholar
  4. 4.
    Fanai, L.: Some aspects of the auto segmental phonology of English and Mizo, M.Litt. Dissertation, CIEFL, Hyderabad (1989)Google Scholar
  5. 5.
    Fanai, L.: Some aspects of the lexical phonology of Mizo and English: an auto segmental approach, Ph.D. dissertation, CIEFL, Hyderabad (1992)Google Scholar
  6. 6.
  7. 7.
    Pakray, P., Pal, A., Majumder, G., Gelbukh, A.: Resource building and parts-of-speech (POS) tagging for the Mizo language. In: 2015 Fourteenth Mexican International Conference on Artificial Intelligence (MICAI), pp. 3–7. IEEE (2015)Google Scholar
  8. 8.
    Sag, I.A., Baldwin, T., Bond, F., Copestake, A., Flickinger, D.: Multiword expressions: a pain in the neck for NLP. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 1–15. Springer, Heidelberg (2002). CrossRefGoogle Scholar
  9. 9.
    Chakraborty, T., Bandyopadhyay, S.: Identification of reduplication in Bengali corpus and their semantic analysis: a rule-based approach. In: Proceedings of 23rd International Conference on Computational Linguistics, pp. 73–76 (2010)Google Scholar
  10. 10.
    Nongmeikapam, K., Nonglenjaoba, L., Nirmal, Y., Bandyopadhyay, S.: Reduplicated MWE (RMWE) helps in improving the CRF based Manipuri POS tagger. Int. J. Inf. Technol. Comput Sci. 2(1), 45–59 (2012)Google Scholar
  11. 11.
    Parimalagantham, A.: A Study of Structural Reduplication in Tamil and Telugu. Doctoral dissertation.
  12. 12.
    Balabantaray, R.C., Lenka, S.K.: Computational model for reduplication in Odia. Int. J. Comput. Linguist. Nat. Lang. Process. 2(2), 266–273 (2013)Google Scholar
  13. 13.
    Asad, M.: Reduplication in modern Maithili. J. Lang. India 14(4), 28–58 (2015)Google Scholar
  14. 14.
    Baldwin, T., Bannard, C., Tanaka, T., Widdow, D.: An empirical model of multiword expressions decomposability. In: Proceedings of the ACL Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, vol. 18, pp. 89–96, July 2003Google Scholar
  15. 15.
    Venkatapathy, S., Agrawal, P., Josh, A.K.: Relative compositionality of Noun+Verb multi-word expressions in Hindi. In: Proceedings of ICON Conference on Natural Language Processing, Kanpur (2005)Google Scholar
  16. 16.
    Becker, J.D.: The phrasal lexicon. In: Proceedings of Theoretical Issues of NLP, Workshop in CL, Linguistics, Psychology and AI, Cambridge, pp. 60–63 (1975)Google Scholar
  17. 17.
    The Sino-Tibetan Language Family_STEDT.htm. Accessed 5 Aug 2015Google Scholar
  18. 18.
    Chawngthu, T.: Mizo thuhlaril hmasawn dan part – I, September 2011.
  19. 19.
    Khiangte, L.: Thuhlaril, 2nd Edn. (1997)Google Scholar
  20. 20.
    Lalthangliana, B.: Mizo Literature, 2nd Edn. (2004)Google Scholar
  21. 21.
    Chawngthu, T.: Mizo thuhlaril hmasawn dan part – II.
  22. 22.
    Lalsangpuii, M.A.: The problems of English teaching and learning in Mizoram. J. Lang. India 15(5) (2015)Google Scholar
  23. 23.
    Zoramdinthara: Mizo Fiction: Emergence and Development. Ruby Press & Co., New Delhi, pp. 1–2 (2013)Google Scholar
  24. 24.

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Goutam Majumder
    • 1
  • Partha Pakray
    • 1
  • Zoramdinthara Khiangte
    • 2
  • Alexander Gelbukh
    • 3
    Email author
  1. 1.National Institute of TechnologyAizawlIndia
  2. 2.Pachhunga University CollegeAizawlIndia
  3. 3.CICInstituto Politécnico NacionalMexico CityMexico

Personalised recommendations