Evolution of the Modern Phase of Written Bangla: A Statistical Study

  • Paheli Bhattacharya
  • Arnab Bhattacharya
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8284)

Abstract

Active languages such as Bangla (or Bengali) evolve over time due to a variety of issues. In this paper, we analyze the change in the written form of the modern phase of Bangla quantitatively in terms of character-level, syllable-level, morpheme-level and word-level features. We collect three different types of corpora—classical, newspapers and blogs—and test whether the differences in their features are statistically significant. Results suggest that there are significant changes in the length of a word when measured in terms of characters, but there is not much difference in usage of different characters, syllables and morphemes in a word or of different words in a sentence. To the best of our knowledge, this is the first work on Bangla of this kind.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bhattacharya, P., Bhattacharya, A.: Evolution of the modern phase of written Bangla: A statistical study. arXiv [cs.CL] (2013)Google Scholar
  2. 2.
    Choudhury, M., Jalan, V., Sarkar, S., Basu, A.: Evolution, optimization, and language change: The case of Bengali verb inflections. In: ACL SIG Computational Morphology and Phonology, pp. 65–74 (2007)Google Scholar
  3. 3.
    Christiansen, M.: Language evolution. Oxford University Press (2003)Google Scholar
  4. 4.
    Dasgupta, S., Ng, V.: Unsupervised morphological parsing of Bengali. Language Resources and Evaluation 40(3-4), 311–330 (2006)CrossRefGoogle Scholar
  5. 5.
    Dasgupta, S., Ng, V.: High-performance, language-independent morphological segmentation. In: HLT-NAACL, pp. 155–163 (2007)Google Scholar
  6. 6.
    Niyogi, P.: The Computational nature of language learning and evolution. MIT Press (2006)Google Scholar
  7. 7.
    Sikder, S.: Contemporary bengali language. Amor Ekushey (February 21, 2013), http://archive.thedailystar.net/suppliments/2013/Amor%20Ekushey%20Special%20Supplement/pg2.htm
  8. 8.
    Steels, L.: The synthetic modeling of language origins. Evolution of Communication 1(1), 1–34 (1997)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Paheli Bhattacharya
    • 1
  • Arnab Bhattacharya
    • 2
  1. 1.Govt. College of Engineering and Textile TechnologySerampore, HooghlyIndia
  2. 2.Dept. of Computer Science and EngineeringIndian Institute of Technology, KanpurIndia

Personalised recommendations