A Multi-purpose Bayesian Model for Word-Based Morphology

  • Maciej Janicki
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 537)


This paper introduces a probabilistic model of morphology based on a word-based morphological theory. Morphology is understood here as a system of rules that describe systematic correspondences between full word forms, without decomposing words into any smaller units. The model is formulated in the Bayesian learning framework and can be trained in both supervised and unsupervised setting. Evaluation is performed on tasks of generating unseen words, lemmatization and inflected form production.


Word-based morphology Machine learning Generative model Inflection Lemmatization Lexicon expansion 


  1. 1.
    Anderson, S.R.: A-Morphous Morphology, Cambridge Studies in Linguistics, vol. 62. Cambridge University Press, New York (1992)CrossRefGoogle Scholar
  2. 2.
    Aronoff, M.: Word Formation in Generative Grammar. The MIT Press, Cambridge (1976)Google Scholar
  3. 3.
    Bocek, T., Hunt, E., Stiller, B.: Fast Similarity Search in Large Dictionaries. Technical report, University of Zurich (2007)Google Scholar
  4. 4.
    Brants, S., Dipper, S., Eisenberg, P., Hansen, S., König, E., Lezius, W., Rohrer, C., Smith, G., Uszkoreit, H.: TIGER: linguistic interpretation of a German corpus. J. Lang. Comput. 2, 597–620 (2004)CrossRefGoogle Scholar
  5. 5.
    Can, B.: Statistical Models for Unsupervised Learning of Morphology and POS Tagging. Ph.D. thesis, University of York (2011)Google Scholar
  6. 6.
    Chan, E.: Learning probabilistic paradigms for morphology. In: Proceedings of the Eighth Meeting of the ACL Special Interest Group on Computational Phonology at HLT-NAACL, pp. 69–78 (2006)Google Scholar
  7. 7.
    Chrupała, G., Dinu, G., van Genabith, J.: Learning morphology with morfette. In: Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008, pp. 2362–2367 (2008)Google Scholar
  8. 8.
    Durrett, G., DeNero, J.: Supervised learning of complete morphological paradigms. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1185–1195 (2013)Google Scholar
  9. 9.
    Ford, A., Singh, R., Martohardjono, G.: Pace Pini: towards a word-based theory of morphology. American University Studies. Series XIII, Linguistics, vol. 34. Peter Lang Publishing Incorporated (1997)Google Scholar
  10. 10.
    Hammarström, H., Borin, L.: Unsupervised learning of morphology. Comput. Linguist. 37(2), 309–350 (2011)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Janicki, M.: Unsupervised learning of a-morphous inflection with graph clustering. In: Proceedings of the Student Research Workshop associated with RANLP 2013, Hissar, Bulgaria, pp. 93–99 (2013)Google Scholar
  12. 12.
    Kirschenbaum, A.: Unsupervised segmentation for different types of morphological processes using multiple sequence alignment. In: Dediu, A.-H., Martín-Vide, C., Mitkov, R., Truthe, B. (eds.) SLSP 2013. LNCS, vol. 7978, pp. 152–163. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  13. 13.
    Kurimo, M., Virpioja, S., Turunen, V., Lagus, K.: Morpho challenge 2005–2010: evaluations and results. In: Proceedings of the 11th Meeting of the ACL-SIGMORPHON, ACL 2010, pp. 87–95, July 2010Google Scholar
  14. 14.
    Mikheev, A.: Automatic rule induction for unknown word guessing. Comput. Linguist. 23, 405–423 (1997)Google Scholar
  15. 15.
    Neuvel, S., Fulop, S.A.: Unsupervised learning of morphology without morphemes. In: Proceedings of the 6th Workshop of the ACL Special Interest Group in Computational Phonology (SIGPHON), pp. 31–40 (2002)Google Scholar
  16. 16.
    Poon, H., Cherry, C., Toutanova, K.: Unsupervised morphological segmentation with log-linear models. In: Proceedings of Human Language Technologies The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics on NAACL 2009, pp. 209–217 (2009)Google Scholar
  17. 17.
    Przepiórkowski, A.: The IPI PAN Corpus: Preliminary Version. Institute of Computer Science, Polish Academy of Sciences, Warsaw (2004) Google Scholar
  18. 18.
    Rasooli, M.S., Lippincott, T., Habash, N., Rambow, O.: Unsupervised morphology-based vocabulary expansion. In: ACL, pp. 1349–1359 (2014)Google Scholar
  19. 19.
    Ruokolainen, T., Kohonen, O., Virpioja, S., Kurimo, M.: Supervised morphological segmentation in a low-resource learning setting using conditional random fields. In: Proceedings of the Seventeenth Conference on Computational Natural Language Learning (CoNLL), Sofia, Bulgaria, pp. 29–37 (2013)Google Scholar
  20. 20.
    Samdani, R., Chang, M.W., Roth, D.: Unified expectation maximization. In: 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 688–698 (2012)Google Scholar
  21. 21.
    Tarjan, R.E.: Finding optimum branchings. Networks 7, 25–35 (1977)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Tseng, H., Jurafsky, D., Manning, C.: Morphological features help POS tagging of unknown words across language varieties. In: Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing, pp. 32–39 (2005)Google Scholar
  23. 23.
    Virpioja, S., Smit, P., Grönroos, S.A., Kurimo, M.: Morfessor 2.0: Python Implementation and Extensions for Morfessor Baseline. Technical report, Aalto University, Helsinki (2013)Google Scholar
  24. 24.
    Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. J. ACM 21(1), 168–173 (1974)MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Wicentowski, R.H.: Modeling and Learning Multilingual Inflectional Morphology in a Minimally Supervised Framework. Ph.D. thesis, Johns Hopkins University (2002)Google Scholar
  26. 26.
    Yarowsky, D., Wicentowski, R.: Minimally supervised morphological analysis by multimodal alignment. In: ACL 2000, pp. 207–216 (2000)Google Scholar
  27. 27.
    Zielinski, A., Simon, C.: morphisto - an open source morphological analyzer for German. In: 7th International Workshop on Finite-State Methods and Natural Language Processing, FSMNLP 2008, pp. 224–231. Ispra, Italy (2008)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Institute of Computer ScienceUniversity of LeipzigLeipzigGermany

Personalised recommendations