Advertisement

Methods and Algorithms for Unsupervised Learning of Morphology

  • Burcu Can
  • Suresh Manandhar
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8403)

Abstract

This paper is a survey of methods and algorithms for unsupervised learning of morphology. We provide a description of the methods and algorithms used for morphological segmentation from a computational linguistics point of view. We survey morphological segmentation methods covering methods based on MDL (minimum description length), MLE (maximum likelihood estimation), MAP (maximum a posteriori), parametric and non-parametric Bayesian approaches. A review of the evaluation schemes for unsupervised morphological segmentation is also provided along with a summary of evaluation results on the Morpho Challenge evaluations.

Keywords

unsupervised learning probabilistic models morphological segmentation machine learning of morphology 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Argamon, S., Akiva, N., Amir, A., Kapah, O.: Efficient unsupervised recursive word segmentation using minimum description length. In: Proceedings of the 20th International Conference on Computational Linguistics, COLING 2004, pp. 1058–1064. Association for Computational Linguistics, Stroudsburg (2004)Google Scholar
  2. 2.
    Arısoy, E., Dutaǧacı, H., Arslan, L.M.: A unified language model for large vocabulary continuous speech recognition of Turkish. Signal Process. 86, 2844–2862 (2006)CrossRefzbMATHGoogle Scholar
  3. 3.
    Aunimo, L., Heinonen, O., Kuuskoski, R., Makkonen, J., Petit, R., Virtanen, O.: Question answering system for incomplete and noisy data. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 193–206. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  4. 4.
    Baayen, R.: Word Frequency Distributions. Kluwer Academic Publishers (2001)Google Scholar
  5. 5.
    Bernhard, D.: Unsupervised morphological segmentation based on segment predictability and word segments alignment. In: PASCAL Challenge Workshop on Unsupervised Segmentation of Words into Morphemes (2006)Google Scholar
  6. 6.
    Berton, A., Fetter, P., Regel-Brietzmann, P.: Compound words in large-vocabulary German speech recognition systems. In: Proceedings of the Fourth International Conference on Spoken Language, ICSLP 1996, vol. 2, pp. 1165–1168 (October 1996)Google Scholar
  7. 7.
    Bilotti, M.W., Katz, B., Lin, J.: What works better for question answering: Stemming or morphological query expansion? In: Proceedings of the Information Retrieval for Question Answering (IR4QA) Workshop at SIGIR (2004)Google Scholar
  8. 8.
    Blackwell, D., MacQueen, J.B.: Ferguson distributions via polya urn schemes. The Annals of Statistics 1, 353–355 (1973)CrossRefzbMATHMathSciNetGoogle Scholar
  9. 9.
    Bordag, S.: Two-step approach to unsupervised morpheme segmentation. In: Proceedings of 2nd Pascal Challenges Workshop, pp. 25–29 (2006)Google Scholar
  10. 10.
    Bordag, S.: Unsupervised and Knowledge-Free Morpheme Segmentation and Analysis. In: Peters, C., Jijkoun, V., Mandl, T., Müller, H., Oard, D.W., Peñas, A., Petras, V., Santos, D. (eds.) CLEF 2007. LNCS, vol. 5152, pp. 881–891. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  11. 11.
    Brent, M.R.: An efficient, probabilistically sound algorithm for segmentation and word discovery. Machine Learning 34, 71–105 (1999)CrossRefzbMATHGoogle Scholar
  12. 12.
    Brent, M.R., Murthy, S.K., Lundberg, A.: Discovering morphemic suffixes a case study in mdl induction. In: Fifth International Workshop on AI and Statistics, Ft., pp. 264–271 (1995)Google Scholar
  13. 13.
    Brown, P.F., Della Pietra, V.J., Della Pietra, S.A., Mercer, R.L.: The mathematics of statistical machine translation: Parameter estimation. Comput. Linguist. 19(2), 263–311 (1993)Google Scholar
  14. 14.
    Can, B., Manandhar, S.: Clustering morphological paradigms using syntactic categories. In: Peters, C., Di Nunzio, G.M., Kurimo, M., Mandl, T., Mostefa, D., Peñas, A., Roda, G. (eds.) CLEF 2009. LNCS, vol. 6241, pp. 641–648. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  15. 15.
    Can, B., Manandhar, S.: Probabilistic hierarchical clustering of morphological paradigms. In: EACL, pp. 654–663 (2012)Google Scholar
  16. 16.
    Chan, E.: Structures and distributions in morphology learning. PhD thesis, University of Pennsylvania (2008)Google Scholar
  17. 17.
    Clark, A.S.: Inducing syntactic categories by context distribution clustering. In: Proceedings of CoNLL 2000 and LLL 2000, pp. 91–94 (2000)Google Scholar
  18. 18.
    Collins, M.: Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. In: Proceedings of the ACL 2002 Conference on Empirical Methods in Natural Language Processing, EMNLP 2002, vol. 10, pp. 1–8. Association for Computational Linguistics, Stroudsburg (2002)Google Scholar
  19. 19.
    Creutz, M.: Unsupervised segmentation of words using prior distributions of morph length and frequency. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, ACL 2003, vol. 1, pp. 280–287. Association for Computational Linguistics, Stroudsburg (2003)Google Scholar
  20. 20.
    Creutz, M.: Induction of the Morphology of Natural Language: Unsupervised Morpheme Segmentation with Application to Automatic Speech Recognition. PhD thesis, Computer and Information Science, University of Technology, Espoo, Finland (2006)Google Scholar
  21. 21.
    Creutz, M., Hirsimäki, T., Kurimo, M., Puurula, A., Pylkkönen, J., Siivola, V., Varjokallio, M., Arisoy, E., Saraçlar, M., Stolcke, A.: Morph-based speech recognition and modeling of out-of-vocabulary words across languages. ACM Trans. Speech Lang. Process. 5, 1–29 (2007)CrossRefGoogle Scholar
  22. 22.
    Creutz, M., Lagus, K.: Unsupervised discovery of morphemes. In: Proceedings of the ACL 2002 Workshop on Morphological and Phonological Learning, MPL 2002, vol. 6, pp. 21–30. Association for Computational Linguistics, Stroudsburg (2002)Google Scholar
  23. 23.
    Creutz, M., Lagus, K.: Induction of a simple morphology for highly-inflecting languages. In: Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology, SIGMorPhon 2004, pp. 43–51. Association for Computational Linguistics, Stroudsburg (2004)Google Scholar
  24. 24.
    Creutz, M., Lagus, K.: Inducing the morphological lexicon of a natural language from unannotated text. In: Proceedings of the International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning, pp. 106–113 (2005)Google Scholar
  25. 25.
    de Gispert, A., Mariño, J.: On the impact of morphology in English to Spanish statistical mt. Speech Communication 50, 1034–1046 (2008)CrossRefGoogle Scholar
  26. 26.
    Déjean, H.: Morphemes as necessary concept for structures discovery from untagged corpora. In: Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning, NeMLaP3/CoNLL 1998, pp. 295–298. Association for Computational Linguistics, Stroudsburg (1998)Google Scholar
  27. 27.
    Demberg, V.: A language-independent unsupervised model for morphological segmentation. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 680–685 (2007)Google Scholar
  28. 28.
    Dreyer, M., Eisner, J.: Discovering morphological paradigms from plain text using a dirichlet process mixture model. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 616–627. Association for Computational Linguistics, Edinburgh (July 2011)Google Scholar
  29. 29.
    Ford, A., Singh, R., Martohardjono, G.: Pace Panini. Peter Lang (1967)Google Scholar
  30. 30.
    Gelbukh, A., Alexandrov, M., Han, S.-Y.: Detecting inflection patterns in natural language by minimization of morphological model. In: Sanfeliu, A., Martínez Trinidad, J.F., Carrasco Ochoa, J.A. (eds.) CIARP 2004. LNCS, vol. 3287, pp. 432–438. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  31. 31.
    Goldsmith, J.: Unsupervised learning of the morphology of a natural language. Computational Linguistics 27(2), 153–198 (2001)CrossRefMathSciNetGoogle Scholar
  32. 32.
    Goldsmith, J.: An algorithm for the unsupervised learning of morphology. In: Natural Language Engineering, vol. 12, pp. 353–371 (2006)Google Scholar
  33. 33.
    Goldwater, S., Griffiths, T.L., Johnson, M.: Interpolating between types and tokens by estimating power-law generators. In: Advances in Neural Information Processing Systems, vol. 18. MIT Press, Cambridge (2006)Google Scholar
  34. 34.
    Goldwater, S., McClosky, D.: Improving statistical mt through morphological analysis. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, pp. 676–683. Association for Computational Linguistics, Stroudsburg (2005)Google Scholar
  35. 35.
    Grünwald, P.: A tutorial introduction to the minimum description length principle. In: Advances in Minimum Description Length: Theory and Applications. MIT Press (2005)Google Scholar
  36. 36.
    Habash, N., Rambow, O.: Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, ACL 2005, pp. 573–580. Association for Computational Linguistics, Stroudsburg (2005)Google Scholar
  37. 37.
    Hafer, M.A., Weiss, S.F.: Word segmentation by letter successor varieties. Information Storage and Retrieval 10(11-12), 371–385 (1974)CrossRefGoogle Scholar
  38. 38.
    Hammarstrm, H.: A survey and classification of methods for (mostly) unsupervised learning of morphology. In: The 16th Nordic Conference of Computational Linguistics, NODALIDA 2007, Tartu, Estonia, May 25-26. NEALT (2007)Google Scholar
  39. 39.
    Harman, D.: How effective is suffixing. Journal of the American Society for Information Science 42(1), 7–15 (1991)CrossRefGoogle Scholar
  40. 40.
    Harris, Z.S.: From phoneme to morpheme. Language 31(2), 190–222 (1955)CrossRefGoogle Scholar
  41. 41.
    Ishwaran, H., James, L.F.: Generalized weighted chinese restaurant processes for species sampling mixture models. Statistica Sinica 13 (2003)MathSciNetGoogle Scholar
  42. 42.
    Järvelin, K., Pirkola, A.: Morphological processing in mono- and cross-lingual information retrieval. In: Arppe, A., Carlson, L., Lindén, K., Piitulainen, J., Suominen, M., Vainio, M., Westerlund, H., Yli-Jyrä, A. (eds.) Inquiries into Words, Constraints and Contexts. Festschrift for Kimmo Koskenniemi on his 60th Birthday, pp. 214–226. CSLI Publications, Stanford (2005)Google Scholar
  43. 43.
    Kazakov, D.: Unsupervised learning of naive morphology with genetic algorithms. In: ECML/Mlnet Workshop on Empirical Learning of Natural Language Processing Tasks, Prague, pp. 105–112 (1997)Google Scholar
  44. 44.
    Kazakov, D., Manandhar, S.: Unsupervised learning of word segmentation rules with genetic algorithms and inductive logic programming. In: Machine Learning, pp. 43–121 (2001)Google Scholar
  45. 45.
    Keshava, S., Pitler, E.: A simpler, intuitive approach to morpheme induction. In: PASCAL Challenge Workshop on Unsupervised Segmentation of Words into Morphemes, pp. 31–35 (2006)Google Scholar
  46. 46.
    Kettunen, K., Kunttu, T., Järvelin, K.: To stem or lemmatize a highly inflectional language in a probabilistic ir environment? Journal of Documentation 61(4), 476–496 (2005)CrossRefGoogle Scholar
  47. 47.
    Kirchhoff, K., Vergyri, D., Bilmes, J., Duh, K., Stolcke, A.: Morphology-based language modeling for conversational Arabic speech recognition. Computer Speech & Language 20(4), 589–608 (2006)CrossRefGoogle Scholar
  48. 48.
    Toutanova, K., Suzuki, H., Ruopp, A.: Applying morphology generation models to machine translation. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 514–522. Association for Computational Linguistics, Columbus (2008)Google Scholar
  49. 49.
    Krovetz, R.: Viewing morphology as an inference process. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1993, pp. 191–202. ACM, New York (1993)CrossRefGoogle Scholar
  50. 50.
    Kurimo, M., Lagus, K., Virpioja, S., Turunen, V.: Morpho challenge 2010 (June 2011), http://research.ics.tkk.fi/events/morphochallenge2010/
  51. 51.
    Kurimo, M., Virpioja, S., Turunen, V.: Proceedings of the morpho challenge 2010 workshop. In: Proceedings of the 11th Meeting of the ACL Special Interest Group on Computational Morphology and Phonology, SIGMORPHON 2010, pp. 87–95. Association for Computational Linguistics, Stroudsburg (2010)Google Scholar
  52. 52.
    Larson, M., Willett, D., Khler, J., Rigoll, G.: Compound splitting and lexical unit recombination for improved performance of a speech recognition system for German parliamentary speeches. In: International Conference on Spoken Language Processing, pp. 945–948 (2000)Google Scholar
  53. 53.
    Lavallée, J.F., Langlais, P.: Morphological acquisition by formal analogy. In: Working Notes for the CLEF 2009 Workshop (September 2009)Google Scholar
  54. 54.
    Lignos, C.: Learning from unseen data. In: Kurimo, M., Virpioja, S., Turunen, V., Lagus, K. (eds.) Proceedings of the Morpho Challenge 2010 Workshop, Aalto University, Espoo, Finland, pp. 35–38 (2010)Google Scholar
  55. 55.
    Lignos, C., Chan, E., Marcus, M.P., Yang, C.: A rule-based unsupervised morphology learning framework. In: Working Notes for the CLEF 2009 Workshop (September 2009)Google Scholar
  56. 56.
    Manandhar, S., Deroski, S., Erjavec, T.: Learning multilingual morphology with clog. In: Page, D. (ed.) ILP 1998. LNCS, vol. 1446, pp. 135–144. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  57. 57.
    Minkov, E., Toutanova, K., Suzuki, H.: Generating complex morphology for machine translation. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 128–135. Association for Computational Linguistics, Prague (2007)Google Scholar
  58. 58.
    Monson, C., Carbonell, J.G., Lavie, A., Levin, L.: Paramor: Finding paradigms across morphology. In: Peters, C., Jijkoun, V., Mandl, T., Müller, H., Oard, D.W., Peñas, A., Petras, V., Santos, D. (eds.) CLEF 2007. LNCS, vol. 5152, pp. 900–907. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  59. 59.
    Monson, C., Hollingshead, K., Roark, B.: Probabilistic ParaMor. In: Proceedings of the 10th CLEF Conference on Multilingual Information Access Evaluation: Text Retrieval Experiments, CLEF 2009 (September 2009)Google Scholar
  60. 60.
    Morrison, D.R.: Patricia - practical algorithm to retrieve information coded in alphanumeric. Journal of the ACM 15, 514–534 (1968)CrossRefGoogle Scholar
  61. 61.
    Neuvel, S., Fulop, S.A.: Unsupervised learning of morphology without morphemes. In: Proceedings of the ACL 2002 Workshop on Morphological and Phonological Learning, MPL 2002, vol. 6, pp. 31–40. Association for Computational Linguistics, Stroudsburg (2002)CrossRefGoogle Scholar
  62. 62.
    Orbanz, P., Teh, Y.W.: Bayesian nonparametric models. In: Encyclopedia of Machine Learning, pp. 81–89. Springer (2010)Google Scholar
  63. 63.
    Poon, H., Cherry, C., Toutanova, K.: Unsupervised morphological segmentation with log-linear models. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL 2009, pp. 209–217. Association for Computational Linguistics, Stroudsburg (2009)Google Scholar
  64. 64.
    Poon, H., Domingos, P.: Joint unsupervised coreference resolution with Markov logic. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, pp. 650–659. Association for Computational Linguistics, Stroudsburg (2008)CrossRefGoogle Scholar
  65. 65.
    Roeland Ordelman, A.V.H., Jong, F.D.: Compound decomposition in Dutch large vocabulary speech recognition. In: Proceedings of Eurospeech 2003, pp. 225–228 (2003)Google Scholar
  66. 66.
    Rosenfeld, R.: A whole sentence maximum entropy language model. In: Proceedings of the IEEE Workshop on Speech Recognition and Understanding (1997)Google Scholar
  67. 67.
    Schleicher, A.: Zur Morphologie der Spreche, St. Pétersburg. moires de l’Académie Impériale des Sciences de St. Pétersburg Series VII, vol. 1(7) (1859)Google Scholar
  68. 68.
    Sirts, K., Alumäe, T.: A hierarchical dirichlet process model for joint part-of-speech and morphology induction. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2012, pp. 407–416. Association for Computational Linguistics, Stroudsburg (2012)Google Scholar
  69. 69.
    Smith, N.A., Eisner, J.: Contrastive estimation: training log-linear models on unlabeled data. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, ACL 2005, pp. 354–362. Association for Computational Linguistics, Stroudsburg (2005)CrossRefGoogle Scholar
  70. 70.
    Snyder, B., Barzilay, R.: Unsupervised multilingual learning for morphological segmentation. In: Proceedings of ACL 2008: HLT, pp. 737–745. Association for Computational Linguistics, Columbus (June 2008)Google Scholar
  71. 71.
    Spiegler, S., Monson, C.: Emma: A novel evaluation metric for morphological analysis. In: Proceedings of the 23rd International Conference on Computational Linguistics, COLING (August 2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Burcu Can
    • 1
  • Suresh Manandhar
    • 1
  1. 1.Department of Computer ScienceUniversity of YorkHeslingtonUK

Personalised recommendations