Research on Language and Computation

, Volume 8, Issue 2–3, pp 209–238 | Cite as

Investigating the Relationship Between Linguistic Representation and Computation through an Unsupervised Model of Human Morphology Learning

Article

Abstract

We develop an unsupervised algorithm for morphological acquisition to investigate the relationship between linguistic representation, data statistics, and learning algorithms. We model the phenomenon that children acquire the morphological inflections of a language monotonically by introducing an algorithm that uses a bootstrapped, frequency-driven learning procedure to acquire rules monotonically. The algorithm learns a morphological grammar in terms of a Base and Transforms representation, a simple rule-based model of morphology. When tested on corpora of child-directed speech in English from CHILDES (MacWhinney in The CHILDES-Project: Tools for analyzing talk. Erlbaum, Hillsdale, 2000), the algorithm learns the most salient rules of English morphology and the order of acquisition is similar to that of children as observed by Brown (A first language: the early stages. Harvard University Press, Cambridge, 1973). Investigations of statistical distributions in corpora reveal that the algorithm is able to acquire morphological grammars due to its exploitation of Zipfian distributions in morphology through type-frequency statistics. These investigations suggest that the computation and frequency-driven selection of discrete morphological rules may be important factors in children’s acquisition of basic inflectional morphological systems.

Keywords

Language acquisition Morphology Unsupervised learning Cognitive modeling 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Albright, A., & Hayes, B. (2002). Modeling English past tense intuitions with minimal generalization. In Proceedings of the special interest group on computational phonology.Google Scholar
  2. Argamon, S., Akiva, N., Amir, A., & Kapah, O. (2004). Efficient unsupervised recursive word segmentation using minimum description length. In Proceedings of the international conference on computational linguistics.Google Scholar
  3. Baayen R. H., Piepenbrock R., van Rijn H. (1996) The CELEX2 lexical database (CD-ROM). Linguistic Data Consortium, PhiladelphiaGoogle Scholar
  4. Bacchin M., Ferro N., Melucci M. (2005) A probabilistic model for stemmer generation. Information Processing and Management 41: 121–137CrossRefGoogle Scholar
  5. Baroni, M., & Ueyama, M. (2006). Building general- and special-purpose corpora by web crawling. In Proceedings of the 13th NIJL international symposium, language corpora: Their compilation and application.Google Scholar
  6. Beesley K., Karttunen L. (2003) Finite state morphology. CSLI Publications, StanfordGoogle Scholar
  7. Biemann, C. (2006). Unsupervised part-of-speech tagging employing efficient graph clustering. In Proceedings of the Association for Computational Linguistics.Google Scholar
  8. Bordag, S. (2007). Elements of knowledge-free and unsupervised lexical acquisition. Dissertation, University of Leipzig.Google Scholar
  9. Brent M., Cartwright T. (1996) Distributional regularity and phonotactic constraints are useful for segmentation. Cognition 61: 93–125CrossRefGoogle Scholar
  10. Brown R. (1973) A first language: The early stages. Harvard University Press, CambridgeGoogle Scholar
  11. Bybee J. L. (1985) Morphology: A study of the relation between meaning and form. John Bejamins, AmsterdamGoogle Scholar
  12. Can, B., & Manandhar, S. (2009). Unsupervised learning of morphology by using syntactic categories. In Working notes for the cross language evaluation forum (CLEF), MorphoChallenge.Google Scholar
  13. Carreras, X., Chao, I., Padró, L., & Padró, M. (2004). FreeLing: an open-source suite of language analyzers. In Proceedings of the language and resources evaluation conference.Google Scholar
  14. Carlson L. (2005) Inducing a morphological transducer from inflectional paradigms. In: Arppe A., Carlson L., Lindén K., Piitulainen J., Suominen M., Vainio & M., Westerlund H., Yli-Jyrä A. (eds) Inquiries into words, constraints and contexts, Festschrift for Kimmo Koskenniemi on his 60th Birthday. CSLI Publications, Stanford, CAGoogle Scholar
  15. Chan, E. (2008). Structures and distributions in morphology learning. Dissertation, University of Pennsylvania.Google Scholar
  16. Chomsky N., Halle M. (1968) The sound pattern of English. Harper & Row, New YorkGoogle Scholar
  17. Clark, A. (2001). Learning morphology with pair hidden markov models. In Proceedings of the student workshop at the 39th annual meeting of the Association for Computational Linguistics.Google Scholar
  18. Clark, A. (2002). Memory-based learning of morphology with stochastic transducers. In Proceedings of the Association for Computational Linguistics.Google Scholar
  19. Clark, A. (2003). Combining distributional and morphological information for part of speech induction. In Proceedings of the 10th conference of the European chapter of the Association for Computational Linguistics.Google Scholar
  20. Corbett G. G., Fraser N. M. (1993) Network morphology: a DATR account of Russian nominal inflection. Journal of Linguistics 29: 42–113CrossRefGoogle Scholar
  21. Creutz, M. (2003). Unsupervised segmentation of words using prior distributions of morph length and frequency. In Proceedings of the association of computational linguistics.Google Scholar
  22. Creutz, M., & Lagus, K. (2004). Induction of a simple morphology for highly-inflecting languages. In Proceedings of the special interest group in computational phonology.Google Scholar
  23. Daelemans, W., Berck, P., & Gillis, S. (1996). Unsupervised discovery of phonological categories through supervised learning of morphological rules. In Proceedings of the 16th international conference on computational linguistics.Google Scholar
  24. Dasgupta, S., & Ng, V. (2007). Unsupervised part-of-speech acquisition for resource-scarce languages. In Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning.Google Scholar
  25. Deerwester S., Dumais S. T., Furnas G. W., Landauer T. K., Harshman R. (1990) Indexing by latent semantic analysis. Journal of the American Society for Information Science 41: 391–407CrossRefGoogle Scholar
  26. Demberg, V. (2007). A language-independent unsupervised model for morphological segmentation. In Proceedings of the Association for Computational Linguistics.Google Scholar
  27. Dressler, W. U. (2005). Morphological typology and first language acquisition: some mutual challenges. In G. Booij, E. Guevara, A. Ralli, S. Sgroi, & S. Scalise, (Eds.), In Morphology and Linguistic Typology, On-line Proceedings of the fourth mediterranean morphology meeting, Catania, 21–23 September 2003, University of Bologna. http://morbo.lingue.unibo.it/mmm.
  28. Dreyer, M., Smith, J., & Eisner, J. (2008). Latent-variable modeling of string transductions with finite-state methods. In Proceedings of the conference on empirical methods in natural language processing.Google Scholar
  29. Erjavec, T. (2006). The English-Slovene ACQUIS corpus. In Proceedings of the language and resources evaluation conference.Google Scholar
  30. Forsberg, M., & Ranta, A. (2004). Functional morphology. In Proceedings of the international conference on functional programming.Google Scholar
  31. Francis W. N., Kucera H. (1967) Computing analysis of present-day American english. Brown University Press, Providence, RIGoogle Scholar
  32. Freitag, D. (2004). Toward unsupervised whole-corpus tagging. In Proceedings of the international conference on computational linguistics.Google Scholar
  33. Freitag, D. (2005). Morphology induction from term clusters. In Proceedings of the conference on computational natural Language Learning.Google Scholar
  34. Gambell, T., & Yang, C. (2004). Statistical learning and universal grammar: modeling word segmentation. In Proceedings of the international conference on computational linguistics.Google Scholar
  35. Gerken L. A. (2006) Decisions, decisions: infant language learning when multiple generalizations are possible. Cognition 98: B67–B74CrossRefGoogle Scholar
  36. Gerken L. A., Bollt A. (2008) Three exemplars allow at least some linguistic generalizations: Implications for generalization mechanisms and constraints. Language Learning and Development 4: 228–248CrossRefGoogle Scholar
  37. Gildea D., Jurafsky D. (1996) Learning bias and phonological rule induction. Computational Linguistics 22: 497–530Google Scholar
  38. Goldberg A. E. (1995) Constructions: A construction grammar approach to argument structure. University of Chicago Press, ChicagoGoogle Scholar
  39. Golding A. R., Thompson H. S. (1985) A morphology component for language programs. Linguistics 23: 263–284CrossRefGoogle Scholar
  40. Goldsmith J. A. (2001) Unsupervised learning of the morphology of a natural language. Computational Linguistics 27: 153–198CrossRefGoogle Scholar
  41. Goldsmith J. A. (2006) An algorithm for the unsupervised learning of morphology. Natural Language Engineering 12: 1–19CrossRefGoogle Scholar
  42. Goldwater S., Griffiths T. L., Johnson M. (2006) Interpolating between types and tokens by estimating power-law generators. In: Weiss Y., Schöllkopf B., Plat J. (eds) Advances in neural information processing systems. The MIT Press, CambridgeGoogle Scholar
  43. Graff D., Gallegos G. (1999) Spanish newswire text. Linguistic Data Consortium, PhiladelphiaGoogle Scholar
  44. Gustafson-Capková S., Hartmann B. (2006) Manual of the Stockholm Umeå Corpus version 2.0. Department of Linguistics, Stockholm University, StockholmGoogle Scholar
  45. Hafer M., Weiss S. (1974) Word segmentation be letter successor varieites. Information Storage and Retrieval 10: 371–385CrossRefGoogle Scholar
  46. Hajic J. et al (2006) Prague Dependency Treebank 2.0, CDROM, LDC2006T01. Linguistic Data Consortium, PhiladelphiaGoogle Scholar
  47. Halle M. (1973) Prolegomena to a theory of word-formation. Linguistic Inquiry 4: 3–16Google Scholar
  48. Harris Z. (1955) From phoneme to morpheme. Language 31: 190–222CrossRefGoogle Scholar
  49. Harris Z. (1970) Papers in structural and transformational linguistics. D. Reidel, DordrechtGoogle Scholar
  50. Higgins, D. (2003). Unsupervised learning of Bulgarian POS tags. In Workshop on morphological processing of slavic languages.Google Scholar
  51. Hockett C. F. (1954) Two models of grammatical description. Word 10: 210–231Google Scholar
  52. Hooper, J. B. (1979). Child morphology and morphophonemic change. Linguistics, 17, 21–50. (Also in J. Fisiak (Ed.), Historical morphology (pp. 157–187). The Hague: Mouton).Google Scholar
  53. Hu, Y., Matveeva, I., Goldsmith, J. A., & Sprague, C. (2005a). The SED heuristic for morpheme discovery: a look at Swahili. In Proceedings of the second workshop on psychocomputational models of human language acquisition.Google Scholar
  54. Hu, Y., Matveeva, I., Goldsmith, J. A., & Sprague, C. (2005b). Using morphology and syntax together in unsupervised learning. In Proceedings of the second workshop on psychocomputational models of human language acquisition.Google Scholar
  55. Itai A., Wintner S. (2008) Language resources for Hebrew. Language Resources and Evaluation 42: 77–98CrossRefGoogle Scholar
  56. Johnson, M. (1984). A discovery procedure for certain phonological rules. In Proceedings of the international conference on computational linguistics and the Association for Computational Linguistics.Google Scholar
  57. Kaplan R., Kay M. (1994) Regular models of phonological rule systems. Computational Linguistics 20: 331–378Google Scholar
  58. Karttunen, L. (1998). The proper treatment of optimality in computational phonology. In Proceedings of FSMNLP’98. International workshop on finite-state methods in natural language processing.Google Scholar
  59. Karttunen L. (2003) Computing with realizational morphology. In: Gelbukh A. (eds) Computational Linguistics and Intelligent Text Processing, vol. 2588 of Lecture Notes in Computer Science. Springer-Verlag, Heidelberg, pp 205–216Google Scholar
  60. Kazakov D., Manandhar S. (2001) Unsupervised learning of word segmentation rules with genetic algorithms and inductive logic programming. Machine Learning 43: 121–162CrossRefGoogle Scholar
  61. Klein, D., & Manning, C. (2004). Corpus-based induction of syntactic structure: models of dependency and constituency. In Proceedings of the Association for Computational Linguistics.Google Scholar
  62. Kurimo, M., Virpioja, S., Turunen, V. T., Blackwood, G. W., & Byrne, W. (2009). Overview and results of Morpho Challenge 2009. In Working notes for the CLEF 2009 workshop.Google Scholar
  63. Linguistic Data Consortium (1994). ECI multilingual text. CDROM, LDC94T5. Linguistic Data Consortium, Philadelphia, PAGoogle Scholar
  64. Lignos, C., Chan, E., Marcus, M. P., & Yang, C. (2009). A rule-based unsupervised morphology learning framework. In Working notes for cross-linguistic evaluation forum, MorphoChallenge.Google Scholar
  65. Lignos, C., Chan, E., Marcus, M. P., & Yang, C. (2010). Evidence for a morphological acquisition model from development data. In Proceedings of the 34th annual Boston University conference on language development.Google Scholar
  66. Lin, Y. (2005). Learning features and segments from waveforms: a statistical model of early phonological acquisition. Dissertation, UCLA.Google Scholar
  67. Ling C. X. (1994) Learning the past tense of English verbs: the symbolic pattern associator versus connectionist models. Journal of Artificial Intelligence Research 1: 202–229Google Scholar
  68. MacWhinney B. (2000) The CHILDES-Project: Tools for analyzing talk. 2nd edn. Erlbaum, HillsdaleGoogle Scholar
  69. Manandhar, S., Džeroski, S., & Erjavec, T. (1998). Learning multilingual morphology with CLOG. In Proceedings of inductive logic programming (ILP), 8th international conference. Lecture notes in artificial intelligence (Vol. 1446, pp. 135–144). Heidelberg: Springer.Google Scholar
  70. Marcus G. F., Pinker S., Ullman M., Hollander M., Rosen T. J., Xu F., Clahsen H. (1992) Overregularization in language acquisition. Monographs of the Society for Research in Child Development 54(4): 1–182Google Scholar
  71. Màrquez, L., Taulé, M., Marti, A., Garcia, M., Real, F., & Ferrés, D. (2004). Senseval-3: The catalan lexical sample task. In Proceedings of senseval-3: The third international workshop on the evaluation of systems for the semantic analysis of text.Google Scholar
  72. McClelland J. L., Patterson K. (2002) Rules or connections in past-tense inflections: What does the evidence rule out?. Trends in Cognitive Science 6: 74–465Google Scholar
  73. Molnar, R. A. (2001). Generalize and sift as a model of inflection acquisition. Masters thesis, Massachusetts Institute of Technology.Google Scholar
  74. Mooney R. J., Califf M. E. (1996) Learning the past tense of English verbs using inductive logic programming. In: Wermter S., Riloff E., Scheler G. (eds) Symbolic, connectionist, and statistical approaches to learning for natural language processing. Spring, HeidelbergGoogle Scholar
  75. Naradowsky, J., & Goldwater, S. (2009). Improving morphology induction by learning spelling rules. In Proceedings of the international joint conference on artificial intelligence.Google Scholar
  76. Newman M. E. J. (2005) Power laws, Pareto distributions and Zipf’s law. Contemporary Physics 46: 323–351CrossRefGoogle Scholar
  77. Ninio A. (2006) Language and the learning curve: A new theory of syntactic development. Oxford University Press, OxfordGoogle Scholar
  78. Oflazer K., Nirenburg S., McShane M. (2001) Bootstrapping morphological analyzers by combining human elicitation and machine learning. Computational Linguistics 27: 59–85CrossRefGoogle Scholar
  79. Papageorgiou, H., Prokopidis, P., Giouli, V., & Piperidis, S. (2000). A unified POS tagging architecture and its application to Greek. In Proceedings of the language and resources evaluation conference.Google Scholar
  80. Parkes, C., Malek, A. M., & Marcus, M. P. (1998). Towards unsupervised extraction of verb paradigms from large corpora. In Proceedings of the sixth workshop on very large corpora.Google Scholar
  81. Pinker S. (1999) Words and rules: The ingredients of language. Harper Collins, New YorkGoogle Scholar
  82. Pinker S., Prince A. (1988) On language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition 28: 73–193CrossRefGoogle Scholar
  83. Pinker S., Ullmann M. T. (2002) The past and future of the past tense. Trends in Cognitive Science 6: 456–463CrossRefGoogle Scholar
  84. Plisson, J., Lavrac, N., & Mladenic, D. (2004). A rule based approach to word lemmatization. In SiKDD 2004 at multiconference IS-2004, Ljubljana, Slovenia.Google Scholar
  85. Poon, H., Cherry, C., & Toutanova, K. (2009). Unsupervised morphological segmentation with log-linear models. In Proceedings of the North American chapter of the Association for Computational Linguistics—Human Language Technologies Conference.Google Scholar
  86. Prince, A., & Smolensky, P. (1993). Optimality theory: Constraint interaction in generative grammar. Technical Report, Rutgers University center for cognitive science and computer science Department, University of Colorado at Boulder. Also published by Blackwell Publishers, 2004.Google Scholar
  87. Redington M., Chater N., Finch S. (1998) Distributional information: A powerful cue for acquiring syntactic categories. Cognitive Science 22: 425–469CrossRefGoogle Scholar
  88. Rumelhart D. E., McClelland J. L. (1986) On learning the past tenses of English verbs. In: McClelland J. L., Rumelhart D. E. (eds) The PDP research group, Parallel distributed processing: Explorations in the microstructure of cognition 2. The MIT Press, CambridgeGoogle Scholar
  89. Schone, P., & Jurafsky, D. (2000). Knowledge-free induction of morphology using latent semantic analysis. In Proceedings of the conference on computational natural language learning.Google Scholar
  90. Schone, P., & Jurafsky, D. (2001). Knowledge-free induction of inflectional morphologies. In Proceedings of the North American chapter of the Association for Computational Linguistics.Google Scholar
  91. Schütze, H. (1993). Part-of-speech induction from scratch. In Proceedings of the Association for Computational Linguistics.Google Scholar
  92. Segal, E. (1999). Hebrew morphological analyzer for Hebrew undotted texts. Master’s thesis, Technion, Israel Institute of Technology, Haifa.Google Scholar
  93. Shalonova, K., & Flach, P. (2007). Morphology learning using tree of aligned suffix rules. In Proceedings of the workshop on challenges and applications of grammar induction.Google Scholar
  94. Slobin D. I. (1973) Cognitive prerequisites for the development of grammar. In: Ferguson C. A., Slobin D. I. (eds) Studies of child language development. Rinehart & Winston, New York: HoltGoogle Scholar
  95. Slobin, D. I. (Ed.) (1985) (2 vols.), (1992), (1997) (2 vols.). The crosslinguistic study of language acquisition. Hillsdale: Erlbaum, NJ.Google Scholar
  96. Snover, M., & Brent, M. (2001). A Bayesian model for morpheme and paradigm identification. In Proceedings of the Association for Computational Linguistics.Google Scholar
  97. Snover, M., Jarosz, G., & Brent, M. (2002). Unsupervised learning of morphology using a novel directed search algorithm: taking the first step. In Proceedings of the special interest group in computational phonology.Google Scholar
  98. Sproat R. (1992) Morphology and computation. The MIT Press, CambridgeGoogle Scholar
  99. Stroppa, N., & Yvon, F. (2005). An analogical learner for morphological analysis. In Proceedings of the Conference on Computational Natural Language Learning.Google Scholar
  100. Stump G. T. (2001) Inflectional morphology: A theory of paradigm structure. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  101. Theron, P., & Cloete, I. (1997). Automatic acquisition of two-level morphological rules. In Proceedings of the conference on applied natural language processing.Google Scholar
  102. Tomasello M. (2003) Constructing a language: A usage-based theory of language acquisition. Harvard University Press, CambridgeGoogle Scholar
  103. Weide, R. (1998). The Carnegie mellon pronouncing dictionary [cmudict. 0.6]. (Carnegie Mellon University: http://www.speech.cs.cmu.edu/cgi-bin/cmudict).
  104. Wicentowski, R. (2002). Modeling and learning multilingual inflectional morphology in a minimally supervised framework. Dissertation, Johns Hopkins University.Google Scholar
  105. Wicentowski, R. (2004). Multilingual noise-robust supervised morphological analysis using the WordFrame model. In Proceedings of special interest group on computational phonology (SIGPHON).Google Scholar
  106. Wothke, K. (1986). Machine learning of morphological rules by generalization and analogy. In Proceedings of the international conference on computational linguistics (COLING).Google Scholar
  107. Yarowsky, D., & Wicentowski, R. (2000). Minimally supervised morphological analysis by multimodal alignment. In Proceedings of the Association for Computational Linguistics.Google Scholar
  108. Yip, K., & Sussman, G. J. (1997). Sparse representations for fast, one-shot learning. A.I. Memo No. 1633, Artificial Intelligence Laboratory, Massachusetts Institute of Technology.Google Scholar
  109. Zajac, R. (2001). Morpholog: constrained and supervised learning of morphology. In Proceedings of the Special Interest Group in Computational Phonology.Google Scholar
  110. Zipf G. K. (1935) The psycho-biology of language, an introduction to dynamic philology. The Riverside Press, CambridgeGoogle Scholar
  111. Zipf G. K. (1949) Human behavior and the principle of least effort. Addison-Wesley, CambridgeGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  1. 1.University of ArizonaTucsonUSA
  2. 2.University of PennsylvaniaPhiladelphiaUSA

Personalised recommendations