Research on Language and Computation

, Volume 8, Issue 2, pp 209–238

Investigating the Relationship Between Linguistic Representation and Computation through an Unsupervised Model of Human Morphology Learning

Article

DOI: 10.1007/s11168-011-9077-2

Cite this article as:
Chan, E. & Lignos, C. Res on Lang and Comput (2010) 8: 209. doi:10.1007/s11168-011-9077-2

Abstract

We develop an unsupervised algorithm for morphological acquisition to investigate the relationship between linguistic representation, data statistics, and learning algorithms. We model the phenomenon that children acquire the morphological inflections of a language monotonically by introducing an algorithm that uses a bootstrapped, frequency-driven learning procedure to acquire rules monotonically. The algorithm learns a morphological grammar in terms of a Base and Transforms representation, a simple rule-based model of morphology. When tested on corpora of child-directed speech in English from CHILDES (MacWhinney in The CHILDES-Project: Tools for analyzing talk. Erlbaum, Hillsdale, 2000), the algorithm learns the most salient rules of English morphology and the order of acquisition is similar to that of children as observed by Brown (A first language: the early stages. Harvard University Press, Cambridge, 1973). Investigations of statistical distributions in corpora reveal that the algorithm is able to acquire morphological grammars due to its exploitation of Zipfian distributions in morphology through type-frequency statistics. These investigations suggest that the computation and frequency-driven selection of discrete morphological rules may be important factors in children’s acquisition of basic inflectional morphological systems.

Keywords

Language acquisitionMorphologyUnsupervised learningCognitive modeling

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  1. 1.University of ArizonaTucsonUSA
  2. 2.University of PennsylvaniaPhiladelphiaUSA