Learning Context-Free Grammars with a Simplicity Bias

  • Pat Langley
  • Sean Stromsten
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1810)


We examine the role of simplicity in directing the induction of context-free grammars from sample sentences. We present a rational reconstruction of Wolff’s SNPR — the Grids system — which incorporates a bias toward grammars that minimize description length. The algorithm alternates between merging existing nonterminal symbols and creating new symbols, using a beam search to move from complex to simpler grammars. Experiments suggest that this approach can induce accurate grammars and that it scales reasonably to more difficult domains.


  1. 1.
    Cook, C. M., Rosenfeld, A., & Aronson, A. (1976). Grammatical inference by hill climbing. Informational Sciences, 10, 59–80.MathSciNetGoogle Scholar
  2. 2.
    Grünwald, P. (1996). A minimum description length approach to grammar inference. In S. Wermter, E. Riloff, & G. Scheler (Eds.) Connectionist, statistical and symbolic approaches to learning for natural language processing. Lecture Notes in Computer Science, 1040. Berlin: Springer-Verlag.Google Scholar
  3. 3.
    Stolcke, A. (1994). Bayesian learning of probabilistic language models. Doctoral dissertation, Division of Computer Science, University of California, Berkeley.Google Scholar
  4. 4.
    VanLehn, K., & Ball, W. (1987). A version space approach to learning context-free grammars. Machine Learning, 2, 39–74.Google Scholar
  5. 5.
    Wolff, J. G. (1982). Language acquisition, data compression and generalization. Language & Communication, 2, 57–89.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Pat Langley
    • 1
  • Sean Stromsten
    • 1
  1. 1.DaimlerChrysler Research and Technology CenterPalo AltoUSA

Personalised recommendations