Variational Bayesian Grammar Induction for Natural Language

  • Kenichi Kurihara
  • Taisuke Sato
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4201)


This paper presents a new grammar induction algorithm for probabilistic context-free grammars (PCFGs). There is an approach to PCFG induction that is based on parameter estimation. Following this approach, we apply the variational Bayes to PCFGs. The variational Bayes (VB) is an approximation of Bayesian learning. It has been empirically shown that VB is less likely to cause overfitting. Moreover, the free energy of VB has been successfully used in model selection. Our algorithm can be seen as a generalization of PCFG induction algorithms proposed before. In the experiments, we empirically show that induced grammars achieve better parsing results than those of other PCFG induction algorithms. Based on the better parsing results, we give examples of recursive grammatical structures found by the proposed algorithm.


Noun Phrase Wall Street Journal Parse Tree Training Corpus Bayesian Learning 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Attias, H.: A variational Bayesian framework for graphical models. In: Advances in Neural Information Processing Systems vol. 12 (2000)Google Scholar
  2. 2.
    Baker, J.K.: Trainable grammars for speech recognition. In: Klatt, D.H., Wolf, J.J. (eds.) Speech Communication Papers for the 97th Meeting of the Acoustical Society of America, pp. 547–550 (1979)Google Scholar
  3. 3.
    Bockhorst, J., Craven, M.: Refining the structure of a stochastic context-free grammar. In: Proceedings of the 17th International Joint Conference on Artificial Intelligence (2001)Google Scholar
  4. 4.
    Chen, S.F.: Bayesian grammar induction for language modeling. In: Meeting of the Association for Computational Linguistics, pp. 228–235 (1995)Google Scholar
  5. 5.
    Collins, M.: Discriminative reranking for natural language parsing. In: Proc. 17th International Conf. on Machine Learning, pp. 175–182 (2000)Google Scholar
  6. 6.
    Ghahramani, Z., Beal, M.J.: Variational inference for Bayesian mixtures of factor analysers. In: Advances in Neural Information Processing Systems,  vol. 12 (2000)Google Scholar
  7. 7.
    Hogenhout, W.R., Matsumoto, Y.: A fast method for statistical grammar induction. Natural Language Engineering 4(3), 191–209 (1998)CrossRefGoogle Scholar
  8. 8.
    Klein, D., Manning, C.D.: A generative constituent-context model for improved grammar induction. In: Proceedings of the 40th Annual Meeting of the ACL (2002)Google Scholar
  9. 9.
    Klein, D., Manning, C.D.: Corpus-based induction of syntactic structure: Models of dependency and constituency. In: Proceedings of the 42nd Annual Meeting of the ACL (2004)Google Scholar
  10. 10.
    Kurihara, K., Sato, T.: An application of the variational Bayesian approach to probabilistic context-free grammars, 2004. In: IJCNLP 2004 Workshop beyond shallow analyses (2004)Google Scholar
  11. 11.
    Lari, K., Young, S.: The estimation of stochastic context-free grammars using the inside-outside algorithm. Computer Speech and Language 4, 35–56 (1990)CrossRefGoogle Scholar
  12. 12.
    MacKay, D.J.C.: Em ensemble learning for hidden markov models. Technical report (1997)Google Scholar
  13. 13.
    Pereira, F.C.N., Schabes, Y.: Inside-outside reestimation from partially bracketed corpora. In: Meeting of the Association for Computational Linguistics, pp. 128–135 (1992)Google Scholar
  14. 14.
    Sato, M.: Online model selection based on the variational bayes. Neural Computation 13, 1649–1681 (2001)MATHCrossRefGoogle Scholar
  15. 15.
    Schabes, Y., Roth, M., Osborne, R.: Parsing the wall street journal with the inside-outside algorithm. In: ACL, pp. 341–347 (1993)Google Scholar
  16. 16.
    Stolcke, A., Omohundro, S.: Inducing probabilistic grammars by Bayesian model merging. In: International Conference on Grammatical Inference (1994)Google Scholar
  17. 17.
    Ueda, N., Ghahramani, Z.: Bayesian model search for mixture models based on optimizing variational bounds. Neural Networks 15(10), 1223–1241 (2002)CrossRefGoogle Scholar
  18. 18.
    van Zaanen, M.: Abl: Alighment-based learning. In: COLING, vol. 18, pp. 961–967 (2000)Google Scholar
  19. 19.
    Wagsta, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained k-means clustering with background knowledge. In: Proceedings of 18th International Conference on Machine Learning, pp. 577–584 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Kenichi Kurihara
    • 1
  • Taisuke Sato
    • 1
  1. 1.Tokyo Institute of TechnologyTokyoJapan

Personalised recommendations