A New General Grammar Formalism for Parsing

  • Gabriel Infante-Lopez
  • Martín Ariel Domínguez
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7094)


We introduce Probabilistic Constrained W-grammars (PCW-grammars), a two-level formalism capable of capturing grammatical frameworks used in three different state of the art grammar formalism, namely Bilexical Grammars, Markov Rules, and Stochastic Tree Substitution Grammars. For each of them we provide an embedding into PCW-grammars, which allows us to derive properties about their expressive power and consistency, and relations between the formalisms studied.


Parsing PCW-grammars Bilexical Grammars 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Blunsom, P., Cohn, T.: Unsupervised induction of tree substitution grammars for dependency parsing. In: Proceedings of EMNLP 2010 (2010)Google Scholar
  2. 2.
    Bod, R.: Enriching linguistics with statistics. Ph.D. thesis, Univ. Amsterdam (1995)Google Scholar
  3. 3.
    Bod, R.: Beyond Grammar—An Experience-Based Theory of Language. Cambridge University Press (1999)Google Scholar
  4. 4.
    Chi, Z., Geman, S.: Estimation of probabilistic context-free grammars. Compu. Ling. 305, 24:299–24:305 (1998)Google Scholar
  5. 5.
    Collins, M.: Three generative, lexicalized models for statistical parsing. In: ACL 1997 (1997)Google Scholar
  6. 6.
    Collins, M.: Head-driven statistical models for natural language parsing. Ph.D. thesis, Univ. Pennsylvania (1999)Google Scholar
  7. 7.
    Domníguez, M.A., Infante-Lopez, G.: Searching for Part of Speech Tags That Improve Parsing Models. In: Nordström, B., Ranta, A. (eds.) GoTAL 2008. LNCS (LNAI), vol. 5221, pp. 126–137. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  8. 8.
    Eisner, J.: Three new probabilistic models for dependency parsing: An exploration. In: Proceedings of International Conference on Computational Linguistics (COLING-1996), Copenhagen (1996)Google Scholar
  9. 9.
    Eisner, J.: Bilexical grammars and their cubic-time parsing algorithms. In: Advances in Probabilistic and Other Parsing Technologies, pp. 29–62 (2000)Google Scholar
  10. 10.
    Infante-Lopez, G., de Rijke, M.: Comparing the ambiguity reduction abilities of probabilistic context-free grammars. In: Proc. LREC 2004 (2004)Google Scholar
  11. 11.
    Joan-Andreu, S., Bened, J.M.: Consistency of stochastic context-free grammars from probabilistic estimation based on growth transformations. IEEE Trans. on Pattern Analysis and Machine Intelligence (1997)Google Scholar
  12. 12.
    Johnson, M.: The dop estimation method is biased and inconsistent. Comp. Ling 76, 28:71–28:76 (2002)Google Scholar
  13. 13.
    Jurafsky, D., Martin, J.: Speech and Language Processing. Prentice Hall (2000)Google Scholar
  14. 14.
    Klein, D., Manning, C.: Corpus-based induction of syntactic structure: Models of dependency and constituency. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics (2004)Google Scholar
  15. 15.
    Koo, T., Carreras, X., Collins, M.: Simple semi-supervised dependency parsing. In: Proc. ACL/HLT (2008)Google Scholar
  16. 16.
    Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. The MIT Press (1999)Google Scholar
  17. 17.
    Sima’an, K., Buratto, L.: Backoff Parameter Estimation for the DOP Model. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 373–384. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  18. 18.
    Spitkovsky, V.I., Alshawi, H., Jurafsky, D., Manning, C.D.: Viterbi training improves unsupervised dependency parsing. In: Proc. CoNLL 2010 (2010)Google Scholar
  19. 19.
    van Wijngaarden, A.: Orthogonal design and description of a formal language. In: Technical Report MR76, Mathematisch Centrum, Amsterdam (1965)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Gabriel Infante-Lopez
    • 1
  • Martín Ariel Domínguez
    • 1
  1. 1.Grupo de Procesamiento de Lenguaje NaturalUniversidad Nacional de CórdobaArgentina

Personalised recommendations