Advertisement

Language Resources and Evaluation

, Volume 42, Issue 1, pp 1–19 | Cite as

LTAG-spinal and the Treebank

A new resource for incremental, dependency and semantic parsing
  • Libin Shen
  • Lucas Champollion
  • Aravind K. Joshi
Article

Abstract

We introduce LTAG-spinal, a novel variant of traditional Lexicalized Tree Adjoining Grammar (LTAG) with desirable linguistic, computational and statistical properties. Unlike in traditional LTAG, subcategorization frames and the argument–adjunct distinction are left underspecified in LTAG-spinal. LTAG-spinal with adjunction constraints is weakly equivalent to LTAG. The LTAG-spinal formalism is used to extract an LTAG-spinal Treebank from the Penn Treebank with Propbank annotation. Based on Propbank annotation, predicate coordination and LTAG adjunction structures are successfully extracted. The LTAG-spinal Treebank makes explicit semantic relations that are implicit or absent from the original PTB. LTAG-spinal provides a very desirable resource for statistical LTAG parsing, incremental parsing, dependency parsing, and semantic parsing. This treebank has been successfully used to train an incremental LTAG-spinal parser and a bidirectional LTAG dependency parser.

Keywords

Tree Adjoining Grammar LTAG-spinal Treebank Dependency parsing 

Abbreviation

LTAG

Lexicalized Tree Adjoining Grammar

Notes

Acknowledgments

We would like to thank our anonymous reviewers for valuable comments. We are grateful to Ryan Gabbard, who has contributed to the code for the LTAG-spinal API. We also thank Julia Hockenmaier, Mark Johnson, Yudong Liu, Mitch Marcus, Sameer Pradhan, Anoop Sarkar, and the CLRG and XTAG groups at Penn for helpful discussions.

References

  1. Abeillé, A., & Rambow, O. (Eds.) (2001). Tree Adjoining Grammars: Formalisms, linguistic analysis and processing. Center for the Study of Language and Information.Google Scholar
  2. Babko-Malaya, O., Bies, A., Taylor, A., Yi, S., Palmer, M., Marcus, M., Kulick, S., & Shen, L. (2006). Issues in synchronizing the English Treebank and PropBank. In Frontiers in Linguistically Annotated Corpora (ACL Workshop).Google Scholar
  3. Charniak, E. (1997). Statistical parsing with a context-free grammar and word statistics. In Proceedings of the Fourteenth National Conference on Artificial Intelligence.Google Scholar
  4. Charniak, E., & Johnson, M. (2005). Coarse-to-fine n-best parsing and MaxEnt discriminative reranking. In Proceedings of the 43th Annual Meeting of the Association for Computational Linguistics (ACL).Google Scholar
  5. Chen, J., Bangalore, S., & Vijay-Shanker, K. (2006). Automated extraction of Tree Adjoining Grammars from treebanks. Natural Language Engineering, 12(3), 251–299.CrossRefGoogle Scholar
  6. Chen, J., & Rambow, O. (2003). Use of deep linguistic features for the recognition and labeling of semantic arguments. In Proceedings of the 2003 Conference of Empirical Methods in Natural Language Processing.Google Scholar
  7. Chiang, D. (2000). Statistical parsing with an automatically-extracted Tree Adjoining Grammar. In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL).Google Scholar
  8. Collins, M. (1999). Head-driven statistical models for natural language parsing. PhD thesis, University of Pennsylvania.Google Scholar
  9. Frank, R. (2002). Phrase structure composition and syntactic dependencies. The MIT Press.Google Scholar
  10. Hockenmaier, J., & Steedman, M. (2002). Generative models for statistical parsing with combinatory categorial grammar. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL).Google Scholar
  11. Joshi, A. K., Levy, L. S., & Takahashi, M. (1975). Tree adjunct grammars. Journal of Computer and System Sciences, 10(1), 136–163.Google Scholar
  12. Joshi, A. K., & Schabes, Y. (1997). Tree-Adjoining Grammars. In G. Rozenberg & A. Salomaa (Eds.), Handbook of formal languages (Vol. 3, pp. 69–124). Springer-Verlag.Google Scholar
  13. Joshi, A. K., & Srinivas, B. (1994). Disambiguation of super parts of speech (or Supertags): Almost parsing. In Proceedings of COLING ’94: The 15th Int. Conf. on Computational Linguistics.Google Scholar
  14. Kroch, A., & Joshi, A. K. (1985). The linguistic relevance of Tree Adjoining Grammar. Report MS-CIS-85-16. CIS Department, University of Pennsylvania.Google Scholar
  15. Magerman, D. (1995). Statistical decision-tree models for parsing. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics.Google Scholar
  16. Marcus, M. P., Santorini, B., & Marcinkiewicz, M. A. (1994). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2), 313–330.Google Scholar
  17. Palmer, M., Gildea, D., & Kingsbury, P. (2005). The proposition bank: An annotated corpus of semantic roles. Computational Linguistics, 31(1), 71–106.Google Scholar
  18. Pradhan, S., Hacioglu, K., Krugler, V., Ward, W., Martin, J., & Jurafsky, D. (2005). Support vector learning for semantic argument classification. Machine Learning, 60(1–3), 11–39.CrossRefGoogle Scholar
  19. Rambow, O., Weir, D., & Vijay-Shanker, K. (2001). D-Tree substitution grammars. Computational Linguistics, 27(1), 89–121.CrossRefGoogle Scholar
  20. Sarkar, A., & Joshi, A. K. (1996). Coordination in Tree Adjoining Grammars: Formalization and implementation. In Proceedings of COLING ’96: The 16th Int. Conf. on Computational Linguistics.Google Scholar
  21. Schabes, Y., & Waters, R. C. (1995). A cubic-time, parsable formalism that lexicalizes context-free grammar without changing the trees produced. Computational Linguistics, 21(4), 479–513.Google Scholar
  22. Shen, L. (2006). Statistical LTAG parsing. PhD Thesis, University of Pennsylvania.Google Scholar
  23. Shen, L., & Joshi, A. K. (2005). Incremental LTAG parsing. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing.Google Scholar
  24. Shen, L., & Joshi, A. K. (2007). Bidirectional LTAG dependency parsing. Technical Report 07-02, IRCS, University of Pennsylvania.Google Scholar
  25. Steedman, M. (2000). The syntactic process. The MIT Press.Google Scholar
  26. Sturt, P., & Lombardo, V. (2005). Processing coordinated structures: Incrementality and connectedness. Cognitive Science, 29(2), 291–305.Google Scholar
  27. Vadas, D., & Curran, J. (2007). Adding noun phrase structure to the Penn Treebank. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL).Google Scholar
  28. Xia, F. (2001). Automatic grammar generation from two different perspectives. PhD thesis, University of Pennsylvania.Google Scholar
  29. XTAG-Group (2001). A lexicalized tree adjoining grammar for English. Technical Report 01-03, IRCS, University of Pennsylvania.Google Scholar
  30. Yi, S. (2007). Robust semantic role labeling using parsing variations and semantic classes. PhD thesis, University of Pennsylvania.Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2007

Authors and Affiliations

  • Libin Shen
    • 1
  • Lucas Champollion
    • 2
  • Aravind K. Joshi
    • 3
  1. 1.BBN TechnologiesCambridgeUSA
  2. 2.Department of LinguisticsUniversity of PennsylvaniaPhiladelphiaUSA
  3. 3.Department of Computer and Information ScienceUniversity of PennsylvaniaPhiladelphiaUSA

Personalised recommendations