Skip to main content

Parser Evaluation

Using a Grammatical Relation Annotation Scheme

  • Chapter
Treebanks

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 20))

Abstract

We describe a recently developed corpus annotation scheme for evaluating parsers that avoids some of the shortcomings of current methods. The scheme encodes grammatical relations between heads and dependents, and has been used to mark up a new public-domain corpus of naturally occurring English text. We show how the corpus can be used to evaluate the accuracy of a robust parser, and relate the corpus to extant resources.

This work was carried out while the second author was at the University of Sussex.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Atwell, E. (1996). Comparative evaluation of grammatical annotation models. In R. Sutcliffe, H. Koch, A. McElligott (Eds.), Industrial Parsing of Software Manuals, p. 25–46. Amsterdam, Rodopi.

    Google Scholar 

  • Barnett, R., Calzolari, N., Flores, S., Hellwig, R, Kahrel, R, Leech, G., Melera, M., Montemagni, S., Odijk, J., Pirrelli, V., Sanfilippo, A., Teufel, S., Villegas, M., Zaysser, L. (1996). EAGLES Recommendations on Subcate-gorisation. Report of the EAGLES Working Group on Computational Lexicons. Available at ftp://ftp.ilc.pi.cnr.it/pub/eagles/lexicons/synlex.ps.gz.

    Google Scholar 

  • Bies, A., Ferguson, M., Katz, K., MacIntyre, R., Tredinnick, V., Kim, G., Marc-inkiewicz, M., Schasberger, B. (1995). Bracketing Guidelines for Treebank II Style Penn Treebank Project. Technical Report, CIS, University of Pennsylvania, Philadelphia, PA.

    Google Scholar 

  • Bod, R. (1999). Beyond Grammar. Stanford, CA: CSLI Press.

    Google Scholar 

  • Briscoe, E. and Carroll, J. (1993). Generalised probabilistic LR parsing for unification-based grammars. Computational Linguistics, 19(1), p. 25–60.

    Google Scholar 

  • Briscoe, E., Carroll, J. (1995). Developing and evaluating a probabilistic LR parser of part-of-speech and punctuation labels. Proceedings of the 4th ACL/SIGPARSE International Workshop on Parsing Technologies, p. 48–58. Prague, Czech Republic.

    Google Scholar 

  • Carpenter, B. and Manning, C. (1997). Probabilistic parsing using left corner language models. Proceedings of the 5th ACL/SIGPARSE International Workshop on Parsing Technologies, p. 147–158. MIT, Cambridge, MA.

    Google Scholar 

  • Carroll, J., Briscoe E. and Sanfilippo, A. (1998). Parser evaluation: a survey and a new proposal. Proceedings of the International Conference on Language Resources and Evaluation, p. 447–454. Granada, Spain.

    Google Scholar 

  • Carroll, J., Minnen, G. and Briscoe, E. (1998). Can subcategorisation probabilities help a statistical parser?. Proceedings of the 6th ACL/SIGDAT Workshop on Very Large Corpora, p. 118–126. Montreal, Canada.

    Google Scholar 

  • Charniak, E. (1996). Tree-bank grammars. Proceedings of the 13th National Conference on Artificial Intelligence, AAAI’96, p. 1031–1036. Portland, OR.

    Google Scholar 

  • Collins, M. (1996). A new statistical parser based on bigram lexical dependencies. Proceedings of the 34th Meeting of the Association for Computational Linguistics, p. 184–191. Santa Cruz, CA.

    Google Scholar 

  • Elworthy, D. (1994). Does Baum-Welch re-estimation help taggers?. Proceedings of the 4th ACL Conference on Applied Natural Language Processing, p. 53–58. Stuttgart, Germany.

    Google Scholar 

  • Gaizauskas, R. (1998). Evaluation in language and speech technology. Computer Speech and Language, 12(3), p. 249–262.

    Article  Google Scholar 

  • Gaizauskas, R., Hepple M., Huyck, C. (1998). Modifying existing annotated corpora for general comparative evaluation of parsing. Proceedings of the LRE Workshop on Evaluation of Parsing Systems. Granada, Spain.

    Google Scholar 

  • Grishman, R., Macleod, C., Sterling, J. (1992). Evaluating parsing strategies using standardized parse files. Proceedings of the 3rd ACL Conference on Applied Natural Language Processing, p. 156–161. Trento, Italy.

    Google Scholar 

  • Harrison, P., Abney, S., Black, E., Flickinger, D., Gdaniec, C., Grishman, R., Hindle, D., Ingria, B., Marcus, M., Santorini, B., Strzalkowski, T. (1991). Evaluating syntax performance of parser/grammars of English. Proceedings of the Workshop on Evaluating Natural Language Processing Systems, p. 71–77. 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, CA.

    Google Scholar 

  • Jackendoff, R. (1977). X-bar Syntax. Cambridge, MA: MIT Press.

    Google Scholar 

  • Kaplan, R., Bresnan, J. (1982). Lexical-Functional Grammar: a formal system for grammatical representation. In J. Bresnan (Eds.), The Mental Representation of Grammatical Relations, p. 173–281. Cambridge MA: MIT Press.

    Google Scholar 

  • Leech, G. (1991). Running a grammar factory: the production of syntactically analysed corpora or “treebanks”, in Johansson et al (eds) English computer corpora, Berlin, Mouton de Gruyter, p. 15–32.

    Google Scholar 

  • Lehmann, S., Oepen, S., Regnier-Prost, S., Netter, K., Lux, V., Klein, J., Falkedal, K., Fouvry, F., Estival, D., Dauphin, E., Compagnion, H., Baur, J., Balkan, L., Arnold, D. (1996). TSNLP — test suites for natural language processing. Proceedings of the 16th International Conference on Computational Linguistics, COLING’96, p. 711–716. Copenhagen, Denmark.

    Google Scholar 

  • Lin, D. (1998). A dependency-based method for evaluating broad-coverage parsers. Natural Language Engineering, 4(2), p. 97–114.

    Article  Google Scholar 

  • Lin, D. (2002) Dependency-based evaluation of MINIPAR, This volume.

    Google Scholar 

  • Magerman, D. (1995). Statistical decision-tree models for parsing. Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, p. 276–283. Boston, MA.

    Google Scholar 

  • Marcus, M., Santorini, B., Marcinkiewicz, M. (1993). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2), p. 313–330.

    Google Scholar 

  • Minnen, G., Carroll, J., Pearce, D. (2000). Robust, applied morphological generation. Proceedings of the 1st ACL/SIGGEN International Conference on Natural Language Generation, p. 201–208. Mitzpe Ramon, Israel.

    Google Scholar 

  • Nunberg, G. (1990). The Linguistics of Punctuation. CSLI Lecture Notes 18, Stanford, CA.

    Google Scholar 

  • Roland, D., Jurafsky, D. (1998). How verb subcategorization frequencies are affected by corpus choice. Proceedings of the 17th International Conference on Computational Linguistics, COLING-ACL’98, p. 1122–1128. Montreal, Canada.

    Google Scholar 

  • Pollard, C., Sag, I. (1994). Head-driven Phrase Structure Grammar. Chicago, IL: University of Chicago Press.

    Google Scholar 

  • Rubio, A. (Ed.) (1998). International Conference on Language Resources and Evaluation. Granada, Spain.

    Google Scholar 

  • Sampson, G. (1995). English for the Computer. Oxford, UK: Oxford University Press.

    Google Scholar 

  • Sampson, G. (2000). A proposal for improving the measurement of parse accuracy. International Journal of Corpus Linguistics, 5(1), p. 53–68.

    Article  Google Scholar 

  • Sekine, S. (1997). The domain dependence of parsing. Proceedings of the 5th ACL Conference on Applied Natural Language Processing, p. 96–102. Washington, DC.

    Google Scholar 

  • Srinivas, B., Doran, C., Hockey B., Joshi A. (1996). An approach to robust partial parsing and evaluation metrics. Proceedings of the ESSLLI’96 Workshop on Robust Parsing, p. 70–82. Prague, Czech Republic.

    Google Scholar 

  • Srinivas, B., Doran, C., Kulick, S. (1995). Heuristics and parse ranking. Proceedings of the 4th ACL/SIGPARSE International Workshop on Parsing Technologies, p. 224–233. Prague, Czech Republic.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Carroll, J., Minnen, G., Briscoe, T. (2003). Parser Evaluation. In: Abeillé, A. (eds) Treebanks. Text, Speech and Language Technology, vol 20. Springer, Dordrecht. https://doi.org/10.1007/978-94-010-0201-1_17

Download citation

  • DOI: https://doi.org/10.1007/978-94-010-0201-1_17

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-1-4020-1335-5

  • Online ISBN: 978-94-010-0201-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics