Advertisement

Syntactic Chunking Across Different Corpora

  • Weiqun Xu
  • Jean Carletta
  • Johanna Moore
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4299)

Abstract

Syntactic chunking has been a well-defined and well-studied task since its introduction in 2000 as the conll shared task. Though some efforts have been further spent on chunking performance improvement, the experimental data has been restricted, with few exceptions, to (part of) the Wall Street Journal data, as adopted in the shared task. It remains open how those successful chunking technologies could be extended to other data, which may differ in genre/domain and/or amount of annotation. In this paper we first train chunkers with three classifiers on three different data sets and test on four data sets. We also vary the size of training data systematically to show data requirements for chunkers. It turns out that there is no significant difference between those state-of-the-art classifiers; training on plentiful data from the same corpus (switchboard) yields comparable results to Wall Street Journal chunkers even when the underlying material is spoken; the results from a large amount of unmatched training data can be obtained by using a very modest amount of matched training data.

Keywords

Shared Task Speech Corpus Current Word North American Chapter Training Data Size 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of english: the penn treebank. Computational Linguistics 19(2), 313–330 (1993)Google Scholar
  2. 2.
    Tjong Kim Sang, E.F., Buchholz, S.: Introduction to the conll-2000 shared task: Chunking. In: Cardie, C., Daelemans, W., Nedellec, C., Tjong Kim Sang, E. (eds.) Proceedings of CoNLL-2000 and LLL-2000, Lisbon, Portugal, pp. 127–132 (2000)Google Scholar
  3. 3.
    Carletta, J., Ashby, S., Bourban, S., Flynn, M., Guillemot, M., Hain, T., Kadlec, J., Karaiskos, V., Kraaij, W., Kronenthal, M., Lathoud, G., Lincoln, M., Lisowska, A., McCowan, I., Post, W., Reidsma, D., Wellner, P.: The AMI meeting corpus: A pre-announcement. In: Proceedings of 2nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (2005)Google Scholar
  4. 4.
    Carreras, X., Màrquez, L.: Introduction to the CoNLL-2005 shared task: Semantic role labeling. In: Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL 2005), Association for Computational Linguistics, Ann Arbor, Michigan, pp. 152–164. (2005)Google Scholar
  5. 5.
    Abney, S.: Parsing by chunks. In: Berwick, R.C., Abney, S.P., Tenny, C. (eds.) Principle-Based Parsing: Computation and Psycholinguistics, pp. 257–278. Kluwer Academic Publishers, Boston (1991)Google Scholar
  6. 6.
    Abney, S.: Partial parsing via finite-state cascade. Natural Language Engineering 2(4), 337–344 (1996)CrossRefGoogle Scholar
  7. 7.
    Ramshaw, L., Marcus, M.: Text chunking using transformation-based learning. In: Yarovsky, D., Church, K. (eds.) Proceedings of the Third Workshop on Very Large Corpora., pp. 82–94 (1995)Google Scholar
  8. 8.
    Osborne, M.: Shallow parsing as part-of-speech tagging. In: Cardie, C., Daelemans, W., Nedellec, C., Tjong Kim Sang, E. (eds.) Proceedings of CoNLL 2000 and LLL 2000, Lisbon, Portugal, pp. 145–147 (2000)Google Scholar
  9. 9.
    Osborne, M.: Shallow parsing using noisy and non-stationary training material. Journal of Machine Learning Research 2, 695–719 (2002)MATHCrossRefGoogle Scholar
  10. 10.
    Kudo, T., Matsumoto, Y.: Use of support vector learning for chunk identification. In: Cardie, C., Daelemans, W., Nedellec, C., Tjong Kim Sang, E. (eds.) Proceedings of CoNLL 2000 and LLL 2000, Lisbon, Portugal, pp. 142–144 (2000)Google Scholar
  11. 11.
    Kudo, T., Matsumoto, Y.: Chunking with support vector machines. In: Proceedings of NAACL 2001. Second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies 2001, pp. 1–8. Association for Computational Linguistics, Morristown (2001)Google Scholar
  12. 12.
    Zhang, T., Damerau, F., Johnson, D.: Text chunking based on a generalization of winnow. Journal of Machine Learning Research 2, 615–637 (2002)MATHCrossRefGoogle Scholar
  13. 13.
    Carreras, X., Màrquez, L., Castro, J.: Filtering-ranking perceptron learning for partial parsing. Machine Learning 60, 41–71 (2005)CrossRefGoogle Scholar
  14. 14.
    Ando, R., Zhang, T.: A high-performance semi-supervised learning method for text chunking. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005), Ann Arbor, Michigan, Association for Computational Linguistics, pp. 1–9 (2005)Google Scholar
  15. 15.
    Ratnaparkhi, A.: A maximum entropy part-of-speech tagger. In: Brill, E., Church, K. (eds.) Proceedings of the Conference on Empirical Methods in Natural Language Processing 1996, pp. 133–142 (1996)Google Scholar
  16. 16.
    Vapnik, V.N.: Statistical Learning Theory. John Wiley and Sons, Chichester (1998)MATHGoogle Scholar
  17. 17.
    Sha, F., Pereira, F.: Shallow parsing with conditional random fields. In: NAACL 2003. Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 134–141. Association for Computational Linguistics, Morristown (2003)Google Scholar
  18. 18.
    Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. 18th International Conf. on Machine Learning, pp. 282–289. Morgan Kaufmann, San Francisco (2001)Google Scholar
  19. 19.
    Cover, T.M., Thomas, J.A.: Elements of information theory. Wiley-Interscience, New York (1991)MATHCrossRefGoogle Scholar
  20. 20.
    Gildea, D.: Corpus variation and parser performance. In: Lee, L., Harman, D. (eds.) Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing, pp. 167–202 (2001)Google Scholar
  21. 21.
    Daumé III, H., Marcu, D.: Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research (conditionally accepted, 2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Weiqun Xu
    • 1
  • Jean Carletta
    • 1
  • Johanna Moore
    • 1
  1. 1.HCRC and ICCS, School of InformaticsUniversity of Edinburgh 

Personalised recommendations