Skip to main content

Verb Class Discovery from Rich Syntactic Data

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2008)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4919))

Abstract

Previous research has shown that syntactic features are the most informative features in automatic verb classification. We investigate their optimal characteristics by comparing a range of feature sets extracted from data where the proportion of verbal arguments and adjuncts is controlled. The data are obtained from different versions of valex [1] – a large scf lexicon for English which was acquired automatically from several corpora and the Web. We evaluate the feature sets thoroughly using four supervised classifiers and one unsupervised method. The best performing feature set includes rich syntactic information about both arguments and adjuncts of verbs. When combined with our best performing classifier (a novel Gaussian classifier), it yields the promising accuracy of 64.2% in classifying 204 verbs to 17 Levin (1993) classes. We discuss the impact of our results on the state-or-art and propose avenues for future work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Korhonen, A., Krymolowski, Y., Briscoe, T.: A large subcategorization lexicon for natural language processing applications. In: Proceedings of LREC (2006)

    Google Scholar 

  2. Merlo, P., Stevenson, S.: Automatic verb classification based on statistical distributions of argument structure. Computational Linguistics 27, 373–408 (2001)

    Article  Google Scholar 

  3. Korhonen, A., Krymolowski, Y., Collier, N.: Automatic classification of verbs in biomedical texts. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual meeting of the ACL, pp. 345–352 (2006)

    Google Scholar 

  4. Schulte im Walde, S.: Experiments on the automatic induction of german semantic verb classes. Computational Linguistics 32, 159–194 (2006)

    Article  Google Scholar 

  5. Joanis, E., Stevenson, S., James, D.: A general feature space for automatic verb classification. Natural Language Engineering (forthcoming, 2007)

    Google Scholar 

  6. Dorr, B.J.: Large-scale dictionary construction for foreign language tutoring and interlingual machine translation. Machine Translation 12, 271–322 (1997)

    Article  Google Scholar 

  7. Prescher, D., Riezler, S., Rooth, M.: Using a probabilistic class-based lexicon for lexical ambiguity resolution. In: 18th International Conference on Computational Linguistics, Saarbrücken, Germany, pp. 649–655 (2000)

    Google Scholar 

  8. Swier, R., Stevenson, S.: Unsupervised semantic role labelling. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain, pp. 95–102 (2004)

    Google Scholar 

  9. Dang, H.T.: Investigations into the Role of Lexical Semantics in Word Sense Disambiguation. PhD thesis, CIS, University of Pennsylvania (2004)

    Google Scholar 

  10. Shi, L., Mihalcea, R.: Putting pieces together: Combining FrameNet, VerbNet and WordNet for robust semantic parsing. In: Proceedings of the Sixth International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, Mexico (2005)

    Google Scholar 

  11. Jackendoff, R.: Semantic Structures. MIT Press, Cambridge (1990)

    Google Scholar 

  12. Levin, B.: English Verb Classes and Alternations. Chicago University Press, Chicago (1993)

    Google Scholar 

  13. Miller, G.A.: WordNet: An on-line lexical database. International Journal of Lexicography 3, 235–312 (1990)

    Article  Google Scholar 

  14. Schulte im Walde, S.: Clustering verbs semantically according to their alternation behaviour. In: Proceedings of COLING, Saarbrücken, Germany, pp. 747–753 (2000)

    Google Scholar 

  15. Kipper, K., Dang, H.T., Palmer, M.: Class-based construction of a verb lexicon. In: AAAI/IAAI, pp. 691–696 (2000)

    Google Scholar 

  16. Briscoe, E.J., Carroll, J.: Automatic extraction of subcategorization from corpora. In: Proceedings of the 5th ACL Conference on Applied Natural Language Processing, Washington DC, pp. 356–363 (1997)

    Google Scholar 

  17. Briscoe, E.J., Carroll, J.: Robust accurate statistical annotation of general text. In: Proceedings of the 3rd LREC, Las Palmas, Gran Canaria, pp. 1499–1504 (2002)

    Google Scholar 

  18. Boguraev, B., Briscoe, T.: Large lexicons for natural language processing: utilising the grammar coding system of ldoce. Comput. Linguist. 13, 203–218 (1987)

    Google Scholar 

  19. Grishman, R., Macleod, C., Meyers, A.: Comlex syntax: building a computational lexicon. In: Proceedings of the 15th conference on Computational linguistics, Morristown, NJ, USA, Association for Computational Linguistics, pp. 268–272 (1994)

    Google Scholar 

  20. Vapnik, V.N.: The nature of statistical learning theory. Springer, New York (1995)

    MATH  Google Scholar 

  21. Chang, C., Lin, J.: LIBSVM: a library for support vector machines (2001)

    Google Scholar 

  22. Hsu, W., Chang, C., Lin, J.: A practical guide to support vector classification (2003)

    Google Scholar 

  23. Pietra, S.D., Pietra, J.D., Lafferty, J.D.: Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 19, 380–393 (1997)

    Article  Google Scholar 

  24. Zhang, L.: Maximum Entropy Modeling Toolkit for Python and C++ (2004)

    Google Scholar 

  25. Puzicha, J., Hofmann, T., Buhmann, J.M.: A theory of proximity-based clustering: structure detection by optimization. Pattern Recognition 33, 617–634 (2000)

    Article  Google Scholar 

  26. Ando, R.K., Zhang, T.: A high-performance semi-supervised learning method for text chunking. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 1–9 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sun, L., Korhonen, A., Krymolowski, Y. (2008). Verb Class Discovery from Rich Syntactic Data. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2008. Lecture Notes in Computer Science, vol 4919. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78135-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78135-6_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78134-9

  • Online ISBN: 978-3-540-78135-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics