Skip to main content

Relative Compositionality of Multi-word Expressions: A Study of Verb-Noun (V-N) Collocations

  • Conference paper
Natural Language Processing – IJCNLP 2005 (IJCNLP 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3651))

Included in the following conference series:

Abstract

Recognition of Multi-word Expressions (MWEs) and their relative compositionality are crucial to Natural Language Processing. Various statistical techniques have been proposed to recognize MWEs. In this paper, we integrate all the existing statistical features and investigate a range of classifiers for their suitability for recognizing the non-compositional Verb-Noun (V-N) collocations. In the task of ranking the V-N collocations based on their relative compositionality, we show that the correlation between the ranks computed by the classifier and human ranking is significantly better than the correlation between ranking of individual features and human ranking. We also show that the properties ‘Distributed frequency of object’ (as defined in [27] ) and ‘Nearest Mutual Information’ (as adapted from [18]) contribute greatly to the recognition of the non-compositional MWEs of the V-N type and to the ranking of the V-N collocations based on their relative compositionality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abeille, A.: Light verb constuctions and extraction out of NP in a tree adjoining grammar. In: Papers of the 24th Regional Meeting of the Chicago Linguistics Society (1988)

    Google Scholar 

  2. Akimoto, M.: Papers of the 24th Regional Meeting of the Chicago Linguistics Society. Shinozaki Shorin (1989)

    Google Scholar 

  3. Baldwin, T., Bannard, C., Tanaka, T., Widdows, D.: An Empirical Model of Multiword Expression. In: Proceedings of the ACL-2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment (2003)

    Google Scholar 

  4. Bannard, C., Baldwin, T., Lascarides, A.: A Statistical Approach to the Semantics of Verb-Particles. In: Proceedings of the ACL-2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment (2003)

    Google Scholar 

  5. Bikel, D.M.: A Distributional Analysis of a Lexicalized Statistical Parsing Model. In: Proceedings of EMNLP (2004)

    Google Scholar 

  6. Becker, J.D.: The Phrasal Lexicon. In: Theoritical Issues of NLP, Workshop in CL, Linguistics, Psychology and AI, Cambridge, MA (1975)

    Google Scholar 

  7. Breidt, E.: Extraction of V-N-Collocations from Text Corpora: A Feasibility Study for German. In: CoRR-1996 (1995)

    Google Scholar 

  8. Church, K., Gale, W., Hanks, P., Hindle, D.: Parsing, word associations and typical predicate-argument relations. In: Current Issues in Parsing Technology. Kluwer Academic, Dordrecht (1991)

    Google Scholar 

  9. Church, K., Hanks, P.: Word Association Norms, Mutual Information, and Lexicography. In: Proceedings of the 27th. Annual Meeting of the Association for Computational Linguistics 1990 (1989)

    Google Scholar 

  10. Dunning, T.: Accurate Methods for the Statistics of Surprise and Coincidence. In: Computational Linguistics - 1993 (1993)

    Google Scholar 

  11. Evert, S., Krenn, B.: Methods for the Qualitative Evaluation of Lexical Association Measures. In: Proceedings of the ACL - 2001 (2001)

    Google Scholar 

  12. Fillmore, C.: An extremist approach to multi-word expressions. A talk given at IRCS, University of Pennsylvania, 2003 (2003)

    Google Scholar 

  13. Fontenelle, Bruls, T.W., Thomas, L., Vanallemeersch, T., Jansen, J.: Survey of collocation extraction tools. Deliverable D-1a, MLAP-Project 93-19 DECIDE, University of Liege, Belgium (1994)

    Google Scholar 

  14. Diaz-Galiano, M.C., Martin-Valdivia, M.T., Martinez-Santiago, F., Urena-Lopez, L.A.: Multi-word Expressions Recognition with the LVQ Algorithm. In: Proceedings of Methodologies and Evaluation of Multiword Unit in Real-world Applications, LREC 2004 (2004)

    Google Scholar 

  15. Joachims, T.: Making large-Scale SVM Learning Practical. Advances in Kernel Methods - Support Vector Learning (1999)

    Google Scholar 

  16. Joachims, T.: Optimizing Search Engines Using Clickthrough Data. In: Advances in Kernel Methods - Support Vector Learning edings of the ACM Conference on Knowledge Discovery and Data Mining (KDD). ACM, New York (2002)

    Google Scholar 

  17. Kilgariff, A., Rosenzweig, J.: Framework and Results for English Senseval. Computers and the Humanities 2000 (2000)

    Google Scholar 

  18. Lin, D.: Automatic Identification of non-compositonal phrases. In: Proceedings of ACL- 1999, College Park, USA (1999)

    Google Scholar 

  19. McCarthy, D., Keller, B., Carroll, J.: Detecting a Continuum of Compositionality in Phrasal Verbs. In: Proceedings of the ACL-2003 Workshop on Multi-word Expressions: Analysis, Acquisition and Treatment 2003 (2003)

    Google Scholar 

  20. Mitchell, T.: Instance-Based Learning. In: Machine Learning. McGraw-Hill Series in Computer Science, New York (1997)

    Google Scholar 

  21. Moore, A.W., Lee, M.S.: Proceedings of the 11 International Conference on Machine Learning (1994)

    Google Scholar 

  22. Nunberg, G., Sag, I.A., Wasow, T.: Idioms. Language 1994 (1994)

    Google Scholar 

  23. Sag, I.A., Baldwin, T., Bond, F., Copestake, A., Flickinger, D.: Multi-word expressions: a pain in the neck for nlp. In: Proceedings of CICLing 2002 (2002)

    Google Scholar 

  24. Schone, P., Jurafsky, D.: Is Knowledge-Free Induction of Multiword Unit Dictionary Headwords a Solved Problem? In: Proceedings of EMNLP 2001 (2001)

    Google Scholar 

  25. Schuler, W., Joshi, A.K.: Relevance of tree rewriting systems for multi-word expressions (2005) (to be published)

    Google Scholar 

  26. Smadja, F.: Retrieving Collocations from Text: Xtract. In: Computational Linguistics - 1993 (1993)

    Google Scholar 

  27. Tapanainen, P., Piitulaine, J., Jarvinen, T.: Idiomatic object usage and support verbs. In: 36th Annual Meeting of the Association for Computational Linguistics (1998)

    Google Scholar 

  28. Venkatapathy, S., Joshi, A.K.: Recognition of Multi-word Expressions: A Study of Verb-Noun (V-N) Collocations. In: Proceedings of the International Conference on Natural Language Processing 2004 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Venkatapathy, S., Joshi, A.K. (2005). Relative Compositionality of Multi-word Expressions: A Study of Verb-Noun (V-N) Collocations. In: Dale, R., Wong, KF., Su, J., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2005. IJCNLP 2005. Lecture Notes in Computer Science(), vol 3651. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11562214_49

Download citation

  • DOI: https://doi.org/10.1007/11562214_49

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29172-5

  • Online ISBN: 978-3-540-31724-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics