Skip to main content

Multiword Expressions: A Pain in the Neck for NLP

  • Conference paper
  • First Online:
Computational Linguistics and Intelligent Text Processing (CICLing 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2276))

Abstract

Multiword expressions are a key problem for the development of large-scale, linguistically sound natural language processing technology. This paper surveys the problem and some currently available analytic techniques. The various kinds of multiword expressions should be analyzed in distinct ways, including listing “words with spaces”, hierarchically organized lexicons, restricted combinatoric rules, lexical selection, “idiomatic constructions” and simple statistical affinity. An adequate comprehensive analysis of multiword expressions must employ both symbolic and statistical techniques.

The research reported here was conducted in part under the auspices of the LinGO project, an international collaboration centered around the lkb system and related resources (see http://lingo.stanford.edu). This research was supported in part by the Research Collaboration between NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation and CSLI, Stanford University. We would like to thank Emily Bender and Tom Wasow for their contributions to our thinking. However, we alone are responsible for any errors that remain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Abeillé, Anne: 1988, ‘Light verb constructions and extraction out of NP in a tree adjoining grammar’, in Papers of the 24th Regional Meeting of the Chicago Linguistics Society.

    Google Scholar 

  • Bauer, Laurie: 1983, English Word-formation, Cambridge: Cambridge University Press.

    Google Scholar 

  • Bolinger, Dwight, ed.: 1972, Degree Words, the Hague: Mouton.

    Google Scholar 

  • Charniak, Eugene: 2001, ‘Immediate-head parsing for language models’, in Proc. of the 39th Annual Meeting of the ACL and 10th Conference of the EACL (ACL-EACL 2001), Toulouse.

    Google Scholar 

  • Copestake, Ann: 1992, ‘The representation of lexical semantic information’, Ph.D. thesis, University of Sussex.

    Google Scholar 

  • Copestake, Ann: 1994, ‘Representing idioms’, Presentation at the HPSG Conference, Copenhagen.

    Google Scholar 

  • Copestake, Ann: in press, Implementing Typed Feature Structure Grammars, Stanford: CSLI Publications.

    Google Scholar 

  • Copestake, Ann & Dan Flickinger: 2000, ‘An open-source grammar development environment and broad-coverage English grammar using HPSG’, in Proc. of the Second conference on Language Resources and Evaluation (LREC-2000), Athens.

    Google Scholar 

  • Copestake, Ann, Dan Flickinger, Ivan Sag & Carl Pollard: 1999, ‘Minimal recursion semantics: An introduction’, (http://www-csli.stanford.edu/~aac/papers/newmrs.ps), (draft).

  • Copestake, Ann & Alex Lascarides: 1997, ‘Integrating symbolic and statistical representations: The lexicon pragmatics interface’, in Proc. of the 35th Annual Meeting of the ACL and 8th Conference of the EACL (ACL-EACL’97), Madrid, pp. 136–43.

    Google Scholar 

  • Dehé, Nicole, Ray Jackendoff, Andrew McIntyre & Silke Urban, eds.: to appear, Verbparticle explorations, Mouton de Gruyter.

    Google Scholar 

  • Dixon, Robert: 1982, ‘The grammar of English phrasal verbs’, Australian Journal of Linguistics, 2: 149–247.

    Google Scholar 

  • Fellbaum, Christine, ed.: 1998, WordNet: An Electronic Lexical Database, Cambridge, MA: MIT Press.

    Google Scholar 

  • Hektoen, Eirik: 1997, ‘Probabilistic parse selection based on semantic cooccurrences’, in Proc. of the 5th International Workshop on Parsing Technologies (IWPT-97), MIT, pp. 113–122.

    Google Scholar 

  • Jackendoff, Ray: 1997, The Architecture of the Language Faculty, Cambridge, MA: MIT Press.

    Google Scholar 

  • Johnson, Mark, Stuart Geman, Stephan Canon, Zhiyi Chi & Stefan Riezler: 1999, ‘Estimators for stochastic “unification-based” grammars’, in Proc. of the 37th Annual Meeting of the ACL, University of Maryland, pp. 535–541.

    Google Scholar 

  • Lascarides, Alex & Ann Copestake: 1999, ‘Default representation in constraint-based frameworks’, Computational Linguistics, 25(1): 55–106.

    Google Scholar 

  • McIntyre, Andrew: 2001, ‘Introduction to the verb-particle experience’, Ms, Leipzig.

    Google Scholar 

  • Nunberg, Geoffery, Ivan A. Sag & Thomas Wasow: 1994, ‘Idioms’, Language, 70: 491–538.

    Google Scholar 

  • Oepen, Stephan, Dan Flickinger, Hans Uszkoreit & Jun-ichi Tsujii: 2000, ‘Introduction to the special issue on efficient processing with HPSG: methods, systems, evaluation’, Natural Language Engineering, 6(1): 1–14.

    Google Scholar 

  • Pearce, Darren: 2001, ‘Synonymy in collocation extraction’, in Proc. of the NAACL 2001 Workshop on Word Net and Other Lexical Resources: Applications, Extensions and Customizations, CMU.

    Google Scholar 

  • Pollard, Carl & Ivan A. Sag: 1994, Head Driven Phrase Structure Grammar, Chicago: University of Chicago Press.

    Google Scholar 

  • Pulman, Stephen G.: 1993, ‘The recognition and interpretation of idioms’, in Cristina Cacciari & Patrizia Tabossi, eds., Idioms: Processing, Structure and Interpretation, Hillsdale, NJ: Lawrence Erlbaum Associates, chap. 11.

    Google Scholar 

  • Riehemann, Susanne: 2001, ‘A constructional approach to idioms and word formation’, Ph.D. thesis, Stanford.

    Google Scholar 

  • Sag, Ivan A. & Tom Wasow: 1999, Syntactic Theory: A Formal Introduction, Stanford: CSLI Publications.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sag, I.A., Baldwin, T., Bond, F., Copestake, A., Flickinger, D. (2002). Multiword Expressions: A Pain in the Neck for NLP. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2002. Lecture Notes in Computer Science, vol 2276. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45715-1_1

Download citation

  • DOI: https://doi.org/10.1007/3-540-45715-1_1

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43219-7

  • Online ISBN: 978-3-540-45715-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics