Multiword Expressions: A Pain in the Neck for NLP

  • Ivan A. Sag
  • Timothy Baldwin
  • Francis Bond
  • Ann Copestake
  • Dan Flickinger
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2276)


Multiword expressions are a key problem for the development of large-scale, linguistically sound natural language processing technology. This paper surveys the problem and some currently available analytic techniques. The various kinds of multiword expressions should be analyzed in distinct ways, including listing “words with spaces”, hierarchically organized lexicons, restricted combinatoric rules, lexical selection, “idiomatic constructions” and simple statistical affinity. An adequate comprehensive analysis of multiword expressions must employ both symbolic and statistical techniques.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Abeillé, Anne: 1988, ‘Light verb constructions and extraction out of NP in a tree adjoining grammar’, in Papers of the 24th Regional Meeting of the Chicago Linguistics Society.Google Scholar
  2. Bauer, Laurie: 1983, English Word-formation, Cambridge: Cambridge University Press.Google Scholar
  3. Bolinger, Dwight, ed.: 1972, Degree Words, the Hague: Mouton.Google Scholar
  4. Charniak, Eugene: 2001, ‘Immediate-head parsing for language models’, in Proc. of the 39th Annual Meeting of the ACL and 10th Conference of the EACL (ACL-EACL 2001), Toulouse.Google Scholar
  5. Copestake, Ann: 1992, ‘The representation of lexical semantic information’, Ph.D. thesis, University of Sussex.Google Scholar
  6. Copestake, Ann: 1994, ‘Representing idioms’, Presentation at the HPSG Conference, Copenhagen.Google Scholar
  7. Copestake, Ann: in press, Implementing Typed Feature Structure Grammars, Stanford: CSLI Publications.Google Scholar
  8. Copestake, Ann & Dan Flickinger: 2000, ‘An open-source grammar development environment and broad-coverage English grammar using HPSG’, in Proc. of the Second conference on Language Resources and Evaluation (LREC-2000), Athens.Google Scholar
  9. Copestake, Ann, Dan Flickinger, Ivan Sag & Carl Pollard: 1999, ‘Minimal recursion semantics: An introduction’, (, (draft).
  10. Copestake, Ann & Alex Lascarides: 1997, ‘Integrating symbolic and statistical representations: The lexicon pragmatics interface’, in Proc. of the 35th Annual Meeting of the ACL and 8th Conference of the EACL (ACL-EACL’97), Madrid, pp. 136–43.Google Scholar
  11. Dehé, Nicole, Ray Jackendoff, Andrew McIntyre & Silke Urban, eds.: to appear, Verbparticle explorations, Mouton de Gruyter.Google Scholar
  12. Dixon, Robert: 1982, ‘The grammar of English phrasal verbs’, Australian Journal of Linguistics, 2: 149–247.Google Scholar
  13. Fellbaum, Christine, ed.: 1998, WordNet: An Electronic Lexical Database, Cambridge, MA: MIT Press.Google Scholar
  14. Hektoen, Eirik: 1997, ‘Probabilistic parse selection based on semantic cooccurrences’, in Proc. of the 5th International Workshop on Parsing Technologies (IWPT-97), MIT, pp. 113–122.Google Scholar
  15. Jackendoff, Ray: 1997, The Architecture of the Language Faculty, Cambridge, MA: MIT Press.Google Scholar
  16. Johnson, Mark, Stuart Geman, Stephan Canon, Zhiyi Chi & Stefan Riezler: 1999, ‘Estimators for stochastic “unification-based” grammars’, in Proc. of the 37th Annual Meeting of the ACL, University of Maryland, pp. 535–541.Google Scholar
  17. Lascarides, Alex & Ann Copestake: 1999, ‘Default representation in constraint-based frameworks’, Computational Linguistics, 25(1): 55–106.Google Scholar
  18. McIntyre, Andrew: 2001, ‘Introduction to the verb-particle experience’, Ms, Leipzig.Google Scholar
  19. Nunberg, Geoffery, Ivan A. Sag & Thomas Wasow: 1994, ‘Idioms’, Language, 70: 491–538.Google Scholar
  20. Oepen, Stephan, Dan Flickinger, Hans Uszkoreit & Jun-ichi Tsujii: 2000, ‘Introduction to the special issue on efficient processing with HPSG: methods, systems, evaluation’, Natural Language Engineering, 6(1): 1–14.Google Scholar
  21. Pearce, Darren: 2001, ‘Synonymy in collocation extraction’, in Proc. of the NAACL 2001 Workshop on Word Net and Other Lexical Resources: Applications, Extensions and Customizations, CMU.Google Scholar
  22. Pollard, Carl & Ivan A. Sag: 1994, Head Driven Phrase Structure Grammar, Chicago: University of Chicago Press.Google Scholar
  23. Pulman, Stephen G.: 1993, ‘The recognition and interpretation of idioms’, in Cristina Cacciari & Patrizia Tabossi, eds., Idioms: Processing, Structure and Interpretation, Hillsdale, NJ: Lawrence Erlbaum Associates, chap. 11.Google Scholar
  24. Riehemann, Susanne: 2001, ‘A constructional approach to idioms and word formation’, Ph.D. thesis, Stanford.Google Scholar
  25. Sag, Ivan A. & Tom Wasow: 1999, Syntactic Theory: A Formal Introduction, Stanford: CSLI Publications.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Ivan A. Sag
    • 1
  • Timothy Baldwin
    • 1
  • Francis Bond
    • 2
  • Ann Copestake
    • 3
  • Dan Flickinger
    • 1
  1. 1.CSLI, Ventura HallStanford University StanfordUSA
  2. 2.NTT Communication Science Labs.KyotoJapan
  3. 3.Computer LaboratoryUniversity of CambridgeCambridgeUK

Personalised recommendations