On Collocations

Chapter
Part of the Text, Speech and Language Technology book series (TLTB, volume 44)

Abstract

The primary goal of this chapter is to set the stage for the practical investigations described in the remainder of this book by providing an overall picture of the existing theoretical descriptions of collocation phenomenon. Because the various approaches to collocation—motivated by linguistic, lexicographic, pedagogical or computational considerations—often made use of vague and inconsistent terminology, we begin by providing an analytic review of the definitions of collocation available in the literature and then identify the most salient and uncontroversial features of the phenomena that will serve as the basis for the discussion throughout the rest of the book. After reviewing the definitions, we present a brief discussion of the main theoretical linguistic frameworks which have addressed collocation phenomena. We then consider characterizations proposed by various researchers in terms of semantic compositionality and morpho-syntactic behaviour. We conclude with giving the definition of collocation which we believe most adequately captures this phenomena for our present purposes.

Keywords

Lexical Item Word Combination Text Cohesion Lexical Function Lexical Combination 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. Bahns J (1993) Lexical collocations: a contrastive view. ELT Journal 1(47):56–63CrossRefGoogle Scholar
  2. Baker CF, Fillmore CJ, Lowe JB (1998) The Berkeley FrameNet project. In: Proceedings of the COLING-ACL, Montreal, Canada, pp 86–90Google Scholar
  3. Baldwin T, Bannard C, Tanaka T, Widdows D (2003) An empirical model of multiword expression decomposability. In: Bond F, Korhonen A, McCarthy D Villavicencio A (eds) Proceedings of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, Sapporo, Japan, pp 89–96Google Scholar
  4. Bally C (1909) Traité de stylistique française. Klincksieck, ParisGoogle Scholar
  5. Bally C (1951) Traité de stylistique française. Klincksieck, ParisGoogle Scholar
  6. Bannard C (2005) Learning about the meaning of verb-particle constructions from corpora. Computer Speech and Language 19(4):467–478Google Scholar
  7. Bartsch S (2004) Structural and Functional Properties of Collocations in English. A Corpus Study of Lexical and Pragmatic Constraints on Lexical Cooccurrence. Gunter Narr Verlag, TübingenGoogle Scholar
  8. Benson M (1990) Collocations and general-purpose dictionaries. International Journal of Lexicography 3(1):23–35CrossRefGoogle Scholar
  9. Benson M, Benson E, Ilson R (1986a) The BBI Dictionary of English Word Combinations. John Benjamins, Amsterdam/PhiladelphiaGoogle Scholar
  10. Benson M, Benson E, Ilson R (1986b) Lexicographic Description of English. John Benjamins, Amsterdam/PhiladelphiaGoogle Scholar
  11. Boitet C, Mangeot M, Sérasset G (2002) The PAPILLON Project: Cooperatively building a multilingual lexical database to derive open source dictionaries and lexicons. In: Proceedings of the 2nd Workshop on NLP and XML (NLPXML-2002), Taipei, TaiwanGoogle Scholar
  12. Choueka Y (1988) Looking for needles in a haystack, or locating interesting collocational expressions in large textual databases. In: Proceedings of the International Conference on User-Oriented Content-Based Text and Image Handling, Cambridge, MA, USA, pp 609–623Google Scholar
  13. Church K, Hanks P (1990) Word association norms, mutual information, and lexicography. Computational Linguistics 16(1):22–29Google Scholar
  14. Cook P, Fazly A, Stevenson S (2008) The VNC-tokens dataset. In: Proceedings of the LREC Workshop Towards a Shared Task for Multiword Expressions (MWE 2008), Marrakech, Morocco, pp 19–22Google Scholar
  15. Coseriu E (1967) Lexikalische Solidaritäten. Poetica (1):293–303Google Scholar
  16. Cowie AP (1978) The place of illustrative material and collocations in the design of a learner’s dictionary. In: Strevens P (ed) In Honour of A.S. Hornby, Oxford University Press, Oxford, pp 127–139Google Scholar
  17. Cruse DA (1986) Lexical Semantics. Cambridge University Press, CambridgeGoogle Scholar
  18. Diab MT, Bhutada P (2009) Verb noun construction MWE token supervised classification. In: 2009 Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation, Applications, Suntec, Singapore, pp 17–22CrossRefGoogle Scholar
  19. Evert S (2004b) The statistics of word cooccurrences: Word pairs and collocations. PhD thesis, University of StuttgartGoogle Scholar
  20. Fazly A, Stevenson S (2007) Distinguishing subtypes of multiword expressions using linguistically-motivated statistical measures. In: Proceedings of the Workshop on A Broader Perspective on Multiword Expressions, Prague, Czech Republic, pp 9–16Google Scholar
  21. Fillmore C, Kay P, O’Connor C (1988) Regularity and idiomaticity in grammatical constructions: The case of let alone. Language 64(3):501–538Google Scholar
  22. Fillmore CJ (1982) Frame semantics. In: Linguistics in the Morning Calm, Hanshin Publishing Co., Seoul, pp 111–137Google Scholar
  23. Firth JR (1957) Papers in Linguistics 1934-1951. Oxford University Press, OxfordGoogle Scholar
  24. Firth JR (1968) A synopsis of linguistic theory, 1930–1955. In: Palmer F (ed) Selected papers of J. R. Firth, 1952–1959, Indiana University Press, Bloomington, IN, pp 168–205Google Scholar
  25. Fontenelle T (1992) Collocation acquisition from a corpus or from a dictionary: A comparison. Proceedings I-II Papers submitted to the 5th EURALEX International Congress on Lexicography in Tampere, Tampere, Finland, pp 221–228Google Scholar
  26. Fontenelle T (1997a) Turning a Bilingual Dictionary into a Lexical-Semantic Database. Max Niemeyer Verlag, TübingenGoogle Scholar
  27. Fontenelle T (1997b) Using a bilingual dictionary to create semantic networks. International Journal of Lexicography 10(4):276–303CrossRefGoogle Scholar
  28. Fontenelle T (2001) Collocation modelling: From lexical functions to frame semantics. In: Proceedings of the ACL Workshop on Collocation: Computational Extraction, Analysis and Exploitation, Toulouse, France, pp 1–7Google Scholar
  29. Francis G (1993) A corpus-driven approach to grammar: Principles, methods and examples. In: Baker M, Francis G, Tognini-Bonelli E (eds) Text and Technology: In Honour of John Sinclair, John Benjamins, Amsterdam, pp 137–156Google Scholar
  30. Gitsaki C (1996) The development of ESL collocational knowledge. PhD thesis, University of QueenslandGoogle Scholar
  31. Gross G (1996) Les expressions figées en français. OPHRYS, ParisGoogle Scholar
  32. Gross M (1984) Lexicon-grammar and the syntactic analysis of French. In: Proceedings of the 10th International Conference on Computational Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics, Morristown, NJ, USA, pp 275–282Google Scholar
  33. Halliday MAK, Hasan R (1976) Cohesion in English. Longman, LondonGoogle Scholar
  34. Hargreaves P (2000) Collocation and testing. In: Lewis M (ed) Teaching Collocations, Language Teaching Publications, HoveGoogle Scholar
  35. Hausmann FJ (1979) Un dictionnaire des collocations est-il possible? Travaux de littérature et de linguistique de l’Université de Strasbourg 17(1):187–195MathSciNetGoogle Scholar
  36. Hausmann FJ (1985) Kollokationen im deutschen Wörterbuch. Ein Beitrag zur Theorie des lexikographischen Beispiels. In: Bergenholtz H, Mugdan J (eds) Lexikographie und Grammatik. Akten des Essener Kolloquiums zur Grammatik im Wörterbuch, Lexicographica. Series Major 3, pp 118–129Google Scholar
  37. Hausmann FJ (1989) Le dictionnaire de collocations. In: Hausmann F, Reichmann O, Wiegand H, Zgusta L (eds) Wörterbücher: Ein internationales Handbuch zur Lexicographie. Dictionaries, Dictionnaires, de Gruyter, Berlin, pp 1010–1019Google Scholar
  38. Hausmann FJ (2004) Was sind eigentlich Kollokationen? In: Steyer K (ed) Wortverbindungen – mehr oder weniger fest. Jahrbuch des Instituts für Deutsche Sprache 2003, de Gruyter, Berlin, pp 309–334Google Scholar
  39. Heid U (1994) On ways words work together – research topics in lexical combinatorics. In: Proceedings of the 6th Euralex International Congress on Lexicography (EURALEX ’94), Amsterdam, The Netherlands, pp 226–257Google Scholar
  40. Heid U, Raab S (1989) Collocations in multilingual generation. In: Proceeding of the 4th Conference of the European Chapter of the Association for Computational Linguistics (EACL’89), Manchester, England, pp 130–136Google Scholar
  41. Heylen D, Maxwell KG, Verhagen M (1994) Lexical functions and machine translation. In: Proceedings of the 15th International Conference on Computational Linguistics (COLING 1994), Kyoto, Japan, pp 1240–1244Google Scholar
  42. Hoey M (1991) Patterns of Lexis in Text. Oxford University Press, OxfordGoogle Scholar
  43. Hoey M (1997) From concordance to text structure: New uses for computer corpora. In: Melia J, Lewandoska B (eds) Proceedings of Practical Applications of Language Corpora (PALC 1997), Lodz, Poland, pp 2–23Google Scholar
  44. Hoey M (2000) A world beyond collocation: New perspectives on vocabulary teaching. In: Lewis M (ed) Teaching Collocations, Language Teaching Publications, HoveGoogle Scholar
  45. Hornby AS, Cowie AP, Lewis JW (1948a) Oxford Advanced Learner’s Dictionary of Current English. Oxford University Press, LondonGoogle Scholar
  46. Hornby AS, Gatenby EV, Wakefield H (1948b) A Learner’s Dictionary of Current English. Oxford University Press, LondonGoogle Scholar
  47. Hornby AS, Gatenby EV, Wakefield H (1952) The Advanced Learner’s Dictionary of Current English. Oxford University Press, LondonGoogle Scholar
  48. Hunston S, Francis G (1998) Verbs observed: A corpus-driven pedagogic grammar. Applied Linguistics 19(1):45–72CrossRefGoogle Scholar
  49. Hunston S, Francis G, Manning E (1997) Grammar and vocabulary: Showing the connections. English Language Teaching Journal 3(51):208–215Google Scholar
  50. Kahane S, Polguère A (2001) Formal foundations of lexical functions. In: Proceedings of the ACL Workshop on Collocation: Computational Extraction, Analysis and Exploitation, Toulouse, France, pp 8–15Google Scholar
  51. Kjellmer G (1987) Aspects of English collocations. In: Meijs W (ed) Corpus Linguistics and Beyond, Rodopi, Amsterdam, pp 133–140Google Scholar
  52. Kjellmer G (1990) Patterns of collocability. In: Aarts J, Meijs W (eds) Theory and practice in Corpus Linguistics, Rodopi B.V., Amsterdam, pp 163–178Google Scholar
  53. Kjellmer G (1991) A mint of phrases. In: Aijmer K, Altenberg B (eds) English Corpus Linguistics. Studies in Honour of Jan Svartvik, Longman, London/New York, pp 111–127Google Scholar
  54. Lehr A (1996) Germanistische Linguistik: Kollokationen und maschinenlesbare Korpora, vol 168. Niemeyer, TübingenGoogle Scholar
  55. Lewis M (2000) Teaching Collocations. Further Developments in the Lexical Approach. Language Teaching Publications, HoveGoogle Scholar
  56. Louw B (1993) Irony in the text or insincerity in the writer? The diagnostic potential of semantic prosodies. In: Baker M, Francis G, Tognini-Bonelli E (eds) Text and Technology: In Honour of John Sinclair, John Benjamins, Amsterdam, pp 157–176Google Scholar
  57. Mangeot M (2006) Papillon project: Retrospective and perspectives. In: Proceedings of the LREC 2006 Workshop on Acquiring and Representing Multilingual, Specialized Lexicons: The Case of Biomedicine, Genoa, ItalyGoogle Scholar
  58. Manning CD, Schütze H (1999) Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, MAGoogle Scholar
  59. McCarthy D, Keller B, Carroll J (2003) Detecting a continuum of compositionality in phrasal verbs. In: Proceedings of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, Sapporo, Japan, pp 73–80Google Scholar
  60. McCarthy D, Venkatapathy S, Joshi A (2007) Detecting compositionality of verb-object combinations using selectional preferences. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic, pp 369–379Google Scholar
  61. McKeown KR, Radev DR (2000) Collocations. In: Dale R, Moisl H, Somers H (eds) A Handbook of Natural Language Processing, Marcel Dekker, New York, NY, pp 507–523Google Scholar
  62. Mel’čuk I (1998) Collocations and lexical functions. In: Cowie AP (ed) Phraseology. Theory, Analysis, and Applications, Claredon Press, Oxford, pp 23–53Google Scholar
  63. Mel’čuk I (2003) Collocations: définition, rôle et utilité. In: Grossmann F, Tutin A (eds) Les collocations: analyse et traitement, Editions De Werelt, Amsterdam, pp 23–32Google Scholar
  64. Mel’čuk et al I (1984, 1988, 1992, 1999) Dictionnaire explicatif et combinatoire du français contemporain. Recherches léxico-sémantiques. Presses de l’Université de Montréal, MontréalGoogle Scholar
  65. Meunier F, Granger S (eds) (2008) Phraseology in Foreign Language and Teaching. John Benjamins, Amsterdam/PhiladelphiaGoogle Scholar
  66. Mille S, Wanner L (2008) Making text resources accessible to the reader: The case of patent claims. In: Proceedings of the 6th International Language Resources and Evaluation (LREC’08), Marrakech, MoroccoGoogle Scholar
  67. Moon R (1998) Fixed Expressions and Idioms in English: A Corpus-Based Approach. Claredon Press Oxford, OxfordGoogle Scholar
  68. Pawley A, Syder FH (1983) Two puzzles for linguistic theory: nativelike selection and nativelike fluency. In: Richards J, Schmidt R (eds) Language and Communication, Longman, London, pp 191–227Google Scholar
  69. Piao SS, Rayson P, Mudraya O, Wilson A, Garside R (2006) Measuring MWE compositionality using semantic annotation. In: Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties, Sydney, Australia, pp 2–11Google Scholar
  70. Polguère A (2000) Towards a theoretically-motivated general public dictionary of semantic derivations and collocations for French. In: Proceedings of the 9th EURALEX International Congress, EURALEX 2000, Stuttgart, Germany, pp 517–527Google Scholar
  71. Ramos MA, Rambow O, Wanner L (2008) Using semantically annotated corpora to build collocation resources. In: Proceedings of the 6th International Language Resources and Evaluation (LREC’08), Marrakech, MoroccoGoogle Scholar
  72. Sag IA, Baldwin T, Bond F, Copestake A, Flickinger D (2002) Multiword expressions: A pain in the neck for NLP. In: Proceedings of the 3rd International Conference on Intelligent Text Processing and Computational Linguistics (CICLING 2002), Mexico City, Mexico, pp 1–15Google Scholar
  73. Selva T, Verlinde S, Binon J (2002) Le DAFLES, un nouveau dictionnaire électronique pour apprenants du français. In: Braasch A, Povlsen C (eds) Proceedings of the 10th Euralex International Congress (EURALEX 2002), Copenhagen, Denmark, pp 199–208Google Scholar
  74. Silberztein M (1993) Dictionnaires électroniques et analyse automatique de textes. Le système INTEX. Masson, ParisGoogle Scholar
  75. Sinclair J (1991) Corpus, Concordance, Collocation. Oxford University Press, OxfordGoogle Scholar
  76. Smadja F (1993) Retrieving collocations from text: Xtract. Computational Linguistics 19(1):143–177Google Scholar
  77. Stubbs M (1995) Corpus evidence for norms of lexical collocation. In: Cook G, Seidlhofer B (eds) Principle & Practice in Applied Linguistics. Studies in Honour of H.G. Widdowson, Oxford University Press, OxfordGoogle Scholar
  78. Venkatapathy S, Joshi AK (2005) Relative compositionality of multi-word expressions: A study of verb-noun (V-N) collocations. In: Natural Language Processing - IJCNLP 2005, Lecture Notes in Computer Science, vol 3651, Springer, Berlin/Heidelberg, pp 553–564Google Scholar
  79. Wanner L (1997) Exploring lexical resources for text generation in a systemic functional language model. PhD thesis, University of the Saarland, SaarbrückenGoogle Scholar
  80. Wanner L, Bohnet B, Giereth M (2006) Making sense of collocations. Computer Speech & Language 20(4):609–624CrossRefGoogle Scholar
  81. Wehrli E (2000) Parsing and collocations. In: Christodoulakis D (ed) Natural Language Processing, Springer, Berlin, pp 272–282Google Scholar
  82. van der Wouden T (1997) Negative Contexts. Collocation, Polarity, and Multiple Negation. Routledge, London, New YorkGoogle Scholar
  83. van der Wouden T (2001) Collocational behaviour in non content words. In: Proceedings of the ACL Workshop on Collocation: Computational Extraction, Analysis and Exploitation, Toulouse, France, pp 16–23Google Scholar
  84. Zeevat H (1995) Idiomatic blocking and the Elsewhere principle. In: Everaert M, van der Linden EJ, Schenk A, Schreuder R (eds) Idioms: Structural and Psychological Perspectives, Lawrence Erlbaum Associates, Hillsdale, NJ and Hove, UK, pp 301–316Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  1. 1.Department of Linguistics (Office L706)University of GenevaGenevaSwitzerland

Personalised recommendations