Advertisement

Research on Language and Computation

, Volume 6, Issue 2, pp 205–238 | Cite as

An Empirical Characterisation of Response Types in German Association Norms

  • Sabine Schulte im Walde
  • Alissa Melinger
  • Michael Roth
  • Andrea Weber
Article

Abstract

This article presents a study to distinguish and quantify the various types of semantic associations provided by humans, to investigate their properties, and to discuss the impact that our analyses may have on NLP tasks. Specifically, we concentrate on two issues related to word properties and word relations: (1) We address the task of modelling word meaning by empirical features in data-intensive lexical semantics. Relying on large-scale corpus-based resources, we identify the contextual categories and functions that are activated by the associates and therefore contribute to the salient meaning components of individual words and across words. As a result, we discuss conceptual roles and present evidence for the usefulness of co-occurrence information in distributional descriptions. (2) We assume that semantic associates provide a means to investigate the range of semantic relations between words and contexts, and we provide insight into which types of semantic relations are treated as important or salient by the speakers of the language.

Keywords

Association norms Semantic associates Semantic relations Data-intensive semantics Distributional features Corpus co-occurrence Lexical resources 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baldwin, T., Bannard, C., Tanaka, T., & Widdows, D. (2003). An empirical model of multiword expression decomposability. In Proceedings of the ACL-2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment (pp. 89–96). Sapporo, Japan.Google Scholar
  2. Beigman Klebanov, B. (2006). Measuring semantic relatedness using people and WordNet. In Proceedings of the joint Conference on Human Language Technology and the North American Chapter of the Association for Computational Linguistics (pp. 13–17). New York City, NY.Google Scholar
  3. Beigman Klebanov B. and Shamir E. (2006). Reader-based exploration of lexical cohesion. Language Resources and Evaluation, 40(2): 109–126 CrossRefGoogle Scholar
  4. Berland, M., & Charniak, E. (1999). Finding parts in very large corpora. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (pp. 57–64). Maryland, MD.Google Scholar
  5. BNC (1995). British National Corpus’. http://www.hcu.ox.ac.uk/BNC/.
  6. Boyd-Graber, J., Fellbaum, C., Osherson, D., & Schapire, R. (2006). Adding dense, weighted connections to WordNet. In Proceedings of the Third Global WordNet Meeting. Jeju Island, Korea.Google Scholar
  7. Chklovski, T., & Pantel, P. (2004). VerbOcean: Mining the web for fine-grained semantic verb relations. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Barcelona, Spain.Google Scholar
  8. Church, K. W., & Hanks, P. (1989). Word association norms, mutual information, and lexicography. In Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics (pp. 76–83). Vancouver, Canada.Google Scholar
  9. Clark, H. H. (1971). Word associations and linguistic theory. In J. Lyons (Ed.), New Horizon in Linguistics (Chap. 15, pp. 271–286). Penguin.Google Scholar
  10. Curran, J. (2003). From distributional to semantic similarity. Ph.D. thesis, Institute for Communicating and Collaborative Systems, School of Informatics, University of Edinburgh.Google Scholar
  11. Daelemans, W. (2006). A mission for computational natural language learning. In Proceedings of the 10th Conference on Computational Natural Language Learning (pp. 1–5). New York City, NY.Google Scholar
  12. Deerwester S., Dumais S.T., Furnas G.W., Landauer T.K. and Harshman R. (1990). Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6): 391–407 CrossRefGoogle Scholar
  13. Fellbaum C. (1995). Co-occurrence and antonymy. Lexicography, 8(4): 281–303 CrossRefGoogle Scholar
  14. Fellbaum, C. (Ed.) (1998). WordNet–an Electronic Lexical Database. Language, Speech, and Communication. Cambridge, MA: MIT Press.Google Scholar
  15. Fellbaum, C., & Chaffin, R. (1990). Some principles of the organization of verbs in the mental lexicon. In Proceedings of the 12th Annual Conference of the Cognitive Science Society of America.Google Scholar
  16. Fernández A., Diez E., Alonso M.A. and Beato M.S. (2004). Free-association norms for the Spanisch names of the Snodgrass and Vanderwart pictures. Behavior Research Methods, Instruments and Computers, 36(3): 577–583 Google Scholar
  17. Ferrand L. and Alario F.-X. (1998). French word association norms for 366 names of objects. L’Annee Psychologique, 98(4): 659–709 CrossRefGoogle Scholar
  18. Fillmore C.J., Johnson C.R. and Petruck M.R. (2003). Background to FrameNet. International Journal of Lexicography, 16: 235–250 CrossRefGoogle Scholar
  19. Geffet, M., & Dagan, I. (2005). The distributional inclusion hypotheses and lexical entailment. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics. Ann Arbor, MI.Google Scholar
  20. Girju, R. (2003). Automatic detection of causal relations for question answering. In Proceedings of the ACL Workshop on Multilingual Summarization and Question Answering—Machine Learning and Beyond. Sapporo, Japan.Google Scholar
  21. Girju R., Badulescu A. and Moldovan D. (2006). Automatic discovery of part-whole relations. Computational Linguistics, 32(1): 83–135 Google Scholar
  22. Girju, R., Moldovan, D., Tatu, M., & Antohe, D. (2005). On the semantics of noun compounds. Journal of Computer Speech and Language, 19(4). Special Issue on Multiword Expressions.Google Scholar
  23. Girju, R., Nakov, P., Nastase, V., Szpakowicz, S., Turney, P., & Yuret, D. (2007). SemEval-2007 Task 04: Classification of semantic relations between nominals. In Proceedings of the 4th International Workshop on Semantic Evaluations (pp. 13–18). Prague, Czech Republic.Google Scholar
  24. Guida, A. (2007). The representation of verb meaning within lexical semantic memory: Evidence from word associations. Master’s thesis, Universit degli studi di Pisa.Google Scholar
  25. Gurevych, I., Müller, C., & Zesch, T. (2007). Electronic career guidance based on semantic relatedness. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. Prague, Czech Republic.Google Scholar
  26. Hamp, B., & Feldweg, H. (1997). GermaNet—a lexical-semantic net for German. In Proceedings of the ACL Workshop on Automatic Information Extraction and Building Lexical Semantic Resources for NLP Applications (pp. 9–15). Madrid, Spain.Google Scholar
  27. Harris, Z. (1968). Distributional structure. In J. J. Katz (Ed.), The philosophy of linguistics. Oxford Readings in Philosophy (pp. 26–47). Oxford University Press.Google Scholar
  28. Hearst, M. (1998). Automated discovery of WordNet relations. In Fellbaum (1998).Google Scholar
  29. Heringer H.J. (1986). The verb and its semantic power: Association as the basis for valence. Journal of Semantics, 4: 79–99 CrossRefGoogle Scholar
  30. Hirsh K.W. and Tree J. (2001). Word association norms for two cohorts of British adults. Journal of Neurolinguistics, 14(1): 1–44 CrossRefGoogle Scholar
  31. Ji, H., Westbrook, D., & Grishman, R. (2005). Using semantic relations to refine coreference decisions. In Proceedings of the joint Conference on Human Language Technology and Empirial Methods in Natural Language Processing (pp. 17–24). Vancouver, Canada.Google Scholar
  32. Kavalek, M., & Svatek, V. (2005). A study on automated relation labelling in ontology learning. In P. Buitelaar, P. Cimiano, & B. Magnini (Eds.), Ontology learning and population (Vol. 123). Frontiers in Artificial Intelligence. IOS Press.Google Scholar
  33. Kiss, G., Armstrong, C., Milroy, R., & Piper, J. (1973). An associative thesaurus of English and its computer analysis. In The computer and literary studies. Edinburgh University Press.Google Scholar
  34. Korhonen, A., Krymolowski, Y., & Marx, Z. (2003). Clustering polysemic subcategorization frame distributions semantically. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (pp. 64–71). Sapporo, Japan.Google Scholar
  35. Kunze, C. (2000). Extension and use of GermaNet, a lexical-semantic database. In Proceedings of the 2nd International Conference on Language Resources and Evaluation (pp. 999–1002). Athens, Greece.Google Scholar
  36. Kunze, C. (2004). Semantische relationstypen in GermaNet. In S. Langer & D. Schnorbusch (Eds.), Semantik im Lexikon. (Vol. 479, pp. 162–178). Tübinger Beiträge zur Linguistik. Tübingen: Gunter Narr Verlag.Google Scholar
  37. Landauer T.K. and Dumais S.T. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104(2): 211–240 CrossRefGoogle Scholar
  38. Lapata M. (2002). The disambiguation of nominalisations. Computational Linguistics, 28(3): 357–388 CrossRefGoogle Scholar
  39. Lauteslager, M., Schaap, T., & Schievels, D. (1986). Schriftelijke Woordassociatienormen voor 549 Nederlandse Zelfstandige Naamworden. Swets and Zeitlinger.Google Scholar
  40. Lemaire, B., & Denhiére, G. (2006). Effects of high-order co-occurrences on word semantic similarity. Current Psychology Letters – Behaviour, Brain and Cognition, 18(1).Google Scholar
  41. Levin, B. (1993). English verb classes and alternations. The University of Chicago Press.Google Scholar
  42. Lin, D. (1998a). Automatic retrieval and clustering of similar words. In Proceedings of the 17th International Conference on Computational Linguistics. Montreal, Canada.Google Scholar
  43. Lin, D. (1998b). Extracting collocations from text corpora. In Proceedings of the First Workshop on Computational Terminology. Montreal, Canada.Google Scholar
  44. Lin, D. (1999). Automatic identification of non-compositional phrases. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (pp. 317–324). Maryland, MD.Google Scholar
  45. Lowe, W., & McDonald, S. (2000). The direct route: Mediated priming in semantic space. In Proceedings of the 22nd Annual Conference of the Cognitive Science Society (pp. 675–680). Philadelphia, PA.Google Scholar
  46. Lund K. and Burgess C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments and Computers, 28(2): 203–208 Google Scholar
  47. Lund, K., Burgess, C., & Atchley, R. A. (1995). Semantic and associative priming in high-dimensional semantic space. In Proceedings of the 17th Annual Conference of the Cognitive Science Society of America (pp. 660–665).Google Scholar
  48. Maedche, A., & Staab, S. (2000). Discovering conceptual relations from text. In Proceedings of the 14th European Conference on Artificial Intelligence. Berlin, Germany.Google Scholar
  49. McCarthy, D., Keller, B., & Carroll, J. (2003). Detecting a continuum of compositionality in phrasal verbs. In Proceedings of the ACL-SIGLEX Workshop on Multiword Expressions: Analysis, Acquisition and Treatment. Sapporo, Japan.Google Scholar
  50. McEvoy C.L. and Nelson D.L. (1982). Category name and instance norms for 106 categories of various sizes. American Journal of Psychology, 95: 581–634 CrossRefGoogle Scholar
  51. McKoon G. and Ratcliff R. (1992). Spreading activation versus compound cue accounts of priming: Mediated priming revisited. Journal of Experimental Psychology: Learning, Memory and Cognition 18: 1155–1172 CrossRefGoogle Scholar
  52. McNamara T.P. (2005). Semantic priming: Perspectives from memory and word recognition. Psychology Press, New York Google Scholar
  53. Melinger, A., Schulte im Walde, S., & Weber, A. (2006). Characterizing response types and revealing noun ambiguity in German association norms. In Proceedings of the EACL Workshop “Making Sense of Sense”: Bringing Computational Linguistics and Psycholinguistics Together (pp. 41–48). Trento, Italy.Google Scholar
  54. Melinger, A., & Weber, A. (2006). Database of noun associations for German. http://www.coli.uni-saarland.de/projects/nag/.
  55. Merlo P. and Stevenson S. (2001). Automatic verb classification based on statistical distributions of argument structure. Computational Linguistics, 27(3): 373–408 CrossRefGoogle Scholar
  56. Miller G.A., Beckwith R., Fellbaum C., Gross D. and Miller K. J. (1990). Introduction to WordNet: An on-line lexical database. International Journal of Lexicography, 3(4): 235–244 CrossRefGoogle Scholar
  57. Moldovan, D., Badulescu, A., Tatu, M., Antohe, D., & Girju, R. (2004). Models for the semantic classification of noun phrases. In Proceedings of the HLT-NAACL Computational Lexical Semantics Workshop (pp. 60–67). Boston, MA.Google Scholar
  58. Moldovan, D., & Novischi, A. (2002). Lexical chains for question answering. In Proceedings of the 19th International Conference on Computational Linguistics. Taipei, Taiwan.Google Scholar
  59. Morris, J., & Hirst, G. (2004). Non-classical lexical semantic relations. In Proceedings of the HLT Workshop on Computational Lexical Semantics. Boston, MA.Google Scholar
  60. Nastase, V. A. (2003). Semantic relations across syntactic levels. Ph.D. thesis, School of Information Technology and Engineering, University of Ottawa.Google Scholar
  61. Navigli R. and Velardi P. (2004). Learning domain ontologies from document warehouses and dedicated web sites. Computational Linguistics, 30(2): 151–179 CrossRefGoogle Scholar
  62. Nelson, D., McEvoy, C., & Schreiber, T. (1998). The University of South Florida word association, rhyme, and word fragment norms. http://www.usf.edu/FreeAssociation.
  63. Nelson D.L., McEvoy C.L. and Dennis S. (2000). What is free association and what does it measure?. Memory and Cognition, 28: 887–899 Google Scholar
  64. Padó S. and Lapata M. (2007). Dependency-based construction of semantic space models. Computational Linguistics, 33(2): 161–199 CrossRefGoogle Scholar
  65. Palermo D. and Jenkins J. (1964). Word Association Norms: Grade school Through college. University of Minnesota Press, Minneapolis Google Scholar
  66. Palmer M., Gildea D. and Kingsbury P. (2005). The Proposition Bank: An annotated resource of semantic roles. Computational Linguistics, 31(1): 71–106 CrossRefGoogle Scholar
  67. Pantel, P., & Pennacchiotti, M. (2006). Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics (pp. 113–120). Sydney, Australia.Google Scholar
  68. Pereira, F., Tishby, N., & Lee, L. (1993). Distributional clustering of English words. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics (pp. 183–190). Columbus, OH.Google Scholar
  69. Plaut, D. C. (1995). Semantic and associative priming in a distributed attractor network. In Proceedings of the 17th Annual Conference of the Cognitive Science Society (Vol. 17. pp. 37–42).Google Scholar
  70. Poesio, M., Ishikawa, T., Schulte im Walde, S., & Viera, R. (2002). Acquiring lexical knowledge for anaphora resolution. In Proceedings of the 3rd Conference on Language Resources and Evaluation (Vol. IV, pp. 1220–1224). Las Palmas de Gran Canaria, Spain.Google Scholar
  71. Rapp, R. (1996). Die Berechnung von Assoziationen (Vol. 16). Sprache und Computer. Georg Olms Verlag.Google Scholar
  72. Rapp, R. (2002). The computation of word associations: Comparing syntagmatic and paradigmatic approaches. In Proceedings of the 19th International Conference on Computational Linguistics. Taipei, Taiwan.Google Scholar
  73. Rooth, M., Riezler, S., Prescher, D., Carroll, G., & Beil, F. (1999). Inducing a semantically annotated Lexicon via EM-based clustering. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics. Maryland, MD.Google Scholar
  74. Roth, M. (2006). Relationen zwischen Nomen und ihren Assoziationen. Studienarbeit. Institut für Computerlinguistik und Phonetik, Universität des Saarlandes.Google Scholar
  75. Russell W.A. (1970). The complete German language norms for responses to 100 words from the Kent-Rosanoff word association test. In: Postman, L. and Keppel, G. (eds) Norms of word association, pp 53–94. Academic Press, New York Google Scholar
  76. Russell W.A. and Meseck O. (1959). Der Einfluss der Assoziation auf das Erinnern von Worten in der deutschen, französischen und englischen Sprache. Zeitschrift für Experimentelle und Angewandte Psychologie, 6: 191–211 Google Scholar
  77. Sahlgren, M. (2006). The word-space model: Using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. Ph.D. thesis, Stockholm University.Google Scholar
  78. Salton G. and McGill M. (1983). Introduction to modern information retrieval. McGraw-Hill, New York Google Scholar
  79. Salton G., Wong A. and Yang C.-S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11): 613–620 CrossRefGoogle Scholar
  80. Schulte im Walde, S. (2002). A subcategorisation lexicon for German verbs induced from a lexicalised PCFG. In Proceedings of the 3rd Conference on Language Resources and Evaluation. (Vol. IV, pp. 1351–1357). Las Palmas de Gran Canaria, Spain.Google Scholar
  81. Schulteim Walde S. (2006). Experiments on the automatic induction of German semantic verb classes. Computational Linguistics, 32(2): 159–194 CrossRefGoogle Scholar
  82. Schulte im Walde, S. (to appear). Human associations and the choice of features for semantic verb classification. Research on Language and Computation.Google Scholar
  83. Schulte im Walde, S., & Melinger, A. (2005). Identifying semantic relations and functional properties of human verb associations. In Proceedings of the Joint Conference on Human Language Technology and Empirial Methods in Natural Language Processing (pp. 612–619). Vancouver, Canada.Google Scholar
  84. Schulte im Walde, S., & Melinger, A. (to appear). An in-depth look into the co-occurrence distribution of semantic associates. Italian Journal of Linguistics. Special Issue on “From Context to Meaning: Distributional Models of the Lexicon in Linguistics and Cognitive Science”.Google Scholar
  85. Schulte im Walde, S., Melinger, A., Roth, M., & Weber, A. (2007). Which distributional functions are crucial to word meaning: An investigation of semantic associates. In C. Kunze, L. Lemnitzer, & R. Osswald (Eds.), Proceedings of the GLDV Workshop on Lexical Semantic and Ontological Resources. (Vol. 336–333, pp. 109–118). Informatik-Berichte FernUniversität Hagen. Tübingen, Germany.Google Scholar
  86. Schütze H. (1998). Automatic word sense discrimination. Computational Linguistics, 24(1): 97–123. Special Issue on Word Sense DisambiguationGoogle Scholar
  87. Sinopalnikova, A. (2004). Word association thesaurus as a resource for building WordNet. In Proceedings of the 2nd International WordNet Conference (pp. 199–205). Brno, Czech Republic.Google Scholar
  88. Snodgrass J.G. and Vanderwart M. (1980). A standardized set of 260 pictures: Norms for name agreement, image agreement, familiarity and visual complexity. Journal of Experimental Psychology: Human Learning and Memory, 6: 174–215 CrossRefGoogle Scholar
  89. Spence D.P. and Owens K.C. (1990). Lexical co-occurrence and association strength. Journal of Psycholinguistic Research, 19: 317–330 CrossRefGoogle Scholar
  90. Tatu, M., & Moldovan, D. (2005). A semantic approach to recognizing textual entailment. In Proceedings of the joint Conference on Human Language Technology and Empirial Methods in Natural Language Processing (pp. 371–378). Vancouver, Canada.Google Scholar
  91. Vieira R. and Poesio M. (2000). An empirically-based system for processing definite descriptions. Computational Linguistics, 26(4): 539–593 CrossRefGoogle Scholar
  92. Vigliocco G., Vinson D., Lewis W. and Garrett M. (2004). Representing the meanings of object and action words: The featural and unitary semantic space hypothesis. Cognitive Psychology, 48: 422–488 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2008

Authors and Affiliations

  • Sabine Schulte im Walde
    • 1
  • Alissa Melinger
    • 2
  • Michael Roth
    • 3
  • Andrea Weber
    • 4
  1. 1.Institute for Natural Language ProcessingUniversity of StuttgartStuttgartGermany
  2. 2.School of PsychologyUniversity of DundeeScotlandUK
  3. 3.Computational LinguisticsSaarland UniversitySaarbrückenGermany
  4. 4.PsycholinguisticsSaarland UniversitySaarbrückenGermany

Personalised recommendations