Machine Learning

, Volume 60, Issue 1–3, pp 251–278 | Cite as

Corpus-based Learning of Analogies and Semantic Relations

Original Article

Abstract

We present an algorithm for learning from unlabeled text, based on the Vector Space Model (VSM) of information retrieval, that can solve verbal analogy questions of the kind found in the SAT college entrance exam. A verbal analogy has the form A:B::C:D, meaning “A is to B as C is to D”; for example, mason:stone::carpenter:wood. SAT analogy questions provide a word pair, A:B, and the problem is to select the most analogous word pair, C:D, from a set of five choices. The VSM algorithm correctly answers 47% of a collection of 374 college-level analogy questions (random guessing would yield 20% correct; the average college-bound senior high school student answers about 57% correctly). We motivate this research by applying it to a difficult problem in natural language processing, determining semantic relations in noun-modifier pairs. The problem is to classify a noun-modifier pair, such as “laser printer”, according to the semantic relation between the noun (printer) and the modifier (laser). We use a supervised nearest-neighbour algorithm that assigns a class to a given noun-modifier pair by finding the most analogous noun-modifier pair in the training data. With 30 classes of semantic relations, on a collection of 600 labeled noun-modifier pairs, the learning algorithm attains an F value of 26.5% (random guessing: 3.3%). With 5 classes of semantic relations, the F value is 43.2% (random: 20%). The performance is state-of-the-art for both verbal analogies and noun-modifier relations.

Keywords

analogy metaphor semantic relations vector space model cosine similarity noun-modifier pairs 

References

  1. Aristotle (2001). Ed. S. Broadie, Trans. C. Rowe, Nicomachean ethics. Oxford University Press.Google Scholar
  2. Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern Information Retrieval. Addison-Wesley.Google Scholar
  3. Barker, K., & Szpakowicz, S. (1998). Semi-automatic recognition of noun modifier relationships. In Proceedings of the 17th International Conference on Computational Linguistics and the 36th Annual Meeting of the Association for Computational Linguistics (COLING-ACL’98) (pp. 96–102). Montreal, Quebec.Google Scholar
  4. Berland, M., & Charniak, E. (1999). Finding parts in very large corpora. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (pp. 57–64). (ACL’99), ACL, New Brunswick, NJ.Google Scholar
  5. Church, K. W., & Hanks, P. (1989). Word association norms, mutual information and lexicography. In Proceedings of the 27th Annual Conference of the Association of Computational Linguistics (pp. 76–83). New Brunswick, NJ: Association for Computational Linguistics.Google Scholar
  6. Claman, C. (2000). 10 Real SATs. College Entrance Examination Board.Google Scholar
  7. Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines. Cambridge, UK: Cambridge University Press.Google Scholar
  8. Dagan, I., Marx, Z., & Shamir, E. (2002). Cross-dataset clustering: Revealing corresponding themes across multiple corpora. In Sixth Conference on Natural Language Learning (CoNLL-2002) (pp. 15–21). Taipei, Taiwan.Google Scholar
  9. Daganzo, C. F. (1994). The cell transmission model: A dynamic representation of highway traffic consistent with the hydrodynamic theory. Transportation Research Part B: Methodological, 28:4, 269–287.CrossRefGoogle Scholar
  10. Dolan, W. B. (1995). Metaphor as an emergent property of machine-readable dictionaries. In Proceedings of the AAAI 1995 Spring Symposium Series: Representation and Acquisition of Lexical Knowledge: Polysemy, Ambiguity and Generativity (pp. 27–32).Google Scholar
  11. Dunning, T. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19, 61–74.Google Scholar
  12. Falkenhainer, B., Forbus, K. D., & Gentner, D. (1989). The structure-mapping engine: Algorithm and examples. Artificial Intelligence, 41:1, 1–63.CrossRefGoogle Scholar
  13. Falkenhainer, B. (1990). Analogical interpretation in context. In Proceedings of the Twelfth Annual Conference of the Cognitive Science Society (pp. 69–76). Lawrence Erlbaum Associates.Google Scholar
  14. Fellbaum, C. (Ed.). (1998). WordNet: An electronic lexical database. MIT Press.Google Scholar
  15. French, R. M. (2002). The computational modeling of analogy-making. Trends in Cognitive Sciences, 6:5, 200–205,CrossRefPubMedGoogle Scholar
  16. Gentner, D. (1983). Structure-mapping: A theoretical framework for analogy. Cognitive Science, 7:2, 155–170.CrossRefGoogle Scholar
  17. Gildea, D., & Jurafsky, D. (2002). Automatic labeling of semantic roles. Computational Linguistics, 28:3, 245–288.CrossRefGoogle Scholar
  18. Hearst, M. A. (1992). Automatic acquisition of hyponyms from large text corpora. In Proceedings of the Fourteenth International Conference on Computational Linguistics (pp. 539–545). Nantes, France.Google Scholar
  19. Hofstadter, D. & the Fluid Analogies Research Group. (1995). Fluid concepts and creative analogies: Computer models of the fundamental mechanisms of thought. Basic Books.Google Scholar
  20. Lakoff, G., & Johnson, M. (1980). Metaphors we live by. University of Chicago Press.Google Scholar
  21. Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review? 104, 211–240.CrossRefGoogle Scholar
  22. Lapata, M., & Lascarides, A. (2003). A probabilistic account of logical metonymy. Computational Linguistics, 29:2, 261–315.CrossRefGoogle Scholar
  23. Lesk, M. E. (1969). Word-word associations in document retrieval systems. American Documentation, 20:1, 27–38.Google Scholar
  24. Lewis, D. D. (1991). Evaluating text categorization. In Proceedings of the Speech and Natural Language Workshop (pp. 312–318). Asilomar.Google Scholar
  25. Lin, D. (1998). An information-theoretic definition of similarity. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML’98) (pp. 296–304). Madison, Wisconsin, July.Google Scholar
  26. Madsen, B. N., Pedersen, B. S., & Thomsen, H. E. (2001). Defining semantic relations for OntoQuery. In P. A. Jensen & P. Skadhauge (Eds.), Ontology-based Interpretation of noun phrases: Proceedings of the first international OntoQuery workshop, Department of Business Communication and Information Science, University of Southern Denmark, Kolding.Google Scholar
  27. Marx, Z., Dagan, I., Buhmann, J., & Shamir, E. (2002). Coupled clustering: A method for detecting structural correspondence. Journal of Machine Learning Research, 3, 747–780.Google Scholar
  28. Nastase, V., & Szpakowicz, S. (2003). Exploring noun-modifier semantic relations. In Fifth International Workshop on Computational Semantics (IWCS-5) (pp. 285–301). Tilburg, The Netherlands.Google Scholar
  29. Pantel, P., & Lin, D. (2002). Discovering word senses from text. In Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 613–619).Google Scholar
  30. Ravichandran, D., & Hovy, E. (2002). Learning surface text patterns for a question answering system. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL-02) (pp. 41–47). Philadelphia.Google Scholar
  31. Regier, T. (1996). The human semantic potential: Spatial language and constrained connectionism. Cambridge, MA: MIT Press.Google Scholar
  32. Reitman, W. R. (1965). Cognition and thought: An information processing approach. John Wiley and Sons.Google Scholar
  33. Resnik, P. (1995). Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (pp. 448–453). Morgan Kaufmann, San Mateo, CA.Google Scholar
  34. Resnik, P. (1999a). Mining the Web for bilingual text. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL’99) (pp. 527–534). College Park, Maryland.Google Scholar
  35. Resnik, P. (1999b). Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research, 11, 95–130.Google Scholar
  36. Rosario, B., & Hearst, M. (2001). Classifying the semantic relations in noun-compounds via a domain-specific lexical hierarchy. In Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing (EMNLP-01) (pp. 82–90).Google Scholar
  37. Rosario, B., Hearst, M., & Fillmore, C. (2002). The descent of hierarchy, and selection in relational semantics. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL’02) Philadelphia, PA, (pp. 417–424).Google Scholar
  38. Ruge, G. (1992). Experiments on linguistically-based term associations. Information Processing and Management, 28:3, 317–332.CrossRefGoogle Scholar
  39. Salton, G., & McGill, M. J. (1983). Introduction to modern information retrieval. New York: McGraw-Hill.Google Scholar
  40. Salton, G. (1989). Automatic text processing: The transformation, analysis, and retrieval of information by computer. Reading, Massachusetts: Addison-Wesley.Google Scholar
  41. Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24:5, 513–523.CrossRefGoogle Scholar
  42. Skeat, W. W. (1963). A concise etymological dictionary of the English language. Capricorn Books.Google Scholar
  43. Smadja, F. (1993). Retrieving collocations from Text: Xtract. Computational Linguistics, 19, 143–177.Google Scholar
  44. Stephens, M., Palakal, M., Mukhopadhyay, S., Raje, R., & Mostafa, J. (2001). Detecting gene relations from MEDLINE abstracts. Pacific Symposium on Biocomputing, 6, 483–496.Google Scholar
  45. Turney, P. D. (2001). Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. In Proceedings of the Twelfth European Conference on Machine Learning, Springer-Verlag, Berlin (pp. 491–502).Google Scholar
  46. Turney, P. D., & Littman, M. L. (2003). Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems (TOIS), 21:4, 315–346.CrossRefGoogle Scholar
  47. Turney, P. D., Littman, M. L., Bigham, J., & Shnayder, V. (2003). Combining independent modules to solve multiple-choice synonym and analogy problems. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03) (pp. 482–489). Borovets, Bulgaria.Google Scholar
  48. van Rijsbergen, C. J. (1979). Information retrieval, 2nd edition, London: Butterworths.Google Scholar
  49. Vanderwende, L. (1994). Algorithm for automatic interpretation of noun sequences. In Proceedings of the Fifteenth International Conference on Computational Linguistics (pp. 782–788). Kyoto, Japan.Google Scholar
  50. Veale, T. (2003). The analogical thesaurus. In Proceedings of the Fifteenth Innovative Applications of Artificial Intelligence Conference (IAAI-03) (pp. 137–142). Acapulco, Mexico.Google Scholar
  51. Voorhees, E. M., & Harman, D. K. (1997). Overview of the fifth Text Retrieval Conference (TREC-5). In Proceedings of the Fifth Text Retrieval Conference (TREC-5) (pp. 1–28). NIST Special Publication 500–238.Google Scholar
  52. Wong, S. K. M., Ziarko, W., & Wong, P. C. N. (1985). Generalized vector space model in information retrieval. In Proceedings of the 8th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR-85) (pp. 18–25).Google Scholar
  53. Yi, J., Lin, H., Alvarez, L., & Horowitz, R. (2003). Stability of macroscopic traffic flow modeling through wavefront expansion. Transportation Research Part B: Methodological, 37:7, 661–679.CrossRefGoogle Scholar
  54. Zhang, H. M. (2003). Driver memory, traffic viscosity and a viscous vehicular traffic flow model. Transportation Research Part B: Methodological, 37:1, 27–41.CrossRefGoogle Scholar

Copyright information

© Springer Science + Business Media, Inc. 2005

Authors and Affiliations

  1. 1.Institute for Information TechnologyNational Research Council CanadaOttawaCanada
  2. 2.Department of Computer ScienceRutgers UniversityPiscatawayUSA

Personalised recommendations