Abstract
We present an algorithm for learning from unlabeled text, based on the Vector Space Model (VSM) of information retrieval, that can solve verbal analogy questions of the kind found in the SAT college entrance exam. A verbal analogy has the form A:B::C:D, meaning “A is to B as C is to D”; for example, mason:stone::carpenter:wood. SAT analogy questions provide a word pair, A:B, and the problem is to select the most analogous word pair, C:D, from a set of five choices. The VSM algorithm correctly answers 47% of a collection of 374 college-level analogy questions (random guessing would yield 20% correct; the average college-bound senior high school student answers about 57% correctly). We motivate this research by applying it to a difficult problem in natural language processing, determining semantic relations in noun-modifier pairs. The problem is to classify a noun-modifier pair, such as “laser printer”, according to the semantic relation between the noun (printer) and the modifier (laser). We use a supervised nearest-neighbour algorithm that assigns a class to a given noun-modifier pair by finding the most analogous noun-modifier pair in the training data. With 30 classes of semantic relations, on a collection of 600 labeled noun-modifier pairs, the learning algorithm attains an F value of 26.5% (random guessing: 3.3%). With 5 classes of semantic relations, the F value is 43.2% (random: 20%). The performance is state-of-the-art for both verbal analogies and noun-modifier relations.
Article PDF
Similar content being viewed by others
References
Aristotle (2001). Ed. S. Broadie, Trans. C. Rowe, Nicomachean ethics. Oxford University Press.
Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern Information Retrieval. Addison-Wesley.
Barker, K., & Szpakowicz, S. (1998). Semi-automatic recognition of noun modifier relationships. In Proceedings of the 17th International Conference on Computational Linguistics and the 36th Annual Meeting of the Association for Computational Linguistics (COLING-ACL’98) (pp. 96–102). Montreal, Quebec.
Berland, M., & Charniak, E. (1999). Finding parts in very large corpora. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (pp. 57–64). (ACL’99), ACL, New Brunswick, NJ.
Church, K. W., & Hanks, P. (1989). Word association norms, mutual information and lexicography. In Proceedings of the 27th Annual Conference of the Association of Computational Linguistics (pp. 76–83). New Brunswick, NJ: Association for Computational Linguistics.
Claman, C. (2000). 10 Real SATs. College Entrance Examination Board.
Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines. Cambridge, UK: Cambridge University Press.
Dagan, I., Marx, Z., & Shamir, E. (2002). Cross-dataset clustering: Revealing corresponding themes across multiple corpora. In Sixth Conference on Natural Language Learning (CoNLL-2002) (pp. 15–21). Taipei, Taiwan.
Daganzo, C. F. (1994). The cell transmission model: A dynamic representation of highway traffic consistent with the hydrodynamic theory. Transportation Research Part B: Methodological, 28:4, 269–287.
Dolan, W. B. (1995). Metaphor as an emergent property of machine-readable dictionaries. In Proceedings of the AAAI 1995 Spring Symposium Series: Representation and Acquisition of Lexical Knowledge: Polysemy, Ambiguity and Generativity (pp. 27–32).
Dunning, T. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19, 61–74.
Falkenhainer, B., Forbus, K. D., & Gentner, D. (1989). The structure-mapping engine: Algorithm and examples. Artificial Intelligence, 41:1, 1–63.
Falkenhainer, B. (1990). Analogical interpretation in context. In Proceedings of the Twelfth Annual Conference of the Cognitive Science Society (pp. 69–76). Lawrence Erlbaum Associates.
Fellbaum, C. (Ed.). (1998). WordNet: An electronic lexical database. MIT Press.
French, R. M. (2002). The computational modeling of analogy-making. Trends in Cognitive Sciences, 6:5, 200–205,
Gentner, D. (1983). Structure-mapping: A theoretical framework for analogy. Cognitive Science, 7:2, 155–170.
Gildea, D., & Jurafsky, D. (2002). Automatic labeling of semantic roles. Computational Linguistics, 28:3, 245–288.
Hearst, M. A. (1992). Automatic acquisition of hyponyms from large text corpora. In Proceedings of the Fourteenth International Conference on Computational Linguistics (pp. 539–545). Nantes, France.
Hofstadter, D. & the Fluid Analogies Research Group. (1995). Fluid concepts and creative analogies: Computer models of the fundamental mechanisms of thought. Basic Books.
Lakoff, G., & Johnson, M. (1980). Metaphors we live by. University of Chicago Press.
Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review? 104, 211–240.
Lapata, M., & Lascarides, A. (2003). A probabilistic account of logical metonymy. Computational Linguistics, 29:2, 261–315.
Lesk, M. E. (1969). Word-word associations in document retrieval systems. American Documentation, 20:1, 27–38.
Lewis, D. D. (1991). Evaluating text categorization. In Proceedings of the Speech and Natural Language Workshop (pp. 312–318). Asilomar.
Lin, D. (1998). An information-theoretic definition of similarity. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML’98) (pp. 296–304). Madison, Wisconsin, July.
Madsen, B. N., Pedersen, B. S., & Thomsen, H. E. (2001). Defining semantic relations for OntoQuery. In P. A. Jensen & P. Skadhauge (Eds.), Ontology-based Interpretation of noun phrases: Proceedings of the first international OntoQuery workshop, Department of Business Communication and Information Science, University of Southern Denmark, Kolding.
Marx, Z., Dagan, I., Buhmann, J., & Shamir, E. (2002). Coupled clustering: A method for detecting structural correspondence. Journal of Machine Learning Research, 3, 747–780.
Nastase, V., & Szpakowicz, S. (2003). Exploring noun-modifier semantic relations. In Fifth International Workshop on Computational Semantics (IWCS-5) (pp. 285–301). Tilburg, The Netherlands.
Pantel, P., & Lin, D. (2002). Discovering word senses from text. In Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 613–619).
Ravichandran, D., & Hovy, E. (2002). Learning surface text patterns for a question answering system. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL-02) (pp. 41–47). Philadelphia.
Regier, T. (1996). The human semantic potential: Spatial language and constrained connectionism. Cambridge, MA: MIT Press.
Reitman, W. R. (1965). Cognition and thought: An information processing approach. John Wiley and Sons.
Resnik, P. (1995). Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (pp. 448–453). Morgan Kaufmann, San Mateo, CA.
Resnik, P. (1999a). Mining the Web for bilingual text. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL’99) (pp. 527–534). College Park, Maryland.
Resnik, P. (1999b). Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research, 11, 95–130.
Rosario, B., & Hearst, M. (2001). Classifying the semantic relations in noun-compounds via a domain-specific lexical hierarchy. In Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing (EMNLP-01) (pp. 82–90).
Rosario, B., Hearst, M., & Fillmore, C. (2002). The descent of hierarchy, and selection in relational semantics. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL’02) Philadelphia, PA, (pp. 417–424).
Ruge, G. (1992). Experiments on linguistically-based term associations. Information Processing and Management, 28:3, 317–332.
Salton, G., & McGill, M. J. (1983). Introduction to modern information retrieval. New York: McGraw-Hill.
Salton, G. (1989). Automatic text processing: The transformation, analysis, and retrieval of information by computer. Reading, Massachusetts: Addison-Wesley.
Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24:5, 513–523.
Skeat, W. W. (1963). A concise etymological dictionary of the English language. Capricorn Books.
Smadja, F. (1993). Retrieving collocations from Text: Xtract. Computational Linguistics, 19, 143–177.
Stephens, M., Palakal, M., Mukhopadhyay, S., Raje, R., & Mostafa, J. (2001). Detecting gene relations from MEDLINE abstracts. Pacific Symposium on Biocomputing, 6, 483–496.
Turney, P. D. (2001). Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. In Proceedings of the Twelfth European Conference on Machine Learning, Springer-Verlag, Berlin (pp. 491–502).
Turney, P. D., & Littman, M. L. (2003). Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems (TOIS), 21:4, 315–346.
Turney, P. D., Littman, M. L., Bigham, J., & Shnayder, V. (2003). Combining independent modules to solve multiple-choice synonym and analogy problems. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03) (pp. 482–489). Borovets, Bulgaria.
van Rijsbergen, C. J. (1979). Information retrieval, 2nd edition, London: Butterworths.
Vanderwende, L. (1994). Algorithm for automatic interpretation of noun sequences. In Proceedings of the Fifteenth International Conference on Computational Linguistics (pp. 782–788). Kyoto, Japan.
Veale, T. (2003). The analogical thesaurus. In Proceedings of the Fifteenth Innovative Applications of Artificial Intelligence Conference (IAAI-03) (pp. 137–142). Acapulco, Mexico.
Voorhees, E. M., & Harman, D. K. (1997). Overview of the fifth Text Retrieval Conference (TREC-5). In Proceedings of the Fifth Text Retrieval Conference (TREC-5) (pp. 1–28). NIST Special Publication 500–238.
Wong, S. K. M., Ziarko, W., & Wong, P. C. N. (1985). Generalized vector space model in information retrieval. In Proceedings of the 8th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR-85) (pp. 18–25).
Yi, J., Lin, H., Alvarez, L., & Horowitz, R. (2003). Stability of macroscopic traffic flow modeling through wavefront expansion. Transportation Research Part B: Methodological, 37:7, 661–679.
Zhang, H. M. (2003). Driver memory, traffic viscosity and a viscous vehicular traffic flow model. Transportation Research Part B: Methodological, 37:1, 27–41.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editors:
Dan Roth and Pascale Fung
Rights and permissions
About this article
Cite this article
Turney, P.D., Littman, M.L. Corpus-based Learning of Analogies and Semantic Relations. Mach Learn 60, 251–278 (2005). https://doi.org/10.1007/s10994-005-0913-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-005-0913-1