Semantic Measures: How Similar? How Related?

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9671)

Abstract

There are two main types of semantic measures (SM): similarity and relatedness. There are also two main types of datasets, those intended for similarity evaluations and those intended for relatedness. Although they are clearly distinct, they are similar enough to generate some misconceptions.

Is there a confusion between similarity and relatedness among the semantic measure community, both the designers of SMs and the creators of benchmarks? This is the question that the research presented in this paper tries to answer. Authors performed a survey of both the SMs and datasets and executed a cross evaluation of those measures and datasets. The results show different consistency of measures with datasets of the same type. This research enabled us to conclude not only that there is indeed some confusion but also to pinpoint the SMs and benchmarks less consistent with their intended type.

Keywords

Semantic similarity Semantic relatedness Semantic measures Linked data 

Notes

Acknowledgments

This work is partially financed by the ERDF European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation - COMPETE 2020 Programme and by the FCT within project POCI-01-0145-FEDER-006961 and project “NORTE-01-0145-FEDER-000020” financed by the North Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement and through the European Regional Development Fund (ERDF).

References

  1. 1.
    Harispe, S., Ranwez, S., Janaqi, S., Montmain, J.: Semantic similarity from natural language and ontology analysis. Synth. Lect. Hum. Lang. Technol. 8, 1–254 (2015)CrossRefGoogle Scholar
  2. 2.
    Gorodnichenko, Y., Roland, G.: Understanding the individualism-collectivism cleavage, its effects: lessons from cultural psychology. Institutions Comp. Econ. Dev. 150, 213 (2012)CrossRefGoogle Scholar
  3. 3.
    Budanitsky, A., Hirst, G.: Evaluating wordnet-based measures of lexical semantic relatedness. Comput. Linguist. 32, 13–47 (2006)CrossRefMATHGoogle Scholar
  4. 4.
    Strube, M., Ponzetto, S.: WikiRelate! Computing semantic relatedness using wikipedia. In: AAAI (2006)Google Scholar
  5. 5.
    Philip, R.: Using information content to evaluate semantic similarity in a taxonomy. In: IJCAI (1995)Google Scholar
  6. 6.
    Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Trans. Syst. Man Cybern. 19, 17–30 (1989)CrossRefGoogle Scholar
  7. 7.
    Leacock, C., Chodorow, M.: Combining local context and wordnet similarity for word sense identification. WordNet: Electr. Lexical Database 49, 265–283 (1998)Google Scholar
  8. 8.
    Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics (1994)Google Scholar
  9. 9.
    Bodenreider, O., Aubry, M., Burgun, A.: Non-lexical approaches to identifying associative relations in the gene ontology. In: Pacific Symposium on Biocomputing (2005)Google Scholar
  10. 10.
    Lin, D.: An information-theoretic definition of similarity. In: ICML (1998)Google Scholar
  11. 11.
    Hirst, G., St-Onge, D.: Lexical chains as representations of context for the detection and correction of malapropisms. WordNet: Electr. Lexical Database 305, 305–332 (1998)Google Scholar
  12. 12.
    Rubenstein, H., Goodenough, J.B.: Contextual correlates of synonymy. Commun. ACM 8, 627–633 (1965)CrossRefGoogle Scholar
  13. 13.
    Miller, G.A., Charles, W.G.: Contextual correlates of semantic similarity. Lang. Cogn. Proc. 6, 1–28 (1991)CrossRefGoogle Scholar
  14. 14.
    Agirre, E., Alfonseca, E., Hall, K., Kravalova, J., Paşca, M., Soroa, A.: A study on similarity, relatedness using distributional, wordnet-based approaches. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics (2009)Google Scholar
  15. 15.
    Fellbaum, C.: WordNet. Wiley, New York (1999)MATHGoogle Scholar
  16. 16.
    Hill, F., Reichart, R., Korhonen, A.: Simlex-999: evaluating semantic models with (genuine) similarity estimation (2014). arXiv preprint arXiv:1408.3456
  17. 17.
    Evgeniy, G.: The WordSimilarity-353 Test Collection. http://www.cs.technion.ac.il/gabr/resources/data/wordsim353/
  18. 18.
    Radinsky, K., Agichtein, E., Gabrilovich, E., Markovitch, S.: A word at a time, computing word relatedness using temporal semantic analysis. In: Proceedings of the 20th International Conference on World Wide Web (2011)Google Scholar
  19. 19.
    Halawi, G., Dror, G., Gabrilovich, E., Koren, Y.: Large-scale learning of word relatedness with constraints. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2012)Google Scholar
  20. 20.
    Bruni, E., Tran, N.-K., Baroni, M.: Multimodal distributional semantics. J. Artif. Intell. Res. (JAIR) 49, 1–47 (2014)MathSciNetMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.CRACS and INESC-Porto LA, Faculty of SciencesUniversity of PortoPortoPortugal

Personalised recommendations