Correlation of Ontology-Based Semantic Similarity and Human Judgement for a Domain Specific Fashion Ontology

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9671)

Abstract

Evaluation of semantic similarity is difficult because semantic similarity values are highly subjective. There are several approaches that compare automatically computed similarities with values assigned by humans for general purpose terms and ontologies that contain general purpose terms. However, ontologies should be as domain specific as possible to capture the maximal amount of semantic knowledge about a domain. To evaluate the semantic knowledge captured by a custom fashion ontology we conducted a survey and crowdsourced similarity values for fashion terms. In this article we compare the manually assigned similarities to those computed automatically with several ontology-based similarity measures. We show that our proposed feature-based measure achieves the highest correlation with human judgement and give some insight into why this kind of similarity measure most resembles human similarity assessments. To evaluate the influence of the ontology on similarities we compare the results achieved with our fashion ontology to similarity values computed using a fragment of DBpedia.

Keywords

Feature based similarity Semantic similarity Fashion ontology 

References

  1. 1.
    Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G., Ruppin, E.: Placing search in context: the concept revisited. ACM Trans. Inf. Syst. 20(1), 116–131 (2002)CrossRefGoogle Scholar
  2. 2.
    Bruni, E., Tran, N.K., Baroni, M.: Multimodal distributional semantics. J. Artif. Intell. Res. 49, 1–47 (2014)MathSciNetMATHGoogle Scholar
  3. 3.
    WordNet. http://wordnet.princeton.edu/. Accessed 17 Dec 2015
  4. 4.
    Ghose, A., Yang, S.: An empirical analysis of search engine advertising: sponsored search in electronic markets. Manage. Sci. 55(10), 1605–1622 (2009)CrossRefGoogle Scholar
  5. 5.
    Kelly, B., Burka, K.: Enterprise paid media compaign management platforms 2015: a marketer’s report. http://downloads.digitalmarketingdepot.com/MIR_1305_PPCamp2013_buyersguidelandingpage.html. Accessed 16 Dec 2015
  6. 6.
    Thumasathit, T.: Wag the dog: the tail of bid management. http://searchenginewatch.com/sew/opinion/2048496/wag-dog-the-tail-bid-management. Accessed 16 Dec 2015
  7. 7.
    Chatfield, C.: Time-Series Forecasting. Chapman and Hall/CRC, Boca Raton (2000)CrossRefGoogle Scholar
  8. 8.
    Box, G.E.P., Jenkins, G.M., Reinsel, G.C.: Time Series Analysis: Forecasting and Control, 4th edn. Wiley, Oxford (2008)CrossRefMATHGoogle Scholar
  9. 9.
    Brockwell, P.J., Davis, R.A.: Introduction to Time Series and Forecasting, 2nd edn. Springer, New York (2002)CrossRefMATHGoogle Scholar
  10. 10.
    Smola, A.J., Schölkopf, B.: A tutorial on support vector regression. Stat. Comput. 14, 199–222 (2004)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Pai, P.F., Hong, W.C.: Forecasting regional electricity load based on recurrent support vector machines with genetic algorithms. Electr. Power Syst. Res. 74(3), 417–425 (2005)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Tay, F.E.H., Cao, L.: Application of support vector machines in financial time series forecasting. Omega 29(4), 309–317 (2001)CrossRefGoogle Scholar
  13. 13.
    Kalkowski, E., Sick, B.: Generative exponential smoothing models to forecast time-variant rates or probabilities. In: Proceedings of the International Work-Conference on Time Series (ITISE 2015), pp. 806–817 (2015)Google Scholar
  14. 14.
    Kalkowski, E., Sick, B.: Probabilistic generative models to; forecast time-variant rates or probabilities. In: Rojas, I., Pomares, H. (eds.) Time Series Analysis and Forecasting. Contributions to Statistics. Springer, New York (2015, to appear)Google Scholar
  15. 15.
    DBpedia. http://wiki.dbpedia.org. Accessed 16 Dec 2015
  16. 16.
    Sánchez, D., Batet, M., Isern, D., Valls, A.: Ontology-based semantic similarity: a new feature-based approach. Expert Syst. Appl. 39(9), 7718–7728 (2012)CrossRefGoogle Scholar
  17. 17.
    Rada, R., Mili, H., Bichnell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Trans. Syst. Man Cybern. 19(1), 17–30 (1989)CrossRefGoogle Scholar
  18. 18.
    Leacock, C., Chodorow, M.: Combining local context and WordNet similarity for word sense identification. In: Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database, pp. 265–283. MIT Press, Cambridge (1998)Google Scholar
  19. 19.
    Li, Y., Bandar, Z.A., McLean, D.: An approach for measuring semantic similarity between words using multiple information sources. IEEE Trans. Knowl. Data Eng. 15(4), 871–882 (2003)CrossRefGoogle Scholar
  20. 20.
    Al-Mubaid, H., Nguyen, H.A.: Measuring semantic similarity between biomedical concepts within multiple ontologies. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 39(4), 389–398 (2009)CrossRefGoogle Scholar
  21. 21.
    Tversky, A.: Features of similarity. Psycological Review 84(4), 327–352 (1977)CrossRefGoogle Scholar
  22. 22.
    Petrakis, E.G.M., Varelas, G., Hliaoutakis, A., Raftopoulou, P.: X-similarity: computing semantic similarity between concepts from different ontologies. J. Digital Inf. Manage. 4(4), 233–237 (2006)Google Scholar
  23. 23.
    Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI 1995), vol. 2, Montreal, QC, Canada, pp. 448–453 (1995)Google Scholar
  24. 24.
    Cilibrasi, R.L., Vitányi, P.M.B.: The Google similarity distance. IEEE Trans. Knowl. Data Eng. 19(3), 370–383 (2007)CrossRefGoogle Scholar
  25. 25.
    Zhou, Z., Wang, Y., Gu, J.: A new model of information content for semantic similarity in WordNet. In: Second International Conference on Future Generation Communication and Networking Symposia (FGCNS 2008), vol. 3, Sanya, Hainan Island, China, pp. 85–89 (2008)Google Scholar
  26. 26.
    Sánchez, D., Montserrat, B.: Semantic similarity estimation in the biomedical domain: an ontology-based information-theoretic perspective. J. Biomed. Inform. 44(5), 749–759 (2011)CrossRefGoogle Scholar
  27. 27.
    Milne, D., Witten, I.H.: An open-source toolkit for mining Wikipedia. Artif. Intell. 194, 222–239 (2012)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Bollegala, D., Matsuo, Y., Ishizuka, M.: A relational model of semantic similarity between words using automatically extracted lexical pattern clusters from the web. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP 2009), pp. 803–812 (2009)Google Scholar
  29. 29.
    Rubenstein, H., Goodenough, J.: Contextual correlates of synonymy. Commun. ACM 8, 627–633 (1965)CrossRefGoogle Scholar
  30. 30.
    Miller, G.A., Charles, W.G.: Contextual correlates of semantic similarity. Lang. Cogn. Process. 6, 1–28 (1991)CrossRefGoogle Scholar
  31. 31.
    Neo Technology Inc: Neo4J. http://neo4j.com. Accessed 17 Dec 2015
  32. 32.
    Kalkowski, E., Sick, B.: Using ontology-based similarity measures to find training data for problems with sparse data. In: Proceedings of the 2015 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2015), pp. 1693–1699 (2015)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.University of KasselKasselGermany

Personalised recommendations