Comparing Vocabulary Term Recommendations Using Association Rules and Learning to Rank: A User Study

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9678)

Abstract

When modeling Linked Open Data (LOD), reusing appropriate vocabulary terms to represent the data is difficult, because there are many vocabularies to choose from. Vocabulary term recommendations could alleviate this situation. We present a user study evaluating a vocabulary term recommendation service that is based on how other data providers have used RDF classes and properties in the LOD cloud. Our study compares the machine learning technique Learning to Rank (L2R), the classical data mining approach Association Rule mining (AR), and a baseline that does not provide any recommendations. Results show that utilizing AR, participants needed less time and less effort to model the data, which in the end resulted in models of better quality.

References

  1. 1.
    Heath, T., Bizer, C.: Linked Data: Evolving the Web into a Global Data Space. Synthesis Lectures on the Semantic Web. Morgan & Claypool Publishers, San Rafael (2011)Google Scholar
  2. 2.
    Schaible, J., Gottron, T., Scherp, A.: TermPicker: enabling the reuse of vocabulary terms by exploiting data from the linked open data cloud - an extended technical report. ArXiv e-prints, December 2015Google Scholar
  3. 3.
    Knoblock, C.A., Szekely, P., Ambite, J.L., Goel, A., Gupta, S., Lerman, K., Muslea, M., Taheriyan, M., Mallick, P.: Semi-automatically mapping structured sources into the semantic web. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 375–390. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  4. 4.
    Liu, T.Y.: Learning to rank for information retrieval. Found. Trends Inf. Retr. 3(3), 225–331 (2009)CrossRefGoogle Scholar
  5. 5.
    Hang, L.: A short introduction to learning to rank. IEICE Trans. Inf. Syst. 94(10), 1854–1862 (2011)Google Scholar
  6. 6.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Zhang, C., Zhang, S. (eds.): Association Rule Mining: Models and Algorithms. LNCS (LNAI), vol. 2307. Springer, Heidelberg (2002)MATHGoogle Scholar
  8. 8.
    Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: ACM SIGMOD Record, vol. 22, pp. 207–216. ACM (1993)Google Scholar
  9. 9.
    Charness, G., Gneezy, U., Kuhn, M.A.: Experimental methods: between-subject and within-subject design. J. Econ. Behav. Organ. 81(1), 1–8 (2012)CrossRefGoogle Scholar
  10. 10.
    Käfer, T., Harth, A.: Billion Triples Challenge data set (2014). http://km.aifb.kit.edu/projects/btc-2014/
  11. 11.
    Fernandez, M., Cantador, I., Castells, P.: Core: a tool for collaborative ontology reuse and evaluation. In: 4th International Workshop on Evaluation of Ontologies for the Web (2006)Google Scholar
  12. 12.
    d’Aquin, M., Baldassarre, C., Gridinoc, L., Sabou, M., Angeletou, S., Motta, E.: Watson: supporting next generation semantic web applications. In: Proceedings of the IADIS International Conference WWW/Internet 2007, pp. 363–371 (2007)Google Scholar
  13. 13.
    Cheng, G., Gong, S., Qu, Y.: An empirical study of vocabulary relatedness and its application to recommender systems. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 98–113. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  14. 14.
    Scharffe, F., Atemezing, G., Troncy, R., Gandon, F., et al.: Enabling linked-data publication with the datalift platform. In: AAAI 2012, 26th Conference on Artificial Intelligence - Semantic Cities. (2012)Google Scholar
  15. 15.
    Vandenbussche, P.Y., Atemezing, G.A., Poveda-Villalón, M., Vatant, B.: Linked Open Vocabularies (LOV): a gateway to reusable semantic vocabularies on the Web. Semant. Web J. (to appear). http://www.semantic-web-journal.net/
  16. 16.
    Taheriyan, M., Knoblock, C.A., Szekely, P., Ambite, J.L.: A graph-based approach to learn semantic descriptions of data sources. In: Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N., Welty, C., Janowicz, K. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 607–623. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  17. 17.
    Taheriyan, M., Knoblock, C., Szekely, P., Ambite, J.L., et al.: A scalable approach to learn semantic models of structured sources. In: 2014 IEEE International Conference on Semantic Computing (ICSC), pp. 183–190. IEEE (2014)Google Scholar
  18. 18.
    Taheriyan, M., Knoblock, C.A., Szekely, P., Ambite, J.L., Chen, Y.: Leveraging linked data to infer semantic relations within structured sources. In: Proceedings of the 6th International Workshop on Consuming Linked Data (COLD 2015) (2015)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.GESIS – Leibniz Institute for the Social SciencesCologneGermany
  2. 2.Information Sciences InstituteUniversity of Southern CaliforniaLos AngelesUSA
  3. 3.ZBW – Leibniz Information Center for EconomicsKiel UniversityKielGermany

Personalised recommendations