Using Ontology-Based Data Summarization to Develop Semantics-Aware Recommender Systems

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10843)


In the current information-centric era, recommender systems are gaining momentum as tools able to assist users in daily decision-making tasks. They may exploit users’ past behavior combined with side/contextual information to suggest them new items or pieces of knowledge they might be interested in. Within the recommendation process, Linked Data have been already proposed as a valuable source of information to enhance the predictive power of recommender systems not only in terms of accuracy but also of diversity and novelty of results. In this direction, one of the main open issues in using Linked Data to feed a recommendation engine is related to feature selection: how to select only the most relevant subset of the original Linked Data thus avoiding both useless processing of data and the so called “curse of dimensionality” problem. In this paper, we show how ontology-based (linked) data summarization can drive the selection of properties/features useful to a recommender system. In particular, we compare a fully automated feature selection method based on ontology-based data summaries with more classical ones, and we evaluate the performance of these methods in terms of accuracy and aggregate diversity of a recommender system exploiting the top-k selected features. We set up an experimental testbed relying on datasets related to different knowledge domains. Results show the feasibility of a feature selection process driven by ontology-based data summaries for Linked Data-enabled recommender systems.



This research has been supported in part by EU H2020 projects EW-Shopp - Grant n. 732590, and EuBusinessGraph - Grant n. 732003.


  1. 1.
    Adomavicius, G., Kwon, Y.: Improving aggregate recommendation diversity using ranking-based techniques. IEEE Trans. Knowl. Data Eng. 24(5), 896–911 (2012)CrossRefGoogle Scholar
  2. 2.
    Auer, S., Demter, J., Martin, M., Lehmann, J.: LODStats – an extensible framework for high-performance dataset analytics. In: ten Teije, A., Völker, J., Handschuh, S., Stuckenschmidt, H., d’Acquin, M., Nikolov, A., Aussenac-Gilles, N., Hernandez, N. (eds.) EKAW 2012. LNCS (LNAI), vol. 7603, pp. 353–362. Springer, Heidelberg (2012). Scholar
  3. 3.
    de Gemmis, M., Lops, P., Musto, C., Narducci, F., Semeraro, G.: Semantics-aware content-based recommender systems. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook, pp. 119–159. Springer, Boston, MA (2015). Scholar
  4. 4.
    Fernández-Tobías, I., Cantador, I., Kaminskas, M., Ricci, F.: A generic semantic-based framework for cross-domain recommendation. In: 2nd HetRec Workshop, RecSys (2011)Google Scholar
  5. 5.
    Geng, X., Liu, T.-Y., Qin, T., Li, H.: Feature selection for ranking. In: SIGIR. ACM (2007)Google Scholar
  6. 6.
    Gunawardana, A., Shani, G.: Evaluating recommender systems. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook, pp. 265–308. Springer, Boston, MA (2015). Scholar
  7. 7.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. JMLR 3, 1157–1182 (2003)zbMATHGoogle Scholar
  8. 8.
    Heitmann, B., Hayes, C.: Using linked data to build open, collaborative recommender systems. In: AAAI Spring Symposium: Linked Data Meets Artificial Intelligence (2010)Google Scholar
  9. 9.
    Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)CrossRefGoogle Scholar
  10. 10.
    Konrath, M., Gottron, T., Staab, S., Scherp, A.: SchemEX - efficient construction of a data catalogue by stream-based indexing of linked data. Web Semant. 16, 52–58 (2012)CrossRefGoogle Scholar
  11. 11.
    Langegger, A., Wöß, W.: RDFStats - an extensible RDF statistics generator and library. In: 20th DEXA Workshop. IEEE (2009)Google Scholar
  12. 12.
    Mihindukulasooriya, N., Poveda Villalon, M., Garcia-Castro, R., Gomez-Perez, A.: Loupe - an online tool for inspecting datasets in the linked data cloud. In: ISWC Posters and Demonstrations (2015)Google Scholar
  13. 13.
    Muñoz, E.: On learnability of constraints from RDF data. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 834–844. Springer, Cham (2016). Scholar
  14. 14.
    Musto, C., Lops, P., Basile, P., de Gemmis, M., Semeraro, G.: Semantics-aware graph-based recommender systems exploiting linked open data. In: UMAP (2016)Google Scholar
  15. 15.
    Musto, C., Semeraro, G., Lops, P., de Gemmis, M.: Combining distributional semantics and entity linking for context-aware content-based recommendation. In: UMAP (2014)Google Scholar
  16. 16.
    Nguyen, P., Tomeo, P., Di Noia, T., Di Sciascio, E.: An evaluation of SimRank and Personalized PageRank to build a recommender system for the web of data. In: WWW. ACM (2015)Google Scholar
  17. 17.
    Di Noia, T.: Knowledge-enabled recommender systems: models, challenges, solutions. In: 3rd KDWEB Workshop (2017)Google Scholar
  18. 18.
    Di Noia, T., Ostuni, V.C., Tomeo, P., Sciascio, E.D.: SPrank: semantic path-based ranking for top-N recommendations using linked open data. Trans. Intell. Syst. Technol. 8(1), 9:1–9:34 (2016)Google Scholar
  19. 19.
    Passant, A.: dbrec—music recommendations using DBpedia. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010. LNCS, vol. 6497, pp. 209–224. Springer, Heidelberg (2010). Scholar
  20. 20.
    Paulheim, H., Fümkranz, J.: Unsupervised generation of data mining features from linked open data. In: WIMS (2012)Google Scholar
  21. 21.
    Ragone, A., Tomeo, P., Magarelli, C., Di Noia, T., Palmonari, M., Maurino, A., Di Sciascio, E.: Schema-summarization in linked-data-based feature selection for recommender systems. In: SAC. ACM (2017)Google Scholar
  22. 22.
    Schaible, J., Gottron, T., Scherp, A.: TermPicker: enabling the reuse of vocabulary terms by exploiting data from the linked open data cloud. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 101–117. Springer, Cham (2016). Scholar
  23. 23.
    Shi, Y., Karatzoglou, A., Baltrunas, L., Larson, M., Oliver, N., Hanjalic, A.: CLiMF: learning to maximize reciprocal rank with collaborative less-is-more filtering. In: RecSys. ACM (2012)Google Scholar
  24. 24.
    Soru, T., Marx, E., Ngomo, A.N.: ROCKER: a refinement operator for key discovery. In: WWW. ACM (2015)Google Scholar
  25. 25.
    Spahiu, B., Porrini, R., Palmonari, M., Rula, A., Maurino, A.: ABSTAT: ontology-driven linked data summaries with pattern minimalization. In: Sack, H., Rizzo, G., Steinmetz, N., Mladenić, D., Auer, S., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9989, pp. 381–395. Springer, Cham (2016). Scholar
  26. 26.
    Troullinou, G., Kondylakis, H., Daskalaki, E., Plexousakis, D.: RDF digest: efficient summarization of RDF/S KBs. In: Gandon, F., Sabou, M., Sack, H., d’Amato, C., Cudré-Mauroux, P., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9088, pp. 119–134. Springer, Cham (2015). Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Polytechnic University of BariBariItaly
  2. 2.University of Milano-BicoccaMilanItaly
  3. 3.University of BonnBonnGermany

Personalised recommendations