Discovering Wikipedia Conventions Using DBpedia Properties

  • Diego TorresEmail author
  • Hala Skaf-Molli
  • Pascal Molli
  • Alicia Díaz
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9507)


Wikipedia is a public and universal encyclopedia where contributors edit articles collaboratively. Wikipedia infoboxes and categories have been used by semantic technologies to create DBpedia, a knowledge base that semantically describes Wikipedia content and makes it publicly available on the Web. Semantic descriptions of DBpedia can be exploited not only for data retrieval, but also for identifying missing navigational paths in Wikipedia. Existing approaches have demonstrated that missing navigational paths are useful for the Wikipedia community, but their injection has to respect the Wikipedia convention. In this paper, we present a collaborative recommender system approach named BlueFinder, to enhance Wikipedia content with DBpedia properties. BlueFinder implements a supervised learning algorithm to predict the Wikipedia conventions used to represent similar connected pairs of articles; these predictions are used to recommend the best convention(s) to connect disconnected articles. We report on an exhaustive evaluation that shows three remarkable elements: (1) The evidence of a relevant information gap between DBpedia and Wikipedia; (2) Behavior and accuracy of the BlueFinder algorithm; and (3) Differences in Wikipedia conventions according to the specificity of the involved articles. BlueFinder assists Wikipedia contributors to add missing relations between articles, and consequently, it improves Wikipedia content.


Semantic Web Social web DBpedia Wikipedia Collaborative recommender systems 



This work is supported by the French National Research agency (ANR) through the KolFlow project (code: ANR-10-CONTINT-025), part of the CONTINT research program.


  1. 1.
    Lu, C., Stankovic, M., Laublet, P.: Desperately searching for travel offers? formulate better queries with some help from linked data. In: Gandon, F., Sabou, M., Sack, H., d’Amato, C., Cudré-Mauroux, P., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9088, pp. 621–636. Springer, Heidelberg (2015)CrossRefGoogle Scholar
  2. 2.
    Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web J. 6(2), 167–195 (2015)Google Scholar
  3. 3.
    Torres, D., Molli, P., Skaf-Molli, H., Díaz, A.: Improving wikipedia with DBpedia. In: Mille, A., Gandon, F.L., Misselis, J., Rabinovich, M., Staab, S. (eds.) WWW (Companion Volume), pp. 1107–1112. ACM (2012)Google Scholar
  4. 4.
    Pérez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3), 16:1–16:45 (2009)CrossRefGoogle Scholar
  5. 5.
    Landauer, T.K., Nachbar, D.: Selection from alphabetic and numeric menu trees using a touch screen: breadth, depth, and width. ACM SIGCHI Bull. 16(4), 73–78 (1985)CrossRefGoogle Scholar
  6. 6.
    Larson, K., Czerwinski, M.: Web page design: implications of memory, structure and scent for information retrieval. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 1998, pp. 25–32. Press/Addison-Wesley Publishing Co., ACM, New York (1998)Google Scholar
  7. 7.
    Otter, M., Johnson, H.: Lost in hyperspace: metrics and mental models. Interact. Comput. 13(1), 1–40 (2000)CrossRefGoogle Scholar
  8. 8.
    Torres, D., Molli, P., Skaf-Molli, H., Diaz, A.: From DBpedia to wikipedia: filling the gap by discovering wikipedia conventions. In: 2012 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2012 (2012)Google Scholar
  9. 9.
    Torres, D., Skaf-Molli, H., Molli, P., Diaz, A.: BlueFinder: recommending wikipedia links using DBpedia properties. In: ACM Web Science Conference 2013, WebSci 2013, Paris, France, May 2013Google Scholar
  10. 10.
    Wang, Y., Wang, H., Zhu, H., Yu, Y.: Exploit semantic information for category annotation recommendation in wikipedia. In: Kedad, Z., Lammari, N., Métais, E., Meziane, F., Rezgui, Y. (eds.) NLDB 2007. LNCS, vol. 4592, pp. 48–60. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  11. 11.
    Mirizzi, R., Di Noia, T., Ragone, A., Ostuni, V.C., Di Sciascio, E.: Movie recommendation with DBpedia. In: IIR, pp. 101–112. Citeseer (2012)Google Scholar
  12. 12.
    Panchenko, A., Adeykin, S., Romanov, A., Romanov, P.: Extraction of semantic relations between concepts with knn algorithms on wikipedia. In: Proceedings of Concept Discovery in Unstructured Data Workshop (CDUD) of International Conference On Formal Concept Analysis, pp. 78–88 (2012)Google Scholar
  13. 13.
    Singer, P., Niebler, T., Strohmaier, M., Hotho, A.: Computing semantic relatedness from human navigational paths: a case study on wikipedia. Int. J. Seman. Web Inf. Syst. (IJSWIS) 9(4), 41–70 (2013)CrossRefGoogle Scholar
  14. 14.
    Di Noia, T., Mirizzi, R., Ostuni, V.C., Romito, D., Zanker, M.: Linked open data to support content-based recommender systems. In: 8th International Conference on Semantic Systems (I-SEMANTICS 2012), ICP, ACM Press (2012)Google Scholar
  15. 15.
    Pereira Nunes, B., Dietze, S., Casanova, M.A., Kawase, R., Fetahu, B., Nejdl, W.: Combining a co-occurrence-based and a semantic measure for entity linking. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 548–562. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  16. 16.
    Adafre, S.F., de Rijke, M.: Discovering missing links in wikipedia. In: Proceedings of the 3rd International Workshop on Link Discovery, LinkKDD 2005, pp. 90–97. ACM, New York (2005)Google Scholar
  17. 17.
    Sunercan, O., Birturk, A.: Wikipedia missing link discovery: a comparative study. In: AAAI Spring Symposium: Linked Data Meets Artificial Intelligence, AAAI (2010)Google Scholar
  18. 18.
    Hoffmann, R., Amershi, S., Patel, K., Wu, F., Fogarty, J., Weld, D.S.: Amplifying community content creation with mixed initiative information extraction. In: Proceedings of the 27th International Conference on Human Factors in Computing Systems, CHI 2009, pp. 1849–1858. ACM, New York (2009)Google Scholar
  19. 19.
    Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, pp. 697–706. ACM, New York (2007)Google Scholar
  20. 20.
    Alkhateeb, F., Baget, J.F., Euzenat, J.: Extending SPARQL with regular expression patterns (for querying RDF). Web Seman. Sci. Serv. Agents World Wide Web 7(2), 57–73 (2011)CrossRefGoogle Scholar
  21. 21.
    Abiteboul, S., Vianu, V.: Regular path queries with constraints. In: Proceedings of the Sixteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS 1997, pp. 122–133. ACM, New York (1997)Google Scholar
  22. 22.
    Arenas, M., Conca, S., Pérez, J.: Counting beyond a yottabyte, or how SPARQL 1.1 property paths will prevent adoption of the standard. In: Proceedings of the 21st International Conference on World Wide Web, WWW 2012, pp. 629–638. ACM, New York (2012)Google Scholar
  23. 23.
    Adomavicius, G., Tuzhilin, A.: Towards the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005)CrossRefGoogle Scholar
  24. 24.
    Jaccard, P.: Nouvelles recherches sur la distribution florale. Bull. de la Sociète Vaudense des Sciences Naturelles 44, 223–270 (1908)Google Scholar
  25. 25.
    Lu, W., Shen, Y., Chen, S., Ooi, B.: Efficient processing of k nearest neighbor joins using mapreduce. Proc. VLDB Endowment 5(10), 1016–1027 (2012)CrossRefGoogle Scholar
  26. 26.
    Deshpande, M., Karypis, G.: Item-based top-n recommendation algorithms. ACM Trans. Inf. Syst. (TOIS) 22(1), 143–177 (2004)CrossRefGoogle Scholar
  27. 27.
    O’Sullivan, D., Smyth, B., Wilson, D.C., Mcdonald, K., Smeaton, A.: Improving the quality of the personalized electronic program guide. User Model. User-Adap. Inter. 14(1), 5–36 (2004)CrossRefGoogle Scholar
  28. 28.
    Fleder, D.M., Hosanagar, K.: Recommender systems and their impact on sales diversity. In: Proceedings of the 8th ACM Conference on Electronic Commerce, pp. 192–199. ACM (2007)Google Scholar
  29. 29.
    Shani, G., Gunawardana, A.: Evaluating recommendation systems. In: Ricci, F., Rokach, L., Shapira, B., Kantor, P.B. (eds.) Recommender Systems Handbook, pp. 257–297. Springer, US (2011)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Diego Torres
    • 1
    Email author
  • Hala Skaf-Molli
    • 2
  • Pascal Molli
    • 2
  • Alicia Díaz
    • 1
  1. 1.LIFIA, Fac. InformáticaUniversidad Nacional de La PlataLa PlataArgentina
  2. 2.Nantes UniversityNantesFrance

Personalised recommendations