PageRank and Generic Entity Summarization for RDF Knowledge Bases

  • Dennis DiefenbachEmail author
  • Andreas Thalhammer
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10843)


Ranking and entity summarization are operations that are tightly connected and recurrent in many different domains. Possible application fields include information retrieval, question answering, named entity disambiguation, co-reference resolution, and natural language generation. Still, the use of these techniques is limited because there are few accessible resources. PageRank computations are resource-intensive and entity summarization is a complex research field in itself.

We present two generic and highly re-usable resources for RDF knowledge bases: a component for PageRank-based ranking and a component for entity summarization. The two components, namely PageRankRDF and summaServer, are provided in form of open source code along with example datasets and deployments. In addition, this work outlines the application of the components for PageRank-based RDF ranking and entity summarization in the question answering project WDAqua.


RDF Ranking PageRank Entity summarization Question answering Linked data 



Parts of this work received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sksłodowska-Curie grant agreement No. 642795, project: Answering Questions using Web Data (WDAqua).


  1. 1.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). Scholar
  2. 2.
    Beek, W., Rietveld, L., Bazoobandi, H.R., Wielemaker, J., Schlobach, S.: LOD laundromat: a uniform way of publishing other people’s dirty data. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 213–228. Springer, Cham (2014). Scholar
  3. 3.
    Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1247–1250. SIGMOD 2008. ACM, New York (2008).
  4. 4.
    Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Proceedings of the Seventh International Conference on World Wide Web 7, WWW7, pp. 107–117. Elsevier Science Publishers B.V., Amsterdam (1998). Scholar
  5. 5.
    Diefenbach, D., Amjad, S., Both, A., Singh, K., Maret, P.: Trill: a reusable front-end for QA systems. In: Blomqvist, E., Hose, K., Paulheim, H., Ławrynowicz, A., Ciravegna, F., Hartig, O. (eds.) ESWC 2017. LNCS, vol. 10577, pp. 48–53. Springer, Cham (2017). Scholar
  6. 6.
    Diefenbach, D., Singh, K., Maret, P.: WDAqua-core0: a question answering component for the research community. In: Dragoni, M., Solanki, M., Blomqvist, E. (eds.) SemWebEval 2017. CCIS, vol. 769, pp. 84–89. Springer, Cham (2017). Scholar
  7. 7.
    Diefenbach, D., Both, A., Singh, K., Maret, P.: Towards a question answering system over the semantic web (2018). arXiv:1803.00832
  8. 8.
    Fernández, J.D., Martínez-Prieto, M.A., Gutiérrez, C., Polleres, A., Arias, M.: Binary RDF representation for publication and exchange (HDT). Web Semant.: Sci. Serv. Agents World Wide Web 19, 22–41 (2013). Scholar
  9. 9.
    Lange, C., Shekarpour, S., Auer, S.: The WDAqua ITN: answering questions using web data. In: EU Project Networking Session at ESWC (2015)Google Scholar
  10. 10.
    Ngomo, A.C.N., Hoffmann, M., Usbeck, R., Jha, K.: Holistic and Scalable ranking of RDF data. In: 2017 IEEE International Conference on Big Data (2017). to appearGoogle Scholar
  11. 11.
    Pouriyeh, S., Allahyari, M., Kochut, K., Cheng, G., Arabnia, H.R.: ES-LDA: entity summarization using knowledge-based topic modeling. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 316–325. Asian Federation of Natural Language Processing (2017).
  12. 12.
    Roa-Valverde, A.J., Sicilia, M.A.: A survey of approaches for ranking on the web of data. Information Retrieval 17(4), 295–325 (2014). Scholar
  13. 13.
    Roa-Valverde, A.J., Thalhammer, A., Toma, I., Sicilia, M.A.: Towards a formal model for sharing and reusing ranking computations. In: Proceedings of the 6th International Workshop on Ranking in Databases (DBRank 2012) held in conjunction with the 38th Conference on Very Large Databases (VLDB 2012) (2012).
  14. 14.
    Thalhammer, A.: Linked data entity summarization. Ph.D. thesis, KIT, Fakultät für Wirtschaftswissenschaften, Karlsruhe (2016).
  15. 15.
    Thalhammer, A., Lasierra, N., Rettinger, A.: LinkSUM: using link analysis to summarize entity data. In: Bozzon, A., Cudre-Maroux, P., Pautasso, C. (eds.) ICWE 2016. LNCS, vol. 9671, pp. 244–261. Springer, Cham (2016). Scholar
  16. 16.
    Thalhammer, A., Rettinger, A.: Browsing DBpedia Entities with Summaries. In: Presutti, V., Blomqvist, E., Troncy, R., Sack, H., Papadakis, I., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8798, pp. 511–515. Springer, Cham (2014). Scholar
  17. 17.
    Thalhammer, A., Rettinger, A.: ELES: combining entity linking and entity summarization. In: Bozzon, A., Cudre-Maroux, P., Pautasso, C. (eds.) ICWE 2016. LNCS, vol. 9671, pp. 547–550. Springer, Cham (2016). Scholar
  18. 18.
    Thalhammer, A., Rettinger, A.: PageRank on Wikipedia: towards general importance scores for entities. In: Sack, H., Rizzo, G., Steinmetz, N., Mladenić, D., Auer, S., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9989, pp. 227–240. Springer, Cham (2016). Scholar
  19. 19.
    Thalhammer, A., Stadtmüller, S.: SUMMA: a common API for linked data entity summaries. In: Cimiano, P., Frasincar, F., Houben, G.-J., Schwabe, D. (eds.) ICWE 2015. LNCS, vol. 9114, pp. 430–446. Springer, Cham (2015). Scholar
  20. 20.
    Tristram, F., Walter, S., Cimiano, P., Unger, C.: Weasel: a machine learning based approach to entity linking combining different features. In: Proceedings of 3th International Workshop on NLP and DBpedia, co-located with the 14th International Semantic Web Conference (ISWC 2015), October 11–15, USA (2015).
  21. 21.
    Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014). Scholar
  22. 22.
    Wilkinson, M.D., et al.: The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3 (2016).

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Université de Lyon, CNRS UMR 5516 Laboratoire Hubert CurienLyonFrance
  2. 2.Roche Pharma Research and Early Development Informatics, Roche Innovation Center BaselBaselSwitzerland

Personalised recommendations