Ranking Entities in the Age of Two Webs, an Application to Semantic Snippets

  • Mazen Alsarem
  • Pierre-Edouard Portier
  • Sylvie Calabretto
  • Harald Kosch
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9088)


The advances of the Linked Open Data (LOD) initiative are giving rise to a more structured Web of data. Indeed, a few datasets act as hubs (e.g., DBpedia) connecting many other datasets. They also made possible new Web services for entity detection inside plain text (e.g., DBpedia Spotlight), thus allowing for new applications that can benefit from a combination of the Web of documents and the Web of data. To ease the emergence of these new applications, we propose a query-biased algorithm (LDRANK) for the ranking of web of data resources with associated textual data. Our algorithm combines link analysis with dimensionality reduction. We use crowdsourcing for building a publicly available and reusable dataset for the evaluation of query-biased ranking of Web of data resources detected in Web pages. We show that, on this dataset, LDRANK outperforms the state of the art. Finally, we use this algorithm for the construction of semantic snippets of which we evaluate the usefulness with a crowdsourcing-based approach.


Singular Value Decomposition Ranking Algorithm Link Open Data PageRank Algorithm Equiprobable Distribution 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Alonso, O., Marshall, C., Najork, M.: Crowdsourcing a subjective labeling task: a human-centered framework to ensure reliable results. Technical report, MSR-TR-2014-91.
  2. 2.
    Bai, X., Delbru, R., Tummarello, G.: RDF snippets for semantic web search engines. In: Meersman, R., Tari, Z. (eds.) OTM 2008, Part II. LNCS, vol. 5332, pp. 1304–1318. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  3. 3.
    Berry, M.W.: Large-scale sparse singular value computations. Int. J. Supercomput. Appl. 6(1), 13–49 (1992)Google Scholar
  4. 4.
    Bizer, C., Eckert, K., Meusel, R., Mühleisen, H., Schuhmacher, M., Völker, J.: Deployment of RDFa, microdata, and microformats on the web – a quantitative analysis. In: Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N., Welty, C., Janowicz, K. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 17–32. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  5. 5.
    Carvalho, A., Larson, K.: A consensual linear opinion pool. In: Proceedings of the Twenty-Third international Joint Conference on Artificial Intelligence, pp. 2518–2524. AAAI Press (2013)Google Scholar
  6. 6.
    Dali, L., Fortuna, B., Duc, T.T., Mladenić, D.: Query-independent learning to rank for RDF entity search. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 484–498. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  7. 7.
    Demartini, G., Difallah, D.E., Cudré-Mauroux, P.: Zencrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In: Proceedings of the 21st International Conference on World Wide Web, pp. 469–478. ACM (2012)Google Scholar
  8. 8.
    Ding, L., Finin, T., Joshi, A., Pan, R., Cost, R.S., Peng, Y., Reddivari, P., Doshi, V., Sachs, J.: Swoogle: a search and metadata engine for the semantic web. In: Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management, pp. 652–659. ACM (2004)Google Scholar
  9. 9.
    Fafalios, P., Tzitzikas, Y.: Post-analysis of keyword-based search results using entity mining, linked data, and link analysis at query time (2014)Google Scholar
  10. 10.
    Franz, T., Schultz, A., Sizov, S., Staab, S.: Triplerank: ranking semantic web data by tensor decomposition. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 213–228. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  11. 11.
    Ge, W., Cheng, G., Li, H., Qu, Y.: Incorporating compactness to generate term-association view snippets for ontology search. Inf. Process. Manage. 49, 513–528 (2013)CrossRefGoogle Scholar
  12. 12.
    Haas, K., Mika, P., Tarjan, P., Blanco, R.: Enhanced results for web search. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 725–734. ACM (2011)Google Scholar
  13. 13.
    Järvelin, K., Kekäläinen, J.: Ir evaluation methods for retrieving highly relevant documents. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 41–48. ACM (2000)Google Scholar
  14. 14.
    Jeong, J.W., Morris, M.R., Teevan, J., Liebling, D.J.: A crowd-powered socially embedded search engine. In: ICWSM (2013)Google Scholar
  15. 15.
    Jindal, V., Bawa, S., Batra, S.: A review of ranking approaches for semantic search on web. Inf. Process. Manage. 50(2), 416–425 (2014)CrossRefGoogle Scholar
  16. 16.
    Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM (JACM) 46(5), 604–632 (1999)CrossRefzbMATHMathSciNetGoogle Scholar
  17. 17.
    Kohlschütter, C., Fankhauser, P., Nejdl, W.: Boilerplate detection using shallow text features. In: Proceedings of the Third ACM International Conference on Web search and Data Mining, pp. 441–450. ACM (2010)Google Scholar
  18. 18.
    Krippendorff, K.: Content analysis: An introduction to Its Methodology. Sage, Thousand Oaks (2012) Google Scholar
  19. 19.
    Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977)CrossRefzbMATHMathSciNetGoogle Scholar
  20. 20.
    Lempel, R., Moran, S.: Salsa: the stochastic approach for link-structure analysis. ACM Trans. Inf. Syst. (TOIS) 19(2), 131–160 (2001)CrossRefGoogle Scholar
  21. 21.
    Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: Dbpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, pp. 1–8. I-Semantics 2011, ACM (2011)Google Scholar
  22. 22.
    Nie, Z., Zhang, Y., Wen, J.R., Ma, W.Y.: Object-level ranking: bringing order to web objects. In: Proceedings of the 14th International Conference on World Wide Web, pp. 567–574. ACM (2005)Google Scholar
  23. 23.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web (1999)Google Scholar
  24. 24.
    Penin, T., Wang, H., Tran, T., Yu, Y.: Snippet generation for semantic web search engines. In: Domingue, J., Anutariya, C. (eds.) ASWC 2008. LNCS, vol. 5367, pp. 493–507. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  25. 25.
    Roa-Valverde, A.J., Sicilia, M.A.: A survey of approaches for ranking on the web of data. Inf. Retrieval 17, 1–31 (2014)CrossRefGoogle Scholar
  26. 26.
    Steiner, T., Troncy, R., Hausenblas, M.: How google is using linked data today and vision for tomorrow. In: Proceedings of Linked Data in the Future Internet 700 (2010)Google Scholar
  27. 27.
    Tonon, A., Catasta, M., Demartini, G., Cudré-Mauroux, P., Aberer, K.: TRank: ranking entity types using the web of data. In: Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N., Welty, C., Janowicz, K. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 640–656. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  28. 28.
    Wei, W., Barnaghi, P., Bargiela, A.: Rational research model for ranking semantic entities. Inf. Sci. 181(13), 2823–2840 (2011)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Mazen Alsarem
    • 1
  • Pierre-Edouard Portier
    • 1
  • Sylvie Calabretto
    • 1
  • Harald Kosch
    • 2
  1. 1.Université de Lyon, CNRS INSA de Lyon, LIRIS, UMR5205LyonFrance
  2. 2.Universität PassauPassauGermany

Personalised recommendations