ExpLOD: Summary-Based Exploration of Interlinking and RDF Usage in the Linked Open Data Cloud

  • Shahan Khatchadourian
  • Mariano P. Consens
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6089)

Abstract

Publishing interlinked RDF datasets as links between data items identified using dereferenceable URIs on the web brings forward a number of issues. A key challenge is to understand the data, the schema, and the interlinks that are actually used both within and across linked datasets. Understanding actual RDF usage is critical in the increasingly common situations where terms from different vocabularies are mixed. In this paper we describe a tool, ExpLOD, that supports exploring summaries of RDF usage and interlinking among datasets from the Linked Open Data cloud. ExpLOD’s summaries are based on a novel mechanism that combines text labels and bisimulation contractions. The labels assigned to RDF graphs are hierarchical, enabling summarization at different granularities. The bisimulation contractions are applied to subgraphs defined via queries, providing for summarization of arbitrary large or small graph neighbourhoods. Also, ExpLOD can generate SPARQL queries from a summary. Experimental results, using several collections from the Linked Open Data cloud, compare the two summary creation approaches implemented by ExpLOD (graph-based vs. SPARQL-based).

References

  1. 1.
    Yahoo! SearchMonkey, http://developer.yahoo.com/searchmonkey (last Accessed: December 21, 2009)
  2. 2.
    Alexander, K., Cyganiak, R., Hausenblas, M., Zhao, J.: Describing Linked Datasets - On the Design and Usage of voiD, the Vocabulary of Interlinked Datasets. In: LDOW (2009)Google Scholar
  3. 3.
    Ali, M.S., Consens, M.P., Kazai, G., Lalmas, M.: Structural relevance: a common basis for the evaluation of structured document retrieval. In: CIKM, pp. 1153–1162 (2008)Google Scholar
  4. 4.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: A Nucleus for a Web of Open Data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  5. 5.
    Biron, P.V., Ashok Malhotra, W.W.W.C.: W3C Recommendation. XML Schema Part 2: Datatypes Second Edition, October 28 (2004), http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/
  6. 6.
    Bizer, C., Heath, T., Berners-Lee, T.: Linked Data - The Story So Far. IJSWIS 5(3), 1–22 (2009)Google Scholar
  7. 7.
    Carroll, J.J., Bizer, C., Hayes, P., Stickler, P.: Named Graphs, Provenance and Trust. In: WWW, pp. 613–622 (2005)Google Scholar
  8. 8.
    Cheng, G., Ge, W., Qu, Y.: Falcons: Searching and Browsing Entities on the Semantic Web. In: WWW, pp. 1101–1102 (2008)Google Scholar
  9. 9.
    Consens, M.P., Rizzolo, F.: Fast answering of XPath query workloads on web collections. In: Barbosa, D., Bonifati, A., Bellahsène, Z., Hunt, E., Unland, R. (eds.) XSym 2007. LNCS, vol. 4704, pp. 31–45. Springer, Heidelberg (2007)Google Scholar
  10. 10.
    Consens, M.P., Rizzolo, F., Vaisman, A.A.: AxPRE summaries: Exploring the (semi-)structure of XML web collections. In: ICDE, pp. 1519–1521 (2008)Google Scholar
  11. 11.
    Cyganiak, R., Stenzhorn, H., Delbru, R., Decker, S., Tummarello, G.: Semantic Sitemaps: Efficient and Flexible Access to Datasets on the Semantic Web. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 690–704. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  12. 12.
    Ding, L., Finin, T.: Characterizing the Semantic Web on the Web. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 242–257. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  13. 13.
    Ding, L., Finin, T., Joshi, A., Pan, R., Cost, R.S., Peng, Y., Reddivari, P., Doshi, V., Sachs, J.: Swoogle: A Search and Metadata Engine for the Semantic Web. In: CIKM, pp. 652–659 (2004)Google Scholar
  14. 14.
    Dovier, A., Piazza, C., Policriti, A.: An efficient algorithm for computing bisimulation equivalence. Theor. Comput. Sci. 311(1-3), 221–256 (2004)MATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Klyne, G., Carroll, J. (eds.): Resource Description Framework (RDF): Concepts and Abstract Syntax, http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/
  16. 16.
    Glaser, H., Millard, I., Jaffri, A.: RKBExplorer.com: A Knowledge Driven Infrastructure for Linked Data Providers. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 797–801. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  17. 17.
    Hassanzadeh, O., Consens, M.P.: Linked Movie Data Base. In: I-SEMANTICS, pp. 194–196 (2008)Google Scholar
  18. 18.
    Hausenblas, M., Halb, W., Raimond, Y., Heath, T.: What is the Size of the Semantic Web? In: I-SEMANTICS, pp. 9–16 (2008)Google Scholar
  19. 19.
    Khatchadourian, S., Consens, M.P.: ExpLOD: Exploring Interlinking and RDF Usage in the Linked Open Data Cloud. Technical Report (2009), http://www.cs.toronto.edu/~shahan/tr/explodtechreport090901.pdf
  20. 20.
    Kinsella, S., Bojars, U., Harth, A., Breslin, J.G., Decker, S.: An interactive map of semantic web ontology usage. In: IV, pp. 179–184 (2008)Google Scholar
  21. 21.
    Maduko, A., Anyanwu, K., Sheth, A.P., Schliekelman, P.: Graph Summaries for Subgraph Frequency Estimation. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 508–523. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  22. 22.
    Manola, F., Miller, E.: RDF Primer, W3C Recommendation, February 10 (2004), http://www.w3.org/TR/REC-rdf-syntax/
  23. 23.
    Paige, R., Tarjan, R.E.: Three partition refinement algorithms. SIAM J. Comput. 16(6), 973–989 (1987)MATHCrossRefMathSciNetGoogle Scholar
  24. 24.
    Stickler, P.: CBD - Concise Bounded Description, http://www.w3.org/Submission/2005/SUBM-CBD-20050603/
  25. 25.
    Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: A Core of Semantic Knowledge. In: WWW, pp. 697–706 (2007)Google Scholar
  26. 26.
    Tummarello, G., Delbru, R., Oren, E.: Sindice.com: Weaving the Open Linked Data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 552–565. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  27. 27.
    Tummarello, G., Morbidoni, C., Puliti, P., Piazza, F.: Signing individual fragments of an RDF graph. In: WWW, pp. 1020–1021 (2005)Google Scholar
  28. 28.
    Vassiliadis, P., Sellis, T.: A survey of logical models for OLAP databases. SIGMOD Rec. 28(4), 64–69 (1999)CrossRefGoogle Scholar
  29. 29.
    Volz, J., Bizer, C., Gaedke, M., Kobilarov, G.: Discovering and Maintaining Links on the Web of Data. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 650–665. Springer, Heidelberg (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Shahan Khatchadourian
    • 1
  • Mariano P. Consens
    • 1
  1. 1.University of Toronto 

Personalised recommendations