Optimising Linked Data Queries in the Presence of Co-reference

  • Xin Wang
  • Thanassis Tiropanis
  • Hugh C. Davis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8465)


Due to the distributed nature of Linked Data, many resources are referred to by more than one URI. This phenomenon, known as co-reference, increases the probability of leaving out implicit semantically related results when querying Linked Data. The probability of co-reference increases further when considering distributed SPARQL queries over a larger set of distributed datasets. Addressing co-reference in Linked Data queries, on one hand, increases complexity of query processing. On the other hand, it requires changes in how statistics of datasets are taken into consideration. We investigate these two challenges of addressing co-reference in distributed SPARQL queries, and propose two methods to improve query efficiency: 1) a model named Virtual Graph, that transforms a query with co-reference into a normal query with pre-existing bindings; 2) an algorithm named Ψ, that intensively exploits parallelism, and dynamically optimises queries using runtime statistics. We deploy both methods in an distributed engine called LHD-d. To evaluate LHD-d, we investigate the distribution of co-reference in the real world, based on which we simulate an experimental RDF network. In this environment we demonstrate the advantages of LHD-d for distributed SPARQL queries in environments with co-reference.


#eswc2014Wang SPARQL Linked Data distributed query dynamic optimisation co-reference 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alexander, K., Cyganiak, R., Hausenblas, M., Zhao, J.: Describing linked datasets on the design and usage of VoID, the “Vocabulary of Interlinked Datasets”. In: Proceedings of the Linked Data on the Web Workshop (LDOW), at the International World Wide Web Conference, WWW (2009)Google Scholar
  2. 2.
    Bizer, C., Schultz, A.: The Berlin SPARQL benchmark. International Journal On Semantic Web and Information Systems (IJSWIS) - Special Issue on Scalability and Performance of Semantic Web Systems 5(2), 1–24 (2009)CrossRefGoogle Scholar
  3. 3.
    Carroll, J., Herman, I., Patel-Schneider, P.F.: OWL 2 Web Ontology Language RDF-Based Semantics, 2nd edn (2012),
  4. 4.
    Glaser, H., Jaffri, A., Millard, I.: Managing Co-reference on the Semantic web. In: Proceedings of the Linked Data on the Web Workshop (LDOW), at the International World Wide Web Conference, WWW (2009)Google Scholar
  5. 5.
    Görlitz, O., Staab, S.: SPLENDID: SPARQL Endpoint Federation Exploiting VOID Descriptions. In: Proceedings of the Consuming Linked Data Workshop(COLD) (2011)Google Scholar
  6. 6.
    Hogan, A.: Exploiting RDFS and OWL for Integrating Heterogeneous, Large-Scale, Linked Data Corpora (2011)Google Scholar
  7. 7.
    Hogan, A., Harth, A., Decker, S.: Performing object consolidation on the semantic web data graph. In: Proceedings of 1st I3: Identity, Identifiers, Identification Workshop (2007)Google Scholar
  8. 8.
    Hogan, A., Harth, A., Umbrich, J., Kinsella, S., Polleres, A., Decker, S.: Searching and browsing Linked Data with SWSE: the Semantic Web search engine. Semantic Search Over the Web, 361–414 (2012)Google Scholar
  9. 9.
    Hu, W., Chen, J., Zhang, H., Qu, Y.: How matchable are four thousand ontologies on the semantic Web. The Semantic Web: Research and Applications 6643, 290–304 (2011)Google Scholar
  10. 10.
    Hyland, B., Villazón-Terrazas, B., Atemezing, G.: Best practices for publishing Linked Data (W3C editor’s draft 13 March 2013) (2013),
  11. 11.
    Özsu, M.T., Valduriez, P.: Principles of distributed database systems (1999)Google Scholar
  12. 12.
    Quilitz, B.: Querying distributed RDF data sources with SPARQL. The Semantic Web: Research and Applications, 524–538 (2008)Google Scholar
  13. 13.
    Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: FedX: Optimization Techniques for Federated Query Processing on Linked Data. In: Proceedings of the International Semantic Web Conference, ISWC (2011)Google Scholar
  14. 14.
    Song, D., Heflin, J.: Automatically generating data linkages using a domain-independent candidate selection approach. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 649–664. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  15. 15.
    Wang, X., Tiropanis, T., Davis, H.C.: Evaluating graph traversal algorithms for distributed SPARQL query optimization. In: Pan, J.Z., Chen, H., Kim, H.-G., Li, J., Wu, Z., Horrocks, I., Mizoguchi, R., Wu, Z. (eds.) JIST 2011. LNCS, vol. 7185, pp. 210–225. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  16. 16.
    Wang, X., Tiropanis, T., Davis, H.C.: LHD: Optimising Linked Data query processing using parallelisation. In: Proceedings of the Linked Data on the Web Workshop (LDOW), at the International World Wide Web Conference, WWW (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Xin Wang
    • 1
  • Thanassis Tiropanis
    • 1
  • Hugh C. Davis
    • 1
  1. 1.Electronics and Computer ScienceUniversity of SouthamptonUK

Personalised recommendations