Effects of Network Structure Improvement on Distributed RDF Querying

  • Liaquat Ali
  • Thomas Janson
  • Georg Lausen
  • Christian Schindelhauer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8059)

Abstract

In this paper, we analyze the performance of distributed RDF systems in a peer-to-peer (P2P) environment. We compare the performance of P2P networks based on Distributed Hash Tables (DHTs) and search-tree based networks. Our simulations show a performance boost of factor 2 when using search-tree based networks. This is achieved by grouping related data in branches of the tree, which tend to be accessed combined in a query, e.g. data of a university domain is in one branch. We observe a strongly unbalanced data distribution when indexing the RDF triples by subject, predicate, and object, which raises the question of scalability for huge data sets, e.g. peer responsible for predicate ’type’ is overloaded. However, we show how to exploit this unbalanced data distribution, and how we can speed up the evaluation of queries dramatically with only a few additional routing links, so-called shortcuts, to these frequently occurring triples components. These routing shortcuts can be established with only a constant increase of the peer’s routing tables. To cope with hotspots of unfair load balancing, we propose a novel indexing scheme where triples are indexed ’six instead of three times’ with only 23% data overhead in experiments and the possibility of more parallelism in query processing. For experiments, we use the LUBM data set and benchmark queries.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American 284(5), 34–43 (2001), http://www.scientificamerican.com/article.cfm?id=the-semantic-web CrossRefGoogle Scholar
  2. 2.
    Klyne, G., Carroll, J.J.: Resource Description Framework (RDF): Concepts and Abstract Syntax. World Wide Web Consortium. Tech. Rep. (2004), http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/
  3. 3.
    Cai, M., Frank, M.: RDFPeers: a scalable distributed RDF repository based on a structured peer-to-peer network. In: Proceedings of the 13th International Conference on World Wide Web, New York, USA, pp. 650–657 (2004)Google Scholar
  4. 4.
    Kaoudi, Z., Koubarakis, M., Kyzirakos, K., Miliaraki, I., Magiridou, M., Papadakis-Pesaresi, A.: Atlas: Storing, updating and querying RDF(S) data on top of DHTs. J. Web Sem. 8(4), 271–277 (2010)CrossRefGoogle Scholar
  5. 5.
    Liarou, E., Idreos, S., Koubarakis, M.: Evaluating Conjunctive Triple Pattern Queries over Large Structured Overlay Networks. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 399–413. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  6. 6.
    Battré, D., Heine, F., Höing, A., Kao, O.: On Triple Dissemination, Forward-Chaining, and Load Balancing in DHT Based RDF Stores. In: Moro, G., Bergamaschi, S., Joseph, S., Morin, J.-H., Ouksel, A.M. (eds.) DBISP2P 2005 and DBISP2P 2006. LNCS, vol. 4125, pp. 343–354. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  7. 7.
    Aberer, K., Cudré-Mauroux, P., Hauswirth, M., Van Pelt, T.: GridVine: Building Internet-Scale Semantic Overlay Networks. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 107–121. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  8. 8.
    Ali, L., Janson, T., Lausen, G.: 3rdf: Storing and Querying RDF Data on Top of the 3nuts Overlay Network. In: 10th International Workshop on Web Semantics (WebS 2011), Toulouse, France, pp. 257–261 (August 2011)Google Scholar
  9. 9.
    Aberer, K.: P-Grid: A Self-Organizing Access Structure for P2P Information Systems. In: Batini, C., Giunchiglia, F., Giorgini, P., Mecella, M. (eds.) CoopIS 2001. LNCS, vol. 2172, pp. 179–194. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  10. 10.
    Janson, T., Mahlmann, P., Schindelhauer, C.: A Self-Stabilizing Locality-Aware Peer-to-Peer Network Combining Random Networks, Search Trees, and DHTs. In: ICPADS, pp. 123–130 (2010)Google Scholar
  11. 11.
    Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. SIGCOMM Comput. Commun. Rev. 31, 149–160 (2001)CrossRefGoogle Scholar
  12. 12.
    Karger, D., Lehman, E., Leighton, T., Panigrahy, R., Levine, M., Lewin, D.: Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. In: Proceedings of the 29th Symposium on Theory of Computing (STOC 1997), pp. 654–663. ACM, New York (1997)Google Scholar
  13. 13.
    SPARQL Query Language for RDF, http://www.w3.org/TR/rdf-sparql-query/
  14. 14.
    Guo, Y., Pan, Z., Heflin, J.: LUBM: A Benchmark for OWL Knowledge Base Systems. Web Semantics: Science, Services and Agents on the World Wide Web 3(2-3) (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Liaquat Ali
    • 1
  • Thomas Janson
    • 1
  • Georg Lausen
    • 1
  • Christian Schindelhauer
    • 1
  1. 1.University of FreiburgGermany

Personalised recommendations