Abstract
In this paper, we analyze the performance of distributed RDF systems in a peer-to-peer (P2P) environment. We compare the performance of P2P networks based on Distributed Hash Tables (DHTs) and search-tree based networks. Our simulations show a performance boost of factor 2 when using search-tree based networks. This is achieved by grouping related data in branches of the tree, which tend to be accessed combined in a query, e.g. data of a university domain is in one branch. We observe a strongly unbalanced data distribution when indexing the RDF triples by subject, predicate, and object, which raises the question of scalability for huge data sets, e.g. peer responsible for predicate ’type’ is overloaded. However, we show how to exploit this unbalanced data distribution, and how we can speed up the evaluation of queries dramatically with only a few additional routing links, so-called shortcuts, to these frequently occurring triples components. These routing shortcuts can be established with only a constant increase of the peer’s routing tables. To cope with hotspots of unfair load balancing, we propose a novel indexing scheme where triples are indexed ’six instead of three times’ with only 23% data overhead in experiments and the possibility of more parallelism in query processing. For experiments, we use the LUBM data set and benchmark queries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American 284(5), 34–43 (2001), http://www.scientificamerican.com/article.cfm?id=the-semantic-web
Klyne, G., Carroll, J.J.: Resource Description Framework (RDF): Concepts and Abstract Syntax. World Wide Web Consortium. Tech. Rep. (2004), http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/
Cai, M., Frank, M.: RDFPeers: a scalable distributed RDF repository based on a structured peer-to-peer network. In: Proceedings of the 13th International Conference on World Wide Web, New York, USA, pp. 650–657 (2004)
Kaoudi, Z., Koubarakis, M., Kyzirakos, K., Miliaraki, I., Magiridou, M., Papadakis-Pesaresi, A.: Atlas: Storing, updating and querying RDF(S) data on top of DHTs. J. Web Sem. 8(4), 271–277 (2010)
Liarou, E., Idreos, S., Koubarakis, M.: Evaluating Conjunctive Triple Pattern Queries over Large Structured Overlay Networks. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 399–413. Springer, Heidelberg (2006)
Battré, D., Heine, F., Höing, A., Kao, O.: On Triple Dissemination, Forward-Chaining, and Load Balancing in DHT Based RDF Stores. In: Moro, G., Bergamaschi, S., Joseph, S., Morin, J.-H., Ouksel, A.M. (eds.) DBISP2P 2005 and DBISP2P 2006. LNCS, vol. 4125, pp. 343–354. Springer, Heidelberg (2007)
Aberer, K., Cudré-Mauroux, P., Hauswirth, M., Van Pelt, T.: GridVine: Building Internet-Scale Semantic Overlay Networks. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 107–121. Springer, Heidelberg (2004)
Ali, L., Janson, T., Lausen, G.: 3rdf: Storing and Querying RDF Data on Top of the 3nuts Overlay Network. In: 10th International Workshop on Web Semantics (WebS 2011), Toulouse, France, pp. 257–261 (August 2011)
Aberer, K.: P-Grid: A Self-Organizing Access Structure for P2P Information Systems. In: Batini, C., Giunchiglia, F., Giorgini, P., Mecella, M. (eds.) CoopIS 2001. LNCS, vol. 2172, pp. 179–194. Springer, Heidelberg (2001)
Janson, T., Mahlmann, P., Schindelhauer, C.: A Self-Stabilizing Locality-Aware Peer-to-Peer Network Combining Random Networks, Search Trees, and DHTs. In: ICPADS, pp. 123–130 (2010)
Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. SIGCOMM Comput. Commun. Rev. 31, 149–160 (2001)
Karger, D., Lehman, E., Leighton, T., Panigrahy, R., Levine, M., Lewin, D.: Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. In: Proceedings of the 29th Symposium on Theory of Computing (STOC 1997), pp. 654–663. ACM, New York (1997)
SPARQL Query Language for RDF, http://www.w3.org/TR/rdf-sparql-query/
Guo, Y., Pan, Z., Heflin, J.: LUBM: A Benchmark for OWL Knowledge Base Systems. Web Semantics: Science, Services and Agents on the World Wide Web 3(2-3) (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ali, L., Janson, T., Lausen, G., Schindelhauer, C. (2013). Effects of Network Structure Improvement on Distributed RDF Querying. In: Hameurlain, A., Rahayu, W., Taniar, D. (eds) Data Management in Cloud, Grid and P2P Systems. Globe 2013. Lecture Notes in Computer Science, vol 8059. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40053-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-40053-7_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40052-0
Online ISBN: 978-3-642-40053-7
eBook Packages: Computer ScienceComputer Science (R0)