Advertisement

A Survey of Structured P2P Systems for RDF Data Storage and Retrieval

  • Imen Filali
  • Francesco Bongiovanni
  • Fabrice Huet
  • Françoise Baude
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6790)

Abstract

The Semantic Web enables the possibility to model, create and query resources found on the Web. Enabling the full potential of its technologies at the Internet level requires infrastructures that can cope with scalability challenges while supporting expressive queries. The attractive features of the Peer-to-Peer (P2P) communication model, and more specifically structured P2P systems, such as decentralization, scalability, fault-tolerance seems to be a natural solution to deal with these challenges. Consequently, the combination of the Semantic Web and the P2P model can be a highly innovative attempt to harness the strengths of both technologies and come up with a scalable infrastructure for RDF data storage and retrieval. In this respect, this survey details the research works adopting this combination and gives an insight on how to deal with the RDF data at the indexing and querying levels. We also present some works which adopt the publish/subscribe paradigm for processing RDF data in order to offer long standing queries.

Keywords

Semantic Web Peer-to-Peer (P2P) Distributed Hash Tables (DHTs) Resource Description Framework (RDF) Distributed RDF repository RDF data indexing RDF query processing publish/subscribe (pub/sub) subscription processing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
    Jena - a Semantic Web Framework for java, http://jena.sourceforge.net/
  3. 3.
  4. 4.
    Resource Description Framework, http://www.w3.org/RDF/
  5. 5.
  6. 6.
  7. 7.
    W3C Semantic Web Activity, http://www.w3.org/2001/sw/
  8. 8.
    Aberer, K., Cudré-Mauroux, P., Datta, A., Despotovic, Z., Hauswirth, M., Punceva, M., Schmidt, R.: P-Grid: a self-organizing structured P2P system. SIGMOD Record 32(3), 33 (2003)Google Scholar
  9. 9.
    Aberer, K., Cudré-Mauroux, P., Hauswirth, M., Pelt, T.V.: GridVine: Building Internet-Scale Semantic Overlay Networks. In: International Semantic Web Conference (2004)Google Scholar
  10. 10.
    Alex, P., Chirita, R., Idreos, S., Koubarakis, M., Nejdl, W.: Designing semantic publish/subscribe networks using super-peers. In: Semantic Web and Peer-To-Peer (January 2004)Google Scholar
  11. 11.
    Andersen, D., Balakrishnan, H., Kaashoek, F., Morris, R.: Resilient overlay networks. In: Proceedings of the 18th ACM Symposium on Operating Systems Principles, pp. 131–145. ACM, Banff (2001)Google Scholar
  12. 12.
    Antoniou, G., Harmelen, F.: Web ontology language: Owl. In: Handbook on Ontologies, pp. 91–110 (2009)Google Scholar
  13. 13.
    Atre, M., Srinivasan, J., Hendler, J.: BitMat: A Main-memory Bit Matrix of RDF Triples for Conjunctive Triple Pattern Queries. In: 7th International Semantic Web Conference (ISWC) (October 2008)Google Scholar
  14. 14.
    Battre, D.: Caching of intermediate results in DHT based RDF stores. Int. J. Metadata Semant. Ontologies 3(1), 84–93 (2008)CrossRefGoogle Scholar
  15. 15.
    Battré, D.: Query Planning in DHT Based RDF Stores. In: Proceedings of the 2008 IEEE International Conference on Signal Image Technology and Internet Based Systems (SITIS), pp. 187–194. IEEE Computer Society, Washington, DC, USA (2008)CrossRefGoogle Scholar
  16. 16.
    Battré, D., Heine, F., Höing, A., Kao, O.: On triple dissemination, forward-chaining, and load balancing in DHT based RDF stores. In: Moro, G., Bergamaschi, S., Joseph, S., Morin, J.-H., Ouksel, A.M. (eds.) DBISP2P 2005 and DBISP2P 2006. LNCS, vol. 4125, pp. 343–354. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  17. 17.
    Baude, F., Filali, I., Huet, F., Legrand, V., Mathias, E., Merle, P., Ruz, C., Krummenacher, R., Simperl, E., Hamerling, C., Lorré, J.-P.: ESB Federation for Large-Scale SOA. In: Proceedings of the ACM Symposium on Applied Computing (SAC), pp. 2459–2466 (2010)Google Scholar
  18. 18.
    Berners-Lee, T.: Linked data. W3C Design Issues (2006)Google Scholar
  19. 19.
    Bernstein, A., Kiefer, C., Stocker, M.: OptARQ: A SPARQL Optimization Approach based on Triple Pattern Selectivity Estimation. Tech. rep., University of Zurich (2007)Google Scholar
  20. 20.
    Birman, K.P.: A review of experiences with reliable multicast. Software: Practice and Experience 29(9), 741–774 (1999)Google Scholar
  21. 21.
    Bloom, B.H.: Space/Time Trade-offs in Hash Coding With Allowable Errors. Commun. ACM 13(7), 422–426 (1970)CrossRefzbMATHGoogle Scholar
  22. 22.
    Cai, M., Frank, M., Chen, J., Szekely, P.: MAAN: A multi-attribute Addressable Network for Grid Information Services. Journal of Grid Computing 2 (2003)Google Scholar
  23. 23.
    Cai, M., Frank, M.R.: RDFPeers: a scalable distributed RDF Repository Based on a Structured Peer-to-Peer Network. In: WWW, pp. 650–657 (2004)Google Scholar
  24. 24.
    Cai, M., Frank, M.R., Yan, B., MacGregor, R.M.: A subscribable Peer-to-Peer RDF Repository for Distributed Metadata Management. J. Web Sem. 2(2), 109–130 (2004)CrossRefGoogle Scholar
  25. 25.
    Carzaniga, A., Rosenblum, D.S., Wolf, A.L.: Design and Evaluation of a Wide-Area Event Notification Service. ACM Trans. Comput. Syst. 19(3), 332–383 (2001)CrossRefGoogle Scholar
  26. 26.
    Castro, M., Druschel, P., Kermarrec, A., Rowstron, A.: SCRIBE: A large-scale and decentralized application-level multicast infrastructure. IEEE Journal on Selected Areas in Communications 20(8), 1489–1499 (2002)CrossRefGoogle Scholar
  27. 27.
    Castro, M., Costa, M., Rowstron, A.: Debunking some myths about structured and unstructured overlays. In: Proceedings of the 2nd Conference on Symposium on Networked Systems Design and Implementation (NSDI), pp. 85–98. USENIX Association (2005)Google Scholar
  28. 28.
    Magiridou, M., Sahtouris, S., Christophides, V., Koubarakis, M.: RUL: A declarative update language for RDF. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 506–521. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  29. 29.
  30. 30.
    Crespo, A., Garcia-Molina, H.: Semantic Overlay networks for P2P Systems. In: Agents and Peer-to-Peer Computing, pp. 1–13 (2005)Google Scholar
  31. 31.
    Cudré-Mauroux, P., Agarwal, S., Aberer, K.: Gridvine: An infrastructure for peer information management. IEEE Internet Computing 11, 36–44 (2007)CrossRefGoogle Scholar
  32. 32.
    Dan Brickley, R.G.: RDF Vocabulary Description Language 1.0: RDF schema, http://www.w3.org/TR/rdf-schema/
  33. 33.
    Della Valle, E., Turati, A., Ghioni, A.: PAGE: A distributed infrastructure for fostering RDF-based interoperability. In: Eliassen, F., Montresor, A. (eds.) DAIS 2006. LNCS, vol. 4025, pp. 347–353. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  34. 34.
    Demers, A., Greene, D., Hauser, C., Irish, W., Larson, J., Shenker, S., Sturgis, H., Swinehart, D., Terry, D.: Epidemic algorithms for replicated database maintenance. In: Proceedings of the Sixth Annual ACM Symposium on Principles of Distributed Computing, pp. 1–12. ACM, Vancouver (1987)Google Scholar
  35. 35.
    El-Ansary, S., Alima, L., Brand, P., Haridi, S.: Efficient broadcast in structured P2P networks. In: Peer-to-Peer Systems II, pp. 304–314. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  36. 36.
    El-Ansary, S., Haridi, S.: An overview of structured overlay networks. In: Theoretical and Algorithmic Aspects of Sensor, Ad Hoc Wireless and Peer-to-Peer Networks. CRC Press, Boca Raton (2005)Google Scholar
  37. 37.
    Eugster, P., Guerraoui, R., Kermarrec, A.M., Massoulie, L.: From Epidemics to Distributed Computing. IEEE Computer 37(5), 60–67 (2004)CrossRefGoogle Scholar
  38. 38.
    Eugster, P., Felber, P., Guerraoui, R., Kermarrec, A.: The many faces of publish/subscribe. ACM Computing Surveys (CSUR) 35(2), 114–131 (2003)CrossRefGoogle Scholar
  39. 39.
    Filali, I., Bongiovanni, F., Huet, F., Baude, F.: RDF Data Indexing and Retrieval: A survey of Peer-to-Peer based solutions. Research Report RR-7457, INRIA (November 2010)Google Scholar
  40. 40.
    Gilbert, S., Lynch, N.: Brewer’s Conjecture and the Feasibility of Consistent Available Partition-Tolerant Web Services. In: ACM SIGACT News, p. 2002 (2002)Google Scholar
  41. 41.
    Williams, G.T., Weaver, J., Atre, M., Hendler, J.A.: Scalable Reduction of Large Datasets to Interesting Subsets. In: 8th International Semantic Web Conference (2009)Google Scholar
  42. 42.
    Gu, T., Pung, H.K., Zhang, D.: Information Retrieval in Schema-based P2P Systems Using One-dimensional Semantic Space. Computer Networks 51(16), 4543–4560 (2007)CrossRefzbMATHGoogle Scholar
  43. 43.
    Guerraoui, R., Rodrigues, L.: Introduction to reliable distributed programming. Springer-Verlag New York Inc., Secaucus (2006)zbMATHGoogle Scholar
  44. 44.
    Harth, A., Decker, S.: Optimized Index Structures for Querying RDF from the Web. In: Proceedings of the Third Latin American Web Congress (LA-WEB), p. 71. IEEE Computer Society, Washington, DC, USA (2005)CrossRefGoogle Scholar
  45. 45.
    Heine, F.: Scalable p2p based RDF querying. In: Proceedings of the 1st International Conference on Scalable Information Systems (InfoScale), p. 17. ACM, New York (2006)CrossRefGoogle Scholar
  46. 46.
    Broekstra, J.: Ehrig and Peter Haase and Frank van Harmelen and Maarten Menken and Peter Mika and Bjorn Schnizler and Ronny Siebes: Bibster - A Semantics-Based Bibliographic Peer-to-Peer System. In: The Second Workshop on Semantics in Peer-to-Peer and Grid Computing (SEMPGRID), New York (May 2004)Google Scholar
  47. 47.
    Jafarpour, H., Hore, B., Mehrotra, S., Venkatasubramanian, N.: Subscription subsumption evaluation for content-based publish/Subscribe systems. In: Issarny, V., Schantz, R. (eds.) Middleware 2008. LNCS, vol. 5346, pp. 62–81. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  48. 48.
    de Juan, R., Decker, H., Miedes, E., Armendariz, J.E., Munoz, F.D.: A Survey of Scalability Approaches for Reliable Causal Broadcasts. Tech. Rep. ITI-SIDI-2009/010 (2009)Google Scholar
  49. 49.
    Karnstedt, M., Sattler, K.U., Richtarsky, M., Muller, J., Hauswirth, M., Schmidt, R., John, R., Ilmenau, T.U.: UniStore: querying a DHT-based universal storage. In: IEEE 23rd International Conference on Data Engineering (ICDE), pp. 1503–1504 (2007)Google Scholar
  50. 50.
    Karvounarakis, G., Alexaki, S., Christophides, V., Plexousakis, D., Vouton, F.V., Scholl, M.: RQL: A Declarative Query Language for RDF. In: Proceedings of the 11th International Conference on World Wide Web (WWW), pp. 592–603. ACM Press, New York (2002)Google Scholar
  51. 51.
    King, R.A., Hameurlain, A., Morvan, F.: Query Routing and Processing in Peer-To-Peer Data Sharing Systems. International Journal of Database Management Systems, 116–139 (2010)Google Scholar
  52. 52.
    Knežević, P., Wombacher, A., Risse, T.: DHT-Based Self-adapting Replication Protocol for Achieving High Data Availability. In: Advanced Internet Based Systems and Applications, pp. 201–210 (2009)Google Scholar
  53. 53.
    Kossmann, D.: The State of The Art in Distributed Query Processing. ACM Comput. Surv. 32(4), 422–469 (2000)CrossRefGoogle Scholar
  54. 54.
    Koubarakis, M., Miliaraki, I., Kaoudi, Z., Magiridou, M., Papadakis-Pesaresi, A.: Semantic Grid Resource Discovery using DHTs in Atlas. In: Proceedings of 3rd GGF Semantic Grid Workshop, Athens, Greece (February 2006)Google Scholar
  55. 55.
    Ktari, S., Zoubert, M., Hecker, A., Labiod, H.: Performance Evaluation of Replication Strategies in DHTs Under Churn. In: Proceedings of the 6th International Conference on Mobile and Ubiquitous Multimedia (MUM), pp. 90–97. ACM, New York (2007)CrossRefGoogle Scholar
  56. 56.
    Li, G., Hou, S., Jacobsen, H.: A Unified Approach to Routing, Covering and Merging in Publish/Subscribe Systems Based on Modified Binary Decision Diagrams. In: Proceedings of 25th IEEE International Conference on Distributed Computing Systems (ICDCS) 2005, pp. 447–457 (2005)Google Scholar
  57. 57.
    Liarou, E., Idreos, S., Koubarakis, M.: Continuous RDF query processing over dHTs. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 324–339. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  58. 58.
    Liarou, E., Idreos, S., Koubarakis, M.: Publish/Subscribe with RDF data over large structured overlay networks. In: Moro, G., Bergamaschi, S., Joseph, S., Morin, J.-H., Ouksel, A.M. (eds.) DBISP2P 2005 and DBISP2P 2006. LNCS, vol. 4125, pp. 135–146. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  59. 59.
    Liarou, E., Idreos, S., Koubarakis, M.: Evaluating conjunctive triple pattern queries over large structured overlay networks. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 399–413. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  60. 60.
    Liu, Y., Plale, B.: Survey of Publish Subscribe Event Systems. Tech. rep., Indiana University (2003)Google Scholar
  61. 61.
    Loo, B.T., Huebsch, R., Stoica, I., Hellerstein, J.M.: The Case for a Hybrid P2P Search Infrastructure. In: Peer-to-Peer Systems III, pp. 141–150 (2005)Google Scholar
  62. 62.
    Lua, K., Crowcroft, J., Pias, M., Sharma, R., Lim, S.: A Survey and Comparison of Peer-to-Peer Overlay Network Schemes. IEEE Communications Surveys and Tutorials, 72–93 (2005)Google Scholar
  63. 63.
    Mahambre, S.P., Bellur, U.: An Adaptive Approach for Ensuring Reliability in Event Based Middleware. In: Proceedings of the Second International Conference on Distributed Event-based Systems, pp. 157–168. ACM, New York (2008)CrossRefGoogle Scholar
  64. 64.
    Mahambre, S.P., Madhu Kumar, S.D., Bellur, U.: A Taxonomy of QoS-Aware, Adaptive Event-Dissemination Middleware. IEEE Internet Computing 11(4), 35–44 (2007)CrossRefGoogle Scholar
  65. 65.
    Mahambre, S., Bellur, U.: Reliable Routing of Event Notifications over P2P Overlay Routing Substrate in Event Based Middleware. In: IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 1–8 (2007)Google Scholar
  66. 66.
    Maier, D., Ullman, J.D., Vardi, M.Y.: On the Foundations of the Universal Relation Model. ACM Trans. Database Syst. 9(2), 283–308 (1984)MathSciNetCrossRefzbMATHGoogle Scholar
  67. 67.
    Matono, A., Pahlevi, S., Kojima, I.: RDFCube: A P2P-Based Three-Dimensional Index for Structural Joins on Distributed Triple Stores. In: Databases, Information Systems, and Peer-to-Peer Computing, pp. 323–330 (2007)Google Scholar
  68. 68.
    Meshkova, E., Riihijärvi, J., Petrova, M., Mähönen, P.: A Survey on Resource Discovery Mechanisms, Peer-to-Peer and Service Discovery Frameworks. Comput. Netw. 52(11), 2097–2128 (2008)CrossRefGoogle Scholar
  69. 69.
    Mu, Y., Yu, C., Ma, T., Zhang, C., Zheng, W., Zhang, X.: Dynamic Load Balancing With Multiple Hash Functions in Structured P2P Systems. In: Proceedings of the 5th International Conference on Wireless Communications, Networking and Mobile Computing (WiCOM), pp. 5364–5367. IEEE Press, Piscataway (2009)Google Scholar
  70. 70.
    Mühl, G., Fiege, L., Pietzuch, P.: Distributed Event-Based Systems. Springer-Verlag New York, Inc., Secaucus (2006)zbMATHGoogle Scholar
  71. 71.
    Nejdl, W., Wolf, B., Qu, C., Decker, S., Sintek, M., Naeve, A., Nilsson, M., Palmer, M., Risch, T.: Edutella: A P2P Networking Infrastructure Based on RDF. In: Proceedings of the 11 International World Wide Web Conference, WWW (May 2002)Google Scholar
  72. 72.
    Nejdl, W., Wolpers, M., Siberski, W., Schmitz, C., Schlosser, M., Brunkhorst, I., Löser, A.: Super-peer-based routing and clustering strategies for RDF-based peer-to-peer networks. In: Proceedings of the 12th International Conference on World Wide Web, pp. 536–543. ACM, New York (2003)Google Scholar
  73. 73.
    Pietzuch, P., Bacon, J.: Hermes: A distributed Event-based Middleware Architecture. In: Proceedings of the 22nd International Conference on Distributed Computing Systems(ICDCS), pp. 611–618 (2002)Google Scholar
  74. 74.
    Ranger, D., Cloutier, J.F.: Scalable Peer-to-Peer RDF Query Algorithm. In: Proceedings of Web information systems engineering International Workshops (WISE), p. 266 (November 2005)Google Scholar
  75. 75.
    Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A Scalable Content-Addressable Network. In: Proceedings of the 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM), pp. 161–172. ACM, New York (2001)Google Scholar
  76. 76.
    Rhea, S., Geels, D., Roscoe, T., Kubiatowicz, J.: Handling Churn in a DHT. In: Proceedings of the Annual Conference on USENIX Annual Technical Conference (ATEC), p. 10 (2004)Google Scholar
  77. 77.
    Risson, J., Moors, T.: Survey of Research Towards Robust Peer-to-Peer Networks: Search Methods. Computer Networks 50(17), 3485–3521 (2006)CrossRefzbMATHGoogle Scholar
  78. 78.
    Rowstron, A., Druschel, P.: Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In: Liu, H. (ed.) Middleware 2001. LNCS, vol. 2218, pp. 329–350. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  79. 79.
    Guha, R.V.: rdfDB: An RDF Database, http://guha.com/rdfdb/
  80. 80.
    Schlosser, M.T., Sintek, M., Decker, S., Nejdl, W.: HyperCuP - Hypercubes, Ontologies, and Efficient Search on Peer-to-Peer Networks. In: Moro, G., Koubarakis, M. (eds.) AP2PC 2002. LNCS (LNAI), vol. 2530, pp. 112–124. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  81. 81.
    Seaborne, A.: RDQL - A Query Language for RDF. Tech. rep., W3C, proposal (2004)Google Scholar
  82. 82.
    Staab, S., Stuckenschmidt, H.: Semantic Web and Peer-to-Peer. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  83. 83.
    Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications. In: Proceedings of the 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM), pp. 149–160. ACM, New York (2001)Google Scholar
  84. 84.
    Tang, C., McKinley, P.: Improving Multipath Reliability in Topology-Aware Overlay Networks. In: Proceedings of the 25th IEEE International Conference on Distributed Computing Systems Workshops (ICDCSW), pp. 82–88. IEEE, Los Alamitos (2005)CrossRefGoogle Scholar
  85. 85.
    Triantafillou, P., Economides, A.: Subscription Summarization: A New Paradigm for Efficient Publish/Subscribe Systems. In: Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS 2004), pp. 562–571. IEEE Computer Society, Los Alamitos (2004)CrossRefGoogle Scholar
  86. 86.
    Voulgaris, S., Rivière, E., Kermarrec, A.M., Steen, M.V.: Sub-2-sub: Self-organizing content-based publish subscribe for dynamic large scale collaborative networks. In: Proceedings of the Fifth International Workshop on Peer-to-Peer Systems, IPTPS (2006)Google Scholar
  87. 87.
    Yang, B., Garcia-Molina, H.: Designing a Super-Peer Network. In: Proceedings of the 19th International Conference on Data Engineering (ICDE), vol. 1063, p. 17 (2003)Google Scholar
  88. 88.
    Zhou, J., Hall, W., Roure, D.D.: Building a Distributed Infrastructure for Scalable Triple Stores. Journal of Computer Science and Technology 24(3), 447–462 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Imen Filali
    • 1
  • Francesco Bongiovanni
    • 1
  • Fabrice Huet
    • 1
  • Françoise Baude
    • 1
  1. 1.INRIA Sophia Antipolis CNRS, I3S, University of Nice Sophia AntipolisFrance

Personalised recommendations