Data Sharing in DHT Based P2P Systems

  • Claudia Roncancio
  • María del Pilar Villamil
  • Cyril Labbé
  • Patricia Serrano-Alvarado

Abstract

The evolution of peer-to-peer (P2P) systems triggered the building of large scale distributed applications. The main application domain is data sharing across a very large number of highly autonomous participants. Building such data sharing systems is particularly challenging because of the “extreme” characteristics of P2P infrastructures: massive distribution, high churn rate, no global control, potentially untrusted participants... This article focuses on declarative querying support, query optimization and data privacy on a major class of P2P systems, that based on Distributed Hash Table (P2P DHT). The usual approaches and the algorithms used by classic distributed systems and databases for providing data privacy and querying services are not well suited to P2P DHT systems. A considerable amount of work was required to adapt them for the new challenges such systems present. This paper describes the most important solutions found. It also identifies important future research trends in data management in P2P DHT systems.

Keywords

DHT P2P Systems Data sharing Querying in P2P systems Data privacy 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abiteboul, S., Benjelloun, O., Manolescu, I., Milo, T., Weber, R.: Active XML: A Data-Centric Perspective on Web Services. In: Demo Proc. of Int. Conf. on Very Large Databases (VLDB), Hong Kong, China (August 2002)Google Scholar
  2. 2.
    Abiteboul, S., Dar, I., Pop, R., Vasile, G., Vodislav, D.: EDOS Distribution System: a P2P Architecture for Open-Source Content Dissemination. In: IFIP Working Group on Open Source Software (OSS), Limerick, Ireland (June 2007)Google Scholar
  3. 3.
    Abiteboul, S., Manolescu, I., Polyzotis, N., Preda, N., Sun, C.: XML Processing in DHT Networks. In: Int. Conf. on Data Engineering (ICDE) (April 2008)Google Scholar
  4. 4.
    Abiteboul, S., Manolescu, I., Preda, N.: Sharing Content in Structured P2P Networks. In: Journées Bases de Données Avancées, Saint-Malo, France (October 2005)Google Scholar
  5. 5.
    Agrawal, R., Haas, P., Kiernan, J.: A System for Watermarking Relational Databases. In: Int. Conf. on Management of Data (SIGMOD), San Diego, California, USA (June 2003)Google Scholar
  6. 6.
    Agrawal, R., Kiernan, J., Srikant, R., Xu, Y.: Hippocratic Databases. In: Int. Conf. on Very Large Databases (VLDB), Hong Kong, China (August 2002)Google Scholar
  7. 7.
    Akbarinia, R., Martins, V., Pacitti, E., Valduriez, P.: Design and Implementation of APPA. In: Baldoni, R., Cortese, G., Davide, F. (eds.) Global Data Management. IOS Press, Amsterdam (2006)Google Scholar
  8. 8.
    Akbarinia, R., Pacitti, E., Valduriez, P.: Processing Top-k Queries in Distributed Hash Tables. In: Kermarrec, A.-M., Bougé, L., Priol, T. (eds.) Euro-Par 2007. LNCS, vol. 4641, pp. 489–502. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  9. 9.
    Androutsellis-Theotokis, S., Spinellis, D.: A Survey of Peer-to-Peer Content Distribution Technologies. ACM Computing Surveys 36(4) (2004)Google Scholar
  10. 10.
    Artigas, M.S., López, P.G., Gómez-Skarmeta, A.F.: Subrange Caching: Handling Popular Range Queries in DHTs. In: Hameurlain, A. (ed.) Globe 2008. LNCS, vol. 5187, pp. 22–33. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  11. 11.
    Bharambe, A., Agrawal, M., Seshan, S.: Mercury: Supporting Scalable Multi-Attribute Range Queries. In: Int. Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM), Portland, Oregon, USA, August-September (2004)Google Scholar
  12. 12.
    Blanco, R., Ahmed, N., Sung, D.H.L., Li, H., Soliman, M.: A Survey of Data Management in Peer-to-Peer Systems. Technical Report CS-2006-18, University of Waterloo (2006)Google Scholar
  13. 13.
    Bonifati, A., Cuzzocrea, A.: Storing and Retrieving XPath Fragments in Structured P2P Networks. Data & Knowledge Engineering 59(2) (2006)Google Scholar
  14. 14.
    Brunkhorst, I., Dhraief, H., Kemper, A., Nejdl, W., Wiesner, C.: Distributed Queries and Query Optimization in Schema-Based P2P-Systems. In: Int. Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P), Berlin, Germany (September 2003)Google Scholar
  15. 15.
    Cai, M., Frank, M., Chen, J., Szekely, P.: MAAN: A Multi-Attribute Addressable Network for Grid Information Services. In: Int. Workshop on Grid Computing (GRID), Phoenix, Arizona (November 2003)Google Scholar
  16. 16.
    Cates, J.: Robust and Efficient Data Management for a Distributed Hash Table. Master thesis, Massachusetts Institute of Technology, USA (May 2003)Google Scholar
  17. 17.
    Chen, Q., Hsu, M.: Correlated Query Process and P2P Execution. In: Hameurlain, A. (ed.) Globe 2008. LNCS, vol. 5187, pp. 82–92. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  18. 18.
    Chong, C.N., Peng, Z., Hartel, P.H.: Secure Audit Logging with Tamper-Resistant Hardware. In: Int. Conf. on Information Security (SEC), Athens, Greece (May 2003)Google Scholar
  19. 19.
    Costa, G.D., Orlando, S., Dikaiakos, M.D.: Multi-set DHT for Range Queries on Dynamic Data for Grid Information Service. In: Hameurlain, A. (ed.) Globe 2008. LNCS, vol. 5187, pp. 93–104. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  20. 20.
    Dabek, F., Kaashoek, M., Karger, D., Morris, R., Stoica, I.: Wide-area Cooperative Storage with CFS. In: Int. Symposium on Operating Systems Principles (SOSP), Banff, Canada (October 2001)Google Scholar
  21. 21.
    Dabek, F., Zhao, B.Y., Druschel, P., Kubiatowicz, J., Stoica, I.: Towards a Common API for Structured Peer-to-Peer Overlays. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  22. 22.
    Daswani, N., Garcia-Molina, H., Yang, B.: Open Problems in Data-Sharing Peer-to-Peer Systems. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds.) ICDT 2003. LNCS, vol. 2572, pp. 1–15. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  23. 23.
    d’Orazio, L., Jouanot, F., Labbé, C., Roncancio, C.: Building Adaptable Cache Services. In: Int. Workshop on Middleware for Grid Computing (MGC), Grenoble, France (November 2005)Google Scholar
  24. 24.
    Dragan, F., Gardarin, G., Nguyen, B., Yeh, L.: On Indexing Multidimensional Values in A P2P Architecture. In: French Conf. on Bases de Données Avancées (BDA), Lille, France (2006)Google Scholar
  25. 25.
    Endsuleit, R., Mie, T.: Censorship-Resistant and Anonymous P2P Filesharing. In: Int. Conf. on Availability, Reliability and Security (ARES), Vienna, Austria (April 2006)Google Scholar
  26. 26.
    Furtado, P.: Schemas and Queries over P2P. In: Andersen, K.V., Debenham, J., Wagner, R. (eds.) DEXA 2005. LNCS, vol. 3588, pp. 808–817. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  27. 27.
    Galanis, L., Wang, Y., Jeffery, S., DeWitt, D.: Locating Data Sources in Large Distributed Systems. In: Int. Conf. on Very Large Databases (VLDB), Berlin, Germany (September 2003)Google Scholar
  28. 28.
    Garcés-Erice, L., Felber, P., Biersack, E., Urvoy-Keller, G.: Data Indexing in Peer-to-Peer DHT Networks. In: Int. Conf. on Distributed Computing Systems (ICDCS), Columbus, Ohio, USA (June 2004)Google Scholar
  29. 29.
    Gnawali, O.: A Keyword-Set Search System for Peer-to-Peer Networks. Master thesis, Massachusetts Institute Of Technology, Massachusetts, USA (June 2002)Google Scholar
  30. 30.
    Harvey, N., Jones, M., Saroiu, S., Theimer, M., Wolman, A.: SkipNet: A Scalable Overlay Network with Practical Locality Properties. In: Int. Symposium on Internet Technologies and Systems (USITS), Washington, USA (March 2003)Google Scholar
  31. 31.
    Hazel, S., Wiley, B., Wiley, O.: Achord: A Variant of the Chord Lookup Service for Use in Censorship Resistant Peer-to-Peer Publishing Systems. In: Int. Workshop on Peer To Peer Systems (IPTPS), Cambridge, MA, USA (March 2002)Google Scholar
  32. 32.
    Huebsch, R.: PIER: Internet Scale P2P Query Processing with Distributed Hash Tables. Phd thesis, EECS Department, University of California, Berkeley, California, USA (May 2008)Google Scholar
  33. 33.
    Huebsch, R., Chun, B., Hellerstein, J., Loo, B., Maniatis, P., Roscoe, T., Shenker, S., Stoica, I., Ymerefendi, A.: The Architecture of PIER: An Internet-Scale Query Processor. In: Int. Conf. on Innovative Data Systems Research (CIDR), California, USA (January 2005)Google Scholar
  34. 34.
    Huebsch, R., Hellerstein, J., Lanham, N., Loo, B., Shenker, S., Stoica, I.: Querying the Internet with PIER. In: Int. Conf. on Very Large Databases (VLDB), Berlin, Germany (September 2003)Google Scholar
  35. 35.
    Hunter, D.: Initiation XML. Editions Eyrolles (2001)Google Scholar
  36. 36.
    Iyer, S., Rowstron, A., Drushchel, P.: Squirrel - A Decentralized Peer-to-Peer Web Cache. In: Int. Symposium on Principles of Distributed Computing (PODC), California, USA (July 2002)Google Scholar
  37. 37.
    Jagadish, H., Ooi, B., Vu, Q.: Baton: A Balanced Tree Structure for Peer-to-Peer Networks. In: Int. Conf. on Very Large Databases (VLDB), Trondheim, Norway (September 2005)Google Scholar
  38. 38.
    Jamard, C., Gardarin, G., Yeh, L.: Indexing Textual XML in P2P Networks Using Distributed Bloom Filters. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 1007–1012. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  39. 39.
    Jawad, M., Serrano-Alvarado, P., Valduriez, P.: Design of PriServ, A Privacy Service for DHTs. In: Int. Workshop on Privacy and Anonymity in the Information Society (PAIS), Nantes, France (March 2008)Google Scholar
  40. 40.
    Jawad, M., Serrano-Alvarado, P., Valduriez, P., Drapeau, S.: Data Privacy in Structured P2P Systems with PriServ (May 2009) (submitted paper)Google Scholar
  41. 41.
    Jouanot, F., D’Orazio, L., Roncancio, C.: Context-Aware Cache Management in Grid Middleware. In: Hameurlain, A. (ed.) Globe 2008. LNCS, vol. 5187, pp. 34–45. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  42. 42.
    Judd, D.D.: Geocollaboration using Peer-Peer GIS (May 2005), http://www.directionsmag.com/article.php?article_id=850
  43. 43.
    Kossmann, D.: The State of the Art in Distributed Query Processing. ACM Computing Surveys 32(4) (2000)Google Scholar
  44. 44.
    Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Weimer, W., Wells, C., Zhao, B.: OceanStore: An Architecture for Global-Scale Persistent Storage. In: Int. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Cambridge, MA (November 2000)Google Scholar
  45. 45.
    Lesueur, F., Mé, L., Tong, V.V.T.: A Distributed Certification System for Structured P2P Networks. In: Hausheer, D., Schönwälder, J. (eds.) AIMS 2008. LNCS, vol. 5127, pp. 40–52. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  46. 46.
    Li, Y., Jagadish, H.V., Tan, K.-L.: SPRITE: A Learning-Based Text Retrieval System in DHT Networks. In: Int. Conf. on Data Engineering, ICDE (2007)Google Scholar
  47. 47.
    Loo, B., Hellerstein, J., Huebsch, R., Shenker, S., Stoica, I.: Enhancing P2P File-Sharing with an Internet-Scale Query Processor. In: Int. Conf. on Very Large Databases (VLDB), Toronto, Canada, August-September (2004)Google Scholar
  48. 48.
    Lua, E.K., Crowcroft, J., Pias, M., Sharma, R., Lim, S.: A Survey and Comparison of Peer-to-Peer Overlay Network Schemes. IEEE Communications Surveys and Tutorials 7 (2005)Google Scholar
  49. 49.
    Malkhi, D., Naor, M., Ratajczak, D.: Viceroy: A Scalable and Dynamic Emulation of the Butterfly. In: Int. Symposium on Principles of Distributed Computing (PODC), Monterey, CA, USA (July 2002)Google Scholar
  50. 50.
    Marti, S., Garcia-Molina, H.: Taxonomy of Trust: Categorizing P2P Reputation Systems. Computer Networks 50(4) (2006)Google Scholar
  51. 51.
    Michel, S.: Top-k Aggregation Queries in Large-Scale Distributed Systems. Phd thesis, Saarland University, Saarbrucken, Germany (May 2007)Google Scholar
  52. 52.
    Molina, H., Ullman, J., Widom, J.: Database System Implementation. Prentice-Hall, Englewood Cliffs (2000)Google Scholar
  53. 53.
    Mondal, A., Madria, S.K., Kitsuregawa, M.: CLEAR: An Efficient Context and Location-Based Dynamic Replication Scheme for Mobile-P2P Networks. In: Bressan, S., Küng, J., Wagner, R. (eds.) DEXA 2006. LNCS, vol. 4080, pp. 399–408. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  54. 54.
    Ntarmos, N., Triantafillou, P., Weikum, G.: Counting at Large: Efficient Cardinality Estimation in Internet-Scale Data Networks. In: Int. Conf. on Data Engineering (ICDE), Atlanta, USA (April 2006)Google Scholar
  55. 55.
    Open-Source Search Engine. YACY (2009), http://yacy.net/
  56. 56.
    P2P Streaming. Joost (2009), http://www.joost.com/
  57. 57.
    Petkovic, M., Jonker, W.W. (eds.): Security, Privacy, and Trust in Modern Data Management. Data-Centric Systems and Applications. Springer, Heidelberg (2007)MATHGoogle Scholar
  58. 58.
    Prada, C.: Servicio para Manejar Estadísticas en Sistemas P2P Basados en DHT. Master thesis, Universidad de los Andes, Bogota, Colombia (January 2009)Google Scholar
  59. 59.
    Prada, C., Roncancio, C., Labbée, C., Villamil, M.P.: Semantic Caching Proposal in a P2P Querying System. In: Congreso Latinoamericano de Computación de Alto Rendimiento, Santa Marta, Colombia (June 2007)Google Scholar
  60. 60.
    Prada, C., Villamil, M., Roncancio, C.: Join Queries in P2P DHT Systems. In: Int. Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P), Auckland, New Zealand (August 2008)Google Scholar
  61. 61.
    Ramabhadran, S., Ratnasamy, S., Hellerstein, J., Shenker, S.: Prefix Hash Trees An Indexing Data Structure Over Distributed Hash Tables (2004), http://berkeley.intel-research.net/sylvia/pht.pdf
  62. 62.
    Ramachandran, A., Feamster, N.: Authenticated Out-of-Band Communication Over Social Links. In: Int. Workshop on Online social networks (WOSN), Seattle, WA, USA (August 2008)Google Scholar
  63. 63.
    Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A Scalable Content Addressable Network. In: Int. Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM), San Diego, CA, USA (August 2001)Google Scholar
  64. 64.
    Reynolds, P., Vahdat, A.: Efficient Peer-to-Peer Keyword Searching. In: Endler, M., Schmidt, D.C. (eds.) Middleware 2003. LNCS, vol. 2672. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  65. 65.
    Rice University Houston, USA. FreePastry (2002), http://freepastry.rice.edu/FreePastry/
  66. 66.
    Rowstron, A., Druschel, P.: Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems. In: Guerraoui, R. (ed.) Middleware 2001. LNCS, vol. 2218, p. 329. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  67. 67.
    Rowstron, A., Druschel, P.: Storage Management and Caching in PAST, A Large-scale, Persistent Peer-to-Peer Storage Utility. In: Int. Symposium on Operating Systems Principles (SOSP), Banff, Canada (October 2001)Google Scholar
  68. 68.
    Sahin, O., Gupta, A., Agrawal, D., El-Abbadi, A.: A Peer-to-Peer Framework for Caching Range Queries. In: Int. Conf. on Data Engineering (ICDE), Boston, USA, March-April (2004)Google Scholar
  69. 69.
    Serjantov, A.: Anonymizing Censorship Resistant Systems. In: Druschel, P., Kaashoek, M.F., Rowstron, A. (eds.) IPTPS 2002. LNCS, vol. 2429, p. 111. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  70. 70.
    Shing, S., Yang, G., Wang, D., Yu, J., Qu, S., Chen, M.: Making Peer-to-Peer Keyword Searching Feasible Using Multi-level Partitioning. In: Voelker, G.M., Shenker, S. (eds.) IPTPS 2004. LNCS, vol. 3279, pp. 151–161. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  71. 71.
    Sit, E., Morris, R.: Security Considerations for Peer-to-Peer Distributed Hash Tables. In: Druschel, P., Kaashoek, M.F., Rowstron, A. (eds.) IPTPS 2002. LNCS, vol. 2429, p. 261. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  72. 72.
    Skobeltsyn, G., Aberer, K.: Distributed Cache Table: Efficient Query-Driven Processing of Multi-Term Queries in P2P Networks. In: Int. Workshop on Information Retrieval in Peer-to-Peer Networks (P2PIR), Arlington, USA (November 2006)Google Scholar
  73. 73.
    Stoica, I., Morris, R., Karger, D., Kaashoek, F., Balakrishnan, H.: Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications. In: Int. Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM), San Diego, CA, USA (August 2001)Google Scholar
  74. 74.
    Triantafillou, P., Pitoura, T.: Toward a Unifying Framework for Complex Query Processing over Structured Peer-to-Peer Data Networks. In: Int. Workshop on Databases, Information Systems, and Peer-to-Peer Computing (DBISP2P), Berlin, Germany (September 2003)Google Scholar
  75. 75.
    Villamil, M.: Service de Localisation de Données pour les Systèmes P2P. Phd thesis, Institut National Polytechnique de Grenoble, Grenoble, France (June 2006)Google Scholar
  76. 76.
    Villamil, M., Roncancio, C., Labbé, C.: PinS: Peer to Peer Interrogation and Indexing System. In: Int. Database Engineering and Applications Symposium (IDEAS), Coimbra, Portugal (June 2004)Google Scholar
  77. 77.
    Villamil, M., Roncancio, C., Labbé, C.: Querying in Massively Distributed Storage Systems. In: Journées Bases de Données Avancées, Saint-Malo, France (October 2005)Google Scholar
  78. 78.
    WSDL. Web Services Description Language (WSDL) 1.1 (2001), http://www.w3.org/TR/wsdl
  79. 79.
    Wu, S., Li, J., Ooi, B., Tan, K.-L.: Just-in-Time Query Retrieval over Partially Indexed Data on Structured P2P Overlays. In: Int. Conf. on Management of Data (SIGMOD), Vancouver, Canada (June 2008)Google Scholar
  80. 80.
    Zhao, B., Huang, L., Stribling, J., Rhea, S., Joseph, A., Kubiatowicz, J.: Tapestry: A Resilient Global-scale Overlay for Service Deployment. IEEE Journal on Selected Areas in Communications 22(1) (2004)Google Scholar
  81. 81.
    Zhu, Y., Hu, Y.: Efficient Semantic Search on DHT Overlays. Parallel and Distributed Computing 67(5) (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Claudia Roncancio
    • 1
  • María del Pilar Villamil
    • 2
  • Cyril Labbé
    • 1
  • Patricia Serrano-Alvarado
    • 3
  1. 1.University of GrenobleFrance
  2. 2.University of Los AndesBogotáColombia
  3. 3.University of NantesFrance

Personalised recommendations