Advertisement

Enabling High Data Throughput in Desktop Grids through Decentralized Data and Metadata Management: The BlobSeer Approach

  • Bogdan Nicolae
  • Gabriel Antoniu
  • Luc Bougé
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5704)

Abstract

Whereas traditional Desktop Grids rely on centralized servers for data management, some recent progress has been made to enable distributed, large input data, using to peer-to-peer (P2P) protocols and Content Distribution Networks (CDN). We make a step further and propose a generic, yet efficient data storage which enables the use of Desktop Grids for applications with high output data requirements, where the access grain and the access patterns may be random. Our solution builds on a blob management service enabling a large number of concurrent clients to efficiently read/write and append huge data that are fragmented and distributed at a large scale. Scalability under heavy concurrency is achieved thanks to an original metadata scheme using a distributed segment tree built on top of a Distributed Hash Table (DHT). The proposed approach has been implemented and its benefits have successfully been demonstrated within our BlobSeer prototype on the Grid’5000 testbed.

Keywords

Distribute Hash Table Data Provider Desktop Grid Physical Node Page Size 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Anderson, D.P.: Boinc: A system for public-resource computing and storage. In: GRID 2004: Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing,, Washington, DC, USA, pp. 4–10. IEEE Computer Society, Los Alamitos (2004)Google Scholar
  2. 2.
    Fedak, G., Germain, C., Neri, V.: Xtremweb: A generic global computing system. In: CCGRID 2001: Proceedings of the IEEE International Symposium on Cluster Computing and the Grid. pp. 582–587. Press (2001)Google Scholar
  3. 3.
    Chien, A., Calder, B., Elbert, S., Bhatia, K.: Entropia: Architecture and performance of an enterprise desktop grid system. Journal of Parallel and Distributed Computing 63, 597–610 (2003)CrossRefGoogle Scholar
  4. 4.
    Costa, F., Silva, L., Fedak, G., Kelley, I.: Optimizing data distribution in desktop grid platforms. Parallel Processing Letters (PPL) 18, 391–410 (2008)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Al-Kiswany, S., Ripeanu, M., Vazhkudai, S.S., Gharaibeh, A.: Stdchk: A checkpoint storage system for desktop grid computing. In: ICDCS 2008: Proceedings of the the 28th International Conference on Distributed Computing Systems, Washington, DC, USA, pp. 613–624. IEEE Computer Society, Los Alamitos (2008)Google Scholar
  6. 6.
    Lustre file system: High-performance storage architecture and scalable cluster file system. White Paper (2007), http://wiki.lustre.org/index.php/Lustre_Publications
  7. 7.
    Carns, P.H., Ligon III, W.B., Ross, R.B., Thakur, R.: Pvfs: A parallel file system for linux clusters. In: ALS 2000: Proceedings of the 4th Annual Linux Showcase and Conference, Atlanta, GA, USA, pp. 317–327. USENIX Association (2000)Google Scholar
  8. 8.
    Ghemawat, S., Gobioff, H., Leung, S.T.: The google file system. SIGOPS Oper. Syst. Rev. 37(5), 29–43 (2003)CrossRefGoogle Scholar
  9. 9.
    You, L.L., Pollack, K.T., Long, D.D.E.: Deep store: An archival storage system architecture. In: ICDE 2005: Proceedings of the 21st International Conference on Data Engineering, Washington, DC, USA, pp. 804–8015. IEEE Computer Society, Los Alamitos (2005)Google Scholar
  10. 10.
    Respower render farm (2003), http://www.respower.com
  11. 11.
    Patoli, Z., Gkion, M., Al-Barakati, A., Zhang, W., Newbury, P.F., White, M.: How to build an open source render farm based on desktop grid computing. In: Hussain, D.M.A., Rajput, A.Q.K., Chowdhry, B.S., Gee, Q. (eds.) IMTIC. Communications in Computer and Information Science, vol. 20, pp. 268–278. Springer, Heidelberg (2008)Google Scholar
  12. 12.
    Nicolae, B., Antoniu, G., Bougé, L.: How to enable efficient versioning for large object storage under heavy access concurrency. In: EDBT 2009: 2nd International Workshop on Data Management in P2P Systems (DaMaP 2009), St Petersburg, Russia (2009)Google Scholar
  13. 13.
    Nicolae, B., Antoniu, G., Bougé, L.: Enabling lock-free concurrent fine-grain access to massive distributed data: Application to supernovae detection. In: CLUSTER 2008: Proceedings of the 2008 IEEE International Conference on Cluster Computing, Tsukuba, Japan, pp. 310–315 (2008)Google Scholar
  14. 14.
    Nicolae, B., Antoniu, G., Bougé, L.: Distributed management of massive data: An efficient fine-grain data access scheme. In: Palma, J.M.L.M., Amestoy, P.R., Daydé, M., Mattoso, M., Lopes, J.C. (eds.) VECPAR 2008. LNCS, vol. 5336, pp. 532–543. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  15. 15.
    Zheng, C., Shen, G., Li, S., Shenker, S.: Distributed segment tree: Support of range query and cover query over dht. In: IPTPS 2006: The Fifth International Workshop on Peer-to-Peer Systems, Santa Barbara, USA (2006)Google Scholar
  16. 16.
    Boost c++ libraries (2008), http://www.boost.org
  17. 17.
    Bolze, R., Cappello, F., Caron, E., Daydé, M., Desprez, F., Jeannot, E., Jégou, Y., Lanteri, S., Leduc, J., Melab, N., Mornet, G., Namyst, R., Primet, P., Quetier, B., Richard, O., Talbi, E.G., Touche, I.: Grid’5000: A large scale and highly reconfigurable experimental grid testbed. Int. J. High Perform. Comput. Appl. 20(4), 481–494 (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Bogdan Nicolae
    • 1
  • Gabriel Antoniu
    • 2
  • Luc Bougé
    • 3
  1. 1.University of Rennes 1, IRISA, RennesFrance
  2. 2.INRIA, Centre Rennes - Bretagne Atlantique, IRISA, RennesFrance
  3. 3.ENS Cachan/Brittany, IRISAFrance

Personalised recommendations