Journal of Grid Computing

, Volume 11, Issue 1, pp 103–127 | Cite as

Data Placement in P2P Data Grids Considering the Availability, Security, Access Performance and Load Balancing

  • Manghui TuEmail author
  • Hui Ma
  • Liangliang Xiao
  • I.-Ling Yen
  • Farokh Bastani
  • Dianxiang Xu


Data dependability is an important issue in data Grids. Replication schemes have been widely used in distributed systems to ensure availability and improve access performance. Alternatively, data partitioning schemes (secret sharing, erasure coding with encryption) can be used to provide availability and, in addition, to offer confidentiality protection. In peer-to-peer data Grids, such confidentiality protection is essential since the nodes hosting the data shares may not be trustworthy or may be compromised. However, difficulties in generating new shares and potential security concerns for share reallocation make a pure data partitioning scheme not easily adaptable to dynamic user access patterns. In this paper, we consider combining replication and data partitioning to assure data availability, confidentiality, load balance, and efficient access for data Grid applications. Data are partitioned and shares are dispersed. The shares may be replicated to achieve better performance, load balance, and availability. Models for assessing confidentiality, availability, load balance, and communication cost are developed and used as the metrics to guide placement decisions. Due to the nature of contradicting goals, we model the placement decision problem as a multi-objective problem and use a genetic algorithm to determine solutions that are approximate to the Pareto optimal placement solutions.


P2P data Grids Availability Security assurance Load balance Access performance 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Adan, I., Resing, J.: Queueing theory. (2001)
  2. 2.
    Aguilera, M., et al.: Using erasure codes efficiently for storage in a distributed system. In: Proceedings of DSN (2005)Google Scholar
  3. 3.
    Allcock, B., Bester, J., Bresnahan, J., Chervenak, A.L., Foster, I., Kesselman, C., Meder, S., Nefedova, V., Quesnel, D., Tuecke, S.: Data management and transfer in high-performance computational Grid environments. J. Parallel Comput. 28(5), 749–771 (2002)CrossRefGoogle Scholar
  4. 4.
    Arora, S., Raghavan, P., Rao, S.: Approximation schemes for Euclidean k-medians and related problems. In: Proceedings of the 30th ACM STOC (1998)Google Scholar
  5. 5.
    AviZienis, A., Laprie, J., Randell, B.: Fundamental concepts of dependability. In: The extension version of the Proceedings of the 3rd IEEE Information Survivability Workshop (ISW-2000) (2000)Google Scholar
  6. 6.
    Buda, G., Allen, B., Linthicum, H.: Security standards for the global information Grid. In: Proceedings of IEEE MILCOM 2001-Communications for Network-centric Operations: Creating the Information Force. Vienna, VA (2001)Google Scholar
  7. 7.
    Butt, A., Adabala, S., Kapadia, N., Figueiredo, R.: Fine grain access control for securing shared resources in computation Grids. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’02) (2002)Google Scholar
  8. 8.
    CERT Coordination Center, UNIX Configuration Guidelines. Available at
  9. 9.
    Chandra, B., et al.: End-to-End WAN service availability. In: Proceedings of the 3rd Usenix Symposium on Internet Technologies and Systems (2001)Google Scholar
  10. 10.
    Dalvi, N., et al.: Adversary classification. In: Proceedings of KDD’04 (2004)Google Scholar
  11. 11.
    Deb, K., et al.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Dowdy, D., Foster, D.: Comparative models of the file assignment problem. Comput. Surv. 14(2), 287–313 (1982)CrossRefGoogle Scholar
  13. 13.
    Elnikety, S., Nahum, E., Tracey, J., Zwaenepoel, W.: A method for transparent admission control and request scheduling in e-commerce web sites. In: Proceeding of WWW’04 (2004)Google Scholar
  14. 14.
    Foster, I., Lamnitche, A.: On death, taxes, and convergence of peer-to-peer and Grid computing. In: IPTPS’03 (2003)Google Scholar
  15. 15.
    The Globus Project. Retrieved from
  16. 16.
    Gray, J., Helland, P., O’Neil, P., Shasha, D.: The dangers of replication and a solution. In: ACM SIGMOD Conference (1996)Google Scholar
  17. 17.
    Kalpakis, K., et al.: Optimal placement of replicas in trees with read, write, and storage costs. IEEE Trans. Parallel Distrib. Syst. 12(6), 628–637 (2001)CrossRefGoogle Scholar
  18. 18.
    Kariv, O., Hakimi, S.L.: An algorithmic approach to location problems II: the p-medians. SIAM J. Appl. Math. 37(3), 539–560 (1979)MathSciNetzbMATHCrossRefGoogle Scholar
  19. 19.
    Kosar, T., Livny, M.: Stork: making data placement a first class citizen in the Grid. In: Proceedings of the 24th IEEE International Conference of Distributed Computing Systems (ICDCS’04) (2004)Google Scholar
  20. 20.
    Krawczyk, H.: Dsitributed fingerprints and secure information dispersal. In: Proceedings of the 12th Annual ACM Symposium on Principles of Distributed Computing (PODC’93) (1993)Google Scholar
  21. 21.
    Krawczyk, H.: Secret sharing made short. In: Crypto’93 (1993)Google Scholar
  22. 22.
    Kubitowicz, et al.: OceanStore: an archiotecute for global-scale persistent storage. In: Proceedings of ASPLOS (2000)Google Scholar
  23. 23.
    Lakshmanan, S., Ahamad, M., Venkateswaran, H.: Responsive security for stored data. IEEE Trans. Parallel Distrib. Syst. 14(9), 818–828 (2003)CrossRefGoogle Scholar
  24. 24.
    Lala, J.H.: Foundations of the intrusion tolerant systems OASIS. IEEE Comput. Soc., ISBN 076952057X (2004)Google Scholar
  25. 25.
    Lamehamedi, H., Shentu, Z., Szymanski, B., Deelman, E.: Simulation of dynamic data replication in data Grids. In: Proceedings of 18th International Parallel and Distributed Processing Symposium (2003)Google Scholar
  26. 26.
    Mei, A., et al.: Secure dynamic fragment and replica allocation in large-scale distributed File systems. IEEE Trans. Parallel Distrib. Syst. 14(9), 885–896 (2003)CrossRefGoogle Scholar
  27. 27.
    Nagaratnam, N., et al.: The Security Architecture for Open Grid Services. OGSA Working draft (2002)Google Scholar
  28. 28.
    Nicol, D., Sanders, W., Trivedi, K.: Model-based evaluation from dependability to security. IEEE T. Depend. Secure 1(1), 1–17 (2004)CrossRefGoogle Scholar
  29. 29.
    On, G., Schmitt, J., Steinmetz, R.: On availability Qos for replicated multimedia service and content. In: Proceedings of International Workshop on Interactive Distributed Multimedia Systems (IDMS02) (2002)Google Scholar
  30. 30.
    On, G., Schmitt, J., Steinmetz, R.: Quality of availability: replica placement for widely distributed systems. In: IWQos’03 (2003)Google Scholar
  31. 31.
    Park, S., Kim, J., Ko, Y., Yoon, W.: Dynamic data replication strategy based on internet hierarchy. In: Proceedings of 2nd International Workshop on Grid and Cooperative Computing (GCC’03) (2003)Google Scholar
  32. 32.
    Paxson, V.: End-to-end routing behavior in the internet. IEEE/ACM Trans. Netw. 5(5), 601–615 (1997)CrossRefGoogle Scholar
  33. 33.
    Qin, X.: Design and analysis of a load balancing strategy in data Grids. J. Grid Computing 23(1), 132–137 (2007)Google Scholar
  34. 34.
    Rabin, M.O.: Efficient dispersal of information for security, load balancing and fault tolerance. J. ACM 36(2), 335–348 (1989)MathSciNetzbMATHCrossRefGoogle Scholar
  35. 35.
    Ranganathan, K., Foster, I.: Identifying dynamic replication strategies for a high performance data Grid. In: Proceedings of 2nd International Workshop on Grid Computing (2001)Google Scholar
  36. 36.
    Ranganathan, K., et al.: Improve data availability through dynamic model-driven replication in large peer-to-peer communities. In: Proceedings of 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (2002)Google Scholar
  37. 37.
    Reiter, M., Rohatgi, P.: Homeland security. IEEE Internet Comput. 8(6), 16–17 (2004)CrossRefGoogle Scholar
  38. 38.
    Riedel, E., et al.: A framework for evaluating storage system security. In: Proceedings of the 1st Conference on File and Storage Technology (FAST). Monterey, CA (2002)Google Scholar
  39. 39.
    Schintke, F., Reinefeld, A.: Modeling replica availability in large data Grids. J. Grid Computing 1(2), 219–227 (2003)CrossRefGoogle Scholar
  40. 40.
    Shamir, A.: How to share a secret. Commun. ACM 22, 612–613 (1979)MathSciNetzbMATHCrossRefGoogle Scholar
  41. 41.
    Singh, G., Bharathi, S., Chervenak, A., Deelman, E., Kesselman, C., Manohar, M., Patil, S., Pearlman, L.: A metadata catalog service for data intensive applications. In: Proceedings of 2003 IEEE/ACM Conference on Supercomputing (2003)Google Scholar
  42. 42.
    Steuer, E.: Multiple Criteria Optimization: Theory, Computation, and Application. Wiley (1986)Google Scholar
  43. 43.
    Symantec. Anatomy of a data breach: why breaches happen and what to do about it. White paper, (2009)
  44. 44.
    Thuraisingham, B.M., Maurer, J.A.: Information survivability for evolvable and adaptable real-time command and control systems. IEEE Trans. Knowl. Data Eng. 11(1), 228–238 (1999)CrossRefGoogle Scholar
  45. 45.
    Tu, M., Li, P., Ma, Q., Yen, I., Bastani, F.: On the optimal placement of secure data over Internet. In: Proceedings of IPDPS’05. Denver, Colorado, USA (2005)Google Scholar
  46. 46.
    Tu, M., Xiao, L., Ma, H., Yen, I., Bastani, F.: Data placement in secure and dependable P2P data Grid. In: Proceedings of the IEEE 10th International Symposium on High Assurance System Engineering (HASE07) (2007)Google Scholar
  47. 47.
    Tu, M., Li, P., Ma, Q., Yen, I., Bastani, F.: Secure data object placement in the P2P data Grid. IEEE T. Depend. Secure 7(1), 50–64 (2010)CrossRefGoogle Scholar
  48. 48.
    Welch, V., Siebenlist, F., Foster, I., Bresnahan, J., Czajkowski, K., Gawor, J., Kesselman, C., Meder, S., Pearlman, L., Tuecke, S.: Security for Grid service. In: Proceedings of 12th International Symposium on High Performance Distributed Computing (HPDC-12) (2003)Google Scholar
  49. 49.
    Weatherspoon, H., Kubiatowicz, J.: Erasure coding vs. replication: a quantitative comparison. In: Proceedings of Peer-to-Peer Systems: First International Workshop (IPTPS) (2002)Google Scholar
  50. 50.
    Wolfson, O., Milo, A.: The multicast policy and its relationship to replicated data placement. ACM Trans. Database Syst. 16(1), 181–205 (1991)MathSciNetCrossRefGoogle Scholar
  51. 51.
    Wolfson, O., Jajodia, S., Huang, Y.: An adaptive data replication algorithm. ACM Trans. Database Syst. 22(2), 255–314 (1997)CrossRefGoogle Scholar
  52. 52.
    Wu, T., Malkin, M., Boneh, D.: Building intrusion tolerant applications. In: DARPA Information Survivability Conference & Exposition I (2000)Google Scholar
  53. 53.
    Wylie, J., et al.: Selecting the right data distribution scheme for a survivable storage system. Technical Report CMU (2000)Google Scholar
  54. 54.
    Yu, H., Vahdat, A.: The costs and limits of availability for replicated services. In: Proceedings of the ACM Symposium on Operating Systems Principles (SOSP) (2001)Google Scholar
  55. 55.
    Zitzler, E., Deb, K., Thiele, L.: Comparison of multi-objective evolutionary algorithms: empirical results. Evol. Comput. 8(2), 173–195 (2000)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2012

Authors and Affiliations

  • Manghui Tu
    • 1
    Email author
  • Hui Ma
    • 2
  • Liangliang Xiao
    • 3
  • I.-Ling Yen
    • 3
  • Farokh Bastani
    • 3
  • Dianxiang Xu
    • 4
  1. 1.Department of CITGPurdue University CalumetHammondUSA
  2. 2.Cisco Systems, Inc.AustinUSA
  3. 3.Department of Computer ScienceUniversity of Texas at DallasDallasUSA
  4. 4.College of Business and Information SystemsDakota State UniversityMadisonUSA

Personalised recommendations