Data Driven Infrastructure and Policy Selection to Enhance Scientific Applications in Grid

  • Jose M. Perez
  • Felix Garcia
  • Jesus Carretero
  • Jose D. Garcia
  • Soledad Escolar
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3458)


Most works on Grids have taken an approach where the system is a mixture of clusters and other resources put together with the help of some services. But this solution is a simplistic one that tries to grow from the cluster perspective. We think that the Grid model should be different and near to the p2p model, especially in the I/O field where the network and the heterogeneity of the infrastructure play an important role. In this paper we present a model to organize the DataGrid Infrastructure using concepts as data phases and a p2p approach, in order to select the adequate working policies. These concepts allow the definition of a clearer model for our DataGrid Architecture than a mixture of resources. We present a model relying on the former concepts, their implementation in an I/O middleware for Grids, called GridExpand, and the evaluation of some of the concepts presented.


Access Pattern Storage Server Storage Node Storage Resource Policy Selection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Foster, I., Kesselman, C. (eds.): The Grid: Blueprint for a New Computing Infraestructure. Morgan Kaufmann, San Francisco (1999)Google Scholar
  2. 2.
    Patterson, D., Gibson, G., Katz, R.: A Case for Redundant arrays of Inexpensive Disks (RAID). In: Proc. of the ACM SIGMOD 1988, June 1988, pp. 109–116 (1988)Google Scholar
  3. 3.
    Sponsored by the European Union. The Data Grid Project,
  4. 4.
    Sponsored by the U.S. DOE Office of Science. The Earth System Grid,
  5. 5.
    Wolfson, O., Jajodia, S., Huang, Y.: An Adaptive Data Replication Algorithm. ACM Transaction on Database Systems 22(2), 255–314 (1997)CrossRefGoogle Scholar
  6. 6.
    Pacitti, E., Minet, P., Simon, E.: Fast algorithms for maintaining replica consistency in lazy master replicated databases. In: VLDB, pp. 126–137 (1999)Google Scholar
  7. 7.
    Dahlin, M., Wang, R., Anderson, T., Patterson, D.: Cooperative Caching: Using Remote Client Memory to Improve File System Performance. In: OSDI (November 1994)Google Scholar
  8. 8.
    Dahlin, M., Mather, C., Wang, R., Anderson, T., Patterson, D.: A Quantitative Analysis of Cache Policies for Scalable Network File Systems. In: SIGMETRICS (1994)Google Scholar
  9. 9.
    Garcia, F., Carretero, J., Perez, F., de Miguel, P., Alonso, L.: High Performance Cache Management for Parallel File Systems. In: Hernández, V., Palma, J.M.L.M., Dongarra, J. (eds.) VECPAR 1998. LNCS, vol. 1573, pp. 466–479. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  10. 10.
    Thain, D., Basney, J., Son, S.-C., Livny, M.: The Kangaroo Approach to Data Movement on the Gr4id. In: Proceedings of the Tenth IEEE Symposium on High Performance Distributed ComputingGoogle Scholar
  11. 11.
    Simitci, H., Reed, D.A., Fox, T., Medina, M., Oly, J., Trand, N., Wang, G.: A Framework for Adaptive Storage Input/Output on Computational Grids. In: Proceedings of the 3rd Workshop on Runtime Systems for Parallel Programming (April 1999)Google Scholar
  12. 12.
    Madhyastha, T.M., Elford, C.L., Reed, D.A.: Optimizing Input/Output Using Adaptive File System Policies. In: Proceedings of the Fifth Goddard Conference on Mass Storage Systems and Technologies, College Park, MD, September 1996, pp. 493–514 (1996)Google Scholar
  13. 13.
    Ranganathan, K., Foster, I.: Identifying Dynamic Replication Strategies for a High Performance Data Grid. In: Proceedings of the International Grid Computing Workshop, Denve (November 2001)Google Scholar
  14. 14.
    Ranganathan, K., Iamnitchi, A., Foster, I.: Improving Data Availability through Dynamic Model-Driven Replication in Large Peer-to-Peer Communities. In: Global and Peer-to-Peer Computing on Large Scale Distributed Systems Workshop, Berlin (May 2002)Google Scholar
  15. 15.
    Thakur, R., Choudhary, A.: An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays. Scientific Programming 4(5), 301–317 (Winter 1996)Google Scholar
  16. 16.
    Thakur, R., Gropp, W., Lusk, E.: Data Sieving and Collective I/O in ROMIO. In: Proceedings of the Seventh Symposium on the Frontiers of Massively Parallel Computation (1998)Google Scholar
  17. 17.
    Thain, D., Bent, J., Arpaci-Dusseau, A., Arpaci-Dusseau, R., Libny, M.: Gathering at the Well: Creating Communities for Grid I/O. In: Proceedings of Supercomputing 2001, Denver, Colorado (November 2001)Google Scholar
  18. 18.
    Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. In: Proceedings of the 2001 ACM SIGCOMM Conference, San Diego, California, USA (2001)Google Scholar
  19. 19.
    Garcia, F., Calderon, A., Carretero, J., Perez, J.M., Fernandez, J.: The Design of the Expand Parallel File System. International Journal of High Performance Computing Applications (2003)Google Scholar
  20. 20.
    Gropp, W., Takhur, R., Lusk, E.: An Abstract-Devide Interface for Implementing Portable Paralle-I/O Interfaces. In: Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation, October 1996, pp. 180–187 (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Jose M. Perez
    • 1
  • Felix Garcia
    • 1
  • Jesus Carretero
    • 1
  • Jose D. Garcia
    • 1
  • Soledad Escolar
    • 1
  1. 1.Computer Architecture Group, Department of Computer ScienceUniversity Carlos III de MadridSpain

Personalised recommendations