Skip to main content
Log in

DataStager: scalable data staging services for petascale applications

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Known challenges for petascale machines are that (1) the costs of I/O for high performance applications can be substantial, especially for output tasks like checkpointing, and (2) noise from I/O actions can inject undesirable delays into the runtimes of such codes on individual compute nodes. This paper introduces the flexible ‘DataStager’ framework for data staging and alternative services within that jointly address (1) and (2). Data staging services moving output data from compute nodes to staging or I/O nodes prior to storage are used to reduce I/O overheads on applications’ total processing times, and explicit management of data staging offers reduced perturbation when extracting output data from a petascale machine’s compute partition. Experimental evaluations of DataStager on the Cray XT machine at Oak Ridge National Laboratory establish both the necessity of intelligent data staging and the high performance of our approach, using the GTC fusion modeling code and benchmarks running on 1000+ processors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abbasi, H., Wolf, M., Schwan, K.: LIVE data workspace: a flexible, dynamic and extensible platform for petascale applications. In: Cluster Computing, Sept. 2007. IEEE International

  2. Ali, N., Lauria, M.: Improving the performance of remote i/o using asynchronous primitives. In: 15th IEEE International Symposium on High Performance Distributed Computing (2006), pp. 218–228

  3. Beckman, P., Coghlan, S.: ZeptoOS: the small Linux for big computers (2005)

  4. Bell, K., Chien, A., Lauria, M.: A high-performance cluster storage server. In: Proceedings of 11th IEEE International Symposium on High Performance Distributed Computing. HPDC-11 2002, pp. 311–320 (2002)

  5. Bianchini, R., Crovella, M., Kontothanassis, L., LeBlanc, T.: Alleviating memory contention in matrix computations on large-scale shared-memory multiprocessors. Technical report, DTIC (1993)

  6. Borrill, J., Oliker, L., Shalf, J., Shan, H.: Investigation of leading HPC I/O performance using a scientific-application derived Benchmark. In: Proceedings of the Conference on SuperComputing, SC07 (2007)

  7. Brightwell, R., Hudson, T., Riesen, R., Maccabe, A.B.: The Portals 3.0 message passing interface. Technical report SAND99-2959, Sandia National Laboratories, December 1999

  8. Brightwell, R., Lawry, B., MacCabe, A.B., Riesen, R.: Portals 3.0: Protocol building blocks for low overhead communication. In: IPDPS ’02: Proceedings of the 16th International Parallel and Distributed Processing Symposium, p. 268. IEEE Comput. Soc., Washington (2002)

    Google Scholar 

  9. Bustamante, F.E., Eisenhauer, G., Schwan, K., Widener, P.: Efficient wire formats for high performance computing. In: Proceedings of the ACM/IEEE Conference on Supercomputing (CDROM), p. 39. IEEE Comput. Soc., Los Alamitos (2000)

    Google Scholar 

  10. Cai, Z., Eisenhauer, G., He, Q., Kumar, V., Schwan, K., Wolf, M.: Iq-services: network-aware middleware for interactive large-data applications. In: MGC ’04: Proceedings of the 2nd Workshop on Middleware for Grid Computing, pp. 11–16. ACM Press, New York (2004)

    Chapter  Google Scholar 

  11. Carns, P.H., Ligon, W.B. III, Ross, R.B., Thakur, R.: PVFS: A parallel file system for Linux clusters. In: Proceedings of the 4th Annual Linux Showcase and Conference, pp. 317–327, Atlanta, GA, 2000. USENIX Association

  12. Cluster File Systems Inc. Lustre: a scalable, high-performance file system. White paper, version 1.0, November 2002. http://www.lustre.org/docs/whitepaper.pdf

  13. Dandamudi, S.: Reducing hot-spot contention in shared-memory multiprocessor systems. IEEE Parallel Distrib. Technol. 7(1), 48–59 (1999)

    Google Scholar 

  14. Ding, C., Dwarkadas, S., Huang, M., Shen, K., Carter, J.: Program phase detection and exploitation. In: 20th International Parallel and Distributed Processing Symposium. IPDPS 2006, 8 pp., 25–29 April 2006

  15. Docan, C., Parashar, M., Klasky, S.: High speed asynchronous data transfers on the cray xt3. In: Cray User Group Conference (2007)

  16. Eisenhauer, G.: The evpath library. http://www.cc.gatech.edu/systems/projects/EVPath

  17. Eisenhauer, G.: Portable binary input/output. http://www.cc.gatech.edu/systems/projects/PBIO

  18. Eisenhauer, G., Bustamente, F., Schwan, K.: Event services for high performance computing. In: Proceedings of High Performance Distributed Computing, HPDC-2000 (2000)

  19. Gardner, M.K., Feng, W.-C., Archuleta, J.S., Lin, H., Ma, X.: Parallel genomic sequence-searching on an ad-hoc grid: experiences, lessons learned, and implications. In: ACM/IEEE SC—06: The International Conference on High-Performance Computing, Networking, Storage, and Analysis, Tampa, FL, November 2006. Best Paper Nominee

  20. Golestani, S.: A stop-and-go queueing framework for congestion management. In: SIGCOMM’90 Symposium, September 1990, pp. 8–18. ACM

  21. Jain, R., Ramakrishnan, K.K., Chiu, D.M.: Congestion avoidance in computer networks with a connectionless network layer. Technical Report DEC-TR-506, Digital Equipment Corporation, MA, Aug. 1987

  22. Kotz, D.: Disk-directed I/O for MIMD multiprocessors. ACM Trans. Comput. Syst. 15(1), 41–47 (1997)

    Article  MathSciNet  Google Scholar 

  23. Latham, R., Miller, N., Ross, R., Carns, P.: A next-generation parallel file system for Linux clusters. LinuxWorld, 2(1), January 2004

  24. Lofstead, J., Schwan, K., Klasky, S., Podhorszki, N., Jin, C.: Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS). In: Challenges of Large Applications in Distributed Environments, CLADE (2008)

  25. Miller, E.L., Katz, R.H.: Input/output behavior of supercomputing applications. In: Supercomputing ’91: Proceedings of the 1991 ACM/IEEE Conference on Supercomputing, pp. 567–576. ACM, New York (1991)

    Chapter  Google Scholar 

  26. Nisar, A., Liao, K.W., Choudhary, A.: Scaling parallel I/O performance through I/O delegate and caching system. In: SC ’08: Proceedings of the ACM/IEEE Conference on Supercomputing, pp. 1–12, Piscataway, NJ, USA. IEEE Press, New York (2008)

    Google Scholar 

  27. Oldfield, R.A., Maccabe, A.B., Arunagiri, S., Kordenbrock, T., Riesen, R., Ward, L., Widener, P.: Lightweight I/O for Scientific Applications. In: Proc. of IEEE Conference on Cluster Computing, Barcelona, Spain, September 2006

  28. Oldfield, R.A., Widener, P., Maccabe, A.B., Ward, L., Kordenbrock, T.: Efficient data movement for lightweight I/O. In: Proc. 2006 Workshop on high-performance I/O Techniques and Deployment of Very-Large Scale I/O Systems (HiPerI/O 2006), Barcelona, Spain, September 2006

  29. Oliker, L., Carter, J., Wehner, M., Canning, A., Ethier, S., Mirin, A., Bala, G., Parks, D., Shigemune Kitawaki, P.W., Tsuda, Y.: Leading computational methods on scalar and vector hec platforms. In: Proceedings of SuperComputing (2005)

  30. Patrick, C.M., Son, S., Kandemir, M.: Comparative evaluation of overlap strategies with study of i/o overlap in mpi-io. SIGOPS Oper. Syst. Rev. 42(6), 43–49 (2008)

    Article  Google Scholar 

  31. Schmuck, F., Haskin, R.: GPFS: a shared-disk file system for large computing clusters. In: Proceedings of the 1st USENIX Conference on File and Storage Technologies (2002)

  32. Seamons, K.E., Chen, Y., Jones, P., Jozwiak, J., Winslett, M.: Server-directed collective i/o in panda. In: Supercomputing ’95: Proceedings of the ACM/IEEE Conference on Supercomputing (CDROM), p. 57. ACM, New York (1995)

    Chapter  Google Scholar 

  33. Sinha, S., Parashar, M.: Adaptive system sensitive partitioning of amr applications on heterogeneous clusters. Cluster Comput. 5(4), 343–352 (2002)

    Article  Google Scholar 

  34. Stone, N., Balog, D., Gill, B., Johan-Son, B., Marsteller, J., Nowoczynski, P., Porter, D., Reddy, R., Scott, J., Simmel, D., et al.: PDIO: High-performance remote file I/O for portals enabled compute nodes. In: Proceedings of the Conference on Parallel and Distributed Processing Techniques and Applications, Las Vegas, NV, June 2006

  35. Widener, P.M., Wolf, M., Abbasi, H., Barrick, M., Lofstead, J., Pullikottil, J., Eisenhauer, G., Gavrilovska, A., Klasky, S., Oldfield, R., Bridges, P.G., Maccabe, A.B., Schwan, K.: Structured streams: data services for petascale science environments. Technical Report TR-CS-2007-17, University of New Mexico, Albuquerque, NM, November 2007

  36. Wolf, M., Abbasi, H., Collins, B., Spain, D., Schwan, K.: Service augmentation for high end interactive data services. In: IEEE International Conference on Cluster Computing, Cluster 2005, September 2005

  37. Wolf, M., Cai, Z., Huang, W., Schwan, K.: Smartpointers: Personalized scientific data portals in your hand. In: Proceedings of the Conference IEEE/ACM SC2002, p. 20. IEEE Computer Society, Los Alamitos (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hasan Abbasi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abbasi, H., Wolf, M., Eisenhauer, G. et al. DataStager: scalable data staging services for petascale applications. Cluster Comput 13, 277–290 (2010). https://doi.org/10.1007/s10586-010-0135-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-010-0135-6

Keywords

Navigation