Skip to main content

A New Data Sieving Approach for High Performance I/O

  • Conference paper
  • First Online:
Future Information Technology, Application, and Service

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 164))

Abstract

Many scientific computing applications and engineering simulations exhibit noncontiguous I/O access patterns. Data sieving is an important technique to improve the performance of noncontiguous I/O accesses by combining small and noncontiguous requests into a large and contiguous request. It has been proven effective even though more data is potentially accessed than demanded. In this study, we propose a new data sieving approach namely Performance Model Directed Data Sieving, or PMD data sieving in short. It improves the existing data sieving approach from two aspects: (1) dynamically determines when it is beneficial to perform data sieving; and (2) dynamically determines how to perform data sieving if beneficial. It improves the performance of the existing data sieving approach and reduces the memory consumption as verified by experimental results. Given the importance of supporting noncontiguous accesses effectively and reducing the memory pressure in a large-scale system, the proposed PMD data sieving approach in this research holds a promise and will have an impact on high performance I/O systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Blas, J.G., Isaila, F., Carretero, J., Latham, R., Ross. R.: Multiple-level MPI file write-back and prefetching for blue gene systems. In: Proceedings of the PVM/MPI (2009)

    Google Scholar 

  2. Bordawekar, R., Rosario, J.M., Choudhary, A.N.: Design and evaluation of primitives for parallel I/O. In: Proceedings of the ACM/IEEE Supercomputing Conference (1993)

    Google Scholar 

  3. Carns, P.H., Ligon, W.B., III, Ross, R.B., Thakur, R.: PVFS “a parallel file system for linux clusters.” In: Proceedings of the 4th Annual Linux Showcase and Conference (2000)

    Google Scholar 

  4. Cluster File Systems Inc.: Lustre: a scalable, high performance file system. Whitepaper. http://www.lustre.org/docs/whitepaper.pdf

  5. Crandall, P.E., Aydt, R.A., Chien, A.A., Reed, D.A.: Input/output characteristics of scalable parallel applications. In: Proceedings of the ACM/IEEE Conference on Supercomputing, pp. 59-es (1995)

    Google Scholar 

  6. Iskra, K., Romein, J.W., Yoshii, K., Beckman, P.: ZOID: I/O forwarding infrastructure for petascale architectures. In: Proceedings of the 13th ACM PPoPP (2008)

    Google Scholar 

  7. Lei, H., Duchamp, D.: An analytical approach to file prefetching. In: Proceedings of the 1997 USENIX Annual Technical Conference, pp. 275–288, Jan 1997

    Google Scholar 

  8. Lofstead, J.F., Klasky, S., Schwan, K., Podhorszki, N., Jin, C.: Flexible I/O and integration for scientific codes through the adaptable I/O system (ADIOS). In: Proceedings of the 6th International Workshop on Challenges of Large Applications in Distributed Environments (2008)

    Google Scholar 

  9. May, J.: Parallel I/O for high performance computing. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  10. Ma, X.S., Winslett, M., et. al.: Faster collective output through active buffering. In: IPDPS (2002)

    Google Scholar 

  11. Nisar,A., Liao, W.-K., Choudhary, A.: Scaling parallel I/O performance through I/O delegate and caching system. SC (2008)

    Google Scholar 

  12. Nitzberg, B., et al.: Collective buffering: improving parallel I/O performance. In: HPDC (1997)

    Google Scholar 

  13. Rafique, M.M., Butt, A.R., Nikolopoulos, D.S.: DMA-based prefetching for I/O-intensive workloads on the cell architecture. In: Conference on Computing Frontiers, pp. 23–32 (2008)

    Google Scholar 

  14. ROMIO website.: http://www-unix.mcs.anl.gov/romio/

  15. Schmuck, F., Haskin, R., GPFS: a shared-disk file system for large computing clusters. In: Proceedings of the First USENIX FAST, pp. 231–244, USENIX, Jan 2002

    Google Scholar 

  16. Tran, N., Reed, D.A.: Automatic ARIMA time series modeling for adaptive I/O prefetching. IEEE Trans. Parallel Distrib. Sys. 15(4), 362–377 (2004)

    Article  Google Scholar 

  17. Thakur, R., Gropp, W., Lusk, E.: Data sieving and collective I/O in ROMIO. In: Proceedings of the 7th Symposium on the Frontiers of Massively Parallel Computation (1999)

    Google Scholar 

  18. Thakur, R., Choudhary, A., Bordawekar, R., More, S., Kuditipudi, S.: Passion: optimized I/O for parallel applications. Computer 29(6), 70–78, June 1996

    Google Scholar 

  19. Vilayannur, M., Sivasubramaniam, A., Kandemir, M.T., Thakur, R., Ross, R.: Discretionary caching for I/O on clusters. Cluster Comput. 9(1), 29–44 (2006)

    Article  Google Scholar 

  20. Welch, B., Unangst, M., Abbasi, Z., Gibson, G., Mueller, B., Small, J., Zelenka, J., Zhou, B.: Scalable performance of the panasas parallel file system. USENIX FAST (2008)

    Google Scholar 

  21. Zhang, X., Jiang, S., Davis, K.: Making resonance a common case: a high-performance implementation of collective I/O on parallel file systems. IPDPS (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media Dortdrecht

About this paper

Cite this paper

Lu, Y., Chen, Y., Amritkar, P., Thakur, R., Zhuang, Y. (2012). A New Data Sieving Approach for High Performance I/O. In: J. (Jong Hyuk) Park, J., Leung, V., Wang, CL., Shon, T. (eds) Future Information Technology, Application, and Service. Lecture Notes in Electrical Engineering, vol 164. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-4516-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-94-007-4516-2_12

  • Published:

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-007-4515-5

  • Online ISBN: 978-94-007-4516-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics