Skip to main content
Log in

Discretionary Caching for I/O on Clusters

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

I/O bottlenecks are already a problem in many large-scale applications that manipulate huge datasets. This problem is expected to get worse as applications get larger, and the I/O subsystem performance lags behind processor and memory speed improvements. At the same time, off-the-shelf clusters of workstations are becoming a popular platform for demanding applications due to their cost-effectiveness and widespread deployment. Caching I/O blocks is one effective way of alleviating disk latencies, and there can be multiple levels of caching on a cluster of workstations.

Previous studies have shown the benefits of caching—whether it be local to a particular node, or a shared global cache across the cluster—for certain applications. However, we show that while caching is useful in some situations, it can hurt performance if we are not careful about what to cache and when to bypass the cache. This paper presents compilation techniques and runtime support to address this problem. These techniques are implemented and evaluated on an experimental Linux/Pentium cluster running a parallel file system. Our results using a diverse set of applications (scientific and commercial) demonstrate the benefits of a discretionary approach to caching for I/O subsystems on clusters, providing as much as 48% savings in overall execution time over indiscriminately caching everything in some applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel and M. Paleczny, A model and compilation strategy for out-of-core data parallel programs, in: Proceedings of the ACM-SIGPLAN Symposium on Principles and Practice of Parallel Programming ACM Press (Santa Barbara, CA, 1995) pp. 1–10.

  2. R. Bordawekar, A. Choudhary and J. Ramanujam, Automatic optimization of communication in compiling out-of-core stencil codes, in: Proceedings of the 10th ACM International Conference on Supercomputing ACM Press, Philadelphia, PA (1996) pp. 366–373.

  3. P. Brezany, T.A. Muck and E. Schikuta, Language, compiler and parallel database support for I/O intensive applications, in: Proceedings on High Performance Computing and Networking, Milano, Italy (1995).

  4. A.D. Brown and T.C. Mowry, Taming the memory hogs: Using compiler-inserted releases to manage physical memory intelligently. in: Proceedings of the 4th Symposium on Operating Systems Design and Implementation (Berkeley, CA, 2000) pp. 31–44.

  5. P.H. Carns, W.B. Ligon III, R.B. Ross and R. Thakur, PVFS: A Parallel File System for Linux Clusters, in: Proceedings of the 4th Annual Linux Showcase and Conference (Atlanta, GA, 2000) pp. 317–327.

  6. Z. Chen, Y. Zhou and K. Li, Eviction Based Cache Placement for Storage Caches, in: Proceedings of the USENIX Annual Technical Conference, (2003).

  7. A. Choudhary, R. Bordawekar, M. Harry, R. Krishnaiyer, R. Ponnusamy, T. Singh and R. Thakur, PASSION: Parallel and Scalable Software for Input-Output, Technical Report SCCS-636, Syracuse University, NY (1994).

  8. P. Corbett, D. Feitelson, S. Fineberg, Y. Hsu, B. Nitzberg, J.-P. Prost, M. Snir, B. Traversat and P. Wong, Overview of the MPI-IO Parallel I/O Interface, in: High Performance Mass Storage and Parallel I/O: Technologies and Applications, (IEEE Computer Society Press, edited by Hai Jin, Toni Cortes, and Rajkumar Buyya and Wiley New York, NY, 2001) pp. 477–487.

  9. P.F. Corbett, D.G. Feitelson, J-P. Prost, G.S. Almasi, S.J. Baylor, A.S. Bolmarcich, Y. Hsu, J. Satran, M. Snir, R. Colao, B.D. Herr, J. Kavaky, T.R. Morgan and A. Zlotek, Parallel file systems for the ibm sp computers, IBM Systems Journal 34(2) (1995) 222–248.

    Google Scholar 

  10. T. Cortes, S. Girona and J. Labarta, Design issues of a cooperative cache with no coherence problems, in: High Performance Mass Storage and Parallel I/O: Technologies and Applications, edited by Hai Jin, Toni Cortes and Rajkumar Buyya (IEEE Computer Society Press and Wiley, New York, NY, 2001), pp. 259–270.

  11. C. S. Ellis and D. Kotz, Prefetching in File Systems for MIMD Multiprocessors, in: Proceedings of the 1989 International Conference on Parallel Processing Pennsylvania State Univ. Press, St. Charles, IL (1989) pp. I:306–314.

  12. N. Stavrako et al., Symbolic Analysis in the PROMIS compiler, in: Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing, (1999).

  13. B.C. Forney, A.C. Arpaci-Dusseau and R.H. Arpaci-Dusseau, Storage-Aware Caching: Revisiting Caching for Heterogeneous Storage Systems, in: Proceedings of the First International Conference on File and Storage Technologies (FAST) (2002).

  14. Message Passing Interface Forum. MPI-2: Extensions to the Message-Passing Interface. Technical Report, University of Tennessee, Knoxville, (1996).

  15. M.R. Haghighat and C.D. Polychronopoulos, Symbolic analysis: A basis for parallelization, optimization and scheduling of programs, in: 1993 Workshop on Languages and Compilers for Parallel Computing Portland, OR., Berlin: Springer Verlag, 1993, pp. 567–585.

  16. J.V. Huber, Jr., C.L. Elford, D.A. Reed, A.A. Chien and D.S. Blumenthal, PPFS: A High performance portable parallel file system, in: High Performance Mass Storage and Parallel I/O: Technologies and Applications, edited by Hai Jin, Toni Cortes, and Rajkumar Buyya (IEEE Computer Society Press and Wiley, New York, NY 2001) pp. 330–343.

  17. K. Hwang, H. Jin and R. Ho, RAID-x: A new distributed disk array for i/o-centric cluster computing, in: Proceedings of the Ninth IEEE International Symposium on High Performance Distributed Computing, IEEE Computer Society Press Pittsburgh, PA (2000) pp. 279–287.

  18. T.L. Johnson, D.A. Connors, M.C. Merten and W.W. Hwu, Run-time Cache Bypassing, IEEE Transactions on Computers 48(12) (1999) 1338–1354.

    Article  Google Scholar 

  19. M. Kallahalla and P.J. Varman, Optimal Prefetching and Caching for Parallel I/O Systems, in: Proceedings of the Thirteenth Annual ACM symposium on Parallel algorithms and architectures, ACM Press, (2001) pp. 219–228.

  20. T. Kimbrel, P. Cao, E. Felten, A. Karlin and K. Li, Integrating Parallel Prefetching and Caching, in: Proceedings of the 1996 ACM SIGMETRICS Conference on Measurement and Modelling of Computer Systems, ACM Press (1996) pp. 262–263.

  21. D. Kotz, Disk-directed I/O for MIMD Multiprocessors, in: High Performance Mass Storage and Parallel I/O: Technologies and Applications, edited by Hai Jin, Toni Cortes, and Rajkumar Buyya, IEEE Computer Society Press and John Wiley & Sons (2001) pp. 513–535.

  22. T.M. Kroeger and D.E. Long, Predicting file-system actions from prior events, in: Usenix Annual Technical Conference (1996) pp. 319–328.

  23. E.K. Lee and C.A. Thekkath, Petal: Distributed virtual disks, in: proceedings of the seventh international conference on architectural support for programming languages and operating systems Cambridge, MA (1996) pp. 84–92.

  24. T.M. Madhyastha, Automatic classification of input output access patterns. PhD thesis, UIUC, IL, 1997.

  25. T.C. Mowry, A.K. Demke and O. Krieger, Automatic compiler-inserted I/O prefetching for out-of-core applications, in: Proceedings of the 1996 Symposium on Operating Systems Design and Implementation USENIX Association (1996) pp. 3–17.

  26. J. Nieplocha and I. Foster, Disk resident arrays: an array-oriented I/O library for out-of-core computations, in: Proceedings of the Sixth Symposium on the Frontiers of Massively Parallel Computation, IEEE Computer Society Press (1996) pp. 196–204.

  27. B. Nitzberg and V. Lo, Collective Buffering: Improving Parallel I/O Performance, in: Proceedings of the Sixth IEEE International Symposium on High Performance Distributed Computing IEEE Computer Society Press (1997) pp. 148–157.

  28. M. Paleczny, K. Kennedy and C. Koelbel, Compiler support for out-of-core arrays on data parallel machines, in: Proceedings of the 5th Symposium on the Frontiers of Massively Parallel Computation, McLean, VA (1995) pp. 110–118.

  29. R.H. Patterson, G.A. Gibson and M. Satyanarayanan, A status report on research in transparent informed prefetching, ACM Operating Systems Review, 27(2) (1993) 21–34.

    Google Scholar 

  30. F. Schmuck and R. Haskin, GPFS: A shared-disk file system for large computing clusters, in: Proceedings of the First Conference on File and Storage Technologies (FAST) (2002).

  31. K.E. Seamons, Y. Chen, P. Jones, J. Jozwiak and M. Winslett, Server-Directed Collective I/O in Panda, in: Proceedings of Supercomputing 95IEEE Computer Society Press, San Diego, CA (1995).

  32. X. Shen and A. Choudhary, DPFS: A Distributed Parallel File System. in: Proceedings of the International Conference on Parallel Processing, Spain (2001).

  33. S.R. Soltis, T.M. Ruwart, G.M. Erickson, K.W. Preslan and M.T. O'Keefe, The Global File System, in: High Performance Mass Storage and Parallel I/O: Technologies and Applications, edited by Hai Jin, Toni Cortes, and Rajkumar Buyya (IEEE Computer Society Press, and John Wiley & Sons 2001) pp. 10–15.

  34. R. Thakur, E. Lusk and W. Gropp, Users Guide for ROMIO: A High-Performance, Portable MPI-IO Implementation, Technical Report ANL/MCS–TM–234, Argonne National Labs, 1997.

  35. S. Toledo and F.G. Gustavson, The design and implementation of SOLAR, a Portable Library for Scalable out-of-core Linear Algebra Computations, in: Proceedings of the Fourth Annual Workshop on I/O in Parallel and Distributed Systems, (1996).

  36. G. Tyson, M. Farrens, J. Matthews and A. R. Pleszkun, A modified approach to data cache management. in: Proceedings of the 28th Annual ACM/IEEE International Symposium on Microarchitecture (1995) pp. 93–103.

  37. M. Uysal, A. Acharya and J. Saltz, Requirements of I/O Systems for Parallel Machines: An Application-driven Study, Technical Report CS-TR-3802, University of Maryland, College Park, MD, 1997.

  38. M. Vilayannur, M. Kandemir and A. Sivasubramaniam, Kernel-level Caching for Optimizing I/O by Exploiting Inter-application Data Sharing. in: Proceedings of the IEEE International Conference on Cluster Computing, 2002.

  39. R.P. Wilson, R.S. French, C.S. Wilson, S.P. Amarasinghe, J.M. Anderson, S.W.K. Tjiang, S.Liao, C. Tseng, M.W. Hall, M.S. Lam, and J.L. Hennessy. SUIF: An Infrastructure for Research on Parallelizing and Optimizing Compilers, SIGPLAN Notices 29(12) (1994) 31–37.

    Google Scholar 

  40. T.M. Wong and J.Wilkes, My cache or yours? Making storage more exclusive, in: Proceedings of the USENIX Annual Technical Conference (2002), pp. 161–175.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Murali Vilayannur.

Additional information

Parts of this paper have appeared in the Proceedings of the 3rd IEEE/ACM Symposium on Cluster Computing and the Grid (CCGrid'03). This paper is an extension of these prior results, and includes a more extensive performance evaluation.

Murali Vilayannur is a Ph.D. student in the Department of Computer Science and Engineering at The Pennsylvania State University. His research interests are in High-Performance Parallel I/O, File Systems, Virtual Memory Algorithms and Operating Systems.

Anand Sivasubramaniam received his B.Tech. in Computer Science from the Indian Institute of Technology, Madras, in 1989, and the M.S. and Ph.D. degrees in Computer Science from the Georgia Institute of Technology in 1991 and 1995 respectively. He has been on the faculty at The Pennsylvania State University since Fall 1995 where he is currently an Associate Professor. Anand's research interests are in computer architecture, operating systems, performance evaluation, and applications for both high performance computer systems and embedded systems. Anand's research has been funded by NSF through several grants, including the CAREER award, and from industries including IBM, Microsoft and Unisys Corp. He has several publications in leading journals and conferences, and is on the editorial board of IEEE Transactions on Computers and IEEE Transactions on Parallel and Distributed Systems. He is a recipient of the 2002 IBM Faculty Award. Anand is a member of the IEEE, IEEE Computer Society, and ACM.

Mahmut Kandemir received the B.Sc. and M.Sc. degrees in control and computer engineering from Istanbul Technical University, Istanbul, Turkey, in 1988 and 1992, respectively. He received the Ph.D. from Syracuse University, Syracuse, New York in electrical engineering and computer science, in 1999. He has been an assistant professor in the Computer Science and Engineering Department at the Pennsylvania State University since August 1999. His main research interests are optimizing compilers, I/O intensive applications, and power-aware computing. He is a member of the IEEE and the ACM.

Rajeev Thakur is a Computer Scientist in the Mathematics and Computer Science Division at Argonne National Laboratory. He received a B.E. from the University of Bombay, India, in 1990, M.S. from Syracuse University in 1992, and Ph.D. from Syracuse University in 1995, all in computer engineering. His research interests are in the area of high-performance computing in general and high-performance networking and I/O in particular. He was a member of the MPI Forum and participated actively in the definition of the I/O part of the MPI-2 standard. He is the author of a widely used, portable implementation of MPI-IO, called ROMIO. He is also a co-author of the book “Using MPI-2: Advanced Features of the Message Passing Interface” published by MIT Press.

Robert Ross received his Ph.D. in Computer Engineering from Clemson University in 2000. He is now an Assistant Scientist in the Mathematics and Computer Science Division at Argonne National Laboratory. His research interests are in message passing and storage systems for high performance computing environments. He is the primary author and lead developer for the Parallel Virtual File System (PVFS), a parallel file system for Linux clusters. Current projects include the ROMIO MPI-IO implementation, PVFS, PVFS2, and the MPICH2 implementation of the MPI message passing interface.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vilayannur, M., Sivasubramaniam, A., Kandemir, M. et al. Discretionary Caching for I/O on Clusters. Cluster Comput 9, 29–44 (2006). https://doi.org/10.1007/s10586-006-4895-y

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-006-4895-y

Keywords

Navigation