NWCache: Optimizing disk accesses via an optical network/write cache hybrid
In this paper we propose a simple extension to the I/O architecture of scalable multiprocessors that optimizes page swap-outs significantly. More specifically, we propose the use of an optical ring network for I/O operations that not only transfers swapped-out pages between the local memories and the disks, but also acts as a system-wide write cache. In order to evaluate our proposal, we use detailed execution-driven simulations of several out-of-core parallel applications running on an 8-node scalable multiprocessor. Our results demonstrate that the NWCache provides consistent performance improvements, coming mostly from faster page swap-outs, victim caching, and reduced contention. Based on these results, our main conclusion is that the NWCache is highly efficient for most out-of-core parallel applications.
Unable to display preview. Download preview PDF.
- 1.A. Agarwal, R. Bianchini, D. Chaiken, K. Johnson, D. Kranz, J. Kubiatowicz, B.-H. Lim, K. Mackenzie, and D. Yeung. The MIT Alewife Machine: Architecture and Performance. In Proceedings of the 22nd International Symposium on Computer Architecture, June 1995.Google Scholar
- 2.R. Alverson, D. Callahan, D. Cummings, B. Koblenz, A. Porterfield, and B. Smith. The Tera Computer System. In Proceedings of the 1990 International Conference on Supercomputing, July 1990.Google Scholar
- 3.E. Felten and J. Zahorjan. Issues in the Implementation of a Remote Memory Paging System. Technical Report 91-03-09, Department of Computer Science and Engineering, University of Washington, March 1991.Google Scholar
- 4.K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. L. Hennessy. Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pages 15–26, May 1990.Google Scholar
- 5.K. Ghose, R. K. Horsell, and N. Singhvi. Hybrid Multiprocessing in OPTIMUL: A Multiprocessor for Distributed and Shared Memory Multiprocessing with WDM Optical Fiber Interconnections. In Proceedings of the 1994 International Conference on Parallel Processing, August 1994.Google Scholar
- 7.Y. Hu and Q. Yang. DCD-Disk Caching Disk: A New Approach for Boosting I/O Performance. In Proceedings of the 23rd International Symposium on Computer Architecture, pages 169–177, May 1996.Google Scholar
- 8.H. F. Jordan, V. P. Heuring, and R. J. Feuerstein. Optoelectronic Time-of-Flight Design and the Demonstration of an All-Optical, Stored Program. Proceedings of IEEE. Special issue on Optical Computing, 82(11), November 1994.Google Scholar
- 9.T. Kimbrel et al. A Trace-Driven Comparison of Algorithms for Parallel Prefetching and Caching. In Proceedings of the 2nd USENIX Symposium on Operating Systems Design and Implementation, October 1996.Google Scholar
- 14.T. Mowry, A. Demke, and O. Krieger. Automatic Compiler-Inserted I/O Prefetching for Out-Of-Core Applications. In Proceedings of the 2nd USENIX Symposium on Operating Systems Design and Implementation, October 1996.Google Scholar
- 15.A. G. Nowatzyk and P. R. Prucnal. Are Crossbars Really Dead? The Case for Optical Multiprocessor Interconnect Systems. In Proceedings of the 22nd International Symposium on Computer Architecture, pages 106–115, June 1995.Google Scholar
- 17.C. Ruemmler and J. Wilkes. UNIX Disk Access Patterns. In Proceedings of the Winter 1993 USENIX Conference, January 1993.Google Scholar
- 20.J. E. Veenstra and R. J. Fowler. MINT: A Front End for Efficient Simulation of Shared-Memory Multiprocessors. In Proceedings of the 2nd International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, January 1994.Google Scholar
- 21.D. Womble, D. Greenberg, R. Riesen, and D. Lewis. Out of Core, Out of Mind: Practical Parallel I/O. In Proceedings of the Scalable Parallel Libraries Conference, October 1993.Google Scholar