NWCache: Optimizing disk accesses via an optical network/write cache hybrid

  • Enrique V. Carrera
  • Ricardo Bianchini
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1586)

Abstract

In this paper we propose a simple extension to the I/O architecture of scalable multiprocessors that optimizes page swap-outs significantly. More specifically, we propose the use of an optical ring network for I/O operations that not only transfers swapped-out pages between the local memories and the disks, but also acts as a system-wide write cache. In order to evaluate our proposal, we use detailed execution-driven simulations of several out-of-core parallel applications running on an 8-node scalable multiprocessor. Our results demonstrate that the NWCache provides consistent performance improvements, coming mostly from faster page swap-outs, victim caching, and reduced contention. Based on these results, our main conclusion is that the NWCache is highly efficient for most out-of-core parallel applications.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    A. Agarwal, R. Bianchini, D. Chaiken, K. Johnson, D. Kranz, J. Kubiatowicz, B.-H. Lim, K. Mackenzie, and D. Yeung. The MIT Alewife Machine: Architecture and Performance. In Proceedings of the 22nd International Symposium on Computer Architecture, June 1995.Google Scholar
  2. 2.
    R. Alverson, D. Callahan, D. Cummings, B. Koblenz, A. Porterfield, and B. Smith. The Tera Computer System. In Proceedings of the 1990 International Conference on Supercomputing, July 1990.Google Scholar
  3. 3.
    E. Felten and J. Zahorjan. Issues in the Implementation of a Remote Memory Paging System. Technical Report 91-03-09, Department of Computer Science and Engineering, University of Washington, March 1991.Google Scholar
  4. 4.
    K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. L. Hennessy. Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pages 15–26, May 1990.Google Scholar
  5. 5.
    K. Ghose, R. K. Horsell, and N. Singhvi. Hybrid Multiprocessing in OPTIMUL: A Multiprocessor for Distributed and Shared Memory Multiprocessing with WDM Optical Fiber Interconnections. In Proceedings of the 1994 International Conference on Parallel Processing, August 1994.Google Scholar
  6. 6.
    J.-H. Ha and T. M. Pinkston. SPEED DMON: Cache Coherence on an Optical Multichannel Interconnect Architecture. Journal of Parallel and Distributed Computing, 41(1):78–91, 1997.CrossRefGoogle Scholar
  7. 7.
    Y. Hu and Q. Yang. DCD-Disk Caching Disk: A New Approach for Boosting I/O Performance. In Proceedings of the 23rd International Symposium on Computer Architecture, pages 169–177, May 1996.Google Scholar
  8. 8.
    H. F. Jordan, V. P. Heuring, and R. J. Feuerstein. Optoelectronic Time-of-Flight Design and the Demonstration of an All-Optical, Stored Program. Proceedings of IEEE. Special issue on Optical Computing, 82(11), November 1994.Google Scholar
  9. 9.
    T. Kimbrel et al. A Trace-Driven Comparison of Algorithms for Parallel Prefetching and Caching. In Proceedings of the 2nd USENIX Symposium on Operating Systems Design and Implementation, October 1996.Google Scholar
  10. 10.
    D. Kotz and C. Ellis. Practical Prefetching Techniques for Multiprocessor File Systems. Journal of Distributed and Parallel Databases, 1(1):33–51, January 1993.CrossRefGoogle Scholar
  11. 11.
    R. Langenhorst et al. Fiber Loop Optical Buffer. Journal of Lightwave Technology, 14(3):324–335, March 1996.CrossRefGoogle Scholar
  12. 12.
    D. Lenoski, J. Laudon, T. Joe, D. Nakahira, L. Stevens, A. Gupta, and J. Hennessy. The DASH Prototype: Logic Overhead and Performance. IEEE Transactions on Parallel and Distributed Systems, 4(1):41–61, January, 1993.CrossRefGoogle Scholar
  13. 13.
    K. McKusick, W. Joy, S. Leffler, and R. Fabry. A Fast File System for UNIX. ACM Transactions on Computer Systems, 2(3):181–197, August 1984.CrossRefGoogle Scholar
  14. 14.
    T. Mowry, A. Demke, and O. Krieger. Automatic Compiler-Inserted I/O Prefetching for Out-Of-Core Applications. In Proceedings of the 2nd USENIX Symposium on Operating Systems Design and Implementation, October 1996.Google Scholar
  15. 15.
    A. G. Nowatzyk and P. R. Prucnal. Are Crossbars Really Dead? The Case for Optical Multiprocessor Interconnect Systems. In Proceedings of the 22nd International Symposium on Computer Architecture, pages 106–115, June 1995.Google Scholar
  16. 16.
    M. Rosenblum and J. Ousterhout. The Design and Implementation of a Log-Structured File System. ACM Transactions on Computer Systems, 10(2):26–52, February 1992.CrossRefGoogle Scholar
  17. 17.
    C. Ruemmler and J. Wilkes. UNIX Disk Access Patterns. In Proceedings of the Winter 1993 USENIX Conference, January 1993.Google Scholar
  18. 18.
    D. B. Sarrazin, H. F. Jordan, and V. P. Heuring. Fiber Optic Delay Line Memory. Applied Optics, 29(5):627–637, February 1990.CrossRefGoogle Scholar
  19. 19.
    D. Stodolsky, M. Holland, W. Courtright III, and G. Gibson, Parity Logging Disk Arrays. ACM Transactions on Computer Systems, 12(3):206–235, August 1994.CrossRefGoogle Scholar
  20. 20.
    J. E. Veenstra and R. J. Fowler. MINT: A Front End for Efficient Simulation of Shared-Memory Multiprocessors. In Proceedings of the 2nd International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, January 1994.Google Scholar
  21. 21.
    D. Womble, D. Greenberg, R. Riesen, and D. Lewis. Out of Core, Out of Mind: Practical Parallel I/O. In Proceedings of the Scalable Parallel Libraries Conference, October 1993.Google Scholar

Copyright information

© Springer-Verlag 1999

Authors and Affiliations

  • Enrique V. Carrera
    • 1
  • Ricardo Bianchini
    • 1
  1. 1.COPPE Systems EngineeringFederal University of Rio de JaneiroRio de JaneiroBrazil

Personalised recommendations