Measuring Process Migration Effects Using an MP Simulator

  • Andrew Ladd
  • Trevor Mudge
  • Oyekunle Olukotun

Abstract

This chapter examines multiprocessors that are used in throughput mode to multiprogram a number of unrelated sequential programs. This is the most common use for multiprocessors. There is no data sharing between the programs. Each program forms a process that is scheduled to execute on a processor until it blocks because its time-slice expires or for I/O. The process can migrate among the processors when its turn to resume occurs. The degree of process migration can dramatically impact the behavior of caches and hence the throughput of the multiprocessor. This chapter investigates the behavior of cache misses that can result from different degrees of process migration. The degree of migration is varied by assigning each process an affinity for a particular processor. An efficient multiprocessor cache simulator is described that is used in the study. The study is restricted to shared-bus multiprocessors and two contrasting cache consistency protocols, write update and invalidate.

Keywords

Migration Titan Coherence Compaction Verse 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    H. Stone. Cache allocation strategies for competing processes, February 1990. Presentation in Distinguished Lecture Series at University of Michigan.Google Scholar
  2. [2]
    S. Thakkar and M. Sweiger. Performance of an OLTP application on the Symmetry multiprocessor system. In Proceedings of the 17th International Symposium on Computer Architecture, volume 1, pages 228–238. IEEE, 1990.CrossRefGoogle Scholar
  3. [3]
    A. J. Smith. Cache memories. ACM Computing Surveys, 14(3), Sepetember 1982.Google Scholar
  4. [4]
    M. Kobayashi and M. MacDougall. The stack growth function: Cache line reference models. IEEE Transactions on Computers, 38(6), June 1989.Google Scholar
  5. [5]
    A. Borg, R. Kessler, G. Lazana, and D. Wall. Long address traces from RISC machines: Generation and analysis. WRL research report 89/14, Digital Equipment Western Research Laboritory, Palo Alto, California, September 1989.Google Scholar
  6. [6]
    M. Papamarcos and J. Patel. A low-overhead coherence solution for multiprocessors with private cache memories. In Proceedings of the 11th International Symposium on Computer Architecture, volume 1, pages 348–354. IEEE, 1984.Google Scholar
  7. [7]
    J. Archibald and J.-L. Baer. An economical solution to the cache coherence problem. In Proceedings of the 11th International Symposium on Computer Architecture, volume 1, pages 355–362. IEEE, 1984.Google Scholar
  8. [8]
    J. Archibald and J.-L. Baer. Cache coherence protocols: Evaluation using a multiprocessor simulation model. ACM Transactions on Computer Systems, 4(4):273–298, November 1986.CrossRefGoogle Scholar
  9. [9]
    A. Agarwal, R. Simoni, J. Hennessy, and M. Horowitz. An evaluation of directory schemes for cache coherence. In Proceedings of the 15th International Symposium on Computer Architecture, volume 1, pages 280–289. IEEE, 1988.Google Scholar
  10. [10]
    R. Clapp, T. Mudge, and J. Smith. Performance of parallel loops using alternative cache consistency protocols on a non-bus multiprocessor. In M. Dubois and S. Thakkar, editors, Cache and Interconnect Architectures in Multiprocessors, pages 131–152. Kluwer, 1990.Google Scholar
  11. [11]
    J. Larus. Abstract execution: A technique for efficiently tracing programs. Technical report, University of Wisconsin-Madison, February 1990.Google Scholar
  12. [12]
    MIPS. System Programmer’s Package (SPP) Reference. MIPS Corp., Boston, 1988.Google Scholar
  13. [13]
    D. Samples. Mache: No-loss trace compaction. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems, volume 1, pages 89–97. IEEE, 1989.Google Scholar
  14. [14]
    S. Laha, J. Patel, and R. Iyer. Accurate low-cost methods for performance evaluation of cache memory systems. IEEE Transactions on Computers, 37(11), November 1988.Google Scholar
  15. [15]
    R. Sites and A. Agarwal. Multiprocessor cache analysis using ATUM. In Proceedings of the 15th International Symposium on Computer Architecture, volume 1, pages 186–195. IEEE, 1988.Google Scholar

Copyright information

© Springer Science+Business Media New York 1992

Authors and Affiliations

  • Andrew Ladd
    • 1
  • Trevor Mudge
    • 1
  • Oyekunle Olukotun
    • 1
  1. 1.Advanced Computer Architecture Lab Department of Electrical Engineering and Computer ScienceUniversity of MichiganAnn ArborUSA

Personalised recommendations