Journal of Signal Processing Systems

, Volume 88, Issue 1, pp 83–89 | Cite as

Designs of Low Power Snoop for Multiprocessor System on Chip



In multiprocessor system on chip, processors can access expectable shared data by using snoop protocol in different time. However, this design will generate a large number of snoops to consume unnecessary energy. The main objective of the paper is to reduce the number of snoops of the multiprocessor system by using an energy-saving architecture. The proposed method includes two designs: 1) snoop turning point design and 2) snoop buffer design. In the first design, a main key defined as the critical section in which the data accessed synchronously by multiple processors is presented. When the data in critical section are accessed by a processor, the critical section will be locked immediately such that other processors cannot access the data. Because the critical section processors have not common accessing data with the other processors, all snoops are removed after snooping turning point. In the second design, we add buffers to the caches and shared buses to label the number of common data processors. Using the design, only the processors labeled in buffers need to be snooped. The experimental results are shown that the proposed designs can achieve the purpose of energy saving.


Multi-processor system on chip Low power Synchronous communication Snoop Coherence 


  1. 1.
    Martin, M. K., Hill, M. D., & Wood, D. A. (2003). Token coherence: Decoupling performance and correctness. Proceedings. International Symposium on Computer Architecture (ISCA) (pp. 182–193).Google Scholar
  2. 2.
    Nilsson, J., Landin, A., & Stenstrom, P. (2003). The coherence predictor cache: A resource-efficient and accurate coherence prediction infrastructure. Proceedings. International on Parallel and Distributed Processing Symposium (IPDPS) (pp. 10–17).Google Scholar
  3. 3.
    Ekman, M., Dahlgren, F., & Stenstrom, P. (2002). TLB and snoop energy reduction using virtual caches in low-power chip-microprocessors. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED) (pp. 243–246).Google Scholar
  4. 4.
    Loghi, M., Letis, M., Benini, L., & Poncino, M. (2005). Exploring the energy efficiency of cache coherence protocols in single-chip multi-processors. Proceedings of ACM Great Lake Symposium on VLSI (GLSVLSI) (pp. 276–281).Google Scholar
  5. 5.
    Moshovos, A. (2005). Regionscout: Exploiting coarse grain sharing in snoop-based coherence. Proceedings. International Symposium on Computer Architecture (ISCA) (pp. 234–245).Google Scholar
  6. 6.
    Wenisch, T. F., Somogyi, S., Hardavellas, N., Kim, J., Ailamaki, A., & Falsafi, B. (2005). Temporal streaming of shared memory. Proceedings. International Symposium on Computer Architecture (ISCA) (pp. 222–233).Google Scholar
  7. 7.
    Cantin, J. F., Lipasti, M. H., & Smith, J. E. (2005). Improving multiprocessor performance with coarse-grain coherence tracking. Proceedings. International Symposium on Computer Architecture (ISCA), 33(2), 246–257.CrossRefGoogle Scholar
  8. 8.
    Mukherjee, S., & Hill, M. (1998). Using prediction to accelerate coherence protocols. Proceedings. International Symposium on Computer Architecture (ISCA) (pp. 179–190).Google Scholar
  9. 9.
    Patel, A., & M.S., 2008 ACM/IEEE. Energy efficient MESI cache-coherence with pro-active snoop filtering for multicore microprocessors. Proceedings of the 13th international symposium on Low power electronics and design (ISLPED) (pp. 247–252).Google Scholar
  10. 10.
    Zeng, F., Qiao, L., & Wang, W. (2011). A power-efficient parallel coherence protocol for large-scale network-on-chip. International Conference on Parallel Processing (ICPP) (pp. 63–72).Google Scholar
  11. 11.
    Binkert, N. L., Dreslinski, R. G., Hsu, L. R., Lim, K. T., Saidi, A. G., & Reinhardt, S. K. (2006). The M5 simulator: modeling networked systems. IEEE Micro, 26(4), 52–60.CrossRefGoogle Scholar
  12. 12.
    Ong, W. J., Samsudin, K., Ramli, A. R. & Adnan, W. A. W (2011). Modeling graphic subsystem for M5 simulator. IEEE Conference on Open Systems (ICOS) (pp. 294–299).Google Scholar
  13. 13.
    Woo, S., Ohara, M., Torrie, E., Singh, J., & Gupta, A. (1995). The SPLASH-2 programs: Characterization and methodological considerations. Annual International Symposium on Computer Architecture (pp. 24–36).Google Scholar
  14. 14.
    Singh, J., Weber, W.-D., & Gupta, A. (1992).Splash: Stanford parallel applications forshared memory. Computer Architecture News (pp. 5–44).Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Department of Computer and Information Science and EngineeringNational University of TainanTainanTaiwan

Personalised recommendations