The System-on-a-Chip Lock Cache

Abstract

Lock synchronization overheadsmay be significant in a shared-memory multiprocessor system-on-a-chip (SoC)implementation. These overheads are observed in terms of lock latency, lockdelay and memory bandwidth consumption in the system. There has been muchprevious work to speedup access of lock variables via specialized caches [1],software queues [2]–[5] and delayed loops, e.g., exponential backoff [2]. However, in the context of SoC, these previously reported techniquesall have drawbacks not present in our technique. We present a novel, efficient,small and very simple hardware unit, SoC Lock Cache (SoCLC), which resolvesthe critical section (CS) interactions among multiple processors and improvesthe performance criteria in terms of lock latency, lock delay and bandwidthconsumption in a shared-memory multiprocessor SoC. Our mechanism is capableof handling short CSs as well as long CSs. This combined support has beenestablished at both the hardware architecture level and the software architecturelevel including the real-time operating system (RTOS) kernel level facilities(such as support for preemptive versus non-preemptive synchronization, schedulingof lock variable accesses, interrupt handling and RTOS initialization). Theexperimental results of a microbenchmark program, which simulates an applicationwith high-contention critical section accesses under a four-processor platformwith shared-memory, showed an overall speedup of 55%. Furthermore, a databaseapplication example with client–server pairs of tasks,run on the same platform, showed that our mechanism achieved an overall speedupof 27%.

This is a preview of subscription content, access via your institution.

References

  1. 1.

    Ramachandran, U. and J. Lee. Cache-Based Synchronization in Shared Memory Multi-Processors. Journal of Parallel and Distrb. Computing, vol. 32, pp.11–27,1996.

    Google Scholar 

  2. 2.

    Anderson, T. The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors. IEEE Transactions on Parallel and Distributed Systems 1, vol. 1, no.1, pp. 6–16, January 1990.

    Google Scholar 

  3. 3.

    Graunke, G. and S. Thakkar. Synchronization Algorithms for Shared-Memory Multi-Processors. IEEE Computer, vol. 23, pp. 60–69, June 1990.

    Google Scholar 

  4. 4.

    Mellor-Crummey, J. M. and M. L. Scott. Algorithms for Scalable Synchronization on Shared Memory Multiprocessors. ACM Transactions on Computer Systems, vol. 9, no.1, pp. 21–65, Feb.1991.

    Google Scholar 

  5. 5.

    Magnusson, P., A. Landin, and E. Hagersten. Efficient Software Synchronization on Large Cache Coherent Multiprocessors. SICS Research Report T94:07, Swedish Institute of Computer Science, Kista, Sweden, February 1994.

    Google Scholar 

  6. 6.

    Ramachandran, U. and J. Lee. Processor Initiated Sharing in Multiprocessor Caches. Tech.Rep. GIT-ICS-88/43, Georgia Institute of Technology, November 1988.

  7. 7.

    Goodman, J. R., M. K. Vernon and P. J. Woest. Efficient Synchronization Primitives for Large-Scale Cache-Coherent Multiprocessors. In Proc. of the Third International Conference on ASPLOS, April 1989, pp. 64–75.

  8. 8.

    Ramachandran, U. and J. Lee. Architectural Primitives for a Scalable Memory Multi-Processor. Tech. Rep. GIT-ICS-91/10, Georgia Institute of Technology, February 1991.

  9. 9.

    Heinrich, J. MIPS R4000 Microprocessor User's Manual (2nd edition).MIPS Technologies, Inc., Mt.View, CA, 1994, pp. 286–291.

    Google Scholar 

  10. 10.

    Kagi, A., D. Burger and J. R. Goodman.Efficient Synchronization: Let Them Eat QOLB. In Proceedings of the 24th Annual International Symposium on Computer Architecture, June 1997, pp. 170–180.

  11. 11.

    Kagi, A. Mechanisms for Efficient Shared-Memory Lock-Based Synchronization. Ph.D.Thesis, Computer Sciences Department, University of Wisconsin, Madison, 1999.

    Google Scholar 

  12. 12.

    Mooney, V. and G. De Micheli. Hardware/Software Co-Design of Run-Time Schedulers for Real-Time Systems. Design Automation of Embedded Systems, vol. 6, no.1, pp. 89–144, September 2000.

    Google Scholar 

  13. 14.

    Woo, S. C.,M. Ohara, E. Torrie, J. P. Singh, and A. Gupta.The SPLASH-2 Programs: Characterization and Methodological Considerations. In Proceedings of the 22nd International Symposium on Computer Architecture, Santa Margherita Ligure, Italy, June 1995, pp. 24–36.

    Google Scholar 

  14. 15.

    Di-Shi, S., D. Blough, and V. Mooney. Atalanta: A New Multiprocessor RTOS Kernel for System-on-a-Chip Applications.Georgia Institute of Technology, College of Computing, Atlanta, GA. Tech. Rep. GIT-CC-02–19, March 2002. Available HTTP: http://www.cc.gatech.edu/tech_reports/.

    Google Scholar 

  15. 16.

    Labrosse, J. J. MicroC/OS-II The Real-Time Kernel, R&D Books, Miller Freeman, Inc., Lawrence, KS, 1999.

    Google Scholar 

  16. 18.

    Olson, M. A. Selecting and Implementing an Embedded Database System. IEEE Computer, pp. 27–34, September 2000.

  17. 19.

    Stevens, W. R. UNIX Network Programming, Second Edition: Interprocess Communications, Prentice Hall, vol. 2, 1999.

  18. 20.

    Saglam (Akgul), B. and V. Mooney. System-on-a-Chip Processor Synchronization Support in Hardware. Design, Automation and Test in Europe (DATE 2001), March 2001, pp. 633–639.

  19. 21.

    Akgul, B., J. Lee, and V. Mooney. A System-on-a-Chip Lock Cache with Task Preemption Support. Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES'01), November 2001, PP.149–157.

Download references

Author information

Affiliations

Authors

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Akgul, B.E.S., Mooney III, V.J. The System-on-a-Chip Lock Cache. Design Automation for Embedded Systems 7, 139–174 (2002). https://doi.org/10.1023/A:1019751632622

Download citation

  • Lock synchronization
  • multiprocessor
  • RTOS
  • shared-memory
  • SoC
  • synchronization