Improving Server Performance on Multi-cores via Selective Off-Loading of OS Functionality

  • David Nellans
  • Kshitij Sudan
  • Erik Brunvand
  • Rajeev Balasubramonian
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6161)


Modern and future server-class processors will incorporate many cores. Some studies have suggested that it may be worthwhile to dedicate some of the many cores for specific tasks such as operating system execution. OS off-loading has two main benefits: improved performance due to better cache utilization and improved power efficiency due to smarter use of heterogeneous cores. However, OS off-loading is a complex process that involves balancing the overheads of off-loading against the potential benefit, which is unknown while making the off-loading decision. In prior work, OS off-loading has been implemented by first profiling system call behavior and then manually instrumenting some OS routines (out of hundreds) to support off-loading. We propose a hardware-based mechanism to help automate the off-load decision-making process, and provide high quality dynamic decisions via performance feedback. Our mechanism dynamically estimates the off-load requirements of the application and relies on a run-length predictor for the upcoming OS system call invocation. The resulting hardware based off-loading policy yields a throughput improvement of up to 18% over a baseline without off-loading, 13% over a static software based policy, and 23% over a dynamic software based policy.


System Call Branch Predictor Operating System Code User Core Dynamic Instrumentation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    The SPARC Architecture Manual Version 9,
  2. 2.
    Agarwal, A., Hennessy, J., Horowitz, M.: Cache Performance of Operating System and Multiprogramming Workloads. ACM Trans. Comput. Syst. 6(4), 393–431 (1988)CrossRefGoogle Scholar
  3. 3.
    Albayraktaroglu, K., Jaleel, A., Wu, X., Franklin, M., Jacob, B., Tseng, C.W., Yeung, D.: BioBench: A Benchmark Suite of Bioinformatics Applications. In: Proceedings of ISPASS (2005)Google Scholar
  4. 4.
    Anderson, T.E., Levy, H.M., Bershad, B.N., Lazowska, E.D.: The Interaction of Architecture and Operating System Design. In: Proceedings of ASPLOS (1991)Google Scholar
  5. 5.
    Balasubramonian, R., Dwarkadas, S., Albonesi, D.: Dynamically Managing the Communication-Parallelism Trade-Off in Future Clustered Processors. In: Proceedings of ISCA-30, pp. 275–286 (June 2003)Google Scholar
  6. 6.
    Barroso, L., Holzle, U.: The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Morgan & Claypool, San Francisco (2009)Google Scholar
  7. 7.
    Baumann, A., Barham, P., Dagand, P., Harris, T., Isaacs, R., Peter, S., Roscoe, T., Schupbach, A., Singhania, A.: The Multikernel: A new OS architecture for scalable multicore systems. In: Proceedings of SOSP (October 2009)Google Scholar
  8. 8.
    Benia, C., et al.: The PARSEC Benchmark Suite: Characterization and Architectural Implications. Tech. rep., Department of Computer Science, Princeton University (2008)Google Scholar
  9. 9.
    Brown, J.A., Tullsen, D.M.: The Shared-Thread Multiprocessor. In: Proceedings of ICS (2008)Google Scholar
  10. 10.
    Chakraborty, K., Wells, P.M., Sohi, G.S.: Computation Spreading: Employing Hardware Migration to Specialize CMP Cores On-the-Fly. In: Proceedings of ASPLOS (2006)Google Scholar
  11. 11.
    Gloy, N., Young, C., Chen, J.B., Smith, M.D.: An Analysis of Dynamic Branch Prediction Schemes on System Workloads. In: Proceedings of ISCA (1996)Google Scholar
  12. 12.
    Henning, J.L.: SPEC CPU2006 Benchmark Descriptions. In: Proceedings of ACM SIGARCH Computer Architecture News (2005)Google Scholar
  13. 13.
    Hunt, G., Larus, J.: Singularity: rethinking the software stack. Operating Systems Review (2007)Google Scholar
  14. 14.
    Li, T., John, L., Sivasubramaniam, A., Vijaykrishnan, N., Rubio, J.: Understanding and Improving Operating System Effects in Control Flow Prediction. Operating Systems Review (December 2002)Google Scholar
  15. 15.
    Li, T., John, L.K.: Operating System Power Minimization through Run-time Processor Resource Adaptation. IEEE Microprocessors and Microsystems 30, 189–198 (2006)CrossRefGoogle Scholar
  16. 16.
    Magnusson, P., Christensson, M., Eskilson, J., Forsgren, D., Hallberg, G., Hogberg, J., Larsson, F., Moestedt, A., Werner, B.: Simics: A Full System Simulation Platform. IEEE Computer 35(2), 50–58 (2002)CrossRefGoogle Scholar
  17. 17.
    Mogul, J., Mudigonda, J., Binkert, N., Ranganathan, P., Talwar, V.: Using Asymmetric Single-ISA CMPs to Save Energy on Operating Systems. IEEE Micro (May-June 2008)Google Scholar
  18. 18.
    Muralimanohar, N., Balasubramonian, R., Jouppi, N.: Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0. In: Proceedings of MICRO (2007)Google Scholar
  19. 19.
    Nellans, D., Balasubramonian, R., Brunvand, E.: A Case for Increased Operating System Support in Chip Multi-Processors. In: Proceedings of the 2nd IBM Watson Conference on Interaction between Architecture, Circuits, and Compilers (September 2005)Google Scholar
  20. 20.
    Nellans, D., Balasubramonian, R., Brunvand, E.: OS Execution on Multi-Cores: Is Out-Sourcing Worthwhile? ACM Operating System Review (April 2009)Google Scholar
  21. 21.
    Redstone, J., Eggers, S.J., Levy, H.M.: An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture. In: Proceedings of ASPLOS (2000)Google Scholar
  22. 22.
    Strong, R., Mudigonda, J., Mogul, J., Binkert, N., Tullsen, D.: Fast Switching of Threads Between Cores. Operating Systems Review (April 2009)Google Scholar
  23. 23.
    U.S. Environmental Protection Agency - Energy Star Program: Report To Congress on Server and Data Center Energy Efficiency - Public Law 109-431 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • David Nellans
    • 1
  • Kshitij Sudan
    • 1
  • Erik Brunvand
    • 1
  • Rajeev Balasubramonian
    • 1
  1. 1.School of ComputingUniversity of UtahUSA

Personalised recommendations