Exploration of the Influence of Program Inputs on CMP Co-scheduling

  • Yunlian Jiang
  • Xipeng Shen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5168)

Abstract

Recent studies have showed the effectiveness of job co-scheduling in alleviating shared-cache contention on Chip Multiprocessors. Although program inputs affect cache usage and thus cache contention significantly, their influence on co-scheduling remains unexplored. In this work, we measure that influence and show that the ability to adapt to program inputs is important for a co-scheduler to work effectively on Chip Multiprocessors. We then conduct an exploration in addressing the influence by constructing cross-input predictive models for some memory behaviors that are critical for a recently proposed co-scheduler. The exploration compares the effectiveness of both linear and non-linear regression techniques in the model building. Finally, we conduct a systematic measurement of the sensitivity of co-scheduling on the errors of the predictive behavior models. The results demonstrate the potential of the predictive models in guiding contention-aware co-scheduling.

Keywords

Little Mean Square Program Input Memory Behavior Cache Block Distinct Block 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Berube, P., Amaral, J.N.: Benchmark design for robust profile-directed optimization. In: Standard Performance Evaluation Corporation (SPEC) Workshop (2007)Google Scholar
  2. 2.
    Browne, S., Deane, C., Ho, G., Mucci, P.: Papi: A portable interface to hardware performance counters. In: Proceedings of Department of Defense HPCMP Users Group Conference (1999)Google Scholar
  3. 3.
    Chandra, D., Guo, F., Kim, S., Solihin, Y.: Predicting inter-thread cache contention on a chip multi-processor architecture. In: Proceedings of HPCA (2005)Google Scholar
  4. 4.
    Ding, C., Zhong, Y.: Predicting whole-program locality with reuse distance analysis. In: Proceedings of PLDI (2003)Google Scholar
  5. 5.
    El-Moursy, A., Garg, R., Albonesi, D.H., Dwarkadas, S.: Compatible phase co-scheduling on a cmp of multi-threaded processors. In: Proceedings of IPDPS (2006)Google Scholar
  6. 6.
    Fedorova, A., Seltzer, M., Small, C., Nussbaum, D.: Performance of multithreaded chip multiprocessors and implications for operating system design. In: Proceedings of USENIX Annual Technical Conference (2005)Google Scholar
  7. 7.
    Fedorova, A., Seltzer, M., Smith, M.D.: Improving performance isolation on chip multiprocessors via an operating system scheduler. In: Proceedings of PACT (2007)Google Scholar
  8. 8.
    Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning. Springer, Heidelberg (2001)MATHGoogle Scholar
  9. 9.
    Jiang, Y., Shen, X.: Study of cross-input predictability of inclusive reuse distance. Technical Report WM-CS-2007-13, Computer Science Department, The College of William and Mary (2007)Google Scholar
  10. 10.
    Li, X., Garzaran, M.J., Padua, D.: A dynamically tuned sorting library. In: Proceedings of CGO (2004)Google Scholar
  11. 11.
    Settle, A., Kihm, J.L., Janiszewski, A., Connors, D.: Architectural support for enhanced smt job scheduling. In: Proceedings of PACT (2004)Google Scholar
  12. 12.
    Shen, X., Jiang, Y., Mao, F.: Caps: Contention-aware proactive scheduling for cmps with shared caches. Technical Report WM-CS-2007-09, Computer Science Department, The College of William and Mary (2007)Google Scholar
  13. 13.
    Shen, X., Mao, F.: Modeling relations between inputs and dynamic behavior for general programs. In: Proceedings of LCPC (2007)Google Scholar
  14. 14.
    Shen, X., Zhong, Y., Ding, C.: Locality phase prediction. In: Proceedings of ASPLOS (2004)Google Scholar
  15. 15.
    Snavely, A., Tullsen, D.: Symbiotic jobscheduling for a simultaneous multithreading processor. In: Proceedings of ASPLOS (2000)Google Scholar
  16. 16.
    Tuck, N., Tullsen, D.M.: Initial observations of the simultaneous multithreading Pentium 4 processor. In: Proceedings of PACT (2003)Google Scholar
  17. 17.
    Zhang, X., Dwarkadas, S., Folkmanis, G., Shen, K.: Processor hardware counter statistics as a first-class system resource. In: Proceedings of HotOS (2007)Google Scholar
  18. 18.
    Zhong, Y., Dropsho, S.G., Shen, X., Studer, A., Ding, C.: Miss rate prediction across program inputs and cache configurations. IEEE Transactions on Computers 56(3) (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Yunlian Jiang
    • 1
  • Xipeng Shen
    • 1
  1. 1.Department of Computer ScienceThe College of William and MaryWilliamsburgUSA

Personalised recommendations