Advertisement

Page Classifier and Placer: A Scheme of Managing Hybrid Caches

  • Xin Yu
  • Xuanhua Shi
  • Hai Jin
  • Xiaofei Liao
  • Song Wu
  • Xiaoming Li
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8707)

Abstract

Hybrid cache architecture (HCA), which uses two or more cache hierarchy designs in a processor, may outperform traditional cache architectures because no single memory technology can deliver the optimal power, performance and density at the same time. The general HCA scheme has also been proposed to manage cache regions that have different usage patterns. However previous HCA management schemes control data placement at cache set level and are oblivious to software’s different power and performance characteristics in different hardware cache regions. This hardware-only approach may lead to performance loss and may fail to guarantee quality of service. We propose a new HCA approach that enables OS to be aware of underlying hybrid cache architecture and to control data placement, at OS page level, onto difference cache regions. Our approach employs a light-weighted hardware profiler to monitor cache behaviors at OS page level and to capture the hot pages. With this knowledge, OS will be able to dynamically select different cache placement policies to optimize placement of data to achieve higher performance, lower power consumption and better quality of service. Our simulation experiments demonstrate that the proposed hybrid HCA achieves 7.8% performance improvement on a dual-core system compared to a traditional SRAM-only cache architecture and at the same time reduces area cost.

Keywords

hybrid cache page coloring multi-core 

References

  1. 1.
    Awasthi, M., Sudan, K., Balasubramonian, R., Carter, J.: Dynamic hardware-assisted software-controlled page placement to manage capacity allocation and sharing within large caches. In: Proceedings of IEEE 15th International Symposium on High Performance Computer Architecture (HPCA 2009), pp. 250–261. IEEE (2009)Google Scholar
  2. 2.
    Chaudhuri, M.: Pagenuca: Selected policies for page-grain locality management in large shared chip-multiprocessor caches. In: Proceedings of IEEE 15th International Symposium on High Performance Computer Architecture (HPCA 2009), pp. 227–238. IEEE (2009)Google Scholar
  3. 3.
    Chishti, Z., Powell, M.D., Vijaykumar, T.: Distance associativity for high-performance energy-efficient non-uniform cache architectures. In: Proceedings of 36th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2003), pp. 55–66. IEEE (2003)Google Scholar
  4. 4.
    Hanzawa, S., Kitai, N., Osada, K., Kotabe, A., Matsui, Y., Matsuzaki, N., Takaura, N., Moniwa, M., Kawahara, T.: A 512kb embedded phase change memory with 416kb/s write throughput at 100μa cell write current. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC 2007), pp. 474–616. IEEE (2007)Google Scholar
  5. 5.
    Hosomi, M., Yamagishi, H., Yamamoto, T., Bessho, K., Higo, Y., Yamane, K., Yamada, H., Shoji, M., Hachino, H., Fukumoto, C., Nagao, H., Kano, H.: A novel nonvolatile memory with spin torque transfer magnetization switching: Spin-ram. In: Proceedings of IEEE International Electron Devices Meeting (IEDM 2005), pp. 459–462. IEEE (2005)Google Scholar
  6. 6.
    Lam, C.: Cell design considerations for phase change memory as a universal memory. In: Proceedings of International Symposium on VLSI Technology, Systems and Applications (VLSI-TSA 2008), pp. 132–133. IEEE (2008)Google Scholar
  7. 7.
    Loh, G.H., Subramaniam, S., Xie, Y.: Zesto: A cycle-level simulator for highly detailed microarchitecture exploration. In: Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2009), pp. 53–64. IEEE (2009)Google Scholar
  8. 8.
    Muralimanohar, N., Balasubramonian, R., Jouppi, N.P.: Cacti 6.0: A tool to model large caches. HP Laboratories (2009)Google Scholar
  9. 9.
    Pellizzer, F., Pirovano, A., Ottogalli, F., Magistretti, M., Scaravaggi, M., Zuliani, P., Tosi, M., Benvenuti, A., Besana, P., Cadeo, S., Marangon, T., Morandi, R., Piva, R., Spandre, A., Zonca, R., Modelli, A., Varesi, A., Lowrey, T., Lacaita, A., Casagrande, G., Cappelletti, P., Bez, R.: Novel μtrench phase-change memory cell for embedded and stand-alone non-volatile memory applications. In: Proceedings of International Symposium on VLSI Technology, Systems and Applications (VLSI-TSA 2008), pp. 18–19. IEEE (2004)Google Scholar
  10. 10.
    Sun, G., Dong, X., Xie, Y., Li, J., Chen, Y.: A novel architecture of the 3d stacked mram l2 cache for cmps. In: Proceedings of IEEE 15th International Symposium on High Performance Computer Architecture (HPCA 2009), pp. 239–249. IEEE (2009)Google Scholar
  11. 11.
    Wu, X., Li, J., Zhang, L., Speight, E., Rajamony, R., Xie, Y.: Hybrid cache architecture with disparate memory technologies. In: Proceedings of International Symposium on Computer architecture (ISCA 2009), pp. 34–45. ACM (2009)Google Scholar
  12. 12.
    Zhao, W., Belhaire, E., Mistral, Q., Chappert, C., Javerliac, V., Dieny, B., Nicolle, E.: Macro-model of spin-transfer torque based magnetic tunnel junction device for hybrid magnetic-cmos design. In: Proceedings of the 2006 IEEE International Behavioral Modeling and Simulation Workshop (BMSW 2006), pp. 40–43. IEEE (2006)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2014

Authors and Affiliations

  • Xin Yu
    • 1
  • Xuanhua Shi
    • 1
  • Hai Jin
    • 1
  • Xiaofei Liao
    • 1
  • Song Wu
    • 1
  • Xiaoming Li
    • 2
  1. 1.Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and TechnologyHuazhong University of Science and TechnologyWuhanChina
  2. 2.Department of ECEUniversity of DelawareNewarkUSA

Personalised recommendations