Skip to main content
Log in

Lightweight memory tracing for hot data identification

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

The low capacity of main memory has become a critical issue in the performance of systems. Several memory schemes, utilizing multiple classes of memory devices, are used to mitigate the problem; hiding the small capacity by placing data in proper memory devices based on the hotness of the data. Memory tracers can provide such hotness information, but existing tracing tools incur extremely high overhead and the overhead increases as the problem size of a workload grows. In this paper, we propose Daptrace built for tracing memory access with bounded and light overhead. The two main techniques, region-based sampling and adaptive region construction, are utilized to maintain a low overhead regardless of the program size. For evaluation, we trace a wide range of 20 workloads and compared with baseline. The results show that Daptrace has a very small amount of runtime overhead and storage space overhead (1.95% and 5.38 MB on average) while maintaining the tracing quality regardless of the working set size of a workload. Also, a case study on out-of-core memory management exhibits a high potential of Daptrace for optimal data management. From the evaluation results, we can conclude that Daptrace shows great performance on identifying hot memory objects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Ferdman, M., Adileh, A., Kocberber, O., Volos, S., Alisafaee, M., Jevdjic, D., Kaynak, C., Popescu, A.D., Ailamaki, A., Falsafi, B.: Clearing the clouds. In: Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, volume 47 of ASPLOS. ACM Press, New York, USA, p. 37 (2012)

  2. Basu, A., Gandhi, J., Chang, J., Hill, M.D., Swift, M.M.: Efficient virtual memory for big memory servers. ACM SIGARCH Comput. Architect. News 41, 237–248 (2013)

    Article  Google Scholar 

  3. Dulloor, S.R, Roy, A., Zhao, Z., Sundaram, N., Satish, N., Sankaran, R., Jackson, J., Schwan, K.: Data tiering in heterogeneous memory systems. In: Proceedings of the 11th European Conference on Computer Systems (EuroSys). ACM, p. 15 (2016)

  4. Nitu, V., Teabe, B., Tchana, A., Isci, C., Hagimont, D.: Welcome to zombieland: practical and energy-efficient memory disaggregation in a datacenter. In: Proceedings of the 13th European Conference on Computer Systems (EuroSys). ACM, p. 16 (2018)

  5. Intel’s new Optane SSDs are superfast and can even work as extra RAM. https://www.theverge.com/circuitbreaker/2017/10/31/16582018/intel-optane-p900-ssd-fast-dram-nand-flash-memory-desktop-computer (2017)

  6. Luk, C.-K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. Acm Sigplan Notices 40, 190–200 (2005)

    Article  Google Scholar 

  7. Wang, H., Zhai, J., Tang, X., Yu, B., Ma, X., Chen, W.: Spindle: Informed memory access monitoring. In: 2018 \(USENIX\) Annual Technical Conference (ATC). USENIX Association, Boston, MA, pp. 561–574 (2018)

  8. Snavely, A., Carrington, L., Wolter, N., Labarta, J., Badia, R., Purkayastha, A.: A framework for performance modeling and prediction. In: SC’02: Proceedings of the 2002 ACM/IEEE Conference on Supercomputing. IEEE, pp. 21–21 (2002)

  9. Hauswirth, M., Chilimbi, T.M.: Low-overhead memory leak detection using adaptive statistical profiling. Acm SIGPLAN Notices 39, 156–164 (2004)

    Article  Google Scholar 

  10. Extrae user guide. https://tools.bsc.es/sites/default/files/documentation/extrae-3.2.1-user-guide.pdf (2015)

  11. Chang, P.P., Mahlke, S.A., Hwu, W.M.W.: Using profile information to assist classic code optimizations. Software 21(12), 1301–1321 (1991)

    Google Scholar 

  12. Pettis, K., Hansen, R.C: Profile guided code positioning. In: ACM SIGPLAN Notices, vol. 25. ACM, pp. 16–27 (1990)

  13. Jaleel, A.: Memory characterization of workloads using instrumentation-driven simulation. http://www.jaleels.org/ajaleel/publications/SPECanalysis.pdf (2007)

  14. 433.milc, SPEC CPU2006 Benchmark Description. https://www.spec.org/cpu2006/Docs/433.milc.html (2011)

  15. Waldspurger, C., Saemundsson, T., Ahmad, I., Park, N.: Cache modeling and optimization using miniature simulations. In: 2017 \(USENIX\) Annual Technical Conference (ATC). USENIX Association, Santa Clara, CA, pp. 487–498 (2017)

  16. Lagar-Cavilla, A., Ahn, J., Souhlal, S., Agarwal, N., Burny, R., Butt, S., Chang, J., Chaugule, A., Deng, N., Shahid, J., Thelen, G., Yurtsever, K.A., Zhao, Y., Ranganathan, P.: Software-defined far memory in warehouse-scale computers. In: Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS. ACM, New York, pp. 317–330 (2019)

  17. Servat, H., Peña, A.J, Llort, G., Mercadal, E., Hoppe, H.-C., Labarta, J.: Automating the application data placement in hybrid memory systems. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, pp. 126–136 (2017)

  18. Evans, J.: A scalable concurrent malloc (3) implementation for freebsd. In: Proc. of the bsdcan conference, Ottawa, Canada (2006)

  19. Clarke, S., Walker, R.J: Composition patterns: an approach to designing reusable aspects. In: Proceedings of the 23rd international conference on Software engineering. IEEE Computer Society, pp. 5–14 (2001)

  20. Liaw, A., Wiener, M., et al.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)

    Google Scholar 

  21. Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, Citeseer (2009)

  22. Cifar-10 tensorflow benchmark. https://github.com/tensorflow/models/tree/master/tutorials/image/cifar10 (2019)

  23. Memory Resource Controller. https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt (2019)

  24. Overcommitting CPU and RAM. https://docs.openstack.org/arch-design/design-compute/design-compute-overcommit.html (2018)

  25. mlock(2): Linux manual page. http://man7.org/linux/man-pages/man2/mlock.2.html (2019)

  26. Payer, M., Kravina, E., Gross, T.R: Lightweight memory tracing. In: Presented as part of the 2013 USENIX Annual Technical Conference (ATC 13), pp. 115–126 (2013)

  27. Zhang, X., Dwarkadas, S., Shen, K.: Towards practical page coloring-based multicore cache management. In: Proceedings of the 4th ACM European conference on Computer systems. ACM, pp. 89–102 (2009)

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF2015M3C4A7065646 and No. 2017R1A2B4005681).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yoonhee Kim.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, Y., Kim, Y. & Yeom, H.Y. Lightweight memory tracing for hot data identification. Cluster Comput 23, 2273–2285 (2020). https://doi.org/10.1007/s10586-020-03130-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-020-03130-1

Keywords

Navigation