Abstract
Although virtualization technologies bring many benefits to cloud computing environments, as the virtual machines provide more features, the middleware layer has become bloated, introducing a high overhead. Our ultimate goal is to provide hardware-assisted solutions to improve the middleware performance in cloud computing environments. As a starting point, in this paper, we design, implement, and evaluate specialized hardware instructions to accelerate GC operations. We select GC because it is a common component in virtual machine designs and it incurs high performance and energy consumption overheads. We performed a profiling study on various GC algorithms to identify the GC performance hotspots, which contribute to more than 50% of the total GC execution time. By moving these hotspot functions into hardware, we achieved an order of magnitude speedup and significant improvement on energy efficiency. In addition, the results of our performance estimation study indicate that the hardware-assisted GC instructions can reduce the GC execution time by half and lead to a 7% improvement on the overall execution time.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Armbrust M, Fox A, Griffith R, Joseph A, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I, Zararia M (2009) Above the clouds: a Berkeley view of cloud computing. Technical Report No. UCB/EECS-2009-28, University of California, Berkeley, USA
Buyya R, Yeo CS, Venugopal S, Broberg J, Brandic I (2009) Cloud computing and emerging IT platforms: vision, hype, and reality for delivering computing as the 5th utility. Future Gener Comput Syst 25:599–616
Apache Harmony http://harmony.apache.org/
Common Language Runtime http://msdn.microsoft.com/en-us/library/8bs2ecf4(VS.71).aspx
VMware http://www.vmware.com/
Cherkasova L, Gardner R (2005) Measuring CPU overhead for I/O processing in the Xen virtual machine monitor. In Proceedings of the annual conference on USENIX annual technical conference
Ferrer M Measuring overhead introduced by VMWare workstation hosted virtual machine monitor network subsystem. http://studies.ac.upc.edu/doctorat/ENGRAP/Miquel.pdf
Liu S, Wang L, Li X-F, Gaudiot J-L (2009) Packer an innovative space-time efficient parallel garbage collection algorithm based on virtual spaces. In Proceedings of the 24th IEEE international parallel and distributed processing symposium
Toshio E, Taura K, Yonezawa A (1997) A scalable Mark–Sweep garbage collector on large-scale shared-memory machines. In Proceedings of ACM/IEEE conference on supercomputing, New York, USA, 1997
Domani T, Kolodner EK, Lewis E et al (2001) Implementing an on-the-fly garbage collector for Java. ACM SIGPLAN Notices
Jones RE (1996) Garbage collection: algorithms for automatic dynamic memory management. Wiley, Chichester. With a chapter on distributed garbage collection by R Lins
Abuaiadh D, Ossia Y, Petrank E, Silbershtein U (2004) An efficient parallel heap compaction algorithm. In the ACM conference on object-oriented systems, languages and applications
Kermany H, Petrank E (2006) The compressor: concurrent, incremental and parallel compaction. In PLDI
Wegiel M, Krintz C (2008) The mapping collector: virtual memory support for generational, parallel, and concurrent compaction. In ASPLOS ’08, Seattle, WA
Appel W (1989) Simple generational garbage collection and fast allocation. Softw Pract Exp 19: 171–183
Schmidt WJ, Nilsen KD (1994) Performance of a hardware-assisted real-time garbage collector. In International conference on architectural support for programming languages and operating systems, pp 76–85
Meyer M (2004) A novel processor architecture with exact tag-free pointers. IEEE MICRO 24(3):46–55
Meyer M (2006) A true hardware read barrier. In ISMM’06
Joao JA, Mutlu O, Patt YN (2009) Flexible reference-counting-based hardware acceleration for garbage collection. In Proceedings of the 36th annual international symposium on computer architecture
Intel Vtune http://software.intel.com/en-us/intel-vtune/
Dacapo Project: The DaCapo Benchmark Suite http://www-ali.cs.umass.edu/dacapo/index.html
SPECJVM 2008 http://www.spec.org/jvm2008/
Liu S, Deng C, Li X-F, Gaudiot J-L (2009) RHE: a lightweight JVM instructional tool. In Proceedings of the 33rd annual IEEE international computer software and applications conference
Isci C, Martonosi M (2003) Runtime power monitoring in high-end processors: methodology and empirical data. In Proceedings of the 36th international symposium on microarchitecture
Xilinx Spartan III http://www.xilinx.com/products/devkits/HW-SPAR3-SK-UNI-G.htm
Xilinx XPower http://www.xilinx.com/products/design_tools/logic_design/verification/xpower.htm
Kuon I, Rose J (2006) Measuring the gap between FPGAs and ASICs. In Proceedings of the 14th international symposium on field programmable gate arrays
The eMIPS project http://research.microsoft.com/en-us/projects/emips/default.aspx
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Tang, J., Liu, S., Gu, Z. et al. Achieving middleware execution efficiency: hardware-assisted garbage collection operations. J Supercomput 59, 1101–1119 (2012). https://doi.org/10.1007/s11227-010-0493-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-010-0493-0