Skip to main content

A Comparative Analysis of Garbage Collectors and Their Suitability for Big Data Workloads

  • Conference paper
  • First Online:
Advances in Computing and Network Communications

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 735))

  • 556 Accesses

Abstract

Big data applications tend to be memory intensive, and many of them are written in memory managed languages like Java/Scala. The efficiency of the garbage collector (GC) plays an important role in the performance of these applications. In our paper, we perform a comparative analysis of Java garbage collectors for three commonly used big data workloads to check the choice of the garbage collector for each of the workloads. The garbage collectors under scrutiny are garbage first, parallel and ConcurrentMarkSweep. We demonstrate (a) the relative difference between existing Java workloads that are used to study garbage collectors and big data workloads and (b) the selection of the right garbage collector for a given workload. We find that the garbage first collector gives a performance uplift of up to 15% in certain workloads.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. W. Gao, J. Zhan, L. Wang, C. Luo, D. Zheng, X. Wen, H. Tang, Bigdatabench: a scalable and unified big data and AI benchmark suite. arXiv preprint arXiv:1802.08254 (2018)

  2. Apache Spark. https://spark.apache.org/docs/1.3.0/. Accessed 14 July 2020

  3. OpenJDK 12. https://openjdk.java.net/projects/jdk/12/. Accessed 14 July 2020

  4. G1 GC. https://docs.oracle.com/en/java/javase/12/gctuning/garbage-first-garbage-collector.html. Accessed 14 July 2020

  5. CMS GC. https://docs.oracle.com/en/java/javase/12/gctuning/concurrent-mark-sweep-cms-collector.html. Accessed 14 July 2020

  6. Parallel GC. https://docs.oracle.com/en/java/javase/12/gctuning/parallel-collector1.html. Accessed 14 July 2020

  7. L. Gidra, G. Thomas, J. Sopena, M. Shapiro, N. Nguyen, NumaGiC: a garbage collector for big data on big NUMA machines. ACM SIGARCH Comput. Arch. News 43(1), 661–673 (2015)

    Article  Google Scholar 

  8. K. Nguyen, L. Fang, G. Xu, B. Demsky, S. Lu, S. Alamian, O. Mutlu, Yak: a high-performance big-data-friendly garbage collector, in 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pp. 349–365 (2016)

    Google Scholar 

  9. S.M. Blackburn, R. Garner, C. Hoffmann, A.M. Khang, K.S. McKinley, R. Bentzur, M. Hirzel, The DaCapo benchmarks: Java benchmarking development and analysis, in Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications, pp. 169–190 (2006, October)

    Google Scholar 

  10. Y. He, C. Yang, X.F. Li, Improve google android user experience with regional garbage collection, in IFIP International Conference on Network and Parallel Computing (Springer, Berlin, Heidelberg, 2011, October), pp. 350–365

    Google Scholar 

  11. T. Gerlitz, I. Kalkov, J.F. Schommer, D. Franke, S. Kowalewski, Non-blocking garbage collection for real-time android, in Proceedings of the 11th International Workshop on Java Technologies for Real-time and Embedded Systems, pp. 108–117 (2013, October)

    Google Scholar 

  12. L. Gidra, G. Thomas, J. Sopena, M. Shapiro, A study of the scalability of stop-the-world garbage collectors on multicores. ACM SIGPLAN Notices 48(4), 229–240 (2013)

    Article  Google Scholar 

  13. L. Gidra, G. Thomas, J. Sopena, M. Shapiro, Assessing the scalability of garbage collectors on many cores, in Proceedings of the 6th Workshop on Programming Languages and Operating Systems (ACM, New York, 2011, October), p. 7

    Google Scholar 

  14. M. Carpen-Amarie, P. Marlier, P. Felber, G. Thomas, A performance study of Java garbage collectors on multicore architectures, in Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores, pp. 20–29 (2015, February)

    Google Scholar 

  15. P. Lengauer, V. Bitto, H. Möossenböck, M. Weninger, A comprehensive Java benchmark study on memory and garbage collection behavior of DaCapo, DaCapo Scala, and SPECjvm2008, in Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering, pp. 3–14 (2017, April)

    Google Scholar 

  16. H. Grgic, B. Mihaljević, A. Radovan, Comparison of garbage collectors in Java programming language, in 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO) (IEEE, New York, 2018, May), pp. 1539–1544

    Google Scholar 

  17. P. Pufek, H. Grgić, B. Mihaljević, Analysis of garbage collection algorithms and memory management in Java, in 2019 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO) (IEEE, New York, 2019, May), pp. 1677–1682

    Google Scholar 

  18. S. Iqbal, M. Khan, I. Memon, A Comparative Study of Garbage Collection Techniques in Java Virtual Machines. Sindh University Research Journal-SURJ (Science Series) 44(4) (2012)

    Google Scholar 

  19. R. Bruno, P. Ferreira, A study on garbage collection algorithms for big data environments. ACM Comput. Surv. (CSUR) 51(1), 1–35 (2018)

    Article  MathSciNet  Google Scholar 

  20. L. Xu, T. Guo, W. Dou, W. Wang, J. Wei, An experimental evaluation of garbage collectors on big data applications, in The 45th International Conference on Very Large Data Bases (VLDB’19) (2019, January)

    Google Scholar 

  21. BigDataBench User Manual. http://prof.ict.ac.cn/BigDataBench/wp-content/uploads/2014/12/BigDataBench-User-Manual.pdf. Accessed 14 July 2020

  22. DaCapo benchmarks. http://dacapobench.org. Accessed 14 July 2020

  23. DaCapo benchmarks description. http://dacapobench.sourceforge.net/benchmarks.html. Accessed 14 July 2020

  24. Gceasy Tool. https://gceasy.io. Accessed 14 July 2020

  25. J. Singer, G. Kovoor, G. Brown, M. Luján, Garbage collection auto-tuning for java mapreduce on multi-cores. ACM SIGPLAN Notices 46(11), 109–118 (2011)

    Article  Google Scholar 

Download references

Acknowledgements

The authors wish to thank Dr. Prakash Raghavendra from AMD India Pvt. Ltd. for providing intellectual assistance throughout the study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aiswarya Sriram .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nair, A., Sriram, A., Simon, A., Kalambur, S., Sitaram, D. (2021). A Comparative Analysis of Garbage Collectors and Their Suitability for Big Data Workloads. In: Thampi, S.M., Gelenbe, E., Atiquzzaman, M., Chaudhary, V., Li, KC. (eds) Advances in Computing and Network Communications. Lecture Notes in Electrical Engineering, vol 735. Springer, Singapore. https://doi.org/10.1007/978-981-33-6977-1_24

Download citation

  • DOI: https://doi.org/10.1007/978-981-33-6977-1_24

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-33-6976-4

  • Online ISBN: 978-981-33-6977-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics