Abstract
Memory leaks, an important and difficult issue in software development, occur when an object is inadvertently retained longer than necessary. Programming languages provide a variety of dynamic memory management methods to support programmers in preventing the introduction of defects that cause memory leaks. However, it is not yet possible to completely free programmers from the work of memory management. Indeed, runtime leak detection is time consuming and usually done after the fact, while manual code inspection requires rich developer experience. Understanding the common patterns of memory leaks can help developers be mindful of leaks or avoid them at an earlier stage during the development process and may further inspire future research. Eight code patterns are found in our case study specifically for memory leaks caused by circular references in Python. The observed patterns can explain 91.64% of the memory leaks in the studied projects. Our work can guide important decisions about the possibility of identifying memory leaks with static code analysis.
Similar content being viewed by others
Data availability
The authors confirm that the data supporting the findings of this study are available within the article and its supplementary materials.
Notes
CPython provides a simple and effective way to remove one type of reference: weak references. A weak reference is a reference that does not protect an object during garbage collection. A Python programmer can easily create weak references to objects with the weakref module. See https://docs.python.org/3/library/weakref.html.
References
Campos, E. C., & Maia, M. D. A. (2017). Common bug-fix patterns: a large-scale observational study. In ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. ACM.
Cherem, S., Princehouse, L., & Rugina. R. (2007). Practical memory leak detection using guarded value-flow analysis. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pp. 480–491.
Christopher, T. W. (1984). Reference count garbage collection. Software: Practice and Experience, 14(6), 503–507.
Clause, J., & Orso, A. (2010). Leakpoint: Pinpointing the causes of memory leaks. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering (vol. 1), ICSE ’10, pp. 515–524.
Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to algorithms. MIT Press, Cambridge, MA, third edition.
Distefano, D., & Filipovic, I. (2010). Memory leaks detection in Java by bi-abductive inference. In International Conference on Fundamental Approaches to Software Engineering (FASE), pp. 278–292.
Fan, G., Wu, R., Shi, Q., et al. (2019). Smoke: Scalable path-sensitive memory leak detection for millions of lines of code. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE.
Ghanavati, M., Costa, D., Seboek, J., et al. (2020). Memory and resource leak defects and their repairs in Java projects. Empirical Software Engineering, 25, 678–718. https://doi.org/10.1007/s10664-019-09731-8
Hanam, Q., Brito, F. S. D. M., & Mesbah, A. (2016). Discovering bug patterns in JavaScript. ACM Sigsoft International Symposium on Foundations of Software Engineering. ACM, 2016, 144–156.
Hu, M., & Zhang, Y. (2020). The Python/C API: Evolution, usage statistics, and bug patterns. In 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE.
Jump, M., & McKinley, K. S. (2007). Cork: Dynamic memory leak detection for garbage-collected languages. In Proceedings of the 34th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 31–38.
Jung, C., Lee, S., Raman, E., & Pande, S. (2014). Automated memory leak detection for production use. In Proceedings of the 36th International Conference on Software Engineering, ICSE 2014, pp. 825–836.
Liu, T., Curtsinger, C., & Berger, E. D. (2016). DoubleTake: Fast and precise error detection via evidence-based dynamic analysis. In IEEE/ACM International Conference on Software Engineering. IEEE.
Lo, D., Nagappan, N., & Zimmermann, T. (2015). How practitioners perceive the relevance of software engineering research. In Joint Meeting on Foundations of Software Engineering. ACM.
McBeth, J. H. (1963). On the reference counter method. Communications of ACM, 6(9), 575–584.
McCarthy, J. (1960). Recursive functions of symbolic expressions and their computation by machine, part I. Communications of the ACM, 3(4), 184–195.
Orlovich, M., & Rugina, R. (2006). Memory leak analysis by contradiction. In Static Analysis Symposium (SAS), pp. 405–424.
Pan, K., Kim, S., & Whitehead, E. J. (2009) Toward an understanding of bug fix patterns. Empirical Software Engineering, 14(3), 286–315.
Python Programming Language Homepage. Retrieved June 13, 2023, from https://www.python.org
Retrieved June 13, 2023, from https://github.com/benfred/github-analysis/#inferring-languages
Shaham, R., Kolodner, E. K., & Sagiv, M. (2000). Automatic removal of array memory leaks in Java. In International Conference on Compiler Construction (CC), pp. 50–66.
Sor, V., & Srirama, S. N. (2014). Memory leak detection in Java: Taxonomy and classification of approaches. Journal of Systems and Software, 96, 139–151.
Sui, Y., Ye, D., & Xue, J. (2014). Detecting memory leaks statically with full-sparse value-flow analysis. IEEE Transactions on Software Engineering, 40(2), 107–122.
Sun, X., Xu, S., Guo, C., et al. (2018). A projection-based approach for memory leak detection. In IEEE Computer Software & Applications Conference. IEEE.
Tan, L., Liu, C., Li, Z., et al. (2014). Bug characteristics in open source software. Empirical Software Engineering, 19(6), 1665–1705.
Xu, G., & Rountev, A. (2008). Precise memory leak detection for java software using container profiling. In Proceedings of the 30th International Conference on Software Engineering, ICSE ’08, pp. 151–160.
Xu, G., Bond, M. D., Qin, F., & Rountev, A. (2011). Leakchaser: Helping programmers narrow down causes of memory leaks. In Proceedings of the 32Nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pp. 270–282.
Yan, D., Xu, G., Yang, S., et al. (2014). LeakChecker: Practical static memory leak detection for managed languages. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization. ACM.
Zhong, H., & Su, Z. (2015). An empirical study on real bug fixes. In IEEE/ACM IEEE International Conference on Software Engineering. ACM.
Funding
This work was supported by the National Science Foundation of China (No. 61702144), the Zhejiang Provincial National Science Foundation of China (No. LQ17F020003).
Author information
Authors and Affiliations
Contributions
Jie Chen, Dongjin Yu, and Haiyang Hu contributed to the conception of the study. Jie Chen performed the data collection and experiment and drafted the manuscript. Dongjin Yu and Haiyang Hu helped perform the analysis with constructive discussions and made important modifications to the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, J., Yu, D. & Hu, H. Towards an understanding of memory leak patterns: an empirical study in Python. Software Qual J 31, 1303–1330 (2023). https://doi.org/10.1007/s11219-023-09641-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11219-023-09641-5