Studying logging practice in test code

Zhang, Haonan; Tang, Yiming; Lamothe, Maxime; Li, Heng; Shang, Weiyi

doi:10.1007/s10664-022-10139-0

Studying logging practice in test code

Published: 07 April 2022

Volume 27, article number 83, (2022)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

494 Accesses
4 Citations
Explore all metrics

Abstract

Logging is widely used in modern software development to record run-time information for software systems and plays a significant role in software testing. Although the research area of logging has attracted much attention, little attention is paid to the practice of test logging (i.e., the logging involved in test files). To fill this knowledge gap, we conduct this empirical study to explore and disclose the practice of test logging. This study examines 21 open-source subjects with \(\sim \)70K logging statements, of which \(\sim \)48K are production logging statements and \(\sim \)22K are test logging statements. We organize our study by answering four research questions, and as a result, (1) we have yielded five findings to reveal the differences between test and production logging statements, (2) we have disclosed four findings regarding the differences between the maintenance efforts of test and production logging statements, (3) we have identified four reasons why developers use test log, and (4) we have uncovered the relationship between test logging and production logging. To the best of our knowledge, this is the first study that quantitatively and qualitatively analyzes the logging practices in test and production code, providing developers and researchers with insight into this topic.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 9

Future of software development with generative AI

Article Open access 11 March 2024

Ethics in the Software Development Process: from Codes of Conduct to Ethical Deliberation

Article Open access 21 April 2021

Challenges of Low-Code/No-Code Software Development: A Literature Review

Notes

Scripts and data files used in our research are available online and can be found here: https://github.com/senseconcordia/TestLoggingPractice
https://www.srcml.org/
In this work, we refer to test outputs as the log messages produced during the execution of the unit tests.

References

Apache Common Logging (2021) Apache commons. https://commons.apache.org/proper/commons-logging/guide.html#JCL_Best_Practices, Accessed: 2021-12-06
Apache Software Foundation (2021) Apache software foundation. https://www.apache.org/, Accessed: 2021-04-25
Chen B, Jiang ZMJ (2017a) Characterizing and detecting anti-patterns in the logging code. In: Proceedings of the 39th international conference on software engineering, ICSE ’17. https://doi.org/10.1109/ICSE.2017.15, pp 71–81
Chen B, Jiang ZMJ (2017b) Characterizing and detecting anti-patterns in the logging code. In: Proceedings of the 39th international conference on software engineering, ICSE ’17. https://doi.org/10.1109/ICSE.2017.15, pp 71–81
Chen B, Jiang ZMJ (2017c) Characterizing logging practices in Java-based open source software projects – a replication study in Apache Software Foundation. Empir Softw Eng 22(1):330–374
Article Google Scholar
Chen B, Song J, Xu P, Hu X, Jiang Z M J (2018) An automated approach to estimating code coverage measures via execution logs. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, association for computing machinery, New York, NY, USA, ASE. https://doi.org/10.1145/3238147.3238214, vol 2018, pp 305–316
Cliff N (1996) Ordinal methods for behavioral data analysis. Erlbaum. https://books.google.ca/books?id=bIJFvgAACAAJ
Cohen J (2013) Statistical power analysis for the behavioral sciences. Academic press, Cambridge
Book Google Scholar
Collard ML, Decker MJ, Maletic JI (2013) SrcML: An infrastructure for the exploration, analysis, and manipulation of source code: A tool demonstration. pp 516–519. https://doi.org/10.1109/ICSM.2013.85
Confidence Intervals/Levels (2021) Sample size calculator. https://surveysystem.com/sscalc.htm, Accessed: 2021-07-01
Cramér H (2016) Mathematical methods of statistics (PMS-9), vol 9. Princeton University Press, Princeton
Google Scholar
Danial A (2021) Cloc. https://github.com/AlDanial/cloc
Ding R, Zhou H, Lou JG, Zhang H, Lin Q, Fu Q, Zhang D, Xie T (2015) Log2: A cost-aware logging mechanism for performance diagnosis. USENIX Association, USA, USENIX ATC ’15
Ding Z, Li H, Shang W (2022) Logentext: Automatically generating logging texts using neural machine translation. In: SANER. IEEE
Fisher RA (1922) On the interpretation of x2 from contingency tables, and the calculation of p. Journal of the Royal Statistical Society 85(1):87–94. http://www.jstor.org/stable/2340521
Article Google Scholar
Franke TM, Ho T, Christie CA (2012) The chi-square test: Often used and more often misinterpreted. Am J Eval 33(3):448–458
Article Google Scholar
Fu Q, Lou JG, Wang Y, Li J (2009) Execution anomaly detection in distributed systems through unstructured log analysis. In: 2009 Ninth IEEE international conference on data mining, pp 149–158. https://doi.org/10.1109/ICDM.2009.60
Fu Q, Lou JG, Lin Q, Ding R, Zhang D, Xie T (2013) Contextual analysis of program logs for understanding system behaviors. In: Proceedings of the 10th working conference on mining software repositories. IEEE Press, MSR ’13, p 397–400
Fu Q, Zhu J, Hu W, Lou J G, Ding R, Lin Q, Zhang D, Xie T (2014) Where do developers log? an empirical study on logging practices in industry. ICSE Companion 2014:24–33. https://doi.org/10.1145/2591062.2591175
Google Scholar
GitPython-Developers (2021) GitPython-Developers/gitpython: Gitpython is a python library used to interact with git repositories. https://git.io/JnXb2, Accessed: 2021-04-25
Gülcü C (2002) The Complete log4j Manual. QOS.ch
Glerum K, Kinshumann K, Greenberg S, Aul G, Orgovan V, Nichols G, Grant D, Loihle G, Hunt G (2009) Debugging in the (very) large: Ten years of implementation and experience. In: Proceedings of the ACM SIGOPS 22nd symposium on operating systems principles, association for computing machinery, New York, NY, USA, SOSP ’09. https://doi.org/10.1145/1629575.1629586, pp 103–116
Grechanik M, Jones JA, Orso A, van der Hoek A (2010) Bridging gaps between developers and testers in globally-distributed software development. Association for Computing Machinery, New York, NY, USA, FoSER ’10, 149–154. https://doi.org/10.1145/1882362.1882394
Hassani M, Shang W, Shihab E, Tsantalis N (2018) Studying and detecting log-related issues. Empirical Softw Engg 23(6):3248–3280. https://doi.org/10.1007/s10664-018-9603-z
Article Google Scholar
He P, Chen Z, He S, Lyu M R (2018) Characterizing the natural language descriptions in software logging statements. ASE 2018:178–189. https://doi.org/10.1145/3238147.3238193
Google Scholar
Kabinna S, Bezemer CP, Shang W, Syer MD, Hassan AE (2018) Examining the stability of logging statements. Empirical Softw Engg 23 (1):290–333. https://doi.org/10.1007/s10664-017-9518-0
Article Google Scholar
Kernighan B W, Pike R (1999) The practice of programming. Addison-Wesley longman publishing co Inc, USA
Google Scholar
Laaber C, Scheuner J, Leitner P (2019) Software microbenchmarking in the cloud. how bad is it really? Empirical Softw Engg 24(4):2469–2508. https://doi.org/10.1007/s10664-019-09681-1
Article Google Scholar
Li H, Shang W, Zou Y, E Hassan A (2017a) Towards just-in-time suggestions for log changes. Empir Softw Eng 22(4):1831–1865. https://doi.org/10.1007/s10664-016-9467-z
Article Google Scholar
Li H, Shang W, Hassan AE (2017b) Which log level should developers choose for a new logging statement? Empir Softw Eng, 22. https://doi.org/10.1007/s10664-016-9456-2
Li H, Chen THP, Shang W, Hassan AE (2018) Studying software logging using topic models. Empirical Softw Engg 23(5):2655–2694. https://doi.org/10.1007/s10664-018-9595-8
Article Google Scholar
Li H, Shang W, Adams B, Sayagh M, Hassan A E (2020a) A qualitative study of the benefits and costs of logging from developers’ perspectives. IEEE Trans Softw Eng, 1–1. https://doi.org/10.1109/TSE.2020.2970422
Li Z, Tse-Hsun PC, Jinqiu Y, Weiyi S (2019) Characterizing and detecting duplicate logging code smells. In: Proceedings of the 41st international conference on software engineering: companion proceedings, ICSE ’19, p 147–149. https://doi.org/10.1109/ICSE-Companion.2019.00062
Li Z, Chen TH, Shang W (2020b) Where shall we log? studying and suggesting logging locations in code blocks. In: 2020 35th IEEE/ACM international conference on automated software engineering (ASE), pp 361–372
Li Z, Li H, Chen THP, Shang W (2021) Deeplv: Suggesting log levels using ordinal based neural networks. In: 2021 IEEE/ACM 43rd international conference on software engineering (ICSE), pp 1461–1472. https://doi.org/10.1109/ICSE43902.2021.00131
Liu Z, Xia X, Lo D, Xing Z, Hassan A E, Li S (2019) Which variables should I log? IEEE Trans Softw Eng, 1–1. https://doi.org/10.1109/TSE.2019.2941943
Lou JG, Fu Q, Yang S, Xu Y, Li J (2010) Mining invariants from console logs for system problem detection. In: Proceedings of the 2010 USENIX conference on USENIX annual technical conference, USENIX association, USA, USENIXATC’10, p 24
McHugh M (2012) Interrater reliability: The kappa statistic. Biochemia medica 22:276–282. https://doi.org/10.11613/BM.2012.031
Article Google Scholar
McHugh M (2013) The chi-square test of independence. Biochemia medica 23:143–149. https://doi.org/10.11613/BM.2013.018
Article Google Scholar
Microsoft Developer (2021) Microsoft developer. https://developer.microsoft.com/, Accessed: 2021-04-25
Murphy-Hill E, Zimmermann T, Bird C, Nagappan N (2015) The design space of bug fixes and how developers navigate it. IEEE Trans Softw Eng 41(1):65–81. https://doi.org/10.1109/TSE.2014.2357438
Article Google Scholar
Nachar N (2008) The mann-Whitney U: A test for assessing whether two independent samples come from the same distribution. Tutorials in Quantitative Methods for Psychology, 4. https://doi.org/10.20982/tqmp.04.1.p013
Nagappan N, Ball T (2005) Use of relative code churn measures to predict system defect density. In: Proceedings of the 27th international conference on software engineering, ICSE ’05, p 284–292. https://doi.org/10.1145/1062455.1062514
Nagaraj K, Killian C, Neville J (2012) Structured comparative analysis of systems logs to diagnose performance problems. In: Proceedings of the 9th USENIX conference on networked systems design and implementation, USENIX Association, USA, NSDI’12, p 26
Oracle and/or its affiliates (2021) Package java.util.logging. https://docs.oracle.com/en/java/javase/16/docs/api/java.logging/java/util/logging/package-summary.html, Accessed: 2021-07-05
QOSch (2021) Simple logging facade for java (slf4j). http://www.slf4j.org/, Accessed: 2021-04-25
Romano J, Kromrey JD, Coraggio J, Skowronek J (2006) Appropriate statistics for ordinal level data: Should we really be using t-test and Cohen’s d for evaluating group differences on the nsse and other surveys. In: Annual meeting of the Florida Association of Institutional Research, vol 13
Shang W, Nagappan M, Hassan AE, Jiang ZM (2014) Understanding log lines using development knowledge. In: 2014 IEEE international conference on software maintenance and evolution, pp 21–30. https://doi.org/10.1109/ICSME.2014.24
Shang W, Nagappan M, Hassan A E (2015) Studying the relationship between logging characteristics and the code quality of platform software. Empir Softw Eng 20(1):1–27. https://doi.org/10.1007/s10664-013-9274-8
Article Google Scholar
SLF4J (2021) Slf4j. https://www.slf4j.org/faq.html#fatal, Accessed: 2021-11-19
Tang Y, Spektor A, Khatchadourian R, Bagherzadeh M (2021) A tool for rejuvenating feature logging levels via git histories and degree of interest. arXiv:2112.02758
Tang Y, Spektor A, Khatchadourian R, Bagherzadeh M (2022) Automated evolution of feature logging statement levels using git histories and degree of interest. Science of Computer Programming . https://doi.org/10.1016/j.scico.2021.102724
The Apache Software Foundation (2021) Apache Log4j is a java-based logging utility. https://logging.apache.org/log4j/2.x/, Accessed: 2021-04-25
Wang S, Wen M, Liu Y, Wang Y, Wu R (2021) Understanding and facilitating the co-evolution of production and test code. In: 2021 IEEE international conference on software analysis, evolution and reengineering (SANER), pp 272–283. https://doi.org/10.1109/SANER50967.2021.00033
White R, Krinke J, Tan R (2020) Establishing multilevel test-to-code traceability links. pp 861–872. https://doi.org/10.1145/3377811.3380921
Yao K, de Pádua GB, Shang W, Sporea S, Toma A, Sajedi S (2018) Log4perf: Suggesting logging locations for web-based systems’ performance monitoring. pp 127–138. https://doi.org/10.1145/3184407.3184416
Yuan D, Zheng J, Park S, Zhou Y, Savage S (2011) Improving software diagnosability via log enhancement. SIGARCH Comput Archit News 39 (1):3–14. https://doi.org/10.1145/1961295.1950369
Article Google Scholar
Yuan D, Park S, Huang P, Liu Y, Lee MM, Tang X, Zhou Y, Savage S (2012a) Be conservative: Enhancing failure diagnosis with proactive logging. In: Proceedings of the 10th USENIX conference on operating systems design and implementation, OSDI’12, p 293–306
Yuan D, Park S, Zhou Y (2012b) Characterizing logging practices in open-source software. In: 2012 34th international conference on software engineering (ICSE), pp 102–112. https://doi.org/10.1109/ICSE.2012.6227202
Zeng Y, Chen J, Shang W, Chen T H P (2019) Studying the characteristics of logging practices in mobile apps: a case study on f-Droid. Empir Softw Eng, 24. https://doi.org/10.1007/s10664-019-09687-9
Zhao X, Rodrigues K, Luo Y, Stumm M, Yuan D, Zhou Y (2017) Log20: Fully automated optimal placement of log printing statements under specified overhead threshold. Association for Computing Machinery, New York, NY, USA, SOSP ’17, p 565–581. https://doi.org/10.1145/3132747.3132778
Zhi C, Yin J, Deng S, Ye M, Fu M, Xie T (2019) An exploratory study of logging configuration practice in java. In: 2019 IEEE international conference on software maintenance and evolution (ICSME), pp 459–469. https://doi.org/10.1109/ICSME.2019.00079
Zhu J, He P, Fu Q, Zhang H, Lyu MR, Zhang D (2015) Learning to log: Helping developers make informed logging decisions. In: Proceedings of the 37th international conference on software engineering - vol 1, IEEE Press, ICSE ’15, p 415–425

Download references

Author information

Authors and Affiliations

Department of Computer Science and Software Engineering, Concordia University, Montreal, QC, Canada
Haonan Zhang, Yiming Tang & Weiyi Shang
Department of Computer Engineering and Software Engineering, Polytechnique Montreal, Montreal, QC, Canada
Maxime Lamothe & Heng Li

Authors

Haonan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yiming Tang
View author publications
You can also search for this author in PubMed Google Scholar
Maxime Lamothe
View author publications
You can also search for this author in PubMed Google Scholar
Heng Li
View author publications
You can also search for this author in PubMed Google Scholar
Weiyi Shang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yiming Tang.

Additional information

Communicated by: Shaukat Ali

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, H., Tang, Y., Lamothe, M. et al. Studying logging practice in test code. Empir Software Eng 27, 83 (2022). https://doi.org/10.1007/s10664-022-10139-0

Download citation

Accepted: 22 February 2022
Published: 07 April 2022
DOI: https://doi.org/10.1007/s10664-022-10139-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Studying logging practice in test code

Abstract

Access this article

Similar content being viewed by others

Future of software development with generative AI

Ethics in the Software Development Process: from Codes of Conduct to Ethical Deliberation

Challenges of Low-Code/No-Code Software Development: A Literature Review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Studying logging practice in test code

Abstract

Access this article

Similar content being viewed by others

Future of software development with generative AI

Ethics in the Software Development Process: from Codes of Conduct to Ethical Deliberation

Challenges of Low-Code/No-Code Software Development: A Literature Review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation