Skip to main content
Log in

Studying logging practice in test code

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Logging is widely used in modern software development to record run-time information for software systems and plays a significant role in software testing. Although the research area of logging has attracted much attention, little attention is paid to the practice of test logging (i.e., the logging involved in test files). To fill this knowledge gap, we conduct this empirical study to explore and disclose the practice of test logging. This study examines 21 open-source subjects with \(\sim \)70K logging statements, of which \(\sim \)48K are production logging statements and \(\sim \)22K are test logging statements. We organize our study by answering four research questions, and as a result, (1) we have yielded five findings to reveal the differences between test and production logging statements, (2) we have disclosed four findings regarding the differences between the maintenance efforts of test and production logging statements, (3) we have identified four reasons why developers use test log, and (4) we have uncovered the relationship between test logging and production logging. To the best of our knowledge, this is the first study that quantitatively and qualitatively analyzes the logging practices in test and production code, providing developers and researchers with insight into this topic.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. Scripts and data files used in our research are available online and can be found here: https://github.com/senseconcordia/TestLoggingPractice

  2. https://www.srcml.org/

  3. In this work, we refer to test outputs as the log messages produced during the execution of the unit tests.

References

  • Apache Common Logging (2021) Apache commons. https://commons.apache.org/proper/commons-logging/guide.html#JCL_Best_Practices, Accessed: 2021-12-06

  • Apache Software Foundation (2021) Apache software foundation. https://www.apache.org/, Accessed: 2021-04-25

  • Chen B, Jiang ZMJ (2017a) Characterizing and detecting anti-patterns in the logging code. In: Proceedings of the 39th international conference on software engineering, ICSE ’17. https://doi.org/10.1109/ICSE.2017.15, pp 71–81

  • Chen B, Jiang ZMJ (2017b) Characterizing and detecting anti-patterns in the logging code. In: Proceedings of the 39th international conference on software engineering, ICSE ’17. https://doi.org/10.1109/ICSE.2017.15, pp 71–81

  • Chen B, Jiang ZMJ (2017c) Characterizing logging practices in Java-based open source software projects – a replication study in Apache Software Foundation. Empir Softw Eng 22(1):330–374

    Article  Google Scholar 

  • Chen B, Song J, Xu P, Hu X, Jiang Z M J (2018) An automated approach to estimating code coverage measures via execution logs. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, association for computing machinery, New York, NY, USA, ASE. https://doi.org/10.1145/3238147.3238214, vol 2018, pp 305–316

  • Cliff N (1996) Ordinal methods for behavioral data analysis. Erlbaum. https://books.google.ca/books?id=bIJFvgAACAAJ

  • Cohen J (2013) Statistical power analysis for the behavioral sciences. Academic press, Cambridge

    Book  Google Scholar 

  • Collard ML, Decker MJ, Maletic JI (2013) SrcML: An infrastructure for the exploration, analysis, and manipulation of source code: A tool demonstration. pp 516–519. https://doi.org/10.1109/ICSM.2013.85

  • Confidence Intervals/Levels (2021) Sample size calculator. https://surveysystem.com/sscalc.htm, Accessed: 2021-07-01

  • Cramér H (2016) Mathematical methods of statistics (PMS-9), vol 9. Princeton University Press, Princeton

    Google Scholar 

  • Danial A (2021) Cloc. https://github.com/AlDanial/cloc

  • Ding R, Zhou H, Lou JG, Zhang H, Lin Q, Fu Q, Zhang D, Xie T (2015) Log2: A cost-aware logging mechanism for performance diagnosis. USENIX Association, USA, USENIX ATC ’15

  • Ding Z, Li H, Shang W (2022) Logentext: Automatically generating logging texts using neural machine translation. In: SANER. IEEE

  • Fisher RA (1922) On the interpretation of x2 from contingency tables, and the calculation of p. Journal of the Royal Statistical Society 85(1):87–94. http://www.jstor.org/stable/2340521

    Article  Google Scholar 

  • Franke TM, Ho T, Christie CA (2012) The chi-square test: Often used and more often misinterpreted. Am J Eval 33(3):448–458

    Article  Google Scholar 

  • Fu Q, Lou JG, Wang Y, Li J (2009) Execution anomaly detection in distributed systems through unstructured log analysis. In: 2009 Ninth IEEE international conference on data mining, pp 149–158. https://doi.org/10.1109/ICDM.2009.60

  • Fu Q, Lou JG, Lin Q, Ding R, Zhang D, Xie T (2013) Contextual analysis of program logs for understanding system behaviors. In: Proceedings of the 10th working conference on mining software repositories. IEEE Press, MSR ’13, p 397–400

  • Fu Q, Zhu J, Hu W, Lou J G, Ding R, Lin Q, Zhang D, Xie T (2014) Where do developers log? an empirical study on logging practices in industry. ICSE Companion 2014:24–33. https://doi.org/10.1145/2591062.2591175

    Google Scholar 

  • GitPython-Developers (2021) GitPython-Developers/gitpython: Gitpython is a python library used to interact with git repositories. https://git.io/JnXb2, Accessed: 2021-04-25

  • Gülcü C (2002) The Complete log4j Manual. QOS.ch

  • Glerum K, Kinshumann K, Greenberg S, Aul G, Orgovan V, Nichols G, Grant D, Loihle G, Hunt G (2009) Debugging in the (very) large: Ten years of implementation and experience. In: Proceedings of the ACM SIGOPS 22nd symposium on operating systems principles, association for computing machinery, New York, NY, USA, SOSP ’09. https://doi.org/10.1145/1629575.1629586, pp 103–116

  • Grechanik M, Jones JA, Orso A, van der Hoek A (2010) Bridging gaps between developers and testers in globally-distributed software development. Association for Computing Machinery, New York, NY, USA, FoSER ’10, 149–154. https://doi.org/10.1145/1882362.1882394

  • Hassani M, Shang W, Shihab E, Tsantalis N (2018) Studying and detecting log-related issues. Empirical Softw Engg 23(6):3248–3280. https://doi.org/10.1007/s10664-018-9603-z

    Article  Google Scholar 

  • He P, Chen Z, He S, Lyu M R (2018) Characterizing the natural language descriptions in software logging statements. ASE 2018:178–189. https://doi.org/10.1145/3238147.3238193

    Google Scholar 

  • Kabinna S, Bezemer CP, Shang W, Syer MD, Hassan AE (2018) Examining the stability of logging statements. Empirical Softw Engg 23 (1):290–333. https://doi.org/10.1007/s10664-017-9518-0

    Article  Google Scholar 

  • Kernighan B W, Pike R (1999) The practice of programming. Addison-Wesley longman publishing co Inc, USA

    Google Scholar 

  • Laaber C, Scheuner J, Leitner P (2019) Software microbenchmarking in the cloud. how bad is it really? Empirical Softw Engg 24(4):2469–2508. https://doi.org/10.1007/s10664-019-09681-1

    Article  Google Scholar 

  • Li H, Shang W, Zou Y, E Hassan A (2017a) Towards just-in-time suggestions for log changes. Empir Softw Eng 22(4):1831–1865. https://doi.org/10.1007/s10664-016-9467-z

    Article  Google Scholar 

  • Li H, Shang W, Hassan AE (2017b) Which log level should developers choose for a new logging statement? Empir Softw Eng, 22. https://doi.org/10.1007/s10664-016-9456-2

  • Li H, Chen THP, Shang W, Hassan AE (2018) Studying software logging using topic models. Empirical Softw Engg 23(5):2655–2694. https://doi.org/10.1007/s10664-018-9595-8

    Article  Google Scholar 

  • Li H, Shang W, Adams B, Sayagh M, Hassan A E (2020a) A qualitative study of the benefits and costs of logging from developers’ perspectives. IEEE Trans Softw Eng, 1–1. https://doi.org/10.1109/TSE.2020.2970422

  • Li Z, Tse-Hsun PC, Jinqiu Y, Weiyi S (2019) Characterizing and detecting duplicate logging code smells. In: Proceedings of the 41st international conference on software engineering: companion proceedings, ICSE ’19, p 147–149. https://doi.org/10.1109/ICSE-Companion.2019.00062

  • Li Z, Chen TH, Shang W (2020b) Where shall we log? studying and suggesting logging locations in code blocks. In: 2020 35th IEEE/ACM international conference on automated software engineering (ASE), pp 361–372

  • Li Z, Li H, Chen THP, Shang W (2021) Deeplv: Suggesting log levels using ordinal based neural networks. In: 2021 IEEE/ACM 43rd international conference on software engineering (ICSE), pp 1461–1472. https://doi.org/10.1109/ICSE43902.2021.00131

  • Liu Z, Xia X, Lo D, Xing Z, Hassan A E, Li S (2019) Which variables should I log? IEEE Trans Softw Eng, 1–1. https://doi.org/10.1109/TSE.2019.2941943

  • Lou JG, Fu Q, Yang S, Xu Y, Li J (2010) Mining invariants from console logs for system problem detection. In: Proceedings of the 2010 USENIX conference on USENIX annual technical conference, USENIX association, USA, USENIXATC’10, p 24

  • McHugh M (2012) Interrater reliability: The kappa statistic. Biochemia medica 22:276–282. https://doi.org/10.11613/BM.2012.031

    Article  Google Scholar 

  • McHugh M (2013) The chi-square test of independence. Biochemia medica 23:143–149. https://doi.org/10.11613/BM.2013.018

    Article  Google Scholar 

  • Microsoft Developer (2021) Microsoft developer. https://developer.microsoft.com/, Accessed: 2021-04-25

  • Murphy-Hill E, Zimmermann T, Bird C, Nagappan N (2015) The design space of bug fixes and how developers navigate it. IEEE Trans Softw Eng 41(1):65–81. https://doi.org/10.1109/TSE.2014.2357438

    Article  Google Scholar 

  • Nachar N (2008) The mann-Whitney U: A test for assessing whether two independent samples come from the same distribution. Tutorials in Quantitative Methods for Psychology, 4. https://doi.org/10.20982/tqmp.04.1.p013

  • Nagappan N, Ball T (2005) Use of relative code churn measures to predict system defect density. In: Proceedings of the 27th international conference on software engineering, ICSE ’05, p 284–292. https://doi.org/10.1145/1062455.1062514

  • Nagaraj K, Killian C, Neville J (2012) Structured comparative analysis of systems logs to diagnose performance problems. In: Proceedings of the 9th USENIX conference on networked systems design and implementation, USENIX Association, USA, NSDI’12, p 26

  • Oracle and/or its affiliates (2021) Package java.util.logging. https://docs.oracle.com/en/java/javase/16/docs/api/java.logging/java/util/logging/package-summary.html, Accessed: 2021-07-05

  • QOSch (2021) Simple logging facade for java (slf4j). http://www.slf4j.org/, Accessed: 2021-04-25

  • Romano J, Kromrey JD, Coraggio J, Skowronek J (2006) Appropriate statistics for ordinal level data: Should we really be using t-test and Cohen’s d for evaluating group differences on the nsse and other surveys. In: Annual meeting of the Florida Association of Institutional Research, vol 13

  • Shang W, Nagappan M, Hassan AE, Jiang ZM (2014) Understanding log lines using development knowledge. In: 2014 IEEE international conference on software maintenance and evolution, pp 21–30. https://doi.org/10.1109/ICSME.2014.24

  • Shang W, Nagappan M, Hassan A E (2015) Studying the relationship between logging characteristics and the code quality of platform software. Empir Softw Eng 20(1):1–27. https://doi.org/10.1007/s10664-013-9274-8

    Article  Google Scholar 

  • SLF4J (2021) Slf4j. https://www.slf4j.org/faq.html#fatal, Accessed: 2021-11-19

  • Tang Y, Spektor A, Khatchadourian R, Bagherzadeh M (2021) A tool for rejuvenating feature logging levels via git histories and degree of interest. arXiv:2112.02758

  • Tang Y, Spektor A, Khatchadourian R, Bagherzadeh M (2022) Automated evolution of feature logging statement levels using git histories and degree of interest. Science of Computer Programming . https://doi.org/10.1016/j.scico.2021.102724

  • The Apache Software Foundation (2021) Apache Log4j is a java-based logging utility. https://logging.apache.org/log4j/2.x/, Accessed: 2021-04-25

  • Wang S, Wen M, Liu Y, Wang Y, Wu R (2021) Understanding and facilitating the co-evolution of production and test code. In: 2021 IEEE international conference on software analysis, evolution and reengineering (SANER), pp 272–283. https://doi.org/10.1109/SANER50967.2021.00033

  • White R, Krinke J, Tan R (2020) Establishing multilevel test-to-code traceability links. pp 861–872. https://doi.org/10.1145/3377811.3380921

  • Yao K, de Pádua GB, Shang W, Sporea S, Toma A, Sajedi S (2018) Log4perf: Suggesting logging locations for web-based systems’ performance monitoring. pp 127–138. https://doi.org/10.1145/3184407.3184416

  • Yuan D, Zheng J, Park S, Zhou Y, Savage S (2011) Improving software diagnosability via log enhancement. SIGARCH Comput Archit News 39 (1):3–14. https://doi.org/10.1145/1961295.1950369

    Article  Google Scholar 

  • Yuan D, Park S, Huang P, Liu Y, Lee MM, Tang X, Zhou Y, Savage S (2012a) Be conservative: Enhancing failure diagnosis with proactive logging. In: Proceedings of the 10th USENIX conference on operating systems design and implementation, OSDI’12, p 293–306

  • Yuan D, Park S, Zhou Y (2012b) Characterizing logging practices in open-source software. In: 2012 34th international conference on software engineering (ICSE), pp 102–112. https://doi.org/10.1109/ICSE.2012.6227202

  • Zeng Y, Chen J, Shang W, Chen T H P (2019) Studying the characteristics of logging practices in mobile apps: a case study on f-Droid. Empir Softw Eng, 24. https://doi.org/10.1007/s10664-019-09687-9

  • Zhao X, Rodrigues K, Luo Y, Stumm M, Yuan D, Zhou Y (2017) Log20: Fully automated optimal placement of log printing statements under specified overhead threshold. Association for Computing Machinery, New York, NY, USA, SOSP ’17, p 565–581. https://doi.org/10.1145/3132747.3132778

  • Zhi C, Yin J, Deng S, Ye M, Fu M, Xie T (2019) An exploratory study of logging configuration practice in java. In: 2019 IEEE international conference on software maintenance and evolution (ICSME), pp 459–469. https://doi.org/10.1109/ICSME.2019.00079

  • Zhu J, He P, Fu Q, Zhang H, Lyu MR, Zhang D (2015) Learning to log: Helping developers make informed logging decisions. In: Proceedings of the 37th international conference on software engineering - vol 1, IEEE Press, ICSE ’15, p 415–425

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yiming Tang.

Additional information

Communicated by: Shaukat Ali

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, H., Tang, Y., Lamothe, M. et al. Studying logging practice in test code. Empir Software Eng 27, 83 (2022). https://doi.org/10.1007/s10664-022-10139-0

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-022-10139-0

Keywords

Navigation