Abstract
Bad code smells have been defined as indicators of potential problems in source code. Techniques to identify and mitigate bad code smells have been proposed and studied. Recently bad test code smells (test smells for short) have been put forward as a kind of bad code smell specific to tests such a unit tests. What has been missing is empirical investigation into the prevalence and impact of bad test code smells. Two studies aimed at providing this missing empirical data are presented. The first study finds that there is a high diffusion of test smells in both open source and industrial software systems with 86 % of JUnit tests exhibiting at least one test smell and six tests having six distinct test smells. The second study provides evidence that test smells have a strong negative impact on program comprehension and maintenance. Highlights from this second study include the finding that comprehension is 30 % better in the absence of test smells.
Similar content being viewed by others
Notes
None of Ph.D. students co-authored the paper.
Note that the same refactoring was applied on test smell instances from both systems.
Participants were monitored during the break to ensure that they did not exchange information.
Note that this is possible since the only difference among the four experiments are the involved participants.
Note that we were not able to do the same in our previous experiments since the need to verify our conjecture came out after a first analysis performed on the data achieved in the first three experiments.
References
Abbes M, Khomh F, Guéhéneuc YG, Antoniol G (2011) An empirical study of the impact of two antipatterns, blob and spaghetti code, on program comprehension. In: Proceedings of the 15th european conference on software maintenance and reengineering. IEEE Comput CS Press, Oldenburg, pp 181–190
Abbes M, Khomh F, Guéhéneuc YG, Antoniol G (2011) An empirical study of the impact of two antipatterns, blob and spaghetti code, on program comprehension. In: 15th european conference on software maintenance and reengineering, CSMR 2011, 1-4 March 2011. IEEE Computer Society, Oldenburg, pp 181–190
Antoniol G, Canfora G, Casazza G, De Lucia A, Merlo E (2002) Recovering traceability links between code and documentation. IEEE Trans Softw Eng 28(10): 970–983
Arcoverde R, Garcia A, Figueiredo E (2011) Understanding the longevity of code smells: preliminary results of an explanatory survey. In: Proceedings of the international workshop on refactoring tools. ACM, pp 33–36
Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley
Baker P, Evans D, Grabowski J, Neukirchen H, Zeiss B (2006) Trex-the refactoring and metrics tool for ttcn-3 test specifications. In: TAIC PART, pp 90-94
Bavota G, Qusef A, Oliveto R, DeLucia A, Binkley D (2013) Are test smells really harmful? an empirical study. Tech. rep. http://www.dmi.unisa.it/people/bavota/www/reports/TestSmells/
Bavota G, Qusef A, Oliveto R, Lucia AD, Binkley D (2012) An empirical analysis of the distribution of unit test smells and their impact on software maintenance. In: ICSM, pp 56–65
Breugelmans M, Van Rompaey B (2008) Testq: Exploring structural and maintenance characteristics of unit test suites. In: Proceedings of the 1st international workshop on advanced software development tools and Techniques (WASDeTT)
Chatzigeorgiou A, Manakos A (2010) Investigating the evolution of bad smells in object-oriented code.. In: In: QUATIC, IEEE Computer Society
Cohen J (1988) Statistical power analysis for the behavioral sciences, 2nd edn. Lawrence Earlbaum Associates
Conover WJ (1998) Practical nonparametric statistics, 3rd edn, Wiley
Deligiannis IS, Shepperd MJ, Roumeliotis M, Stamelos I (2003) An empirical investigation of an object-oriented design heuristic for maintainability. J Syst Softw 65(2): 127–139
van Deursen A, Moonen L (2002) The video store revisited – thoughts on refactoring and testing. In: Proceedings of International Conference on eXtreme Programming and Flexible Processes in Software Engineering (XP). Alghero, Italy, pp 71–76
Devore JL, Farnum N (1999) Applied statistics for engineers and scientists, Duxbury
van Emden E, Moonen L (2002) Java quality assurance by detecting code smells. In: Proceedings of the 9th working conference on reverse engineering (WCRE’02). IEEE CS Press. citeseer.ist.psu.edu/vanemden02java.html
Fowler M (1999) Refactoring: improving the design of existing code. Addison-Wesley
Greiler M, van Deursen A, Storey MAD (2013) Automated detection of test fixture strategies and smells. In: IEEE Sixth International Conference on Software Testing, Verification and Validation, Luxembourg, pp 322–331
Greiler M, Zaidman A, van Deursen A, Storey MAD (2013) Strategies for avoiding text fixture smells during software evolution.. In: In: MSR
Grissom RJ, Kim JJ (2005) Effect sizes for research: a broad practical approach, 2nd edn. Lawrence Earlbaum Associates
Hindle A., Godfrey M., Holt R. (2007) Release pattern discovery via partitioning: methodology and case study. In: 4th international workshop on mining software repositories, 2007. ICSE Workshops MSR ’07
Khomh F, Di Penta M, Gueheneuc YG (2009) An exploratory study of the impact of code smells on software change-proneness. In: Proceedings of the 2009 16th working conference on reverse engineering, WCRE ’09. IEEE Comput Soc
Khomh F, Penta MD, Guéhéneuc YG, Antoniol G (2012) An exploratory study of the impact of antipatterns on class change- and fault-proneness. Empir Softw Eng 17(3): 243–275
Khomh F, Vaucher S, Guéhéneuc YG, Sahraoui H (2009) A bayesian approach for the detection of code and design smells. In: Proceedings of the 9th International Conference on Quality Software. IEEE CS Press, Hong Kong, pp 305–314
Kruskal W.H., Wallis W.A. (1952) Use of ranks in one-criterion variance analysis. J Am Stat A 47(260): 583–621
Lanza M, Marinescu R (2006) Object-oriented metrics in practice: using software metrics to characterize,evaluate, and improve the design of object-oriented systems. Springer
Li W, Shatnawi R (2007) An empirical study of the bad smells and class error probability in the post-release object-oriented system evolution. J Syst Softw 80(7): 1120–1128
Marinescu R (2004) Detection strategies: Metrics-based rules for detecting design flaws. In: 20th international conference on software maintenance (ICSM 2004), 11-17 September 2004 Chicago, IL, USA, 350–359
Meszaros G (2007) XUnit test patterns: refactoring test code. Addison-Wesley
Moha N, Gueheneuc YG, Duchien L, Le Meur AF (2010) Decor: a method for the specification and detection of code and design smells. IEEE Trans Softw Eng 36(1): 20–36
Munro MJ (2005) Product metrics for automatic identification of “bad smell” design problems in java source-code. In: Proceedings of the 11th International Software Metrics Symposium. IEEE Computer Society Press
Neukirchen H, Bisanz M (2007) Utilising code smells to detect quality problems in ttcn-3 test suites. In: Proceedings of the 19th IFIP TC6/WG6.1 international conference, and 7th international conference on Testing of Software and Communicating Systems, TestCom’07/FATES’07. Springer, Berlin, Heidelberg, pp 228–243
Olbrich SM, Cruzes D, Sjberg DIK (2010) Are all code smells harmful? a study of god classes and brain classes in the evolution of three open source systems. In: ICSM, pp. 1–10. IEEE Computer Society
Oppenheim AN (1992) Questionnaire design, interviewing and attitude measurement. Pinter Publishers
Palomba F, Bavota G, Penta MD, Oliveto R, Lucia AD, Poshyvanyk D (2013) Detecting bad smells in source code using change history information. 2013 28th IEEE/ACM International Conference on Automated Software Engineering, ASE 2013, Silicon Valley, CA, USA, pp. 268–278
Peters R, Zaidman A (2012) Evaluating the lifespan of code smells using software repository mining. In: European conference on software maintenance and reengineering, pp. 411–416. IEEE
Pinto LS, Sinha S, Orso A (2012) Understanding myths and realities of test-suite evolution. In: Proceedings of the ACM SIGSOFT 20th international symposium on the foundations of software engineering, FSE ’12, pp. 33:1–33:11. ACM
Qusef A, Bavota G, Oliveto R, Lucia AD, Binkley D (2011) Scotch: test-to-code traceability using slicing and conceptual coupling In: Proceedings of the 27th IEEE international conference on software maintenance. Williamsburg, VA, USA, pp 63–72
Ratiu D., Ducasse S., Gîrba T, Marinescu R (2004) Using history information to improve design flaws detection In: Proceeding of the 8th european conference on software maintenance and reengineering (CSMR 2004), 24-26 March 2004. IEEE Computer Society, Finland, pp 223–232
Reichhart S., Gîrba T, Ducasse S. (2007) Rule-based assessment of test quality. J Object Technol 6(9): 231–251
Ricca F, Di Penta M, Torchiano M, Tonella P, Ceccato M (2007) The role of experience and ability in comprehension tasks supported by UML stereotypes. Proceedings of 29th ICSE. IEEE Computer Society, Minneapolis, pp 375–384
Ricca F., Penta M.D., Torchiano M., Tonella P., Ceccato M. (2010) How developers’ experience and ability influence web application comprehension tasks supported by uml stereotypes: A series of four experiments. IEEE Trans Softw Eng 36: 96–118
Simon F., Steinbr F., Lewerentz C. (2001) Metrics based refactoring. Proceedings of 5th European Conference on Software Maintenance and Reengineering. IEEE CS Press, Lisbon, pp 30–38
Tsantalis N., Chatzigeorgiou A. (2009) Identification of move method refactoring opportunities. IEEE Trans Softw Eng 35(3): 347–367
Van Deursen A., Moonen L., Bergh A., Kok G. (2001) Refactoring test code. Tech. rep., Amsterdam
Van Rompaey B., Du Bois B., Demeyer S., Rieger M. (2007) On the detection of test smells: A metrics-based approach for general fixture and eager test. IEEE Trans Softw Eng 33(12): 800–817
Wohlin C., Runeson P., Host M., Ohlsson M.C., Regnell B., Wesslen A. (2000) Experimentation in Software Engineering - An Introduction. Kluwer
Yamashita A., Moonen L. (2013) Exploring the impact of inter-smell relations on software maintainability: An empirical study. In: International Conference on Software Engineering (ICSE), pp. 682–691. IEEE
Zaidman A., Rompaey B.V., van Deursen A., Demeyer S. (2011) Studying the co-evolution of production and test code in open source and industrial developer test processes through repository mining. Empir Softw Eng 16(3): 325–364
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Thomas Zimmermann
Rights and permissions
About this article
Cite this article
Bavota, G., Qusef, A., Oliveto, R. et al. Are test smells really harmful? An empirical study. Empir Software Eng 20, 1052–1094 (2015). https://doi.org/10.1007/s10664-014-9313-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-014-9313-0