Abstract
Test smells, detrimental coding practices that hinder high-quality test code development, pose a significant challenge in software testing and maintenance. Software refactoring, traditionally a powerful approach for addressing code smells and improving software quality without changing functionality, often focuses on production code, leaving test code overlooked. Despite extensive research in test smell refactoring, understanding the efficacy of existing refactoring operations on test code quality remains limited. Investigating real-world developer refactoring practices is crucial to bridge this knowledge gap. In this study, we investigate refactorings performed by developers to address test smells, resulting in a comprehensive catalog of test smells and their corresponding test-specific refactorings. Two test-specific refactorings closely tied to JUnit5 and seven version-agnostic refactorings for various JUnit versions have been identified. While many of these test-specific refactorings are documented in the literature, this analysis unveils new test-specific refactorings aimed at dealing with the “Inappropriate Assertion” test smell. This research provides insights into the challenges faced by developers and prevailing practices for effectively refactoring test code, thereby enhancing software testing and maintenance.
Similar content being viewed by others
Notes
Available at: https://www.eclipse.org/jgit/
Available at: https://exubero.com/junit/anti-patterns/
References
ACCUMULO. Commit: f10b40 (2019). GitHub. Retrieved November 8, 2023, from https://github.com/apache/accumulo/commit/f10b4073dba6b8095e9934e9ea158eb9f45c6f67
Ahmad, A., Leifler, O., & Sandahl, K. (2021). Empirical analysis of practitioners’ perceptions of test flakiness factors. Software Testing, Verification and Reliability, 31(8), 1791. https://doi.org/10.1002/stvr.1791
Aljedaani, W., Peruma, A., Aljohani, A., Alotaibi, M., Mkaouer, M. W., Ouni, A., Newman, C. D., Ghallab, A., & Ludi, S. (2021). Test smell detection tools: A systematic mapping study. EASE ’21, pp. 170–180. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3463274.3463335
Backport JUnit upgrade from #1562 to 1.10.1. (2021). Retrieved November 8, 2023, from https://github.com/apache/accumulo/commit/d4fd27f32dc2611a23f67b1d3e8dafd8ee05a1cb
Bavota, G., Qusef, A., Oliveto, R., Lucia, A., & Binkley, D. (2015). Are test smells really harmful? An empirical study. Empirical Software Engineering, 20(4), 1052–1094. https://doi.org/10.1007/s10664-014-9313-0
Bell, J., Legunsen, O., Hilton, M., Eloussi, L., Yung, T., & Marinov, D. (2018). Deflaker: Automatically detecting flaky tests. In: 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE), pp. 433–444. https://doi.org/10.1145/3180155.3180164
Camara, B., Silva, M., Endo, A., & Vergilio, S. (2021). On the use of test smells for prediction of flaky tests. In: Proceedings of the 6th Brazilian Symposium on Systematic and Automated Software Testing. SAST ’21, pp. 46–54. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3482909.3482916
CAMEL. Commit: 626196a (2019). GitHub. Retrieved November 8, 2023, from https://github.com/apache/camel/commit/626196af0baf18a859c55bdf91526b447b367faf
CAMEL. Commit: 9dc4dc6 (2019). GitHub. Retrieved November 8, 2023, from https://github.com/apache/camel/commit/9dc4dc6cd2c6cee75892e9a57105d79bfdcc8f5c
CAMEL. Commit: f7d1dbb (2020). Retrieved November 8, 2023, from https://github.com/apache/camel/commit/f7d1dbbf736e8b50ac5f17e5d25829a0a6aa5d4e
CAMEL. Commit: 7a4363 (2020). Retrieved November 8, 2023, from https://github.com/apache/camel/commit/7a43633b3c3587d949724f580ad0015a6f65ef82
CAMEL. Commit: d58c731 (2021). GitHub. Retrieved November 8, 2023, from https://github.com/apache/camel/commit/d58c7318cb81f8faa5f2f4acd28d7a215855450d
CAMEL. Commit: c30dea (2021). GitHub. Retrieved November 8, 2023, from https://github.com/apache/camel/commit/c30deabcaed4726bce4371d76257db63f2eba87c
Cedrim, D., Garcia, A., Mongiovi, M., Gheyi, R., Sousa, L., de Mello, R., Fonseca, B., Ribeiro, M., & Chávez, A. (2017). Understanding the impact of refactoring on smells: A longitudinal study of 23 software projects. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. ESEC/FSE 2017, pp. 465–475. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3106237.3106259
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.
CXF. Commit: 7e11da (2019). GitHub. Retrieved November 8, 2023, from https://github.com/apache/cxf/commit/7e11da7a566a95adc64143c0575b7ef86e0fbe5a
CXF. Commit: 4955ca6 (2019). GitHub. Retrieved November 8, 2023, from https://github.com/apache/cxf/commit/4955ca652f16e781524612383af27c650e10cbdc
Deursen, A., Moonen, L. M., Bergh, A., & Kok, G. (2001). Refactoring test code. Centre for Mathematics and Computer Science, NLD: Technical report.
Di, Z., Li, B., Li, Z., & Liang, P. (2018). A preliminary investigation of self-admitted refactorings in open source software. In: International Conference on Software Engineering and Knowledge Engineering, vol. 2018, pp. 165–168. KSI Research Inc. and Knowledge Systems Institute Graduate School.
Fowler, M. (1999). Refactoring: Improving the design of existing code. USA: Addison-Wesley Longman Publishing Co., Inc.
Garousi, V., & Küçük, B. (2018). Smells in software test code: A survey of knowledge in industry and academia. Journal of Systems and Software, 138, 52–81. https://doi.org/10.1016/j.jss.2017.12.013
Greiler, M., van Deursen, A., & Storey, M. -A. (2013). Automated detection of test fixture strategies and smells. In: 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation, pp. 322–331. https://doi.org/10.1109/ICST.2013.45
Junior, N. S., Martins, L. A., Rocha, L., Costa, H. A. X., & Machado, I. (2021). How are test smells treated in the wild? A tale of two empirical studies. Journal of Software Engineering Research and Development, 9, 9–1916. https://doi.org/10.5753/jserd.2021.1802
KAFKA. Commit: 56d948 (2021). GitHub. Retrieved November 8, 2023, from https://github.com/apache/kafka/commit/56d9482462c2aa941b151015499fc59485fe7426
KAFKA. Commit: f4c203 (2021). GitHub. Retrieved November 8, 2023, from https://github.com/apache/kafka/commit/f4c2030b2006fc0c447a10f8b251579424f39f7b
Kim, D. J., Tsantalis, N., Chen, T. -H. P., & Yang, J. (2021). Studying test annotation maintenance in the wild. In: Proceedings of the 43rd International Conference on Software Engineering. ICSE 2021, pp. 62–73. IEEE Computer Society, Los Alamitos, CA, USA. https://doi.org/10.1109/ICSE43902.2021.00019
Kim, D. J., Chen, T.-H.P., & Yang, J. (2021). The secret life of test smells-an empirical study on test smell evolution and maintenance. Empirical Software Engineering, 26(5), 1–47. https://doi.org/10.1007/s10664-021-09969-1
Kummer, M., Nierstrasz, O., & Lungu, M. (2015). Categorising test smells. Bachelor Thesis. University of Bern.
Lacerda, G., Petrillo, F., Pimenta, M., & Guéhéneuc, Y. G. (2020). Code smells and refactoring: A tertiary systematic review of challenges and observations. Journal of Systems and Software, 167, 110610. https://doi.org/10.1016/j.jss.2020.110610
Lambiase, S., Cupito, A., Pecorelli, F., De Lucia, A., & Palomba, F. (2020). Just-in-time test smell detection and refactoring: The darts project. In: Proceedings of the 28th International Conference on Program Comprehension. ICPC ’20, pp. 441–445. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3387904.3389296
Martinez, M., Etien, A., Ducasse, S., & Fuhrman, C. (2020). RTj: A Java framework for detecting and refactoring rotten green test cases. In: 2020 IEEE/ACM 42nd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), pp. 69–72.
Martins, L., Ghaleb, T., Costa, H., & Machado, I. (2023). Curated dataset of test-specific refactorings. Figshare. https://figshare.com/s/3cd337c00ba36954854e
Martins, L., Ghaleb, T., Costa, H., & Machado, I. (2023). TSR-Catalog: The catalog of test smells refactorings. ReadTheDocs. https://tsr-catalog.readthedocs.io/en/latest/
Meszaros, G. (2007). xUnit Test Patterns: Refactoring Test Code. Upper Saddle River, NJ: Addison-Wesley Signature Series. Addison-Wesley.
Palomba, F., Di Nucci, D., Panichella, A., Oliveto, R., & De Lucia, A. (2016). On the diffusion of test smells in automatically generated test code: An empirical study. In: Proceedings of the 9th International Workshop on Search-Based Software Testing. SBST ’16, pp. 5–14. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/2897010.2897016
Panichella, A., Panichella, S., Fraser, G., Sawant, A. A., & Hellendoorn, V. J. (2022). Test smells 20 years later: Detectability, validity, and reliability. Empirical Software Engineering, 27(7), 1–40. https://doi.org/10.1007/s10664-022-10207-5
Pantiuchina, J., Zampetti, F., Scalabrino, S., Piantadosi, V., Oliveto, R., Bavota, G., & Penta, M. D. (2020). Why developers refactor source code: A mining-based study. ACM Transactions on Software Engineering and Methodology,29(4). https://doi.org/10.1145/3408302
Peruma, A., Almalki, K., Newman, C. D., Mkaouer, M. W., Ouni, A., & Palomba, F. (2019). On the distribution of test smells in open source android applications: An exploratory study. In: Proceedings of the 29th Annual International Conference on Computer Science and Software Engineering. CASCON ’19, pp. 193–202. IBM Corp., USA.
Peruma, A., Almalki, K., Newman, C. D., Mkaouer, M. W., Ouni, A., & Palomba, F. (2020). TsDetect: An open source test smells detection tool. In: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ESEC/FSE 2020, pp. 1650–1654. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3368089.3417921
Peruma, A., Newman, C. D., Mkaouer, M. W., Ouni, A., & Palomba, F. (2020). An exploratory study on the refactoring of unit test files in android applications. In: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops, pp. 350–357. ACM, New York, NY, USA. https://doi.org/10.1145/3387940.3392189
Peruma, A., Simmons, S., AlOmar, E. A., Newman, C. D., Mkaouer, M. W., & Ouni, A. (2022). How do i refactor this? An empirical study on refactoring trends and topics in stack overflow. Empirical Software Engineering, 27(1), 1–43.
Santana, R., Fernandes, D., Campos, D., Soares, L., Maciel, R., & Machado, I. (2021). Understanding practitioners’ strategies to handle test smells: A multi-method study. SBES’21: Brazilian Symposium on Software Engineering, pp. 49–53, New York, NY, USA. Association for Computing Machinery. https://doi.org/10.1145/3474624.3474639
Santana, R., Martins, L., Rocha, L., Virgínio, T., Cruz, A., Costa, H., & Machado, I. (2020). Raide: A tool for assertion roulette and duplicate assert identification and refactoring. In: Proceedings of the XXXIV Brazilian Symposium on Software Engineering. SBES ’20, pp. 374–379. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3422392.3422510
Soares, E., Ribeiro, M., Amaral, G., Gheyi, R., Fernandes, L., Garcia, A., Fonseca, B., & Santos, A. (2020). Refactoring test smells: A perspective from open-source developers. In: Proceedings of the 5th Brazilian Symposium on Systematic and Automated Software Testing, pp. 50–59. ACM, New York, NY, USA. https://doi.org/10.1145/3425174.3425212
Soares, E., Ribeiro, M., Gheyi, R., Amaral, G., & Santos, A. (2023). Refactoring test smells with junit 5: Why should developers keep up-to-date? IEEE Transactions on Software Engineering, 49(3), 1152–1170. https://doi.org/10.1109/TSE.2022.3172654
Spadini, D., Schvarcbacher, M., Oprescu, A.-M., Bruntink, M., & Bacchelli, A. (2020). Investigating severity thresholds for test smells. In: Proceedings of the 17th International Conference on Mining Software Repositories. MSR ’20, pp. 311–321. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3379597.3387453
Tufano, M., Palomba, F., Bavota, G., Di Penta, M., Oliveto, R., De Lucia, A., & Poshyvanyk, D. (2016). An empirical investigation into the nature of test smells. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering. ASE ’16, pp. 4–15. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/2970276.2970340
Tufano, M., Palomba, F., Bavota, G., Oliveto, R., Penta, M. D., De Lucia, A., & Poshyvanyk, D. (2017). When and why your code starts to smell bad (and whether the smells go away). IEEE Transactions on Software Engineering, 43(11), 1063–1088. https://doi.org/10.1109/TSE.2017.2653105
Tufano, M., Palomba, F., Bavota, G., Di Penta, M., Oliveto, R., De Lucia, A., & Poshyvanyk, D. (2017). There and back again: Can you compile that snapshot? Journal of Software: Evolution and Process, 29(4), 1838. https://doi.org/10.1002/smr.1838
Vidal, S. A., Marcos, C., & Díaz-Pace, J. A. (2016). An approach to prioritize code smells for refactoring. Automated Software Engineering, 23(3), 501–532. https://doi.org/10.1007/s10515-014-0175-x
Virgínio, T., Martins, L., Rocha, L., Santana, R., Cruz, A., Costa, H., & Machado, I. (2020). Jnose: Java test smell detector. In: Proceedings of the XXXIV Brazilian Symposium on Software Engineering. SBES ’20, pp. 564–569. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3422392.3422499
Weißgerber, P., & Diehl, S. (2006). Identifying refactorings from source-code changes. In: 21st IEEE/ACM International Conference on Automated Software Engineering (ASE’06). IEEE, pp. 231–240. https://doi.org/10.1109/ASE.2006.41
Weißgerber, P., Biegel, B., & Diehl, S. (2007). Making programmers aware of refactorings. In: WRT, pp. 58–59.
Funding
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001, FAPESB grants BOL0188/2020 and PIE0002/2022, and CNPq grants 465614/2014-0, 408356/2018-9 and 403361/2023-0.
Author information
Authors and Affiliations
Contributions
Luana Martins: conceptualization, methodology, formal analysis and investigation, writing — original draft preparation, writing — review and editing. Taher Ghaleb: conceptualization, methodology, formal analysis and investigation, writing — review and editing. Heitor Costa: conceptualization, methodology, writing — review and editing, supervision. Ivan Machado: conceptualization, methodology, writing — review and editing, supervision, funding acquisition.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Martins, L., Ghaleb, T.A., Costa, H. et al. A comprehensive catalog of refactoring strategies to handle test smells in Java-based systems. Software Qual J (2024). https://doi.org/10.1007/s11219-024-09663-7
Accepted:
Published:
DOI: https://doi.org/10.1007/s11219-024-09663-7