A comprehensive catalog of refactoring strategies to handle test smells in Java-based systems

Martins, Luana; Ghaleb, Taher A.; Costa, Heitor; Machado, Ivan

doi:10.1007/s11219-024-09663-7

A comprehensive catalog of refactoring strategies to handle test smells in Java-based systems

Research
Published: 08 March 2024

(2024)
Cite this article

Software Quality Journal Aims and scope Submit manuscript

Luana Martins¹,
Taher A. Ghaleb²,
Heitor Costa³ &
…
Ivan Machado¹

133 Accesses
Explore all metrics

Abstract

Test smells, detrimental coding practices that hinder high-quality test code development, pose a significant challenge in software testing and maintenance. Software refactoring, traditionally a powerful approach for addressing code smells and improving software quality without changing functionality, often focuses on production code, leaving test code overlooked. Despite extensive research in test smell refactoring, understanding the efficacy of existing refactoring operations on test code quality remains limited. Investigating real-world developer refactoring practices is crucial to bridge this knowledge gap. In this study, we investigate refactorings performed by developers to address test smells, resulting in a comprehensive catalog of test smells and their corresponding test-specific refactorings. Two test-specific refactorings closely tied to JUnit5 and seven version-agnostic refactorings for various JUnit versions have been identified. While many of these test-specific refactorings are documented in the literature, this analysis unveils new test-specific refactorings aimed at dealing with the “Inappropriate Assertion” test smell. This research provides insights into the challenges faced by developers and prevailing practices for effectively refactoring test code, thereby enhancing software testing and maintenance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Listing 1

Listing 5

Listing 11

Listing 12

Listing 14

Listing 19

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

Challenges of Low-Code/No-Code Software Development: A Literature Review

Low-code development and model-driven engineering: Two sides of the same coin?

Article Open access 11 January 2022

Data availability

All data analysis referring to test code selection from GitHub and manual classification is available at Martins et al. (2023). The catalog with test smells and test-specific refactorings is available at Martins et al. (2023).

Notes

Available at: https://www.eclipse.org/jgit/
Available at: https://exubero.com/junit/anti-patterns/

References

ACCUMULO. Commit: f10b40 (2019). GitHub. Retrieved November 8, 2023, from https://github.com/apache/accumulo/commit/f10b4073dba6b8095e9934e9ea158eb9f45c6f67
Ahmad, A., Leifler, O., & Sandahl, K. (2021). Empirical analysis of practitioners’ perceptions of test flakiness factors. Software Testing, Verification and Reliability, 31(8), 1791. https://doi.org/10.1002/stvr.1791
Article Google Scholar
Aljedaani, W., Peruma, A., Aljohani, A., Alotaibi, M., Mkaouer, M. W., Ouni, A., Newman, C. D., Ghallab, A., & Ludi, S. (2021). Test smell detection tools: A systematic mapping study. EASE ’21, pp. 170–180. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3463274.3463335
Backport JUnit upgrade from #1562 to 1.10.1. (2021). Retrieved November 8, 2023, from https://github.com/apache/accumulo/commit/d4fd27f32dc2611a23f67b1d3e8dafd8ee05a1cb
Bavota, G., Qusef, A., Oliveto, R., Lucia, A., & Binkley, D. (2015). Are test smells really harmful? An empirical study. Empirical Software Engineering, 20(4), 1052–1094. https://doi.org/10.1007/s10664-014-9313-0
Article Google Scholar
Bell, J., Legunsen, O., Hilton, M., Eloussi, L., Yung, T., & Marinov, D. (2018). Deflaker: Automatically detecting flaky tests. In: 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE), pp. 433–444. https://doi.org/10.1145/3180155.3180164
Camara, B., Silva, M., Endo, A., & Vergilio, S. (2021). On the use of test smells for prediction of flaky tests. In: Proceedings of the 6th Brazilian Symposium on Systematic and Automated Software Testing. SAST ’21, pp. 46–54. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3482909.3482916
CAMEL. Commit: 626196a (2019). GitHub. Retrieved November 8, 2023, from https://github.com/apache/camel/commit/626196af0baf18a859c55bdf91526b447b367faf
CAMEL. Commit: 9dc4dc6 (2019). GitHub. Retrieved November 8, 2023, from https://github.com/apache/camel/commit/9dc4dc6cd2c6cee75892e9a57105d79bfdcc8f5c
CAMEL. Commit: f7d1dbb (2020). Retrieved November 8, 2023, from https://github.com/apache/camel/commit/f7d1dbbf736e8b50ac5f17e5d25829a0a6aa5d4e
CAMEL. Commit: 7a4363 (2020). Retrieved November 8, 2023, from https://github.com/apache/camel/commit/7a43633b3c3587d949724f580ad0015a6f65ef82
CAMEL. Commit: d58c731 (2021). GitHub. Retrieved November 8, 2023, from https://github.com/apache/camel/commit/d58c7318cb81f8faa5f2f4acd28d7a215855450d
CAMEL. Commit: c30dea (2021). GitHub. Retrieved November 8, 2023, from https://github.com/apache/camel/commit/c30deabcaed4726bce4371d76257db63f2eba87c
Cedrim, D., Garcia, A., Mongiovi, M., Gheyi, R., Sousa, L., de Mello, R., Fonseca, B., Ribeiro, M., & Chávez, A. (2017). Understanding the impact of refactoring on smells: A longitudinal study of 23 software projects. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. ESEC/FSE 2017, pp. 465–475. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3106237.3106259
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.
Article Google Scholar
CXF. Commit: 7e11da (2019). GitHub. Retrieved November 8, 2023, from https://github.com/apache/cxf/commit/7e11da7a566a95adc64143c0575b7ef86e0fbe5a
CXF. Commit: 4955ca6 (2019). GitHub. Retrieved November 8, 2023, from https://github.com/apache/cxf/commit/4955ca652f16e781524612383af27c650e10cbdc
Deursen, A., Moonen, L. M., Bergh, A., & Kok, G. (2001). Refactoring test code. Centre for Mathematics and Computer Science, NLD: Technical report.
Google Scholar
Di, Z., Li, B., Li, Z., & Liang, P. (2018). A preliminary investigation of self-admitted refactorings in open source software. In: International Conference on Software Engineering and Knowledge Engineering, vol. 2018, pp. 165–168. KSI Research Inc. and Knowledge Systems Institute Graduate School.
Fowler, M. (1999). Refactoring: Improving the design of existing code. USA: Addison-Wesley Longman Publishing Co., Inc.
Google Scholar
Garousi, V., & Küçük, B. (2018). Smells in software test code: A survey of knowledge in industry and academia. Journal of Systems and Software, 138, 52–81. https://doi.org/10.1016/j.jss.2017.12.013
Article Google Scholar
Greiler, M., van Deursen, A., & Storey, M. -A. (2013). Automated detection of test fixture strategies and smells. In: 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation, pp. 322–331. https://doi.org/10.1109/ICST.2013.45
Junior, N. S., Martins, L. A., Rocha, L., Costa, H. A. X., & Machado, I. (2021). How are test smells treated in the wild? A tale of two empirical studies. Journal of Software Engineering Research and Development, 9, 9–1916. https://doi.org/10.5753/jserd.2021.1802
Article Google Scholar
KAFKA. Commit: 56d948 (2021). GitHub. Retrieved November 8, 2023, from https://github.com/apache/kafka/commit/56d9482462c2aa941b151015499fc59485fe7426
KAFKA. Commit: f4c203 (2021). GitHub. Retrieved November 8, 2023, from https://github.com/apache/kafka/commit/f4c2030b2006fc0c447a10f8b251579424f39f7b
Kim, D. J., Tsantalis, N., Chen, T. -H. P., & Yang, J. (2021). Studying test annotation maintenance in the wild. In: Proceedings of the 43rd International Conference on Software Engineering. ICSE 2021, pp. 62–73. IEEE Computer Society, Los Alamitos, CA, USA. https://doi.org/10.1109/ICSE43902.2021.00019
Kim, D. J., Chen, T.-H.P., & Yang, J. (2021). The secret life of test smells-an empirical study on test smell evolution and maintenance. Empirical Software Engineering, 26(5), 1–47. https://doi.org/10.1007/s10664-021-09969-1
Article Google Scholar
Kummer, M., Nierstrasz, O., & Lungu, M. (2015). Categorising test smells. Bachelor Thesis. University of Bern.
Lacerda, G., Petrillo, F., Pimenta, M., & Guéhéneuc, Y. G. (2020). Code smells and refactoring: A tertiary systematic review of challenges and observations. Journal of Systems and Software, 167, 110610. https://doi.org/10.1016/j.jss.2020.110610
Article Google Scholar
Lambiase, S., Cupito, A., Pecorelli, F., De Lucia, A., & Palomba, F. (2020). Just-in-time test smell detection and refactoring: The darts project. In: Proceedings of the 28th International Conference on Program Comprehension. ICPC ’20, pp. 441–445. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3387904.3389296
Martinez, M., Etien, A., Ducasse, S., & Fuhrman, C. (2020). RTj: A Java framework for detecting and refactoring rotten green test cases. In: 2020 IEEE/ACM 42nd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), pp. 69–72.
Martins, L., Ghaleb, T., Costa, H., & Machado, I. (2023). Curated dataset of test-specific refactorings. Figshare. https://figshare.com/s/3cd337c00ba36954854e
Martins, L., Ghaleb, T., Costa, H., & Machado, I. (2023). TSR-Catalog: The catalog of test smells refactorings. ReadTheDocs. https://tsr-catalog.readthedocs.io/en/latest/
Meszaros, G. (2007). xUnit Test Patterns: Refactoring Test Code. Upper Saddle River, NJ: Addison-Wesley Signature Series. Addison-Wesley.
Google Scholar
Palomba, F., Di Nucci, D., Panichella, A., Oliveto, R., & De Lucia, A. (2016). On the diffusion of test smells in automatically generated test code: An empirical study. In: Proceedings of the 9th International Workshop on Search-Based Software Testing. SBST ’16, pp. 5–14. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/2897010.2897016
Panichella, A., Panichella, S., Fraser, G., Sawant, A. A., & Hellendoorn, V. J. (2022). Test smells 20 years later: Detectability, validity, and reliability. Empirical Software Engineering, 27(7), 1–40. https://doi.org/10.1007/s10664-022-10207-5
Article Google Scholar
Pantiuchina, J., Zampetti, F., Scalabrino, S., Piantadosi, V., Oliveto, R., Bavota, G., & Penta, M. D. (2020). Why developers refactor source code: A mining-based study. ACM Transactions on Software Engineering and Methodology,29(4). https://doi.org/10.1145/3408302
Peruma, A., Almalki, K., Newman, C. D., Mkaouer, M. W., Ouni, A., & Palomba, F. (2019). On the distribution of test smells in open source android applications: An exploratory study. In: Proceedings of the 29th Annual International Conference on Computer Science and Software Engineering. CASCON ’19, pp. 193–202. IBM Corp., USA.
Peruma, A., Almalki, K., Newman, C. D., Mkaouer, M. W., Ouni, A., & Palomba, F. (2020). TsDetect: An open source test smells detection tool. In: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ESEC/FSE 2020, pp. 1650–1654. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3368089.3417921
Peruma, A., Newman, C. D., Mkaouer, M. W., Ouni, A., & Palomba, F. (2020). An exploratory study on the refactoring of unit test files in android applications. In: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops, pp. 350–357. ACM, New York, NY, USA. https://doi.org/10.1145/3387940.3392189
Peruma, A., Simmons, S., AlOmar, E. A., Newman, C. D., Mkaouer, M. W., & Ouni, A. (2022). How do i refactor this? An empirical study on refactoring trends and topics in stack overflow. Empirical Software Engineering, 27(1), 1–43.
Article Google Scholar
Santana, R., Fernandes, D., Campos, D., Soares, L., Maciel, R., & Machado, I. (2021). Understanding practitioners’ strategies to handle test smells: A multi-method study. SBES’21: Brazilian Symposium on Software Engineering, pp. 49–53, New York, NY, USA. Association for Computing Machinery. https://doi.org/10.1145/3474624.3474639
Santana, R., Martins, L., Rocha, L., Virgínio, T., Cruz, A., Costa, H., & Machado, I. (2020). Raide: A tool for assertion roulette and duplicate assert identification and refactoring. In: Proceedings of the XXXIV Brazilian Symposium on Software Engineering. SBES ’20, pp. 374–379. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3422392.3422510
Soares, E., Ribeiro, M., Amaral, G., Gheyi, R., Fernandes, L., Garcia, A., Fonseca, B., & Santos, A. (2020). Refactoring test smells: A perspective from open-source developers. In: Proceedings of the 5th Brazilian Symposium on Systematic and Automated Software Testing, pp. 50–59. ACM, New York, NY, USA. https://doi.org/10.1145/3425174.3425212
Soares, E., Ribeiro, M., Gheyi, R., Amaral, G., & Santos, A. (2023). Refactoring test smells with junit 5: Why should developers keep up-to-date? IEEE Transactions on Software Engineering, 49(3), 1152–1170. https://doi.org/10.1109/TSE.2022.3172654
Article Google Scholar
Spadini, D., Schvarcbacher, M., Oprescu, A.-M., Bruntink, M., & Bacchelli, A. (2020). Investigating severity thresholds for test smells. In: Proceedings of the 17th International Conference on Mining Software Repositories. MSR ’20, pp. 311–321. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3379597.3387453
Tufano, M., Palomba, F., Bavota, G., Di Penta, M., Oliveto, R., De Lucia, A., & Poshyvanyk, D. (2016). An empirical investigation into the nature of test smells. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering. ASE ’16, pp. 4–15. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/2970276.2970340
Tufano, M., Palomba, F., Bavota, G., Oliveto, R., Penta, M. D., De Lucia, A., & Poshyvanyk, D. (2017). When and why your code starts to smell bad (and whether the smells go away). IEEE Transactions on Software Engineering, 43(11), 1063–1088. https://doi.org/10.1109/TSE.2017.2653105
Article Google Scholar
Tufano, M., Palomba, F., Bavota, G., Di Penta, M., Oliveto, R., De Lucia, A., & Poshyvanyk, D. (2017). There and back again: Can you compile that snapshot? Journal of Software: Evolution and Process, 29(4), 1838. https://doi.org/10.1002/smr.1838
Article Google Scholar
Vidal, S. A., Marcos, C., & Díaz-Pace, J. A. (2016). An approach to prioritize code smells for refactoring. Automated Software Engineering, 23(3), 501–532. https://doi.org/10.1007/s10515-014-0175-x
Article Google Scholar
Virgínio, T., Martins, L., Rocha, L., Santana, R., Cruz, A., Costa, H., & Machado, I. (2020). Jnose: Java test smell detector. In: Proceedings of the XXXIV Brazilian Symposium on Software Engineering. SBES ’20, pp. 564–569. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3422392.3422499
Weißgerber, P., & Diehl, S. (2006). Identifying refactorings from source-code changes. In: 21st IEEE/ACM International Conference on Automated Software Engineering (ASE’06). IEEE, pp. 231–240. https://doi.org/10.1109/ASE.2006.41
Weißgerber, P., Biegel, B., & Diehl, S. (2007). Making programmers aware of refactorings. In: WRT, pp. 58–59.

Download references

Funding

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001, FAPESB grants BOL0188/2020 and PIE0002/2022, and CNPq grants 465614/2014-0, 408356/2018-9 and 403361/2023-0.

Author information

Authors and Affiliations

Computer Science Department, Federal University of Bahia, Salvador, 40.170-110, Bahia, Brazil
Luana Martins & Ivan Machado
Faculty of Information, University of Toronto, Toronto, ON M5S 3G6, Ontario, Canada
Taher A. Ghaleb
Computer Science Department, Federal University of Lavras, Lavras, 37200-000, Minas Gerais, Brazil
Heitor Costa

Authors

Luana Martins
View author publications
You can also search for this author in PubMed Google Scholar
Taher A. Ghaleb
View author publications
You can also search for this author in PubMed Google Scholar
Heitor Costa
View author publications
You can also search for this author in PubMed Google Scholar
Ivan Machado
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Luana Martins: conceptualization, methodology, formal analysis and investigation, writing — original draft preparation, writing — review and editing. Taher Ghaleb: conceptualization, methodology, formal analysis and investigation, writing — review and editing. Heitor Costa: conceptualization, methodology, writing — review and editing, supervision. Ivan Machado: conceptualization, methodology, writing — review and editing, supervision, funding acquisition.

Corresponding author

Correspondence to Luana Martins.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Martins, L., Ghaleb, T.A., Costa, H. et al. A comprehensive catalog of refactoring strategies to handle test smells in Java-based systems. Software Qual J (2024). https://doi.org/10.1007/s11219-024-09663-7

Download citation

Accepted: 11 February 2024
Published: 08 March 2024
DOI: https://doi.org/10.1007/s11219-024-09663-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comprehensive catalog of refactoring strategies to handle test smells in Java-based systems

Abstract

Access this article

Similar content being viewed by others

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

Challenges of Low-Code/No-Code Software Development: A Literature Review

Low-code development and model-driven engineering: Two sides of the same coin?

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A comprehensive catalog of refactoring strategies to handle test smells in Java-based systems

Abstract

Access this article

Similar content being viewed by others

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

Challenges of Low-Code/No-Code Software Development: A Literature Review

Low-code development and model-driven engineering: Two sides of the same coin?

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation