Skip to main content
Log in

A comprehensive catalog of refactoring strategies to handle test smells in Java-based systems

  • Research
  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

Test smells, detrimental coding practices that hinder high-quality test code development, pose a significant challenge in software testing and maintenance. Software refactoring, traditionally a powerful approach for addressing code smells and improving software quality without changing functionality, often focuses on production code, leaving test code overlooked. Despite extensive research in test smell refactoring, understanding the efficacy of existing refactoring operations on test code quality remains limited. Investigating real-world developer refactoring practices is crucial to bridge this knowledge gap. In this study, we investigate refactorings performed by developers to address test smells, resulting in a comprehensive catalog of test smells and their corresponding test-specific refactorings. Two test-specific refactorings closely tied to JUnit5 and seven version-agnostic refactorings for various JUnit versions have been identified. While many of these test-specific refactorings are documented in the literature, this analysis unveils new test-specific refactorings aimed at dealing with the “Inappropriate Assertion” test smell. This research provides insights into the challenges faced by developers and prevailing practices for effectively refactoring test code, thereby enhancing software testing and maintenance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Listing 1
Listing 2
Listing 3
Listing 4
Fig. 1
Listing 5
Fig. 2
Listing 6
Listing 7
Listing 8
Fig. 3
Listing 9
Listing 10
Listing 11
Listing 12
Fig. 4
Listing 13
Fig. 5
Listing 14
Fig. 6
Listing 15
Listing 16
Listing 17
Listing 18
Listing 19
Listing 20
Fig. 7

Similar content being viewed by others

Data availability

All data analysis referring to test code selection from GitHub and manual classification is available at Martins et al. (2023). The catalog with test smells and test-specific refactorings is available at Martins et al. (2023).

Notes

  1. Available at: https://www.eclipse.org/jgit/

  2. Available at: https://exubero.com/junit/anti-patterns/

References

  • ACCUMULO. Commit: f10b40 (2019). GitHub. Retrieved November 8, 2023, from https://github.com/apache/accumulo/commit/f10b4073dba6b8095e9934e9ea158eb9f45c6f67

  • Ahmad, A., Leifler, O., & Sandahl, K. (2021). Empirical analysis of practitioners’ perceptions of test flakiness factors. Software Testing, Verification and Reliability, 31(8), 1791. https://doi.org/10.1002/stvr.1791

    Article  Google Scholar 

  • Aljedaani, W., Peruma, A., Aljohani, A., Alotaibi, M., Mkaouer, M. W., Ouni, A., Newman, C. D., Ghallab, A., & Ludi, S. (2021). Test smell detection tools: A systematic mapping study. EASE ’21, pp. 170–180. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3463274.3463335

  • Backport JUnit upgrade from #1562 to 1.10.1. (2021). Retrieved November 8, 2023, from https://github.com/apache/accumulo/commit/d4fd27f32dc2611a23f67b1d3e8dafd8ee05a1cb

  • Bavota, G., Qusef, A., Oliveto, R., Lucia, A., & Binkley, D. (2015). Are test smells really harmful? An empirical study. Empirical Software Engineering, 20(4), 1052–1094. https://doi.org/10.1007/s10664-014-9313-0

    Article  Google Scholar 

  • Bell, J., Legunsen, O., Hilton, M., Eloussi, L., Yung, T., & Marinov, D. (2018). Deflaker: Automatically detecting flaky tests. In: 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE), pp. 433–444. https://doi.org/10.1145/3180155.3180164

  • Camara, B., Silva, M., Endo, A., & Vergilio, S. (2021). On the use of test smells for prediction of flaky tests. In: Proceedings of the 6th Brazilian Symposium on Systematic and Automated Software Testing. SAST ’21, pp. 46–54. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3482909.3482916

  • CAMEL. Commit: 626196a (2019). GitHub. Retrieved November 8, 2023, from https://github.com/apache/camel/commit/626196af0baf18a859c55bdf91526b447b367faf

  • CAMEL. Commit: 9dc4dc6 (2019). GitHub. Retrieved November 8, 2023, from https://github.com/apache/camel/commit/9dc4dc6cd2c6cee75892e9a57105d79bfdcc8f5c

  • CAMEL. Commit: f7d1dbb (2020). Retrieved November 8, 2023, from https://github.com/apache/camel/commit/f7d1dbbf736e8b50ac5f17e5d25829a0a6aa5d4e

  • CAMEL. Commit: 7a4363 (2020). Retrieved November 8, 2023, from https://github.com/apache/camel/commit/7a43633b3c3587d949724f580ad0015a6f65ef82

  • CAMEL. Commit: d58c731 (2021). GitHub. Retrieved November 8, 2023, from https://github.com/apache/camel/commit/d58c7318cb81f8faa5f2f4acd28d7a215855450d

  • CAMEL. Commit: c30dea (2021). GitHub. Retrieved November 8, 2023, from https://github.com/apache/camel/commit/c30deabcaed4726bce4371d76257db63f2eba87c

  • Cedrim, D., Garcia, A., Mongiovi, M., Gheyi, R., Sousa, L., de Mello, R., Fonseca, B., Ribeiro, M., & Chávez, A. (2017). Understanding the impact of refactoring on smells: A longitudinal study of 23 software projects. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. ESEC/FSE 2017, pp. 465–475. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3106237.3106259

  • Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.

    Article  Google Scholar 

  • CXF. Commit: 7e11da (2019). GitHub. Retrieved November 8, 2023, from https://github.com/apache/cxf/commit/7e11da7a566a95adc64143c0575b7ef86e0fbe5a

  • CXF. Commit: 4955ca6 (2019). GitHub. Retrieved November 8, 2023, from https://github.com/apache/cxf/commit/4955ca652f16e781524612383af27c650e10cbdc

  • Deursen, A., Moonen, L. M., Bergh, A., & Kok, G. (2001). Refactoring test code. Centre for Mathematics and Computer Science, NLD: Technical report.

    Google Scholar 

  • Di, Z., Li, B., Li, Z., & Liang, P. (2018). A preliminary investigation of self-admitted refactorings in open source software. In: International Conference on Software Engineering and Knowledge Engineering, vol. 2018, pp. 165–168. KSI Research Inc. and Knowledge Systems Institute Graduate School.

  • Fowler, M. (1999). Refactoring: Improving the design of existing code. USA: Addison-Wesley Longman Publishing Co., Inc.

    Google Scholar 

  • Garousi, V., & Küçük, B. (2018). Smells in software test code: A survey of knowledge in industry and academia. Journal of Systems and Software, 138, 52–81. https://doi.org/10.1016/j.jss.2017.12.013

    Article  Google Scholar 

  • Greiler, M., van Deursen, A., & Storey, M. -A. (2013). Automated detection of test fixture strategies and smells. In: 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation, pp. 322–331. https://doi.org/10.1109/ICST.2013.45

  • Junior, N. S., Martins, L. A., Rocha, L., Costa, H. A. X., & Machado, I. (2021). How are test smells treated in the wild? A tale of two empirical studies. Journal of Software Engineering Research and Development, 9, 9–1916. https://doi.org/10.5753/jserd.2021.1802

    Article  Google Scholar 

  • KAFKA. Commit: 56d948 (2021). GitHub. Retrieved November 8, 2023, from https://github.com/apache/kafka/commit/56d9482462c2aa941b151015499fc59485fe7426

  • KAFKA. Commit: f4c203 (2021). GitHub. Retrieved November 8, 2023, from https://github.com/apache/kafka/commit/f4c2030b2006fc0c447a10f8b251579424f39f7b

  • Kim, D. J., Tsantalis, N., Chen, T. -H. P., & Yang, J. (2021). Studying test annotation maintenance in the wild. In: Proceedings of the 43rd International Conference on Software Engineering. ICSE 2021, pp. 62–73. IEEE Computer Society, Los Alamitos, CA, USA. https://doi.org/10.1109/ICSE43902.2021.00019

  • Kim, D. J., Chen, T.-H.P., & Yang, J. (2021). The secret life of test smells-an empirical study on test smell evolution and maintenance. Empirical Software Engineering, 26(5), 1–47. https://doi.org/10.1007/s10664-021-09969-1

    Article  Google Scholar 

  • Kummer, M., Nierstrasz, O., & Lungu, M. (2015). Categorising test smells. Bachelor Thesis. University of Bern.

  • Lacerda, G., Petrillo, F., Pimenta, M., & Guéhéneuc, Y. G. (2020). Code smells and refactoring: A tertiary systematic review of challenges and observations. Journal of Systems and Software, 167, 110610. https://doi.org/10.1016/j.jss.2020.110610

    Article  Google Scholar 

  • Lambiase, S., Cupito, A., Pecorelli, F., De Lucia, A., & Palomba, F. (2020). Just-in-time test smell detection and refactoring: The darts project. In: Proceedings of the 28th International Conference on Program Comprehension. ICPC ’20, pp. 441–445. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3387904.3389296

  • Martinez, M., Etien, A., Ducasse, S., & Fuhrman, C. (2020). RTj: A Java framework for detecting and refactoring rotten green test cases. In: 2020 IEEE/ACM 42nd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), pp. 69–72.

  • Martins, L., Ghaleb, T., Costa, H., & Machado, I. (2023). Curated dataset of test-specific refactorings. Figshare. https://figshare.com/s/3cd337c00ba36954854e

  • Martins, L., Ghaleb, T., Costa, H., & Machado, I. (2023). TSR-Catalog: The catalog of test smells refactorings. ReadTheDocs. https://tsr-catalog.readthedocs.io/en/latest/

  • Meszaros, G. (2007). xUnit Test Patterns: Refactoring Test Code. Upper Saddle River, NJ: Addison-Wesley Signature Series. Addison-Wesley.

    Google Scholar 

  • Palomba, F., Di Nucci, D., Panichella, A., Oliveto, R., & De Lucia, A. (2016). On the diffusion of test smells in automatically generated test code: An empirical study. In: Proceedings of the 9th International Workshop on Search-Based Software Testing. SBST ’16, pp. 5–14. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/2897010.2897016

  • Panichella, A., Panichella, S., Fraser, G., Sawant, A. A., & Hellendoorn, V. J. (2022). Test smells 20 years later: Detectability, validity, and reliability. Empirical Software Engineering, 27(7), 1–40. https://doi.org/10.1007/s10664-022-10207-5

    Article  Google Scholar 

  • Pantiuchina, J., Zampetti, F., Scalabrino, S., Piantadosi, V., Oliveto, R., Bavota, G., & Penta, M. D. (2020). Why developers refactor source code: A mining-based study. ACM Transactions on Software Engineering and Methodology,29(4). https://doi.org/10.1145/3408302

  • Peruma, A., Almalki, K., Newman, C. D., Mkaouer, M. W., Ouni, A., & Palomba, F. (2019). On the distribution of test smells in open source android applications: An exploratory study. In: Proceedings of the 29th Annual International Conference on Computer Science and Software Engineering. CASCON ’19, pp. 193–202. IBM Corp., USA.

  • Peruma, A., Almalki, K., Newman, C. D., Mkaouer, M. W., Ouni, A., & Palomba, F. (2020). TsDetect: An open source test smells detection tool. In: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ESEC/FSE 2020, pp. 1650–1654. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3368089.3417921

  • Peruma, A., Newman, C. D., Mkaouer, M. W., Ouni, A., & Palomba, F. (2020). An exploratory study on the refactoring of unit test files in android applications. In: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops, pp. 350–357. ACM, New York, NY, USA. https://doi.org/10.1145/3387940.3392189

  • Peruma, A., Simmons, S., AlOmar, E. A., Newman, C. D., Mkaouer, M. W., & Ouni, A. (2022). How do i refactor this? An empirical study on refactoring trends and topics in stack overflow. Empirical Software Engineering, 27(1), 1–43.

    Article  Google Scholar 

  • Santana, R., Fernandes, D., Campos, D., Soares, L., Maciel, R., & Machado, I. (2021). Understanding practitioners’ strategies to handle test smells: A multi-method study. SBES’21: Brazilian Symposium on Software Engineering, pp. 49–53, New York, NY, USA. Association for Computing Machinery. https://doi.org/10.1145/3474624.3474639

  • Santana, R., Martins, L., Rocha, L., Virgínio, T., Cruz, A., Costa, H., & Machado, I. (2020). Raide: A tool for assertion roulette and duplicate assert identification and refactoring. In: Proceedings of the XXXIV Brazilian Symposium on Software Engineering. SBES ’20, pp. 374–379. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3422392.3422510

  • Soares, E., Ribeiro, M., Amaral, G., Gheyi, R., Fernandes, L., Garcia, A., Fonseca, B., & Santos, A. (2020). Refactoring test smells: A perspective from open-source developers. In: Proceedings of the 5th Brazilian Symposium on Systematic and Automated Software Testing, pp. 50–59. ACM, New York, NY, USA. https://doi.org/10.1145/3425174.3425212

  • Soares, E., Ribeiro, M., Gheyi, R., Amaral, G., & Santos, A. (2023). Refactoring test smells with junit 5: Why should developers keep up-to-date? IEEE Transactions on Software Engineering, 49(3), 1152–1170. https://doi.org/10.1109/TSE.2022.3172654

    Article  Google Scholar 

  • Spadini, D., Schvarcbacher, M., Oprescu, A.-M., Bruntink, M., & Bacchelli, A. (2020). Investigating severity thresholds for test smells. In: Proceedings of the 17th International Conference on Mining Software Repositories. MSR ’20, pp. 311–321. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3379597.3387453

  • Tufano, M., Palomba, F., Bavota, G., Di Penta, M., Oliveto, R., De Lucia, A., & Poshyvanyk, D. (2016). An empirical investigation into the nature of test smells. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering. ASE ’16, pp. 4–15. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/2970276.2970340

  • Tufano, M., Palomba, F., Bavota, G., Oliveto, R., Penta, M. D., De Lucia, A., & Poshyvanyk, D. (2017). When and why your code starts to smell bad (and whether the smells go away). IEEE Transactions on Software Engineering, 43(11), 1063–1088. https://doi.org/10.1109/TSE.2017.2653105

    Article  Google Scholar 

  • Tufano, M., Palomba, F., Bavota, G., Di Penta, M., Oliveto, R., De Lucia, A., & Poshyvanyk, D. (2017). There and back again: Can you compile that snapshot? Journal of Software: Evolution and Process, 29(4), 1838. https://doi.org/10.1002/smr.1838

    Article  Google Scholar 

  • Vidal, S. A., Marcos, C., & Díaz-Pace, J. A. (2016). An approach to prioritize code smells for refactoring. Automated Software Engineering, 23(3), 501–532. https://doi.org/10.1007/s10515-014-0175-x

    Article  Google Scholar 

  • Virgínio, T., Martins, L., Rocha, L., Santana, R., Cruz, A., Costa, H., & Machado, I. (2020). Jnose: Java test smell detector. In: Proceedings of the XXXIV Brazilian Symposium on Software Engineering. SBES ’20, pp. 564–569. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3422392.3422499

  • Weißgerber, P., & Diehl, S. (2006). Identifying refactorings from source-code changes. In: 21st IEEE/ACM International Conference on Automated Software Engineering (ASE’06). IEEE, pp. 231–240. https://doi.org/10.1109/ASE.2006.41

  • Weißgerber, P., Biegel, B., & Diehl, S. (2007). Making programmers aware of refactorings. In: WRT, pp. 58–59.

Download references

Funding

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001, FAPESB grants BOL0188/2020 and PIE0002/2022, and CNPq grants 465614/2014-0, 408356/2018-9 and 403361/2023-0.

Author information

Authors and Affiliations

Authors

Contributions

Luana Martins: conceptualization, methodology, formal analysis and investigation, writing — original draft preparation, writing — review and editing. Taher Ghaleb: conceptualization, methodology, formal analysis and investigation, writing — review and editing. Heitor Costa: conceptualization, methodology, writing — review and editing, supervision. Ivan Machado: conceptualization, methodology, writing — review and editing, supervision, funding acquisition.

Corresponding author

Correspondence to Luana Martins.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Martins, L., Ghaleb, T.A., Costa, H. et al. A comprehensive catalog of refactoring strategies to handle test smells in Java-based systems. Software Qual J (2024). https://doi.org/10.1007/s11219-024-09663-7

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11219-024-09663-7

Keywords

Navigation