Skip to main content
Log in

A Cost-effective and Machine-learning-based method to identify and cluster redundant mutants in software mutation testing

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The quality of software test data is assessed through mutation testing. This technique involves introducing various modifications (mutants) to the original code of the program. The test data’s effectiveness, known as the test score, is quantified by the proportion of mutants that are successfully detected. A prominent issue within software mutation testing is the generation of an excessive quantity of mutants in programs. The primary objectives of this research are to diminish the total count of mutants by consolidating those that are duplicative, to decrease the overall mutation testing time by lessening the quantity of mutants produced, and to lower the expenses associated with mutation testing. This research introduces a machine learning-based strategy to recognise and eliminate redundant mutants. Building a machine learning-based classifier to classify the instructions according to the rate of error propagation is the first contribution of this study. Next, the remaining instructions of the source code are analyzed by the designed parser to generate single-line mutants. Unlike traditional approaches, mutants are not generated as distinct full-fledged programs. Instead, mutants consisting of a single line are selectively run using a developed instruction evaluator. Following this, a clustering technique is employed to categorise single-line mutants yielding identical outcomes into groups, where only one complete execution is needed per group. Testing on Java benchmarks with the new method has shown a decrease in the mutant count by 56.33% and a time reduction of 56.71% when compared with parallel tests using the MuJava and MuClipse tools. Despite the marked decrease in both mutant count and testing time, the mutation score remained consistent. Comparable outcomes were also observed with other mutation testing tools such as Pitest, Jester, Jumble, and JavaLancer.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

The data relating to the current study is available in the google.drive and can be freely accessed by the following link:https://drive.google.com/drive/folders/1d69XSBZ-ioInjPw9L4qp-3BBe2jlkkfv?usp=share_link.

References

  1. Arasteh B (2019) ReDup: a software-based method for detecting soft-error using data analysis. Comput Electr Eng 78:89–107

    Article  Google Scholar 

  2. Arasteh B, Najafi J (2018) Programming guidelines for improving software resiliency against soft-errors without performance overhead. Computing 100:971–1003. https://doi.org/10.1007/s00607-018-0592-y

    Article  MathSciNet  Google Scholar 

  3. Keshtgar A, Arasteh B (2017) Enhancing software reliability against soft-error using minimum redundancy on critical data. Int J Comput Netw Inf Secur. https://doi.org/10.5815/ijcnis.2017.05.03

    Article  Google Scholar 

  4. Arasteh B, Miremadi SG, Rahmani AM (2014) Developing inherently resilient software against soft-errors based on algorithm level inherent features. J Electron Test 30:193–212. https://doi.org/10.1007/s10836-014-5438-8

    Article  Google Scholar 

  5. Ghaemi A, Arasteh B (2020) SFLA-based heuristic method to generate software structural test data. J Softw Evol Proc 32:e2228. https://doi.org/10.1002/smr.2228

    Article  Google Scholar 

  6. Hosseini S, Arasteh B, Isazadeh A, Mohsenzadeh M, Mirzarezaee M (2021) An error-propagation aware method to reduce the software mutation cost using genetic algorithm. Data Technol Appl 55(1):118–148. https://doi.org/10.1108/DTA-03-2020-0073

    Article  Google Scholar 

  7. Wong WE (1998) On mutation and data flow. Purdue University, West Lafayette

    Google Scholar 

  8. Ma YS, Offutt J, Kwon YR (2006) MuJava: a mutation system for Java. In: 28th International Conference on Software Engineering (ICSE ‘06)

  9. Arasteh B, Hosseini SMJ (2022) Traxtor: an automatic software test suit generation method inspired by imperialist competitive optimization algorithms. J Electron Test. https://doi.org/10.1007/s10836-022-05999-9

    Article  Google Scholar 

  10. Aghdam ZK, Arasteh B (2017) An efficient method to generate test data for software structural testing using artificial bee colony optimization algorithm. Int J Software Eng Knowl Eng 27(6):2017

    Article  Google Scholar 

  11. Papadakis M, Malevris N (2010) An empirical evaluation of the first and second order mutation testing strategies. In: Third International Conference on Software Testing, Verification, and Validation Workshops (ICSTW)

  12. Offutt AJ, Lee A, Rothermel G, Untch RH, Zapf C (1996) An experimental determination of sufficient mutant operators. ACM Trans. Softw. Eng. Methodol. 5(2):99–118

    Article  Google Scholar 

  13. Barbosa EF, Maldonado JC, Vincenzi AMR (2001) Toward the determination of sufficient mutant operators for C. Softw. Test. Verif. Reliab. 11(2):113–136

    Article  Google Scholar 

  14. Zhang L, Hou S-S, Hu J-J, Xie T, Mei H (2010) Is operator-based mutant selection superior to random mutant selection? In: Proceedings of the 32Nd ACM/IEEE International Conference on Software Engineering

  15. Zhang L, Gligoric M, Marinov D, Khurshid S (2013) Operator-based and random mutant selection: better together. In: Automated Software Engineering (ASE), IEEE/ACM 28th International Conference

  16. Malevris N, Yates D (2006) The collateral coverage of data flow criteria when branch testing. Inf Softw Technol 48(8):676–686

    Article  Google Scholar 

  17. Kintis M, Papadakis M, Malevris N (2010) Evaluating mutation testing alternatives: a collateral experiment. In: Proceedings of the 17th Asia-Pacific Software Engineering Conference (APSEC)

  18. Kurtz B, Ammann P, Delamaro M, Offutt J, Deng L (2014) Mutant subsumption graphs. In: 2014 IEEE Seventh International Conference on Software Testing, Verification and Validation Workshops (ICSTW)

  19. Howden WE (1982) Weak mutation testing and completeness of test sets. IEEE Trans Softw Eng 8(4):371–379

    Article  Google Scholar 

  20. Woodward M, Halewood K (1998) From weak to strong, dead, or alive? An analysis of some mutation testing issues. In: Proceedings of the Second Workshop on Software Testing, Verification, and Analysis

  21. Ma YS, Kim S-W (2016) Mutation testing cost reduction by clustering overlapped mutants. J Syst Softw 115:18–30

    Article  Google Scholar 

  22. Shomali N, Arasteh B (2020) Mutation reduction in software mutation testing using firefly optimization algorithm. Data Technol Appl 54(4):461–480. https://doi.org/10.1108/DTA-08-2019-0140

    Article  Google Scholar 

  23. Cook JJ, Zilles C (2008) A characterization of instruction level error derating and its implications for error detection. In: IEEE International Conference on Dependable Systems and Networks (DSN)

  24. Soleimanian F, Abdollahzadeh B, Barshandeh S, Arasteh B (2023) A multi-objective mutation-based dynamic Harris Hawks optimization for botnet detection in IoT. Internet Things 24:100952. https://doi.org/10.1016/j.iot.2023.100952

    Article  Google Scholar 

  25. Gharehchopogh FS, Abdollahzadeh B, Arasteh B (2023) An improved farmland fertility algorithm with hyper-heuristic approach for solving travelling salesman problem. CMES Comput Model Eng Sci 135(3):1981–2006. https://doi.org/10.32604/cmes.2023.024172

    Article  Google Scholar 

  26. Arasteh B, Sadegi R, Arasteh K (2021) Bölen: software module clustering method using the combination of shuffled frog leaping and genetic algorithm. Data Technol Appl 55(2):251–279. https://doi.org/10.1108/DTA-08-2019-0138

    Article  Google Scholar 

  27. King KN, Offutt AJ (1991) A Fortran language system for mutation-based software testing. Softw Pract Exp 21(7):685–718

    Article  Google Scholar 

  28. Wei C, Yao X, Gong D, Liu H (2021) Spectral clustering based mutant reduction for mutation testing. Inf Softw Technol 132:106502

    Article  Google Scholar 

  29. Offutt AJ, Rothermel G, Zapf C (1993) An experimental evaluation of selective mutation. In: Proceedings of the 15th International Conference on Software Engineering, ICSE ‘93. IEEE Computer Society Press, Los Alamitos

  30. Delgado-Pérez P, Medina-Bulo I (2018) Search-based mutant selection for efficient test suite improvement: evaluation and results. Inf Softw Technol 104:130–143

    Article  Google Scholar 

  31. Kurtz B, Ammann P, Offutt J (2015) Static analysis of mutant subsumption. In: IEEE Eighth International Conference on Software Testing, Verification and Validation Workshops (ICSTW)

  32. Deng L, Offutt J, Ammann P, Mirzaei N (2017) Mutation operators for testing Android apps. Inf Softw Technol 81:154–168

    Article  Google Scholar 

  33. Gheyi R, Ribeiro M, Souza B, Guimarães M, Fernandes L, d’Amorim M, Alves V, Teixeira L, Fonseca B (2021) Identifying method-level mutation subsumption relations using Z3. Inf Softw Technol 132:106496

    Article  Google Scholar 

  34. Arasteh B, Imanzadeh P, Arasteh K et al (2022) A source-code aware method for software mutation testing using artificial bee colony algorithm. J Electron Test. https://doi.org/10.1007/s10836-022-06008-9

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

The proposed method was developed and discretised by B. Arasteh. The designed algorithm was implemented and coded by B. Arasteh. The implemented method was adapted and benchmarked by B. Arasteh. The data and results analysis were performed by B. Arasteh and A. Ghaffari. The manuscript of the paper was written and revised by B. Arasteh and A. Ghaffari.

Corresponding author

Correspondence to Bahman Arasteh.

Ethics declarations

Conflict of interest

The authors declare that no funds, grants, or other support were received during the preparation of this manuscript. The authors have no relevant financial or non-financial conflict of interest.

Ethical approval and informed consent

The data used in this research does not belong to any other person or third party and was prepared and generated by the researchers themselves during the research. The data of this research will be accessible by other researchers.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Arasteh, B., Ghaffari, A. A Cost-effective and Machine-learning-based method to identify and cluster redundant mutants in software mutation testing. J Supercomput (2024). https://doi.org/10.1007/s11227-024-06107-8

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-024-06107-8

Keywords

Navigation