Abstract
The quality of software test data is assessed through mutation testing. This technique involves introducing various modifications (mutants) to the original code of the program. The test data’s effectiveness, known as the test score, is quantified by the proportion of mutants that are successfully detected. A prominent issue within software mutation testing is the generation of an excessive quantity of mutants in programs. The primary objectives of this research are to diminish the total count of mutants by consolidating those that are duplicative, to decrease the overall mutation testing time by lessening the quantity of mutants produced, and to lower the expenses associated with mutation testing. This research introduces a machine learning-based strategy to recognise and eliminate redundant mutants. Building a machine learning-based classifier to classify the instructions according to the rate of error propagation is the first contribution of this study. Next, the remaining instructions of the source code are analyzed by the designed parser to generate single-line mutants. Unlike traditional approaches, mutants are not generated as distinct full-fledged programs. Instead, mutants consisting of a single line are selectively run using a developed instruction evaluator. Following this, a clustering technique is employed to categorise single-line mutants yielding identical outcomes into groups, where only one complete execution is needed per group. Testing on Java benchmarks with the new method has shown a decrease in the mutant count by 56.33% and a time reduction of 56.71% when compared with parallel tests using the MuJava and MuClipse tools. Despite the marked decrease in both mutant count and testing time, the mutation score remained consistent. Comparable outcomes were also observed with other mutation testing tools such as Pitest, Jester, Jumble, and JavaLancer.
Similar content being viewed by others
Data availability
The data relating to the current study is available in the google.drive and can be freely accessed by the following link:https://drive.google.com/drive/folders/1d69XSBZ-ioInjPw9L4qp-3BBe2jlkkfv?usp=share_link.
References
Arasteh B (2019) ReDup: a software-based method for detecting soft-error using data analysis. Comput Electr Eng 78:89–107
Arasteh B, Najafi J (2018) Programming guidelines for improving software resiliency against soft-errors without performance overhead. Computing 100:971–1003. https://doi.org/10.1007/s00607-018-0592-y
Keshtgar A, Arasteh B (2017) Enhancing software reliability against soft-error using minimum redundancy on critical data. Int J Comput Netw Inf Secur. https://doi.org/10.5815/ijcnis.2017.05.03
Arasteh B, Miremadi SG, Rahmani AM (2014) Developing inherently resilient software against soft-errors based on algorithm level inherent features. J Electron Test 30:193–212. https://doi.org/10.1007/s10836-014-5438-8
Ghaemi A, Arasteh B (2020) SFLA-based heuristic method to generate software structural test data. J Softw Evol Proc 32:e2228. https://doi.org/10.1002/smr.2228
Hosseini S, Arasteh B, Isazadeh A, Mohsenzadeh M, Mirzarezaee M (2021) An error-propagation aware method to reduce the software mutation cost using genetic algorithm. Data Technol Appl 55(1):118–148. https://doi.org/10.1108/DTA-03-2020-0073
Wong WE (1998) On mutation and data flow. Purdue University, West Lafayette
Ma YS, Offutt J, Kwon YR (2006) MuJava: a mutation system for Java. In: 28th International Conference on Software Engineering (ICSE ‘06)
Arasteh B, Hosseini SMJ (2022) Traxtor: an automatic software test suit generation method inspired by imperialist competitive optimization algorithms. J Electron Test. https://doi.org/10.1007/s10836-022-05999-9
Aghdam ZK, Arasteh B (2017) An efficient method to generate test data for software structural testing using artificial bee colony optimization algorithm. Int J Software Eng Knowl Eng 27(6):2017
Papadakis M, Malevris N (2010) An empirical evaluation of the first and second order mutation testing strategies. In: Third International Conference on Software Testing, Verification, and Validation Workshops (ICSTW)
Offutt AJ, Lee A, Rothermel G, Untch RH, Zapf C (1996) An experimental determination of sufficient mutant operators. ACM Trans. Softw. Eng. Methodol. 5(2):99–118
Barbosa EF, Maldonado JC, Vincenzi AMR (2001) Toward the determination of sufficient mutant operators for C. Softw. Test. Verif. Reliab. 11(2):113–136
Zhang L, Hou S-S, Hu J-J, Xie T, Mei H (2010) Is operator-based mutant selection superior to random mutant selection? In: Proceedings of the 32Nd ACM/IEEE International Conference on Software Engineering
Zhang L, Gligoric M, Marinov D, Khurshid S (2013) Operator-based and random mutant selection: better together. In: Automated Software Engineering (ASE), IEEE/ACM 28th International Conference
Malevris N, Yates D (2006) The collateral coverage of data flow criteria when branch testing. Inf Softw Technol 48(8):676–686
Kintis M, Papadakis M, Malevris N (2010) Evaluating mutation testing alternatives: a collateral experiment. In: Proceedings of the 17th Asia-Pacific Software Engineering Conference (APSEC)
Kurtz B, Ammann P, Delamaro M, Offutt J, Deng L (2014) Mutant subsumption graphs. In: 2014 IEEE Seventh International Conference on Software Testing, Verification and Validation Workshops (ICSTW)
Howden WE (1982) Weak mutation testing and completeness of test sets. IEEE Trans Softw Eng 8(4):371–379
Woodward M, Halewood K (1998) From weak to strong, dead, or alive? An analysis of some mutation testing issues. In: Proceedings of the Second Workshop on Software Testing, Verification, and Analysis
Ma YS, Kim S-W (2016) Mutation testing cost reduction by clustering overlapped mutants. J Syst Softw 115:18–30
Shomali N, Arasteh B (2020) Mutation reduction in software mutation testing using firefly optimization algorithm. Data Technol Appl 54(4):461–480. https://doi.org/10.1108/DTA-08-2019-0140
Cook JJ, Zilles C (2008) A characterization of instruction level error derating and its implications for error detection. In: IEEE International Conference on Dependable Systems and Networks (DSN)
Soleimanian F, Abdollahzadeh B, Barshandeh S, Arasteh B (2023) A multi-objective mutation-based dynamic Harris Hawks optimization for botnet detection in IoT. Internet Things 24:100952. https://doi.org/10.1016/j.iot.2023.100952
Gharehchopogh FS, Abdollahzadeh B, Arasteh B (2023) An improved farmland fertility algorithm with hyper-heuristic approach for solving travelling salesman problem. CMES Comput Model Eng Sci 135(3):1981–2006. https://doi.org/10.32604/cmes.2023.024172
Arasteh B, Sadegi R, Arasteh K (2021) Bölen: software module clustering method using the combination of shuffled frog leaping and genetic algorithm. Data Technol Appl 55(2):251–279. https://doi.org/10.1108/DTA-08-2019-0138
King KN, Offutt AJ (1991) A Fortran language system for mutation-based software testing. Softw Pract Exp 21(7):685–718
Wei C, Yao X, Gong D, Liu H (2021) Spectral clustering based mutant reduction for mutation testing. Inf Softw Technol 132:106502
Offutt AJ, Rothermel G, Zapf C (1993) An experimental evaluation of selective mutation. In: Proceedings of the 15th International Conference on Software Engineering, ICSE ‘93. IEEE Computer Society Press, Los Alamitos
Delgado-Pérez P, Medina-Bulo I (2018) Search-based mutant selection for efficient test suite improvement: evaluation and results. Inf Softw Technol 104:130–143
Kurtz B, Ammann P, Offutt J (2015) Static analysis of mutant subsumption. In: IEEE Eighth International Conference on Software Testing, Verification and Validation Workshops (ICSTW)
Deng L, Offutt J, Ammann P, Mirzaei N (2017) Mutation operators for testing Android apps. Inf Softw Technol 81:154–168
Gheyi R, Ribeiro M, Souza B, Guimarães M, Fernandes L, d’Amorim M, Alves V, Teixeira L, Fonseca B (2021) Identifying method-level mutation subsumption relations using Z3. Inf Softw Technol 132:106496
Arasteh B, Imanzadeh P, Arasteh K et al (2022) A source-code aware method for software mutation testing using artificial bee colony algorithm. J Electron Test. https://doi.org/10.1007/s10836-022-06008-9
Author information
Authors and Affiliations
Contributions
The proposed method was developed and discretised by B. Arasteh. The designed algorithm was implemented and coded by B. Arasteh. The implemented method was adapted and benchmarked by B. Arasteh. The data and results analysis were performed by B. Arasteh and A. Ghaffari. The manuscript of the paper was written and revised by B. Arasteh and A. Ghaffari.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that no funds, grants, or other support were received during the preparation of this manuscript. The authors have no relevant financial or non-financial conflict of interest.
Ethical approval and informed consent
The data used in this research does not belong to any other person or third party and was prepared and generated by the researchers themselves during the research. The data of this research will be accessible by other researchers.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Arasteh, B., Ghaffari, A. A Cost-effective and Machine-learning-based method to identify and cluster redundant mutants in software mutation testing. J Supercomput (2024). https://doi.org/10.1007/s11227-024-06107-8
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-024-06107-8