Abstract
Regression testing is an essential activity to assure that software code changes do not adversely affect existing functionalities. With the wide adoption of Continuous Integration (CI) in software projects, which increases the frequency of running software builds, running all tests can be time-consuming and resource-intensive. To alleviate that problem, Test case Selection and Prioritization (TSP) techniques have been proposed to improve regression testing by selecting and prioritizing test cases in order to provide early feedback to developers. In recent years, researchers have relied on Machine Learning (ML) techniques to achieve effective TSP (ML-based TSP). Such techniques help combine information about test cases, from partial and imperfect sources, into accurate prediction models. This work conducts a systematic literature review focused on ML-based TSP techniques, aiming to perform an in-depth analysis of the state of the art, thus gaining insights regarding future avenues of research. To that end, we analyze 29 primary studies published from 2006 to 2020, which have been identified through a systematic and documented process. This paper addresses five research questions addressing variations in ML-based TSP techniques and feature sets for training and testing ML models, alternative metrics used for evaluating the techniques, the performance of techniques, and the reproducibility of the published studies. We summarize the results related to our research questions in a high-level summary that can be used as a taxonomy for classifying future TSP studies.
Similar content being viewed by others
Notes
https://www.covidence.org to screen the obtained papers. 1,057 papers were imported for screening and, after removing duplicate articles, 731 papers were included after the title and abstract screening step. After excluding the papers based on the inclusion and exclusion criteria, 70 papers remained for a full-text manual review step.
References
Almaghairbe R, Roper M (2017) Separating passing and failing test executions by clustering anomalies. Softw Qual J 25(3):803–840
Alon U, Zilberstein M, Levy O, Yahav E (2019) code2vec: Learning distributed representations of code. Proc ACM Programm Lang 3(POPL):1–29
Aman H, Amasaki S, Yokogawa T, Kawahara M (2020) A comparative study of vectorization-based static test case prioritization methods. In: 2020 46th Euromicro conference on software engineering and advanced applications (SEAA). IEEE, pp 80–88
Beller M, Gousios G, Zaidman A (2017) Travistorrent: Synthesizing travis ci and github for full-stack research on continuous integration. In: 2017 IEEE/ACM 14th international conference on mining software repositories (MSR). IEEE, pp 447–450
Bertolino A, Guerriero A, Miranda B, Pietrantuono R, Russo S (2020) Learning-to-rank vs ranking-to-learn: Strategies for regression testing in continuous integration. In: In 42nd international conference on software engineering (ICSE)
Busjaeger B, Xie T (2016) Learning for test prioritization: an industrial case study. In: Proceedings of the 2016 24th ACM SIGSOFT International symposium on foundations of software engineering. pp 975–980
Carlson R, Do H, Denton A (2011) A clustering approach to improving test case prioritization: An industrial case study. In: ICSM, vol 11. pp 382–391
Chen S, Chen Z, Zhao Z, Baowen X u, Feng Y (2011) Using semi-supervised clustering to improve regression test selection techniques. In: 2011 Fourth IEEE international conference on software testing, verification and validation. IEEE, pp 1–10
Chen J, Lou Y, Zhang L, Zhou J, Wang X, Hao D, Zhang L u (2018) Optimizing test prioritization via test distribution analysis. In: Proceedings of the 2018 26th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering. pp 656–667
Cliff N (1993) Dominance statistics: Ordinal analyses to answer ordinal questions. Psychol Bull 114(3):494
Clover A (2021) Atlassian Clover. https://bitbucket.org/atlassian/clover. Retrieved March 14, 2021
Dang V, Zarozinski M (2020) Ranklib. https://sourceforge.net/p/lemur/wiki/RankLib/
Do H, Rothermel G (2006) On the use of mutation faults in empirical assessments of test case prioritization techniques. IEEE Trans Softw Eng 32(9):733–752
Devlin J, Chang M-W, Lee K, Toutanova K (2018) BERT Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Durelli VHS, Durelli RS, Borges SS, Endo AT, Eler MM, Dias DRC, Guimaraes MP (2019) Machine learning applied to software testing: A systematic mapping study. IEEE Trans Reliab 68(3):1189–1212
EclEmma team (2021) JaCoCo: Java code coverage library. https://github.com/jacoco/jacoco. Retrieved March 14, 2021
Elbaum S, Malishevsky A, Rothermel G (2001) Incorporating varying test costs and fault severities into test case prioritization. In: Proceedings of the 23rd international conference on software engineering, ICSE 2001. IEEE, pp 329–338
Elbaum S, Malishevsky A, Rothermel G (2002) Test case prioritization: A family of empirical studies. IEEE Trans Softw Eng 28(2):159–182
Feng Z, Guo D, Tang D, Duan N, Feng X, Gong M, Shou L, Qin B, Liu T, Jiang D, et al. (2020) CodeBERT: A pre-trained model for programming and natural languages. arXiv:2002.08155
Fowler M, Foemmel M (2006) Continuous integration. http://www.dccia.ua.es/dccia/inf/asignaturas/MADS/2013-14/lecturas/10_Fowler_Continuous_Integration.pdf
Freund Y, Iyer R, Schapire RE, Singer Y (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4(Nov):933–969
Hasnain M, Pasha MF, Lim CH, Ghan I (2019) Recurrent neural network for web services performance forecasting, ranking and regression testing. In: 2019 Asia-pacific signal and information processing association annual summit and conference (APSIPA ASC). IEEE, pp 96–105
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Jahan H, Feng Z, Mahmud SM, Dong P (2019) Version specific test case prioritization approach based on artificial neural network. J Intell Fuzzy Syst 36(6):6181–6194
James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning, vol 112. Springer, Berlin
Joachims T (2002) Optimizing search engines using clickthrough data. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. pp 133–142
Just R, Jalali D, Ernst MD (2014) Defects4j: A database of existing faults to enable controlled testing studies for java programs. In: Proceedings of the 2014 International Symposium on Software Testing and Analysis. pp 437–440
Kandil P, Moussa S, Badr N (2017) Cluster-based test cases prioritization and selection technique for agile regression testing. J Softw Evol Process 29(6):e1794
Kazmi R, Jawawi DNA, Mohamad R, Ghani I (2017) Effective regression test case selection: A systematic literature review. ACM Comput Surv (CSUR) 50(2):1–32
Khalid Z, Qamar U (2019) Weight and cluster based test case prioritization technique. In: 2019 IEEE 10th annual information technology, electronics and mobile communication conference (IEMCON). IEEE, pp 1013–1022
Khatibsyarbini M, Isa MA, Jawawi DNA, Tumeng R (2018) Test case prioritization approaches in regression testing: A systematic literature review. Inf Softw Technol 93:74–93
Kim J-M, Porter A (2002) A history-based test prioritization technique for regression testing in resource constrained environments. In: Proceedings of the 24th international conference on software engineering. pp 119–129
Kitchenham B (2004) Procedures for performing systematic reviews. Keele, UK, Keele University 33(2004):1–26
Kitchenham B, Brereton OP, Budgen D, Turner M, Bailey J, Linkman S (2009) Systematic literature reviews in software engineering–a systematic literature review. Inf Softw Technol 51(1):7–15
Lachmann R, Schulze S, Nieke M, Seidl C, Schaefer I (2016) System-level test case prioritization using machine learning. In: 2016 15th IEEE international conference on machine learning and applications (ICMLA). IEEE, pp 361–368
Li H (2011) Learning to rank for information retrieval and natural language processing. Synth Lect Hum Lang Technol 4(1):1–113
Liem C, Panichella A (2020) Run, forest, run? on randomization and reproducibility in predictive software engineering. arXiv:2012.08387,2020,
Lima JAP, Vergilio SR (2020) Test case prioritization in continuous integration environments A systematic mapping study. Inf Softw Technol 121:106268
Lima JAP, Vergilio SR (2020) Multi-armed bandit test case prioritization in continuous integration environments: A trade-off analysis. In: Proceedings of the 5th Brazilian symposium on systematic and automated software testing. pp 21–30
Lima JAP, Mendonċa WDF, Vergilio SR, Assunċão WKG (2020) Learning-based prioritization of test cases in continuous integration of highly-configurable software. In: Proceedings of the 24th ACM conference on systems and software product line: Volume A-Volume A. pp 1–11
Mahdieh M, Mirian-Hosseinabadi Seyed-Hassan, Etemadi K, Nosrati A, Jalali S (2020) Incorporating fault-proneness estimations into coverage-based test case prioritization methods. Inf Softw Technol 121:106269
Manning C, Schutze H (1999) Foundations of statistical natural language processing. MIT press, Cambridge
Mattis T, Rein P, Dürsch F, Hirschfeld R (2020) Rtptorrent: An open-source dataset for evaluating regression test prioritization. In: Proceedings of the 17th international conference on mining software repositories. pp 385–396
McGuire N, Kernel L (2006) GCOV - tool analysis. https://linuxdevices.org/ldfiles/article062/der_herr_gcov.pdf. Retrieved March 14, 2021
Medhat N, Moussa S, Badr N, Tolba MF (2020) A framework for continuous regression and integration testing in iot systems based on deep learning and search-based techniques. IEEE Access 8:215716–215726
Mirarab S, Tahvildari L (2008) An empirical study on bayesian network-based approach for test case prioritization. In: 2008 1st international conference on software testing, verification, and validation. IEEE, pp 278–287
Noor TB, Hemmati H (2017) Studying test case failure prediction for test case prioritization. In: Proceedings of the 13th international conference on predictive models and data analytics in software engineering. pp 2–11
Nuñez-Varela AS, Pérez-Gonzalez HG, Martínez-Perez FE, Soubervielle-Montalvo C (2017) Source code metrics: A systematic mapping study. J Syst Softw 128:164–197
Palma F, Abdou T, Bener A, Maidens J, Liu S (2018) An improvement to test case failure prediction in the context of test case prioritization. In: Proceedings of the 14th international conference on predictive models and data analytics in software engineering. pp 80–89
do Prado Lima JA, Vergilio SR (2020) A multi-armed bandit approach for test case prioritization in continuous integration environments. IEEE Trans Softw Eng
Qi L, Moran K, Poshyvanyk D, Penta MD (2018) Assessing test case prioritization on real faults and mutants. In: 2018 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 240–251
Replication package (2021) https://github.com/uOttawa-Nanda-Lab/ML-based-TSP-SLR
Robbins H (1952) Some aspects of the sequential design of experiments. Bull Am Math Soc 58(5):527–535
Rosenbauer L, Stein A, Maier R, Pätzel D, Hähner J (2020) Xcs as a reinforcement learning approach to automatic test case prioritization. In: Proceedings of the 2020 genetic and evolutionary computation conference companion. pp 1798–1806
Rothermel G, Untch RH, Chu C, Harrold MJ (2001) Prioritizing test cases for regression testing. IEEE Trans Softw Eng 27(10):929–948
Shaheamlung G, Rote K, et al. (2020) A comprehensive review for test case prioritization in software engineering. In: 2020 international conference on intelligent engineering and management (ICIEM). IEEE, pp 331–336
Sharma MM, Agrawal A (2019) Test case design and test case prioritization using machine learning. Int J Eng Adv Technol 9(1):2742–2748
Shi T, Xiao L, Wu K (2020) Reinforcement learning based test case prioritization for enhancing the security of software. In: 2020 IEEE 7th international conference on data science and advanced analytics (DSAA). IEEE, pp 663–672
Singh A, Bhatia RK, Singhrova A (2019) Machine learning based test case prioritization in object oriented testing. Int J Recent Technol Eng 8 (3):700–707
Spieker H, Gotlieb A, Marijan D, Mossige M (2017) Reinforcement learning for automatic test case prioritization and selection in continuous integration. In: Proceedings of the 26th ACM SIGSOFT international symposium on software testing and analysis. pp 12–22
Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J R Stat Soc Ser B (Methodol) 36(2):111–133
Sutton RS, Barto AG (2018) Reinforcement learning: An introduction. MIT press, Cambridge
Thomas SW, Hemmati H, Hassan AE, Blostein D (2014) Static test case prioritization using topic models. Empir Softw Eng 19(1):182–212
Tonella P, Avesani P, Susi A (2006) Using the case-based ranking methodology for test case prioritization. In: 2006 22nd IEEE international conference on software maintenance. pp 123–133
Wang Y, Chen Z, Feng Y, Luo B, Yang Y (2012) Using weighted attributes to improve cluster test selection. In: 2012 IEEE sixth international conference on software security and reliability. IEEE, pp 138–146
Wilks DS (2011) Statistical methods in the atmospheric sciences, vol 100. Academic press, Cambridge
Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175
Xiao Q u, Cohen MB, Woolf KM (2007) Combinatorial interaction regression testing: A study of test case generation and prioritization. In: 2007 IEEE international conference on software maintenance. IEEE, pp 255–264
Yan S, Chen Z, Zhao Z, Zhang C, Zhou Y (2010) A dynamic test cluster sampling strategy by leveraging execution spectra information. In: 2010 third international conference on software testing, verification and validation. IEEE, pp 147–154
Yoo S, Harman M, Tonella P, Susi A (2009) Clustering test cases to achieve effective and scalable prioritisation incorporating expert knowledge. In: Proceedings of the eighteenth international symposium on software testing and analysis. pp 201–212
Yue Y, Finley T, Radlinski F, Joachims T (2007) A support vector method for optimizing average precision. In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. pp 271–278
Zhang C, Zhang Y, Shi X, Almpanidis G, Fan G, Shen X (2019) On incremental learning for gradient boosting decision trees. Neural Process Lett 50(1):957–987
Acknowledgements
This work was supported by a research grant from Huawei Technologies Canada Co., Ltd, as well as by the Mitacs Accelerate Program, the Canada Research Chair and Discovery Grant programs of the Natural Sciences and Engineering Research Council of Canada (NSERC).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Paolo Tonella
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Pan, R., Bagherzadeh, M., Ghaleb, T.A. et al. Test case selection and prioritization using machine learning: a systematic literature review. Empir Software Eng 27, 29 (2022). https://doi.org/10.1007/s10664-021-10066-6
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-021-10066-6