Test case selection and prioritization using machine learning: a systematic literature review

Pan, Rongqi; Bagherzadeh, Mojtaba; Ghaleb, Taher A.; Briand, Lionel

doi:10.1007/s10664-021-10066-6

Test case selection and prioritization using machine learning: a systematic literature review

Published: 14 December 2021

Volume 27, article number 29, (2022)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Rongqi Pan¹,
Mojtaba Bagherzadeh¹,
Taher A. Ghaleb ORCID: orcid.org/0000-0001-9336-7298¹ &
…
Lionel Briand^1,2

4487 Accesses
36 Citations
16 Altmetric
1 Mention
Explore all metrics

Abstract

Regression testing is an essential activity to assure that software code changes do not adversely affect existing functionalities. With the wide adoption of Continuous Integration (CI) in software projects, which increases the frequency of running software builds, running all tests can be time-consuming and resource-intensive. To alleviate that problem, Test case Selection and Prioritization (TSP) techniques have been proposed to improve regression testing by selecting and prioritizing test cases in order to provide early feedback to developers. In recent years, researchers have relied on Machine Learning (ML) techniques to achieve effective TSP (ML-based TSP). Such techniques help combine information about test cases, from partial and imperfect sources, into accurate prediction models. This work conducts a systematic literature review focused on ML-based TSP techniques, aiming to perform an in-depth analysis of the state of the art, thus gaining insights regarding future avenues of research. To that end, we analyze 29 primary studies published from 2006 to 2020, which have been identified through a systematic and documented process. This paper addresses five research questions addressing variations in ML-based TSP techniques and feature sets for training and testing ML models, alternative metrics used for evaluating the techniques, the performance of techniques, and the reproducibility of the published studies. We summarize the results related to our research questions in a high-level summary that can be used as a taxonomy for classifying future TSP studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data collection and quality challenges in deep learning: a data-centric AI perspective

Article 03 January 2023

Model-based testing leveraged for automated web tests

Article 27 November 2021

Analyzing source code vulnerabilities in the D2A dataset with ML ensembles and C-BERT

Article Open access 22 February 2024

Notes

https://www.covidence.org to screen the obtained papers. 1,057 papers were imported for screening and, after removing duplicate articles, 731 papers were included after the title and abstract screening step. After excluding the papers based on the inclusion and exclusion criteria, 70 papers remained for a full-text manual review step.
https://www.travis-ci.com
https://sir.csc.ncsu.edu
https://bitbucket.org/HelgeS/atcs-data
https://sir.csc.ncsu.edu

References

Almaghairbe R, Roper M (2017) Separating passing and failing test executions by clustering anomalies. Softw Qual J 25(3):803–840
Article Google Scholar
Alon U, Zilberstein M, Levy O, Yahav E (2019) code2vec: Learning distributed representations of code. Proc ACM Programm Lang 3(POPL):1–29
Article Google Scholar
Aman H, Amasaki S, Yokogawa T, Kawahara M (2020) A comparative study of vectorization-based static test case prioritization methods. In: 2020 46th Euromicro conference on software engineering and advanced applications (SEAA). IEEE, pp 80–88
Beller M, Gousios G, Zaidman A (2017) Travistorrent: Synthesizing travis ci and github for full-stack research on continuous integration. In: 2017 IEEE/ACM 14th international conference on mining software repositories (MSR). IEEE, pp 447–450
Bertolino A, Guerriero A, Miranda B, Pietrantuono R, Russo S (2020) Learning-to-rank vs ranking-to-learn: Strategies for regression testing in continuous integration. In: In 42nd international conference on software engineering (ICSE)
Busjaeger B, Xie T (2016) Learning for test prioritization: an industrial case study. In: Proceedings of the 2016 24th ACM SIGSOFT International symposium on foundations of software engineering. pp 975–980
Carlson R, Do H, Denton A (2011) A clustering approach to improving test case prioritization: An industrial case study. In: ICSM, vol 11. pp 382–391
Chen S, Chen Z, Zhao Z, Baowen X u, Feng Y (2011) Using semi-supervised clustering to improve regression test selection techniques. In: 2011 Fourth IEEE international conference on software testing, verification and validation. IEEE, pp 1–10
Chen J, Lou Y, Zhang L, Zhou J, Wang X, Hao D, Zhang L u (2018) Optimizing test prioritization via test distribution analysis. In: Proceedings of the 2018 26th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering. pp 656–667
Cliff N (1993) Dominance statistics: Ordinal analyses to answer ordinal questions. Psychol Bull 114(3):494
Article Google Scholar
Clover A (2021) Atlassian Clover. https://bitbucket.org/atlassian/clover. Retrieved March 14, 2021
Dang V, Zarozinski M (2020) Ranklib. https://sourceforge.net/p/lemur/wiki/RankLib/
Do H, Rothermel G (2006) On the use of mutation faults in empirical assessments of test case prioritization techniques. IEEE Trans Softw Eng 32(9):733–752
Article Google Scholar
Devlin J, Chang M-W, Lee K, Toutanova K (2018) BERT Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Durelli VHS, Durelli RS, Borges SS, Endo AT, Eler MM, Dias DRC, Guimaraes MP (2019) Machine learning applied to software testing: A systematic mapping study. IEEE Trans Reliab 68(3):1189–1212
Article Google Scholar
EclEmma team (2021) JaCoCo: Java code coverage library. https://github.com/jacoco/jacoco. Retrieved March 14, 2021
Elbaum S, Malishevsky A, Rothermel G (2001) Incorporating varying test costs and fault severities into test case prioritization. In: Proceedings of the 23rd international conference on software engineering, ICSE 2001. IEEE, pp 329–338
Elbaum S, Malishevsky A, Rothermel G (2002) Test case prioritization: A family of empirical studies. IEEE Trans Softw Eng 28(2):159–182
Article Google Scholar
Feng Z, Guo D, Tang D, Duan N, Feng X, Gong M, Shou L, Qin B, Liu T, Jiang D, et al. (2020) CodeBERT: A pre-trained model for programming and natural languages. arXiv:2002.08155
Fowler M, Foemmel M (2006) Continuous integration. http://www.dccia.ua.es/dccia/inf/asignaturas/MADS/2013-14/lecturas/10_Fowler_Continuous_Integration.pdf
Freund Y, Iyer R, Schapire RE, Singer Y (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4(Nov):933–969
MathSciNet MATH Google Scholar
Hasnain M, Pasha MF, Lim CH, Ghan I (2019) Recurrent neural network for web services performance forecasting, ranking and regression testing. In: 2019 Asia-pacific signal and information processing association annual summit and conference (APSIPA ASC). IEEE, pp 96–105
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Jahan H, Feng Z, Mahmud SM, Dong P (2019) Version specific test case prioritization approach based on artificial neural network. J Intell Fuzzy Syst 36(6):6181–6194
Article Google Scholar
James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning, vol 112. Springer, Berlin
Book Google Scholar
Joachims T (2002) Optimizing search engines using clickthrough data. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. pp 133–142
Just R, Jalali D, Ernst MD (2014) Defects4j: A database of existing faults to enable controlled testing studies for java programs. In: Proceedings of the 2014 International Symposium on Software Testing and Analysis. pp 437–440
Kandil P, Moussa S, Badr N (2017) Cluster-based test cases prioritization and selection technique for agile regression testing. J Softw Evol Process 29(6):e1794
Article Google Scholar
Kazmi R, Jawawi DNA, Mohamad R, Ghani I (2017) Effective regression test case selection: A systematic literature review. ACM Comput Surv (CSUR) 50(2):1–32
Article Google Scholar
Khalid Z, Qamar U (2019) Weight and cluster based test case prioritization technique. In: 2019 IEEE 10th annual information technology, electronics and mobile communication conference (IEMCON). IEEE, pp 1013–1022
Khatibsyarbini M, Isa MA, Jawawi DNA, Tumeng R (2018) Test case prioritization approaches in regression testing: A systematic literature review. Inf Softw Technol 93:74–93
Article Google Scholar
Kim J-M, Porter A (2002) A history-based test prioritization technique for regression testing in resource constrained environments. In: Proceedings of the 24th international conference on software engineering. pp 119–129
Kitchenham B (2004) Procedures for performing systematic reviews. Keele, UK, Keele University 33(2004):1–26
Google Scholar
Kitchenham B, Brereton OP, Budgen D, Turner M, Bailey J, Linkman S (2009) Systematic literature reviews in software engineering–a systematic literature review. Inf Softw Technol 51(1):7–15
Article Google Scholar
Lachmann R, Schulze S, Nieke M, Seidl C, Schaefer I (2016) System-level test case prioritization using machine learning. In: 2016 15th IEEE international conference on machine learning and applications (ICMLA). IEEE, pp 361–368
Li H (2011) Learning to rank for information retrieval and natural language processing. Synth Lect Hum Lang Technol 4(1):1–113
Article Google Scholar
Liem C, Panichella A (2020) Run, forest, run? on randomization and reproducibility in predictive software engineering. arXiv:2012.08387,2020,
Lima JAP, Vergilio SR (2020) Test case prioritization in continuous integration environments A systematic mapping study. Inf Softw Technol 121:106268
Article Google Scholar
Lima JAP, Vergilio SR (2020) Multi-armed bandit test case prioritization in continuous integration environments: A trade-off analysis. In: Proceedings of the 5th Brazilian symposium on systematic and automated software testing. pp 21–30
Lima JAP, Mendonċa WDF, Vergilio SR, Assunċão WKG (2020) Learning-based prioritization of test cases in continuous integration of highly-configurable software. In: Proceedings of the 24th ACM conference on systems and software product line: Volume A-Volume A. pp 1–11
Mahdieh M, Mirian-Hosseinabadi Seyed-Hassan, Etemadi K, Nosrati A, Jalali S (2020) Incorporating fault-proneness estimations into coverage-based test case prioritization methods. Inf Softw Technol 121:106269
Article Google Scholar
Manning C, Schutze H (1999) Foundations of statistical natural language processing. MIT press, Cambridge
MATH Google Scholar
Mattis T, Rein P, Dürsch F, Hirschfeld R (2020) Rtptorrent: An open-source dataset for evaluating regression test prioritization. In: Proceedings of the 17th international conference on mining software repositories. pp 385–396
McGuire N, Kernel L (2006) GCOV - tool analysis. https://linuxdevices.org/ldfiles/article062/der_herr_gcov.pdf. Retrieved March 14, 2021
Medhat N, Moussa S, Badr N, Tolba MF (2020) A framework for continuous regression and integration testing in iot systems based on deep learning and search-based techniques. IEEE Access 8:215716–215726
Article Google Scholar
Mirarab S, Tahvildari L (2008) An empirical study on bayesian network-based approach for test case prioritization. In: 2008 1st international conference on software testing, verification, and validation. IEEE, pp 278–287
Noor TB, Hemmati H (2017) Studying test case failure prediction for test case prioritization. In: Proceedings of the 13th international conference on predictive models and data analytics in software engineering. pp 2–11
Nuñez-Varela AS, Pérez-Gonzalez HG, Martínez-Perez FE, Soubervielle-Montalvo C (2017) Source code metrics: A systematic mapping study. J Syst Softw 128:164–197
Article Google Scholar
Palma F, Abdou T, Bener A, Maidens J, Liu S (2018) An improvement to test case failure prediction in the context of test case prioritization. In: Proceedings of the 14th international conference on predictive models and data analytics in software engineering. pp 80–89
do Prado Lima JA, Vergilio SR (2020) A multi-armed bandit approach for test case prioritization in continuous integration environments. IEEE Trans Softw Eng
Qi L, Moran K, Poshyvanyk D, Penta MD (2018) Assessing test case prioritization on real faults and mutants. In: 2018 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 240–251
Replication package (2021) https://github.com/uOttawa-Nanda-Lab/ML-based-TSP-SLR
Robbins H (1952) Some aspects of the sequential design of experiments. Bull Am Math Soc 58(5):527–535
Article MathSciNet Google Scholar
Rosenbauer L, Stein A, Maier R, Pätzel D, Hähner J (2020) Xcs as a reinforcement learning approach to automatic test case prioritization. In: Proceedings of the 2020 genetic and evolutionary computation conference companion. pp 1798–1806
Rothermel G, Untch RH, Chu C, Harrold MJ (2001) Prioritizing test cases for regression testing. IEEE Trans Softw Eng 27(10):929–948
Article Google Scholar
Shaheamlung G, Rote K, et al. (2020) A comprehensive review for test case prioritization in software engineering. In: 2020 international conference on intelligent engineering and management (ICIEM). IEEE, pp 331–336
Sharma MM, Agrawal A (2019) Test case design and test case prioritization using machine learning. Int J Eng Adv Technol 9(1):2742–2748
Article Google Scholar
Shi T, Xiao L, Wu K (2020) Reinforcement learning based test case prioritization for enhancing the security of software. In: 2020 IEEE 7th international conference on data science and advanced analytics (DSAA). IEEE, pp 663–672
Singh A, Bhatia RK, Singhrova A (2019) Machine learning based test case prioritization in object oriented testing. Int J Recent Technol Eng 8 (3):700–707
Google Scholar
Spieker H, Gotlieb A, Marijan D, Mossige M (2017) Reinforcement learning for automatic test case prioritization and selection in continuous integration. In: Proceedings of the 26th ACM SIGSOFT international symposium on software testing and analysis. pp 12–22
Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J R Stat Soc Ser B (Methodol) 36(2):111–133
MathSciNet MATH Google Scholar
Sutton RS, Barto AG (2018) Reinforcement learning: An introduction. MIT press, Cambridge
MATH Google Scholar
Thomas SW, Hemmati H, Hassan AE, Blostein D (2014) Static test case prioritization using topic models. Empir Softw Eng 19(1):182–212
Article Google Scholar
Tonella P, Avesani P, Susi A (2006) Using the case-based ranking methodology for test case prioritization. In: 2006 22nd IEEE international conference on software maintenance. pp 123–133
Wang Y, Chen Z, Feng Y, Luo B, Yang Y (2012) Using weighted attributes to improve cluster test selection. In: 2012 IEEE sixth international conference on software security and reliability. IEEE, pp 138–146
Wilks DS (2011) Statistical methods in the atmospheric sciences, vol 100. Academic press, Cambridge
Google Scholar
Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175
Article Google Scholar
Xiao Q u, Cohen MB, Woolf KM (2007) Combinatorial interaction regression testing: A study of test case generation and prioritization. In: 2007 IEEE international conference on software maintenance. IEEE, pp 255–264
Yan S, Chen Z, Zhao Z, Zhang C, Zhou Y (2010) A dynamic test cluster sampling strategy by leveraging execution spectra information. In: 2010 third international conference on software testing, verification and validation. IEEE, pp 147–154
Yoo S, Harman M, Tonella P, Susi A (2009) Clustering test cases to achieve effective and scalable prioritisation incorporating expert knowledge. In: Proceedings of the eighteenth international symposium on software testing and analysis. pp 201–212
Yue Y, Finley T, Radlinski F, Joachims T (2007) A support vector method for optimizing average precision. In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. pp 271–278
Zhang C, Zhang Y, Shi X, Almpanidis G, Fan G, Shen X (2019) On incremental learning for gradient boosting decision trees. Neural Process Lett 50(1):957–987
Article Google Scholar

Download references

Acknowledgements

This work was supported by a research grant from Huawei Technologies Canada Co., Ltd, as well as by the Mitacs Accelerate Program, the Canada Research Chair and Discovery Grant programs of the Natural Sciences and Engineering Research Council of Canada (NSERC).

Author information

Authors and Affiliations

School of Electrical Engineering and Computer Science (EECS), University of Ottawa, Ottawa, Canada
Rongqi Pan, Mojtaba Bagherzadeh, Taher A. Ghaleb & Lionel Briand
SnT Centre for Security, Reliability and Trust, University of Luxembourg, Luxembourg, Luxembourg
Lionel Briand

Authors

Rongqi Pan
View author publications
You can also search for this author in PubMed Google Scholar
Mojtaba Bagherzadeh
View author publications
You can also search for this author in PubMed Google Scholar
Taher A. Ghaleb
View author publications
You can also search for this author in PubMed Google Scholar
Lionel Briand
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Taher A. Ghaleb.

Additional information

Communicated by: Paolo Tonella

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pan, R., Bagherzadeh, M., Ghaleb, T.A. et al. Test case selection and prioritization using machine learning: a systematic literature review. Empir Software Eng 27, 29 (2022). https://doi.org/10.1007/s10664-021-10066-6

Download citation

Accepted: 05 October 2021
Published: 14 December 2021
DOI: https://doi.org/10.1007/s10664-021-10066-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Test case selection and prioritization using machine learning: a systematic literature review

Abstract

Access this article

Similar content being viewed by others

Data collection and quality challenges in deep learning: a data-centric AI perspective

Model-based testing leveraged for automated web tests

Analyzing source code vulnerabilities in the D2A dataset with ML ensembles and C-BERT

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Test case selection and prioritization using machine learning: a systematic literature review

Abstract

Access this article

Similar content being viewed by others

Data collection and quality challenges in deep learning: a data-centric AI perspective

Model-based testing leveraged for automated web tests

Analyzing source code vulnerabilities in the D2A dataset with ML ensembles and C-BERT

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation