Skip to main content
Log in

Comparative study of machine learning test case prioritization for continuous integration testing

  • Research
  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

There is a growing body of research indicating the potential of machine learning to tackle complex software testing challenges. One such challenge pertains to continuous integration testing, which is highly time-constrained, and generates a large amount of data coming from iterative code commits and test runs. In such a setting, we can use plentiful test data for training machine learning predictors to identify test cases able to speed up the detection of regression bugs introduced during code integration. However, different machine learning models can have different fault prediction performance depending on the context and the parameters of continuous integration testing, for example, variable time budget available for continuous integration cycles, or the size of test execution history used for learning to prioritize failing test cases. Existing studies on test case prioritization rarely study both of these factors, which are essential for the continuous integration practice. In this study, we perform a comprehensive comparison of the fault prediction performance of machine learning approaches that have shown the best performance on test case prioritization tasks in the literature. We evaluate the accuracy of the classifiers in predicting fault-detecting tests for different values of the continuous integration time budget and with different lengths of test history used for training the classifiers. In evaluation, we use real-world and augmented industrial datasets from a continuous integration practice. The results show that different machine learning models have different performance for different size of test history used for model training and for different time budgets available for test case execution. Our results imply that machine learning approaches for test prioritization in continuous integration testing should be carefully configured to achieve optimal performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author upon request.

Notes

  1. https://bitbucket.org/HelgeS/atcs-data/src/master/

  2. https://code.google.com/archive/p/google-shared-dataset-of-test-suite-results/

  3. https://code.google.com/p/googletest/

References

  • Ali, S., Hafeez, Y., Hussain, S., & Yang, S. (2020). Enhanced regression testing technique for agile software development and continuous integration strategies. Software Quality Journal, 28(2), 397–423.

    Article  Google Scholar 

  • Bertolino, A., Guerriero, A., Miranda, B., Pietrantuono, R., & Russo, S. (2020, June). Learning-to-rank vs ranking-to-learn: Strategies for regression testing in continuous integration. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (pp. 1–12). https://doi.org/10.1145/3377811.3380369

  • Busjaeger, B., & Xie, T. (2016). Learning for test prioritization: An industrial case study. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016, Association for Computing Machinery, New York, NY, USA (pp. 975–980). https://doi.org/10.1145/2950290.2983954

  • Calvo, B., & Santafé Rodrigo, G. (2016). scmamp: Statistical comparison of multiple algorithms in multiple problems. The R Journal, 8, 248–256.

    Article  Google Scholar 

  • Chen, J., Lou, Y., Zhang, L., Zhou, J., Wang, X., Hao, D., & Zhang, L. (2018). Optimizing test prioritization via test distribution analysis. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2018, Association for Computing Machinery, New York, NY, USA (pp. 656–667). https://doi.org/10.1145/3236024.3236053

  • Elbaum, A. M. S., & Penix, J. (2014). The google dataset of testing results. Retrieved July 18, 2023, from https://code.google.com/p/google-shared-dataset-of-test-suite-results

  • Elbaum, S., Rothermel, G., & Penix, J. (2014). Techniques for improving regression testing in continuous integration development environments. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (pp. 235–245).

  • Grano, G., Titov, T. V., Panichella, S., & Gall, H. C. (2018). How high will it be? Using machine learning models to predict branch coverage in automated testing. In 2018 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE) (pp. 19–24). https://doi.org/10.1109/MALTESQUE.2018.8368454

  • Hasnain, M., Pasha, M. F., Lim, C. H., & Ghan, I. (2019). Recurrent neural network for web services performance forecasting, ranking and regression testing. In 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) (pp. 96–105). https://doi.org/10.1109/APSIPAASC47483.2019.9023052

  • Hemmati, H., Fang, Z., Mantyla, M. V., & Adams, B. (2017). Prioritizing manual test cases in rapid release environments (p. 27). Verification and Reliability: Software Testing.

    Google Scholar 

  • Jahan, H., Feng, Z., Mahmud, S., & Dong, P. (2019). Version specific test case prioritization approach based on artificial neural network. Journal of Intelligent and Fuzzy Systems, 36, 6181–6194.

  • Khatibsyarbini, M., Isa, M. A., Jawawi, D. N., & Tumeng, R. (2018). Test case prioritization approaches in regression testing: A systematic literature review. Information and Software Technology, 93, 74–93.

    Article  Google Scholar 

  • Lachmann, R., Schulze, S., Nieke, M., Seidl, C., & Schaefer, I. (2016). System-level test case prioritization using machine learning. In 15th International Conference on Machine Learning and Applications (pp. 361–368).

  • Lemaître, G., Nogueira, F., & Aridas, C. K. (2017). Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. Journal of Machine Learning Research, 18(17)1–5.  http://jmlr.org/papers/v18/16-365

  • Lima, J. A. P., & Vergilio, S. R. (2022). A multi-armed bandit approach for test case prioritization in continuous integration environments. IEEE Transactions on Software Engineering, 48(2), 453–465. https://doi.org/10.1109/TSE.2020.2992428

    Article  Google Scholar 

  • Machalica, M., Samylkin, A., Porth, M., & Chandra, S. (2019). Predictive test selection. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) (pp. 91–100). https://doi.org/10.1109/ICSE-SEIP.2019.00018

  • Mahdieh, M., Mirian-Hosseinabadi, S. H., Etemadi, K., Nosrati, A., & Jalali, S. (2020). Incorporating fault-proneness estimations into coverage-based test case prioritization methods. Information and Software Technology, 121, 106269.

    Article  Google Scholar 

  • Marijan, D. (2015). Multi-perspective regression test prioritization for time-constrained environments. In 2015 IEEE International Conference on Software Quality, Reliability and Security (pp. 157–162). https://doi.org/10.1109/QRS.2015.31

  • Marijan, D., Gotlieb, A., & Sen, S. (2013). Test case prioritization for continuous regression testing: An industrial case study. In 2013 IEEE International Conference on Software Maintenance (pp. 540–543). https://doi.org/10.1109/ICSM.2013.91

  • Marijan, D., Gotlieb, A., & Liaaen, M. (2019). A learning algorithm for optimizing continuous integration development and testing practice. Software: Practice and Experience, 192–213. https://doi.org/10.1002/spe.2661

  • Marijan, D., & Liaaen, M. (2016). Effect of time window on the performance of continuous regression testing. In 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME) (pp. 568–571). https://doi.org/10.1109/ICSME.2016.77

  • Marijan, D., & Liaaen, M. (2017). Test prioritization with optimally balanced configuration coverage. In 2017 IEEE 18th International Symposium on High Assurance Systems Engineering (HASE) (pp. 100–103). https://doi.org/10.1109/HASE.2017.26

  • Marijan, D., Liaaen, M., Gotlieb, A., Sen, S., & Ieva, C. (2017). Titan: Test suite optimization for highly configurable software. In 2017 IEEE International Conference on Software Testing, Verification and Validation (ICST) (pp. 524–531). https://doi.org/10.1109/ICST.2017.60

  • Marijan, D., & Liaaen, M. (2018). Practical selective regression testing with effective redundancy in interleaved tests. In Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice, ICSE-SEIP ’18, Association for Computing Machinery, New York, NY, USA (pp. 153–162). https://doi.org/10.1145/3183519.3183532

  • Marijan, D., Liaaen, M., & Sen, S. (2018). DevOps improvements for reduced cycle times with integrated test optimizations for continuous integration. In 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC) (Vol. 1, pp. 22–27). https://doi.org/10.1109/COMPSAC.2018.00012

  • Mirarab, S., & Tahvildari, L. (2008). An empirical study on bayesian network-based approach for test case prioritization. In 2008 1st International Conference on Software Testing, Verification, and Validation (pp. 278–287). https://doi.org/10.1109/ICST.2008.57

  • Niu, N., Brinkkemper, S., Franch, X., Partanen, J., & Savolainen, J. (2018). Requirements engineering and continuous deployment. IEEE Software, 35(2), 86–90.

    Article  Google Scholar 

  • Parnin, C., Helms, E., Atlee, C., Boughton, H., Ghattas, M., Glover, A., Holman, J., Micco, J., Murphy, B., Savor, T., et al. (2017). The top 10 adages in continuous deployment. IEEE Software, 34(3), 86–95.

    Article  Google Scholar 

  • Rosenbauer, L., Stein, A., Maier, R., Patzel, D., & Hahner, J. (2020). Xcs as a reinforcement learning approach to automatic test case prioritization. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, 1798–1806.

  • Rothermel, G., Untch, R. H., Chu, C., & Harrold, M. J. (2001). Prioritizing test cases for regression testing. IEEE Transactions on Software Engineering, 27, 929–948.

    Article  Google Scholar 

  • Savor, T., Douglas, M., Gentili, M., Williams, L., Beck, K., & Stumm, M. (2016). Continuous deployment at facebook and oanda. In 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C) (pp. 21–30). IEEE.

  • Sharif, A., Marijan, D., & Liaaen, M. (2021). Deeporder: Deep learning for test case prioritization in continuous integration testing. In 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME) (pp. 525–534). https://doi.org/10.1109/ICSME52107.2021.00053

  • Shi, A., Zhao, P., & Marinov, D. (2019). Understanding and improving regression test selection in continuous integration. In 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE) (pp. 228–238). IEEE.

  • Shi, T., Xiao, L., & Wu, K. (2020). Reinforcement learning based test case prioritization for enhancing the security of software. In 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA) (pp. 663–672).

  • Srikanth, H., Cashman, M., & Cohen, M. B. (2016). Test case prioritization of build acceptance tests for an enterprise cloud application: An industrial case study. Journal of Systems and Software, 119, 122–135.

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the Research Council of Norway, grant no 287329.

Author information

Authors and Affiliations

Authors

Contributions

Dusica Marijan made the conceptual design, performed the experiments, and wrote the manuscript.

Corresponding author

Correspondence to Dusica Marijan.

Ethics declarations

Conflict of interest

The author declares no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Marijan, D. Comparative study of machine learning test case prioritization for continuous integration testing. Software Qual J 31, 1415–1438 (2023). https://doi.org/10.1007/s11219-023-09646-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-023-09646-0

Keywords

Navigation