Static test case prioritization using topic models

Thomas, Stephen W.; Hemmati, Hadi; Hassan, Ahmed E.; Blostein, Dorothea

doi:10.1007/s10664-012-9219-7

Static test case prioritization using topic models

Published: 28 July 2012

Volume 19, pages 182–212, (2014)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Stephen W. Thomas¹,
Hadi Hemmati¹,
Ahmed E. Hassan¹ &
…
Dorothea Blostein¹

2344 Accesses
104 Citations
3 Altmetric
Explore all metrics

Abstract

Software development teams use test suites to test changes to their source code. In many situations, the test suites are so large that executing every test for every source code change is infeasible, due to time and resource constraints. Development teams need to prioritize their test suite so that as many distinct faults as possible are detected early in the execution of the test suite. We consider the problem of static black-box test case prioritization (TCP), where test suites are prioritized without the availability of the source code of the system under test (SUT). We propose a new static black-box TCP technique that represents test cases using a previously unused data source in the test suite: the linguistic data of the test cases, i.e., their identifier names, comments, and string literals. Our technique applies a text analysis algorithm called topic modeling to the linguistic data to approximate the functionality of each test case, allowing our technique to give high priority to test cases that test different functionalities of the SUT. We compare our proposed technique with existing static black-box TCP techniques in a case study of multiple real-world open source systems: several versions of Apache Ant and Apache Derby. We find that our static black-box TCP technique outperforms existing static black-box TCP techniques, and has comparable or better performance than two existing execution-based TCP techniques. Static black-box TCP methods are widely applicable because the only input they require is the source code of the test cases themselves. This contrasts with other TCP techniques which require access to the SUT runtime behavior, to the SUT specification models, or to the SUT source code.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Ali S, Briand LC, Hemmati H, Panesar-Walawege RK (2009) A systematic review of the application and empirical investigation of search-based test case generation. IEEE Trans Softw Eng 36(6):742–762
Article Google Scholar
Apache Foundation (2012a) Ant. http://ant.apache.org. Accessed 17 July 2012
Apache Foundation (2012b) Apache. http://www.apache.org. Accessed 17 July 2012
Apache Foundation (2012c) Derby. http://db.apache.org/derby. Accessed 17 July 2012
Arcuri A, Briand L (2011) A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: Proceedings of the 33rd international conference on software engineering, pp 1–10
Asuncion HU, Asuncion AU, Taylor RN (2010) Software traceability with topic modeling. In: Proceedings of the 32nd international conference on software engineering, pp 95–104
Baldi PF, Lopes CV, Linstead EJ, Bajracharya SK (2008) A theory of aspects as latent topics. ACM SIGPLAN Not 43(10):543–562
Article Google Scholar
Blei DM, Lafferty JD (2009) Topic models. In: Text mining: classification, clustering, and applications. Chapman & Hall, London, UK, pp 71–94
Google Scholar
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
MATH Google Scholar
Chang J (2011) lda: Collapsed Gibbs sampling methods for topic models. http://cran.r-project.org/web/packages/lda. Accessed 17 July 2012
Chen S, Chen Z, Zhao Z, Xu B, Feng Y (2011) Using semi-supervised clustering to improve regression test selection techniques. In: Proceedings of the 4th international conference on software testing, verification and validation, pp 1–10
Cordy JR (2006) The TXL source transformation language. Science of Computer Programming 61(3):190–210
Article MATH MathSciNet Google Scholar
Do H, Elbaum S, Rothermel G (2005) Supporting controlled experimentation with testing techniques: an infrastructure and its potential impact. Empir Software Eng 10(4):405–435
Article Google Scholar
Elbaum S, Malishevsky A, Rothermel G (2002) Test case prioritization: a family of empirical studies. IEEE Trans Softw Eng 28(2):159–182
Article Google Scholar
Feldt R, Torkar R, Gorschek T, Afzal W (2008) Searching for cognitively diverse tests: towards universal test diversity metrics. In: Proceedings of the international conference on software testing verification and validation workshop, pp 178–186
Gethers M, Poshyvanyk D (2010) Using relational topic models to capture coupling among classes in object-oriented software systems. In: Proceedings of the 26th international conference on software maintenance, pp 1–10
Gethers M, Oliveto R, Poshyvanyk D, Lucia A (2011) On integrating orthogonal information retrieval methods to improve traceability recovery. In: Proceedings of the 27th international conference on software maintenance, pp 133–142
Grant S, Cordy JR (2010) Estimating the optimal number of latent concepts in source code analysis. In: Proceedings of the 10th international working conference on source code analysis and manipulation, pp 65–74
Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101:5228–5235
Article Google Scholar
Griffiths TL, Steyvers M, Tenenbaum JB (2007) Topics in semantic representation. Psychol Rev 114(2):211–244
Article Google Scholar
Hemmati H, Arcuri A, Briand L (2010a) Reducing the cost of model-based testing through test case diversity. In: Proceedings of the 22nd international conference on testing software and systems, pp 63–78
Hemmati H, Briand L, Arcuri A, Ali S (2010b) An enhanced test case selection approach for model-based testing: an industrial case study. In: Proceedings of the 18th international symposium on foundations of software engineering, pp 267–276
Hemmati H, Arcuri A, Briand L (2011) Empirical investigation of the effects of test suite properties on similarity-based test case selection. In: Proceedings of the 4th international conference on software testing, verification and validation, pp 327–336
Hemmati H, Briand L, Arcuri A (2013) Achieving scalable model-based testing through test case diversity. ACM Trans Softw Eng Methodol 22(1) (upcoming)
Hofmann T (1999) Probabilistic Latent Semantic Indexing. In: Proceedings of the 22nd international conference on research and development in information retrieval, pp 50–57
Ihaka R, Gentleman R (1996) R: A language for data analysis and graphics. J Comput Graph Stat 5(3):299–314
Google Scholar
Jiang B, Zhang Z, Chan W, Tse T (2009) Adaptive random test case prioritization. In: Proceedings of the 24th international conference on automated software engineering, pp 233–244
Jones J, Harrold M (2003) Test-suite reduction and prioritization for modified condition/decision coverage. IEEE Trans Softw Eng 29(3):195–209
Article Google Scholar
Korel B, Koutsogiannakis G, Tahat L (2007) Model-based test prioritization heuristic methods and their evaluation. In: Proceedings of the 3rd international workshop on advances in model-based testing, pp 34–43
Kuhn A, Ducasse S, Girba T (2007) Semantic clustering: identifying topics in source code. Inf Softw Technol 49(3):230–243
Article Google Scholar
Kullback S, Leibler R (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Article MATH MathSciNet Google Scholar
Kumar A (2010) Development at the speed and scale of google. Presented at QCon 2010, San Francisco, CA, USA
Ledru Y, Petrenko A, Boroday S (2009) Using string distances for test case prioritisation. In: Proceedings of the 24th international conference on automated software engineering, pp 510–514
Ledru Y, Petrenko A, Boroday S, Mandran N (2011) Prioritizing test cases with string distances. Autom Softw Eng 19(1):65–95
Article Google Scholar
Leon D, Podgurski A (2003) A comparison of coverage-based and distribution-based techniques for filtering and prioritizing test cases. In: Proceedings of the international symposium on software reliability engineering, pp 442–456
Linstead E, Lopes C, Baldi P (2008) An application of latent Dirichlet allocation to analyzing software evolution. In: Proceedings of the 7th international conference on machine learning and applications, pp 813–818
Liu Y, Poshyvanyk D, Ferenc R, Gyimothy T, Chrisochoides N (2009) Modeling class cohesion as mixtures of latent topics. In: Proceedings of the 25th international conference on software maintenance, pp 233–242
Lukins SK, Kraft NA, Etzkorn LH (2010) Bug localization using latent Dirichlet allocation. Inf Softw Technol 52(9):972–990
Article Google Scholar
Marcus A, Sergeyev A, Rajlich V, Maletic JI (2004) An information retrieval approach to concept location in source code. In: Proceedings of the 11th working conference on reverse engineering, pp 214–223
Maskeri G, Sarkar S, Heafield K (2008) Mining business topics in source code using latent Dirichlet allocation. In: Proceedings of the 1st conference on India software engineering conference, pp 113–120
Masri W, Podgurski A, Leon D (2007) An empirical study of test case filtering techniques based on exercising information flows. IEEE Trans Softw Eng 33(7):454–477
Article Google Scholar
McMaster S, Memon A (2006) Call stack coverage for GUI test-suite reduction. IEEE Trans Softw Eng 34(1):99–115
Article Google Scholar
Mei H, Hao D, Zhang L, Zhang L, Zhou J, Rothermel G (2011) A static approach to prioritizing JUnit test cases. IEEE Trans Softw Eng. doi:10.1109/TSE.2011.106
Oliveto R, Gethers M, Bavota G, Poshyvanyk D, De Lucia A (2011) Identifying method friendships to remove the feature envy bad smell. In: Proceeding of the 33rd international conference on software engineering (NIER Track), pp 820–823
Porteous I, Newman D, Ihler A, Asuncion A, Smyth P, Welling M (2008) Fast collapsed Gibbs sampling for latent Dirichlet allocation. In: Proceeding of the 14th international conference on knowledge discovery and data mining, pp 569–577
Ramanathan MK, Koyuturk M, Grama A, Jagannathan S (2008) PHALANX: a graph-theoretic framework for test case prioritization. In: Proceedings of the 23rd ACM symposium on applied computing, pp 667–673
Rothermel G, Untch R, Chu C, Harrold M (2001) Prioritizing test cases for regression testing. IEEE Trans Softw Eng 27(10):929–948
Article Google Scholar
Rothermel G, Harrold M, Von Ronne J, Hong C (2002) Empirical studies of test-suite reduction. Softw Test Verif Reliab 12(4):219–249
Article Google Scholar
Sampath S, Bryce RC, Viswanath G, Kandimalla V, Koru AG (2008) Prioritizing user-session-based test cases for web applications testing. In: Proceedings of the 1st international conference on software testing, verification, and validation, pp 141–150
Savage T, Dit B, Gethers M, Poshyvanyk D (2010) TopicXP: Exploring topics in source code using latent Dirichlet allocation. In: Proceedings of the 26th international conference on software maintenance, pp 1–6
Simao A, de Mello RF, Senger LJ (2006) A technique to reduce the test case suites for regression testing based on a self-organizing neural network architecture. In: Proceedings of the 30th annual international computer software and applications conference, pp 93–96
Thomas SW (2012a) http://research.cs.queensu.ca/~sthomas/. Accessed 17 July 2012
Thomas SW (2012b) Mining software repositories with topic models. Tech. Rep. 2012-586, School of Computing, Queen’s University
Thomas SW, Adams B, Hassan AE, Blostein D (2010) Validating the use of topic models for software evolution. In: Proceedings of the 10th international working conference on source code analysis and manipulation, pp 55–64
Thomas SW, Adams B, Hassan AE, Blostein D (2011) Modeling the evolution of topics in source code histories. In: Proceedings of the 8th working conference on mining software repositories, pp 173–182
Vargha A, Delaney HD (2000) A critique and improvement of the CL common language effect size statistics of McGraw and Wong. J Educ Behav Stat 25(2):101–132
Google Scholar
Wallach HM, Murray I, Salakhutdinov R, Mimno D (2009) Evaluation methods for topic models. In: Proceedings of the 26th international conference on machine learning, pp 1105–1112
Wang S, Lo D, Xing Z, Jiang L (2011) Concern localization using information retrieval: an empirical study on Linux kernel. In: Proceedings of the 18th working conference on reverse engineering, pp 92–96
Wong W, Horgan J, London S, Agrawal H (1997) A study of effective regression testing in practice. In: Proceedings of the 8th international symposium on software reliability engineering, pp 264–274
Yoo S, Harman M (2010) Regression testing minimization, selection and prioritization: a survey. Softw Test Verif Reliab 22(2):67–120
Article Google Scholar
Yoo S, Harman M, Tonella P, Susi A (2009) Clustering test cases to achieve effective and scalable prioritisation incorporating expert knowledge. In: Proceedings of the 18th international symposium on software testing and analysis, pp 201–212
Zhang L, Zhou J, Hao D, Zhang L, Mei H (2009) Prioritizing JUnit test cases in absence of coverage information. In: Proceedings of the 25th international conference on software maintenance, pp 19–28

Download references

Author information

Authors and Affiliations

School of Computing, Queen’s University, 156 Barrie Street, Kingston, ON, Canada
Stephen W. Thomas, Hadi Hemmati, Ahmed E. Hassan & Dorothea Blostein

Authors

Stephen W. Thomas
View author publications
You can also search for this author in PubMed Google Scholar
Hadi Hemmati
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed E. Hassan
View author publications
You can also search for this author in PubMed Google Scholar
Dorothea Blostein
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stephen W. Thomas.

Additional information

Editor: Gregg Rothermel

Rights and permissions

Reprints and permissions

About this article

Cite this article

Thomas, S.W., Hemmati, H., Hassan, A.E. et al. Static test case prioritization using topic models. Empir Software Eng 19, 182–212 (2014). https://doi.org/10.1007/s10664-012-9219-7

Download citation

Published: 28 July 2012
Issue Date: February 2014
DOI: https://doi.org/10.1007/s10664-012-9219-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Static test case prioritization using topic models

Abstract

Access this article

Similar content being viewed by others

Test case selection in industry: an analysis of issues related to static approaches

Semantic topic models for source code analysis

Studying software logging using topic models

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Static test case prioritization using topic models

Abstract

Access this article

Similar content being viewed by others

Test case selection in industry: an analysis of issues related to static approaches

Semantic topic models for source code analysis

Studying software logging using topic models

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation