Abstract
The current generation of software analytics tools are mostly prediction algorithms (e.g. support vector machines, naive bayes, logistic regression, etc). While prediction is useful, after prediction comes planning about what actions to take in order to improve quality. This research seeks methods that generate demonstrably useful guidance on “what to do” within the context of a specific software project. Specifically, we propose XTREE (for within-project planning) and BELLTREE (for cross-project planning) to generating plans that can improve software quality. Each such plan has the property that, if followed, it reduces the expected number of future defect reports. To find this expected number, planning was first applied to data from release x. Next, we looked for change in release x + 1 that conformed to our plans. This procedure was applied using a range of planners from the literature, as well as XTREE. In 10 open-source JAVA systems, several hundreds of defects were reduced in sections of the code that conformed to XTREE’s plans. Further, when compared to other planners, XTREE’s plans were found to be easier to implement (since they were shorter) and more effective at reducing the expected number of defects.
Similar content being viewed by others
Notes
According to the Oxford English Dictionary, the bellwether is the leading sheep of a flock, with a bell on its neck.
And recall in Section 1.2 these versions were given less formal names, specifically older, newer, latest.
References
Al Dallal J, Briand L C (2010) An object-oriented high-level design-based class cohesion metric. Inf Softw Technol 52(12):1346–1361
Altman E (1999) Constrained Markov decision processes, vol 7. CRC Press
Alves T L, Ypma C, Visser J (2010) Deriving metric thresholds from benchmark data In: 2010 IEEE Int. Conf. Softw. Maint. IEEE, pp 1–10 https://doi.org/10.1109/ICSM.2010.5609747
Andrews JH, Li FCH, Menzies T (2007) Nighthawk: A Two-Level Genetic-Random Unit Test Data Generator. In: IEEE ASE’07
Andrews J H, Menzies T, Li FCH (2010) Genetic Algorithms for Randomized Unit Testing. IEEE Transactions on Software Engineering
Baier SSJA, McIlraith SA (2009) Htn planning with preferences. In: 21st Int. Joint Conf. on Artificial Intelligence, pp 1790–1797
Bansiya J, Davis C G (2002) A hierarchical model for object-oriented design quality assessment. IEEE Trans Softw Eng 28(1):4–17. https://doi.org/10.1109/32.979986
Begel A, Zimmermann T (2014) Analyze this! 145 questions for data scientists in software engineering. In: Proc. 36th Intl. Conf. Software Engineering (ICSE 2014). ACM
Bellman R (1957) A markovian decision process. Indiana Univ Math J 6:679–684
Bender R (1999) Quantitative risk assessment in epidemiological studies investigating threshold effects. Biom J 41(3):305–319
Bener A, Misirli AT, Caglayan B, Kocaguneli E, Calikli G (2015) Lessons learned from software analytics in practice. In: The Art and Science of Analyzing Software Data. Elsevier, pp 453–489
Blue D, Segall I, Tzoref-Brill R, Zlotnick A (2013) Interaction-based test-suite minimization. Proceedings of the 2013 Intl. Conf. Software Engineering. IEEE Press, pp 182–191
Boehm B (1981) Software Engineering Economics. Prentice Hall
Boehm B, Horowitz E, Madachy R, Reifer D, Clark B K, Steece B, Brown A W, Chulani S, Abts C (2000) Software Cost Estimation with Cocomo II. Prentice Hall
Bryton S, e Abreu F B (2009) Strengthening refactoring: towards software evolution with quantitative and experimental grounds. In: Software engineering advances, 2009. ICSEA’09. Fourth intl Conference. IEEE, pp 570–575
Cheng B, Jensen A (2010) On the use of genetic programming for automated refactoring and the introduction of design patterns. In: Proc. 12th Annual Conf. Genetic and Evolutionary Computation, GECCO ’10. ACM, New York, pp 1341–1348. https://doi.org/10.1145/1830483.1830731
Chidamber S R, Kemerer C F (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493
Chidamber S R, Darcy D P, Kemerer C F (1998) Managerial use of metrics for object-oriented software: an exploratory analysis. IEEE Trans Softw Eng 24(8):629–639
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Cui X, Potok T, Palathingal P (2005) Document clustering using particle swarm optimization. ... Intelligence Symposium ...
Czerwonka J, Das R, Nagappan N, Tarvo A, Teterev A (2011) Crane: Failure prediction, change analysis and test prioritization in practice – experiences from windows. In: 2011 IEEE Fourth Intl. Conference on Software Testing, Verification and Validation (ICST), pp 357–366
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast elitist Multi-Objective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6:182–197
Deb K, Jain H (2014) An evolutionary Many-Objective optimization algorithm using Reference-Point-Based nondominated sorting approach, Part I: Solving problems with box constraints. IEEE Trans Evol Comput 18(4):577–601. https://doi.org/10.1109/TEVC.2013.2281535
Devanbu P, Zimmermann T, Bird C (2016) Belief & evidence in empirical software engineering. In: Proceedings of 38th intl. Conf Software Engineering. ACM, pp 108–119
Du Bois B (2006) A study of quality improvements by refactoring
Elish K, Alshayeb M (2011) A classification of refactoring methods based on software quality attributes. Arab J Sci Eng 36(7):1253–1267
Elish K, Alshayeb M (2012) Using software quality attributes to classify refactoring to patterns. JSW 7(2):408–419
Fayyad U, Irani K (1993) Multi-interval discretization of continuous-valued attributes for classification learning. NASA JPL Archives
Fu W, Menzies T, Shen X (2016) Tuning for software analytics: is it really necessary? Information and Software Technology (submitted)
Ghallab M, Nau D, Traverso P (2004) Automated Planning: theory and practice. Elsevier
Ghotra B, McIntosh S, Hassan AE (2015) Revisiting the impact of classification techniques on the performance of defect prediction models. In: 37th ICSE-Volume 1. IEEE Press, pp 789–800
Graves T L, Karr A F, Marron J S, Siy H (2000) Predicting fault incidence using software change history. IEEE Trans Softw Eng 26(7):653–661
Guo X, Hernández-Lerma O (2009) Continuous-time markov decision processes. Continuous-Time Markov Decision Processes, pp 9–18
Halstead MH (1977) Elements of software science, vol 7. Elsevier, New York
Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Discov 15(1):55–86
Harman M, Mansouri SA, Zhang Y (2009) Search based software engineering: A comprehensive analysis and review of trends techniques and applications. Department of Computer Science, King’s College London, Tech Rep TR-09-03
Harman M, McMinn P, De Souza J, Yoo S (2011) Search based software engineering: Techniques, taxonomy, tutorial. Search 2012:1–59. https://doi.org/10.1007/978-3-642-25231-0_1
Henard C, Papadakis M, Harman M, Traon Y L (2015) Combining multi-objective search and constraint solving for configuring large software product lines. In: 2015 IEEE/ACM 37Th IEEE intl. Conf. Software engineering, vol 1, pp 517–528. https://doi.org/10.1109/ICSE.2015.69
Hihn J, Menzies T (2015) Data mining methods and cost estimation models: Why is it so hard to infuse new ideas? In: 2015 30th IEEE/ACM Intl. Conf. Automated Software Engineering Workshop (ASEW), pp 5–9. https://doi.org/10.1109/ASEW.2015.27
Ii P G, Menzies T, Williams S, El-Rawas O (2009) Understanding the value of software engineering technologies. In: Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering. IEEE Computer Society, pp 52–61
Jing X, Wu G, Dong X, Qi F, Xu B (2015) Heterogeneous cross-company defect prediction by unified metric representation and cca-based transfer learning. In: FSE’15
Jørgensen M, Gruschke TM (2009) The impact of lessons-learned sessions on effort estimation and uncertainty assessments. IEEE Trans Softw Eng 35(3):368–383
Jureczko M, Madeyski L (2010) Towards identifying software project clusters with regard to defect prediction. In: Proceedings of 6th int. Conf. Predict. Model. Softw. Eng. - PROMISE ’10. ACM Press, New York, pp 1. https://doi.org/10.1145/1868328.1868342
Kataoka Y, Imai T, Andou H, Fukaya T (2002) A quantitative evaluation of maintainability enhancement by refactoring. In: Software maintenance, 2002, Proceedings. Intl. Conf. IEEE, pp 576–585
Kocaguneli E, Menzies T (2011) How to find relevant data for effort estimation?. In: 2011 Intl. Symposium on Empirical software engineering and measurement (ESEM). IEEE, pp 255–264
Kocaguneli E, Menzies T, Bener A, Keung J (2012) Exploiting the essential assumptions of analogy-based effort estimation. IEEE Trans Softw Eng 28:425–438
Kocaguneli E, Menzies T, Mendes E (2015) Transfer learning in effort estimation. Empir Softw Eng 20(3):813–843. https://doi.org/10.1007/s10664-014-9300-5
Krall J, Menzies T, Davies M (2015) Gale: Geometric active learning for search-based software engineering. IEEE Trans Softw Eng 41(10):1001–1018
Krishna R, Menzies T (2015) Actionable = cluster + contrast? In: 2015 30th IEEE/ACM International Conference on Automated Software Engineering Workshop (ASEW), pp 14–17. https://doi.org/10.1109/ASEW.2015.23
Krishna R, Menzies T, Fu W (2016) Too much automation? the bellwether effect and its implications for transfer learning. In: Proceedings of 31st IEEE/ACM intl. Conf. Automated software engineering - ASE 2016. ACM Press, New York, pp 122–131. https://doi.org/10.1145/2970276.2970339
Krishna R, Menzies T, Layman L (2017a) Less is more: Minimizing code reorganization using XTREE. Information and Software Technology. https://doi.org/10.1016/j.infsof.2017.03.012, 1609.03614
Krishna R, Menzies T, Layman L (2017b) Less is more: Minimizing code reorganization using xtree. Inf Softw Technol 88:53–66
Krishna R (2017c) Learning effective changes for software projects. In: Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, IEEE Press, pp 1002–1005
Krishna R, Menzies T (2018) Bellwethers: a baseline method for transfer learning. IEEE Transactions on Software Engineering:1–1, pp https://doi.org/10.1109/TSE.2018.2821670
Le Goues C, Dewey-Vogt M, Forrest S, Weimer W (2012) A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In: 2012 34Th intl. Conf. Software engineering (ICSE). IEEE, pp 3–13. https://doi.org/10.1109/ICSE.2012.6227211
Le Goues C, Holtschulte N, Smith EK, Brun Y, Devanbu P, Forrest S, Weimer W (2015) The ManyBugs and IntroClass Benchmarks for Automated Repair of C Programs. IEEE Trans Softw Eng 41(12):1236–1256. https://doi.org/10.1109/TSE.2015.2454513
Lemon B, Riesbeck A, Menzies T, Price J, D’Alessandro J, Carlsson R, Prifiti T, Peters F, Lu H, Port D (2009) Applications of Simulation and AI Search: Assessing the Relative Merits of Agile vs Traditional Software Development. In: IEEE ASE’09
Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans Softw Eng 34(4):485–496. https://doi.org/10.1109/TSE.2008.35
Lewis C, Lin Z, Sadowski C, Zhu X, Ou R, Whitehead Jr EJ (2013) Does bug prediction support human developers? findings from a google case study. In: Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13. IEEE Press, Piscataway, pp 372–381. http://dl.acm.org/citation.cfm?id=2486788.2486838
Lowry M, Boyd M, Kulkami D (1998) Towards a theory for integration of mathematical verification and empirical testing. In: 1998. Proceedings. 13th IEEE International Conference on Automated Software Engineering. IEEE, pp 322–331
Madeyski L, Jureczko M (2015) Which process metrics can significantly improve defect prediction models? an empirical study. Softw Qual J 23(3):393–422
McCabe T J (1976) A complexity measure. IEEE Transactions on software Engineering (4):308–320
Mensah S, Keung J, MacDonell S G, Bosu M F, Bennin K E (2018) Investigating the significance of the bellwether effect to improve software effort prediction: Further empirical study. IEEE Transactions on Reliability (99):1–23
Menzies T, Raffo D, Setamanit S, Hu Y, Tootoonian S (2002) Model-based tests of truisms. In: Proceedings of IEEE ASE 2002, available from http://menzies.us/pdf/02truisms.pdf
Menzies T, Dekhtyar A, Distefano J, Greenwald J (2007a) Problems with precision: a response to ”Comments on ’Data mining static code attributes to learn defect predictors’”. IEEE Trans Softw Eng 33(9):637–640. https://doi.org/10.1109/TSE.2007.70721
Menzies T, Elrawas O, Hihn J, Feather M, Madachy R, Boehm B (2007b) The business case for automated software engineering. In: Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering. ACM, pp 303–312
Menzies T, Greenwald J, Frank A (2007c) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(1):2–13
Menzies T, Greenwald J, Frank A (2007d) Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering Available from http://menzies.us/pdf/06learnPredict.pdf
Menzies T, Williams S, Boehm B, Hihn J (2009) How to avoid drastic software process change (using stochastic stability). In: Proceedings of the 31st International Conference on Software Engineering , ICSE ’09. IEEE Computer Society, Washington, pp 540–550. https://doi.org/10.1109/ICSE.2009.5070552
Menzies T, Butcher A, Cok D, Marcus A, Layman L, Shull F, Turhan B, Zimmermann T (2013) Local versus global lessons for defect prediction and effort estimation. IEEE Trans Softw Eng 39(6):822–834. https://doi.org/10.1109/TSE.2012.83
Menzies T, Krishna T, Pryor D (2016) The promise repository of empirical software engineering data. north carolina state university, department of computer science
Metzger A, Pohl K (2014) Software product line engineering and variability management: achievements and challenges. In: Proceedings on Future of Software Engineering. ACM, pp 70–84
Mkaouer MW, Kessentini M, Bechikh S, Deb K, Ó Cinnéide M (2014) Recommendation system for software refactoring using innovization and interactive dynamic optimization. In: Proc. 29th ACM/IEEE Intl. Conf. Automated Software Engineering, ASE ’14. ACM, New York, pp 331–336, https://doi.org/10.1145/2642937.2642965
Moghadam IH (2011) Search Based Software Engineering: Third Intl. Symposium, SSBSE 2011, Szeged, 2011. Proceedings. Springer, Berlin, chap Multi-level Automated Refactoring Using Design Exploration, pp 70–75. https://doi.org/10.1007/978-3-642-23716-4_9
Nagappan N, Ball T (2005) Static analysis tools as early indicators of pre-release defect density. In: Proceedings of 27th intl. Conf Software engineering. ACM, pp 580–586
Nam J, Pan SJ, Kim S (2013a) Transfer defect learning. In: Proc. Intl. Conf. on Software Engineering, pp 382–391. https://doi.org/10.1109/ICSE.2013.6606584
Nam J, Pan SJ, Kim S (2013b) Transfer defect learning. In: Proc. Intl. Conf. Software Engineering, pp 382–391. https://doi.org/10.1109/ICSE.2013.6606584
Nam J, Kim S (2015a) Heterogeneous defect prediction. In: Proceedings of 2015 10th jt. Meet. Found. Softw. Eng. - ESEC/FSE 2015. ACM Press, New York, pp 508–519. https://doi.org/10.1145/2786805.2786814
Nam J, Kim S (2015b) Heterogeneous defect prediction. In: Proceedings of 2015 10th jt. Meet. Found. Softw. Eng. - ESEC/FSE 2015. ACM Press, New York, pp 508–519. https://doi.org/10.1145/2786805.2786814
Nam J, Fu W, Kim S, Menzies T, Tan L (2017) Heterogeneous defect prediction. IEEE Trans Softw Eng PP(99):1–1. https://doi.org/10.1109/TSE.2017.2720603
Nayrolles M, Hamou-Lhadj A (2018) Clever: Combining code metrics with clone detection for just-in-time fault prevention and resolution in large industrial projects. In: Mining Software Repositories
O’Keeffe MK, Cinneide MO (2007) Getting the most from search-based refactoring. In: Proc. 9th Annual Conf. Genetic and Evolutionary Computation, GECCO ’07. ACM, New York, pp 1114–1120, https://doi.org/10.1145/1276958.1277177
O’Keeffe M, Cinnéide MO (2008) Search-based refactoring: An empirical study. J Softw Maint Evol 20(5):345–364 https://doi.org/10.1002/smr.v20:5
Oliveira P, Valente M T, Lima FP (2014) Extracting relative thresholds for source code metrics. In: 2014 Software evolution week - IEEE conf. Software maintenance, reengineering, and reverse engineering (CSMR-WCRE). IEEE, pp 254–263. https://doi.org/10.1109/CSMR-WCRE.2014.6747177
Ostrand T J, Weyuker E J, Bell R M (2004) Where the bugs are. In: ISSTA ’04: Proceedings of The 2004 ACM SIGSOFT Intl. symposium on Software testing and analysis. ACM, New York, pp 86–96
Passos C, Braun AP, Cruzes DS, Mendonca M (2011) Analyzing the impact of beliefs in software project practices. In: ESEM’11
Peters F, Menzies T, Layman L (2015) LACE2: Better Privacy-preserving data sharing for cross project defect prediction. In: Proceedings of Intl. Conf. Software engineering, vol 1, pp 801–811. https://doi.org/10.1109/ICSE.2015.92
Rahman F, Devanbu P (2013) How, and why, process metrics are better. In: 2013 35Th international conference on software engineering (ICSE). IEEE, pp 432–441
Rathore S S, Kumar S (2019) A study on software fault prediction techniques. Artif Intell Rev 51(2):255–327. https://doi.org/10.1007/s10462-017-9563-5
Ruhe G, Greer D (2003) Quantitative studies in software release planning under risk and resource constraints. In: 2003. ISESE 2003. Proceedings. 2003 Intl. Symposium on Empirical Software Engineering. IEEE, pp 262–270
Ruhe G (2010) Product release planning: methods, tools and applications. CRC Press
Russell S, Norvig P (1995) Artificial intelligence: a modern approach. Prentice-Hall, Egnlewood Cliffs
Sayyad A S, Ingram J, Menzies T, Ammar H (2013) Scalable Product line configuration: A straw to break the camel’s back. In: Automated software engineering (ASE), 2013 IEEE/ACM 28th intl Conf. IEEE, pp 465–474
Sharma A, Jiang J, Bommannavar P, Larson B, Lin J (2016) Graphjet: real-time content recommendations at twitter. Proc VLDB Endow 9 (13):1281–1292
Shatnawi R, Li W (2008) The effectiveness of software metrics in identifying error-prone classes in post-release software evolution process. J Syst Softw 81(11):1868–1882
Shatnawi R (2010) A quantitative investigation of the acceptable risk levels of object-oriented metrics in open-source systems. IEEE Trans Softw Eng 36 (2):216–225. https://doi.org/10.1109/TSE.2010.9
Shull F, ad B Boehm V B, Brown A, Costa P, Lindvall M, Port D, Rus I, Tesoriero R, Zelkowitz M (2002) What we have learned about fighting defects. In: Proceedings of 8th International Software Metrics Symposium, Ottawa, pp 249–258
Son T C, Pontelli E (2006) Planning with preferences using logic programming. Theory Pract Log Programm 6(5):559–607
Stroggylos K, Spinellis D (2007) Refactoring–does it improve software quality? IN: 2007, Fifth Intl. Workshop on Software quality, 2007. woSQ’07: ICSE workshops. IEEE, pp 10–10
Tallam S, Gupta N (2006) A concept analysis inspired greedy algorithm for test suite minimization. ACM SIGSOFT Softw Eng Notes 31(1):35–42
Theisen C, Herzig K, Morrison P, Murphy B, Williams L (2015)) Approximating attack surfaces with stack traces. In: ICSE’15
Turhan B, Menzies T, Bener A B, Di Stefano J (2009) On the relative value of cross-company and within-company data for defect prediction. Empir Softw Eng 14(5):540–578
Turhan B, Tosun A, Bener A (2011) Empirical evaluation of mixed-project defect prediction models. In: 2011 37th EUROMICRO Conference Software Engineering and Advanced Applications (SEAA). IEEE, pp 396–403
Weimer W, Nguyen T, Le Goues C, Forrest S (2009) Automatically finding patches using genetic programming. In: Proceedings of Intl. Conf Software Engineering. IEEE, pp 364–374. https://doi.org/10.1109/ICSE.2009.5070536
Wooldridge M, Jennings N R (1995) Intelligent agents: Theory and practice. Knowl Eng Rev 10(2):115–152
Ying R, He R, Chen K, Eksombatchai P, Hamilton WL, Leskovec J (2018) Graph convolutional neural networks for web-scale recommender systems. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, pp 974–983
Yoo S, Harman M (2012) Regression testing minimization, selection and prioritization: a survey. Softw Test Verif Reliab 22(2):67–120
Zhang Q, Li H (2007) MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evol Comput 11 (6):712–731. https://doi.org/10.1109/TEVC.2007.892759
Zimmermann T, Nagappan N, Gall H, Giger E, Murphy B (2009) Cross-project defect prediction. In: Proceedings of 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering on European software engineering conference and foundations of software engineering symposium - E, pp 91, pp https://doi.org/10.1145/1595696.1595713
Zitzler E, Laumanns M, Thiele L (2002) SPEA2: Improving The strength pareto evolutionary algorithm for multiobjective optimization. In: Evolutionary methods for design, optimisation, and control, CIMNE, Barcelona, pp 95–100
Zitzler E, Künzli S (2004) Indicator-Based Selection in Multiobjective Search. In: Parallel Problem Solving from Nature - PPSN VIII, Lecture Notes in Computer Science, vol 3242. Springer Berlin, pp 832–842. https://doi.org/10.1007/978-3-540-30217-9_84
Acknowledgments
The work is partially funded by NSF awards #1506586 and #1302169.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Sarah Nadi
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Krishna, R., Menzies, T. Learning actionable analytics from multiple software projects. Empir Software Eng 25, 3468–3500 (2020). https://doi.org/10.1007/s10664-020-09843-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-020-09843-6