Search-Based Predictive Modelling for Software Engineering: How Far Have We Gone?

Sarro, Federica

doi:10.1007/978-3-030-27455-9_1

Federica Sarro¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11664))

Included in the following conference series:

International Symposium on Search Based Software Engineering

591 Accesses
2 Citations
1 Altmetric

Abstract

In this keynote I introduce the use of Predictive Analytics for Software Engineering (SE) and then focus on the use of search-based heuristics to tackle long-standing SE prediction problems including (but not limited to) software development effort estimation and software defect prediction. I review recent research in Search-Based Predictive Modelling for SE in order to assess the maturity of the field and point out promising research directions. I conclude my keynote by discussing best practices for a rigorous and realistic empirical evaluation of search-based predictive models, a condicio sine qua non to facilitate the adoption of prediction models in software industry practices.

This paper provides an outline of the keynote talk given by Dr. Federica Sarro at SSBSE 2019, with pointers to the literature for details of the results covered.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Arcuri, A., Briand, L.C.: A hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. STVR 24(3), 219–250 (2014)
Google Scholar
Canfora, G., De Lucia, A., Di Penta, M., Oliveto, R., Panichella, A., Panichella, S.: Multi-objective cross-project defect prediction. In: Proceedings of the IEEE 6th International Conference on Software Testing, Verification and Validation, ICST 2013, pp. 252–261 (2013). https://doi.org/10.1109/ICST.2013.38
Corazza, A., Di Martino, S., Ferrucci, F., Gravino, C., Sarro, F., Mendes, E.: How effective is Tabu search to configure support vector regression for effort estimation? In: Proceedings of the International Conference on Predictive Models in Software Engineering, PROMISE 2010, pp. 4:1–4:10 (2010). https://doi.org/10.1145/1868328.1868335
Corazza, A., Di Martino, S., Ferrucci, F., Gravino, C., Sarro, F., Mendes, E.: Using tabu search to configure support vector regression for effort estimation. Empir. Softw. Eng. 18(3), 506–546 (2013). https://doi.org/10.1007/s10664-011-9187-3
Article Google Scholar
Di Martino, S., Ferrucci, F., Gravino, C., Sarro, F.: A genetic algorithm to configure support vector machines for predicting fault-prone components. In: Caivano, D., Oivo, M., Baldassarre, M.T., Visaggio, G. (eds.) PROFES 2011. LNCS, vol. 6759, pp. 247–261. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21843-9_20
Chapter Google Scholar
Ferrucci, F., Gravino, C., Oliveto, R., Sarro, F.: Using Tabu search to estimate software development effort. In: Abran, A., Braungarten, R., Dumke, R.R., Cuadrado-Gallego, J.J., Brunekreef, J. (eds.) IWSM 2009. LNCS, vol. 5891, pp. 307–320. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-05415-0_22
Chapter Google Scholar
Ferrucci, F., Gravino, C., Oliveto, R., Sarro, F.: Genetic programming for effort estimation: an analysis of the impact of different fitness functions. In: Proceedings of the 2nd International Symposium on Search Based Software Engineering, SSBSE 2010, pp. 89–98 (2010). https://doi.org/10.1109/SSBSE.2010.20
Ferrucci, F., Harman, M., Sarro, F.: Search-based software project management. In: Ruhe, G., Wohlin, C. (eds.) Software Project Management in a Changing World, pp. 373–399. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-55035-5_15
Chapter Google Scholar
Ferrucci, F., Salza, P., Sarro, F.: Using hadoop MapReduce for parallel genetic algorithms: a comparison of the global, grid and island models. Evol. Comput. 26, 1–33 (2017). https://doi.org/10.1162/evco_a_00213
Article Google Scholar
Ferrucci, F., Gravino, C., Oliveto, R., Sarro, F., Mendes, E.: Investigating Tabu search for web effort estimation. In: Proceedings of EUROMICRO Conference on Software Engineering and Advanced Applications, SEAA 2010, pp. 350–357 (2010)
Google Scholar
Ferrucci, F., Mendes, E., Sarro, F.: Web effort estimation: the value of cross-company data set compared to single-company data set. In: Proceedings of the 8th International Conference on Predictive Models in Software Engineering, pp. 29–38. ACM (2012)
Google Scholar
Hall, T., Beecham, S., Bowes, D., Gray, D., Counsell, S.: A systematic literature review on fault prediction performance in software engineering. IEEE Trans. Softw. Eng. 38(6), 1276–1304 (2012). https://doi.org/10.1109/TSE.2011.103
Article Google Scholar
Harman, M., Islam, S., Jia, Y., Minku, L.L., Sarro, F., Srivisut, K.: Less is more: temporal fault predictive performance over multiple hadoop releases. In: Le Goues, C., Yoo, S. (eds.) SSBSE 2014. LNCS, vol. 8636, pp. 240–246. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-09940-8_19
Chapter Google Scholar
Harman, M.: The relationship between search based software engineering and predictive modeling. In: Proceedings of the 6th International Conference on Predictive Models in Software Engineering, PROMISE 2010, pp. 1:1–1:13 (2010). https://doi.org/10.1145/1868328.1868330
Jimenez, M., Rwemalika, R., Papadakis, M., Sarro, F., Le Traon, Y., Harman, M.: The importance of accounting for real-world labelling when predicting software vulnerabilities. In: Proceedings of the 27th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, ESEC/FSE 2019 (2019)
Google Scholar
Langdon, W.B., Dolado, J.J., Sarro, F., Harman, M.: Exact mean absolute error of baseline predictor, MARP0. Inf. Softw. Technol. 73, 16–18 (2016). https://doi.org/10.1016/j.infsof.2016.01.003
Article Google Scholar
Lanza, M., Mocci, A., Ponzanelli, L.: The tragedy of defect prediction, prince of empirical software engineering research. IEEE Softw. 33(6), 102–105 (2016). https://doi.org/10.1109/MS.2016.156
Article Google Scholar
Menzies, T., Zimmermann, T.: Software analytics: so what? IEEE Softw. 30(4), 31–37 (2013). https://doi.org/10.1109/MS.2013.86
Article Google Scholar
Najafi, A., Rigby, P., Shang, W.: Bisecting commits and modeling commit risk during testing. In: Proceedings of the 27th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, ESEC/FSE 2019 (2019)
Google Scholar
Braga, P.L., Oliveira, A.L.I., Meira, S.R.L.: A GA-based feature selection and parameters optimization for support vector regression applied to software effort estimation. In: Proceedings of the ACM Symposium on Applied Computing, SAC 2008, pp. 1788–1792 (2008)
Google Scholar
Ruchika, M., Megha, K., Rajeev, R.R.: On the application of search-based techniques for software engineering predictive modeling: a systematic review and future directions. Swarm Evol. Comput. 32, 85–109 (2017)
Article Google Scholar
Russo, B.: A proposed method to evaluate and compare fault predictions across studies. In: Proceedings of the 10th International Conference on Predictive Models in Software Engineering, PROMISE 2014, pp. 2–11. ACM (2014). https://doi.org/10.1145/2639490.2639504
Salza, P., Ferrucci, F., Sarro, F.: Elephant56: design and implementation of a parallel genetic algorithms framework on hadoop MapReduce. In: Proceedings of the 2016 on Genetic and Evolutionary Computation Conference, GECCO 2016, pp. 1315–1322 (2016). https://doi.org/10.1145/2908961.2931722
Sarro, F., Di Martino, S., Ferrucci, F., Gravino, C.: A further analysis on the use of genetic algorithm to configure support vector machines for inter-release fault prediction. In: Proceedings of the 27th Annual ACM Symposium on Applied Computing, SAC 2012, pp. 1215–1220 (2012). https://doi.org/10.1145/2245276.2231967
Sarro, F., Petrozziello, A., Harman, M.: Multi-objective software effort estimation. In: Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, pp. 619–630 (2016). https://doi.org/10.1145/2884781.2884830
Sarro, F.: Search-based approaches for software development effort estimation. In: Proceedings of the 12th International Conference on Product Focused Software Development and Process Improvement, PROFES 2011, pp. 38–43 (2011). https://doi.org/10.1145/2181101.2181111
Sarro, F.: Predictive analytics for software testing: keynote paper. In: Proceedings of the 11th International Workshop on Search-Based Software Testing, SBST 2018, p. 1 (2018). https://doi.org/10.1145/3194718.3194730
Sarro, F., Ferrucci, F., Gravino, C.: Single and multi objective genetic programming for software development effort estimation. In: Proceedings of the 27th Annual ACM Symposium on Applied Computing, SAC 2012, pp. 1221–1226 (2012). https://doi.org/10.1145/2245276.2231968
Sarro, F., Harman, M., Jia, Y., Zhang, Y.: Customer rating reactions can be predicted purely using app features. In: Proceedings of 26th IEEE International Requirements Engineering Conference, RE 2018, pp. 76–87 (2018). https://doi.org/10.1109/RE.2018.00018
Sarro, F., Petrozziello, A.: Linear programming as a baseline for software effort estimation. ACM Trans. Softw. Eng. Methodol. 27(3), 12:1–12:28 (2018). https://doi.org/10.1145/3234940
Article Google Scholar
Shepperd, M.J., MacDonell, S.G.: Evaluating prediction systems in software project estimation. Inf. Sofw. Technol. 54(8), 820–827 (2012). https://doi.org/10.1016/j.infsof.2011.12.008
Article Google Scholar
Sigweni, B., Shepperd, M., Turchi, T.: Realistic assessment of software effort estimation models. In: Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, EASE 2016, pp. 41:1–41:6. ACM (2016). https://doi.org/10.1145/2915970.2916005
Xia, X., Shihab, E., Kamei, Y., Lo, D., Wang, X.: Predicting crashing releases of mobile applications. In: Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2016, pp. 29:1–29:10 (2016). https://doi.org/10.1145/2961111.2962606

Download references

Author information

Authors and Affiliations

Department of Computer Science, University College London, London, UK
Federica Sarro

Authors

Federica Sarro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Federica Sarro .

Editor information

Editors and Affiliations

SnT/University of Luxembourg, Luxembourg, Luxembourg
Shiva Nejati
University of South Carolina, Columbia, SC, USA
Gregory Gay

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sarro, F. (2019). Search-Based Predictive Modelling for Software Engineering: How Far Have We Gone?. In: Nejati, S., Gay, G. (eds) Search-Based Software Engineering. SSBSE 2019. Lecture Notes in Computer Science(), vol 11664. Springer, Cham. https://doi.org/10.1007/978-3-030-27455-9_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-27455-9_1
Published: 03 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27454-2
Online ISBN: 978-3-030-27455-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics