Abstract
We investigate search based fault prediction over time based on 8 consecutive Hadoop versions, aiming to analyse the impact of chronology on fault prediction performance. Our results confound the assumption, implicit in previous work, that additional information from historical versions improves prediction; though G-mean tends to improve, Recall can be reduced.
Author order is alphabetical.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Afzal, W., Torkar, R.: On the application of genetic programming for software engineering predictive modeling: A systematic review. Expert Systems Applications 38(9), 11984–11997 (2011)
Arcuri, A., Briand, L.: A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: ICSE, pp. 1–10 (2011)
Bouktif, S., Sahraoui, H., Antoniol, G.: Simulated annealing for improving software quality prediction. In: GECCO, vol. 2, pp. 1893–1900 (2006)
Chidamber, S.R., Kemerer, C.F.: A metrics suite for object oriented design. IEEE TSE 20(6), 476–493 (1994)
Di Martino, S., Ferrucci, F., Gravino, C., Sarro, F.: A genetic algorithm to configure support vector machines for predicting fault-prone components. In: Caivano, D., Oivo, M., Baldassarre, M.T., Visaggio, G. (eds.) PROFES 2011. LNCS, vol. 6759, pp. 247–261. Springer, Heidelberg (2011)
Elish, K.O., Elish, M.O.: Predicting defect-prone software modules using support vector machines. JSS 81(5), 649–660 (2008)
Ferrucci, F., Harman, M., Sarro, F.: Search based software project management. In: Ruhe, G., Wohlin, C. (eds.) Software Project Management in a Changing World, Springer (to appear, 2014)
Gondra, I.: Applying machine learning to software fault-proneness prediction. JSS 81(2), 186–195 (2008)
Hall, T., Beecham, S., Bowes, D., Gray, D., Counsell, S.: A systematic literature review on fault prediction performance in software engineering. IEEE TSE 38(6), 1276–1304 (2012)
Harman, M.: How SBSE can support construction and analysis of predictive models (keynote). In: PROMISE (2010)
Harman, M., Burke, E., Clark, J.A., Yao, X.: Dynamic adaptive search based software engineering. In: ESEM, pp. 1–8 (2012)
Harman, M., McMinn, P., de Souza, J.T., Yoo, S.: Search based software engineering: Techniques, taxonomy, tutorial. In: Meyer, B., Nordio, M. (eds.) LASER Summer School 2008-2010. LNCS, vol. 7007, pp. 1–59. Springer, Heidelberg (2012)
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE TKDE 21(9), 1263–1284 (2009)
Krogmann, K., Kuperberg, M., Reussner, R.: Using genetic search for reverse engineering of parametric behaviour models for performance prediction. IEEE TSE 36(6), 865–877 (2010)
Minku, L., Yao, X.: Can cross-company data improve performance in software effort estimation? In: PROMISE, pp. 69–78 (2012)
Minku, L., Yao, X.: How to make best use of cross-company data in software effort estimation? In: ICSE, pp. 446–456 (2014)
Ostrand, T.J., Weyuker, E.J.: How to measure success of fault prediction models. In: SOQUA 2007, pp. 25–30. ACM (2007)
RodrÃguez, D., Ruiz, R., Riquelme, J.C., Harrison, R.: Subgroup discovery for defect prediction. In: Cohen, M.B., Ó Cinnéide, M. (eds.) SSBSE 2011. LNCS, vol. 6956, pp. 269–270. Springer, Heidelberg (2011)
Sarro, F., Di Martino, S., Ferrucci, F., Gravino, C.: A further analysis on the use of genetic algorithm to configure support vector machines for inter-release fault prediction. In: ACM-SAC, pp. 1215–1220 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Harman, M., Islam, S., Jia, Y., Minku, L.L., Sarro, F., Srivisut, K. (2014). Less is More: Temporal Fault Predictive Performance over Multiple Hadoop Releases. In: Le Goues, C., Yoo, S. (eds) Search-Based Software Engineering. SSBSE 2014. Lecture Notes in Computer Science, vol 8636. Springer, Cham. https://doi.org/10.1007/978-3-319-09940-8_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-09940-8_19
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09939-2
Online ISBN: 978-3-319-09940-8
eBook Packages: Computer ScienceComputer Science (R0)