Abstract
Severity i.e impact, extent and effect on software is a decisive attribute which decides how instantly the bug should be fixed. Predicting the severity of software bugs is important to improve the bug triaging and resolution process. To reduce the effort and time required in manual assessment of severity of newly reported bugs, many techniques and methods are used in past researches. To help software developers to utilize their resources efficiently, this study evaluates a number of machine learning techniques for predicting the severity of software bugs at system and component level. The techniques are evaluated on thirteen apache projects automatically extracted using the Bug Report Collection System tool. Severity is predicted based on the most frequent terms extracted from the summary of bugs using text mining. Performance metrics such as precision, recall and accuracy are used to interpret the results obtained from various techniques. The result of the study advocates that Boosting (an ensemble learner) technique outperforms other machine learning techniques such as Bayesian learners, decision tree, support vector machine applied in previous researches.
Similar content being viewed by others
Notes
References
Anvik J, Hiew L, Murphy GC (2006) Who should fix this bug. In: Proceedings of the 28th international conference on software engineering. ACM, pp 361–370
Chaturvedi KK, Singh VB (2012) Determining bug severity using machine learning techniques. In: International conference on software engineering. IEEE, pp 1–6
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Costache M, Liénou, M, Datcu M (2006) On bayesian inference, maximum entropy and support vector machines methods. In: Aip conference proceedings, vol 872(1), pp 43–51
Cunningham P, Delany SJ (2007) k-nearest neighbour classifiers. Mult Classif Syst 34(8):1–17
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Du W, Zhijun Z (2002) Building decision tree classifier on private data. In: Proceedings of the IEEE international conference on Privacy, security and data mining, vol 14. Australian Computer Society, Inc., pp 1–8
Gegick M, Rotella P, Xie T (2010) Identifying security bug reports via text mining: An industrial case study. In: Mining software repositories. IEEE
Goutte C, Gaussier E (2005) A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In: European conference on information retrieval. Springer, pp 345–359
Iliev M, Karasneh B, Chaudron M, Essenius E (2012) Automated prediction of defect severity based on codifying design knowledge using ontologies. In: First international workshop on realizing AI synergies in software engineering. IEEE, pp 7–11. https://drive.google.com/file/d/1vegyz5uEj3TbiQ0dbixSv25c0RvytuST/view?usp=sharing. Accessed 14 Apr 2019
Jin K, Dashbalbar A, Yang G, Lee B, Lee JW (2016) Improving predictions about bug severity by utilizing bugs classified as normal. Contemp Eng Sci 9(19):933–942
Jindal R, Malhotra R, Jain A (2014) Software defect prediction using neural networks. In: 3rd international conference on reliability, Infocom Technologies and Optimization. IEEE, pp 1–6
Jindal R, Malhotra R, Jain A (2016) Prediction of defect severity by mining software project reports. Int J Syst Assur Eng Manag 8(2):1–18
Kaur A, Goyal S (2017) Bug report collection system (BCRS). In: Proceedings of international conference on cloud computing, data science and engineering. IEEE
Lamkanfi A, Demeyer S, Giger El, Goethals B (2010) Predicting the severity of a reported bug. In: Mining software repositories. IEEE, pp 1–10
Lamkanfi A, Demeyer S, Soetens QD, Verdonck T (2011) Comparing mining algorithms for predicting the severity of a reported bug. In: European conference on, software maintenance and reengineering. IEEE, pp 249–258
Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22
Maclin R, Opitz D (1997) An empirical evaluation of bagging and boosting. In: AAAI/IAAI, pp 546–551
Malhotra R (2015) A systematic review of machine learning techniques for software fault prediction. Appl Soft Comput 27:504–518
Malhotra R, Kapoor N, Jain R, Biyani S (2013) Severity assessment of software defect reports using text classification. Int J Comput Appl 83(11):13–17
Menzies T, Marcus A (2008) Automated severity assessment of software defect reports. In: International conference on software maintenance. IEEE, pp 346–355
Murphy KP (2006) Naive bayes classifiers. University of British Columbia, Vancouver, p 18
Otoom AF, Al-Shdaifat D, Hammad M, Abdallah EE (2016) Severity prediction of software bugs. In: International conference on information and communication systems. IEEE, pp 92–95
Panjer LD (2007) Predicting eclipse bug lifetimes. In; Proceedings of the fourth international workshop on mining software repositories. IEEE, p 29
Rana P, Sharma S (2015) Implementing bug severity prediction through information mining using KNN classifier. Int J Sci Technol Eng 2(4)
Roy NKS, Rossi B (2014) Towards an improvement of bug severity classification. In: EUROMICRO conference on software engineering and advanced applications. IEEE, pp 269–276
Runeson P, Alexandersson M, Nyholm O (2007) Detection of duplicate defect reports using natural language processing. In: Proceedings of the international conference on software engineering. IEEE, pp 499–510
Sharma G, Sharma S, Gujral S (2015) A novel way of assessing software bug severity using dictionary of critical terms. Procedia Comput Sci 70:632–639
Spanos G, Angelis L, Toloudis D(2017) Assessment of vulnerability severity using text mining. In: Proceedings of the 21st Pan-Hellenic conference on informatics. ACM, pp 1–6
Tian Y, Lo D, Sun C (2012) Information retrieval based nearest neighbour classification for fine-grained bug severity prediction. In: Working conference on reverse engineering. IEEE, pp 215–224
Tian Y, Lo D, Xia X, Sun C (2015) Automated prediction of bug report priority using multi-factor analysis. Empir Softw Eng 20(5):1354–1383
Wang X, Zhang L, Xie T, Anvik J, Sun J (2008) An approach to detecting duplicate bug reports using natural language and execution information. In: International conference on software engineering. IEEE, pp 461–470
Xia X, Lo D, Wang X, Zhou B (2013) Accurate developer recommendation for bug resolution. In Working conference on reverse engineering. IEEE, pp 72–81
Yang CZ, Hou CC, Kao WC, Chen X (2012) An empirical study on improving severity prediction of defect reports using feature selection. In; Asia-Pacific software engineering, vol 1. IEEE, pp 240–249
Yang G, Zhang T, Lee B (2014) Towards semi-automatic bug triage and severity prediction based on topic model and multi-feature of bug reports. In: Computer software and applications conference. IEEE, pp 97–106
Zhang T, Chen J, Yang G, Lee B, Luo X (2016) Towards more accurate severity prediction and fixer recommendation of software bugs. J Syst Softw 117:166–184
Zhou Y, Tong Y, Gu R, Gall H (2016) Combining text mining and data mining for bug report classification. J Softw Evol Process 28(3):150–176
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kaur, A., Jindal, S.G. Text analytics based severity prediction of software bugs for apache projects. Int J Syst Assur Eng Manag 10, 765–782 (2019). https://doi.org/10.1007/s13198-019-00807-8
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13198-019-00807-8