Skip to main content
Log in

Text analytics based severity prediction of software bugs for apache projects

  • Original Article
  • Published:
International Journal of System Assurance Engineering and Management Aims and scope Submit manuscript

Abstract

Severity i.e impact, extent and effect on software is a decisive attribute which decides how instantly the bug should be fixed. Predicting the severity of software bugs is important to improve the bug triaging and resolution process. To reduce the effort and time required in manual assessment of severity of newly reported bugs, many techniques and methods are used in past researches. To help software developers to utilize their resources efficiently, this study evaluates a number of machine learning techniques for predicting the severity of software bugs at system and component level. The techniques are evaluated on thirteen apache projects automatically extracted using the Bug Report Collection System tool. Severity is predicted based on the most frequent terms extracted from the summary of bugs using text mining. Performance metrics such as precision, recall and accuracy are used to interpret the results obtained from various techniques. The result of the study advocates that Boosting (an ensemble learner) technique outperforms other machine learning techniques such as Bayesian learners, decision tree, support vector machine applied in previous researches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. https://jira.atlassian.com/browse/JRA-886?=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&showAll=true

  2. https://confluence.sakaiproject.org/display/MGT/Sakai+Jira+Guidelines.

  3. http://ataspinar.com/2015/11/16/text-classification-and-sentiment-analysis/

  4. www.statsoft.com/Textbook/Random-Forest

  5. https://www.dezyre.com/article/top-10-machine-learning-algorithms/202

References

  • Anvik J, Hiew L, Murphy GC (2006) Who should fix this bug. In: Proceedings of the 28th international conference on software engineering. ACM, pp 361–370

  • Chaturvedi KK, Singh VB (2012) Determining bug severity using machine learning techniques. In: International conference on software engineering. IEEE, pp 1–6

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  • Costache M, Liénou, M, Datcu M (2006) On bayesian inference, maximum entropy and support vector machines methods. In: Aip conference proceedings, vol 872(1), pp 43–51

  • Cunningham P, Delany SJ (2007) k-nearest neighbour classifiers. Mult Classif Syst 34(8):1–17

    Google Scholar 

  • Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MATH  MathSciNet  Google Scholar 

  • Du W, Zhijun Z (2002) Building decision tree classifier on private data. In: Proceedings of the IEEE international conference on Privacy, security and data mining, vol 14. Australian Computer Society, Inc., pp 1–8

  • Gegick M, Rotella P, Xie T (2010) Identifying security bug reports via text mining: An industrial case study. In: Mining software repositories. IEEE

  • Goutte C, Gaussier E (2005) A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In: European conference on information retrieval. Springer, pp 345–359

  • Iliev M, Karasneh B, Chaudron M, Essenius E (2012) Automated prediction of defect severity based on codifying design knowledge using ontologies. In: First international workshop on realizing AI synergies in software engineering. IEEE, pp 7–11. https://drive.google.com/file/d/1vegyz5uEj3TbiQ0dbixSv25c0RvytuST/view?usp=sharing. Accessed 14 Apr 2019

  • Jin K, Dashbalbar A, Yang G, Lee B, Lee JW (2016) Improving predictions about bug severity by utilizing bugs classified as normal. Contemp Eng Sci 9(19):933–942

    Article  Google Scholar 

  • Jindal R, Malhotra R, Jain A (2014) Software defect prediction using neural networks. In: 3rd international conference on reliability, Infocom Technologies and Optimization. IEEE, pp 1–6

  • Jindal R, Malhotra R, Jain A (2016) Prediction of defect severity by mining software project reports. Int J Syst Assur Eng Manag 8(2):1–18

    Google Scholar 

  • Kaur A, Goyal S (2017) Bug report collection system (BCRS). In: Proceedings of international conference on cloud computing, data science and engineering. IEEE

  • Lamkanfi A, Demeyer S, Giger El, Goethals B (2010) Predicting the severity of a reported bug. In: Mining software repositories. IEEE, pp 1–10

  • Lamkanfi A, Demeyer S, Soetens QD, Verdonck T (2011) Comparing mining algorithms for predicting the severity of a reported bug. In: European conference on, software maintenance and reengineering. IEEE, pp 249–258

  • Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22

    Google Scholar 

  • Maclin R, Opitz D (1997) An empirical evaluation of bagging and boosting. In: AAAI/IAAI, pp 546–551

  • Malhotra R (2015) A systematic review of machine learning techniques for software fault prediction. Appl Soft Comput 27:504–518

    Article  Google Scholar 

  • Malhotra R, Kapoor N, Jain R, Biyani S (2013) Severity assessment of software defect reports using text classification. Int J Comput Appl 83(11):13–17

    Google Scholar 

  • Menzies T, Marcus A (2008) Automated severity assessment of software defect reports. In: International conference on software maintenance. IEEE, pp 346–355

  • Murphy KP (2006) Naive bayes classifiers. University of British Columbia, Vancouver, p 18

    Google Scholar 

  • Otoom AF, Al-Shdaifat D, Hammad M, Abdallah EE (2016) Severity prediction of software bugs. In: International conference on information and communication systems. IEEE, pp 92–95

  • Panjer LD (2007) Predicting eclipse bug lifetimes. In; Proceedings of the fourth international workshop on mining software repositories. IEEE, p 29

  • Rana P, Sharma S (2015) Implementing bug severity prediction through information mining using KNN classifier. Int J Sci Technol Eng 2(4)

  • Roy NKS, Rossi B (2014) Towards an improvement of bug severity classification. In: EUROMICRO conference on software engineering and advanced applications. IEEE, pp 269–276

  • Runeson P, Alexandersson M, Nyholm O (2007) Detection of duplicate defect reports using natural language processing. In: Proceedings of the international conference on software engineering. IEEE, pp 499–510

  • Sharma G, Sharma S, Gujral S (2015) A novel way of assessing software bug severity using dictionary of critical terms. Procedia Comput Sci 70:632–639

    Article  Google Scholar 

  • Spanos G, Angelis L, Toloudis D(2017) Assessment of vulnerability severity using text mining. In: Proceedings of the 21st Pan-Hellenic conference on informatics. ACM, pp 1–6

  • Tian Y, Lo D, Sun C (2012) Information retrieval based nearest neighbour classification for fine-grained bug severity prediction. In: Working conference on reverse engineering. IEEE, pp 215–224

  • Tian Y, Lo D, Xia X, Sun C (2015) Automated prediction of bug report priority using multi-factor analysis. Empir Softw Eng 20(5):1354–1383

    Article  Google Scholar 

  • Wang X, Zhang L, Xie T, Anvik J, Sun J (2008) An approach to detecting duplicate bug reports using natural language and execution information. In: International conference on software engineering. IEEE, pp 461–470

  • Xia X, Lo D, Wang X, Zhou B (2013) Accurate developer recommendation for bug resolution. In Working conference on reverse engineering. IEEE, pp 72–81

  • Yang CZ, Hou CC, Kao WC, Chen X (2012) An empirical study on improving severity prediction of defect reports using feature selection. In; Asia-Pacific software engineering, vol 1. IEEE, pp 240–249

  • Yang G, Zhang T, Lee B (2014) Towards semi-automatic bug triage and severity prediction based on topic model and multi-feature of bug reports. In: Computer software and applications conference. IEEE, pp 97–106

  • Zhang T, Chen J, Yang G, Lee B, Luo X (2016) Towards more accurate severity prediction and fixer recommendation of software bugs. J Syst Softw 117:166–184

    Article  Google Scholar 

  • Zhou Y, Tong Y, Gu R, Gall H (2016) Combining text mining and data mining for bug report classification. J Softw Evol Process 28(3):150–176

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shubhra Goyal Jindal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kaur, A., Jindal, S.G. Text analytics based severity prediction of software bugs for apache projects. Int J Syst Assur Eng Manag 10, 765–782 (2019). https://doi.org/10.1007/s13198-019-00807-8

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13198-019-00807-8

Keywords

Navigation