Text analytics based severity prediction of software bugs for apache projects

Kaur, Arvinder; Jindal, Shubhra Goyal

doi:10.1007/s13198-019-00807-8

Text analytics based severity prediction of software bugs for apache projects

Original Article
Published: 03 June 2019

Volume 10, pages 765–782, (2019)
Cite this article

International Journal of System Assurance Engineering and Management Aims and scope Submit manuscript

385 Accesses
16 Citations
Explore all metrics

Abstract

Severity i.e impact, extent and effect on software is a decisive attribute which decides how instantly the bug should be fixed. Predicting the severity of software bugs is important to improve the bug triaging and resolution process. To reduce the effort and time required in manual assessment of severity of newly reported bugs, many techniques and methods are used in past researches. To help software developers to utilize their resources efficiently, this study evaluates a number of machine learning techniques for predicting the severity of software bugs at system and component level. The techniques are evaluated on thirteen apache projects automatically extracted using the Bug Report Collection System tool. Severity is predicted based on the most frequent terms extracted from the summary of bugs using text mining. Performance metrics such as precision, recall and accuracy are used to interpret the results obtained from various techniques. The result of the study advocates that Boosting (an ensemble learner) technique outperforms other machine learning techniques such as Bayesian learners, decision tree, support vector machine applied in previous researches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiattribute Based Machine Learning Models for Severity Prediction in Cross Project Context

Predicting the Severity of Closed Source Bug Reports Using Ensemble Methods

Machine Learning-Based Methods for Identifying Bug Severity Level from Bug Reports

Notes

References

Anvik J, Hiew L, Murphy GC (2006) Who should fix this bug. In: Proceedings of the 28th international conference on software engineering. ACM, pp 361–370
Chaturvedi KK, Singh VB (2012) Determining bug severity using machine learning techniques. In: International conference on software engineering. IEEE, pp 1–6
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
MATH Google Scholar
Costache M, Liénou, M, Datcu M (2006) On bayesian inference, maximum entropy and support vector machines methods. In: Aip conference proceedings, vol 872(1), pp 43–51
Cunningham P, Delany SJ (2007) k-nearest neighbour classifiers. Mult Classif Syst 34(8):1–17
Google Scholar
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MATH MathSciNet Google Scholar
Du W, Zhijun Z (2002) Building decision tree classifier on private data. In: Proceedings of the IEEE international conference on Privacy, security and data mining, vol 14. Australian Computer Society, Inc., pp 1–8
Gegick M, Rotella P, Xie T (2010) Identifying security bug reports via text mining: An industrial case study. In: Mining software repositories. IEEE
Goutte C, Gaussier E (2005) A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In: European conference on information retrieval. Springer, pp 345–359
Iliev M, Karasneh B, Chaudron M, Essenius E (2012) Automated prediction of defect severity based on codifying design knowledge using ontologies. In: First international workshop on realizing AI synergies in software engineering. IEEE, pp 7–11. https://drive.google.com/file/d/1vegyz5uEj3TbiQ0dbixSv25c0RvytuST/view?usp=sharing. Accessed 14 Apr 2019
Jin K, Dashbalbar A, Yang G, Lee B, Lee JW (2016) Improving predictions about bug severity by utilizing bugs classified as normal. Contemp Eng Sci 9(19):933–942
Article Google Scholar
Jindal R, Malhotra R, Jain A (2014) Software defect prediction using neural networks. In: 3rd international conference on reliability, Infocom Technologies and Optimization. IEEE, pp 1–6
Jindal R, Malhotra R, Jain A (2016) Prediction of defect severity by mining software project reports. Int J Syst Assur Eng Manag 8(2):1–18
Google Scholar
Kaur A, Goyal S (2017) Bug report collection system (BCRS). In: Proceedings of international conference on cloud computing, data science and engineering. IEEE
Lamkanfi A, Demeyer S, Giger El, Goethals B (2010) Predicting the severity of a reported bug. In: Mining software repositories. IEEE, pp 1–10
Lamkanfi A, Demeyer S, Soetens QD, Verdonck T (2011) Comparing mining algorithms for predicting the severity of a reported bug. In: European conference on, software maintenance and reengineering. IEEE, pp 249–258
Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22
Google Scholar
Maclin R, Opitz D (1997) An empirical evaluation of bagging and boosting. In: AAAI/IAAI, pp 546–551
Malhotra R (2015) A systematic review of machine learning techniques for software fault prediction. Appl Soft Comput 27:504–518
Article Google Scholar
Malhotra R, Kapoor N, Jain R, Biyani S (2013) Severity assessment of software defect reports using text classification. Int J Comput Appl 83(11):13–17
Google Scholar
Menzies T, Marcus A (2008) Automated severity assessment of software defect reports. In: International conference on software maintenance. IEEE, pp 346–355
Murphy KP (2006) Naive bayes classifiers. University of British Columbia, Vancouver, p 18
Google Scholar
Otoom AF, Al-Shdaifat D, Hammad M, Abdallah EE (2016) Severity prediction of software bugs. In: International conference on information and communication systems. IEEE, pp 92–95
Panjer LD (2007) Predicting eclipse bug lifetimes. In; Proceedings of the fourth international workshop on mining software repositories. IEEE, p 29
Rana P, Sharma S (2015) Implementing bug severity prediction through information mining using KNN classifier. Int J Sci Technol Eng 2(4)
Roy NKS, Rossi B (2014) Towards an improvement of bug severity classification. In: EUROMICRO conference on software engineering and advanced applications. IEEE, pp 269–276
Runeson P, Alexandersson M, Nyholm O (2007) Detection of duplicate defect reports using natural language processing. In: Proceedings of the international conference on software engineering. IEEE, pp 499–510
Sharma G, Sharma S, Gujral S (2015) A novel way of assessing software bug severity using dictionary of critical terms. Procedia Comput Sci 70:632–639
Article Google Scholar
Spanos G, Angelis L, Toloudis D(2017) Assessment of vulnerability severity using text mining. In: Proceedings of the 21st Pan-Hellenic conference on informatics. ACM, pp 1–6
Tian Y, Lo D, Sun C (2012) Information retrieval based nearest neighbour classification for fine-grained bug severity prediction. In: Working conference on reverse engineering. IEEE, pp 215–224
Tian Y, Lo D, Xia X, Sun C (2015) Automated prediction of bug report priority using multi-factor analysis. Empir Softw Eng 20(5):1354–1383
Article Google Scholar
Wang X, Zhang L, Xie T, Anvik J, Sun J (2008) An approach to detecting duplicate bug reports using natural language and execution information. In: International conference on software engineering. IEEE, pp 461–470
Xia X, Lo D, Wang X, Zhou B (2013) Accurate developer recommendation for bug resolution. In Working conference on reverse engineering. IEEE, pp 72–81
Yang CZ, Hou CC, Kao WC, Chen X (2012) An empirical study on improving severity prediction of defect reports using feature selection. In; Asia-Pacific software engineering, vol 1. IEEE, pp 240–249
Yang G, Zhang T, Lee B (2014) Towards semi-automatic bug triage and severity prediction based on topic model and multi-feature of bug reports. In: Computer software and applications conference. IEEE, pp 97–106
Zhang T, Chen J, Yang G, Lee B, Luo X (2016) Towards more accurate severity prediction and fixer recommendation of software bugs. J Syst Softw 117:166–184
Article Google Scholar
Zhou Y, Tong Y, Gu R, Gall H (2016) Combining text mining and data mining for bug report classification. J Softw Evol Process 28(3):150–176
Article Google Scholar

Download references

Author information

Authors and Affiliations

University School of Information and Communication Technology, Guru Gobind Singh Indraprastha University, Dwarka, Delhi, India
Arvinder Kaur & Shubhra Goyal Jindal

Authors

Arvinder Kaur
View author publications
You can also search for this author in PubMed Google Scholar
Shubhra Goyal Jindal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shubhra Goyal Jindal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kaur, A., Jindal, S.G. Text analytics based severity prediction of software bugs for apache projects. Int J Syst Assur Eng Manag 10, 765–782 (2019). https://doi.org/10.1007/s13198-019-00807-8

Download citation

Received: 22 August 2018
Revised: 12 April 2019
Published: 03 June 2019
Issue Date: August 2019
DOI: https://doi.org/10.1007/s13198-019-00807-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Text analytics based severity prediction of software bugs for apache projects

Abstract

Access this article

Similar content being viewed by others

Multiattribute Based Machine Learning Models for Severity Prediction in Cross Project Context

Predicting the Severity of Closed Source Bug Reports Using Ensemble Methods

Machine Learning-Based Methods for Identifying Bug Severity Level from Bug Reports

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Text analytics based severity prediction of software bugs for apache projects

Abstract

Access this article

Similar content being viewed by others

Multiattribute Based Machine Learning Models for Severity Prediction in Cross Project Context

Predicting the Severity of Closed Source Bug Reports Using Ensemble Methods

Machine Learning-Based Methods for Identifying Bug Severity Level from Bug Reports

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation