Abstract
Bug reports (BRs) play a major role in the software maintenance process; they alert developers about the bugs discovered by the end-users. Software applications utilize bug tracking systems (BTS) to manage submitted bug reports. Recent studies showed that the majority of BRs in BTS belong to the default severity category, which does not represent their actual severity. In this paper, we propose an approach that can automatically classify default bug reports into severe or non-severe categories. We curated a dataset based on the history of bug report logs. After that, we used the Support Vector Machine algorithm and Term Frequency-Inverse Document Frequency feature extraction method to classify default bug reports into severe or non-severe categories. The results show that building customized models for default severity bug reports provides better and more reliable results than training one model for all severity. Overall, the proposed Log model outperformed the three models (approaches) from the literature; it achieved an improvement of up to ~ 4% f-measure compared to others, and in some projects, it achieved an improvement of 11.2% f-measure. Moreover, we investigated the impact of sentiment analysis on default bug severity prediction; the results show no noticeable influence.
Similar content being viewed by others
References
Sharma, G.; Sharma, S.; Gujral, S.: A novel way of assessing software bug severity using dictionary of critical terms. Procedia Comput. Sci. 70, 632–639 (2015). https://doi.org/10.1016/j.procs.2015.10.059
Lamkanfi, A., Demeyer, S., Soetens, Q.D., Verdonck, T.: Comparing mining algorithms for predicting the severity of a reported bug. In: 2011 15th European Conference on Software Maintenance and Reengineering, 2011, pp. 249–258 (2011), https://doi.org/10.1109/CSMR.2011.31
Lamkanfi, A., Demeyer, S., Giger, E., Goethals, B.: Predicting the severity of a reported bug. In: 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), 2010, pp 1–10 (2010), https://doi.org/10.1109/MSR.2010.5463284
Tian, Y.; Ali, N.; Lo, D.; Hassan, A.E.: On the unreliability of bug severity data. Empir. Softw. Eng. 21, 2298–2323 (2015). https://doi.org/10.1007/s10664-015-9409-1
Gomes, L.A.F.; da Silva Torres, R.; Côrtes, M.L.: Bug report severity level prediction in open source software: a survey and research opportunities. Inf. Softw. Technol. 115, 58–78 (2019). https://doi.org/10.1016/j.infsof.2019.07.009
Ortu, M., Destefanis, G., Adams, B., Murgia, A., Marchesi, M., Tonelli, R.: The JIRA Repository Dataset: Understanding Social Aspects of Software Development. Presented at the Proceedings of the 11th International Conference on Predictive Models and Data Analytics in Software Engineering, Beijing, China, (2015). [Online]. Available: https://doi.org/10.1145/2810146.2810147.
Kukkar, A.; Mohana, R.; Nayyar, A.; Kim, J.; Kang, B.-G.; Chilamkurti, N.: A novel deep-learning-based bug severity classification technique using convolutional neural networks and random forest with boosting. Sensors 19, 2964 (2019). https://doi.org/10.3390/s19132964
Ramay, W.Y.; Umer, Q.; Yin, X.C.; Zhu, C.; Illahi, I.: Deep neural network-based severity prediction of bug reports. IEEE Access 7, 46846–46857 (2019). https://doi.org/10.1109/ACCESS.2019.2909746
Calefato, F., Lanubile, F., Maiorano, F., Novielli, N.: Sentiment Polarity Detection for Software Development. (2018), pp. 128–128
Lamkanfi, A., Pérez, J., Demeyer, S.: The Eclipse and Mozilla defect tracking dataset: a genuine dataset for mining bug information. In: 2013 10th Working Conference on Mining Software Repositories (MSR), 18–19 May 2013 (2013), pp. 203–206, doi: https://doi.org/10.1109/MSR.2013.6624028
Tian, Y.; Lo, D.; Xia, X.; Sun, C.: Automated prediction of bug report priority using multi-factor analysis. Empir. Softw. Eng. 20(5), 1354–1383 (2015). https://doi.org/10.1007/s10664-014-9331-y
Zhang, T., Yang, G., Lee, B., Chan, A.T.S.: Predicting severity of bug report by mining bug repository with concept profile. Presented at the Proceedings of the 30th Annual ACM Symposium on Applied Computing, Salamanca, Spain, 2015. [Online]. Available: https://doi.org/10.1145/2695664.2695872
Zhang, T., Lee, B.: An Automated Bug Triage Approach: A Concept Profile and Social Network Based Developer Recommendation. Berlin, Heidelberg, 2012: Springer Berlin Heidelberg, in Intelligent Computing Technology, pp. 505–512
Zhang, T., Lee, B.: A hybrid bug triage algorithm for developer recommendation. Presented at the Proceedings of the 28th Annual ACM Symposium on Applied Computing, Coimbra, Portugal, 2013. [Online]. Available: https://doi.org/10.1145/2480362.2480568.
Zhou, Y.; Tong, Y.; Gu, R.; Gall, H.: Combining text mining and data mining for bug report classification. J. Softw.: Evol. Process 28(3), 150–176 (2016). https://doi.org/10.1002/smr.1770
Kukkar, A.; Mohana, R.: A supervised bug report classification with incorporate and textual field knowledge. Procedia Comput. Sci. 132, 352–361 (2018). https://doi.org/10.1016/j.procs.2018.05.194
Catolino, G.; Palomba, F.; Zaidman, A.; Ferrucci, F.: Not all bugs are the same: understanding, characterizing, and classifying bug types. J. Syst. Softw. 152, 165–181 (2019). https://doi.org/10.1016/j.jss.2019.03.002
Pingclasai, N., Hata, H., Matsumoto, K.I.: Classifying bug reports to bugs and other requests using topic modeling. In 2013 20th Asia-Pacific Software Engineering Conference (APSEC), 2–5 Dec. 2013 2013, vol. 2, pp. 13–18, doi: https://doi.org/10.1109/APSEC.2013.105
Köksal, Ö., Öztürk, C.E.: A survey on machine learning-based automated software bug report classification. In 2022 International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), 20–22 Oct. 2022 2022, pp. 635–640, doi: https://doi.org/10.1109/ISMSIT56059.2022.9932822
Uddin, J.; Ghazali, R.; Deris, M.M.; Naseem, R.; Shah, H.: A survey on bug prioritization. Artif. Intell. Rev. 47(2), 145–180 (2017). https://doi.org/10.1007/s10462-016-9478-6
Yang, G., Zhang, T., Lee, B.: Towards semi-automatic bug triage and severity prediction based on topic model and multi-feature of bug reports. In: 2014 IEEE 38th Annual Computer Software and Applications Conference, 21–25 July 2014 2014, pp. 97–106, doi: https://doi.org/10.1109/COMPSAC.2014.16
Saha, R.K., Lawall, J., Khurshid, S., Perry, D.E.: Are these bugs really “Normal”?. In 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, 16–17 May 2015 2015, pp. 258–268, doi: https://doi.org/10.1109/MSR.2015.31
Baarah, A.; Al-oqaily, A.; Salah, Z.; Zamzeer, M.; Sallam, M.: Machine learning approaches for predicting the severity level of software bug reports in closed source projects. Int. J. Adv. Comput. Sci. Appl. (2019). https://doi.org/10.14569/IJACSA.2019.0100836
Dao, A.-H.; Yang, C.-Z.: Severity prediction for bug reports using multi-aspect features: a deep learning approach. Mathematics 9(14), 1644 (2021)
Sabor, K.K.; Hamdaqa, M.; Hamou-Lhadj, A.: Automatic prediction of the severity of bugs using stack traces and categorical features. Inf. Softw. Technol. 123, 106205 (2020). https://doi.org/10.1016/j.infsof.2019.106205
Umer, Q.; Liu, H.; Sultan, Y.: Emotion based automated priority prediction for bug reports. IEEE Access 6, 35743–35752 (2018). https://doi.org/10.1109/ACCESS.2018.2850910
Zhang, T.; Chen, J.; Yang, G.; Lee, B.; Luo, X.: Towards more accurate severity prediction and fixer recommendation of software bugs. J. Syst. Softw. 117, 166–184 (2016). https://doi.org/10.1016/j.jss.2016.02.034
Kaur, A.; Jindal, S.G.: Text analytics based severity prediction of software bugs for apache projects. Int. J. Syst. Assur. Eng. Manag. 10(4), 765–782 (2019). https://doi.org/10.1007/s13198-019-00807-8
Luaphol, B.; Polpinij, J.; Kaenampornpan, M.: Text mining approaches for dependent bug report assembly and severity prediction. Int. Arab J. Inf. Technol. (IAJIT) 19(6), 51–60 (2022). https://doi.org/10.34028/iajit/19/6/9
Yang, G.; Zhang, T.; Lee, B.: An emotion similarity based severity prediction of software bugs: a case study of open source projects. IEICE Trans. Inf. Syst. E101D, 2015–2026 (2018). https://doi.org/10.1587/transinf.2017EDP7406
Tian, Y., Lo, D., Sun, C.: Information retrieval based nearest neighbor classification for fine-grained bug severity prediction. In: 2012 19th Working Conference on Reverse Engineering, 15–18 Oct. 2012 (2012), pp. 215–224, doi: https://doi.org/10.1109/WCRE.2012.31.
Otoom, A.F., Al-Shdaifat, D., Hammad, M., Abdallah, E.E.: Severity prediction of software bugs. In: 2016 7th International Conference on Information and Communication Systems (ICICS), 5–7 April 2016 (2016), pp. 92–95, doi: https://doi.org/10.1109/IACS.2016.7476092
Sharmin, S., Aktar, F, Ali, A.A., Khan, M.A.H., Shoyaib, M.: BFSp: a feature selection method for bug severity classification. In: 2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC), 21–23 Dec. 2017 (2017), pp. 750–754, doi: https://doi.org/10.1109/R10-HTC.2017.8289066.
Kukkar, A.; Mohana, R.; Kumar, Y.: Does bug report summarization help in enhancing the accuracy of bug severity classification? Procedia Comput. Sci. 167, 1345–1353 (2020). https://doi.org/10.1016/j.procs.2020.03.345
Kim, J.; Yang, G.: Bug severity prediction algorithm using topic-based feature selection and CNN-LSTM algorithm. IEEE Access 10, 94643–94651 (2022)
Agrawal, R.; Goyal, R.: Developing bug severity prediction models using word2vec. Int. J. Cogn. Comput. Eng. 2, 104–115 (2021)
Wei, Y.; Zhang, C.; Ren, T.: Improving bug severity prediction with domain-specific representation learning. IEEE Access 11, 62829–62839 (2023)
Mashhadi, E., Ahmadvand, H., Hemmati, H.: Method-level bug severity prediction using source code metrics and LLMs. In: 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE), (2023) IEEE, pp. 635–646
Hamdy, A., El-Laithy, A.: Semantic categorization of software bug repositories for severity assignment automation. Integrating Research and Practice in Software Engineering, pp. 15–30 (2020)
Bird, S.; Klein, E.; Loper, E.: Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. O’Reilly Media Inc, Sebastopol (2009)
Lemaître, G.; Nogueira, F.; Aridas, C.K.: Imbalanced-learn: a python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18(1), 559–563 (2017)
Acknowledgements
The authors acknowledge the support of King Fahd University of Petroleum and Minerals in the development of this work.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Aburakhia, A., Alshayeb, M. A Machine Learning Approach for Classifying the Default Bug Severity Level. Arab J Sci Eng (2024). https://doi.org/10.1007/s13369-024-09081-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13369-024-09081-8