Abstract
Assigning severity level to reported bugs is a critical part of software maintenance to ensure an efficient resolution process. In many bug trackers, e.g. Bugzilla, this is a time consuming process, because bug reporters must manually assign one of seven severity levels to each bug. In addition, some bug types may be reported more often than others, leading to a disproportionate distribution of severity labels. Machine learning techniques can be used to predict the label of a newly reported bug automatically. However, learning from imbalanced data in a multi-class task remains one of the major difficulties for machine learning classifiers. In this paper, we propose a hierarchical classification approach that exploits class imbalance in the training data, to reduce classification bias. Specifically, we designed a classification tree that consists of multiple binary classifiers organised hierarchically, such that instances from the most dominant class are trained against the remaining classes but are not used for training the next level of the classification tree. We used FastText classifier to test and compare between the hierarchical and standard classification approaches. Based on 93,051 bug reports from 38 Eclipse open-source products, the hierarchical approach was shown to perform relatively well with \(65\%\) Micro F-Score and \(45\%\) Macro F-Score.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
Bug reports were downloaded from 38 Eclipse related products.
- 4.
Dominance refers to size. A dominant class contains more instances than another.
- 5.
References
Chaturvedi, K.K., Singh, V.B.: Determining Bug severity using machine learning techniques. In: 2012 CSI 6th International Conference on Software Engineering. CONSEG 2012, pp. 1–6. IEEE (2012). https://doi.org/10.1109/CONSEG.2012.6349519
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2002). http://dl.acm.org/citation.cfm?id=1622407.1622416
Gegick, M., Rotella, P., Xie, T.: Identifying security bug reports via text mining: an industrial case study. In: Proceedings - International Conference on Software Engineering, pp. 11–20. IEEE (2010). https://doi.org/10.1109/MSR.2010.5463340
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, vol. 2, pp. 427–431 (2017)
Lamkanfi, A., Demeyer, S., Giger, E., Goethals, B.: Predicting the severity of a reported bug. In: Proceedings - International Conference on Software Engineering, pp. 1–10. IEEE (2010). https://doi.org/10.1109/MSR.2010.5463284
Lamkanfi, A., Demeyer, S., Soetens, Q.D., Verdonckz, T.: Comparing mining algorithms for predicting the severity of a reported bug. In: Proceedings of the European Conference on Software Maintenance and Reengineering. CSMR, pp. 249–258. IEEE (2011). https://doi.org/10.1109/CSMR.2011.31
Menzies, T., Marcus, A.: Automated severity assessment of software defect reports. In: IEEE International Conference on Software Maintenance. ICSM, pp. 346–355 (2008). https://doi.org/10.1109/ICSM.2008.4658083
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Močkus, J., Tiešis, V., Žilinskas, A.: The application of Bayesian methods for seeking the extremum. In: Szegö, G.P., Dixon, L.C.W. (eds.) Towards Global Optimisation, vol. 2, pp. 117–128, North-Holland (1978)
Ramage, D., Hall, D., Nallapati, R., Manning, C.D.: Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In: EMNLP 2009 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 248–256 (2009). https://doi.org/10.3115/1699510.1699543
Roy, N.K.S., Rossi, B.: Towards an improvement of bug severity classification. Proceedings - 40th Euromicro Conference Series on Software Engineering and Advanced Applications. SEAA 2014, pp. 269–276 (2014). https://doi.org/10.1109/SEAA.2014.51
Singh, V.B., Misra, S., Sharma, M.: Bug severity assessment in cross project context and identifying training candidates. J. Inf. Knowl. Manag. 16(01), 1750005 1–30 (2017). https://doi.org/10.1142/S0219649217500058, http://www.worldscientific.com/doi/abs/10.1142/S0219649217500058
Sun, C., Lo, D., Khoo, S.C., Jiang, J.: Towards more accurate retrieval of duplicate bug reports. In: Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering. ASE 2011, pp. 253–262 (2011). https://doi.org/10.1109/ASE.2011.6100061
Tian, Y., Lo, D., Sun, C.: Information retrieval based nearest neighbor classification for fine-grained bug severity prediction. In: Proceedings - Working Conference on Reverse Engineering. WCRE, pp. 215–224 (2012). https://doi.org/10.1109/WCRE.2012.31
Yang, C.Z., Hou, C.C., Kao, W.C., Chen, I.X.: An empirical study on improving severity prediction of defect reports using feature selection. In: Proceedings - Asia-Pacific Software Engineering Conference, APSEC. vol. 1, pp. 240–249. IEEE (2012). https://doi.org/10.1109/APSEC.2012.144
Zhang, T., Chen, J., Yang, G., Lee, B., Luo, X.: Towards more accurate severity prediction and fixer recommendation of software bugs. J. Syst. Software 117, 166–184 (2016). https://doi.org/10.1016/j.jss.2016.02.034
Zolotov, V., Kung, D.: Analysis and optimization of fast text linear text classifier. arXiv preprint arXiv:1702.05531 (2017)
Acknowledgments
This research work is part of the CROSSMINER Project, which has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under grant agreement No. 732223.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Nnamoko, N., Cabrera-Diego, L.A., Campbell, D., Korkontzelos, Y. (2019). Bug Severity Prediction Using a Hierarchical One-vs.-Remainder Approach. In: Métais, E., Meziane, F., Vadera, S., Sugumaran, V., Saraee, M. (eds) Natural Language Processing and Information Systems. NLDB 2019. Lecture Notes in Computer Science(), vol 11608. Springer, Cham. https://doi.org/10.1007/978-3-030-23281-8_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-23281-8_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23280-1
Online ISBN: 978-3-030-23281-8
eBook Packages: Computer ScienceComputer Science (R0)