Enhancing Bug Report Assignment with an Optimized Reduction of Training Set

  • Miaomiao Wei
  • Shikai Guo
  • Rong ChenEmail author
  • Jian Gao
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11062)


Despite the great potential to save the labor cost of developers, automated bug triaging as a text classification problem has not been thoroughly investigated on long descriptions, which are informative but often noisy. In this paper an optimized bug triage technique is proposed to build a high quality set of bug data by removing the noisy and non-informative bug reports while ensuring the maximum accuracy of bug triaging with weights and binary constraints. The proposed technique is built upon three feature selection algorithms and four instances selection algorithms with intention to recommend the bug and to automatically assign it more accurately even with noisy bug descriptions. Several experiments are conducted and the experimental results show that the reduced training sets by the proposed approach can achieve better accuracy in several cases, about 4% on average better than the original ones.


Bug triaging Bug reports Bug assignment Machine learning Text classification Industrial scale 



This work is supported by the National Natural Science Foundation of China (Nos. 61672122, 61602077), the Public Welfare Funds for Scientific Research of Liaoning Province of China (No. 20170005), the Natural Science Foundation of Liaoning Province of China (No. 20170540097), and the Fundamental Research Funds for the Central Universities (No. 3132016348).


  1. 1.
    Zhang, J., Wang, X.Y., Hao, D., et al.: A survey on bug-report analysis. Sci. China Inf. Sci. 58(2), 1–24 (2015)CrossRefGoogle Scholar
  2. 2.
    Cubranic, D., Murphy, G.C.: Automatic bug triage using text categorization. In: Proceedings of the Sixteenth International Conference on Software Engineering and Knowledge Engineering, DBLP, pp. 92–97 (2004)Google Scholar
  3. 3.
    Ahsan, S.N., Ferzund, J., Wotawa, F.: Automatic software bug triage system (BTS) based on latent semantic indexing and support vector machine. In: 2009 Fourth International Conference on Software Engineering Advances, pp. 216–221, September 2009Google Scholar
  4. 4.
    Alenezi, M., Magel, K., Banitaan, S.: Efficient bug triaging using text mining. J. Softw. 8(9), 2185–2190 (2013)CrossRefGoogle Scholar
  5. 5.
    Zou, W., Hu, Y., Xuan, J., Jiang, H.: Towards training set reduction for bug triage. In: Proceedings-35th Annual IEEE International Computer Software and Applications Conference, pp. 576–581, July 2011Google Scholar
  6. 6.
    Xia, X., Lo, D., Shihab, E., et al.: Automatic, high accuracy prediction of reopened bugs. Autom. Softw. Eng. 22(1), 75–109 (2015)CrossRefGoogle Scholar
  7. 7.
    Kumar, V.A.: ArK feature selection algorithm to resolve small sample size problem. Data Min. Knowl. Eng. 5(2), 59–61 (2013)Google Scholar
  8. 8.
    Dash, M., Liu, H.: Feature Selection for Classification. Intell. Data Anal. 1(1–4), 131–156 (1997)CrossRefGoogle Scholar
  9. 9.
    Penrod, C.S., Wagner, T.J.: Another look at the edited nearest neighbor rule. IEEE Trans. Syst. Man Cybern. 7(2), 92–94 (2007)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Selvi, C., Ahuja, C., Sivasankar, E.: A comparative study of feature selection and machine learning methods for sentiment classification on movie data set. In: Mandal, D., Kar, R., Das, S., Panigrahi, B.K. (eds.) Intelligent Computing and Applications. AISC, vol. 343, pp. 367–379. Springer, New Delhi (2015). Scholar
  11. 11.
    Brighton, H., Mellish, C.: Identifying competence-critical instances for instance-based learners. In: Liu, H., Motoda, H. (eds.) Instance Selection and Construction for Data Mining. The Springer International Series in Engineering and Computer Science, vol. 608, pp. 77–94. Springer, Boston (2001). Scholar
  12. 12.
    Dasarathy, B.V.: Minimal consistent set (MCS) identification for optimal nearest neighbor decision systems design. IEEE Trans. Syst. Man Cybern. 24(3), 511–517 (1994)CrossRefGoogle Scholar
  13. 13.
    Čubranić, D., Murphy, G.C.: Automatic bug triage using text categorization. In: Proceedings of the Sixteenth International Conference on Software Engineering and Knowledge Engineering (SEKE 2004), pp. 92–97, June 2004Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Dalian Maritime UniversityDalianChina

Personalised recommendations