Empirical Software Engineering

, Volume 20, Issue 5, pp 1354–1383 | Cite as

Automated prediction of bug report priority using multi-factor analysis

  • Yuan TianEmail author
  • David Lo
  • Xin Xia
  • Chengnian Sun


Bugs are prevalent. To improve software quality, developers often allow users to report bugs that they found using a bug tracking system such as Bugzilla. Users would specify among other things, a description of the bug, the component that is affected by the bug, and the severity of the bug. Based on this information, bug triagers would then assign a priority level to the reported bug. As resources are limited, bug reports would be investigated based on their priority levels. This priority assignment process however is a manual one. Could we do better? In this paper, we propose an automated approach based on machine learning that would recommend a priority level based on information available in bug reports. Our approach considers multiple factors, temporal, textual, author, related-report, severity, and product, that potentially affect the priority level of a bug report. These factors are extracted as features which are then used to train a discriminative model via a new classification algorithm that handles ordinal class labels and imbalanced data. Experiments on more than a hundred thousands bug reports from Eclipse show that we can outperform baseline approaches in terms of average F-measure by a relative improvement of up to 209 %.


Bug report management Priority prediction Multi-factor analysis 



We would like to thank Serge Demeyer and Foutse Khomh for their comments and advice during our ICSM’13 paper presentation and in the subsequent email exchanges. Their comments and advice motivate us to consider the three additional scenarios: “Assigned”, “First”, and “No-P3”. We would also like to acknowledge Kun Mei, Shaowei Wang, Yang Feng, Lingfeng Bao, and Wenchao Xu for their help in the collection of status histories of bug reports that we analyze in this study.


  1. Anvik J, Murphy GC (2011) Reducing the effort of bug report triage: recommenders for development-oriented decisions. TOSEM 20(3):10CrossRefGoogle Scholar
  2. Anvik J, Hiew L, Murphy GC (2005) Coping with an open bug repository. In: ETX, pp 35–39Google Scholar
  3. Bhattacharya P, Neamtiu I, Shelton CR (2012) Automated, highly-accurate, bug assignment using machine learning and tossing graphs. J Syst Softw 85(10):2275–2292CrossRefGoogle Scholar
  4. Cohen WW (1995) Fast effective rule induction. In: ICMLGoogle Scholar
  5. Crammer K, Singer Y (2001) On the algorithmic implementation of multiclass kernel-based vector machines. J Mach Learn Res 2Google Scholar
  6. Cubranic D, Murphy GC (2004) Automatic bug triage using text categorization. In: SEKE, pp 92–97Google Scholar
  7. Duda R, Hart P, Stork D (2000) Pattern classification. Wiley InterscienceGoogle Scholar
  8. Forman G (2008) Bns feature scaling: an improved representation over tf-idf for svm text classification. In: CIKMGoogle Scholar
  9. Gegick M, Rotella P, Xie T (2010) Identifying security bug reports via text mining: an industrial case study. In: MSRGoogle Scholar
  10. Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques. Morgan KaufmannGoogle Scholar
  11. Hiew L (2006) Assisted detection of duplicate bug reports. Master’s thesis, The University Of British ColumbiaGoogle Scholar
  12. Hosseini H, Nguyen R, Godfrey M (2012) A market-based bug allocation mechanism using predictive bug lifetimes. In: CSMRGoogle Scholar
  13. Huang L, Ng V, Persing I, Geng R, Bai X, Tian J (2011) AutoODC: automated generation of orthogonal defect classifications. In: ASEGoogle Scholar
  14. Jalbert N, Weimer W (2008) Automated duplicate detection for bug tracking systems. In: DSNGoogle Scholar
  15. Jeong G, Kim S, Zimmermann T (2009) Improving bug triage with bug tossing graphs. In: ESEC/SIGSOFT FSE, pp 111–120Google Scholar
  16. Khomh F, Chan B, Zou Y, Hassan AE (2011) An entropy evaluation approach for triaging field crashes: a case study of mozilla firefox. In: WCREGoogle Scholar
  17. Kim S, Whitehead EJ (2006) How long did it take to fix bugs? In: MSRGoogle Scholar
  18. Lamkanfi A, Demeyer S, Giger E, Goethals B (2010) Predicting the severity of a reported bug. In: MSRGoogle Scholar
  19. Lamkanfi A, Demeyer S, Soetens Q, Verdonck T (2011) Comparing mining algorithms for predicting the severity of a reported bug. In: CSMRGoogle Scholar
  20. Manning CD, Raghavan P, Schutze H (2008) Introduction to information retrieval. CambridgeGoogle Scholar
  21. Menzies T, Marcus A (2008) Automated severity assessment of software defect reports. In: ICSMGoogle Scholar
  22. Nguyen AT, Nguyen TT, Nguyen TN, Lo D, Sun C (2012) Duplicate bug report detection with a combination of information retrieval and topic modeling. In: ASEGoogle Scholar
  23. Robertson S, Zaragoza H, Taylor M (2004) Simple BM25 extension to multiple weighted fields. In: CIKMGoogle Scholar
  24. Runeson P, Alexandersson M, Nyholm O (2007) Detection of duplicate defect reports using natural language processing. In: ICSE, pp 499–510Google Scholar
  25. Sun C, Lo D, Wang X, Jiang J, Khoo SC (2010) A discriminative model approach for accurate duplicate bug report retrieval. In: ICSEGoogle Scholar
  26. Sun C, Lo D, Khoo SC, Jiang J (2011) Towards more accurate retrieval of duplicate bug reports. In: ASEGoogle Scholar
  27. Tamrawi A, Nguyen TT, Al-Kofahim J, Nguyen TN (2011) Fuzzy set-based automatic bug triaging. In: ICSE, pp 884–887Google Scholar
  28. Tian Y, Lo D, Sun C (2012) Information retrieval based nearest neighbor classification for fine-grained bug severity prediction. In: WCREGoogle Scholar
  29. Wang X, Zhang L, Xie T, Anvik J, Sun J (2008) An approach to detecting duplicate bug reports using natural language and execution information. In: ICSE, pp 461–470Google Scholar
  30. Weiß C, Premraj R, Zimmermann T, Zeller A (2007) How long will it take to fix this bug? In: MSR, p 1Google Scholar
  31. WEKA (2011) Weka 3: Data Mining Software
  32. Xia X, Lo D, Wen M, Shihab E, Zhou B (2014) An empirical study of bug report field reassignment. In: CSMR-WCREGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.School of Information SystemsSingapore Management UniversitySingaporeSingapore
  2. 2.College of Computer Science and TechnologyZhejiang UniversityHangzhouChina
  3. 3.University of California at DavisDavisUSA

Personalised recommendations