Automated prediction of bug report priority using multi-factor analysis

Tian, Yuan; Lo, David; Xia, Xin; Sun, Chengnian

doi:10.1007/s10664-014-9331-y

Automated prediction of bug report priority using multi-factor analysis

Published: 03 August 2014

Volume 20, pages 1354–1383, (2015)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Yuan Tian¹,
David Lo¹,
Xin Xia² &
…
Chengnian Sun³

1832 Accesses
91 Citations
Explore all metrics

Abstract

Bugs are prevalent. To improve software quality, developers often allow users to report bugs that they found using a bug tracking system such as Bugzilla. Users would specify among other things, a description of the bug, the component that is affected by the bug, and the severity of the bug. Based on this information, bug triagers would then assign a priority level to the reported bug. As resources are limited, bug reports would be investigated based on their priority levels. This priority assignment process however is a manual one. Could we do better? In this paper, we propose an automated approach based on machine learning that would recommend a priority level based on information available in bug reports. Our approach considers multiple factors, temporal, textual, author, related-report, severity, and product, that potentially affect the priority level of a bug report. These factors are extracted as features which are then used to train a discriminative model via a new classification algorithm that handles ordinal class labels and imbalanced data. Experiments on more than a hundred thousands bug reports from Eclipse show that we can outperform baseline approaches in terms of average F-measure by a relative improvement of up to 209 %.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Text-Based Regression Approach to Predict Bug-Fix Time

Automatic Classification of Bug Reports Based on Multiple Text Information and Reports’ Intention

Predicting Bug-Fix Time: Using Standard Versus Topic-Based Text Categorization Techniques

Notes

https://bugs.eclipse.org/bugs/

References

Anvik J, Murphy GC (2011) Reducing the effort of bug report triage: recommenders for development-oriented decisions. TOSEM 20(3):10
Article Google Scholar
Anvik J, Hiew L, Murphy GC (2005) Coping with an open bug repository. In: ETX, pp 35–39
Bhattacharya P, Neamtiu I, Shelton CR (2012) Automated, highly-accurate, bug assignment using machine learning and tossing graphs. J Syst Softw 85(10):2275–2292
Article Google Scholar
Cohen WW (1995) Fast effective rule induction. In: ICML
Crammer K, Singer Y (2001) On the algorithmic implementation of multiclass kernel-based vector machines. J Mach Learn Res 2
Cubranic D, Murphy GC (2004) Automatic bug triage using text categorization. In: SEKE, pp 92–97
Duda R, Hart P, Stork D (2000) Pattern classification. Wiley Interscience
Eclipse (2012) http://wiki.eclipse.org/Bug_Reporting_FAQ#What_is_the_difference_between_Severity_and_Priority.3F
Forman G (2008) Bns feature scaling: an improved representation over tf-idf for svm text classification. In: CIKM
Gegick M, Rotella P, Xie T (2010) Identifying security bug reports via text mining: an industrial case study. In: MSR
Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques. Morgan Kaufmann
Hiew L (2006) Assisted detection of duplicate bug reports. Master’s thesis, The University Of British Columbia
Hosseini H, Nguyen R, Godfrey M (2012) A market-based bug allocation mechanism using predictive bug lifetimes. In: CSMR
Huang L, Ng V, Persing I, Geng R, Bai X, Tian J (2011) AutoODC: automated generation of orthogonal defect classifications. In: ASE
Jalbert N, Weimer W (2008) Automated duplicate detection for bug tracking systems. In: DSN
Jeong G, Kim S, Zimmermann T (2009) Improving bug triage with bug tossing graphs. In: ESEC/SIGSOFT FSE, pp 111–120
Khomh F, Chan B, Zou Y, Hassan AE (2011) An entropy evaluation approach for triaging field crashes: a case study of mozilla firefox. In: WCRE
Kim S, Whitehead EJ (2006) How long did it take to fix bugs? In: MSR
Lamkanfi A, Demeyer S, Giger E, Goethals B (2010) Predicting the severity of a reported bug. In: MSR
Lamkanfi A, Demeyer S, Soetens Q, Verdonck T (2011) Comparing mining algorithms for predicting the severity of a reported bug. In: CSMR
Manning CD, Raghavan P, Schutze H (2008) Introduction to information retrieval. Cambridge
Menzies T, Marcus A (2008) Automated severity assessment of software defect reports. In: ICSM
Nguyen AT, Nguyen TT, Nguyen TN, Lo D, Sun C (2012) Duplicate bug report detection with a combination of information retrieval and topic modeling. In: ASE
PorterStemmer (2011) www.ils.unc.edu/~keyeg/java/porter/PorterStemmer.java
Robertson S, Zaragoza H, Taylor M (2004) Simple BM25 extension to multiple weighted fields. In: CIKM
Runeson P, Alexandersson M, Nyholm O (2007) Detection of duplicate defect reports using natural language processing. In: ICSE, pp 499–510
Sun C, Lo D, Wang X, Jiang J, Khoo SC (2010) A discriminative model approach for accurate duplicate bug report retrieval. In: ICSE
Sun C, Lo D, Khoo SC, Jiang J (2011) Towards more accurate retrieval of duplicate bug reports. In: ASE
SVM-MultiClass (2011) http://svmlight.joachims.org/svm_multiclass.html
Tamrawi A, Nguyen TT, Al-Kofahim J, Nguyen TN (2011) Fuzzy set-based automatic bug triaging. In: ICSE, pp 884–887
Tian Y, Lo D, Sun C (2012) Information retrieval based nearest neighbor classification for fine-grained bug severity prediction. In: WCRE
Wang X, Zhang L, Xie T, Anvik J, Sun J (2008) An approach to detecting duplicate bug reports using natural language and execution information. In: ICSE, pp 461–470
Weiß C, Premraj R, Zimmermann T, Zeller A (2007) How long will it take to fix this bug? In: MSR, p 1
WEKA (2011) http://www.cs.waikato.ac.nz/ml/weka/. Weka 3: Data Mining Software
Xia X, Lo D, Wen M, Shihab E, Zhou B (2014) An empirical study of bug report field reassignment. In: CSMR-WCRE

Download references

Acknowledgments

We would like to thank Serge Demeyer and Foutse Khomh for their comments and advice during our ICSM’13 paper presentation and in the subsequent email exchanges. Their comments and advice motivate us to consider the three additional scenarios: “Assigned”, “First”, and “No-P3”. We would also like to acknowledge Kun Mei, Shaowei Wang, Yang Feng, Lingfeng Bao, and Wenchao Xu for their help in the collection of status histories of bug reports that we analyze in this study.

Author information

Authors and Affiliations

School of Information Systems, Singapore Management University, Singapore, Singapore
Yuan Tian & David Lo
College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Xin Xia
University of California at Davis, Davis, CA, 95616, USA
Chengnian Sun

Authors

Yuan Tian
View author publications
You can also search for this author in PubMed Google Scholar
David Lo
View author publications
You can also search for this author in PubMed Google Scholar
Xin Xia
View author publications
You can also search for this author in PubMed Google Scholar
Chengnian Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuan Tian.

Additional information

Communicated by: Yann-Gaël Guéhéneuc and Tom Mens

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tian, Y., Lo, D., Xia, X. et al. Automated prediction of bug report priority using multi-factor analysis. Empir Software Eng 20, 1354–1383 (2015). https://doi.org/10.1007/s10664-014-9331-y

Download citation

Published: 03 August 2014
Issue Date: October 2015
DOI: https://doi.org/10.1007/s10664-014-9331-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automated prediction of bug report priority using multi-factor analysis

Abstract

Access this article

Similar content being viewed by others

A Text-Based Regression Approach to Predict Bug-Fix Time

Automatic Classification of Bug Reports Based on Multiple Text Information and Reports’ Intention

Predicting Bug-Fix Time: Using Standard Versus Topic-Based Text Categorization Techniques

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automated prediction of bug report priority using multi-factor analysis

Abstract

Access this article

Similar content being viewed by others

A Text-Based Regression Approach to Predict Bug-Fix Time

Automatic Classification of Bug Reports Based on Multiple Text Information and Reports’ Intention

Predicting Bug-Fix Time: Using Standard Versus Topic-Based Text Categorization Techniques

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation