Skip to main content
Log in

Revisiting reopened bugs in open source software systems

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Reopened bugs can degrade the overall quality of a software system since they require unnecessary rework by developers. Moreover, reopened bugs also lead to a loss of trust in the end-users regarding the quality of the software. Thus, predicting bugs that might be reopened could be extremely helpful for software developers to avoid rework. Prior studies on reopened bug prediction focus only on three open source projects (i.e., Apache, Eclipse, and OpenOffice) to generate insights. We observe that one out of the three projects (i.e., Apache) has a data leak issue – the bug status of reopened was included as training data to predict reopened bugs. In addition, prior studies used an outdated prediction model pipeline (i.e., with old techniques for constructing a prediction model) to predict reopened bugs. Therefore, we revisit the reopened bugs study on a large scale dataset consisting of 47 projects tracked by JIRA using the modern techniques such as SMOTE, permutation importance together with 7 different machine learning models. We study the reopened bugs using a mixed methods approach (i.e., both quantitative and qualitative study). We find that: 1) After using an updated reopened bug prediction model pipeline, only 34% projects give an acceptable performance with AUC \(\geqslant \) 0.7. 2) There are four major reasons for a bug getting reopened, that is, technical (i.e., patch/integration issues), documentation, human (i.e., due to incorrect bug assessment), and reasons not shown in the bug reports. 3) In projects with an acceptable AUC, 94% of the reopened bugs are due to patch issues (i.e., the usage of an incorrect patch) identified before bug reopening. Our study revisits reopened bugs and provides new insights into developer’s bug reopening activities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://issues.apache.org/jira/browse/XW-140

  2. https://github.com/xin-xia/reopenBug

  3. https://www.atlassian.com/software/jira

  4. https://developer.atlassian.com/cloud/jira/platform/rest/v2/intro/

  5. https://pydriller.readthedocs.io/en/latest/

  6. https://projects.apache.org/projects.html?number

  7. https://zenodo.org/record/6378876#.YjrqlxDMJQI

  8. https://www.nltk.org/

  9. https://rdrr.io/github/software-analytics/Rnalytica/man/AutoSpearman.html

  10. https://imbalanced-learn.org/stable/references/generated/imblearn.over_sampling.SMOTE.html

  11. https://issues.apache.org/jira/browse/ACCUMULO-4112

  12. https://issues.apache.org/jira/browse/FLEX-34059

  13. https://issues.apache.org/jira/browse/AIRAVATA-741

  14. https://issues.apache.org/jira/browse/ACCUMULO-4640

  15. https://issues.apache.org/jira/browse/OFBIZ-2197

  16. https://issues.apache.org/jira/browse/IGNITE-11429

  17. https://issues.apache.org/jira/browse/NETBEANS-3598

  18. https://issues.apache.org/jira/browse/FLINK-5206

  19. https://whatis.techtarget.com/definition/flaky-test

  20. https://issues.apache.org/jira/browse/DAFFODIL-346

  21. https://issues.apache.org/jira/browse/HAWQ-480

  22. https://issues.apache.org/jira/browse/FELIX-1999

  23. https://issues.apache.org/jira/browse/SPARK-3598

  24. https://issues.apache.org/jira/browse/THRIFT-2570

  25. https://issues.apache.org/jira/browse/SPARK-17024

  26. https://issues.apache.org/jira/browse/HAWQ-1152

  27. https://issues.apache.org/jira/browse/CASSANDRA-2626

  28. https://issues.apache.org/jira/browse/SPARK-15519

  29. https://issues.apache.org/jira/browse/THRIFT-4531

  30. https://issues.apache.org/jira/browse/HBASE-15977

  31. https://issues.apache.org/jira/browse/GEODE-1716

  32. https://issues.apache.org/jira/browse/ZEPPELIN-5011

  33. https://issues.apache.org/jira/browse/HAWQ-889

  34. https://github.com/xin-xia/reopenBug

References

  • Agrawal A, Fu W, Chen D, Shen X, Menzies T (2019) How to” dodge” complex software analytics. IEEE Transactions on Software Engineering (TSE’19), pp 1–13

  • Agrawal A, Menzies T (2018) Is” better data” better than” better data miners”?. In: 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE’18). IEEE, pp 1050–1061

  • Al Dallal J, Morasca S (2014) Predicting object-oriented class reuse-proneness using internal quality attributes. Empir Softw Eng (EMSE’14) 19(4):775–821

    Article  Google Scholar 

  • Altmann A, Toloşi L, Sander O, Lengauer T (2010) Permutation importance: a corrected feature importance measure. Bioinformatics 26(10):1340–1347

    Article  Google Scholar 

  • Anvik J, Hiew L, Murphy GC (2006) Who should fix this bug?. In: Proceedings of the 28th international conference on Software engineering (ICSE’06), pp 361–370

  • Arellano AV (2019) Epidemiological disease surveillance using public media text mining. North Carolina State University

  • Artstein R, Poesio M (2008) Inter-coder agreement for computational linguistics. Comput Linguist 34(4):555–596

    Article  Google Scholar 

  • Beckler DT, Thumser ZC, Schofield JS, Marasco PD (2018) Reliability in evaluator-based tests: using simulation-constructed models to determine contextually relevant agreement thresholds. BMC Med Res Methodol 18(1):1–12

    Article  Google Scholar 

  • Biggers LR, Bocovich C, Capshaw R, Eddy BP, Etzkorn LH, Kraft NA (2014) Configuring latent dirichlet allocation based feature location. Empir Softw Eng (EMSE’14) 19(3):465–500

    Article  Google Scholar 

  • Bortis G, Van Der Hoek A (2013) Porchlight: A tag-based approach to bug triaging. In: 2013 35th International Conference on Software Engineering (ICSE). IEEE, pp 342–351

  • Boslaugh S (2012) Statistics in a nutshell: A desktop quick reference. O’Reilly Media, Inc.

  • Caglayan B, Misirli AT, Miranskyy A, Turhan B, Bener A (2012) Factors characterizing reopened issues: a case study. In: Proceedings of the 8th International Conference on Predictive Models in Software Engineering, pp 1–10

  • Cerda P, Varoquaux G, Kégl B (2018) Similarity encoding for learning with dirty categorical variables. Mach Learn 107(8-10):1477–1494

    Article  MathSciNet  Google Scholar 

  • Chakraborty D, Elzarka H (2019) Advanced machine learning techniques for building performance simulation: a comparative analysis. J Build Perform Simul 12(2):193–207

    Article  Google Scholar 

  • Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  MATH  Google Scholar 

  • Chen D, Fu W, Krishna R, Menzies T (2018) Applications of psychological science for actionable analytics. In: Proceedings of the 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp 456–467

  • Corazza A, Di Martino S, Maggio V, Scanniello G (2016) Weighing lexical information for software clustering in the context of architecture recovery. Empir Softw Eng (EMSE’16) 21(1):72–103

    Article  Google Scholar 

  • Corbin JM, Strauss A (1990) Grounded theory research: Procedures, canons, and evaluative criteria. Qualitative Sociol 13(1):3–21

    Article  Google Scholar 

  • da Costa DA, McIntosh S, Treude C, Kulesza U, Hassan AE (2018) The impact of rapid release cycles on the integration delay of fixed issues. Empir Softw Eng (EMSE’18) 23(2):835–904

    Article  Google Scholar 

  • Denny M, Spirling A (2017) Text preprocessing for unsupervised learning: Why it matters, when it misleads, and what to do about it. When It Misleads, and What to Do about It (September 27, 2017)

  • Fu W, Menzies T, Shen X (2016) Tuning for software analytics: Is it really necessary?. Inf Softw Technol 76:135–146

    Article  Google Scholar 

  • Ghotra B, McIntosh S, Hassan AE (2015) Revisiting the impact of classification techniques on the performance of defect prediction models. In: 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering (ICSE’15), vol 1. IEEE, pp 789–800

  • Ghotra B, McIntosh S, Hassan AE (2017) A large-scale study of the impact of feature selection techniques on defect classification models. In: 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR’17). IEEE, pp 146–157

  • Guo PJ, Zimmermann T, Nagappan N, Murphy B (2010) Characterizing and predicting which bugs get fixed: an empirical study of microsoft windows. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering (ICSE’10)-Volume 1, pp 495–504

  • Guo PJ, Zimmermann T, Nagappan N, Murphy B (2011) ”not my bug!” and other reasons for software bug report reassignments. In: Proceedings of the ACM 2011 conference on Computer supported cooperative work, pp 395–404

  • Hardeniya N, Perkins J, Chopra D, Joshi N, Mathur I (2016) Natural language processing: python and nltk. Packt Publishing Ltd

  • Hayes AF, Krippendorff K (2007) Answering the call for a standard reliability measure for coding data. Commun Methods Measur 1(1):77–89

    Article  Google Scholar 

  • He H, Ma Y (2013) Imbalanced learning: foundations, algorithms, and applications. Wiley

  • Hébert A (2020) Estimation of road accident risk with machine learning. Ph.D. Thesis, Concordia University

  • Hemalatha I, Varma GP Saradhi, Govardhan A (2012) Preprocessing the informal text for efficient sentiment analysis. Int J Emerging Trends Technol Comput Sci (IJETTCS) 1(2):58–61

    Google Scholar 

  • Herzig K, Nagappan N (2015) Empirically detecting false test alarms using association rules. In: 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering (ICSE’15), vol 2. IEEE, pp 39–48

  • Jalbert N, Weimer W (2008) Automated duplicate detection for bug tracking systems. In: 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN). IEEE, pp 52–61

  • Jiarpakdee J, Tantithamthavorn C, Hassan AE (2019) The impact of correlated metrics on the interpretation of defect models. IEEE Transactions on Software Engineering (TSE’19)

  • Jiarpakdee J, Tantithamthavorn C, Treude C (2018) Autospearman: Automatically mitigating correlated software metrics for interpreting defect models. In: 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME’18). IEEE Computer Society, pp 92–103

  • Jiarpakdee J, Tantithamthavorn C, Treude C (2020) The impact of automated feature selection techniques on the interpretation of defect models. Empir Softw Eng (EMSE’20) 25(5):3590–3638

    Article  Google Scholar 

  • Kannan S, Gurusamy V (2014) Preprocessing techniques for text mining. Int J Comput Sci Commun Netw 5(1):7–16

    Google Scholar 

  • Kaufman S, Rosset S, Perlich C, Stitelman O (2012) Leakage in data mining: Formulation, detection, and avoidance. ACM Trans Knowl Discov Data (TKDD’12) 6(4):1–21

    Article  Google Scholar 

  • Lee D, Rajbahadur GK, Lin D, Sayagh M, Bezemer C-P, Hassan AE (2020) An empirical study of the characteristics of popular minecraft mods. Empir Softw Eng (EMSE’20) 25(5):3396–3429

    Article  Google Scholar 

  • Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Trans Softw Eng (TSE’08) 34(4):485–496

    Article  Google Scholar 

  • Li H, Shang W, Adams B, Sayagh M, Hassan AE (2020) A qualitative study of the benefits and costs of logging from developers’ perspectives. IEEE Transactions on Software Engineering (TSE’20)

  • Lipton ZC (2018) The mythos of model interpretability. Queue 16(3):31–57

    Article  Google Scholar 

  • Malhotra R, Khanna M (2017) An empirical study for software change prediction using imbalanced data. Empir Softw Eng (EMSE’17) 22(6):2806–2851

    Article  Google Scholar 

  • McIntosh S, Kamei Y, Adams B, Hassan AE (2016) An empirical study of the impact of modern code review practices on software quality. Empir Softw Eng (EMSE’16) 21(5):2146–2189

    Article  Google Scholar 

  • McMillan C, Grechanik M, Poshyvanyk D, Fu C, Xie Q (2011) Exemplar: A source code search engine for finding highly relevant applications. IEEE Trans Softw Eng (TSE’11) 38(5):1069–1087

    Article  Google Scholar 

  • Méndez JR, Iglesias EL, Fdez-Riverola F, Díaz F, Corchado JM (2005) Tokenising, stemming and stopword removal on anti-spam filtering domain. In: Conference of the Spanish Association for Artificial Intelligence. Springer, pp 449–458

  • Meyer TA, Whateley B (2004) Spambayes: Effective open-source, bayesian based, email classification system.. In: CEAS. Citeseer

  • Mi Q, Keung J (2016) An empirical analysis of reopened bugs based on open source projects. In: Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, pp 1–10

  • Mi Q, Keung J, Huo Y, Mensah S (2018) Not all bug reopens are negative: A case study on eclipse bug reports. Inf Softw Technol 99:93–97

    Article  Google Scholar 

  • Morasca S, Lavazza L (2020) On the assessment of software defect prediction models via ROC curves. Empir Softw Eng (EMSE’20) 25(5):3977–4019

    Article  Google Scholar 

  • Murphy G, Cubranic D (2004) Automatic bug triage using text categorization. In: Proceedings of the Sixteenth International Conference on Software Engineering & Knowledge Engineering. Citeseer, pp 1–6

  • Nyamawe AS, Liu H, Niu N, Umer Q, Niu Z (2020) Feature requests-based recommendation of software refactorings. Empir Softw Eng (EMSE’20) 25(5):4315–4347

    Article  Google Scholar 

  • Rajbahadur GK, Wang S, Ansaldi G, Kamei Y, Hassan AE (2021) The impact of feature importance methods on the interpretation of defect classifiers. IEEE Transactions on Software Engineering (TSE’21)

  • Rajbahadur GK, Wang S, Kamei Y, Hassan AE (2017) The impact of using regression models to build defect classifiers. In: 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR’17). IEEE, pp 135–145

  • Rajbahadur GK, Wang S, Kamei Y, Hassan AE (2019) Impact of discretization noise of the dependent variable on machine learning classifiers in software engineering. IEEE Trans Softw Eng (TSE’19):1–18

  • Rakha M S, Bezemer C-P, Hassan AE (2017) Revisiting the performance evaluation of automated approaches for the retrieval of duplicate issue reports. IEEE Trans Softw Eng (TSE’17) 44(12):1245–1268

    Article  Google Scholar 

  • Rodríguez-Pérez G, Robles G, Serebrenik A, Zaidman A, Germán DM, Gonzalez-Barahona JM (2020) How bugs are born: a model to identify how bugs are introduced in software components. Empir Softw Eng (EMSE’20) 25(2):1294–1340

    Article  Google Scholar 

  • Saha RK, Khurshid S, Perry DE (2015) Understanding the triaging and fixing processes of long lived bugs. Inf Softw Technol 65:114–128

    Article  Google Scholar 

  • Scikit-learn (2020) Tuning the hyper-parameters of an estimator. https://scikit-learn.org/stable/modules/grid_search.html#grid-search, [Online; accessed 08-June-2020]

  • Scoccia GL, Autili M (2020) Web frameworks for desktop apps: an exploratory study. In: Proceedings of the 14th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’20), pp 1–6

  • Shihab E, Ihara A, Kamei Y, Ibrahim WM, Ohira M, Adams B, Hassan AE, Matsumoto K- (2010) Predicting re-opened bugs: A case study on the eclipse project. In: 2010 17th Working Conference on Reverse Engineering. IEEE, pp 249–258

  • Shihab E, Ihara A, Kamei Y, Ibrahim WM, Ohira M, Adams B, Hassan AE, Matsumoto Ki (2013) Studying re-opened bugs in open source software. Empir Softw Eng (EMSE’13) 18(5):1005–1042

    Article  Google Scholar 

  • Somasundaram K, Murphy GC (2012) Automatic categorization of bug reports using latent dirichlet allocation. In: Proceedings of the 5th India software engineering conference, pp 125–130

  • Song F, Liu S, Yang J (2005) A comparative study on text representation schemes in text categorization. Pattern analysis and applications 8(1-2):199–209

    Article  MathSciNet  Google Scholar 

  • Srividhya V, Anitha R (2010) Evaluating preprocessing techniques in text categorization. Int J Comput Sci Appl 47(11):49–51

    Google Scholar 

  • Stolberg S (2009) Enabling agile testing through continuous integration. In: 2009 agile conference. IEEE, pp 369–374

  • Tantithamthavorn C, Hassan AE, Matsumoto K (2018) The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Transactions on Software Engineering (TSE’18)

  • Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2016) Automated parameter optimization of classification techniques for defect prediction models. In: Proceedings of the 38th International Conference on Software Engineering (ICSE’16), pp 321–332

  • Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2016) An empirical comparison of model validation techniques for defect prediction models. IEEE Trans Softw Eng (TSE’16) 43(1):1–18

    Google Scholar 

  • Tian Y, Sun C, Lo D (2012) Improved duplicate bug report identification. In: 2012 16th European Conference on Software Maintenance and Reengineering. IEEE, pp 385–390

  • Tian Y, Wijedasa D, Lo D, Le Goues C (2016) Learning to rank for bug report assignee recommendation. In: 2016 IEEE 24th International Conference on Program Comprehension (ICPC’16). IEEE, pp 1–10

  • Tu F, Zhu J, Zheng Q, Zhou M (2018) Be careful of when: an empirical study on time-related misuse of issue tracking data. In: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp 307–318

  • Turabieh H, Mafarja M, Li X (2019) Iterated feature selection algorithms with layered recurrent neural network for software fault prediction. Expert Syst Appl 122:27–42

    Article  Google Scholar 

  • Uysal AK, Gunal S (2014) The impact of preprocessing on text classification. Inf Process Manag 50(1):104–112

    Article  Google Scholar 

  • Vassallo C, Panichella S, Palomba F, Proksch S, Gall HC, Zaidman A (2020) How developers engage with static analysis tools in different contexts. Empir Softw Eng (EMSE’20) 25(2):1419–1457

    Article  Google Scholar 

  • Vieira R, da Silva A, Rocha L, Gomes JP (2019) From reports to bug-fix commits: A 10 years dataset of bug-fixing activity from 55 apache’s open source projects. In: Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering, pp 80–89

  • Webb JK, Keller KA, Welle K, Allender MC (2020) Evaluation of the inter-and intraindividual agreement of a pododermatitis scoring model in greater flamingos (phoenicopterus roseus). J Zoo Wildlife Med 51(2):379–384

    Article  Google Scholar 

  • Xia X, Lo D, Ding Y, Al-Kofahi JM, Nguyen TN, Wang X (2016) Improving automated bug triaging with specialized topic model. IEEE Trans Softw Eng (TSE’16) 43(3):272–297

    Article  Google Scholar 

  • Xia X, Lo D, Shihab E, Wang X, Zhou B (2015) Automatic, high accuracy prediction of reopened bugs. Autom Softw Eng (ASE’15) 22(1):75–109

    Article  Google Scholar 

  • Xia X, Lo D, Wang X, Yang X, Li S, Sun J (2013) A comparative study of supervised learning algorithms for re-opened bug prediction. In: 2013 17th European Conference on Software Maintenance and Reengineering. IEEE, pp 331–334

  • Xia X, Lo D, Wang X, Zhou B (2013) Accurate developer recommendation for bug resolution. In: 2013 20th Working Conference on Reverse Engineering (WCRE). IEEE, pp 72–81

  • Xia X, Lo D, Wen M, Shihab E, Zhou B (2014) An empirical study of bug report field reassignment. In: 2014 Software Evolution Week-IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE’14). IEEE, pp 174–183

  • Xuan J, Jiang H, Hu Y, Ren Z, Zou W, Luo Z, Wu X (2014) Towards effective bug triage with software data reduction techniques. IEEE Trans Knowl Data Eng 27(1):264–280

    Article  Google Scholar 

  • Xuan J, Jiang H, Ren Z, Zou W (2012) Developer prioritization in bug repositories. In: 2012 34th International Conference on Software Engineering (ICSE’12). IEEE, pp 25–35

  • Yadav A, Singh SK, Suri JS (2019) Ranking of software developers based on expertise score for bug triaging. Inf Softw Technol 112:1–17

    Article  Google Scholar 

  • Yatish S, Jiarpakdee J, Thongtanunam P, Tantithamthavorn C (2019) Mining software defects: should we consider affected releases?. In: IEEE/ACM 41st International Conference on Software Engineering (ICSE’19). IEEE, pp 654–665

  • Zeng Y, Jiang K, Chen J (2019) Automatic seismic salt interpretation with deep convolutional neural networks. In: Proceedings of the 3rd International Conference on Information System and Data Mining, pp 16–20

  • Zhang H, Wang S, Chen T-H, Hassan AE (2020) Are comments on stack overflow well organized for easy retrieval by developers? ACM Trans Softw Eng Methodol (TOSEM’20) 29

  • Zhang H, Wang S, Chen T-H, Zou Y, Hassan AE (2019) An empirical study of obsolete answers on Stack Overflow. IEEE Transactions on Software Engineering (TSE’19)

  • Zheng A, Casari A (2018) Feature engineering for machine learning: principles and techniques for data scientists. O’Reilly Media, Inc.

  • Zimmermann T, Nagappan N, Guo PJ, Murphy B (2012) Characterizing and predicting which bugs get reopened. In: 2012 34th International Conference on Software Engineering (ICSE’12). IEEE, pp 1074–1083

Download references

Acknowledgments

We would like to thank the anonymous reviewers for their insightful comments. The findings and opinions in this paper belong solely to the authors and are not necessarily those of Huawei. Moreover, our results do not in any way reflect the quality of Huawei software products.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haoxiang Zhang.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Communicated by: Andy Zaidman

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tagra, A., Zhang, H., Rajbahadur, G.K. et al. Revisiting reopened bugs in open source software systems. Empir Software Eng 27, 92 (2022). https://doi.org/10.1007/s10664-022-10133-6

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-022-10133-6

Keywords

Navigation