Automated Software Engineering

, Volume 24, Issue 2, pp 455–498

An effective change recommendation approach for supplementary bug fixes

Article

Abstract

Bug fixing is one of the most important activities during software development and maintenance. A substantial number of bugs are often fixed more than once due to incomplete initial fixes which need to be followed up by supplementary fixes. Automatically recommending relevant change locations for supplementary bug fixes can help developers to improve their productivity. It also help improve the reliability of systems by highlighting locations that a developer potentially needs to change to completely remove a bug. Unfortunately, a recent study by Park et al. shows that many change recommendation techniques do not work for supplementary bug fixes. In this paper, to advance the capabilities of existing change recommendation techniques, we propose an effective approach named SupLocator to recommend relevant locations (i.e., methods) that need to be changed for supplementary bug fixes. Based on various relationships among methods, classes, and packages in the source code (such as containment, inheritance, historical co-change, etc.), SupLocator extracts six change relationship graphs. Next, SupLocator performs random walk on each of the 6 graphs, and for each it outputs a ranked list of candidate change locations. Finally, SupLocator combines these six ranked lists by leveraging genetic algorithm. To investigate the benefits of SupLocator, we perform experiments on three projects, i.e., Eclipse JDT, Eclipse SWT, and Equinox p2. The experimental results show that on average SupLocator can achieve top-1, top-5, and top-10 accuracies, mean reciprocal rank (MRR), and mean average precision (MAP) of 0.51, 0.65, 0.67, 0.58 and 0.32 for the three projects, which improve the best variants of the approach proposed by Park et al. by 1523.09, 639.70, 550.62, 919.41, and 1478.44 %, respectively. It also improves the approach proposed by Saul et al. in terms of top-1, top-5, and top-10 accuracies, MRR, and MAP by 71.81, 29.54, 18.30, 47.24, and 56.60 %, respectively. Statistical tests show that the improvements are statistically significant.

Keywords

Change recommendation Supplementary bug fixes Random walk Genetic algorithm 

References

  1. Abdi, H.: Bonferroni and Šidák corrections for multiple comparisons. In: Salkind, N.J. (ed.) Encyclopedia of Measurement and Statistics. https://www.utdallas.edu/~herve/Abdi-Bonferroni2007-pretty.pdf (2007). Accessed 12 Aug 2016
  2. An, L., Khomh, F., Adams, B.: Supplementary bug fixes vs. re-opened bugs. In: 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation (SCAM), pp. 205–214. IEEE (2014)Google Scholar
  3. Arcuri, A., Briand, L.: A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: 2011 33rd International Conference on Software Engineering (ICSE), pp. 1–10. IEEE (2011)Google Scholar
  4. Baeza-Yates, R., Ribeiro-Neto, B., et al.: Modern Information Retrieval, vol. 463. ACM Press, New York (1999)Google Scholar
  5. Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., Hullender, G.: Learning to rank using gradient descent. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 89–96. ACM, New York (2005)Google Scholar
  6. Canfora, G., De Lucia, A., Di Penta, M., Oliveto, R., Panichella, A., Panichella, S.: Multi-objective cross-project defect prediction. In: 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation (ICST), pp. 252–261. IEEE (2013)Google Scholar
  7. Dagenais, B., Hendren, L.: Enabling static analysis for partial Java programs. ACM Sigplan Notices 43, 313–328 (2008)CrossRefGoogle Scholar
  8. Dai, N., Shokouhi, M., Davison, B.D.: Learning to rank for freshness and relevance. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 95–104. ACM, New York (2013)Google Scholar
  9. Deb, K.: Multi-objective Optimization Using Evolutionary Algorithms, vol. 16. Wiley, Hoboken (2001)MATHGoogle Scholar
  10. Freund, Y., Iyer, R., Schapire, R.E., Singer, Y.: An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 4, 933–969 (2003)MathSciNetMATHGoogle Scholar
  11. Goldberg, D.E., Holland, J.H.: Genetic algorithms and machine learning. Mach. Learn. 3(2), 95–99 (1988)CrossRefGoogle Scholar
  12. Harman, M., Jones, B.F.: Search-based software engineering. Inf. Softw. Technol. 43(14), 833–839 (2001)CrossRefGoogle Scholar
  13. Harman, M., Mansouri, S.A., Zhang, Y.: Search-based software engineering: trends, techniques and applications. ACM Comput. Surv. (CSUR) 45(1), 11 (2012)CrossRefGoogle Scholar
  14. Hassan, A.E., Holt, R.C.: Predicting change propagation in software systems. In: 20th IEEE International Conference on Software Maintenance, 2004. Proceedings, pp. 284–293. IEEE, Washington, DC (2004)Google Scholar
  15. Hassan, A.E., Holt, R.C.: Replaying development history to assess the effectiveness of change propagation tools. Empir. Softw. Eng. 11(3), 335–367 (2006)CrossRefGoogle Scholar
  16. Herzig, K., Zeller, A.: Mining cause–effect-chains from version histories. In: 2011 IEEE 22nd International Symposium on Software Reliability Engineering (ISSRE), pp. 60–69. IEEE, Washington, DC (2011)Google Scholar
  17. Herzig, K., Zeller, A.: The impact of tangled code changes. In: 2013 10th IEEE Working Conference on Mining Software Repositories (MSR), pp. 121–130. IEEE (2013)Google Scholar
  18. Hopkins, W.G.: A New View of Statistics. Will G. Hopkins (1997)Google Scholar
  19. Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 133–142. ACM, New York (2002)Google Scholar
  20. Kamiya, T., Kusumoto, S., Inoue, K.: Ccfinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng. 28(7), 654–670 (2002)CrossRefGoogle Scholar
  21. Kawrykow, D., Robillard, M.P.: Non-essential changes in version histories. In: Proceedings of the 33rd International Conference on Software Engineering, pp. 351–360. ACM, New York (2011)Google Scholar
  22. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)MathSciNetCrossRefMATHGoogle Scholar
  23. Le Goues, C., Nguyen, T., Forrest, S., Weimer, W.: Genprog: a generic method for automatic software repair. IEEE Trans. Softw. Eng. 38(1), 54–72 (2012)CrossRefGoogle Scholar
  24. Liu, T.Y.: Learning to rank for information retrieval. Found. Trends Inf. Retr. 3(3), 225–331 (2009)CrossRefGoogle Scholar
  25. Liu, Y., Khoshgoftaar, T.M., Seliya, N.: Evolutionary optimization of software quality modeling with multiple repositories. IEEE Trans. Softw. Eng. 36(6), 852–864 (2010)CrossRefGoogle Scholar
  26. Lohar, S., Amornborvornwong, S., Zisman, A., Cleland-Huang, J.: Improving trace accuracy through data-driven configuration and composition of tracing features. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, pp. 378–388. ACM, Saint Petersburg (2013)Google Scholar
  27. Malik, H., Hassan, A.E.: Supporting software evolution using adaptive change propagation heuristics. In: IEEE International Conference on Software Maintenance, 2008. ICSM 2008, pp. 177–186. IEEE (2008)Google Scholar
  28. Meffert, K., Rotstan, N., Knowles, C., Sangiorgi, U.: Jgap-java genetic algorithms and genetic programming package. http://jgap.sourceforge.net/ (2011). Accessed 12 Aug 2016
  29. Nguyen, T.T., Nguyen, H.A., Pham, N.H., Al-Kofahi, J., Nguyen, T.N.: Recurring bug fixes in object-oriented programs. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, vol. 1, pp. 315–324. ACM (2010)Google Scholar
  30. Page, L., Brin, S., Motwani, R., Winograd, T.: The Pagerank Citation Ranking: Bringing Order to the Web (1999)Google Scholar
  31. Panichella, A., Dit, B., Oliveto, R., Di Penta, M., Poshyvanyk, D., De Lucia, A.: How to effectively use topic models for software engineering tasks? An approach based on genetic algorithms. In: Proceedings of the 2013 International Conference on Software Engineering, pp. 522–531. IEEE Press, Piscataway (2013)Google Scholar
  32. Panichella, A., Oliveto, R., De Lucia, A.: Cross-project defect prediction models: L’union fait la force. In: IEEE Conference on Software Maintenance, Reengineering and Reverse Engineering (CSMR-WCRE), 2014 Software Evolution Week, pp. 164–173. IEEE (2014)Google Scholar
  33. Park, J., Kim, M., Ray, B., Bae, D.H.: An empirical study of supplementary bug fixes. In: Proceedings of the 9th IEEE Working Conference on Mining Software Repositories, pp. 40–49. IEEE Press, Washington, DC (2012)Google Scholar
  34. Park, J., Kim, M., Bae, D.H.: An empirical study on reducing omission errors in practice. In: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, pp. 121–126. ACM (2014)Google Scholar
  35. Poshyvanyk, D., Marcus, A., Ferenc, R., Gyimóthy, T.: Using information retrieval based coupling measures for impact analysis. Empir. Softw. Eng. 14(1), 5–32 (2009)CrossRefGoogle Scholar
  36. Rao, S., Kak, A.: Retrieval from software libraries for bug localization: a comparative study of generic and composite text models. In: Proceedings of the 8th Working Conference on Mining Software Repositories, pp. 43–52. ACM, New York (2011)Google Scholar
  37. Robillard, M.P.: Automatic generation of suggestions for program investigation. ACM SIGSOFT Softw. Eng. Notes 30, 11–20 (2005)CrossRefGoogle Scholar
  38. Robillard, M.P., Murphy, G.C.: Concern graphs: finding and describing concerns using structural program dependencies. In: Proceedings of the 24th International Conference on Software Engineering, pp. 406–416. ACM, New York (2002)Google Scholar
  39. Saha, R.K., Lease, M., Khurshid, S., Perry, D.E.: Improving bug localization using structured information retrieval. In: 2013 IEEE/ACM 28th International Conference on Automated Software Engineering (ASE), pp. 345–355. IEEE (2013)Google Scholar
  40. Saul, Z.M., Filkov, V., Devanbu, P., Bird, C.: Recommending random walks. In: Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Fundations of Software Engineering, pp. 15–24. ACM, New York (2007)Google Scholar
  41. Shihab, E., Ihara, A., Kamei, Y., Ibrahim, W.M., Ohira, M., Adams, B., Hassan, A.E., Ki, M.: Studying re-opened bugs in open source software. Empir. Softw. Eng. 18(5), 1005–1042 (2013)CrossRefGoogle Scholar
  42. Sivanandam, S., Deepa, S.: Introduction to Genetic Algorithms. Springer, Berlin (2007)MATHGoogle Scholar
  43. Tamrawi, A., Nguyen, T.T., Al-Kofahi, J.M., Nguyen, T.N.: Fuzzy set and cache-based approach for bug triaging. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, pp. 365–375. ACM, New York (2011)Google Scholar
  44. Tao, Y., Dang, Y., Xie, T., Zhang, D., Kim, S.: How do software engineers understand code changes? An exploratory study in industry. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, p. 51. ACM, New York (2012)Google Scholar
  45. Thongtanunam, P., Tantithamthavorn, C., Kula, R.G., Yoshida, N., Iida, H., Matsumoto, K.i.: Who should review my code? A file location-based code-reviewer recommendation approach for modern code review. In: 2015 IEEE 22nd International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 141–150. IEEE (2015)Google Scholar
  46. Tsai, M.F., Liu, T.Y., Qin, T., Chen, H.H., Ma, W.Y.: Frank: a ranking method with fidelity loss. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 383–390. ACM, New York (2007)Google Scholar
  47. Wang, T., Harman, M., Jia, Y., Krinke, J.: Searching for better configurations: a rigorous approach to clone evaluation. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, pp. 455–465. ACM, New York (2013)Google Scholar
  48. West, D.B., et al.: Introduction to Graph Theory, vol. 2. Prentice Hall, Upper Saddle River (2001)Google Scholar
  49. Wilcoxon, F.: Individual comparisons by ranking methods. Biom. Bull. 1, 80–83 (1945)CrossRefGoogle Scholar
  50. Xia, X., Feng, Y., Lo, D., Chen, Z., Wang, X.: Towards more accurate multi-label software behavior learning. In: IEEE Conference on Software Maintenance, Reengineering and Reverse Engineering (CSMR-WCRE), 2014 Software Evolution Week, pp. 134–143. IEEE (2014a)Google Scholar
  51. Xia, X., Lo, D., Shihab, E., Wang, X., Zhou, B.: Automatic, high accuracy prediction of reopened bugs. Autom. Softw. Eng. 22(1), 75–109 (2014b)CrossRefGoogle Scholar
  52. Xia, X., Lo, D., Wang, X., Zhang, C., Wang, X.: Cross-language bug localization. In: Proceedings of the 22nd International Conference on Program Comprehension, pp. 275–278. ACM (2014c)Google Scholar
  53. Xing, Z., Stroulia, E.: Umldiff: an algorithm for object-oriented design differencing. In: Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering, pp. 54–65. ACM, New York (2005)Google Scholar
  54. Ying, A.T., Murphy, G.C., Ng, R., Chu-Carroll, M.C.: Predicting source code changes by mining change history. IEEE Trans. Softw. Eng. 30(9), 574–586 (2004)CrossRefGoogle Scholar
  55. Zhang, Q., Li, H.: MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput. 11(6), 712–731 (2007)CrossRefGoogle Scholar
  56. Zhou, Z.H.: Ensemble Methods: Foundations and Algorithms. CRC Press, Boca Raton (2012)Google Scholar
  57. Zhou, J., Zhang, H., Lo, D.: Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports. In: 2012 34th International Conference on Software Engineering (ICSE), pp. 14–24. IEEE (2012)Google Scholar
  58. Zimmermann, T., Zeller, A., Weissgerber, P., Diehl, S.: Mining version histories to guide software changes. IEEE Trans. Softw. Eng. 31(6), 429–445 (2005)CrossRefGoogle Scholar
  59. Zimmermann, T., Nagappan, N., Guo, P.J., Murphy, B.: Characterizing and predicting which bugs get reopened. In: 2012 34th International Conference on Software Engineering (ICSE), pp. 1074–1083. IEEE, Piscataway (2012)Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.College of Computer Science and TechnologyZhejiang UniversityHangzhouChina
  2. 2.School of Information SystemsSingapore Management UniversitySingaporeSingapore

Personalised recommendations