Advertisement

Software Quality Journal

, Volume 21, Issue 1, pp 39–66 | Cite as

The bug report duplication problem: an exploratory study

  • Yguaratã Cerqueira CavalcantiEmail author
  • Paulo Anselmo da Mota Silveira Neto
  • Daniel Lucrédio
  • Tassio Vale
  • Eduardo Santana de Almeida
  • Silvio Romero de Lemos Meira
Article

Abstract

Duplicate bug report entries in bug trackers have a negative impact on software maintenance and evolution. This is due, among other factors, to the increased time spent on report analysis and validation, which in some cases takes over 20 min. Therefore, a considerable amount of time is lost in duplicate bug report analysis. In order to understand the possible factors that cause bug report duplication and its impact on software development, this paper presents an exploratory study in which bug tracking data from private and open source projects were analyzed. The results show, for example, that all projects we investigated had duplicate bug reports and a considerable amount of time was wasted by this duplication. Furthermore, features such as project lifetime, staff size, and the number of bug reports do not seem to be significant factors for duplication, while others, such as the submitters’ profile and the number of submitters, do seem to influence the bug report duplication.

Keywords

Bug reports Bug tracker Bug reports duplication Exploratory study Software configuration management 

Notes

Acknowledgments

This work was partially supported by the National Institute of Science and Technology for Software Engineering (INES http://www.ines.org.br), funded by CNPq and FACEPE, grants 573964/2008-4 and APQ-1037-1.03/08 and CNPq grants 305968/2010-6, 559997/2010-8, 474766/2010-1.

References

  1. Anvik, J., & Murphy, G. C. (2007). Determining implementation expertise from bug reports. In Proceedings of the fourth international workshop on mining soft. Repositories (MSR07). New York, NY: IEEE Press.Google Scholar
  2. Anvik, J., Hiew, L., & Murphy, G. C. (2005). Coping with an open bug repository. In Proceedings of the 2005 OOPSLA workshop on eclipse technology eXchange (pp. 35–39). New York, NY: ACM Press. doi: 10.1145/1117696.1117704.
  3. Anvik, J., Hiew, L., & Murphy, G. C. (2006). Who should fix this bug? In Proceedings of the 28th international conference on software engineering (ICSE06) (pp. 361–370). New York, NY : ACM Press.Google Scholar
  4. Basili, V., Selby, R., & Hutchens, D. (1986). Experimentation in software engineering. IEEE Transacations on Software Engineering, 12(7), 733–743.CrossRefGoogle Scholar
  5. Bettenburg, N., Just, S., Schröter, A., Weiss, C., Premraj, R., & Zimmermann, T. (2007). Quality of bug reports in eclipse. In Proceedings of the 2007 OOPSLA workshop on eclipse technology eXchange (eclipse07) (pp. 21–25). New York: ACM Press. doi: 10.1145/1328279.1328284.
  6. Bettenburg, N., Premraj, R., Zimmermann, T., & Kim, S. (2008a). Duplicate bug reports considered harmful? In Proceedings of the international conference on software maintenance (ICSM08) (pp. 337–345). New York: IEEE Press.Google Scholar
  7. Bettenburg, N., Premraj, R., Zimmermann, T., & Kim, S. (2008b). Extracting structural information from bug reports. In Proceedings of the 2008 international workshop on mining software repositories (MSR08) (pp. 27–30). New York: ACM Press. doi: 10.1145/1370750.1370757.
  8. Canfora, G., & Cerulo, L. (2005). Impact analysis by mining software and change request repositories. In Proceedings of the 11th IEEE international software metrics symposium (METRICS05) (p. 29). Washington, DC: IEEE Press. doi: 10.1109/METRICS.2005.28.
  9. Canfora, G., & Cerulo, L. (2006). Supporting change request assignment in open source development. In Proceedings of the 2006 ACM symposium on applied computing (SAC06) (pp. 1767–1772). New York: ACM Press. doi: 10.1145/1141277.1141693.
  10. Card, S. K., Mackinlay, J. D., & Shneiderman, B. (1999). Readings in information visualization: Using vision to think. In C. Stuart, P. A. R. C. Xerox, & G. Jonathan (Eds.), The Morgan Kaufmann series in interactive technologies. MA, USA: Morgan Kaufmann.Google Scholar
  11. Castro, M., Costa, M., & Martin, J. P. (2008). Better bug reporting with better privacy. In Proceedings of the 13th international conference on architectural support for programming languages and operating systems (ASPLOS XIII) (pp. 319–328). New York, NY: ACM Press. doi: 10.1145/1346281.1346322.
  12. Cavalcanti, Y. C., de Almeida, E. S., da Cunha, C. E. A., Lucrédio, D., & de Lemos Meira, S. R. (2010a). An initial study on the bug report duplication problem. In Proceedings of the 14th European conference on software maintenance and reengineering (CSMR’2010) (pp. 273–276). Madrid, Spain: IEEE.Google Scholar
  13. Cavalcanti, Y. C., da Silveira Mota, P. A., de Almeida, E. S., Lucrédio, D., da Cunha, CEA., & de Lemos Meira, S. R. (2010b). One step more to understand the bug report duplication problem. In XXIV Simpósio Brasileiro de Engenharia de software (SBES2010), Salvador, Brazil.Google Scholar
  14. da Cunha, C. E. A., Cavalcanti, Y. C., da Mota Silveira Neto, P. A., de Almeida, E. S., & de Lemos Meira, S. R. (2010). A visual bug report analysis and search tool. In Proceedings of the 22nd international conference on software engineering and knowledge engineering (SEKE2010) (pp. 742–747), San Franciso, CA.Google Scholar
  15. D’Ambros, M., & Lanza, M. (2006). Software bugs and evolution: A visual approach to uncover their relationship. In Proceedings of the 10th European conference on software maintenance and reengineering (CSMR06) (pp. 229–238). New York: IEEE Press. doi: 10.1109/CSMR.2006.51.
  16. Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York, NY: Chapman and Hall/CRC.zbMATHGoogle Scholar
  17. Feldman, R., & Sanger, J. (2007). The text mining handbook: Advanced approaches in analyzing unstructured data. Cambridge: Cambridge University Press.Google Scholar
  18. Fischer, M., Pinzger, M., & Gall, H. (2003a). Analyzing and relating bug report data for feature tracking. In Proceednings of the 10th working conference on reverse engineering (WCRE03) (pp. 90–99). Washington, DC: IEEE Press.Google Scholar
  19. Fischer, M., Pinzger, M., & Gall, H. (2003b). Populating a release history database from version control and bug tracking systems. In Proceedings of the 19th international conference on software maintenance (ICSM03) (pp. 23–32). New York: IEEE Press. doi: 10.1109/ICSM.2003.1235403.
  20. Hiew, L. (2006). Assisted detection of duplicate bug reports. Master’s thesis, The University of British Columbia.Google Scholar
  21. Jalbert, N., & Weimer, W. (2008). Automated duplicate detection for bug tracking systems. In Proceedings of the 38th annual IEEE/IFIP international conference on dependable systems and networks (DSN08) (pp. 52–61). New York: IEEE Press.Google Scholar
  22. Jeong, G., Kim, S., & Zimmermann, T. (2009). Improving bug triage with bug tossing graphs. In Proceedings of the 7th joint meeting of the European software engineering confernce and the ACM SIGSOFT symposium on the foundations of software engineering (ESEC/FSE2009).Google Scholar
  23. Johnson, J. N., Dubois, P. F. (2003). Issue tracking. Computers in Science and Engineering, 5(6), 71–77.CrossRefGoogle Scholar
  24. Ko, A. J., Myers, B. A., & Chau, D. H. (2006). A linguistic analysis of how people describe software problems. In Proceedings of the visual languages and human-centric computing (VLHCC06) (pp. 127–134). Washington, DC: IEEE Press. doi: 10.1109/VLHCC.2006.3.
  25. Koponen, T., Lintula, H. (2006). Are the changes induced by the defect reports in the open source software maintenance? In H. R. Arabnia, & H. Reza (Eds.), Procedings of the 2006 international confernce on software engineering research (SERP06) (pp. 429–435). Nevada, USA: CSREA Press.Google Scholar
  26. Lancaster, F. W. (1986). Vocabulary control for information retrieval (2nd ed.). AL, USA: Information Resources Press.Google Scholar
  27. Podgurski, A., Leon, D., Francis, P., Masri, W., Minch, M., Sun, J., & Wang, B. (2003). Automated support for classifying software failure reports. In Proceedings of the 25th international conference on software engineering (ICSE03) (pp. 465–475). Washington, DC: IEEE Press. doi: 10.1109/ICSE.2003.1201224.
  28. Runeson, P., Alexandersson, M., & Nyholm, O. (2007). Detection of duplicate defect reports using natural language processing. In Proceedings of the 29th international conference on software engineering (ICSE07) (pp. 499–510). New York: IEEE Press. doi: 10.1109/ICSE.2007.32.
  29. Sandusky, R. J., Gasser, L., & Ripoche, G. (2004). Bug report networks: Varieties, strategies, and impacts in a f/oss development community. In Proceedings of the 1st international workshop on mining software repositories (MSR04) (pp. 80–84). Waterloo: University of Waterloo.Google Scholar
  30. Serrano, N., Ciordia, I. (2005). Bugzilla, itracker, and other bug trackers. IEEE Software, 22(2), 11–13.CrossRefGoogle Scholar
  31. Sommerville, I. (2007). Software engineering, (8th ed.). New York: Addison Wesley.zbMATHGoogle Scholar
  32. Song, Q., Shepperd, M. J., Cartwright, M., & Mair, C. (2006). Software defect association mining and defect correction effort prediction. IEEE Transactions on Software Engineering, 32(2), 69–82. doi: 10.1109/TSE.2006.1599417.CrossRefGoogle Scholar
  33. Wang, X., Zhang, L., Xie, T., Anvik, J., & Sun, J. (2008). An approach to detecting duplicate bug reports using natural language and execution information. In Proceedings of the 13th international conference on software engineering (ICSE08) (pp. 461–470). New York: ACM Press. doi: 10.1145/1368088.1368151.
  34. Weiss, C., Premraj, R., Zimmermann, T., & Zeller, A. (2007). How long will it take to fix this bug? In Proceedings of the fourth international workshop on mining software repositories (MSR07) (pp. 20–26). New York: IEEE Press. doi: 10.1109/MSR.2007.13.
  35. Wohlin, C., Runeson, P., Martin Höst, M. C. O., Regnell, B., & Wesslén, A. (2000). Experimentation in software engineering: An introduction the Kluwer internationational series in software engineering. MA, USA: Kluwer Academic Publishers.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Yguaratã Cerqueira Cavalcanti
    • 1
    • 4
    Email author
  • Paulo Anselmo da Mota Silveira Neto
    • 4
  • Daniel Lucrédio
    • 3
    • 4
  • Tassio Vale
    • 1
    • 4
  • Eduardo Santana de Almeida
    • 2
    • 4
  • Silvio Romero de Lemos Meira
    • 1
    • 4
  1. 1.Center for InformaticsFederal University of Pernambuco—CIn/UFPEPernambucoBrazil
  2. 2.Computer Science DepartmentFederal University of Bahia—DCC/UFBABahiaBrazil
  3. 3.Computing DepartmentFederal University of São Carlos—DC/UFSCarSão CarlosBrazil
  4. 4.Reuse in Software Engineering—RiSERecifeBrazil

Personalised recommendations