Advertisement

Are Popular Classes More Defect Prone?

  • Alberto Bacchelli
  • Marco D’Ambros
  • Michele Lanza
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6013)

Abstract

Traces of the evolution of software systems are left in a number of different repositories, such as configuration management systems, bug tracking systems, and mailing lists. Developers use e-mails to discuss issues ranging from low-level concerns (bug fixes, refactorings) to high-level resolutions (future planning, design decisions). Thus, e-mail archives constitute a valuable asset for understanding the evolutionary dynamics of a system.

We introduce metrics that measure the “popularity” of source code artifacts, i.e. the amount of discussion they generate in e-mail archives, and investigate whether the information contained in e-mail archives is correlated to the defects found in the system. Our hypothesis is that developers discuss problematic entities more than unproblematic ones. We also study whether the precision of existing techniques for defect prediction can be improved using our popularity metrics.

Keywords

Source Code Mailing List Defect Prediction Software Defect Change Metrics 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Basili, V.R., Briand, L.C., Melo, W.L.: A validation of object-oriented design metrics as quality indicators. IEEE Transactions on Software Engineering 22(10), 751–761 (1996)CrossRefGoogle Scholar
  2. 2.
    Ohlsson, N., Alberg, H.: Predicting fault-prone software modules in telephone switches. IEEE Transactions on Software Engineering 22(12), 886–894 (1996)CrossRefGoogle Scholar
  3. 3.
    Subramanyam, R., Krishnan, M.S.: Empirical analysis of CK metrics for object-oriented design complexity: Implications for software defects. IEEE Transactions on Software Engineering 29(4), 297–310 (2003)CrossRefGoogle Scholar
  4. 4.
    Gyimóthy, T., Ferenc, R., Siket, I.: Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Transactions on Software Engineering 31(10), 897–910 (2005)CrossRefGoogle Scholar
  5. 5.
    Nagappan, N., Ball, T., Zeller, A.: Mining metrics to predict component failures. In: Proceedings of the ICSE 2006, 28th International Conference on Software Engineering, pp. 452–461. ACM, New York (2006)CrossRefGoogle Scholar
  6. 6.
    Nagappan, N., Ball, T.: Use of relative code churn measures to predict system defect density. In: Proceedings of ICSE 2005, 27th International Conference on Software Engineering, pp. 284–292. ACM, New York (2005)Google Scholar
  7. 7.
    Moser, R., Pedrycz, W., Succi, G.: A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In: Proceedings of ICSE 2008, 30th International Conference on Software Engineering, pp. 181–190 (2008)Google Scholar
  8. 8.
    Bernstein, A., Ekanayake, J., Pinzger, M.: Improving defect prediction using temporal features and non linear models. In: Proceedings of the International Workshop on Principles of Software Evolution, Dubrovnik, Croatia, pp. 11–18. IEEE CS Press, Los Alamitos (2007)Google Scholar
  9. 9.
    Ostrand, T.J., Weyuker, E.J., Bell, R.M.: Predicting the location and number of faults in large software systems. IEEE Transactions on Software Engineering 31(4), 340–355 (2005)CrossRefGoogle Scholar
  10. 10.
    Zimmermann, T., Premraj, R., Zeller, A.: Predicting defects for eclipse. In: Proceedings of PROMISE 2007, 3rd International Workshop on Predictor Models in Software Engineering, p. 9. IEEE Computer Society, Los Alamitos (2007)CrossRefGoogle Scholar
  11. 11.
    Antoniol, G., Canfora, G., Casazza, G., Lucia, A.D., Merlo, E.: Recovering traceability links between code and documentation. IEEE Transactions on Software Engineering 28(10), 970–983 (2002)CrossRefGoogle Scholar
  12. 12.
    Lucia, A.D., Fasano, F., Grieco, C., Tortora, G.: Recovering design rationale from email repositories. In: Proceedings of ICSM 2009, 25th IEEE International Conference on Software Maintenance. IEEE CS Press, Los Alamitos (2009)Google Scholar
  13. 13.
    Pattison, D., Bird, C., Devanbu, P.: Talk and Work: a Preliminary Report. In: Proceedings of MSR 2008, 5th International Working on Mining Software Repositories, pp. 113–116. ACM, New York (2008)CrossRefGoogle Scholar
  14. 14.
    Bacchelli, A., D”Ambros, M., Lanza, M., Robbes, R.: Benchmarking lightweight techniques to link e-mails and source code. In: Proceedings of WCRE 2009, 16th IEEE Working Conference on Reverse Engineering, pp. 205–214. IEEE CS Press, Los Alamitos (2009)Google Scholar
  15. 15.
    Demeyer, S., Tichelaar, S., Ducasse, S.: FAMIX 2.1 — The FAMOOS Information Exchange Model. Technical report, University of Bern (2001)Google Scholar
  16. 16.
    Chidamber, S.R., Kemerer, C.F.: A metrics suite for object oriented design. IEEE Trans. Software Eng. 20(6), 476–493 (1994)CrossRefGoogle Scholar
  17. 17.
    Emam, K.E., Melo, W., Machado, J.C.: The prediction of faulty classes using object-oriented design metrics. Journal of Systems and Software 56(1), 63–75 (2001)CrossRefGoogle Scholar
  18. 18.
    Fischer, M., Pinzger, M., Gall, H.: Populating a release history database from version control and bug tracking systems. In: Proceedings of ICSM 2003, 19th International Conference on Software Maintenance, pp. 23–32. IEEE CS Press, Los Alamitos (2003)CrossRefGoogle Scholar
  19. 19.
    Kollmann, R., Selonen, P., Stroulia, E.: A study on the current state of the art in tools-upported uml-based static reverse engineering. In: Proceedings WCRE 2002, 9th Working Conference on Reverse Engineering, pp. 22–32 (2002)Google Scholar
  20. 20.
    Triola, M.: Elementary Statistics, 10th edn. Addison-Wesley, Reading (2006)Google Scholar
  21. 21.
    Ostrand, T.J., Weyuker, E.J.: The distribution of faults in a large industrial software system. SIGSOFT Software Engineering Notes 27(4), 55–64 (2002)CrossRefGoogle Scholar
  22. 22.
    Ostrand, T.J., Weyuker, E.J., Bell, R.M.: Where the bugs are. In: Proceedings of ISSTA 2004, ACM SIGSOFT International Symposium on Software testing and analysis, pp. 86–96. ACM, New York (2004)CrossRefGoogle Scholar
  23. 23.
    Ostrand, T.J., Weyuker, E.J., Bell, R.M.: Automating algorithms for the identification of fault-prone files. In: Proceedings of ISSTA 2007, ACM SIGSOFT International Symposium on Software testing and analysis, pp. 219–227. ACM, New York (2007)Google Scholar
  24. 24.
    Zimmermann, T., Nagappan, N.: Predicting defects using network analysis on dependency graphs. In: Proceedings of ICSE 2008, 30th International Conference on Software Engineering, pp. 531–540. ACM, New York (2008)CrossRefGoogle Scholar
  25. 25.
    Antoniol, G., Ayari, K., Penta, M.D., Khomh, F., Guéhéneuc, Y.G.: Is it a bug or an enhancement?: a text-based approach to classify change requests. In: Proceedings of CASCON 2008, Conference of the center for Advanced Studies On Collaborative research, pp. 304–318. ACM, New York (2008)CrossRefGoogle Scholar
  26. 26.
    Li, P.L., Herbsleb, J., Shaw, M.: Finding predictors of field defects for open source software systems in commonly available data sources: A case study of openbsd. In: Proceedings of METRICS 2005, 11th IEEE International Software Metrics Symposium, p. 32. IEEE Computer Society, Los Alamitos (2005)CrossRefGoogle Scholar
  27. 27.
    Bird, C., Gourley, A., Devanbu, P., Gertz, M., Swaminathan, A.: Mining Email Social Networks. In: Proceedings of MSR 2006, 3rd International Workshop on Mining software repositories, pp. 137–143. ACM, New York (2006)CrossRefGoogle Scholar
  28. 28.
    Mockus, A., Fielding, R.T., Herbsleb, J.D.: Two case studies of open source software development: Apache and Mozilla. ACM Transactions on Software Engineering and Methodology 11(3), 309–346 (2002)CrossRefGoogle Scholar
  29. 29.
    Bird, C., Gourley, A., Devanbu, P., Swaminathan, A., Hsu, G.: Open Borders? Immigration in Open Source Projects. In: Proceedings of MSR 2007, 4th International Workshop on Mining Software Repositories, p. 6. IEEE Computer Society, Los Alamitos (2007)CrossRefGoogle Scholar
  30. 30.
    Rigby, P.C., Hassan, A.E.: What can oss mailing lists tell us? a preliminary psychometric text analysis of the apache developer mailing list. In: Proceedings of MSR 2007, 4th International Workshop on Mining Software Repositories, p. 23. IEEE Computer Society, Los Alamitos (2007)CrossRefGoogle Scholar
  31. 31.
    Hassan, A.E.: Predicting faults using the complexity of code changes. In: Proceedings of ICSE 2009, 31st International Conference on Software Engineering, pp. 78–88 (2009)Google Scholar
  32. 32.
    Marcus, A., Poshyvanyk, D., Ferenc, R.: Using the conceptual cohesion of classes for fault prediction in object-oriented systems. IEEE Transactions on Software Engineering 34(2), 287–300 (2008)CrossRefGoogle Scholar
  33. 33.
    Neuhaus, S., Zimmermann, T., Holler, C., Zeller, A.: Predicting vulnerable software components. In: Proceedings of CCS 2007, 14th ACM Conference on Computer and Communications Security, pp. 529–540. ACM, New York (2007)CrossRefGoogle Scholar
  34. 34.
    Wolf, T., Schroter, A., Damian, D., Nguyen, T.: Predicting build failures using social network analysis on developer communication. In: Proceedings of ICSE 2009, 31st International Conference on Software Engineering, pp. 1–11. IEEE Computer Society, Los Alamitos (2009)Google Scholar
  35. 35.
    Sarma, A., Maccherone, L., Wagstrom, P., Herbsleb, J.: Tesseract: Interactive visual exploration of socio-technical relationships in software development. In: Proceedings of ICSE 2009, 31st International Conference on Software Engineering, pp. 23–33 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Alberto Bacchelli
    • 1
  • Marco D’Ambros
    • 1
  • Michele Lanza
    • 1
  1. 1.REVEAL @ Faculty of InformaticsUniversity of LuganoSwitzerland

Personalised recommendations