Skip to main content

Comparing four approaches for technical debt identification

Abstract

Software systems accumulate technical debt (TD) when short-term goals in software development are traded for long-term goals (e.g., quick-and-dirty implementation to reach a release date versus a well-refactored implementation that supports the long-term health of the project). Some forms of TD accumulate over time in the form of source code that is difficult to work with and exhibits a variety of anomalies. A number of source code analysis techniques and tools have been proposed to potentially identify the code-level debt accumulated in a system. What has not yet been studied is if using multiple tools to detect TD can lead to benefits, that is, if different tools will flag the same or different source code components. Further, these techniques also lack investigation into the symptoms of TD “interest” that they lead to. To address this latter question, we also investigated whether TD, as identified by the source code analysis techniques, correlates with interest payments in the form of increased defect- and change-proneness. Comparing the results of different TD identification approaches to understand their commonalities and differences and to evaluate their relationship to indicators of future TD “interest.” We selected four different TD identification techniques (code smells, automatic static analysis issues, grime buildup, and Modularity violations) and applied them to 13 versions of the Apache Hadoop open source software project. We collected and aggregated statistical measures to investigate whether the different techniques identified TD indicators in the same or different classes and whether those classes in turn exhibited high interest (in the form of a large number of defects and higher change-proneness). The outputs of the four approaches have very little overlap and are therefore pointing to different problems in the source code. Dispersed Coupling and Modularity violations were co-located in classes with higher defect-proneness. We also observed a strong relationship between Modularity violations and change-proneness. Our main contribution is an initial overview of the TD landscape, showing that different TD techniques are loosely coupled and therefore indicate problems in different locations of the source code. Moreover, our proxy interest indicators (change- and defect-proneness) correlate with only a small subset of TD indicators.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Notes

  1. This approach is necessary since versions can overlap (in time) in the SVN repository and single revisions cannot always be clearly assigned to a single version.

  2. Each bug pattern is assigned a priority and category by the FindBugs authors. Some categories are biased toward single priorities: for example, correctness is considered more often to be of high priority.

References

  • Altman, D. G. (1990) Practical Statistics for Medical Research (Statistics texts), 1st ed. Chapman & Hall/CRC, Nov. 1990. [Online]. Available: http://www.worldcat.org/-isbn/-0412276305.

  • Ayewah, N., Pugh, W. (2010) The google findbugs fixit, In Proceedings of the 19th International Symposium on Software Testing and Analysis (pp. 241–252), ser. ISSTA’10. New York, NY, USA: ACM. [Online]. Available: 10.1145/-1831708.1831738.

  • Basili, V. R., & Weiss, D. M. (1984). A methodology for collecting valid software engineering data. Software Engineering, IEEE Transactions on, SE-10(6), 728–738.

    Article  Google Scholar 

  • Izurieta C., Bieman, J. (2008) Testing consequences of grime buildup in object oriented design patterns, In Software Testing, Verification, and Validation, 2008 1st International Conference on, April 2008, pp. 171–179.

  • Bieman, J.M., Straw, G., Wang, H., Munger, P.W., Alexander, R.T. (2003) Design patterns and change proneness: An examination of five evolving systems, Software Metrics Symposium, 2003. Proceedings. Ninth International, pp. 40–49, 3–5 Sept. 2003.

  • Boogerd, C., Moonen, L. (2009) Evaluating the relation between coding standard violations and faultswithin and across software versions, In Mining Software Repositories, 2009. MSR’09. 6th IEEE International Working Conference on, May 2009, pp. 41–50.

  • Brown, N., Cai, Y., Guo, Y., Kazman, R., Kim, M., Kruchten, P., Lim, E., MacCormack, A., Nord, R., Ozkaya, I., Sangwan, R., Seaman, C., Sullivan, K., Zazworka, N. (2010) Managing technical debt in software-reliant systems. In Proceedings of the FSE/SDP workshop on Future of Software Engineering Research (pp. 47–52), ser. FoSER’10. New York, NY, USA: ACM, [Online]. Available: 10.1145/-1882362.1882373.

  • Brown, W. J., Malveau, R. C., Mowbray, T. J. (1998) AntiPatterns: Refactoring software, architectures, and projects in crisis. Wiley, Mar. 1998. [Online]. Available: http://www.worldcat.org/-isbn/-0471197130.

  • CAST, (2010) Cast worldwide application software quality study: Summary of key findings, Tech. Rep.

  • Cohen, J. (1988) Statistical power analysis for the behavioral sciences: Jacob Cohen., 2nd ed. Lawrence Erlbaum, Jan. 1988. [Online]. Available: http://www.worldcat.org/-isbn/-0805802835.

  • Cunningham W (1992) The wycash portfolio management system, In Addendum to the Proceedings on Object-Oriented Programming Systems, Languages, and Applications (Addendum), ser. OOPSLA’92 (pp. 29–30). New York, NY: ACM. [Online]. Available: 10.1145/-157709.157715.

  • D’Ambros, M., Bacchelli, A., Lanza, M. (2010) On the impact of design flaws on software defects, In Quality Software (QSIC), 2010 10th International Conference on, July 2010 (pp. 23–31).

  • El Emam, K., Wieczorek, I. (1998) The repeatability of code defect classifications, In Software Reliability Engineering, 1998. Proceedings. The Ninth International Symposium on, nov 1998 (pp. 322–333).

  • Evans, J. (1996) Straightforward Statistics for the Behavioral Sciences. Brooks/Cole Pub. Co., 1996. [Online]. Available: http://books.google.com/-books?id=8Ca2AAAAIAAJ.

  • Fenton, N., Neil, M., Marsh, W., Hearty, P., Marquez, D., Krause, P., Mishra, R. (2007) Predicting software defects in varying development lifecycles using bayesian nets, Information and Software Technology, 49:(1): 32–43, Most Cited Journal Articles in Software Engineering—2000. [Online]. Available: http://www.sciencedirect.com/-science/-article/-pii/-S0950584906001194.

  • Fleiss, J. L. (1981) Statistical Methods for Rates and Proportions, 2nd ed., Wiley series in probability and mathematical statistics. New York: Wiley.

  • Fowler, M., Beck, K., Brant, J., Opdyke, W., & Roberts, D. (1999). Refactoring: Improving the design of existing code (1st ed.). Jul: Addison-Wesley Professional.

    Google Scholar 

  • Gat, I., Heintz, J.D., (2011) From assessment to reduction: How cutter consortium helps rein in millions of dollars in technical debt, In Proceedings of the 2nd Workshop on Managing Technical Debt (pp. 24–26), ser. MTD’11. New York, NY, USA: ACM. [Online]. Available: 10.1145/-1985362.1985368.

  • Guéhéneuc, Y.-G., Albin-Amiot, H. (2001) Using design patterns and constraints to automate the detection and correction of inter-class design defects, In Proceedings of the 39th International Conference and Exhibition on Technology of Object-Oriented Languages and Systems (TOOLS39) (p 296), ser. TOOLS’01. Washington, DC, USA: IEEE Computer Society, 2001, [Online]. Available: http://dl.acm.org/-citation.cfm?id=882501.884740.

  • Hovemeyer, D., Pugh, W. (2004) Finding bugs is easy, SIGPLAN Not., 39: 92–106. [Online]. Available: 10.1145/-1052883.1052895.

  • Izurieta, C., Bieman, J. (2007) How software designs decay: A pilot study of pattern evolution, In Empirical Software Engineering and Measurement, 2007. ESEM 2007. First International Symposium on, sept. 2007 (pp. 449–451).

  • Izurieta, C., Bieman, J. (2012) A multiple case study of design pattern decay, grime, and rot in evolving software systems, Springer Software Quality Journal. February 2012, 10.1007/s11219-012-9175-x.

  • Khomh, F., Di Penta, M., Gueheneuc, Y.-G. (2009) An Exploratory study of the impact of code smells on software change-proneness, Reverse Engineering, 2009. WCRE ‘09. 16th Working Conference on, pp.75–84, 13–16 Oct. 2009.

  • Kim, S., Ernst, M. D. (2007) Prioritizing warning categories by analyzing software history, In Proceedings of the Fourth International Workshop on Mining Software Repositories (p. 27), ser. MSR’07. Washington, DC, USA: IEEE Computer Society, [Online]. Available: 10.1109/-MSR.2007.26.

  • Kim, S., Ernst, M. D. (2007) Which warnings should i fix first? In Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering (pp. 45–54), ser. ESEC-FSE’07. New York, NY, USA: ACM, [Online]. Available: 10.1145/-1287624.1287633.

  • Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174.

    Article  MATH  MathSciNet  Google Scholar 

  • Lanza, M., & Marinescu, R. (2006). Object-oriented metrics in practice. Berlin: Springer-Verlag.

    MATH  Google Scholar 

  • Marinescu, R. (2004). Detection strategies: Metrics-based rules for detecting design flaws. Software Maintenance, IEEE International Conference on, 0, 350–359.

    Google Scholar 

  • Muthanna, S., Kontogiannis, K., Ponnambalam, K. Stacey, B. (2000) A maintainability model for industrial software systems using design level metrics, In Reverse Engineering, 2000. Proceedings. Seventh Working Conference on, 2000, pp. 248–256.

  • Nagappan, N., Ball, T. (2005) Static analysis tools as early indicators of pre-release defect density, In Software Engineering, 2005. ICSE 2005. Proceedings. 27th International Conference on, may 2005.

  • Nagappan, N., Ball, T., Zeller, A. (2006) Mining metrics to predict component failures, In Proceedings of the 28th International Conference on Software Engineering (pp. 452–461), ser. ICSE’06. New York, NY, USA: ACM, [Online]. Available: 10.1145/-1134285.1134349.

  • Nugroho, A., Visser, J., Kuipers, T. (2011) An empirical model of technical debt and interest, In Proceedings of the 2nd Workshop on Managing Technical Debt (pp. 1–8), ser. MTD’11. New York, NY, USA: ACM, [Online]. Available: 10.1145/-1985362.1985364.

  • Olbrich, S. M., Cruzes, D. S., Sjoberg, D. I. K. (2010) Are all code smells harmful? a study of god classes and brain classes in the evolution of three open source systems, In Proceedings of the 2010 IEEE International Conference on Software Maintenance (pp. 1–10), ser. ICSM’10. Washington, DC, USA: IEEE Computer Society. [Online]. Available: 10.1109/-ICSM.2010.5609564.

  • Park, H.-M., Jung, H.-W. (2003) Evaluating interrater agreement with intraclass correlation coefficient in spice-based software process assessment, In Quality Software, 2003. Proceedings of Third International Conference on, Nov. 2003 (pp. 308–314).

  • Riaz, M., Mendes, E., Tempero, E. (2009) A systematic review of software maintainability prediction and metrics, In Empirical Software Engineering and Measurement, 2009. ESEM 2009. 3rd International Symposium on, Oct. 2009 (pp. 367–377).

  • Schumacher, J., Zazworka, N., Shull, F., Seaman, C., Shaw, M. (2010) Building empirical support for automated code smell detection, In Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (pp. 8:1–8:10), ser. ESEM’10. New York, NY, USA: ACM. [Online]. Available: 10.1145/-1852786.1852797.

  • Shull, F. (2011). Perfectionists in a world of finite resources. IEEE Software, 28(2), 4–6.

    Article  Google Scholar 

  • Vetro’, A., Morisio, M., Torchiano, M. (2011) An empirical validation of findbugs issues related to defects, IET Seminar Digests, 2011(1):144–153. [Online]. Available: http://-link.aip.org/-link/-abstract/-IEESEM/-v2011/-i1/-p144/-s1.

  • Vetro’, A., Torchiano, M., Morisio, M. (2010) Assessing the precision of findbugs by mining java projects developed at a university, In Mining Software Repositories (MSR), 2010 7th IEEE Working Conference on, may 2010, pp. 110–113.

  • Wagner, S., Jürjens, J., Koller, C., Trischberger, P. (2005) Comparing bug finding tools with reviews and tests, In Proceedings of the International Conference on Testing of Communications Systems (pp. 40–55).

  • Wohlin, C., Runeson, P., Höst, M., Ohlsson, M. C., Regnell, B., & Wesslén, A. (2000). Experimentation in Software Engineering: An Introduction. Norwell, MA, USA: Kluwer Academic Publishers.

    Book  Google Scholar 

  • Wong, S., Cai, Y., Kim, M., Dalton, M. (2011) Detecting software modularity violations, In Proceedings of 33th International Conference on Software Engineering (pp. 411–420), May 2011.

  • Zazworka, N., Shaw, M. A., Shull, F., Seaman, C. (2011) Investigating the impact of design debt on software quality, In Proceeding of the 2nd Working on Managing Technical Debt (pp. 17–23), ser. MTD’11. New York, NY, USA: ACM, [Online]. Available: 10.1145/-1985362.1985366.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nico Zazworka.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Zazworka, N., Vetro’, A., Izurieta, C. et al. Comparing four approaches for technical debt identification. Software Qual J 22, 403–426 (2014). https://doi.org/10.1007/s11219-013-9200-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-013-9200-8

Keywords

  • Technical debt
  • Software maintenance
  • Software quality
  • Source code analysis
  • Modularity violations
  • Grime
  • Code smells
  • ASA