Empirical Software Engineering

, Volume 23, Issue 1, pp 418–451 | Cite as

Identifying self-admitted technical debt in open source projects using text mining

  • Qiao Huang
  • Emad Shihab
  • Xin Xia
  • David Lo
  • Shanping Li


Technical debt is a metaphor to describe the situation in which long-term code quality is traded for short-term goals in software projects. Recently, the concept of self-admitted technical debt (SATD) was proposed, which considers debt that is intentionally introduced, e.g., in the form of quick or temporary fixes. Prior work on SATD has shown that source code comments can be used to successfully detect SATD, however, most current state-of-the-art classification approaches of SATD rely on manual inspection of the source code comments. In this paper, we proposed an automated approach to detect SATD in source code comments using text mining. In our approach, we utilize feature selection to select useful features for classifier training, and we combine multiple classifiers from different source projects to build a composite classifier that identifies SATD comments in a target project. We investigate the performance of our approach on 8 open source projects that contain 212,413 comments. Our experimental results show that, on every target project, our approach outperforms the state-of-the-art and the baselines approaches in terms of F1-score. The F1-score achieved by our approach ranges between 0.518 - 0.841, with an average of 0.737, which improves over the state-of-the-art approach proposed by Potdar and Shihab by 499.19%. When compared with the text mining-based baseline approaches, our approach significantly improves the average F1-score by at least 58.49%. When compared with a natural language processing-based baseline, our approach also significantly improves its F1-score by 27.95%. Our proposed approach can be used by project personnel to effectively identify SATD with minimal manual effort.


Technical debt Source code comments Text mining 



The authors thank to all the developers who participated in this study. This research was supported by NSFC Program (No. 61602403 and 61572426), and National Key Technology R&D Program of the Ministry of Science and Technology of China (No. 2015BAH17F01).


  1. Arisholm E, Briand LC, Fuglerud M (2007) Data mining techniques for building fault-proneness models in telecom java software. In: The 18th IEEE international symposium on software reliability (ISSRE’07), IEEE, pp 215–224CrossRefGoogle Scholar
  2. Bavota G, Russo B (2016) A large-scale empirical study on self-admitted technical debt. In: Proceedings of the 13th international conference on mining software repositories, MSR ’16, pp 315–326Google Scholar
  3. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140MathSciNetzbMATHGoogle Scholar
  4. Brown N, Cai Y, Guo Y, Kazman R, Kim M, Kruchten P, Lim E, MacCormack A, Nord R, Ozkaya I et al (2010) Managing technical debt in software-reliant systems. In: Proceedings of the FSE/SDP workshop on Future of software engineering research ACM, pp 47–52CrossRefGoogle Scholar
  5. Cohen J (1968) Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychol Bull 70(4):213CrossRefGoogle Scholar
  6. Cunningham W (1993) The wycash portfolio management system. ACM SIGPLAN OOPS Messenger 4(2):29–30CrossRefGoogle Scholar
  7. Fluri B, Wursch M, Gall HC (2007) Do code and comments co-evolve? On the relation between source code and comment changes. In: 14th working conference on reverse engineering (WCRE 2007). IEEE, pp 70–79CrossRefGoogle Scholar
  8. Guo Y, Seaman C, Gomes R, Cavalcanti A, Tonin G, Da Silva FQ, Santos AL, Siebra C (2011) Tracking technical debtan exploratory case study. In: 2011 27th IEEE international conference on software maintenance (ICSM). IEEE, pp 528–531CrossRefGoogle Scholar
  9. Haiduc S, Aponte J, Marcus A (2010) Supporting program comprehension with source code summarization. In: Proceedings of the 32nd ACM/IEEE international conference on software engineering-volume 2. ACM, pp 223–226Google Scholar
  10. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explorations Newsletter 11(1):10–18CrossRefGoogle Scholar
  11. Hall MA (1999) Correlation-based feature selection for machine learning. PhD thesis, The University of WaikatoGoogle Scholar
  12. Han J, Kamber M, Pei J (2006) Data mining: concepts and techniques. Morgan KaufmannGoogle Scholar
  13. He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284CrossRefGoogle Scholar
  14. Jiang T, Tan L, Kim S (2013) Personalized defect prediction. In: 2013 IEEE/ACM 28th international conference on automated software engineering (ASE). IEEE, pp 279–289CrossRefGoogle Scholar
  15. Khamis N, Witte R, Rilling J (2010) Automatic quality assessment of source code comments: the javadocminer. In: International conference on application of natural language to information systems. Springer, pp 68–79Google Scholar
  16. Kruchten P, Nord RL, Ozkaya I, Falessi D (2013) Technical debt: towards a crisper definition report on the 4th international workshop on managing technical debt. ACM SIGSOFT Softw Eng Notes 38(5):51–54CrossRefGoogle Scholar
  17. Lim E, Taksande N, Seaman C (2012) A balancing act: what software practitioners have to say about technical debt. Softw IEEE 29(6):22–27CrossRefGoogle Scholar
  18. Maldonado E, Shihab E (2015) Detecting and quantifying different types of self-admitted technical debt. In: Proceedings of the 7th IEEE international workshop on managing technical debt (MTD’15), pp 9– 15Google Scholar
  19. Maldonado E, Shihab E, Tsantalis N (2017) Using natural language processing to automatically detect self-admitted technical debt. IEEE Transactions on Software EngineeringGoogle Scholar
  20. Malik H, Chowdhury I, Tsou HM, Jiang ZM, Hassan AE (2008) Understanding the rationale for updating a functions comment. In: IEEE international conference on software maintenance, 2008. ICSM 2008. IEEE, pp 167–176Google Scholar
  21. Marcus A, Maletic JI (2003) Recovering documentation-to-source-code traceability links using latent semantic indexing. In: Proceedings of the 25th international conference on software engineering, 2003. IEEE, pp 125–135CrossRefGoogle Scholar
  22. Marinescu R (2004) Detection strategies: Metrics-based rules for detecting design flaws. In: Proceedings of the 20th IEEE international conference on software maintenance, 2004. IEEE, pp 350–359CrossRefGoogle Scholar
  23. Marinescu R, Ganea G, Verebi I (2010) incode: Continuous quality assessment and improvement. In: 2010 14th European conference on software maintenance and reengineering (CSMR). IEEE, pp 274–275CrossRefGoogle Scholar
  24. McCallum A, Nigam K et al (1998) A comparison of event models for naive bayes text classification. In: AAAI-98 Workshop on learning for text categorization. Citeseer, vol 752, pp 41–48Google Scholar
  25. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119Google Scholar
  26. Nguyen AT, Nguyen TT, Nguyen TN, Lo D, Sun C (2012) Duplicate bug report detection with a combination of information retrieval and topic modeling. In: Proceedings of the 27th IEEE/ACM international conference on automated software engineering (ASE), 2012. IEEE, pp 70–79Google Scholar
  27. Padioleau Y, Tan L, Zhou Y (2009) Listening to programmers taxonomies and characteristics of comments in operating system code. In: Proceedings of the 31st international conference on software engineering, IEEE computer society, pp 331–341Google Scholar
  28. Potdar A, Shihab E (2014) An exploratory study on self-admitted technical debt. In: 2014 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 91–100CrossRefGoogle Scholar
  29. Rahman F, Posnett D, Devanbu P (2012) Recalling the imprecision of cross-project defect prediction. In: Proceedings of the ACM SIGSOFT 20th international symposium on the foundations of software engineering. ACM, p 61Google Scholar
  30. Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620CrossRefzbMATHGoogle Scholar
  31. Seaman C, Guo Y, Izurieta C, Cai Y, Zazworka N, Shull F, Vetrò A (2012) Using technical debt data in decision making: Potential decision approaches. In: Proceedings of the 3rd international workshop on managing technical debt. IEEE Press, pp 45–48Google Scholar
  32. Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv (CSUR) 34(1):1–47CrossRefGoogle Scholar
  33. Shihab E, Ihara A, Kamei Y, Ibrahim WM, Ohira M, Adams B, Hassan AE, Ki M (2013) Studying re-opened bugs in open source software. Empir Softw Eng 18(5):1005–1042CrossRefGoogle Scholar
  34. Storey MA, Ryall J, Bull RI, Myers D, Singer J (2008) Todo or to bug. In: 2008 ACM/IEEE 30th international conference on software engineering. ICSE08. IEEE, pp 251–260Google Scholar
  35. Sun C, Lo D, Wang X, Jiang J, Khoo SC (2010) A discriminative model approach for accurate duplicate bug report retrieval. In: Proceedings of the 32nd ACM/IEEE international conference on software engineering. ACM, vol 1, pp 45-54Google Scholar
  36. Sun C, Lo D, Khoo SC, Jiang J (2011) Towards more accurate retrieval of duplicate bug reports. In: Proceedings of the 2011 26th IEEE/ACM international conference on automated software engineering. IEEE Computer Society, pp 253–262Google Scholar
  37. Tan L, Yuan D, Krishna G, Zhou Y (2007) /* icomment: Bugs or bad comments?*. In: ACM SIGOPS operating systems review. ACM, vol 41, pp 145–158Google Scholar
  38. Tan SH, Marinov D, Tan L, Leavens GT (2012) @ Tcomment: Testing javadoc comments to detect comment-code inconsistencies. In: 2012 IEEE Fifth international conference on software testing, Verification and Validation, IEEE, pp 260–269CrossRefGoogle Scholar
  39. Tian Y, Lawall J, Lo D (2012) Identifying linux bug fixing patches. In: 2012 34th international conference on software engineering, (ICSE). IEEE, pp 386–396CrossRefGoogle Scholar
  40. Valdivia Garcia H, Shihab E (2014) Characterizing and predicting blocking bugs in open source projects. In: Proceedings of the 11th working conference on mining software repositories. ACM, pp 72–81Google Scholar
  41. Vassallo C, Zampetti F, Romano D, Beller M, Panichella A, Penta MD, Zaidman A (2016) Continuous delivery practices in a large financial organization. In: Proceedings of the international conference on software maintenance and evolution (ICSME), ICSME ’16, p To AppearGoogle Scholar
  42. Wehaibi S, Shihab E, Guerrouj L (2016) Examining the impact of self-admitted technical debt on software quality. In: Proceedings of the 23rd IEEE international conference on software analysis, evolution, and reengineering (SANER’16)Google Scholar
  43. Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83CrossRefGoogle Scholar
  44. Xia X, Lo D, Wang X, Zhou B (2013) Tag recommendation in software information sites. In: Proceedings of the 10th working conference on mining software repositories. IEEE Press, pp 287–296Google Scholar
  45. Xia X, Lo D, Qiu W, Wang X, Zhou B (2014) Automated configuration bug report prediction using text mining. In: 2014 IEEE 38th annual computer software and applications conference (COMPSAC). IEEE, pp 107–116Google Scholar
  46. Xia X, Lo D, Shihab E, Wang X, Yang X (2015a) Elblocker: Predicting blocking bugs with ensemble imbalance learning. Inf Softw Technol 61:93–106CrossRefGoogle Scholar
  47. Xia X, Lo D, Shihab E, Wang X, Zhou B (2015b) Automatic, high accuracy prediction of reopened bugs. Autom Softw Eng 22(1):75–109CrossRefGoogle Scholar
  48. Xia X, Lo D, Wang X, Yang X (2015c) Who should review this change?: Putting text and file location analyses together for more accurate recommendations. In: 2015 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 261–270CrossRefGoogle Scholar
  49. Xia X, Lo D, Wang X, Zhou B (2015d) Dual analysis for recommending developers to resolve bugs. J Softw Evol Process 27(3):195–220CrossRefGoogle Scholar
  50. Xia X, Lo D, Pan SJ, Nagappan N, Wang X (2016a) Hydra: Massively compositional model for cross-project defect prediction. IEEE Trans Softw Eng 42 (10):977–998CrossRefGoogle Scholar
  51. Xia X, Lo D, Wang X, Yang X (2016b) Collective personalized change classification with multiobjective search. IEEE Trans Reliab 65(4):1810–1829CrossRefGoogle Scholar
  52. Xia X, Shihab E, Kamei Y, Lo D, Wang X (2016c) Predicting crashing releases of mobile applications. In: Proceedings of the 10th ACM/IEEE international symposium on empirical software engineering and measurement. ACM, p 29Google Scholar
  53. Xia X, Lo D, Ding Y, Al-Kofahi JM, Nguyen TN, Wang X (2017) Improving automated bug triaging with specialized topic model. IEEE Trans Softw Eng 43(3):272–297CrossRefGoogle Scholar
  54. Xu B, Ye D, Xing Z, Xia X, Chen G, Li S (2016) Predicting semantically linkable knowledge in developer online forums via convolutional neural network. In: Proceedings of the 31st IEEE/ACM international conference on automated software engineering. ACM, pp 51–62Google Scholar
  55. Yang X, Lo D, Xia X, Bao L, Sun J (2016) Combining word embedding with information retrieval to recommend similar bug reports. In: 2016 IEEE 27th international symposium on software reliability engineering (ISSRE). IEEE, pp 127–137CrossRefGoogle Scholar
  56. Yang XL, Lo D, Xia X, Huang Q, Sun JL (2017) High-impact bug report identification with imbalanced learning strategies. J Comput Sci Technol 32:1CrossRefGoogle Scholar
  57. Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: ICML, vol 97, pp 412–420Google Scholar
  58. Zazworka N, Shaw MA, Shull F, Seaman C (2011) Investigating the impact of design debt on software quality. In: Proceedings of the 2nd workshop on managing technical debt. ACM, pp 17– 23Google Scholar
  59. Zazworka N, Spínola RO, Vetro A, Shull F, Seaman C (2013) A case study on effectively identifying technical debt. In: Proceedings of the 17th international conference on evaluation and assessment in software engineering. ACM, pp 42–47Google Scholar
  60. Zhang Y, Lo D, Xia X, Le TDB, Scanniello G, Sun J (2016) Inferring links between concerns and methods with multi-abstraction vector space model. In: 2016 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 110–121CrossRefGoogle Scholar
  61. Zhou J, Zhang H, Lo D (2012) Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports. In: 2012 34th international conference on software engineering (ICSE). IEEE, pp 14–24CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  • Qiao Huang
    • 1
  • Emad Shihab
    • 2
  • Xin Xia
    • 1
    • 3
  • David Lo
    • 4
  • Shanping Li
    • 1
  1. 1.College of Computer Science and TechnologyZhejiang UniversityHangzhouChina
  2. 2.Data-driven Analysis of Software (DAS) Lab at the Department of Computer Science and Software EngineeringConcordia UniversityMontrealCanada
  3. 3.Department of Computer ScienceUniversity of British ColumbiaVancouverCanada
  4. 4.School of Information SystemsSingapore Management UniversitySingaporeSingapore

Personalised recommendations