Skip to main content
Log in

Identifying the severity of technical debt issues based on semantic and structural information

  • Research
  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

Technical debt (TD) refers to the phenomenon that developers choose a compromise solution from a short-term benefit perspective during design or architecture selection. TD-related issues, such as code smells, may have a critical impact on important non-functional requirements. Different severity levels of TD issues require different measures to be taken by developers in the future. Existing studies mainly focus on detecting TD in software projects through source code or comments, but usually ignore the severity degree of TD issues. As a matter of fact, it is very important to identify the severity of TD issues and clarify which TD should be prioritized. In this paper, we propose an approach that combines the semantic and structural information of the code snippets to identify their severity at method level. In the approach, we first transform each method affected by TD issues into an abstract syntax tree (AST) and use the paths in the AST to represent its semantic information. Then, we extract different code metrics to measure the size, coupling, and complexity of methods affected by TD issues to represent their structural information. Finally, we build a stacking ensemble model to identify the severity of TD issues by using Random Forest (RF) and eXtreme Gradient Boosting (XGBoost) for the base classifiers and Support Vector Machine (SVM) for the meta-classifier. The evaluation results on the real dataset show that our approach achieves 65.77% in terms of precision, 68.18% in terms of recall, and 65.84% in terms of F1-score on average. In addition, the experimental results also demonstrate that the strategy of combining the semantic and structural information of code snippets is effective in improving the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1

Similar content being viewed by others

Data availability

The data of this study is openly available in Github at https://github.com/HduDBSI/SQJ-TD-Severity.

Notes

  1. https://github.com/HduDBSI/SQJ-TD-Severity.

  2. https://sourceforge.net.

  3. https://github.com.

  4. https://www.sonarqube.org.

  5. https://pmd.github.io.

  6. https://javaparser.org.

References

  • Alfayez, R., & Boehm, B. (2019). Technical debt prioritization: A search-based approach. In 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS), pp. 434–445. IEEE.

  • Alon, U., Zilberstein, M., Levy, O., & Yahav, E. (2019). code2vec: Learning distributed representations of code. Proceedings of the ACM on Programming Languages, 3(POPL), 1–29.

  • Amanatidis, T., Mittas, N., Moschou, A., Chatzigeorgiou, A., Ampatzoglou, A., & Angelis, L. (2020). Evaluating the agreement among technical debt measurement tools: Building an empirical benchmark of technical debt liabilities. Empirical Software Engineering, 25, 4161–4204.

    Article  Google Scholar 

  • Aniche, M. (2015). Java code metrics calculator (CK). Available in https://github.com/mauricioaniche/ck/

  • Avgeriou, P. C., Taibi, D., Ampatzoglou, A., Arcelli Fontana, F., Besker, T., Chatzigeorgiou, A., Lenarduzzi, V., Martini, A., Moschou, A., Pigazzini, I., et al. (2020). An overview and comparison of technical debt measurement tools. IEEE Software, 38(3), 61–71.

    Article  Google Scholar 

  • Boutaib, S., Bechikh, S., Palomba, F., Elarbi, M., Makhlouf, M., & Said, L. B. (2021). Code smell detection and identification in imbalanced environments. Expert Systems with Applications, 166, 114076.

    Article  Google Scholar 

  • Breiman, L. (2001). Random forests. Machine learning, 45(1), 5–32.

    Article  Google Scholar 

  • Chatzigeorgiou, A., Ampatzoglou, A., Ampatzoglou, A., & Amanatidis, T. (2015). Estimating the breaking point for technical debt. In 2015 IEEE 7th International Workshop on Managing Technical Debt (MTD), pp. 53–56. IEEE.

  • Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, 321–357.

    Article  MATH  Google Scholar 

  • Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., Chen, K., et al. (2015). Xgboost: Extreme gradient boosting. R package version 0.4-2, 1(4), 1–4.

  • Chen, X., Yu, D., Fan, X., Wang, L., & Chen, J. (2021). Multiclass classification for self-admitted technical debt based on XGBoost. IEEE Transactions on Reliability.

  • Conejero, J. M., Rodríguez-Echeverría, R., Hernández, J., Clemente, P. J., Ortiz-Caraballo, C., Jurado, E., & Sánchez-Figueroa, F. (2018). Early evaluation of technical debt impact on maintainability. Journal of Systems and Software, 142, 92–114.

    Article  Google Scholar 

  • Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273–297.

    Google Scholar 

  • Cunningham, W. (1992). The WyCash portfolio management system. ACM SIGPLAN OOPS Messenger, 4(2), 29–30.

    Article  Google Scholar 

  • da Silva, Maldonado E., Shihab, E., & Tsantalis, N. (2017). Using natural language processing to automatically detect self-admitted technical debt. IEEE Transactions on Software Engineering, 43(11), 1044–1062.

    Article  Google Scholar 

  • de Almeida, R. R., Kulesza, U., Treude, C., Higino Guedes Lima, A., et al. (2018). Aligning technical debt prioritization with business objectives: A multiple-case study. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 655–664. IEEE.

  • de Lima, B. S., & Garcia, R. E. (2020). Analyzing the rework time and severity of code debt: A case study using technical debt catalogs. arXiv preprint arXiv:2002.04695

  • de Lima, B. S., Garcia, R. E., & Eler, D. M. (2022). Toward prioritization of self-admitted technical debt: An approach to support decision to payment. Software Quality Journal, 30(3), 729–755.

    Article  Google Scholar 

  • Detofeno, T., Malucelli, A., & Reinehr, S. (2022). PriorTD: A method for prioritization technical debt. In Proceedings of the XXXVI Brazilian Symposium on Software Engineering, pp. 230–240.

  • Digkas, G., Lungu, M., Chatzigeorgiou, A., & Avgeriou, P. (2017). The evolution of technical debt in the apache ecosystem. In European Conference on Software Architecture, pages 51–66. Springer.

  • Falessi, D., & Reichel, A. (2015). Towards an open-source tool for measuring and visualizing the interest of technical debt. In 2015 IEEE 7th International Workshop on Managing Technical Debt (MTD), pp. 1–8. IEEE.

  • Fernández, A., Garcia, S., Herrera, F., & Chawla, N. V. (2018). Smote for learning from imbalanced data: Progress and challenges, marking the 15-year anniversary. Journal of artificial intelligence research, 61, 863–905.

    Article  MathSciNet  MATH  Google Scholar 

  • Flisar, J., & Podgorelec, V. (2018). Enhanced feature selection using word embeddings for self-admitted technical debt identification. In 2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 230–233. IEEE.

  • Fontana, F. A., & Zanoni, M. (2017). Code smell severity classification using machine learning techniques. Knowledge-Based Systems, 128, 43–58.

    Article  Google Scholar 

  • Guggulothu, T., & Moiz, S. A. (2020). Code smell detection using multi-label classification approach. Software Quality Journal, 28, 1063–1086.

    Article  Google Scholar 

  • He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on knowledge and data engineering, 21(9), 1263–1284.

    Article  Google Scholar 

  • Huang, Q., Shihab, E., Xia, X., Lo, D., & Li, S. (2018). Identifying self-admitted technical debt in open source projects using text mining. Empirical Software Engineering, 23(1), 418–451.

    Article  Google Scholar 

  • Japkowicz, N., & Stephen, S. (2002). The class imbalance problem: A systematic study. Intelligent data analysis, 6(5), 429–449.

    Article  MATH  Google Scholar 

  • Kamei, Y., Maldonado, E. D. S., Shihab, E., & Ubayashi, N. (2016). Using analytics to quantify interest of self-admitted technical debt. In QuASoQ/TDA@ APSEC, pp. 68–71.

  • Lenarduzzi, V., Sillitti, A., & Taibi, D. (2017). Analyzing forty years of software maintenance models. In 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C), pp. 146–148. IEEE.

  • Lenarduzzi, V., Sillitti, A., & Taibi, D. (2020). A survey on code analysis tools for software maintenance prediction. In Proceedings of 6th International Conference in Software Engineering for Defence Applications: SEDA 2018 6, pp. 165–175. Springer.

  • Letouzey, J.-L., & Ilkiewicz, M. (2012). Managing technical debt with the sqale method. IEEE software, 29(6), 44–51.

    Article  Google Scholar 

  • Li, Z., Liang, P., Avgeriou, P., Guelfi, N., & Ampatzoglou, A. (2014). An empirical investigation of modularity metrics for indicating architectural technical debt. In Proceedings of the 10th international ACM Sigsoft conference on Quality of software architectures, pp. 119–128.

  • Li, Z., Avgeriou, P., & Liang, P. (2015). A systematic mapping study on technical debt and its management. Journal of Systems and Software, 101, 193–220.

    Article  Google Scholar 

  • Liu, X. -Y., Wu, J., & Zhou, Z. -H. (2008). Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 39(2), 539–550.

  • Maipradit, R., Treude, C., Hata, H., & Matsumoto, K. (2020). Wait for it: Identifying (on-hold) self-admitted technical debt. Empirical Software Engineering, 25(5), 3770–3798.

    Article  Google Scholar 

  • Maldonado, E. D. S., & Shihab, E. (2015). Detecting and quantifying different types of self-admitted technical debt. In 2015 IEEE 7Th international workshop on managing technical debt (MTD), pp. 9–15. IEEE.

  • Martini, A., & Bosch, J. (2017). On the interest of architectural technical debt: Uncovering the contagious debt phenomenon. Journal of Software: Evolution and Process, 29(10), e1877.

    Google Scholar 

  • Mensah, S., Keung, J., Svajlenko, J., Bennin, K. E., & Mi, Q. (2018). On the value of a prioritization scheme for resolving self-admitted technical debt. Journal of Systems and Software, 135, 37–54.

    Article  Google Scholar 

  • Ramač, R., Mandić, V., Taušan, N., Rios, N., Freire, S., Pérez, B., Castellanos, C., Correal, D., Pacheco, A., Lopez, G., et al. (2022). Prevalence, common causes and effects of technical debt: Results from a family of surveys with the it industry. Journal of Systems and Software, 184, 111114.

    Article  Google Scholar 

  • Ren, X., Xing, Z., Xia, X., Lo, D., Wang, X., & Grundy, J. (2019). Neural network-based detection of self-admitted technical debt: From performance to explainability. ACM transactions on software engineering and methodology (TOSEM), 28(3), 1–45.

    Article  Google Scholar 

  • Ribeiro, L. F., de Freitas Farias, M. A., Mendonça, M. G., & Spínola, R. O. (2016). Decision criteria for the payment of technical debt in software projects: A systematic mapping study. ICEIS, 1, 572–579.

    Google Scholar 

  • Rios, N., de Mendonça Neto, M. G., & Spínola, R. O. (2018a). A tertiary study on technical debt: Types, management strategies, research trends, and base information for practitioners. Information and Software Technology, 102, 117–145.

    Article  Google Scholar 

  • Rios, N., Spínola, R. O., Mendonça, M., & Seaman, C. (2018b). The most common causes and effects of technical debt: First results from a global family of industrial surveys. In Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 1–10.

  • Sae-Lim, N., Hayashi, S., & Saeki, M. (2018). Context-based approach to prioritize code smells for prefactoring. Journal of Software: Evolution and Process, 30(6), e1886.

    Google Scholar 

  • Tan, J., Feitosa, D., Avgeriou, P., & Lungu, M. (2021). Evolution of technical debt remediation in python: A case study on the apache software ecosystem. Journal of Software: Evolution and Process, 33(4), e2319.

    Google Scholar 

  • Tsoukalas, D., Chatzigeorgiou, A., Ampatzoglou, A., Mittas, N., & Kehagias, D. (2022). TD classifier: Automatic identification of Java classes with high technical debt. In Proceedings of the International Conference on Technical Debt, pp. 76–80.

  • Vassallo, C., Panichella, S., Palomba, F., Proksch, S., Gall, H. C., & Zaidman, A. (2020). How developers engage with static analysis tools in different contexts. Empirical Software Engineering, 25(2), 1419–1457.

    Article  Google Scholar 

  • Wang, X., Liu, J., Li, L., Chen, X., Liu, X., & Wu, H. (2020). Detecting and explaining self-admitted technical debts with attention-based neural networks. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, pp. 871–882.

  • Wehaibi, S., Shihab, E., & Guerrouj, L. (2016). Examining the impact of self-admitted technical debt on software quality. In 2016 IEEE 23Rd international conference on software analysis, evolution, and reengineering (SANER), 1, 179–188. IEEE.

  • Wilson, D. R., & Martinez, T. R. (2000). Reduction techniques for instance-based learning algorithms. Machine learning, 38(3), 257–286.

    Article  MATH  Google Scholar 

  • Wolpert, D. H. (1992). Stacked generalization. Neural networks, 5(2), 241–259.

    Article  Google Scholar 

  • Xia, X., Shihab, E., Kamei, Y., Lo, D., & Wang, X. (2016). Predicting crashing releases of mobile applications. In Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 1–10.

  • Yan, M., Xia, X., Shihab, E., Lo, D., Yin, J., & Yang, X. (2018). Automating change-level self-admitted technical debt determination. IEEE Transactions on Software Engineering, 45(12), 1211–1229.

    Article  Google Scholar 

  • Yli-Huumo, J., Maglyas, A., & Smolander, K. (2016). How do software development teams manage technical debt? An empirical study. Journal of Systems and Software, 120, 195–218.

    Article  Google Scholar 

  • Yu, D., Wang, L., Chen, X., & Chen, J. (2021). Using BiLSTM with attention mechanism to automatically detect self-admitted technical debt. Frontiers of Computer Science, 15(4), 1–12.

    Article  Google Scholar 

  • Zampetti, F., Noiseux, C., Antoniol, G., Khomh, F., & Di Penta, M. (2017). Recommending when design technical debt should be self-admitted. In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 216–226. IEEE.

  • Zampetti, F., Serebrenik, A., & Di Penta, M. (2020). Automatically learning patterns for self-admitted technical debt removal. In 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 355–366. IEEE.

Download references

Funding

This work was supported by the National Natural Science Foundation of China under Grants 62372145 and 61902096, the Natural Science Foundation of Zhejiang Province under Grant LY21F020020, and the Key Research and Development Program of Zhejiang Province under Grants 2023C03200 and 2023C03179.

Author information

Authors and Affiliations

Authors

Contributions

Dongjin Yu: conceptualization, methodology. Sicheng Li: data curation, methodology, writing original draft, software. Xin Chen: validation, reviewing. Tian Sun: data curation, investigation.

Corresponding author

Correspondence to Dongjin Yu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, D., Li, S., Chen, X. et al. Identifying the severity of technical debt issues based on semantic and structural information. Software Qual J 31, 1499–1526 (2023). https://doi.org/10.1007/s11219-023-09651-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-023-09651-3

Keywords

Navigation