Identifying the severity of technical debt issues based on semantic and structural information

Yu, Dongjin; Li, Sicheng; Chen, Xin; Sun, Tian

doi:10.1007/s11219-023-09651-3

Identifying the severity of technical debt issues based on semantic and structural information

Research
Published: 10 October 2023

Volume 31, pages 1499–1526, (2023)
Cite this article

Software Quality Journal Aims and scope Submit manuscript

Dongjin Yu¹,
Sicheng Li²,
Xin Chen¹ &
…
Tian Sun¹

181 Accesses
Explore all metrics

Abstract

Technical debt (TD) refers to the phenomenon that developers choose a compromise solution from a short-term benefit perspective during design or architecture selection. TD-related issues, such as code smells, may have a critical impact on important non-functional requirements. Different severity levels of TD issues require different measures to be taken by developers in the future. Existing studies mainly focus on detecting TD in software projects through source code or comments, but usually ignore the severity degree of TD issues. As a matter of fact, it is very important to identify the severity of TD issues and clarify which TD should be prioritized. In this paper, we propose an approach that combines the semantic and structural information of the code snippets to identify their severity at method level. In the approach, we first transform each method affected by TD issues into an abstract syntax tree (AST) and use the paths in the AST to represent its semantic information. Then, we extract different code metrics to measure the size, coupling, and complexity of methods affected by TD issues to represent their structural information. Finally, we build a stacking ensemble model to identify the severity of TD issues by using Random Forest (RF) and eXtreme Gradient Boosting (XGBoost) for the base classifiers and Support Vector Machine (SVM) for the meta-classifier. The evaluation results on the real dataset show that our approach achieves 65.77% in terms of precision, 68.18% in terms of recall, and 65.84% in terms of F1-score on average. In addition, the experimental results also demonstrate that the strategy of combining the semantic and structural information of code snippets is effective in improving the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identifying self-admitted technical debt in open source projects using text mining

Article 09 May 2017

Early prediction of merged code changes to prioritize reviewing tasks

Article 19 March 2018

A multi-objective effort-aware approach for early code review prediction and prioritization

Article 23 December 2023

Data availability

The data of this study is openly available in Github at https://github.com/HduDBSI/SQJ-TD-Severity.

Notes

References

Alfayez, R., & Boehm, B. (2019). Technical debt prioritization: A search-based approach. In 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS), pp. 434–445. IEEE.
Alon, U., Zilberstein, M., Levy, O., & Yahav, E. (2019). code2vec: Learning distributed representations of code. Proceedings of the ACM on Programming Languages, 3(POPL), 1–29.
Amanatidis, T., Mittas, N., Moschou, A., Chatzigeorgiou, A., Ampatzoglou, A., & Angelis, L. (2020). Evaluating the agreement among technical debt measurement tools: Building an empirical benchmark of technical debt liabilities. Empirical Software Engineering, 25, 4161–4204.
Article Google Scholar
Aniche, M. (2015). Java code metrics calculator (CK). Available in https://github.com/mauricioaniche/ck/
Avgeriou, P. C., Taibi, D., Ampatzoglou, A., Arcelli Fontana, F., Besker, T., Chatzigeorgiou, A., Lenarduzzi, V., Martini, A., Moschou, A., Pigazzini, I., et al. (2020). An overview and comparison of technical debt measurement tools. IEEE Software, 38(3), 61–71.
Article Google Scholar
Boutaib, S., Bechikh, S., Palomba, F., Elarbi, M., Makhlouf, M., & Said, L. B. (2021). Code smell detection and identification in imbalanced environments. Expert Systems with Applications, 166, 114076.
Article Google Scholar
Breiman, L. (2001). Random forests. Machine learning, 45(1), 5–32.
Article Google Scholar
Chatzigeorgiou, A., Ampatzoglou, A., Ampatzoglou, A., & Amanatidis, T. (2015). Estimating the breaking point for technical debt. In 2015 IEEE 7th International Workshop on Managing Technical Debt (MTD), pp. 53–56. IEEE.
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, 321–357.
Article MATH Google Scholar
Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., Chen, K., et al. (2015). Xgboost: Extreme gradient boosting. R package version 0.4-2, 1(4), 1–4.
Chen, X., Yu, D., Fan, X., Wang, L., & Chen, J. (2021). Multiclass classification for self-admitted technical debt based on XGBoost. IEEE Transactions on Reliability.
Conejero, J. M., Rodríguez-Echeverría, R., Hernández, J., Clemente, P. J., Ortiz-Caraballo, C., Jurado, E., & Sánchez-Figueroa, F. (2018). Early evaluation of technical debt impact on maintainability. Journal of Systems and Software, 142, 92–114.
Article Google Scholar
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273–297.
Google Scholar
Cunningham, W. (1992). The WyCash portfolio management system. ACM SIGPLAN OOPS Messenger, 4(2), 29–30.
Article Google Scholar
da Silva, Maldonado E., Shihab, E., & Tsantalis, N. (2017). Using natural language processing to automatically detect self-admitted technical debt. IEEE Transactions on Software Engineering, 43(11), 1044–1062.
Article Google Scholar
de Almeida, R. R., Kulesza, U., Treude, C., Higino Guedes Lima, A., et al. (2018). Aligning technical debt prioritization with business objectives: A multiple-case study. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 655–664. IEEE.
de Lima, B. S., & Garcia, R. E. (2020). Analyzing the rework time and severity of code debt: A case study using technical debt catalogs. arXiv preprint arXiv:2002.04695
de Lima, B. S., Garcia, R. E., & Eler, D. M. (2022). Toward prioritization of self-admitted technical debt: An approach to support decision to payment. Software Quality Journal, 30(3), 729–755.
Article Google Scholar
Detofeno, T., Malucelli, A., & Reinehr, S. (2022). PriorTD: A method for prioritization technical debt. In Proceedings of the XXXVI Brazilian Symposium on Software Engineering, pp. 230–240.
Digkas, G., Lungu, M., Chatzigeorgiou, A., & Avgeriou, P. (2017). The evolution of technical debt in the apache ecosystem. In European Conference on Software Architecture, pages 51–66. Springer.
Falessi, D., & Reichel, A. (2015). Towards an open-source tool for measuring and visualizing the interest of technical debt. In 2015 IEEE 7th International Workshop on Managing Technical Debt (MTD), pp. 1–8. IEEE.
Fernández, A., Garcia, S., Herrera, F., & Chawla, N. V. (2018). Smote for learning from imbalanced data: Progress and challenges, marking the 15-year anniversary. Journal of artificial intelligence research, 61, 863–905.
Article MathSciNet MATH Google Scholar
Flisar, J., & Podgorelec, V. (2018). Enhanced feature selection using word embeddings for self-admitted technical debt identification. In 2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 230–233. IEEE.
Fontana, F. A., & Zanoni, M. (2017). Code smell severity classification using machine learning techniques. Knowledge-Based Systems, 128, 43–58.
Article Google Scholar
Guggulothu, T., & Moiz, S. A. (2020). Code smell detection using multi-label classification approach. Software Quality Journal, 28, 1063–1086.
Article Google Scholar
He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on knowledge and data engineering, 21(9), 1263–1284.
Article Google Scholar
Huang, Q., Shihab, E., Xia, X., Lo, D., & Li, S. (2018). Identifying self-admitted technical debt in open source projects using text mining. Empirical Software Engineering, 23(1), 418–451.
Article Google Scholar
Japkowicz, N., & Stephen, S. (2002). The class imbalance problem: A systematic study. Intelligent data analysis, 6(5), 429–449.
Article MATH Google Scholar
Kamei, Y., Maldonado, E. D. S., Shihab, E., & Ubayashi, N. (2016). Using analytics to quantify interest of self-admitted technical debt. In QuASoQ/TDA@ APSEC, pp. 68–71.
Lenarduzzi, V., Sillitti, A., & Taibi, D. (2017). Analyzing forty years of software maintenance models. In 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C), pp. 146–148. IEEE.
Lenarduzzi, V., Sillitti, A., & Taibi, D. (2020). A survey on code analysis tools for software maintenance prediction. In Proceedings of 6th International Conference in Software Engineering for Defence Applications: SEDA 2018 6, pp. 165–175. Springer.
Letouzey, J.-L., & Ilkiewicz, M. (2012). Managing technical debt with the sqale method. IEEE software, 29(6), 44–51.
Article Google Scholar
Li, Z., Liang, P., Avgeriou, P., Guelfi, N., & Ampatzoglou, A. (2014). An empirical investigation of modularity metrics for indicating architectural technical debt. In Proceedings of the 10th international ACM Sigsoft conference on Quality of software architectures, pp. 119–128.
Li, Z., Avgeriou, P., & Liang, P. (2015). A systematic mapping study on technical debt and its management. Journal of Systems and Software, 101, 193–220.
Article Google Scholar
Liu, X. -Y., Wu, J., & Zhou, Z. -H. (2008). Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 39(2), 539–550.
Maipradit, R., Treude, C., Hata, H., & Matsumoto, K. (2020). Wait for it: Identifying (on-hold) self-admitted technical debt. Empirical Software Engineering, 25(5), 3770–3798.
Article Google Scholar
Maldonado, E. D. S., & Shihab, E. (2015). Detecting and quantifying different types of self-admitted technical debt. In 2015 IEEE 7Th international workshop on managing technical debt (MTD), pp. 9–15. IEEE.
Martini, A., & Bosch, J. (2017). On the interest of architectural technical debt: Uncovering the contagious debt phenomenon. Journal of Software: Evolution and Process, 29(10), e1877.
Google Scholar
Mensah, S., Keung, J., Svajlenko, J., Bennin, K. E., & Mi, Q. (2018). On the value of a prioritization scheme for resolving self-admitted technical debt. Journal of Systems and Software, 135, 37–54.
Article Google Scholar
Ramač, R., Mandić, V., Taušan, N., Rios, N., Freire, S., Pérez, B., Castellanos, C., Correal, D., Pacheco, A., Lopez, G., et al. (2022). Prevalence, common causes and effects of technical debt: Results from a family of surveys with the it industry. Journal of Systems and Software, 184, 111114.
Article Google Scholar
Ren, X., Xing, Z., Xia, X., Lo, D., Wang, X., & Grundy, J. (2019). Neural network-based detection of self-admitted technical debt: From performance to explainability. ACM transactions on software engineering and methodology (TOSEM), 28(3), 1–45.
Article Google Scholar
Ribeiro, L. F., de Freitas Farias, M. A., Mendonça, M. G., & Spínola, R. O. (2016). Decision criteria for the payment of technical debt in software projects: A systematic mapping study. ICEIS, 1, 572–579.
Google Scholar
Rios, N., de Mendonça Neto, M. G., & Spínola, R. O. (2018a). A tertiary study on technical debt: Types, management strategies, research trends, and base information for practitioners. Information and Software Technology, 102, 117–145.
Article Google Scholar
Rios, N., Spínola, R. O., Mendonça, M., & Seaman, C. (2018b). The most common causes and effects of technical debt: First results from a global family of industrial surveys. In Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 1–10.
Sae-Lim, N., Hayashi, S., & Saeki, M. (2018). Context-based approach to prioritize code smells for prefactoring. Journal of Software: Evolution and Process, 30(6), e1886.
Google Scholar
Tan, J., Feitosa, D., Avgeriou, P., & Lungu, M. (2021). Evolution of technical debt remediation in python: A case study on the apache software ecosystem. Journal of Software: Evolution and Process, 33(4), e2319.
Google Scholar
Tsoukalas, D., Chatzigeorgiou, A., Ampatzoglou, A., Mittas, N., & Kehagias, D. (2022). TD classifier: Automatic identification of Java classes with high technical debt. In Proceedings of the International Conference on Technical Debt, pp. 76–80.
Vassallo, C., Panichella, S., Palomba, F., Proksch, S., Gall, H. C., & Zaidman, A. (2020). How developers engage with static analysis tools in different contexts. Empirical Software Engineering, 25(2), 1419–1457.
Article Google Scholar
Wang, X., Liu, J., Li, L., Chen, X., Liu, X., & Wu, H. (2020). Detecting and explaining self-admitted technical debts with attention-based neural networks. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, pp. 871–882.
Wehaibi, S., Shihab, E., & Guerrouj, L. (2016). Examining the impact of self-admitted technical debt on software quality. In 2016 IEEE 23Rd international conference on software analysis, evolution, and reengineering (SANER), 1, 179–188. IEEE.
Wilson, D. R., & Martinez, T. R. (2000). Reduction techniques for instance-based learning algorithms. Machine learning, 38(3), 257–286.
Article MATH Google Scholar
Wolpert, D. H. (1992). Stacked generalization. Neural networks, 5(2), 241–259.
Article Google Scholar
Xia, X., Shihab, E., Kamei, Y., Lo, D., & Wang, X. (2016). Predicting crashing releases of mobile applications. In Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 1–10.
Yan, M., Xia, X., Shihab, E., Lo, D., Yin, J., & Yang, X. (2018). Automating change-level self-admitted technical debt determination. IEEE Transactions on Software Engineering, 45(12), 1211–1229.
Article Google Scholar
Yli-Huumo, J., Maglyas, A., & Smolander, K. (2016). How do software development teams manage technical debt? An empirical study. Journal of Systems and Software, 120, 195–218.
Article Google Scholar
Yu, D., Wang, L., Chen, X., & Chen, J. (2021). Using BiLSTM with attention mechanism to automatically detect self-admitted technical debt. Frontiers of Computer Science, 15(4), 1–12.
Article Google Scholar
Zampetti, F., Noiseux, C., Antoniol, G., Khomh, F., & Di Penta, M. (2017). Recommending when design technical debt should be self-admitted. In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 216–226. IEEE.
Zampetti, F., Serebrenik, A., & Di Penta, M. (2020). Automatically learning patterns for self-admitted technical debt removal. In 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 355–366. IEEE.

Download references

Funding

This work was supported by the National Natural Science Foundation of China under Grants 62372145 and 61902096, the Natural Science Foundation of Zhejiang Province under Grant LY21F020020, and the Key Research and Development Program of Zhejiang Province under Grants 2023C03200 and 2023C03179.

Author information

Authors and Affiliations

School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, China
Dongjin Yu, Xin Chen & Tian Sun
HDU-ITMO Joint Institute, Hangzhou Dianzi University, Hangzhou, China
Sicheng Li

Authors

Dongjin Yu
View author publications
You can also search for this author in PubMed Google Scholar
Sicheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Xin Chen
View author publications
You can also search for this author in PubMed Google Scholar
Tian Sun
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Dongjin Yu: conceptualization, methodology. Sicheng Li: data curation, methodology, writing original draft, software. Xin Chen: validation, reviewing. Tian Sun: data curation, investigation.

Corresponding author

Correspondence to Dongjin Yu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yu, D., Li, S., Chen, X. et al. Identifying the severity of technical debt issues based on semantic and structural information. Software Qual J 31, 1499–1526 (2023). https://doi.org/10.1007/s11219-023-09651-3

Download citation

Accepted: 09 September 2023
Published: 10 October 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s11219-023-09651-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identifying the severity of technical debt issues based on semantic and structural information

Abstract

Access this article

Similar content being viewed by others

Identifying self-admitted technical debt in open source projects using text mining

Early prediction of merged code changes to prioritize reviewing tasks

A multi-objective effort-aware approach for early code review prediction and prioritization

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Identifying the severity of technical debt issues based on semantic and structural information

Abstract

Access this article

Similar content being viewed by others

Identifying self-admitted technical debt in open source projects using text mining

Early prediction of merged code changes to prioritize reviewing tasks

A multi-objective effort-aware approach for early code review prediction and prioritization

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation