CoRT: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection

Alazba, Amal; Aljamaan, Hamoud; Alshayeb, Mohammad

doi:10.1007/s10664-024-10445-9

CoRT: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection

Published: 08 April 2024

Volume 29, article number 59, (2024)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

203 Accesses
Explore all metrics

Abstract

Context

Code smell detection is the process of identifying poorly designed and implemented code pieces. Machine learning-based approaches require enormous amounts of manually labeled data, which are costly and difficult to scale. Unsupervised semantic feature learning, or learning without manual annotation, is vital for effectively harvesting an enormous amount of available data.

Objective

The objective of this study is to propose a new code smell detection approach that utilizes self-supervised learning to learn intermediate representations without the need for labels and then fine-tune these representations on multiple tasks.

Method

We propose a Code Representation with Transformers (CoRT) to learn the semantic and structural features of the source code by training transformers to recognize masked reserved words that are applied to the code given as input. We empirically demonstrated that the defined proxy task provides a powerful method for learning semantic and structural features. We exhaustively evaluated our approach on four downstream tasks: detection of the Data Class, God Class, Feature Envy, and Long Method code smells. Moreover, we compare our results with those of two paradigms: supervised learning and a feature-based approach. Finally, we conducted a cross-project experiment to evaluate the generalizability of our method to unseen labeled data.

Results

The results indicate that the proposed method has a high detection performance for code smells. For instance, the detection performance of CoRT on Data Class achieved a score of F1 between 88.08–99.4, Area Under Curve (AUC) between 89.62–99.88, and Matthews Correlation Coefficient (MCC) between 75.28–98.8, while God Class achieved a value of F1 ranges from 86.32–99.03, AUC of 92.1–99.85, and MCC of 76.15–98.09. Compared with the baseline model and feature-based approach, CoRT achieved better detection performance and had a high capability to detect code smells in unseen datasets.

Conclusions

The proposed method has been shown to be effective in detecting class-level, and method-level code smells.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Code smell detection using multi-label classification approach

Article 04 April 2020

Automatic detection of code smells using metrics and CodeT5 embeddings: a case study in C#

Article 24 February 2024

Categorical Analysis of Code Smell Detection Using Machine Learning Algorithms

Data Availability

The datasets generated and/or analyzed in the current study are available upon acceptance at the GitHub repository: https://github.com/amalazba/CoRT-Transformer-based-Code-Representations-with-Self-supervision-for-Code-Smell-Detection.

Notes

References

Abdou A, Darwish N (2022) Severity classification of software code smells using machine learning techniques: A comparative study. J Softw Evol Process e2454. https://doi.org/10.1002/smr.2454
AbuHassan A, Alshayeb M, Ghouti L (2021) Software smell detection techniques: A systematic literature review. J Softw Evol Process 33:e2320. https://doi.org/10.1002/smr.2320
Article Google Scholar
Akiba T, Sano S, Yanase T, Ohta T, Koyama M (2019) Optuna: A next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Association for Computing Machinery, New York, NY, USA, pp 2623–2631. https://doi.org/10.1145/3292500.3330701
Alazba A, Aljamaan H (2021) Code smell detection using feature selection and stacking ensemble: An empirical investigation. Inf Softw Technol 138:106648. https://doi.org/10.1016/j.infsof.2021.106648
Article Google Scholar
Alazba A, Aljamaan H, Alshayeb M (2023) Deep learning approaches for bad smell detection: a systematic literature review. Empir Softw Eng 28:77. https://doi.org/10.1007/s10664-023-10312-z
Article Google Scholar
Alkhaeir T, Walter B (2021) The Effect of Code Smells on the Relationship Between Design Patterns and Defects. IEEE Access 9:3360–3373. https://doi.org/10.1109/ACCESS.2020.3047870
Article Google Scholar
Alkharabsheh K, Crespo Y, Manso E, Taboada JA (2019) Software Design Smell Detection: a systematic mapping study. Softw Qual J 27:1069–1148. https://doi.org/10.1007/s11219-018-9424-8
Article Google Scholar
Al-Shaaby A, Aljamaan H, Alshayeb M (2020) Bad Smell Detection Using Machine Learning Techniques: A Systematic Literature Review. Arab J Sci Eng. https://doi.org/10.1007/s13369-019-04311-w
Article Google Scholar
Amorim L, Antunes N, Fonseca B, Ribeiro M (2015) Experience report: evaluating the effectiveness of decision trees for detecting code smells. In: 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE), pp 261–269. https://doi.org/10.1109/ISSRE.2015.7381819
Arcelli Fontana F, Zanoni M (2017) Code smell severity classification using machine learning techniques. Knowl-Based Syst 128:43–58. https://doi.org/10.1016/j.knosys.2017.04.014
Article Google Scholar
Arcelli Fontana F, Mäntylä MV, Zanoni M, Marino A (2016) Comparing and experimenting machine learning techniques for code smell detection. Empir Softw Eng 21:1143–1191. https://doi.org/10.1007/s10664-015-9378-4
Article Google Scholar
Banker RD, Datar SM, Kemerer CF, Zweig D (1993) Software complexity and maintenance costs. Commun ACM 36:81–94. https://doi.org/10.1145/163359.163375
Article Google Scholar
Barbez A, Khomh F, Guéhéneuc Y-G (2019a) A machine-learning based ensemble method for anti-patterns detection. J Syst Softw 161:110486. https://doi.org/10.1016/j.jss.2019.110486
Barbez A, Khomh F, Gueheneuc Y-G (2019b) Deep Learning anti-patterns from Code metrics history. In: 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, Cleveland, OH, USA, pp 114–124. https://doi.org/10.1109/ICSME.2019.00021
Bryton S, Brito e Abreu F, Monteiro M (2010) Reducing subjectivity in code smells detection: experimenting with the long method. In: 2010 seventh international conference on the quality of information and communications technology. pp 337–342. https://doi.org/10.1109/QUATIC.2010.60
Charalampidou S, Ampatzoglou A, Avgeriou P (2015) Size and cohesion metrics as indicators of the long method bad smell: An empirical study. In: Proceedings of the 11th International Conference on Predictive Models and Data Analytics in Software Engineering. Association for Computing Machinery, Beijing, China, pp 1–10. https://doi.org/10.1145/2810146.2810155
Chen Z, Chen L, Ma W et al (2018) Understanding metric-based detectable smells in Python software: A comparative study. Inf Softw Technol 94:14–29. https://doi.org/10.1016/j.infsof.2017.09.011
Article Google Scholar
Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of naacL-HLT, vol 1, p 2. https://doi.org/10.18653/V1/N19-1423
Dewangan S, Rao RS, Mishra A, Gupta M (2021) A novel approach for code smell detection: An empirical study. IEEE Access 9:162869–162883. https://doi.org/10.1109/ACCESS.2021.3133810
Article Google Scholar
Di Nucci D, Palomba F, Tamburri DA, Serebrenik A, De Lucia A (2018) Detecting code smells using machine learning techniques: Are we there yet? 2018 IEEE 25th Int Conf Softw Anal Evol Reengineering SANER 612–621. https://doi.org/10.1109/SANER.2018.8330266
dos Reis JP, Abreu FB e, Carneiro G de F (2022) Crowdsmelling: A preliminary study on using collective knowledge in code smells detection. Empir Softw Eng 27:69. https://doi.org/10.1007/s10664-021-10110-5
Article Google Scholar
Feng Z, Guo D, Tang D, Duan N, Feng X, Gong M, Shou L, Qin B, Liu T, Jiang D, Zhou M (2020) CodeBERT: A pre-trained model for programming and natural languages. In Findings of the Association for Computational Linguistics: EMNLP 2020, Online. Association for Computational Linguistics, pp 1536–1547. https://doi.org/10.18653/v1/2020.findings-emnlp.139
Fontana FA, Zanoni M, Marino A, Mäntylä MV (2013) Code smell detection: Towards a machine learning-based approach. In: Proceedings of the 2013 IEEE international conference on software maintenance. IEEE Computer Society, USA, pp 396–399. https://doi.org/10.1109/ICSM.2013.56
Fowler M, Beck K, Brant J et al (1999) Refactoring: Improving the design of existing code, 1st edn. Addison-Wesley Professional, Reading, MA
Google Scholar
Gidaris S, Singh P, Komodakis N (2018) Unsupervised representation learning by predicting image rotations. ArXiv, abs/1803.07728.
Guggulothu T, Moiz SA (2020) Code smell detection using multi-label classification approach. Softw Qual J 28:1063–1086. https://doi.org/10.1007/s11219-020-09498-y
Article Google Scholar
Guo X, Shi C, Jiang H (2019) Deep semantic-based feature envy identification. In: Proceedings of the 11th Asia-Pacific Symposium on Internetware. Association for Computing Machinery, New York, NY, USA, pp 1–6. https://doi.org/10.1145/3361242.3361257
Guo D, Ren S, Lu S, Feng Z, Tang D, Liu S, Zhou L, Duan N, Yin J, Jiang D, Zhou M (2020) GraphCodeBERT: Pre-training Code Representations with Data Flow. ArXiv, abs/2009.08366
Guo D, Lu S, Duan N, Wang Y, Zhou M, Yin J (2022) UniXcoder: Unified cross-modal pre-training for code representation. Annual Meeting of the Association for Computational Linguistics. https://doi.org/10.48550/arXiv.2203.03850
Hadj-Kacem M, Bouassida N (2018) A hybrid approach to detect code smells using deep learning. In: Proceedings of the 13th international conference on evaluation of novel approaches to software engineering. SCITEPRESS - Science and Technology Publications, Lda, Setubal, PRT, pp 137–146. https://doi.org/10.5220/0006709801370146
Hadj-Kacem M, Bouassida N (2019a) Deep representation learning for code smells detection using variational auto-encoder. In: 2019 international joint conference on neural networks (IJCNN), pp 1–8. https://doi.org/10.1109/IJCNN.2019.8851854
Hadj-Kacem M, Bouassida N (2019b) Improving the identification of code smells by combining structural and semantic information. In: Gedeon T, Wong KW, Lee M (eds) Neural Information Processing. Springer International Publishing, Cham, pp 296–304
Hassaine S, Khomh F, Gueheneuc Y-G, Hamel S (2010) IDS: an immune-inspired approach for the detection of software design smells. In: 2010 Seventh International Conference on the Quality of Information and Communications Technology, pp 343–348. https://doi.org/10.1109/QUATIC.2010.61
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hua W, Sui Y, Wan Y et al (2021) FCCA: Hybrid Code Representation for Functional Clone Detection Using Attention Networks. IEEE Trans Reliab 70:304–318. https://doi.org/10.1109/TR.2020.3001918
Article Google Scholar
Ioffe S, Szegedy C (2015) Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In International conference on machine learning, pp 448–456
Jaiswal A, Babu AR, Zadeh MZ et al (2021) A survey on contrastive self-supervised learning. Technologies 9:2. https://doi.org/10.3390/technologies9010002
Article Google Scholar
Kaur A, Jain S, Goel S (2017) A support vector machine based approach for code smell detection. In: 2017 international conference on machine learning and data science (MLDS), pp 9–14. https://doi.org/10.1109/MLDS.2017.8
Khleel NAA, Nehéz K (2022) Deep convolutional neural network model for bad code smells detection based on oversampling method. Indones J Electr Eng Comput Sci 26:1725–1735. https://doi.org/10.11591/ijeecs.v26.i3.pp1725-1735
Article Google Scholar
Khomh F, Vaucher S, Guéhéneuc Y-G, Sahraoui H (2009) A Bayesian approach for the detection of code and design smells. In: 2009 Ninth International Conference on Quality Software, pp 305–314. https://doi.org/10.1109/QSIC.2009.47
Khomh F, Vaucher S, Guéhéneuc Y-G, Sahraoui H (2011) BDTEX: A GQM-based Bayesian approach for the detection of antipatterns. J Syst Softw 84:559–572. https://doi.org/10.1016/j.jss.2010.11.921
Article Google Scholar
Kim DK (2017) Finding bad code smells with neural network models. Int J Electr Comput Eng IJECE 7:3613–3621. https://doi.org/10.11591/ijece.v7i6.pp3613-3621
Kotsiantis SB (2007) Supervised machine learning: A review of classification techniques. In: Proceedings of the 2007 conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies. IOS Press, NLD, pp 3–24
Lacerda G, Petrillo F, Pimenta M, Guéhéneuc YG (2020) Code smells and refactoring: A tertiary systematic review of challenges and observations. J Syst Softw 167:110610. https://doi.org/10.1016/j.jss.2020.110610
Article Google Scholar
Le H, Wang Y, Gotmare AD, Savarese S, Hoi SC (2022) Coderl: Mastering code generation through pretrained models and deep reinforcement learning. Adv Neural Inf Process Syst 35:21314–21328
Lim T-S, Loh W-Y, Shih Y-S (2000) A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Mach Learn 40:203–228. https://doi.org/10.1023/A:1007608224229
Article Google Scholar
Liu H, Xu Z, Zou Y (2018) Deep learning based feature envy detection. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. ACM, New York, NY, USA, pp 385–396. https://doi.org/10.1145/3238147.3238166
Liu H, Jin J, Xu Z, Zou Y, Bu Y, Zhang L (2019) Deep learning based code smell detection. IEEE Trans Softw Eng 47(9):1811–1837. https://doi.org/10.1109/TSE.2019.2936376
Liu X, Zhang F, Hou Z, Mian L, Wang Z, Zhang J, Tang J (2021) Self-supervised Learning: Generative or Contrastive. IEEE Trans Knowl Data Eng 35(1):857–876. https://doi.org/10.1109/TKDE.2021.3090866
Liu S, Wu B, Xie X, Meng G, Liu Y (2023) ContraBERT: Enhancing code pre-trained models via contrastive learning. arXiv preprint arXiv:2301.09072
Lu S, Guo D, Ren S, Huang J, Svyatkovskiy A, Blanco A, Clement C, Drain D, Jiang D, Tang D, Li G (2021) CodeXGLUE: A machine learning benchmark dataset for code understanding and generation. arXiv preprint arXiv:2102.04664
Maiga A, Ali N, Bhattacharya N, Sabané A, Guéhéneuc YG, Aimeur E (2012a) SMURF: a SVM-based incremental anti-pattern detection approach. In: 2012 19th Working Conference on Reverse Engineering, pp 466–475. https://doi.org/10.1109/WCRE.2012.56
Maiga A, Ali N, Bhattacharya N, Sabané A, Guéhéneuc YG, Antoniol G, Aimeur E (2012b) Support vector machines for anti-pattern detection. In: 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering, pp 278–281. https://doi.org/10.1145/2351676.2351723
Mayvan BB, Rasoolzadegan A, Jafari AJ (2020) Bad smell detection using quality metrics and refactoring opportunities. J Softw Evol Process 32:e2255. https://doi.org/10.1002/smr.2255
Article Google Scholar
Moha N, Gueheneuc Y-G, Duchien L, Le Meur A-F (2010) DECOR: A Method for the Specification and Detection of Code and Design Smells. IEEE Trans Softw Eng 36:20–36. https://doi.org/10.1109/TSE.2009.50
Article Google Scholar
Myung IJ (2000) The Importance of Complexity in Model Selection. J Math Psychol 44:190–204. https://doi.org/10.1006/jmps.1999.1283
Article Google Scholar
Nafi KW, Kar TS, Roy B, Roy CK, Schneider KA (2019) CLCDSA: cross language code clone detection using syntactical features and api documentation. In: 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, San Diego, CA, USA, pp 1026–1037. https://doi.org/10.1109/ASE.2019.00099
Olbrich SM, Cruzes DS, Sjøberg DIK (2010) Are all code smells harmful? A study of God Classes and Brain Classes in the evolution of three open source systems. In: 2010 IEEE International Conference on Software Maintenance, pp 1–10. https://doi.org/10.1109/ICSM.2010.5609564
Parr T (2013) The definitive ANTLR 4 reference. The Definitive ANTLR 4 Reference, pp 1–326
Ren S, Shi C, Zhao S (2021) Exploiting multi-aspect interactions for god class detection with dataset fine-tuning. In: 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC). IEEE, Madrid, Spain, pp 864–873. https://doi.org/10.1109/COMPSAC51774.2021.00119
Roy GG, Veraart VE (1996) Software engineering education: from an engineering perspective. In: Proceedings 1996 International Conference Software Engineering: Education and Practice, pp 256–262
Sandouka R, Aljamaan H (2023) Python code smells detection using conventional machine learning models. PeerJ Comput Sci 9:e1370. https://doi.org/10.7717/peerj-cs.1370
Article Google Scholar
Sharma T, Efstathiou V, Louridas P, Spinellis D (2021) Code smell detection by deep direct-learning and transfer-learning. J Syst Softw 176:110936. https://doi.org/10.1016/j.jss.2021.110936
Article Google Scholar
Sotto-Mayor B, Elmishali A, Kalech M, Abreu R (2022) Exploring Design smells for smell-based defect prediction. Eng Appl Artif Intell 115:105240. https://doi.org/10.1016/j.engappai.2022.105240
Article Google Scholar
Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2019) The Impact of Automated Parameter Optimization on Defect Prediction Models. IEEE Trans Softw Eng 45:683–711. https://doi.org/10.1109/TSE.2018.2794977
Article Google Scholar
Tempero E, Anslow C, Dietrich J, Han T, Li J, Lumpe M, Melton H, Noble J (2010) The qualitas corpus: A curated collection of java code for empirical studies. In: 2010 Asia Pacific Software Engineering Conference, pp 336–345. https://doi.org/10.1109/APSEC.2010.46
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention Is All You Need. Adv Neural Inf Process Syst 30
Wang X, Dang Y, Zhang L, Zhang D, Lan E, Mei H (2012) Can I clone this piece of code here? In: 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering, pp 170–179. https://doi.org/10.1145/2351676.2351701
Wang H, Liu J, Kang J, Yin W, Sun H, Wang H (2020) Feature envy detection based on Bi-LSTM with self-attention mechanism. In: 2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). IEEE, Exeter, United Kingdom, pp 448–457. https://doi.org/10.1109/ISPA-BDCloud-SocialCom-SustainCom51426.2020.00082
Wang Y, Wang W, Joty S, Hoi SCH (2021) CodeT5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv preprint arXiv:2109.00859
Wang Y, Le H, Gotmare AD, Bui ND, Li J, Hoi SC (2023) CodeT5+: Open code large language models for code understanding and generation. arXiv preprint arXiv:2305.07922
Watanabe S, Hutter F (2022) c-TPE: Generalizing tree-structured parzen estimator with inequality constraints for continuous and categorical hyperparameter optimization. arXiv preprint arXiv:2211.14411
White M, Tufano M, Vendome C, Poshyvanyk D (2016) Deep learning code fragments for code clone detection. In: 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE), pp 87–98. https://doi.org/10.1145/2970276.2970326
Xu W, Zhang X (2021) Multi-granularity code smell detection using deep learning method based on abstract syntax tree. In: Proceeding 33rd Int. Conf. Software Engineering and Knowledge Engineering, pp 503–509
Yin X, Shi C, Zhao S (2021) Local and global feature based explainable feature envy detection. In: 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC). IEEE, Madrid, Spain, pp 942–951. https://doi.org/10.1109/COMPSAC51774.2021.00127

Download references

Acknowledgements

The authors acknowledge the support of the King Fahd University of Petroleum and Minerals in the development of this work.

Author information

Authors and Affiliations

Information and Computer Science Department, King Fahd University of Petroleum and Minerals, 31261, Dhahran, Saudi Arabia
Amal Alazba, Hamoud Aljamaan & Mohammad Alshayeb
Department of Information Systems, King Saud University, 11362, Riyadh, Saudi Arabia
Amal Alazba
Interdisciplinary Research Center for Finance and Digital Economy, 31261, Dhahran, Saudi Arabia
Hamoud Aljamaan
Interdisciplinary Research Center for Intelligent Secure Systems, 31261, Dhahran, Saudi Arabia
Mohammad Alshayeb

Authors

Amal Alazba
View author publications
You can also search for this author in PubMed Google Scholar
Hamoud Aljamaan
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Alshayeb
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amal Alazba.

Ethics declarations

Conflicts of Interests/Competing Interests

The authors declare no conflict of interest relevant to the content of this article.

Additional information

Communicated by: Andrea De Lucia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix 1: Cross-project Performance Heatmap (Accuracy, Precision, Recall)

See Figs. 12, 13, 14 and 15.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Alazba, A., Aljamaan, H. & Alshayeb, M. CoRT: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection. Empir Software Eng 29, 59 (2024). https://doi.org/10.1007/s10664-024-10445-9

Download citation

Accepted: 08 January 2024
Published: 08 April 2024
DOI: https://doi.org/10.1007/s10664-024-10445-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CoRT: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection