Skip to main content
Log in

CoRT: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Context

Code smell detection is the process of identifying poorly designed and implemented code pieces. Machine learning-based approaches require enormous amounts of manually labeled data, which are costly and difficult to scale. Unsupervised semantic feature learning, or learning without manual annotation, is vital for effectively harvesting an enormous amount of available data.

Objective

The objective of this study is to propose a new code smell detection approach that utilizes self-supervised learning to learn intermediate representations without the need for labels and then fine-tune these representations on multiple tasks.

Method

We propose a Code Representation with Transformers (CoRT) to learn the semantic and structural features of the source code by training transformers to recognize masked reserved words that are applied to the code given as input. We empirically demonstrated that the defined proxy task provides a powerful method for learning semantic and structural features. We exhaustively evaluated our approach on four downstream tasks: detection of the Data Class, God Class, Feature Envy, and Long Method code smells. Moreover, we compare our results with those of two paradigms: supervised learning and a feature-based approach. Finally, we conducted a cross-project experiment to evaluate the generalizability of our method to unseen labeled data.

Results

The results indicate that the proposed method has a high detection performance for code smells. For instance, the detection performance of CoRT on Data Class achieved a score of F1 between 88.08–99.4, Area Under Curve (AUC) between 89.62–99.88, and Matthews Correlation Coefficient (MCC) between 75.28–98.8, while God Class achieved a value of F1 ranges from 86.32–99.03, AUC of 92.1–99.85, and MCC of 76.15–98.09. Compared with the baseline model and feature-based approach, CoRT achieved better detection performance and had a high capability to detect code smells in unseen datasets.

Conclusions

The proposed method has been shown to be effective in detecting class-level, and method-level code smells.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data Availability

The datasets generated and/or analyzed in the current study are available upon acceptance at the GitHub repository: https://github.com/amalazba/CoRT-Transformer-based-Code-Representations-with-Self-supervision-for-Code-Smell-Detection.

Notes

  1. https://scale.com/pricing

  2. https://github.com/amalazba/CoRT-Transformer-based-Code-Representations-with-Self-supervision-for-Code-Smell-Detection

  3. https://zenodo.org/record/3992730#.YvsvfuxBxEI

  4. https://zenodo.org/record/6555241#.YvssBuxByi4

  5. https://zenodo.org/record/4103861#.Yvsyu-xByi4

  6. https://www.tensorflow.org/

References

  • Abdou A, Darwish N (2022) Severity classification of software code smells using machine learning techniques: A comparative study. J Softw Evol Process e2454. https://doi.org/10.1002/smr.2454

  • AbuHassan A, Alshayeb M, Ghouti L (2021) Software smell detection techniques: A systematic literature review. J Softw Evol Process 33:e2320. https://doi.org/10.1002/smr.2320

    Article  Google Scholar 

  • Akiba T, Sano S, Yanase T, Ohta T, Koyama M (2019) Optuna: A next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Association for Computing Machinery, New York, NY, USA, pp 2623–2631. https://doi.org/10.1145/3292500.3330701

  • Alazba A, Aljamaan H (2021) Code smell detection using feature selection and stacking ensemble: An empirical investigation. Inf Softw Technol 138:106648. https://doi.org/10.1016/j.infsof.2021.106648

    Article  Google Scholar 

  • Alazba A, Aljamaan H, Alshayeb M (2023) Deep learning approaches for bad smell detection: a systematic literature review. Empir Softw Eng 28:77. https://doi.org/10.1007/s10664-023-10312-z

    Article  Google Scholar 

  • Alkhaeir T, Walter B (2021) The Effect of Code Smells on the Relationship Between Design Patterns and Defects. IEEE Access 9:3360–3373. https://doi.org/10.1109/ACCESS.2020.3047870

    Article  Google Scholar 

  • Alkharabsheh K, Crespo Y, Manso E, Taboada JA (2019) Software Design Smell Detection: a systematic mapping study. Softw Qual J 27:1069–1148. https://doi.org/10.1007/s11219-018-9424-8

    Article  Google Scholar 

  • Al-Shaaby A, Aljamaan H, Alshayeb M (2020) Bad Smell Detection Using Machine Learning Techniques: A Systematic Literature Review. Arab J Sci Eng. https://doi.org/10.1007/s13369-019-04311-w

    Article  Google Scholar 

  • Amorim L, Antunes N, Fonseca B, Ribeiro M (2015) Experience report: evaluating the effectiveness of decision trees for detecting code smells. In: 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE), pp 261–269. https://doi.org/10.1109/ISSRE.2015.7381819

  • Arcelli Fontana F, Zanoni M (2017) Code smell severity classification using machine learning techniques. Knowl-Based Syst 128:43–58. https://doi.org/10.1016/j.knosys.2017.04.014

    Article  Google Scholar 

  • Arcelli Fontana F, Mäntylä MV, Zanoni M, Marino A (2016) Comparing and experimenting machine learning techniques for code smell detection. Empir Softw Eng 21:1143–1191. https://doi.org/10.1007/s10664-015-9378-4

    Article  Google Scholar 

  • Banker RD, Datar SM, Kemerer CF, Zweig D (1993) Software complexity and maintenance costs. Commun ACM 36:81–94. https://doi.org/10.1145/163359.163375

    Article  Google Scholar 

  • Barbez A, Khomh F, Guéhéneuc Y-G (2019a) A machine-learning based ensemble method for anti-patterns detection. J Syst Softw 161:110486. https://doi.org/10.1016/j.jss.2019.110486

  • Barbez A, Khomh F, Gueheneuc Y-G (2019b) Deep Learning anti-patterns from Code metrics history. In: 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, Cleveland, OH, USA, pp 114–124. https://doi.org/10.1109/ICSME.2019.00021

  • Bryton S, Brito e Abreu F, Monteiro M (2010) Reducing subjectivity in code smells detection: experimenting with the long method. In: 2010 seventh international conference on the quality of information and communications technology. pp 337–342. https://doi.org/10.1109/QUATIC.2010.60

  • Charalampidou S, Ampatzoglou A, Avgeriou P (2015) Size and cohesion metrics as indicators of the long method bad smell: An empirical study. In: Proceedings of the 11th International Conference on Predictive Models and Data Analytics in Software Engineering. Association for Computing Machinery, Beijing, China, pp 1–10. https://doi.org/10.1145/2810146.2810155

  • Chen Z, Chen L, Ma W et al (2018) Understanding metric-based detectable smells in Python software: A comparative study. Inf Softw Technol 94:14–29. https://doi.org/10.1016/j.infsof.2017.09.011

    Article  Google Scholar 

  • Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of naacL-HLT, vol 1, p 2. https://doi.org/10.18653/V1/N19-1423

  • Dewangan S, Rao RS, Mishra A, Gupta M (2021) A novel approach for code smell detection: An empirical study. IEEE Access 9:162869–162883. https://doi.org/10.1109/ACCESS.2021.3133810

    Article  Google Scholar 

  • Di Nucci D, Palomba F, Tamburri DA, Serebrenik A, De Lucia A (2018) Detecting code smells using machine learning techniques: Are we there yet? 2018 IEEE 25th Int Conf Softw Anal Evol Reengineering SANER 612–621. https://doi.org/10.1109/SANER.2018.8330266

  • dos Reis JP, Abreu FB e, Carneiro G de F (2022) Crowdsmelling: A preliminary study on using collective knowledge in code smells detection. Empir Softw Eng 27:69. https://doi.org/10.1007/s10664-021-10110-5

    Article  Google Scholar 

  • Feng Z, Guo D, Tang D, Duan N, Feng X, Gong M, Shou L, Qin B, Liu T, Jiang D, Zhou M (2020) CodeBERT: A pre-trained model for programming and natural languages. In Findings of the Association for Computational Linguistics: EMNLP 2020, Online. Association for Computational Linguistics, pp 1536–1547. https://doi.org/10.18653/v1/2020.findings-emnlp.139

  • Fontana FA, Zanoni M, Marino A, Mäntylä MV (2013) Code smell detection: Towards a machine learning-based approach. In: Proceedings of the 2013 IEEE international conference on software maintenance. IEEE Computer Society, USA, pp 396–399. https://doi.org/10.1109/ICSM.2013.56

  • Fowler M, Beck K, Brant J et al (1999) Refactoring: Improving the design of existing code, 1st edn. Addison-Wesley Professional, Reading, MA

    Google Scholar 

  • Gidaris S, Singh P, Komodakis N (2018) Unsupervised representation learning by predicting image rotations. ArXiv, abs/1803.07728.

  • Guggulothu T, Moiz SA (2020) Code smell detection using multi-label classification approach. Softw Qual J 28:1063–1086. https://doi.org/10.1007/s11219-020-09498-y

    Article  Google Scholar 

  • Guo X, Shi C, Jiang H (2019) Deep semantic-based feature envy identification. In: Proceedings of the 11th Asia-Pacific Symposium on Internetware. Association for Computing Machinery, New York, NY, USA, pp 1–6. https://doi.org/10.1145/3361242.3361257

  • Guo D, Ren S, Lu S, Feng Z, Tang D, Liu S, Zhou L, Duan N, Yin J, Jiang D, Zhou M (2020) GraphCodeBERT: Pre-training Code Representations with Data Flow. ArXiv, abs/2009.08366

  • Guo D, Lu S, Duan N, Wang Y, Zhou M, Yin J (2022) UniXcoder: Unified cross-modal pre-training for code representation. Annual Meeting of the Association for Computational Linguistics. https://doi.org/10.48550/arXiv.2203.03850

  • Hadj-Kacem M, Bouassida N (2018) A hybrid approach to detect code smells using deep learning. In: Proceedings of the 13th international conference on evaluation of novel approaches to software engineering. SCITEPRESS - Science and Technology Publications, Lda, Setubal, PRT, pp 137–146. https://doi.org/10.5220/0006709801370146

  • Hadj-Kacem M, Bouassida N (2019a) Deep representation learning for code smells detection using variational auto-encoder. In: 2019 international joint conference on neural networks (IJCNN), pp 1–8. https://doi.org/10.1109/IJCNN.2019.8851854

  • Hadj-Kacem M, Bouassida N (2019b) Improving the identification of code smells by combining structural and semantic information. In: Gedeon T, Wong KW, Lee M (eds) Neural Information Processing. Springer International Publishing, Cham, pp 296–304

  • Hassaine S, Khomh F, Gueheneuc Y-G, Hamel S (2010) IDS: an immune-inspired approach for the detection of software design smells. In: 2010 Seventh International Conference on the Quality of Information and Communications Technology, pp 343–348. https://doi.org/10.1109/QUATIC.2010.61

  • He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  • Hua W, Sui Y, Wan Y et al (2021) FCCA: Hybrid Code Representation for Functional Clone Detection Using Attention Networks. IEEE Trans Reliab 70:304–318. https://doi.org/10.1109/TR.2020.3001918

    Article  Google Scholar 

  • Ioffe S, Szegedy C (2015) Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In International conference on machine learning, pp 448–456

  • Jaiswal A, Babu AR, Zadeh MZ et al (2021) A survey on contrastive self-supervised learning. Technologies 9:2. https://doi.org/10.3390/technologies9010002

    Article  Google Scholar 

  • Kaur A, Jain S, Goel S (2017) A support vector machine based approach for code smell detection. In: 2017 international conference on machine learning and data science (MLDS), pp 9–14. https://doi.org/10.1109/MLDS.2017.8

  • Khleel NAA, Nehéz K (2022) Deep convolutional neural network model for bad code smells detection based on oversampling method. Indones J Electr Eng Comput Sci 26:1725–1735. https://doi.org/10.11591/ijeecs.v26.i3.pp1725-1735

    Article  Google Scholar 

  • Khomh F, Vaucher S, Guéhéneuc Y-G, Sahraoui H (2009) A Bayesian approach for the detection of code and design smells. In: 2009 Ninth International Conference on Quality Software, pp 305–314. https://doi.org/10.1109/QSIC.2009.47

  • Khomh F, Vaucher S, Guéhéneuc Y-G, Sahraoui H (2011) BDTEX: A GQM-based Bayesian approach for the detection of antipatterns. J Syst Softw 84:559–572. https://doi.org/10.1016/j.jss.2010.11.921

    Article  Google Scholar 

  • Kim DK (2017) Finding bad code smells with neural network models. Int J Electr Comput Eng IJECE 7:3613–3621. https://doi.org/10.11591/ijece.v7i6.pp3613-3621

  • Kotsiantis SB (2007) Supervised machine learning: A review of classification techniques. In: Proceedings of the 2007 conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies. IOS Press, NLD, pp 3–24

  • Lacerda G, Petrillo F, Pimenta M, Guéhéneuc YG (2020) Code smells and refactoring: A tertiary systematic review of challenges and observations. J Syst Softw 167:110610. https://doi.org/10.1016/j.jss.2020.110610

    Article  Google Scholar 

  • Le H, Wang Y, Gotmare AD, Savarese S, Hoi SC (2022) Coderl: Mastering code generation through pretrained models and deep reinforcement learning. Adv Neural Inf Process Syst 35:21314–21328

  • Lim T-S, Loh W-Y, Shih Y-S (2000) A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Mach Learn 40:203–228. https://doi.org/10.1023/A:1007608224229

    Article  Google Scholar 

  • Liu H, Xu Z, Zou Y (2018) Deep learning based feature envy detection. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. ACM, New York, NY, USA, pp 385–396. https://doi.org/10.1145/3238147.3238166

  • Liu H, Jin J, Xu Z, Zou Y, Bu Y, Zhang L (2019) Deep learning based code smell detection. IEEE Trans Softw Eng 47(9):1811–1837. https://doi.org/10.1109/TSE.2019.2936376

  • Liu X, Zhang F, Hou Z, Mian L, Wang Z, Zhang J, Tang J (2021) Self-supervised Learning: Generative or Contrastive. IEEE Trans Knowl Data Eng 35(1):857–876. https://doi.org/10.1109/TKDE.2021.3090866

  • Liu S, Wu B, Xie X, Meng G, Liu Y (2023) ContraBERT: Enhancing code pre-trained models via contrastive learning. arXiv preprint arXiv:2301.09072

  • Lu S, Guo D, Ren S, Huang J, Svyatkovskiy A, Blanco A, Clement C, Drain D, Jiang D, Tang D, Li G (2021) CodeXGLUE: A machine learning benchmark dataset for code understanding and generation. arXiv preprint arXiv:2102.04664

  • Maiga A, Ali N, Bhattacharya N, Sabané A, Guéhéneuc YG, Aimeur E (2012a) SMURF: a SVM-based incremental anti-pattern detection approach. In: 2012 19th Working Conference on Reverse Engineering, pp 466–475. https://doi.org/10.1109/WCRE.2012.56

  • Maiga A, Ali N, Bhattacharya N, Sabané A, Guéhéneuc YG, Antoniol G, Aimeur E (2012b) Support vector machines for anti-pattern detection. In: 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering, pp 278–281. https://doi.org/10.1145/2351676.2351723

  • Mayvan BB, Rasoolzadegan A, Jafari AJ (2020) Bad smell detection using quality metrics and refactoring opportunities. J Softw Evol Process 32:e2255. https://doi.org/10.1002/smr.2255

    Article  Google Scholar 

  • Moha N, Gueheneuc Y-G, Duchien L, Le Meur A-F (2010) DECOR: A Method for the Specification and Detection of Code and Design Smells. IEEE Trans Softw Eng 36:20–36. https://doi.org/10.1109/TSE.2009.50

    Article  Google Scholar 

  • Myung IJ (2000) The Importance of Complexity in Model Selection. J Math Psychol 44:190–204. https://doi.org/10.1006/jmps.1999.1283

    Article  Google Scholar 

  • Nafi KW, Kar TS, Roy B, Roy CK, Schneider KA (2019) CLCDSA: cross language code clone detection using syntactical features and api documentation. In: 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, San Diego, CA, USA, pp 1026–1037. https://doi.org/10.1109/ASE.2019.00099

  • Olbrich SM, Cruzes DS, Sjøberg DIK (2010) Are all code smells harmful? A study of God Classes and Brain Classes in the evolution of three open source systems. In: 2010 IEEE International Conference on Software Maintenance, pp 1–10. https://doi.org/10.1109/ICSM.2010.5609564

  • Parr T (2013) The definitive ANTLR 4 reference. The Definitive ANTLR 4 Reference, pp 1–326

  • Ren S, Shi C, Zhao S (2021) Exploiting multi-aspect interactions for god class detection with dataset fine-tuning. In: 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC). IEEE, Madrid, Spain, pp 864–873. https://doi.org/10.1109/COMPSAC51774.2021.00119

  • Roy GG, Veraart VE (1996) Software engineering education: from an engineering perspective. In: Proceedings 1996 International Conference Software Engineering: Education and Practice, pp 256–262

  • Sandouka R, Aljamaan H (2023) Python code smells detection using conventional machine learning models. PeerJ Comput Sci 9:e1370. https://doi.org/10.7717/peerj-cs.1370

    Article  Google Scholar 

  • Sharma T, Efstathiou V, Louridas P, Spinellis D (2021) Code smell detection by deep direct-learning and transfer-learning. J Syst Softw 176:110936. https://doi.org/10.1016/j.jss.2021.110936

    Article  Google Scholar 

  • Sotto-Mayor B, Elmishali A, Kalech M, Abreu R (2022) Exploring Design smells for smell-based defect prediction. Eng Appl Artif Intell 115:105240. https://doi.org/10.1016/j.engappai.2022.105240

    Article  Google Scholar 

  • Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2019) The Impact of Automated Parameter Optimization on Defect Prediction Models. IEEE Trans Softw Eng 45:683–711. https://doi.org/10.1109/TSE.2018.2794977

    Article  Google Scholar 

  • Tempero E, Anslow C, Dietrich J, Han T, Li J, Lumpe M, Melton H, Noble J (2010) The qualitas corpus: A curated collection of java code for empirical studies. In: 2010 Asia Pacific Software Engineering Conference, pp 336–345. https://doi.org/10.1109/APSEC.2010.46

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention Is All You Need. Adv Neural Inf Process Syst 30

  • Wang X, Dang Y, Zhang L, Zhang D, Lan E, Mei H (2012) Can I clone this piece of code here? In: 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering, pp 170–179. https://doi.org/10.1145/2351676.2351701

  • Wang H, Liu J, Kang J, Yin W, Sun H, Wang H (2020) Feature envy detection based on Bi-LSTM with self-attention mechanism. In: 2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). IEEE, Exeter, United Kingdom, pp 448–457. https://doi.org/10.1109/ISPA-BDCloud-SocialCom-SustainCom51426.2020.00082

  • Wang Y, Wang W, Joty S, Hoi SCH (2021) CodeT5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv preprint arXiv:2109.00859

  • Wang Y, Le H, Gotmare AD, Bui ND, Li J, Hoi SC (2023) CodeT5+: Open code large language models for code understanding and generation. arXiv preprint arXiv:2305.07922

  • Watanabe S, Hutter F (2022) c-TPE: Generalizing tree-structured parzen estimator with inequality constraints for continuous and categorical hyperparameter optimization. arXiv preprint arXiv:2211.14411

  • White M, Tufano M, Vendome C, Poshyvanyk D (2016) Deep learning code fragments for code clone detection. In: 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE), pp 87–98. https://doi.org/10.1145/2970276.2970326

  • Xu W, Zhang X (2021) Multi-granularity code smell detection using deep learning method based on abstract syntax tree. In: Proceeding 33rd Int. Conf. Software Engineering and Knowledge Engineering, pp 503–509

  • Yin X, Shi C, Zhao S (2021) Local and global feature based explainable feature envy detection. In: 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC). IEEE, Madrid, Spain, pp 942–951. https://doi.org/10.1109/COMPSAC51774.2021.00127

Download references

Acknowledgements

The authors acknowledge the support of the King Fahd University of Petroleum and Minerals in the development of this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amal Alazba.

Ethics declarations

Conflicts of Interests/Competing Interests

The authors declare no conflict of interest relevant to the content of this article.

Additional information

Communicated by: Andrea De Lucia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix 1: Cross-project Performance Heatmap (Accuracy, Precision, Recall)

Appendix 1: Cross-project Performance Heatmap (Accuracy, Precision, Recall)

See Figs. 12, 13, 14 and 15.

Fig. 12
figure 12

Cross-project performance heatmap for Data Class

Fig. 13
figure 13

Cross-project performance heatmap for God Class

Fig. 14
figure 14

Cross-project performance heatmap for Feature Envy

Fig. 15
figure 15

Cross-project performance heatmap for Long Method

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alazba, A., Aljamaan, H. & Alshayeb, M. CoRT: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection. Empir Software Eng 29, 59 (2024). https://doi.org/10.1007/s10664-024-10445-9

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-024-10445-9

Keywords

Navigation