Automated detection of class diagram smells using self-supervised learning

Alazba, Amal; Aljamaan, Hamoud; Alshayeb, Mohammad

doi:10.1007/s10515-024-00429-w

Automated detection of class diagram smells using self-supervised learning

Published: 24 March 2024

Volume 31, article number 29, (2024)
Cite this article

Automated Software Engineering Aims and scope Submit manuscript

Amal Alazba^1,2,
Hamoud Aljamaan^1,3 &
Mohammad Alshayeb^1,4

111 Accesses
Explore all metrics

Abstract

Design smells are symptoms of poorly designed solutions that may result in several maintenance issues. While various approaches, including traditional machine learning methods, have been proposed and shown to be effective in detecting design smells, they require extensive manually labeled data, which is expensive and challenging to scale. To leverage the vast amount of data that is now accessible, unsupervised semantic feature learning, or learning without requiring manual annotation labor, is essential. The goal of this paper is to propose a design smell detection method that is based on self-supervised learning. We propose Model Representation with Transformers (MoRT) to learn the UML class diagram features by training Transformers to recognize masked keywords. We empirically show how effective the defined proxy task is at learning semantic and structural properties. We thoroughly assess MoRT using four model smells: the Blob, Functional Decomposition, Spaghetti Code, and Swiss Army Knife. Furthermore, we compare our findings with supervised learning and feature-based methods. Finally, we ran a cross-project experiment to assess the generalizability of our approach. Results show that MoRT is highly effective in detecting design smells.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine learning-based test smell detection

Article Open access 05 March 2024

Software Design Smell Detection: a systematic mapping study

Article 27 October 2018

Implications of semi-supervised learning for design pattern selection

Article 05 January 2023

Notes

References

AbuHassan, A., Alshayeb, M., Ghouti, L.: Software smell detection techniques: a systematic literature review. J. Softw. Evol. Process 33(3), e2320 (2021). https://doi.org/10.1002/smr.2320
Article Google Scholar
Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, in KDD ‘19. New York, NY, USA: Association for Computing Machinery, Jul. 2019, pp. 2623–2631. https://doi.org/10.1145/3292500.3330701
Alalfi, M.H., Antony, E.P., Cordy, J.R.: An approach to clone detection in sequence diagrams and its application to security analysis. Softw. Syst. Model. 17(4), 1287–1309 (2018). https://doi.org/10.1007/s10270-016-0557-6
Article Google Scholar
Alazba, A., Aljamaan, H.: Code smell detection using feature selection and stacking ensemble: an empirical investigation. Inf. Softw. Technol. 138, 106648 (2021). https://doi.org/10.1016/j.infsof.2021.106648
Article Google Scholar
Alazba, A., Aljamaan, H., Alshayeb, M.: Deep learning approaches for bad smell detection: a systematic literature review. Empir. Softw. Eng. 28(3), 77 (2023). https://doi.org/10.1007/s10664-023-10312-z
Article Google Scholar
Alazba, A., Aljamaan, H., Alshayeb, M.: CoRT: transformer-based code representations with self-supervision by predicting reserved words for code smell detection. Empir. Softw. Eng. J. (2024)
Al-Shaaby, A., Aljamaan, H., Alshayeb, M.: Bad smell detection using machine learning techniques: a systematic literature review. Arab. J. Sci. Eng. 45(4), 2341–2369 (2020). https://doi.org/10.1007/s13369-019-04311-w
Article Google Scholar
Alshayeb, M., Mumtaz, H., Mahmood, S., Niazi, M.: Improving the security of UML sequence diagram using genetic algorithm. IEEE Access 8, 62738–62761 (2020). https://doi.org/10.1109/ACCESS.2020.2981742
Article Google Scholar
Barriga Rodriguez, A., Bettini, L., Iovino, L., Rutle, A., Heldal, R.: Addressing the trade off between smells and quality when refactoring class diagrams. J. Object Technol. 20, 1 (2021). https://doi.org/10.5381/jot.2021.20.3.a1
Article Google Scholar
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. ArXiv181004805 Cs, May 2019, Accessed: Mar. 07, 2022. [Online]. Available: http://arxiv.org/abs/1810.04805
Fourati, R., Bouassida, N., Abdallah, H.B.: A metric-based approach for anti-pattern detection in UML designs. In: Lee, R. (ed) Computer and Information Science 2011, in Studies in Computational Intelligence. Berlin, Heidelberg: Springer, 2011, pp. 17–33https://doi.org/10.1007/978-3-642-21378-6_2
Ghannem, A., El Boussaidi, G., Kessentini, M.: On the use of design defect examples to detect model refactoring opportunities. Softw. Qual. J. 24(4), 947–965 (2016). https://doi.org/10.1007/s11219-015-9271-9
Article Google Scholar
Gidaris, S., Singh, P., Komodakis, N.: Unsupervised representation learning by predicting image rotations. arXiv, Mar. 20, 2018. https://doi.org/10.48550/arXiv.1803.07728
Hebig, R., Quang, T.H., Chaudron, M.R.V., Robles, G., Fernandez, M.A.: The quest for open source projects that use UML: mining GitHub. In: Proceedings of the ACM/IEEE 19th International Conference on Model Driven Engineering Languages and Systems, in MODELS ‘16. New York, NY, USA: Association for Computing Machinery, 2016, pp. 173–183. https://doi.org/10.1145/2976767.2976778
Jaiswal, A., Babu, A.R., Zadeh, M.Z., Banerjee, D., Makedon, F.: A survey on contrastive self-supervised learning. Technologies 9(1), 1 (2021). https://doi.org/10.3390/technologies9010002
Article Google Scholar
Khomh, F., Vaucher, S., Guéhéneuc, Y.-G., Sahraoui, H.: BDTEX: a GQM-based Bayesian approach for the detection of antipatterns. J. Syst. Softw. 84(4), 559–572 (2011). https://doi.org/10.1016/j.jss.2010.11.921
Article Google Scholar
Lim, T.-S., Loh, W.-Y., Shih, Y.-S.: A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Mach. Learn. 40(3), 203–228 (2000). https://doi.org/10.1023/A:1007608224229
Article Google Scholar
Liu, H., Jin, J., Xu, Z., Bu, Y., Zou, Y., Zhang, L.: Deep learning based code smell detection. IEEE Trans. Softw. Eng. (2019). https://doi.org/10.1109/TSE.2019.2936376
Article Google Scholar
Liu, X. et al., Self-supervised learning: generative or contrastive. ArXiv200608218 Cs Stat, Mar. 2021, Accessed: Apr. 26, 2021. [Online]. Available: http://arxiv.org/abs/2006.08218
López, J.A.H., Cánovas Izquierdo, J.L., Cuadrado, J.S.: ModelSet: a dataset for machine learning in model-driven engineering. Softw. Syst. Model. 21(3), 967–986 (2022). https://doi.org/10.1007/s10270-021-00929-3
Article Google Scholar
Maddeh, M., Ayouni, S., Alyahya, S., Hajjej, F.: Decision tree-based design defects detection. IEEE Access 9, 71606–71614 (2021). https://doi.org/10.1109/ACCESS.2021.3078724
Article Google Scholar
Maddeh, M., Ayouni, S.: Extracting and modeling design defects using gradual rules and UML profile. In: Maddeh, M. (ed.) Computer Science and its Applications, in IFIP Advances in Information and Communication Technology, pp. 574–583. Springer International Publishing, Cham (2015). https://doi.org/10.1007/978-3-319-19578-0_47
Chapter Google Scholar
Maneerat, N., Muenchaisri, P.: Bad-smell prediction from software design model using machine learning techniques. In: 2011 Eighth International Joint Conference on Computer Science and Software Engineering (JCSSE), May 2011, pp. 331–336. https://doi.org/10.1109/JCSSE.2011.5930143
Misbhauddin, M., Alshayeb, M.: UML model refactoring: a systematic literature review. Empir. Softw. Eng. 20(1), 206–251 (2015). https://doi.org/10.1007/s10664-013-9283-7
Article Google Scholar
Misbhauddin, M., Alshayeb, M.: An integrated metamodel-based approach to software model refactoring. Softw. Syst. Model. 18(3), 2013–2050 (2019). https://doi.org/10.1007/s10270-017-0628-3
Article Google Scholar
Moha, N., Gueheneuc, Y.-G., Duchien, L., Le Meur, A.-F.: DECOR: a method for the specification and detection of code and design smells. IEEE Trans. Softw. Eng. 36(1), 20–36 (2010). https://doi.org/10.1109/TSE.2009.50
Article Google Scholar
Mumtaz, H., Alshayeb, M., Mahmood, S., Niazi, M.: A survey on UML model smells detection techniques for software refactoring. J. Softw. Evol. Process 31(3), e2154 (2019). https://doi.org/10.1002/smr.2154
Article Google Scholar
Myung, I.J.: The importance of complexity in model selection. J. Math. Psychol. 44(1), 190–204 (2000). https://doi.org/10.1006/jmps.1999.1283
Article Google Scholar
Rattan, D., Bhatia, R., Singh, M.: Model clone detection based on tree comparison. In: 2012 Annual IEEE India Conference (INDICON), pp. 1041–1046 (2012). https://doi.org/10.1109/INDCON.2012.6420770
Rosca, D., Domingues, L.: A systematic comparison of roundtrip software engineering approaches applied to UML class diagram. Procedia Comput. Sci. 181, 861–868 (2021). https://doi.org/10.1016/j.procs.2021.01.240
Article Google Scholar
Roy, G.G., Veraart, V.E.: Software engineering education: from an engineering perspective. In: Proceedings 1996 International Conference Software Engineering: Education and Practice, 1996, pp. 256–262. https://doi.org/10.1109/SEEP.1996.534008
Sandouka, R., Aljamaan, H.: Python code smells detection using conventional machine learning models. PeerJ. Comput. Sci. 9, e1370 (2023). https://doi.org/10.7717/peerj-cs.1370
Article Google Scholar
Sidhu, B.K., Singh, K., Sharma, N.: A machine learning approach to software model refactoring. Int. J. Comput. Appl. (2020). https://doi.org/10.1080/1206212X.2020.1711616
Article Google Scholar
Tantithamthavorn, C., McIntosh, S., Hassan, A.E., Matsumoto, K.: The impact of automated parameter optimization on defect prediction models. IEEE Trans. Softw. Eng. 45(7), 683–711 (2019). https://doi.org/10.1109/TSE.2018.2794977
Article Google Scholar
“TensorFlow | Google Open Source Projects,” Google Open Source. Accessed: Jan. 27, 2023. [Online]. Available: https://opensource.google/projects/tensorflow
Vaswani, A. et al., Attention is all you need. arXiv, (2017). https://doi.org/10.48550/arXiv.1706.03762
Watanabe, S., Hutter, F.: c-TPE: generalizing tree-structured parzen estimator with inequality constraints for continuous and categorical hyperparameter optimization. arXiv, (2022). https://doi.org/10.48550/arXiv.2211.14411
Yin, X., Shi, C., Zhao, S.: Local and global feature based explainable feature envy detection. In: 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC), Madrid, Spain: IEEE, pp. 942–951. (2021). https://doi.org/10.1109/COMPSAC51774.2021.00127

Download references

Acknowledgements

The authors acknowledge the support of King Fahd University of Petroleum and Minerals in the development of this work.

Author information

Authors and Affiliations

Information and Computer Science Department, King Fahd University of Petroleum and Minerals, 31261, Dhahran, Saudi Arabia
Amal Alazba, Hamoud Aljamaan & Mohammad Alshayeb
Department of Information Systems, King Saud University, 11362, Riyadh, Saudi Arabia
Amal Alazba
Interdisciplinary Research Center for Finance and Digital Economy, 31261, Dhahran, Saudi Arabia
Hamoud Aljamaan
Interdisciplinary Research Center for Intelligent Secure Systems, 31261, Dhahran, Saudi Arabia
Mohammad Alshayeb

Authors

Amal Alazba
View author publications
You can also search for this author in PubMed Google Scholar
Hamoud Aljamaan
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Alshayeb
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

AA wrote the main manuscript text. HA and MA edited and reviewed the manuscript.

Corresponding author

Correspondence to Mohammad Alshayeb.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Alazba, A., Aljamaan, H. & Alshayeb, M. Automated detection of class diagram smells using self-supervised learning. Autom Softw Eng 31, 29 (2024). https://doi.org/10.1007/s10515-024-00429-w

Download citation

Received: 16 September 2023
Accepted: 02 March 2024
Published: 24 March 2024
DOI: https://doi.org/10.1007/s10515-024-00429-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automated detection of class diagram smells using self-supervised learning

Abstract

Access this article

Similar content being viewed by others

Machine learning-based test smell detection

Software Design Smell Detection: a systematic mapping study

Implications of semi-supervised learning for design pattern selection

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automated detection of class diagram smells using self-supervised learning

Abstract

Access this article

Similar content being viewed by others

Machine learning-based test smell detection

Software Design Smell Detection: a systematic mapping study

Implications of semi-supervised learning for design pattern selection

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation