Abstract
Large scale software systems must be decomposed into modular units to reduce maintenance efforts. Software Architecture Recovery (SAR) approaches have been introduced to analyze dependencies among software modules and automatically cluster them to achieve high modularity. These approaches employ various types of algorithms for clustering software modules. In this paper, we discuss design decisions and variations in existing genetic algorithms devised for SAR. We present a novel hybrid genetic algorithm that introduces three major differences with respect to these algorithms. First, it employs a greedy heuristic algorithm to automatically determine the number of clusters and enrich the initial population that is generated randomly. Second, it uses a different solution representation that facilitates an arithmetic crossover operator. Third, it is hybridized with a heuristic that improves solutions in each iteration. We present an empirical evaluation with seven real systems as experimental objects. We compare the effectiveness of our algorithm with respect to a baseline and state-of-the-art hybrid genetic algorithms. Our algorithm outperforms others in maximizing the modularity of the obtained clusters.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Akbari, M., Izadkhah, H.: Hybrid of genetic algorithm and krill herd for software clustering problem. In: Proceedings of the 5th Conference on Knowledge Based Engineering and Innovation, pp 565–570 (2019)
Altinisik, M., Sozer, H.: Automated procedure clustering for reverse engineering PL/SQL programs. In: Proceedings of the 31st ACM Symposium on Applied Computing, pp 1440–1445 (2016)
Altinisik, M., Ersoy, E., Sozer, H.: Evaluating software architecture erosion for PL/SQL programs. In: Proceedings of the 11th European Conference on Software Architecture: Companion Proceedings, ACM, pp 159–165 (2017)
Andritsos, P., Tsaparas, P., Miller, R., et al.: LIMBO: Scalable clustering of categorical data. In: Proceedings of the 9th International Conference on Extending DataBase Technology, pp 531–532 (2004)
Barros, M.: An analysis of the effects of composite objectives in multiobjective software module clustering. In: Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation, pp 1205–1212 (2012)
Candela, I., Bavota, G., Russo, B., et al.: Using cohesion and coupling for software remodularization: Is it enough? ACM Transactions on Software Engineering and Methodology 25(3), 1–28 (2016)
Chen, C., Alfayez, R., Srisopha, K., et al.: Why is it important to measure maintainability, and what are the best ways to do it? In: Proceedings of the 39th International Conference on Software Engineering Companion, pp 377–378 (2017)
Clauset, A., Newman, M., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70(066), 111 (2004)
Corazza, A., Martino, S.D., Maggio, V., et al.: Investigating the use of lexical information for software system clustering. In: Proceedings of the 15th European Conference on Software Maintenance and Reengineering, pp 35–44 (2011)
Desai, U., Bandyopadhyay, S., Tamilselvam, S.: Graph neural network to dilute outliers for refactoring monolith application. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence, pp 72–80 (2021)
Doval, D., Mancoridis, A., Mitchell, B.: Automatic clustering of software systems using a genetic algorithm. In: Proceedings of the 9th International Workshop Software Technology and Engineering Practice, pp 73–81 (1999)
Ducasse, S., Pollet, D.: Software architecture reconstruction: A process-oriented taxonomy. IEEE Trans. Software Eng. 35(4), 573–591 (2009)
Eick, S., Graves, T., Karr, A., et al.: Does code decay? assessing the evidence from change management data. IEEE Trans. Software Eng. 27(1), 1–12 (2001)
Elyasi, M., Simitcioglu, M., Saydemir, A., et al.: HYGAR: A hybrid genetic algorithm for software architecture recovery. In: Proceedings of the 37th ACM Symposium on Applied Computing, pp 1417–1424 (2022)
Ersoy, E., Kaya, K., Altinisik, M., et al.: (2016) Using hypergraph clustering for software architecture reconstruction of data-tier software. In: Proceedings of the 10th European Conference on Software Architecture, pp 326–333
Garcia, J., Popescu, D., Mattmann, C., et al.: Enhancing architectural recovery using concerns. In: Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering, pp 552–555 (2011)
Garcia, J., Krka, I., Mattmann, C., et al.: Obtaining ground-truth software architectures. In: Proceedings of the International Conference on Software Engineering, pp 901–910 (2013)
Garlan, D., Bachmann, F., Ivers, J., et al.: Documenting Software Architectures: Views and Beyond, 2nd edn. Addison-Wesley, Boston (2010)
Gen, M., Cheng, R.: Genetic Algorithms and Engineering Optimization, vol. 7. John Wiley & Sons (2000)
Goldberg, D.: Genetic Algorithms in Search, Optimization and Machine Learning, 1st edn. Addison-Wesley Longman Publishing Co. Inc., USA (1989)
Harman, M., Yao, X., Praditwong, K.: Software module clustering as a multi-objective search problem. IEEE Trans. Software Eng. 37(02), 264–282 (2011)
Hendrickson, B., Leland, R.: A multi-level algorithm for partitioning graphs. In: Proceedings of the ACM/IEEE Conference on Supercomputing, p. 28 (1995)
Jalali, N.S., Izadkhah, H., Lofti, S.: Multi-objective search-based software modularization: structural and non-structural features. Soft. Comput. 23(21), 11,141-11,165 (2019)
Jin, W., Liu, T., Cai, Y., et al.: Service candidate identification from monolithic systems based on execution traces. IEEE Trans. Software Eng. 47(5), 987–1007 (2021)
Kalia, A., Xiao, J., Krishna, R., et al.: Mono2Micro: A practical and effective tool for decomposing monolithic Java applications to microservices. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, p 1214–1224 (2021)
Karna, S.K., Sahai, R., et al.: An overview on taguchi method. International journal of engineering and mathematical sciences 1(1), 1–7 (2012)
Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)
Kruskal, W., Wallis, W.: Use of ranks in one-criterion variance analysis. J. Am. Stat. Assoc. 47(260), 583–621 (1952)
Kumari, A., Srinivas, K.: Hyper-heuristic approach for multi-objective software module clustering. J. Syst. Softw. 117(C), 384–401 (2016)
Lutellier, T., Chollak, D., Garcia, J., et al.: Measuring the impact of code dependencies on software architecture recovery techniques. IEEE Trans. Software Eng. 44(2), 159–181 (2018)
Maqbool, O., Babri, H.: The weighted combined algorithm: A linkage algorithm for software clustering. In: Proceedings of the 8th Euromicro Working Conference on Software Maintenance and Reengineering, pp 15–24 (2004)
Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs. 2nd, Extended Ed.. Springer-Verlag, Berlin, Heidelberg (1994)
Mitchell, B., Mancoridis, S.: On the automatic modularization of software systems using the Bunch tool. IEEE Trans. Software Eng. 32(3), 193–208 (2006)
Mitchell, B., Mancoridis, S.: On the evaluation of the Bunch search-based software modularization algorithm. Soft. Comput. 12(1), 77–93 (2008)
Mkaouer, W., Kessentini, M., Shaout, A., et al.: Many-objective software remodularization using NSGA-III. ACM Transactions on Software Engineering and Methodology 24(3), 1–45 (2015)
Mohammadi, S., Izadkhah, H.: A new algorithm for software clustering considering the knowledge of dependency between artifacts in the source code. Inf. Softw. Technol. 105, 252–256 (2019)
Monçores, M., Alvim, A., Barros, M.: Large neighborhood search applied to the software module clustering problem. Comput. Oper. Res. 91, 92–111 (2018)
Mu, L., Sugumaran, V., Wang, F.: A hybrid genetic algorithm for software architecture re-modularization. Inf. Syst. Front. 22(5), 1133–1161 (2020)
Murphy, G., Notkin, D., Sullivan, K.: Software reflexion models: Bridging the gap between design and implementation. IEEE Trans. Software Eng. 27(4), 364–308 (2001)
Nicosia, V., Mangioni, G., Carchiolo, V., et al.: (2009) Extending the definition of modularity to directed graphs with overlapping communities. J. Stat. Mech: Theory Exp. 03, P03024 (2009)
Nitin, V., Asthana, S., Ray, B., et al.: CARGO: ai-guided dependency analysis for migrating monolithic applications to microservices architecture. In: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, pp 20:1–20:12 (2022)
Noack, A., Rotta, R.: Multi-level algorithms for modularity clustering. In: Proceedings of the 8th International Symposium on Experimental Algorithms, pp 257–268 (2009)
Parnas, D.: On the criteria to be used in decomposing systems into modules. Commun. ACM 15(12), 1053–1058 (1972)
Rao, R.: Jaya: A simple and new optimization algorithm for solving constrained and unconstrained optimization problems. Int. J. Ind. Eng. Comput. 7(1), 19–34 (2016)
Ross, P.: Taguchi Techniques for Quality Engineering: Loss Function, Orthogonal Experiments, Parameter and Tolerance Design, 2nd edn. McGraw-Hill, New York, NY (1996)
Rossi, F., Villa-Vialaneix, N.: Représentation d’un grand réseau à partir d’une classification hiérarchique de ses sommets. Journal de la Société Française de Statistique 152(3), 34–65 (2011)
Sarhan, Q.I., Ahmed, B.S., Bures, M., et al.: Software module clustering: An in-depth literature analysis. IEEE Trans. Software Eng. 48(6), 1905–1928 (2022)
Saydemir, A., Simitcioglu, M., Sozer, H.: On the use of evolutionary coupling for software architecture recovery. In: Proceedings of the 15th Turkish National Software Engineering Symposium, pp 1–6 (2021)
Schuetz, P., Caflisch, A.: Efficient modularity optimization by multistep greedy algorithm and vertex mover refinement. Phys. Rev. E 77(046), 112 (2008)
Sozer, H.: Evaluating the effectiveness of multi-level greedy modularity clustering for software architecture recovery. In: Proceedings of the 13th European Conference on Software Architecture, pp 71–87 (2019)
Funding
The work is supported by the Scientific and Research Council of Turkey under Grant No. 120E488
Author information
Authors and Affiliations
Contributions
ME, AE, OOO and HS contributed to the conceptualization of the approach and development of the methodology. ME, AS and MES implemented the algorithms, conducted the experiments, and collected the results. All authors contributed to the writing of the original draft, and they all reviewed the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Elyasi, M., Simitcioğlu, M.E., Saydemir, A. et al. Genetic algorithms and heuristics hybridized for software architecture recovery. Autom Softw Eng 30, 19 (2023). https://doi.org/10.1007/s10515-023-00384-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10515-023-00384-y