Abstract
The first part of this paper describes an automatic reverse engineering process to infer subsystem abstractions that are useful for a variety of software maintenance activities. This process is based on clustering the graph representing the modules and module-level dependencies found in the source code into abstract structures not in the source code called subsystems. The clustering process uses evolutionary algorithms to search through the enormous set of possible graph partitions, and is guided by a fitness function designed to measure the quality of individual graph partitions. The second part of this paper focuses on evaluating the results produced by our clustering technique. Our previous research has shown through both qualitative and quantitative studies that our clustering technique produces good results quickly and consistently. In this part of the paper we study the underlying structure of the search space of several open source systems. We also report on some interesting findings our analysis uncovered by comparing random graphs to graphs representing real software systems.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Shaw M, Garlan D (1996) Software architecture: perspectives on an emerging discipline. Prentice-Hall, Englewood Cliffs
Lakhotia A (1997) A unified framework for expressing software subsystem classification techniques. J Syst Softw 36(3):211–231
Hutchens D, Basili R (1985) System structure analysis: clustering with data bindings. IEEE Trans Softw Eng 11:749–757
Schwanke R (1991) An intelligent tool for re-engineering software modularity. In: Proceedings of 13th International Conference on Software Engineering
Choi S, Scacchi W (1999) Extracting and restructuring the design of large systems. IEEE Softw 66–71
Müller H, Orgun M, Tilley S, Uhl J (1992) Discovering and reconstructing subsystem structures through reverse engineering. In: Technical report DCS-201-IR, Department of Computer Science, University of Victoria
Lindig C, Snelting G (1997) Assessing modular structure of legacy code based on mathematical concept analysis. In: Proceedings of international conference on software engineering
van Deursen A, Kuipers T (1999) Identifying objects using cluster and concept analysis. In: International conference on software engineering, ICSM’99. IEEE Computer Society, pp 246–255
Anquetil N, Fourrier C, Lethbridge T (1999) Experiments with hierarchical clustering algorithms as software remodularization methods. In: Proceedings of working conference on reverse engineering
Affenzeller M, Mayrhofer R (2002) Generic heuristics for combinatorial optimization problems. In: Proceedings of the 9th international conference on operational research (KOI)
Clark J, Dolado J, Harman M, Hierons R, Jones B, Lumkin M, Mitchell BS, Mancoridis S, Rees K, Roper M, Shepperd M (2003) Reformulating software engineering as a search problem. J IEE Proc Softw 150(3):161–175
Garey MR, Johnson DS (1979) Computers and Intractability. Freeman WH, San Francisco
Mitchell BS, Mancoridis S (2001) Comparing the decompositions produced by software clustering algorithms using similarity measurements. In: Proceedings of international conference of software maintenance
Mitchell M (1997) An introduction to genetic algorithms. The MIT Press, Cambridge
Doval D, Mancoridis S, Mitchell BS (1999) Automatic clustering of software systems using a genetic algorithm. In: Proceedings of software technology and engineering practice
Chen Y, Gansner E, Koutsofios E (1997) A C++ data model supporting reachability analysis and dead code detection. In: Proceedings of 6th European software engineering conference and 5th ACM SIGSOFT symposium on the foundations of software engineering
Chen Y (1995) Reverse engineering. In: Krishnamurthy B (ed) Practical reusable UNIX software, chap 6. Wiley, New York, pp 177–208
Mitchell BS (2002) A heuristic search approach to solving the software clustering problem. PhD Thesis, Drexel University, Philadelphia
Holt RC, Winter A, Schurr A (2000) Gxl: toward a standard exchange format. In: Proceedings of working conference on reverse engineering
GXL GXL: Graph eXchange Language: Online guide. http://www. gupro.de/GXL
Gansner ER, Koutsofios E, North SC, Vo KP (1993) A technique for drawing directed graphs. IEEE Trans Softw Eng 19(3):214–230
Mancoridis S, Mitchell BS, Rorres C, Chen Y, Gansner ER (1998) Using automatic clustering to produce high-level system organizations of source code. In: Proceedings of 6th international workshop on program comprehension
Mancoridis S, Mitchell BS, Chen Y, Gansner ER (1999) Bunch: a clustering tool for the recovery and maintenance of software system structures. In: Proceedings of international conference of software maintenance, pp 50–59
Mitchell BS, Traverso M, Mancoridis S (2001) An architecture for distributing the computation of software clustering algorithms. In: The working IEEE/IFIP conference on software architecture (WICSA 2001)
Kirkpatrick S, Gelatt JR. CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220:671–680
Mitchell BS, Mancoridis S (2002) Using heuristic search techniques to extract design abstractions from source code. In: Proceedings of the genetic and evolutionary computation conference
Schwanke R, Hanson S (1998) Using neural networks to modularize software. Mach Learn 15:137–168
Müller H, Orgun M, Tilley S, Uhl J (1993) A reverse engineering approach to subsystem structure identification. J Softw Maintenance Res Practice 5:181–204
Anquetil N, Lethbridge T (1999) Recovering software architecture from the names of source files. In: Proceedings of working conference on reverse engineering
Murphy G, Notkin D, Sullivan K (2001) Software reflexion models: bridging the gap between design and implementation. IEEE Trans Softw Eng 364–380
Eisenbarth T, Koschke R, Simon D (2001) Aiding program comprehension by static and dynamic feature analysis. In: Proceedings of the IEEE international conference of software maintenance (ICSM 2001)
Tzerpos V, Holt RC (2000) ACDC: an algorithm for comprehension-driven clustering. In: Proceedings of working conference on reverse engineering, pp 258–267
Tzerpos V, Holt RC (1997) The orphan adoption problem in architecture maintenance. In: Proceedings of working conference on reverse engineering
Koschke R (2000) Evaluation of automatic re-modularization techniques and their integration in a semi-automatic method. PhD Thesis, University of Stuttgart
Mahdavi K, Harman M, Hierons R (2003) A multiple hill climbing approach to software module clustering. In: Proceedings of the IEEE international conference of software maintenance (ICSM 2003)
Mitchell BS, Mancoridis S (2001) CRAFT: a framework for evaluating software clustering results in the absence of benchmark decompositions. In: Proceedings of working conference on reverse engineering
Harman M, Swift S, Mahdavi K (2005) An empirical study of the robustness of two module clustering fitness functions. In: Proceedings of the genetic and evolutionary computation conference
Tzerpos V, Holt RC (1999) MoJo: a distance metric for software clustering. In: Proceedings of working conference on reverse engineering
Wen Z, Tzerpos V (2004) Evaluating similarity measures for software decompositions. In: ICSM, pp 368–377
Wu J, Hassan A, Holt R (2005) Comparison of clustering algorithms in the context of software evolution. In: Proceedings of the IEEE international conference of software maintenance (ICSM 2005)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mitchell, B.S., Mancoridis, S. On the evaluation of the Bunch search-based software modularization algorithm. Soft Comput 12, 77–93 (2008). https://doi.org/10.1007/s00500-007-0218-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-007-0218-3