Software Remodularization by Estimating Structural and Conceptual Relations Among Classes and Using Hierarchical Clustering

  • Amit RatheeEmail author
  • Jitender Kumar Chhabra
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 712)


In this paper, we have presented a technique of software remodularization by estimating conceptual similarity among software elements (Classes). The proposed technique makes use of both structural and semantic coupling measurements together to get much more accurate coupling measures. In particular, the proposed approach makes use of lexical information extracted from six main parts of the source code of a class, namely comments, class names, attribute names, method signatures, parameter names and method source code statements zone. Simultaneously, it also makes use of counting of other class’s member functions used by a given class as a structural coupling measure among classes. Structural coupling among software elements (classes) are measured using information-flow based coupling metric (ICP) and conceptual coupling is measured by tokenizing source code and calculating Cosine Similarity. Clustering is performed by performing Hierarchical Agglomerate Clustering (HAC). The proposed technique is tested on three standard open source Java software’s. The obtained results encourage remodularization by showing higher accuracy against the corresponding software gold standard.


  1. 1.
    Cimitile, A., Visaggio, G.: Software salvaging and the call dominance tree. J. Syst. Softw. 28(2), 117–127 (1995)CrossRefGoogle Scholar
  2. 2.
    Marcus, A., Poshyvanyk, D., Ferenc, R.: Using the conceptual cohesion of classes for fault prediction in object-oriented systems. IEEE Trans. Softw. Eng. 34(2), 287–300 (2008)CrossRefGoogle Scholar
  3. 3.
    Antoniol, G., Di Penta, M., Casazza, G., Merlo, E.: A method to re-organize legacy systems via concept analysis. In: Proceedings of 9th International Workshop on Program Comprehension, Toronto, Canada, pp. 281–292 (2001)Google Scholar
  4. 4.
    Tonella, P.: Concept analysis for module restructuring. IEEE Trans. Softw. Eng. 27(4), 351–363 (2001)CrossRefGoogle Scholar
  5. 5.
    van Deursen, A., Kuipers, T.: Identifying objects using cluster and concept analysis. In: Proceedings of 21st International Conference on Software Engineering, Los Angeles, California, USA, pp. 246–255 (1999)Google Scholar
  6. 6.
    Mitchell, B.S., Mancoridis, S.: On the automatic modularization of software systems using the bunch tool. IEEE Trans. Softw. Eng. 32(3), 193–208 (2006)CrossRefGoogle Scholar
  7. 7.
    Harman, M., Hierons, R.M., Proctor, M.: A new representation and crossover operator for search-based optimization of software modularization. In: Proceedings of the Genetic and Evolutionary Computation Conference, New York, USA (2002)Google Scholar
  8. 8.
    Seng, O., Bauer, M., Biehl, M., Pache, G.: Search-based improvement of subsystem decompositions. In: Proceedings of the Genetic and Evolutionary Computation Conference, Washington, Columbia, USA, pp. 1045–1051 (2005)Google Scholar
  9. 9.
    Abdeen, H., Ducasse, S., Sahraoui, H.A., Alloui, I.: Automatic package coupling and cycle minimization. In: Proceedings of the 16th Working Conference on Reverse Engineering, Lille, France, pp. 103–112 (2009)Google Scholar
  10. 10.
    Maletic, J., Marcus, A.: Supporting program comprehension using semantic and structural information. In: Proceedings of 23rd International Conference on Software Engineering. Toronto, Ontario, Canada, pp. 103–112 (2001)Google Scholar
  11. 11.
    Kuhn, A., Ducasse, S., Gı̂rba, T.: Semantic clustering: identifying topics in source code. Inf. Soft. Technol. 49(3), 230–243 (2007)CrossRefGoogle Scholar
  12. 12.
    Scanniello, G., Risi, M., Tortora, G.: Architecture recovery using latent semantic indexing and k-means: an empirical evaluation. In: Proceedings of International Conference on Software Engineering and Formal Methods, pp. 103–112 (2010)Google Scholar
  13. 13.
    Bavota, G., De Lucia, A., Marcus, A., Oliveto, R.: Software re-modularization based on structural and semantic metrics. In: Proceedings of International Working Conference on Reverse Engineering, pp. 195–204. IEEE Computer Society (2010)Google Scholar
  14. 14.
    Bavota, G., Oliveto, R., Gethers, M., Poshyvanyk, D., De Lucia, A.: Methodbook: recommending move method refactorings via relational topic models. IEEE Trans. Softw. Eng. 40(7), 671–694 (2014)CrossRefGoogle Scholar
  15. 15.
    Shaw, S.C., Goldstein, M., Munro, M., Burd, E.: Moral dominance relations for program comprehension. IEEE Trans. Softw. Eng. 29(9), 851–863 (2003)CrossRefGoogle Scholar
  16. 16.
    Mancoridis, S., Mitchell, B.S., Rorres, C., Chen, Y.-F., Gansner, E.R.: Using automatic clustering to produce high-level system organizations of source code. In: Proceedings of 6th International Workshop on Program Comprehension, Ischia, Italy. IEEE CS Press (1998)Google Scholar
  17. 17.
    Abdellatief, M., Sultan, A.B.M., Ghani, A., Jabar, M.A.: Component-based software system dependency metrics based on component information flow measurements. In: Sixth International Conference on Software Engineering Advances, IARIA (2011)Google Scholar
  18. 18.
    Qiu, D.H., Li, H., Sun, J.L.: Measuring software similarity based on structure and property of class diagram. In: 6th International Conference on Advanced Computational Intelligence (ICACI). IEEE (2013)Google Scholar
  19. 19.
    Savic, M., Rakic, G., Budimac, Z., Ivanovic, M.: A language-independent approach to the extraction of dependencies between source code entities. IST 56, 1268–1288 (2014). ElsevierGoogle Scholar
  20. 20.
    Srinivas, C., Radhakrishna, V., Rao, C.V.G.: Software component clustering and classification using noval similarity measure. In: 8th International Conference Interdisciplinarity in Engineering (INTER-ENG), Romania (2014)Google Scholar
  21. 21.
    Corazza, A., Di Martino, S., Scanniello, G.: A probabilistic based approach towards software system clustering. In: Proceedings of European Conference on Software Maintenance and Reengineering, pp. 89–98. IEEE Computer Society (2010)Google Scholar
  22. 22.
    Corazza, A., Di Martino, S., Maggio, V., Scanniello, G.: Investigating the use of lexical information for software system clustering. In: Proceedings of European Conference on Software Maintenance and Reengineering, pp. 35–44. IEEE Computer Society (2011)Google Scholar
  23. 23.
    Corazza, A., Martino, S., Maggio, V., Scanniello, G.: Weighing lexical information for software clustering in the context of architecture recovery. Empir. Softw. Eng. 21, 72–103 (2016)CrossRefGoogle Scholar
  24. 24.
    Andritsos, P., Tzerpos, V.: Information-theoretic software clustering. IEEE Trans. Softw. Eng. 31(2), 150–165 (2005)CrossRefGoogle Scholar
  25. 25.
    Belle, A.B., Boussaidi, G.E., Kpodjedo, S.: Combining lexical and structural information to reconstruct software layers. Inf. Softw. Technol. 74, 1–16 (2016)CrossRefGoogle Scholar
  26. 26.
    Maqbool, O., Babri, A.H.: Hierarchical clustering for software architecture recovery. IEEE Trans. Softw. Eng. 33(11), 759–780 (2007)CrossRefGoogle Scholar
  27. 27.
    Prajapati, A., Chhabra, J.K.: Improving modular structure of software system using structural and lexical dependency. Inf. Softw. Technol. 82, 96–120 (2017). (Elsevier, SCI)CrossRefGoogle Scholar
  28. 28.
    Parashar, A., Chhabra, J.K.: An approach for clustering class coupling metrics to mine object oriented software components. Int. Arab J. Inf. Technol. 13(3), 239–248 (2016). (SCI)Google Scholar
  29. 29.
    Prajapati, A., Chhabra, J.K.: Preserving core components of object-oriented packages while maintaining structural quality. Procedia Comput. Sci. 46, 833–840 (2015). (Elsevier)CrossRefGoogle Scholar
  30. 30.
    Kagdi, H., Gethers, M., Poshyvanyk, D.: Integrating conceptual and logical couplings for change impact analysis in software. Empir. Softw. Eng. 18, 933–969 (2013)CrossRefGoogle Scholar
  31. 31.
    Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2017

Authors and Affiliations

  1. 1.Department of Computer EngineeringNational Institute of TechnologyKurukshetraIndia

Personalised recommendations