Skip to main content

Advertisement

Log in

Search-Based Cost-Effective Software Remodularization

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Software modularization is a technique used to divide a software system into independent modules (packages) that are expected to be cohesive and loosely coupled. However, as software systems evolve over time to meet new requirements, their modularizations become complex and gradually loose their quality. Thus, it is challenging to automatically optimize the classes’ distribution in packages, also known as remodularization. To alleviate this issue, we introduce a new approach to optimize software modularization by moving classes to more suitable packages. In addition to improving design quality and preserving semantic coherence, our approach takes into consideration the refactoring effort as an objective in itself while optimizing software modularization. We adapt the Elitist Non-dominated Sorting Genetic Algorithm (NSGA-II) of Deb et al. to find the best sequence of refactorings that 1) maximize structural quality, 2) maximize semantic cohesiveness of packages (evaluated by a semantic measure based on WordNet), and 3) minimize the refactoring effort. We report the results of an evaluation of our approach using open-source projects, and we show that our proposal is able to produce a coherent and useful sequence of recommended refactorings both in terms of quality metrics and from the developer’s points of view.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Lehman M M. On understanding laws, evolution, and conservation in the large-program life cycle. Journal of Systems and Software, 1984, 1: 213-221.

    Article  Google Scholar 

  2. Eick S G, Graves T L, Karr A F, Marron J S, Mockus A. Does code decay? Assessing the evidence from change management data. IEEE Transactions on Software Engineering, 2001, 27(1): 1-12.

    Article  Google Scholar 

  3. Lanza M, Marinescu R. Object-oriented Metrics in Practice: Using Software Metrics to Characterize, Evaluate, and Improve the Design of Object-Oriented Systems. Springer-Verlag Berlin Heidelberg, 2006.

    MATH  Google Scholar 

  4. Fowler M, Beck K, Brant J, Opdyke W, Roberts D. Refactoring: Improving the Design of Existing Code. Addison-Wesley Professional, 1999.

  5. Harman M, Hierons R M, Proctor M. A new representation and crossover operator for search-based optimization of software modularization. In Proc. the 4th Annual Conference on Genetic and Evolutionary Computation, July 2002, pp.1351-1358.

  6. Mitchell B S, Mancoridis S. On the automatic modularization of software systems using the bunch tool. IEEE Transactions on Software Engineering, 2006, 32(3): 193-208.

    Article  Google Scholar 

  7. Seng O, Bauer M, Biehl M, Pache G. Search-based improvement of subsystem decompositions. In Proc. the 7th Annual Conference on Genetic and Evolutionary Computation, June 2005, pp.1045-1051.

  8. Bavota G, de Lucia A, Marcus A, Oliveto R. Software remodularization based on structural and semantic metrics. In Proc. the 17th Working Conference on Reverse Engineering, October 2010, pp.195-204.

  9. Harman M, Tratt L. Pareto optimal search based refactoring at the design level. In Proc. the 9th Annual Conference on Genetic and Evolutionary Computation, July 2007, pp.1106-1113.

  10. Bavota G, Carnevale F, de Lucia A, di Penta M, Oliveto R. Putting the developer in-the-loop: An interactive GA for software re-modularization. In Proc. the 4th International Symposium on Search Based Software Engineering, September 2012, pp.75-89.

    Chapter  Google Scholar 

  11. Bavota G, de Lucia A, Marcus A, Oliveto R. Using structural and semantic measures to improve software modularization. Empirical Software Engineering, 2013, 18(5): 901-932.

    Article  Google Scholar 

  12. Bavota G, Gethers M, Oliveto R, Poshyvanyk D, de Lucia A. Improving software modularization via automated analysis of latent topics and dependencies. ACM Transactions on Software Engineering and Methodology, 2014, 23(1): Article No. 4.

    Article  Google Scholar 

  13. Mkaouer M W, Kessentini M, Shaout A, Koligheu P, Bechikh S, Deb K, Ouni A. Many-objective software remodularization using NSGA-III. ACM Trans. Softw. Eng. Methodol., 2015, 24(3): Article No. 17.

    Article  Google Scholar 

  14. Abdeen H, Ducasse S, Sahraoui H, Alloui I. Automatic package coupling and cycle minimization. In Proc. the 16th Working Conference on Reverse Engineering, October 2009, pp.103-112.

  15. Palomba F, Tufano M, Bavota G, Oliveto R, Marcus A, Poshyvanyk D, de Lucia A. Extract package refactoring in ARIES. In Proc. the 37th IEEE/ACM International Conference on Software Engineering, Volume 2, May 2015, pp.669-672.

  16. Doval D, Mancoridis S, Mitchell B S. Automatic clustering of software systems using a genetic algorithm. In Proc. the 9th International Workshop on Software Technology and Engineering Practice, September 1999, pp.73-81.

  17. Paixao M, Harman M, Zhang Y, Yu Y. An empirical study of cohesion and coupling: Balancing optimization and disruption. IEEE Transactions on Evolutionary Computation, 2018, 22(3): 394-414.

    Article  Google Scholar 

  18. Ouni A, Kessentini M, Sahraoui H, Inoue K, Deb K. Multicriteria code refactoring using search-based software engineering: An industrial case study. ACM Transactions on Software Engineering and Methodology, 2016, 25(3): Article No. 23.

    Article  Google Scholar 

  19. Maqbool O, Babri H. Hierarchical clustering for software architecture recovery. IEEE Transactions on Software Engineering, 2007, 33(11): 759-780.

    Article  Google Scholar 

  20. Candela I, Bavota G, Russo B, Oliveto R. Using cohesion and coupling for software remodularization: Is it enough? ACM Transactions on Software Engineering and Methodology, 2016, 25(3): Article No. 24.

    Article  Google Scholar 

  21. Corazza A, di Martino S, Maggio V, Scanniello G. Investigating the use of lexical information for software system clustering. In Proc. the 15th European Conference on Software Maintenance and Reengineering, March 2011, pp.35-44.

  22. Hall M, Khojaye M A, Walkinshaw N, McMinn P. Establishing the source code disruption caused by automated remodularisation tools. In Proc. the IEEE International Conference on Software Maintenance and Evolution, September 2014, pp.466-470.

  23. Abdeen H, Sahraoui H, Shata O, Anquetil N, Ducasse S. Towards automatically improving package structure while respecting original design decisions. In Proc. the 20th Working Conference on Reverse Engineering, October 2013, pp.212-221.

  24. Ouni A, Kessentini M, Sahraoui H, Boukadoum M. Maintainability defects detection and correction: A multiobjective approach. Automated Software Engineering, 2013, 20(1): 47-79.

    Article  Google Scholar 

  25. Deb K, Pratap A, Agarwal S, Meyarivan T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 2002, 6(2): 182-197.

    Article  Google Scholar 

  26. Praditwong K, Harman M, Yao X. Software module clustering as a multi-objective search problem. IEEE Transactions on Software Engineering, 2011, 37(2): 264-282.

    Article  Google Scholar 

  27. Vallée-Rai R, Gagnon E, Hendren L, Lam P, Pominville P, Sundaresan V. Optimizing Java bytecode using the soot framework: Is it feasible? In Proc. the 9th International Conference on Compiler Construction, March 2000, pp.18-34.

  28. Farrugia A. Vertex-partitioning into fixed additive induced-hereditary properties is NP-hard. The Electronic Journal of Combinatorics, 2004, 11(1): R46.

    MathSciNet  MATH  Google Scholar 

  29. Jiang J J, Conrath D W. Semantic similarity based on corpus statistics and lexical taxonomy. In Proc. the 10th International Conference Research on Computational Linguistics, March 1997, pp.19-33.

  30. Brooks R. Towards a theory of the comprehension of computer programs. International Journal of Man-Machine Studies, 1983, 18(6): 543-554.

    Article  Google Scholar 

  31. Merlo E, McAdam I, de Mori R. Feed-forward and recurrent neural networks for source code informal information analysis. Journal of Software Maintenance: Research and Practice, 2003, 15(4): 205-244.

    Article  Google Scholar 

  32. Caprile C, Tonella P. Nomen est omen: Analyzing the language of function identifiers. In Proc. the 6th Working Conference on Reverse Engineering, October 1999, pp.112-122.

  33. Lawrie D, Morrell C, Feild H, Binkley D. What’s in a name? A study of identifiers. In Proc. the 14th IEEE International Conference on Program Comprehension, June 2006, pp.3-12.

  34. Poshyvanyk D, Marcus A. The conceptual coupling metrics for object-oriented systems. In Proc. the 22nd IEEE International Conference on Software Maintenance, September 2006, pp.469-478.

  35. Gethers M, Poshyvanyk D. Using relational topic models to capture coupling among classes in object-oriented software systems. In Proc. the 26th IEEE International Conference on Software Maintenance, September 2010, pp.1-10.

  36. Arnaoudova V, Eshkevari L M, di Penta M, Oliveto R, Antoniol G, Guéhéneuc Y G. REPENT: Analyzing the nature of identifier renamings. IEEE Transactions on Software Engineering, 2014, 40(5): 502-532.

    Article  Google Scholar 

  37. Arnaoudova V, di Penta M, Antoniol G. Linguistic antipatterns: What they are and how developers perceive them. Empirical Software Engineering, 2016, 21(1): 104-158.

    Article  Google Scholar 

  38. Seco N, Veale T, Hayes J. An intrinsic information content metric for semantic similarity in WordNet. In Proc. the 16th European Conference on Artificial Intelligence, August 2004, pp.1089-1090.

  39. Budanitsky A, Hirst G. Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures. In Proc. Workshop on WordNet and Other Lexical Resources, Second Meeting of the North American Chapter of the Association for Computational Linguistics, Volume 2, June 2001, pp.24-29.

  40. Lin D. An information-theoretic definition of similarity. In Proc. the 15th International Conference on Machine Learning, July 1998, pp.296-304.

  41. Resnik P. Using information content to evaluate semantic similarity in a taxonomy. In Proc. the 14th International Joint Conference on Artificial Intelligence, August 1995, pp.448-453.

  42. Deb K, Jain H. An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part I: Solving problems with box constraints. IEEE Trans. Evolutionary Computation, 2014, 18(4): 577-601.

    Article  Google Scholar 

  43. Wen Z, Tzerpos V. An effectiveness measure for software clustering algorithms. In Proc. the 12th IEEE International Workshop on Program Comprehension, July 2004, pp.194-203.

  44. Kuhn A, Ducasse S, Gîrba T. Semantic clustering: Identifying topics in source code. Information & Software Technology, 2007, 49(3): 230-243.

    Article  Google Scholar 

  45. Sahraoui H A, Godin R, Miceli T. Can metrics help to bridge the gap between the improvement of OO design quality and its automation? In Proc. the 8th International Conference on Software Maintenance, October 2000, pp.154-162.

  46. Kessentini M, Mahaouachi R, Ghedira K. What you like in design use to correct bad-smells. Software Quality Journal, 2013, 21(4): 551-571.

    Article  Google Scholar 

  47. Bavota G, Oliveto R, Gethers M, Poshyvanyk D, de Lucia A. Methodbook: Recommending move method refactorings via relational topic models. IEEE Transactions on Software Engineering, 2014, 40(7): 671-694.

    Article  Google Scholar 

  48. Tsantalis N, Chatzigeorgiou A. Identification of move method refactoring opportunities. IEEE Transactions on Software Engineering, 2009, 35(3): 347-367.

    Article  Google Scholar 

  49. Oliveto R, Gethers M, Bavota G, Poshyvanyk D, de Lucia A. Identifying method friendships to remove the feature envy bad smell: NIER track. In Proc. the 33rd International Conference on Software Engineering, May 2011, pp.820-823.

  50. Lee J, Lee D, Kim D K, Park S. A semantic-based approach for detecting and decomposing god classes. arXiv: 1204.1967, 2012. https://arxiv.org/pdf/1204.1967.pdf, Sept. 2018.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rim Mahouachi.

Electronic supplementary material

ESM 1

(PDF 299 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mahouachi, R. Search-Based Cost-Effective Software Remodularization. J. Comput. Sci. Technol. 33, 1320–1336 (2018). https://doi.org/10.1007/s11390-018-1892-6

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-018-1892-6

Keywords

Navigation