Computational Optimization and Applications

, Volume 64, Issue 3, pp 843–864 | Cite as

Column generation approaches for the software clustering problem

  • Hugo Harry Kramer
  • Eduardo Uchoa
  • Marcia FampaEmail author
  • Viviane Köhler
  • François Vanderbeck


This work presents the application of branch-and-price approaches to the automatic version of the Software Clustering Problem. To tackle this problem, we apply the Dantzig–Wolfe decomposition to a formulation from the literature. Given this, we present two Column Generation (CG) approaches to solve the linear programming relaxation of the resulting reformulation: the standard CG approach, and a new approach, which we call Staged Column Generation (SCG). Also, we propose a modification to the pricing subproblem that allows to add multiple columns at each iteration of the CG. We test our algorithms in a set of 45 instances from the literature. The proposed approaches were able to improve the literature results solving all these instances to optimality. Furthermore, the SCG approach presented a considerable performance improvement regarding computational time, number of iterations and generated columns when compared with the standard CG as the size of the instances grows.


Software Clustering Problem Column Generation Branch-and-Price 



HH Kramer was financially supported by CNPq/CsF Grant No. 246661/2012-7 and CAPES


  1. 1.
    Billionnet, A., Djebali, K.: Résolution d’un problème combinatoire fractionnaire par la programmation linéaire mixte. RAIRO. Recherche opérationnelle 40(2), 97–111 (2006)MathSciNetGoogle Scholar
  2. 2.
    Dantzig, G.B., Wolfe, P.: Decomposition principle for linear programs. Oper. Res. 8(1), 101–111 (1960)CrossRefzbMATHGoogle Scholar
  3. 3.
    Doval, D., Mancoridis, S., Mitchell, B.S.: Automatic clustering of software systems using a genetic algorithm. In: Proceedings of the Software Technology and Engineering Practice, pp. 73–81. IEEE (1999)Google Scholar
  4. 4.
    Gauthier, R., Pont, S.: Designing Systems Programs. Prentice-Hall, Englewood Cliffs (1970)Google Scholar
  5. 5.
    Harman, M., Hierons, R.M., Proctor, M.: A new representation and crossover operator for search-based optimization of software modularization. GECCO 2, 1351–1358 (2002)Google Scholar
  6. 6.
    Hochbaum, D.S.: Polynomial time algorithms for ratio regions and a variant of normalized cut. IEEE Trans. Pattern Anal. Mach. Intell. 32(5), 889–898 (2010)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Hochbaum, D.S.: A polynomial time algorithm for rayleigh ratio on discrete variables: Replacing spectral techniques for expander ratio, normalized cut, and cheeger constant. Oper. Res. 61(1), 184–198 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Jeet, K., Dhir, R.: Software architecture recovery using genetic black hole algorithm. ACM SIGSOFT Softw. Eng. Notes 40(1), 1–5 (2015)CrossRefGoogle Scholar
  9. 9.
    Kazem, A.A.P., Lotfi, S.: A modified genetic algorithm for software clustering problem. In: Proceedings of the 6th WSEAS International Conference on Applied Informatics and Communications, pp. 306–311. World Scientific and Engineering Academy and Society (WSEAS) (2006)Google Scholar
  10. 10.
    Kazem, A.A.P., Lotfi, S.: An evolutionary approach for partitioning weighted module dependency graphs. In: 4th International Conference on Innovations in Information Technology, 2007. IIT’07, pp. 252–256. IEEE (2007)Google Scholar
  11. 11.
    Köhler, V., Fampa, M., Araújo, O.: Mixed-integer linear programming formulations for the software clustering problem. Comput. Optim. Appl. 55(1), 1–23 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Mahdavi, K., Harman, M., Hierons, R.M.: Finding building blocks for software clustering. Lecture Notes in Computer Science, vol, 2724, pp. 2513–2514 (2003)Google Scholar
  13. 13.
    Mahdavi, K., Harman, M., Hierons, R.M.: A multiple hill climbing approach to software module clustering. In: Proceedings of the International Conference on Software Maintenance, pp. 315–324. IEEE (2003)Google Scholar
  14. 14.
    Mamaghani, A.S., Meybodi, M.R.: Clustering of software systems using new hybrid algorithms. In: Proceedings of the 2009 IEEE International Conference on Computer and Information Technology (CIT’09), vol. 1, pp. 20–25 (2009)Google Scholar
  15. 15.
    Mancoridis, S., Mitchell, B.S., Chen, Y., Gansner, E.R.: Bunch: A clustering tool for the recovery and maintenance of software system structures. In: Proceedings of the IEEE International Conference on Software Maintenance, pp. 50–59. IEEE (1999)Google Scholar
  16. 16.
    Mancoridis, S., Mitchell, B.S., Rorres, C., Chen, Y., Gansner, E.R.: Using automatic clustering to produce high-level system organizations of source code. In: Proceedings of the 6th International Workshop on Program Comprehension, 1998. IWPC’98. pp. 45–52. IEEE (1998)Google Scholar
  17. 17.
    Mitchell, B.S.: A heuristic search approach to solving the software clustering problem. Ph.D. thesis, Drexel University (2002)Google Scholar
  18. 18.
    Mitchell, B.S., Mancoridis, S.: Using heuristic search techniques to extract design abstractions from source code. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1375–1382. Morgan Kaufmann Publishers Inc. (2002)Google Scholar
  19. 19.
    Parnas, D.L.: On the criteria to be used in decomposing systems into modules. Commun. ACM 15(12), 1053–1058 (1972)CrossRefGoogle Scholar
  20. 20.
    Parsa, S., Bushehrian, O.: A new encoding scheme and a framework to investigate genetic clustering algorithms. J. Res. Pract. Inf. Technol. 37(1) (2005)Google Scholar
  21. 21.
    Räihä, O.: A survey on search-based software design. Comput. Sci. Rev. 4(4), 203–249 (2010)CrossRefGoogle Scholar
  22. 22.
    Ryan, D.M., Foster, B.A.: An integer programming approach to scheduling. Computer scheduling of public transport urban passenger vehicle and crew scheduling, pp. 269–280 (1981)Google Scholar
  23. 23.
    Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. on Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)CrossRefGoogle Scholar
  25. 25.
    Vanderbeck, F.: Branching in branch-and-price: a generic scheme. Math. Program. 130(2), 249–294 (2011)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Hugo Harry Kramer
    • 1
  • Eduardo Uchoa
    • 1
  • Marcia Fampa
    • 2
    Email author
  • Viviane Köhler
    • 3
  • François Vanderbeck
    • 4
  1. 1.Departamento de Engenharia de ProduçãoUniversidade Federal FluminenseNiteróiBrazil
  2. 2.Instituto de Matemática and PESC/COPPEUniversidade Federal do Rio de JaneiroRio de JaneiroBrazil
  3. 3.CTISMUniversidade Federal de Santa MariaSanta MariaBrazil
  4. 4.Institut de Mathématiques de BordeauxUniversité de Bordeaux & Inria Bordeaux Sud-OuestTalence CedexFrance

Personalised recommendations