Skip to main content
Log in

Mixed-Integer Linear Programming Formulations for the Software Clustering Problem

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

Abstract

The clustering problem has an important application in software engineering, which usually deals with large software systems with complex structures. To facilitate the work of software maintainers, components of the system are divided into groups in such a way that the groups formed contain highly-interdependent modules and the independent modules are placed in different groups. The measure used to analyze the quality of the system partition is called Modularization Quality (MQ). Designers represent the software system as a graph where modules are represented by nodes and relationships between modules are represented by edges. This graph is referred in the literature as Module Dependency Graph (MDG). The Software Clustering Problem (SCP) consists in finding the partition of the MDG that maximizes the MQ.

In this paper we present three new mathematical programming formulations for the SCP. Firstly, we formulate the SCP as a sum of linear fractional functions problem and then we apply two different linearization procedures to reformulate the problem as Mixed-Integer Linear Programming (MILP) problems. We discuss a preprocessing technique that reduces the size of the original problem and develop valid inequalities that have been shown to be very effective in tightening the formulations. We present numerical results that compare the formulations proposed and compare our results with the solutions obtained by the exhaustive algorithm supported by the freely available Bunch clustering tool, for benchmark problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Billionnet, A., Djebali, K.: Résolution d’un problème combinatoire fractionnaire par la programmation linaire mixte. RAIRO. Rech. Opér. 40, 97–111 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  2. Chen, D.Z., Daescu, O., Dai, Y., Katoh, N., Wu, X., Xu, J.: Efficient algorithms and implementations for optimizing the sum of linear fractional functions, with applications. J. Comb. Optim. 9(1), 69–90 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  3. Doval, D., Mancoridis, S., Mitchell, B.S.: Automatic clustering of software systems using a genetic algorithm. In: Proceedings of 1999 IEEE International Conference on Software Tools and Engineering Practice (STEP’99), Pittsburgh, PA, pp. 73–81 (1999)

    Google Scholar 

  4. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, New York (1979)

    MATH  Google Scholar 

  5. Gay, D.M.: In: IBM ILOG CPLEX, User’s Manual for CPLEX v12.1 (2009)

    Google Scholar 

  6. Golden, B., Raghvan, S., Wasil, E.: The Vehicle Routing Problem. Latest Advances and New Challenges. Springer, Berlin (2008)

    Book  MATH  Google Scholar 

  7. Li, H.: A global approach for general 0–1 fractional programming. Eur. J. Oper. Res. 73, 590–596 (1994)

    Article  MATH  Google Scholar 

  8. Mamaghani, A.S., Meybodi, M.R.: Clustering of software systems using new hybrid algorithms. In: Proceedings of 2009 IEEE International Conference on Computer and Information Technology (CIT 2009), Xiamen, China October, 20–25 (2009)

    Google Scholar 

  9. Mancoridis, S., Mitchell, B.S., Chen, Y., Gansner, E.R.: Bunch: A clustering tool for the recovery and maintenance of software system structures. In: Proceedings of 1999 IEEE of International Conference on Software Maintenance (ICSM’99), Oxford, pp. 50–59 (1999)

    Chapter  Google Scholar 

  10. Margot, F.: Symmetry in Integer Linear Programming, 50 Years of Integer Programming. Springer, Berlin (2009)

    Google Scholar 

  11. Mitchell, B.S.: A heuristic search approach to solving the software clustering problem. Ph.D Thesis, Drexel University, Philadelphia (2002)

  12. Parsa, S., Bushehrian, O.: A new encoding scheme and a framework to investigate genetic clustering algorithms. J. Res. Pract. Inf. Technol. 37(1), 127–143 (2005)

    Google Scholar 

  13. Praditwong, K., Harman, M., Yao, X.: Software module clustering as a Multi-Objective search problem. IEEE Trans. Softw. Eng. 37(2), 264–282 (2011)

    Article  Google Scholar 

  14. Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007). doi:10.1016/j.cosrev.2007.05.001

    Article  MathSciNet  Google Scholar 

  15. Síma, J., Schaeffer, S.E.: On the NP-completeness of Some Graph Cluster Measures. In: Wiedermann, J., Tel, G., Pokorný, J., Bieliková, M., Stuller, J. (eds.): Proceedings of the Thirtysecond International Conference on Current Trends in Theory and Practice of Computer Science, SOFSEM. Lecture Notes in Computer Science, vol. 3831. Springer, Berlin (2006)

    Google Scholar 

Download references

Acknowledgements

This research has been supported in part by CNPq and FAPERJ (Brazil). The authors thank Ali S. Mamaghani and Brian S. Mitchell for providing the instances used on the computational experiments and thank the anonymous reviewers, whose comments greatly improve the structure of the presentation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcia Fampa.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Köhler, V., Fampa, M. & Araújo, O. Mixed-Integer Linear Programming Formulations for the Software Clustering Problem. Comput Optim Appl 55, 113–135 (2013). https://doi.org/10.1007/s10589-012-9512-9

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10589-012-9512-9

Keywords

Navigation