Co-change Clusters: Extraction and Application on Assessing Software Modularity

  • Luciana Lourdes Silva
  • Marco Tulio Valente
  • Marcelo de A. Maia
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8989)


The traditional modular structure defined by the package hierarchy suffers from the dominant decomposition problem and it is widely accepted that alternative forms of modularization are necessary to increase developer’s productivity. In this paper, we propose an alternative form to understand and assess package modularity based on co-change clusters, which are highly inter-related classes considering co-change relations. We evaluate how co-change clusters relate to the package decomposition of four real-world systems. The results show that the projection of co-change clusters to packages follows different patterns in each system. Therefore, we claim that modular views based on co-change clusters can improve developers’ understanding on how well-modularized are their systems, considering that modularity is the ability to confine changes and evolve components in parallel.


Modularity Software changes Version control systems Co-change graphs Co-change clusters Agglomerative hierarchical clustering algorithm 



This work is supported by FAPEMIG, CAPES, and CNPq.


  1. 1.
    Adams, B., Jiang, Z.M., Hassan, A.E.: Identifying crosscutting concerns using historical code changes. In: 32nd International Conference on Software Engineering (ICSE), pp. 305–314 (2010)Google Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: 20th International Conference on Very Large Data Bases (VLDB), pp. 487–499 (1994)Google Scholar
  3. 3.
    Baldwin, C.Y., Clark, K.B.: Design Rules: The Power of Modularity. MIT Press, Cambridge (2003)Google Scholar
  4. 4.
    Beck, F., Diehl, S.: Evaluating the impact of software evolution on software clustering. In: 17th Working Conference on Reverse Engineering (WCRE), pp. 99–108 (2010)Google Scholar
  5. 5.
    Beyer, D., Noack, A.: Clustering software artifacts based on frequent common changes. In: 13th International Workshop on Program Comprehension (IWPC), pp. 259–268 (2005)Google Scholar
  6. 6.
    Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008(10), P10008 (2008)CrossRefGoogle Scholar
  7. 7.
    Breu, S., Zimmermann, T.: Mining aspects from version history. In: 21st Automated Software Engineering Conference (ASE), pp. 221–230 (2006)Google Scholar
  8. 8.
    Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic itemset counting and implication rules for market basket data. In: International Conference on Management of Data (SIGMOD), pp. 255–264 (1997)Google Scholar
  9. 9.
    Chidamber, S., Kemerer, C.: Towards a metrics suite for object oriented design. In: 6th Object-oriented Programming Systems, Languages, and Applications Conference (OOPSLA), pp. 197–211 (1991)Google Scholar
  10. 10.
    Couto, C., Pires, P., Valente, M.T., Bigonha, R., Anquetil, N.: Predicting software defects with causality tests. J. Syst. Softw. 93, 24–41 (2014)CrossRefGoogle Scholar
  11. 11.
    Couto, C., Silva, C., Valente, M.T., Bigonha, R., Anquetil, N.: Uncovering causal relationships between software metrics and bugs. In: 16th European Conference on Software Maintenance and Reengineering (CSMR), pp. 223–232 (2012)Google Scholar
  12. 12.
    D’Ambros, M., Lanza, M., Robbes, R.: An extensive comparison of bug prediction approaches. In: 7th Working Conference on Mining Software Repositories (MSR), pp. 31–41 (2010)Google Scholar
  13. 13.
    Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41, 391–407 (1990)CrossRefGoogle Scholar
  14. 14.
    Ducasse, S., Gîrba, T., Kuhn, A.: Distribution map. In: 22nd IEEE International Conference on Software Maintenance (ICSM), pp. 203–212 (2006)Google Scholar
  15. 15.
    Gethers, M., Kagdi, H., Dit, B., Poshyvanyk, D.: An adaptive approach to impact analysis from change requests to source code. In: 26th Automated Software Engineering Conference (ASE), pp. 540–543 (2011)Google Scholar
  16. 16.
    Griswold, W.G., Yuan, J.J., Kato, Y.: Exploiting the map metaphor in a tool for software evolution. In: 23rd International Conference on Software Engineering (ICSE), pp. 265–274 (2001)Google Scholar
  17. 17.
    Herzing, K., Zeller, A.: The impact of tangled code changes. In: 10th Working Conference on Mining Software Repositories (MSR), pp. 121–130 (2013)Google Scholar
  18. 18.
    Janzen, D., Volder, K.D.: Navigating and querying code without getting lost. In: 2nd International Conference on Aspect-oriented Software Development (AOSD), pp. 178–187 (2003)Google Scholar
  19. 19.
    Kagdi, H., Gethers, M., Poshyvanyk, D.: Integrating conceptual and logical couplings for change impact analysis in software. Empirical Softw. Eng. (EMSE) 18(5), 933–969 (2013)CrossRefGoogle Scholar
  20. 20.
    Karypis, G., Han, E.H.S., Kumar, V.: Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8), 68–75 (1999)CrossRefGoogle Scholar
  21. 21.
    Kästner, C., Apel, S., Kuhlemann, M.: Granularity in software product lines. In: 30th International Conference on Software Engineering (ICSE), pp. 311–320 (2008)Google Scholar
  22. 22.
    Kersten, M., Murphy, G.C.: Using task context to improve programmer productivity. In: 14th International Symposium on Foundations of Software Engineering (FSE), pp. 1–11 (2006)Google Scholar
  23. 23.
    Kiczales, G., Lamping, J., Mendhekar, A., Maeda, C., Lopes, C., Loingtier, J.M., Irwin, J.: Aspect-oriented programming. In: Akşit, M., Matsuoka, S. (eds.) ECOOP 1997. LNCS, vol. 1241. Springer, Heidelberg (1997) Google Scholar
  24. 24.
    Kouroshfar, E.: Studying the effect of co-change dispersion on software quality. In: 35th International Conference on Software Engineering (ICSE), pp. 1450–1452 (2013)Google Scholar
  25. 25.
    Lent, B., Swami, A.N., Widom, J.: Clustering association rules. In: 13th International Conference on Data Engineering (ICDE), pp. 220–231 (1997)Google Scholar
  26. 26.
    MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)Google Scholar
  27. 27.
    Manning, C.D., Raghavan, P., Schtze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)CrossRefzbMATHGoogle Scholar
  28. 28.
    Meyer, B.: Object-Oriented Software Construction. Prentice-Hall, Upper Saddle River (2000)Google Scholar
  29. 29.
    Negara, S., Vakilian, M., Chen, N., Johnson, R.E., Dig, D.: Is it dangerous to use version control histories to study source code evolution? In: Noble, J. (ed.) ECOOP 2012. LNCS, vol. 7313, pp. 79–103. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  30. 30.
    Oliva, G.A., Santana, F.W., Gerosa, M.A., de Souza, C.R.B.: Towards a classification of logical dependencies origins: a case study. In: 12th International Workshop on Principles of Software Evolution and the 7th Annual ERCIM Workshop on Software Evolution (EVOL/IWPSE), pp. 31–40 (2011)Google Scholar
  31. 31.
    Omiecinski, E.: Alternative interest measures for mining associations in databases. IEEE Trans. Knowl. Data Eng. 15(1), 57–69 (2003)CrossRefMathSciNetGoogle Scholar
  32. 32.
    Palomba, F., Bavota, G., Penta, M.D., Oliveto, R., de Lucia, A., Poshyvanyk, D.: Detecting bad smells in source code using change history information. In: 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 268–278 (2013)Google Scholar
  33. 33.
    Parnas, D.L.: On the criteria to be used in decomposing systems into modules. Commun. ACM 15(12), 1053–1058 (1972)CrossRefGoogle Scholar
  34. 34.
    Piatetsky-Shapiro, G.: Discovery, analysis and presentation of strong rules. In: Knowledge Discovery in Databases, pp. 229–248 (1991)Google Scholar
  35. 35.
    Poshyvanyk, D., Marcus, A.: Using information retrieval to support design of incremental change of software. In: 22th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 563–566 (2007)Google Scholar
  36. 36.
    Poshyvanyk, D., Marcus, A.: Measuring the semantic similarity of comments in bug reports. In: 1st International ICPC Workshop on Semantic Technologies in System Maintenance (STSM), pp. 265–280 (2008)Google Scholar
  37. 37.
    Robillard, M.P., Dagenais, B.: Recommending change clusters to support software investigation: an empirical study. J. Softw. Maintenance Evol. Res. Pract. 22(3), 143–164 (2010)Google Scholar
  38. 38.
    Robillard, M.P., Murphy, G.C.: Concern graphs: finding and describing concerns using structural program dependencies. In: 24th International Conference on Software Engineering (ICSE), pp. 406–416 (2002)Google Scholar
  39. 39.
    Robillard, M.P., Murphy, G.C.: Representing concerns in source code. ACM Trans. Softw. Eng. Methodol. 16(1), 1–38 (2007)CrossRefGoogle Scholar
  40. 40.
    Robillard, M.P., Weigand-Warr, F.: ConcernMapper: simple view-based separation of scattered concerns. In: OOPSLA Workshop on Eclipse Technology eXchange, pp. 65–69 (2005)Google Scholar
  41. 41.
    Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)CrossRefzbMATHGoogle Scholar
  42. 42.
    Santos, G., Valente, M.T., Anquetil, N.: Remodularization analysis using semantic clustering. In: 1st CSMR-WCRE Software Evolution Week, pp. 224–233 (2014)Google Scholar
  43. 43.
    Silva, L., Valente, M.T., Maia, M.: Assessing modularity using co-change clusters. In: 13th International Conference on Modularity, pp. 49–60 (2014)Google Scholar
  44. 44.
    Śliwerski, J., Zimmermann, T., Zeller, A.: When do changes induce fixes? In: 2nd Working Conference on Mining Software Repositories (MSR), pp. 1–5 (2005)Google Scholar
  45. 45.
    Stevens, W.P., Myers, G.J., Constantine, L.L.: Structured design. IBM Syst. J. 13(2), 115–139 (1974)CrossRefGoogle Scholar
  46. 46.
    Tempero, E., Anslow, C., Dietrich, J., Han, T., Li, J., Lumpe, M., Melton, H., Noble, J.: Qualitas corpus: a curated collection of Java code for empirical studies. In: Asia Pacific Software Engineering Conference (APSEC), pp. 336–345 (2010)Google Scholar
  47. 47.
    Terra, R., Miranda, L.F., Valente, M.T., Bigonha, R.S.: Qualitas.class corpus: a compiled version of the qualitas corpus. Softw. Eng. Notes 38, 1–4 (2013)CrossRefGoogle Scholar
  48. 48.
    Valente, M., Borges, V., Passos, L.: A semi-automatic approach for extracting software product lines. IEEE Trans. Softw. Eng. 38(4), 737–754 (2012)CrossRefGoogle Scholar
  49. 49.
    Vanya, A., Hofland, L., Klusener, S., van de Laar, P., van Vliet, H.: Assessing software archives with evolutionary clusters. In: 16th IEEE International Conference on Program Comprehension (ICPC), pp. 192–201 (2008)Google Scholar
  50. 50.
    Walker, R.J., Rawal, S., Sillito, J.: Do crosscutting concerns cause modularity problems? In: 20th International Symposium on the Foundations of Software Engineering (FSE), pp. 1–11 (2012)Google Scholar
  51. 51.
    Zimmermann, T., Premraj, R., Zeller, A.: Predicting defects for Eclipse. In: 3rd International Workshop on Predictor Models in Software Engineering, pp. 9 (2007)Google Scholar
  52. 52.
    Zimmermann, T., Weissgerber, P., Diehl, S., Zeller, A.: Mining version histories to guide software changes. IEEE Trans. Softw. Eng. 31(6), 429–445 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Luciana Lourdes Silva
    • 1
    • 3
  • Marco Tulio Valente
    • 1
  • Marcelo de A. Maia
    • 2
  1. 1.Department of Computer ScienceFederal University of Minas GeraisBelo HorizonteBrazil
  2. 2.Faculty of ComputingFederal University of UberlândiaUberlândiaBrazil
  3. 3.Federal Institute of the Triângulo MineiroUberabaBrazil

Personalised recommendations