Hierarchical Clustering of Metamodels for Comparative Analysis and Visualization

  • Önder BaburEmail author
  • Loek Cleophas
  • Mark van den Brand
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9764)


Many applications in Model-Driven Engineering involve processing multiple models or metamodels. A good example is the comparison and merging of metamodel variants into a common metamodel in domain model recovery. Although there are many sophisticated techniques to process the input dataset, little attention has been given to the initial data analysis, visualization and filtering activities. These are hard to ignore especially in the case of a large dataset, possibly with outliers and sub-groupings. In this paper we present a generic approach for metamodel comparison, analysis and visualization as an exploratory first step for domain model recovery. We propose representing metamodels in a vector space model, and applying hierarchical clustering techniques to compare and visualize them as a tree structure. We demonstrate our approach on two Ecore datasets: a collection of 50 state machine metamodels extracted from GitHub as top search results; and \(\sim \)100 metamodels from 16 different domains, obtained from AtlanMod Metamodel Zoo.


Model-Driven Engineering Model comparison Vector space model Hierarchical clustering 


  1. 1.
    Abebe, S.L., Tonella, P.: Natural language parsing of program element names for concept extraction. In: 2010 IEEE 18th International Conference on Program Comprehension (ICPC), pp. 156–159. IEEE (2010)Google Scholar
  2. 2.
    Alalfi, M.H., Cordy, J.R., Dean, T.R.: Analysis and clustering of model clones: an automotive industrial experience. In: 2014 Software Evolution Week-IEEE Conference on Software Maintenance, Reengineeringand Reverse Engineering (CSMR-WCRE), pp. 375–378. IEEE (2014)Google Scholar
  3. 3.
    Altmanninger, K., Seidl, M., Wimmer, M.: A survey on model versioning approaches. Int. J. Web Inf. Syst. 5(3), 271–304 (2009)CrossRefGoogle Scholar
  4. 4.
    Babur, Ö., Cleophas, L., Verhoeff, T., van den Brand, M.: Towards statistical comparison and analysis of models. In: Proceedings of the 4th International Conference on Model-Driven Engineering and Software Development, pp. 361–367 (2016)Google Scholar
  5. 5.
    Babur, Ö., Smilauer, V., Verhoeff, T., van den Brand, M.: Multiphysics and multiscale software frameworks: an annotated bibliography. Technical report 15-01, Dept. of Mathematics and Computer Science, Technische Universiteit Eindhoven, Eindhoven (2015)Google Scholar
  6. 6.
    Babur, Ö., Smilauer, V., Verhoeff, T., van den Brand, M.: A survey of open source multiphysics frameworks in engineering. Procedia Comput. Sci. 51, 1088–1097 (2015)CrossRefGoogle Scholar
  7. 7.
    Basciani, F., Di Rocco, J., Di Ruscio, D., Iovino, L., Pierantonio, A.: Automated clustering of metamodel repositories. In: Nurcan, S., Soffer, P., Bajec, M., Eder, J. (eds.) CAiSE 2016. LNCS, vol. 9694, pp. 342–358. Springer, Heidelberg (2016). doi: 10.1007/978-3-319-39696-5_21 CrossRefGoogle Scholar
  8. 8.
    Brunet, G., Chechik, M., Easterbrook, S., Nejati, S., Niu, N., Sabetzadeh, M.: A manifesto for model merging. In: Proceedings of the 2006 International Workshop on Global Integrated Model Management, pp. 5–12. ACM (2006)Google Scholar
  9. 9.
    Deissenboeck, F., Hummel, B., Juergens, E., Pfaehler, M., Schaetz, B.: Model clone detection in practice. In: Proceedings of the 4th International Workshop on Software Clones, pp. 57–64. ACM (2010)Google Scholar
  10. 10.
    Dijkman, R., Dumas, M., van Dongen, B., Käärik, R., Mendling, J.: Similarity of business process models: metrics and evaluation. Inf. Syst. 36(2), 498–516 (2011)CrossRefGoogle Scholar
  11. 11.
    Holthusen, S., Wille, D., Legat, C., Beddig, S., Schaefer, I., Vogel-Heuser, B.: Family model mining for function block diagrams in automation software. In: Proceedings of the 18th International Software Product Line Conference: Companion Volume for Workshops, Demonstrations and Tools, vol. 2, pp. 36–43. ACM (2014)Google Scholar
  12. 12.
    Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall Inc., Englewood Cliffs (1988)zbMATHGoogle Scholar
  13. 13.
    Javed, F., Mernik, M., Gray, J., Bryant, B.R.: Mars: a metamodel recovery system using grammar inference. Inf. Softw. Tech. 50(9), 948–968 (2008)CrossRefGoogle Scholar
  14. 14.
    Klint, P., Landman, D., Vinju, J.: Exploring the limits of domain model recovery. In: 2013 29th IEEE International Conference on Software Maintenance (ICSM), pp. 120–129. IEEE (2013)Google Scholar
  15. 15.
    Kolovos, D.S., Rose, L.M., Matragkas, N., Paige, R.F., Guerra, E., Cuadrado, J.S., De Lara, J., Ráth, I., Varró, D., Tisi, M., Cabot, J.: A research roadmap towards achieving scalability in model driven engineering. In: Proceedings of the Workshop on Scalability in Model Driven Engineering, BigMDE 2013, pp. 2:1–2:10. ACM, New York (2013).
  16. 16.
    Kolovos, D.S., Ruscio, D.D., Pierantonio, A., Paige, R.F.: Different models for model matching: an analysis of approaches to support model differencing. In: ICSE Workshop on Comparison and Versioning of Software Models, 2009. pp. 1–6. IEEE (2009)Google Scholar
  17. 17.
    Kuhn, A., Ducasse, S., Gírba, T.: Semantic clustering: identifying topics in source code. Inf. Softw. Technol. 49(3), 230–243 (2007)CrossRefGoogle Scholar
  18. 18.
    Lucrédio, D., de M. Fortes, R.P.: Moogle: a metamodel-based model search engine. Softw. Syst. Model. 11(2), 183–208 (2012)CrossRefGoogle Scholar
  19. 19.
    Manning, C.D., Raghavan, P., Schütze, H., et al.: Introduction to Information Retrieval, vol. 1. Cambridge University Press, Cambridge (2008)CrossRefzbMATHGoogle Scholar
  20. 20.
    R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2014).
  21. 21.
    Ramey, J.A.: clusteval: Evaluation of Clustering Algorithms (2012)., r package version 0.1
  22. 22.
    Ratiu, D., Feilkas, M., Jürjens, J.: Extracting domain ontologies from domain specific apis. In: 12th European Conference on Software Maintenance and Reengineering, 2008, CSMR 2008, pp. 203–212. IEEE (2008)Google Scholar
  23. 23.
    Rubin, J., Chechik, M.: N-way model merging. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, pp. 301–311. ACM (2013)Google Scholar
  24. 24.
    She, S., Lotufo, R., Berger, T., Wøsowski, A., Czarnecki, K.: Reverse engineering feature models. In: 2011 33rd International Conference on Software Engineering (ICSE), pp. 461–470. IEEE (2011)Google Scholar
  25. 25.
    Stephan, M., Cordy, J.R.: A survey of model comparison approaches and applications. In: Modelsward, pp. 265–277 (2013)Google Scholar
  26. 26.
    Strüber, D., Selter, M., Taentzer, G.: Tool support for clustering large meta-models. In: Proceedings of the Workshop on Scalability in Model Driven Engineering, p. 7. ACM (2013)Google Scholar
  27. 27.
    Wild, F.: LSA: Latent Semantic Analysis (2015)., r package version 0.73.1

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Önder Babur
    • 1
    Email author
  • Loek Cleophas
    • 1
    • 2
  • Mark van den Brand
    • 1
  1. 1.Eindhoven University of TechnologyEindhovenThe Netherlands
  2. 2.Stellenbosch UniversityMatielandSouth Africa

Personalised recommendations