Skip to main content
Log in

Towards clone detection in UML domain models

  • Special Section Paper
  • Published:
Software & Systems Modeling Aims and scope Submit manuscript

Abstract

Code clones (i.e., duplicate fragments of code) have been studied for long, and there is strong evidence that they are a major source of software faults. Anecdotal evidence suggests that this phenomenon occurs similarly in models, suggesting that model clones are as detrimental to model quality as they are to code quality. However, programming language code and visual models have significant differences that make it difficult to directly transfer notions and algorithms developed in the code clone arena to model clones. In this article, we develop and propose a definition of the notion of “model clone” based on the thorough analysis of practical scenarios. We propose a formal definition of model clones, specify a clone detection algorithm for UML domain models, and implement it prototypically. We investigate different similarity heuristics to be used in the algorithm, and report the performance of our approach. While we believe that our approach advances the state of the art significantly, it is restricted to UML models, its results leave room for improvements, and there is no validation by field studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Alanen, M., Porres, I.: Difference and union of models. In: Stevens, P., Whittle, J., Booch, G. (eds.) Proceedings of 6th International Confernece Unified Modeling Language («UML»’03) (2003). LNCS, vol. 2863. Springer, Berlin, pp. 2–17

  2. Booch, G., Brown, A., Iyengar, S., Rumbaugh, J., Selic, B.: An MDA Manifesto. MDA Journal 5 (May 2004), 2–9. http://bptrends.com/publicationfiles/05-04COLIBMManifesto-Frankel-3.pdf

  3. Cordy, J.R., Inoue, K., Koschke, R., and Jarzabek, S. (eds.): Proceedings of 4th International Ws. Software Clones (IWSC), ACM 29(2), ACM SIGSOFT SE Notes (2010)

  4. Myers, B., et al.: (eds.): Proceedings of IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC’11), IEEE Computer Society (2011)

  5. Deissenboeck, F., Hummel, B., Juergens, E., Pfaehler, M., Schaetz, B.: Model Clone Detection in Practice. In: Cordy et al. [3], pp. 57–64. ACM SIGSOFT SE Notes, 29(2)

  6. Deissenboeck, F., Hummel, B., Schaetz, B., Wagner, S., Girard, J., Teuchert, S.: Clone detection in automotive model-based development. In: Proceedings of IEEE 30th International Conference Software Engineering (ICSE) (2008), IEEE Computer Society, pp. 603–612

  7. Fish, A., Störrle, H.: Visual qualities of the unified modeling language: deficiencies and improvements. In: Cox, P., Hosking, J. (eds.) Proceedings of IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC’07). IEEE Computer Society, pp. 41–49 (2007)

  8. ICSE: In: Proceedings of IEEE 31st International Conference Software Engineering (ICSE) (2009), IEEE Computer Society

  9. Juergens, E., Deissenboeck, F., Hummel, B., Wagner, S.: Do code clones matter? In: ICSE’09 [8], pp. 485–495

  10. Junginger S., Kühn H., Strobl R., Karagiannis D.: Ein Geschäftsprozessmanagement- Werkzeug der nächsten Generation -ADONIS: Konzeption und Anwendungen. Wirtschaftsinformatik 42(5), 392–401 (2000)

    Article  Google Scholar 

  11. Kapser, C., Anderson, P., Godfrey, M., Koschke, R., Rieger, M., Van Rysselberghe, F., Weißgerber, P.: Subjectivity in clone judgment: Can we ever agree? Tech. Rep. 06301, Internationales Begegnungs- und Forschungszentrum für Informatik Schloß Dagstuhl, 2007. Final report on seminar 06301 “Duplication, Redundancy, and Similarity in Software”. http://drops.dagstuhl.de/opus/volltexte/2007/970

  12. Kelter, U., Wehren, J., Niere, J.: A generic difference algorithm for UML models. In: Pohl, K., (ed.) Proceedings of National Germ. Conference Software-Engineering 2005 (SE’05), no. P-64. Lecture Notes in Informatics, Gesellschaft für Informatik e.V., pp. 105–116 (2005)

  13. Kolovos, D.S., Paige, R.F., Polack, F.A.C.: Merging models with the Epsilon Merging Language (EML). In: Nierstrasz, O., Whittle, J., Harel, D., Reggio, G. (eds.) 9th Interantional Conference Model Driven Engineering Languages and Systems (MoDELS’09). LNCS, no. 4199. Springer, Berlin pp. 215–229 (2006)

  14. Koschke, R.: Survey of research on software clones. In: Walenstein, A., Koschke, R., Merlo, E.(eds.) Duplication, redundancy, and similarity in software no. 06301 in Dagstuhl Seminar Proceedings, International Conference and Research Center for Computer Science, Dagstuhl Castle (2006)

  15. Liu, H., Ma, Z., Zhang, L., Shao, W.: Detecting duplications in sequence diagrams based on suffix trees. In: 13th Asia Pacific Software Engineering Conference (APSEC), IEEE CS, pp. 269–276 (2006)

  16. Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity flooding: a versatile graph matching algorithm and its application to schema matching. In: Proceedings of 18th International Conference Data Engineering (ICDE’02) (2002), IEEE, pp. 117–128

  17. Mork, P., Bernstein, P.A.: Adapting a generic match algorithm to align ontologies of human anatomy. In: Proceedings of 20th International Conference Data Engineering (ICDE’04), IEEE Computer Society, pp. 787–791 (2004)

  18. Nagl, M., Schürr, A.: A specification environment for graph grammars. In: Ehrig, H., Kreowski, H.J., Rozenberg, G. (eds.) Proceedings of 4th International Ws. Graph-Grammars and Their Application to Computer Science. LNCS, vol. 532. Springer, Berlin, pp. 599–609 (1991)

  19. Nejati, S., Sabetzadeh, M., Chechik, M., Easterbrook, S., Zave, P.: Matching and merging of statecharts specifications. In Proceedings of 29th International Conference Software Engineering (ICSE). IEEE Computer Society, pp. 54–64 (2007)

  20. Nguyen, H., Nguyen, T., Pham, N., Al-Kofahi, J., Nguyen, T.: Accurate and efficient structural characteristic feature extraction for clone detection. In: Proceedings of 12th International Conference Fundamental Approaches to Software Engineering (FASE). Springer, Berlin, pp. 440–455 (2009)

  21. OMG: OMG Unified Modeling Language (OMG UML), Superstructure, V2.2 (formal/2009-02-02). Tech. rep., Object Management Group, Feb (2009)

  22. MDA Guide Version 1.0.1. Tech. rep., Object Management Group. http://www.omg.org/mda, document number omg/2003-06-01 (2003)

  23. Pham, N.H., Nguyen, H.A., Nguyen, T.T., Al-Kofahi, J.M., Nguyen, T.N.: Complete and accurate clone detection in graph-based models. In: ICSE’09 [8], pp. 276–286

  24. Rahm E., Bernstein P.A.: A survey of approaches to automatic schema matching. VLDB J. 10, 334–350 (2001)

    Article  MATH  Google Scholar 

  25. Ren, S., Rui, K., Butler, G.: Refactoring the scenario specification: a message sequence chart approach. In: 9th International Conference Object-Oriented Information Systems (2003). LNCS, no. 2817. Springer, Berlin, pp. 294–298

  26. Roy, C.K., Cordy, J.R.: A survey on software clone detection. Tech. Rep. TR 541, Queen’s University, School of Computing (2007)

  27. Schrepfer, M., Wolf, J., Mendling, J., Reijers, H.A.: The impact of secondary notation on process model understanding. In: Persson, A., Stirna, J. (eds.) The practice of enterprise modeling (PoEM). Springer, Berlin, pp. 161–175 (2009)

  28. Schürr, A.: Introduction to PROGRESS and an attribute graph grammar based specification language. In: Nagl, M. (ed.) Proceedings of 15th International Ws. Graph-Theoretic Concepts in Computer Science (WG’89). LNCS, vol. 411. Springer, Berlin, pp. 151–165 (1989)

  29. Selic B.: The pragmatics of model-driven development. IEEE Softw. 20(5), 19–25 (2003)

    Article  Google Scholar 

  30. Störrle, H.: A PROLOG-based approach to representing and querying UML models. In: Cox, P., Fish, A., Howse, J. (eds.) International Ws. Visual Languages and Logic (VLL’07). CEUR-WS, vol. 274, CEUR, pp. 71–84 (2007)

  31. Störrle, H.: Large scale modeling efforts: a survey on challenges and best practices. In: Hasselbring, W. (ed.) Proceedings of IASTED International Conference Software Engineering (IASTED-SE’07). Acta Press, USA, pp. 382–389 (2007)

  32. Störrle, H.: A logical model query interface. In: Cox, P., Fish, A., Howse, J. (eds.) International Ws. Visual Languages and Logic (VLL’09), vol. 510. CEUR, pp. 18–36 (2009)

  33. Störrle, H.: VMQL: A Generic Visual Model Query Language. In: Erwig, M., DeLine, R., Minas, M. (eds.) Proceedings of IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC’09). IEEE Computer Society, pp. 199–206 (2009)

  34. Störrle, H.: Structuring very large domain models: experiences from industrial MDSD projects. In: Wasowski et al. [40], pp. 49–54

  35. Störrle, H.: Towards clone detection in UML domain models. In: Wasowski et al. [40], pp. 285–293

  36. Störrle, H.: Expressing Model Constraints Visually with VMQL. In: Myers et al. [4], pp. 195–202

  37. Störrle, H.: On the Impact of Layout Quality to Unterstanding UML Diagrams. In: Myers et al. [4], pp. 135–142

  38. Störrle, H.: VMQL: A Visual Language for Ad-Hoc Model Querying. J. Vis. Lang. Comput. 22(1) (2011)

  39. Tiarks, R., Koschke, R., Falke, R.: An assessment of Type-3 clones as detected by state-of-the-art tools. In: International Ws. Source Code Analysis and Manipulation. IEEE Computer Society, pp. 67–76 (2009)

  40. Wasowski, A., Truscan, D., Kuzniarz, L. (eds): Proceedings of 8th Nordic Ws. Model Driven Engineering (NW-MODE’10). In: Gorton, I., Cuesta, C.E., Babar, M.A. (eds.) Proceedings of 4th European Conference Sw. Architecture (ECSA’10): Companion. ACM (2010)

  41. Wielemaker, J.: SWI Prolog 5.6.46 Reference Manual. Tech. rep., University of Amsterdam, Department of Social Science Informatics, 2007. http://www.swi-prolog.org

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Harald Störrle.

Additional information

Communicated by Dr. Muhammad Ali Babar, Flavio Oquendo, and Ian Gorton.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Störrle, H. Towards clone detection in UML domain models. Softw Syst Model 12, 307–329 (2013). https://doi.org/10.1007/s10270-011-0217-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10270-011-0217-9

Keywords

Navigation