Empirical Software Engineering

, Volume 21, Issue 4, pp 1476–1508 | Cite as

Code clones and developer behavior: results of two surveys of the clone research community

  • Debarshi Chatterji
  • Jeffrey C. CarverEmail author
  • Nicholas A. Kraft


The literature presents conflicting claims regarding the effects of clones on software maintainability. For a community to progress, it is important to identify and address those areas of disagreement. Many claims, such as those related to developer behavior, either lack human-based empirical validation or are contradicted by other studies. This paper describes the results of two surveys to evaluate the level of agreement among clone researchers regarding claims that have not yet been validated through human-based empirical study. The surveys covered three key clone-related research topics: general information, developer behavior, and evolution. Survey 1 focused on high-level information about all three topics, whereas Survey 2 focused specifically on developer behavior. Approximately 20 clone researchers responded to each survey. The survey responses showed a lack of agreement on some major clone-related topics. First, the respondents disagree about the definitions of clone types, with some indicating the need for a taxonomy based upon developer intent. Second, the respondents were uncertain whether the ratio of cloned to non-cloned code affected system quality. Finally, the respondents disagree about the usefulness of various detection, analysis, evolution, and visualization tools for clone management tasks such as tracking and refactoring of clones. The overall results indicate the need for more focused, human-based empirical research regarding the effects of clones during maintenance. The paper proposes a strategy for future research regarding developer behavior and code clones in order to bridge the gap between clone research and the application of that research in clone maintenance.


Code clones Clone evolution Clone management Software maintenance Developer behavior Community survey 



We thank the survey respondents. We acknowledge support from NSF grant CCF-0915559.


  1. Barbour L, Khomh F, Zou Y (2013) An empirical study of faults in late propagation clone genealogies. J Softw Evol Process 25(11):1139–1165. doi: 10.1002/smr.1597 CrossRefGoogle Scholar
  2. Baxter I, Yahin A, Moura L, Sant’Anna M, Bier L (1998) Clone detection using abstract syntax trees. In: Proceedings of the international conference on software maintenance, 1998. doi: 10.1109/ICSM.1998.738528, pp 368–377
  3. Bellon S, Koschke R, Antoniol G, Krinke J, Merlo E (2007) Comparison and evaluation of clone detection tools. IEEE Transactions on Software Engineering 33 (9):577–591. doi: 10.1109/TSE.2007.70725 CrossRefGoogle Scholar
  4. Cai D, Kim M (2011) An empirical study of long-lived code clones. In: Proceedings of the 14th international conference on fundamental approaches to software engineering: part of the joint european conferences on theory and practice of software, Springer-Verlag, Berlin, Heidelberg, FASE’11/ETAPS’11, pp 432–446.
  5. Chatterji D, Carver J, Massengil B, Oslin J, Kraft N (2011) Measuring the efficacy of code clone information in a bug localization task: an empirical study. In: International symposium on empirical software engineering and measurement, pp 20–29Google Scholar
  6. Chatterji D, Carver J, Kraft N (2012) Claims and beliefs about code clones: do we agree as a community? a survey. In: 6th International workshop on software clones (IWSC), pp 15–21Google Scholar
  7. Chatterji D, Carver J, Kraft N, Harder J (2013) Effects of cloned code on software maintainability: a replicated developer study. In: 20th Working conference on reverse engineering (WCRE), pp 112–121Google Scholar
  8. De Wit M, Zaidman A, van Deursen A (2009) Managing code clones using dynamic change tracking and resolution.. In: IEEE international conference on Software Maintenance, ICSM 2009, pp 169– 178Google Scholar
  9. Fowler M (1999) Refactoring: improving the design of existing code. Addison-Wesley, BostonGoogle Scholar
  10. Glaser BG (1965) The constant comparative method of qualitative analysis. Soc Probl 12(4):436–445. doi: 10.2307/798843. ArticleType: research-article / Full publication date: Spring, 1965 / Copyright 1965 University of California PressCrossRefGoogle Scholar
  11. Göde N, Koschke R (2013) Studying clone evolution using incremental clone detection. J Softw Evol Process 25(2):165–192. doi: 10.1002/smr.520 CrossRefGoogle Scholar
  12. Harder J, Göde N (2012) Cloned code: stable code. J Softw Evol Process. doi: 10.1002/smr.1551 Google Scholar
  13. Jablonski P, Hou D (2010) Aiding software maintenance with copy-and-paste clone-awareness. In: 2010 IEEE 18th international conference on program comprehension (ICPC), pp 170–179Google Scholar
  14. Kamiya T, Kusumoto S, Inoue K (2002) Ccfinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans Softw Eng 28 (7):654–670. doi: 10.1109/TSE.2002.1019480 CrossRefGoogle Scholar
  15. Kapser CJ, Godfrey MW (2008a) “cloning considered harmful” considered harmful: patterns of cloning in software. Empirical Softw Engg 13(6):645–692CrossRefGoogle Scholar
  16. Kapser CJ, Godfrey MW (2008b) “cloning considered harmful” considered harmful: patterns of cloning in software. Empir Softw Eng 13(6):645–692. doi: 10.1007/s10664-008-9076-6 CrossRefGoogle Scholar
  17. Kim M, Bergman L, Lau T, Notkin D (2004) An ethnographic study of copy and paste programming practices in oopl. In: Proceedings 2004 international symposium on empirical software engineering , pp 83–92Google Scholar
  18. Kim M, Sazawal V, Notkin D, Murphy G (2005) An empirical study of code clone genealogies. SIGSOFT Softw Eng Notes 30(5):187–196. doi: 10.1145/1095430.1081737 CrossRefGoogle Scholar
  19. Koschke R, Baxter ID, Conradt M, Cordy JR (2012) Software clone management towards industrial application (Dagstuhl Seminar 12071). Dagstuhl Reports 2(2):21–57. doi: 10.4230/DagRep.2.2.21 Google Scholar
  20. Lozano A, Wermelinger M, Nuseibeh B (2007) Evaluating the harmfulness of cloning: a change based experiment. In: Proceedings of the fourth international workshop on mining software repositories. IEEE Computer Society, Washington, p 18. MSR ’07Google Scholar
  21. Pate J R, Tairas R, Kraft N A (2013) Clone evolution: a systematic review. J Softw Evol Process 25(3):261–283. doi: 10.1002/smr.579 CrossRefGoogle Scholar
  22. Rahman F, Bird C, Devanbu P (2010) Clones: what is that smell? In: 7th IEEE working conference on mining software repositories, pp 72–81Google Scholar
  23. Roy C, Zibran M, Koschke R (2014) The vision of software clone management: past, present, and future (keynote paper). In: 2014 Software evolution week - IEEE conference on software maintenance, Reengineering and Reverse Engineering, pp 18–33Google Scholar
  24. Roy CK, Cordy JR, Koschke R (2009) Comparison and evaluation of code clone detection techniques and tools: a qualitative approach. Sci Comput Program 74 (7):470–495. doi: 10.1016/j.scico.2009.02.007 MathSciNetCrossRefzbMATHGoogle Scholar
  25. Thummalapenta S, Cerulo L, Aversano L, Di Penta M (2010) An empirical study on the maintenance of source code clones. Empirical Softw Engg 15(1):1–34. doi: 10.1007/s10664-009-9108-x CrossRefGoogle Scholar
  26. Zhang G, Peng X, Xing Z, Zhao W (2012) Cloning practices: Why developers clone and what can be changed. In: 28th IEEE international conference on software maintenance, pp 285–294Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Debarshi Chatterji
    • 1
  • Jeffrey C. Carver
    • 1
    Email author
  • Nicholas A. Kraft
    • 2
  1. 1.University of AlabamaTuscaloosaUSA
  2. 2.ABB Corporate ResearchRaleighUSA

Personalised recommendations