A Survey on Code Clone, Its Behavior and Applications

  • Aakanshi GuptaEmail author
  • Bharti Suri
Conference paper
Part of the Lecture Notes on Data Engineering and Communications Technologies book series (LNDECT, volume 4)


Code Clones are separate fragments of code that are very similar to a piece of code in matter or in its functionality. It is a type of Bad Smell that increases the project size and maintenance cost. However, the existing research elaborates several detection techniques. But the data from the available research is still insufficient to reach at some conclusion. The aim of this survey is to investigate all detection techniques and to analyze the Code Clone behavior and motivation behind the cloning. In this paper, 16 techniques for detecting the clones are summarized. The paper presents detailed analysis of 76 research papers. The research identified that there are various tools that are available for detecting Code Clones. We also investigate the approaches followed in the tools and further summarized the Code Clone patterns that are used for qualitative analysis. Overall, our findings indicate that the management of Clones should be started at the earliest.


Code Clone Bad Smell Patterns of cloning Case study Detection technique 



We would like to thank Prof. B.P. Singh, Prof. Rekha Agrawal and the Amity School of Engineering and Technology, Delhi, for providing the good environment and support in my research work.


  1. 1.
    Brown, W.J., Malveau, R.C., Brown, W.H., McCormick III, H.W., Mowbray, T.J.: Anti Patterns: Refactoring Software, Architectures, and Projects in Crisis, 1st edn. John Wiley and Sons, Chichester.Google Scholar
  2. 2.
    Flower, Martin. Improving the design of an existing code.Google Scholar
  3. 3.
    Bernadette Schell & Clemens Martin,: Webster’s New world Hackers dictionary 2006.Google Scholar
  4. 4.
    A Survey on Software Clone Detection Research. Chanchal Kumar Roy, James R. Cordy. 2007.Google Scholar
  5. 5.
    Clone Region Descriptors: Representing and Tracking Duplication in Source Code. Ekwa Duala-Ekoko, Martin P. Robillard. 2010. 2010, ACM.Google Scholar
  6. 6.
    Cloning practices: Why developers clone and what can be changed. Gang Zhang, Xin Peng, Zhenchang Xing, Wenyun Zhao. 2012. Trento: IEEE, 2012. pp. 285–294.Google Scholar
  7. 7.
    Cloning: The need to understand developer intent. Debarshi Chatterji, Jeffrey C. Carver and Nicholas A. Kraft. 2013. San Francisco, CA: IEEE, 2013. pp. 14–15.Google Scholar
  8. 8.
    An Empirical Study of the Impacts of Clones in Software Maintenance. Manishankar Mondal, Md. Saidur Rahman, Ripon K. Saha, Chanchal K. Roy, Jens Krinke, Kevin A. Schneider. 2011. Kingston, ON: IEEE, 2011. pp. 242–245.Google Scholar
  9. 9.
    R. Tairas, F. Jacob, J. Gray, Representing clones in a localized manner, in: Proceedings of 5th International Workshop on Software Clones, Honolulu, USA, 2011, pp. 54–60.Google Scholar
  10. 10.
    “Cloning considered harmful” considered harmful: patterns of cloning in software. Cory J. Kapser, Michael W. Godfrey. 2008. 2008, ACM, pp. 645–692.Google Scholar
  11. 11.
    Clone Detection: Why, What and How? Marat Akhin, Vladimir Itsykson. 2010. Moscow: IEEE, 2010. pp. 36–42.Google Scholar
  12. 12.
    A hybrid-token and textual based approach to find similar code segments. Akshat Agrawal, SumitKumar Yadav. 2013. Tiruchengode:IEEE, 2013. pp. 1–4.Google Scholar
  13. 13.
    Ctcompare: Code Clone Detection Using Hashed Token Sequences. Toomey, Warren. 2012. Zurich: IEEE, 2012. pp. 92–93.Google Scholar
  14. 14.
    Efficient Token Based Clone Detection with Flexible Tokenization. Hamid Abdul Basit, Simon J. Puglisi, William F. Smyth, Andrew Turpin, Stan Jarzabek. 2004. New York, USA: ACM, 2004. pp. 513–516.Google Scholar
  15. 15.
    Agec: An execution-semantic clone detection tool. Kamiya, Toshihiro. 2013. San Francisco, CA: IEEE, 2013. pp. 227–229.Google Scholar
  16. 16.
    CP-Miner: finding copy-paste and related bugs in large-scale software code. Li Z, Lu S, Myagmar S, Zhou Y. 2006. s.l.: IEEE, 2006, pp. 176–192.Google Scholar
  17. 17.
    Boreas: An Accurate and Scalable Token-Based Approach to Code Clone Detection. Yang Yuan, Yao Guo. 2012. New York, USA: ACM, 2012. pp. 286–289.Google Scholar
  18. 18.
    Ethnographic Study of Copy and Paste Programming Practices in OOPL. Kim, Miryung. 2004. s.l.: IEEE, 2004. pp. 83–92.Google Scholar
  19. 19.
    How developers copy. Balint M, Girba T, Marinescu R. 2006. Athens: IEEE, 2006. pp. 56–68.Google Scholar
  20. 20.
    An empirical study of build maintenance effort. Thummalapenta S, Cerulo L, Aversano L, Penta MD. 2011. s.l.: IEEE, 2011. pp. 141–150.Google Scholar
  21. 21.
    An empirical study on the maintenance of source code clones. Thummalapenta S, Cerulo L, Aversano L, Penta MD. 2010. 2010, Springer, pp. 1–34.Google Scholar
  22. 22.
    Project Bauhaus. URL Last accessed November 2008.
  23. 23.
    How clones are maintained: an empirical study. Aversano L, Cerulo L, Penta MD. 2007. Amsterdam: IEEE, 2007. pp. 81–90.Google Scholar
  24. 24.
    ClemanX:Incremental Clone Detection Tool for Evolving Software. Tung Thanh Nguyen, Hoan Anh Nguyen, Nam H. Pham, Jafar M. Al-Kofahi, Tien N. Nguyen. 2009. Vancouver, BC: IEEE, 2009. pp. 437–438.Google Scholar
  25. 25.
    A Framework for Studying Clones In Large Software Systems. Zhen Ming Jiang, Ahmed E. Hassan. 2007. Paris: IEEE, 2007. pp. 203–212.Google Scholar
  26. 26.
    A study of consistent and inconsistent changes of code clone. Krinke, Jens. 2007. Washington, DC, USA: IEEE, 2007. pp. 170–178.Google Scholar
  27. 27.
    Is cloned code more stable than non-cloned code. J, Krinke. 2008. Beijing: IEEE, 2008. pp. 57–66.Google Scholar
  28. 28.
    Incremental Code Clone Detection: A PDG-based Approach. Yoshiki Higo, Yasushi Ueda, Minoru Nishino, Shinji Kusumoto. 2011. Limerick: IEEE, 2011. pp. 3–12.Google Scholar
  29. 29.
    Measuring Clone Based Reengineering Opportunities M. Balazinska, E. Merlo, M. Dagenais, B. Lague and K. Kontogiannis,, in: Proceedings of the IEEE Symposium on S/W Metrics, METRICS 1999, pp. 292–303 (1999).Google Scholar
  30. 30.
    A Metrics-Based Data Mining Approach for Software Clone Detection. Abd-El-Hafiz, Salwa K. 2012. Izmir: IEEE, 2012. pp. 35–41.Google Scholar
  31. 31.
    Detection of Redundant Code Using R2D2, A. Leit˜ao, Software Quality Journal, 12(4):361–382 (2004).Google Scholar
  32. 32.
    An empirical study on the fault-proneness of clone migration in clone genealogies. Shuai Xie, Foutse Khomh, Ying Zou. 2014. Antwerp: IEEE, 2014. pp. 94–103.Google Scholar
  33. 33.
    Assessing the benefits of incorporating function clone detection in a development process. Lague B, Proulx D, Mayrand J, Merlo E. 1997. Bari, Italy: IEEE, 1997. pp. 314–321.Google Scholar
  34. 34.
    CCFinder: a multilinguistic token-based code clone detection system for large scale source code. Toshihiro Kamiya, Shinji Kusumoto, Katsuro Inoue. 2002. 2002, pp. 654–670.Google Scholar
  35. 35.
    Gemini: maintenance support environment based on code clone. Yasushi Ueda, Yoshiki Higo, Toshihiro Kamiya, Shinji Kusumoto, Katsuro Inoue. 2002. s.l.: IEEE, 2002. pp. 67–76.Google Scholar
  36. 36.
    ARIES: REFACTORING SUPPORT ENVIRONMENT BASED ON CODE CLONE ANALYSIS. Yoshiki Higo, Toshihiro Kamiya, Shinji Kusumoto, Katsuro Inoue. 2005. New York, NY, USA: ACM, 2005. pp. 1–4.Google Scholar
  37. 37.
    Relation of code clones and change couplings. Geiger R, Fluri B, Gall H, Pinzger M. 2006. Berlin, Heidelberg: Springer, 2006. pp. 411–425.Google Scholar
  38. 38.
    Evolution of type-1 clones. N, Gode. 2009. Edmonton, AB: IEEE, 2009. pp. 77–86.Google Scholar
  39. 39.
    Assessing the effect of clones on changeability. Lozano A, Wermelinger M. 2008. Beijing: IEEE, 2008. pp. 227–236.Google Scholar
  40. 40.
    Clone smells in software evolution. Bakota T, Ferenc R, Gyimothy T. 2007. Paris: IEEE, 2007. pp. 24–33.Google Scholar
  41. 41.
    An empirical study on inconsistent changes to code clones at release level. Bettenburg N, Shang W, Ibrahim W, Adams B, Zou Y, Hassan A. 2009. Lille: IEEE, 2009. pp. 85–94.Google Scholar
  42. 42.
    Clone region descriptors: representing and tracking duplication in source code. Duala-Ekoko E, Robillard M. 2010. 2010, ACM.Google Scholar
  43. 43.
    CloneTracker:tool support for code clone management. Ekwa Duala-Ekko, Martin P. Robillard. 2008. Leipzig: IEEE, 2008. pp. 843–846.Google Scholar
  44. 44.
    CloneDetective–A Work bench for Clone Detection Research. Elmar Juergens, Florian Deissenboeck, Benjamin Hummel. 2009. Vancouver, BC: IEEE, 2009. pp. 603–606.Google Scholar
  45. 45.
    Do code clones matter? Jürgens E, Deissenboeck F, Humme lB, Wagner S. 2009. Vancouver, BC: IEEE, 2009. pp. 485–495.Google Scholar
  46. 46.
    ClemanX:IncrementalCloneDetectionToolforEvolvingSoftware. Tung Thanh Nguyen, Hoan Anh Nguyen, Nam H. Pham, Jafar M. Al-Kofahi, Tien N. Nguyen. 2009. Vancouver, BC: IEEE, 2009. pp. 437–438.Google Scholar
  47. 47.
    Studying clone evolution using incremental clone detection. Göde N, Koschke R. 2010. 2010, Wiley.Google Scholar
  48. 48.
    The NiCad Clone Detector. James R Cordy, Chanchal K. Roy. 2011. Washington DC, USA: IEEE, 2011. pp. 219–220.Google Scholar
  49. 49.
    Detecting Clones across Microsoft.NET Programming Languages. Farouq Al-omari, Iman Keivanloo, Chanchal K. Roy, Juergen Rilling. 2012. Kingston, ON: IEEE, 2012. pp. 405–414.Google Scholar
  50. 50.
    SimCad: An Extensible and Faster Clone Detection Tool for Large Scale Software Systems. Md. Sharif Uddin, Chanchal K. Roy, Kevin A. Schneider. 2013. San Francisco, CA: IEEE, 2013. pp. 236–238.Google Scholar
  51. 51.
    Evaluating the harmfulness of cloning: a change based experiment. Lozano A, Wermelinger M, Nuseibeh B. 2007. Minneapolis, MN: IEEE, 2007. p. 18.Google Scholar
  52. 52.
    A language independent approach for detecting duplicated code. Stephane Ducasse, Matthias Rieger, Serge Demeyer. 1999. Oxford: IEEE, 1999. pp. 109–118.Google Scholar
  53. 53.
    Clone Detection Using Abstract Syntax Trees. Ira D. Baxter, Andrew Yahin, Leonardo Moura, Marcelo Sant Anna, Lorraine Bier. 1998. Washington, DC, USA: IEEE, 1998.Google Scholar
  54. 54.
    Clone Detection Using Abstract Syntax Suffix Trees. Rainer Koschke, Raimar Falke, Pierre Frenzel. 2006. Benevento: IEEE, 2006. pp. 253–262.Google Scholar
  55. 55.
    A Tree Kernel based approach for clone detection. Anna Corazza, Sergio Di Martino, Valerio Maggio, Giuseppe Scanniello. 2010. Timisoara: IEEE, 2010. pp. 1–5.Google Scholar
  56. 56.
    Detection of Type-1 and Type-2 Code Clones Using Textual Analysis and Metrics. KODHAI E, KANMANI S, KAMATCHI A, RADHIKA R, VIJAYA SARANYA B. 2010. Kochi, Kerala: IEEE, 2010. pp. 241–243.Google Scholar
  57. 57.
    SeByte: A semantic clone detection tool for intermediate languages. Iman Keivanloo, Chanchal K. Roy, Juergen Rilling. 2012. Passau: IEEE, 2012. pp. 247–249.Google Scholar
  58. 58.
    BinClone: Detecting Code Clones in Malware. Mohammad Reza Farhadi, Information Systems Engineering Concordia University Montreal, Benjamin C. M. Fung, Philippe Charland. 2014. San Francisco, CA: IEEE, 2014. pp. 78–87.Google Scholar
  59. 59.
    Chakraborty, Sanjeev. CODE CLONE DETECTION A NEW APPROACH.Google Scholar
  60. 60.
    Semantic Code Clone Detection Using Parse Trees and Grammar Recovery. Rajkumar Tekchandani, Rajesh Kumar Bhatia, Maninder Singh. 2013. Nodia: IEEE, 2013. pp. 41–46.Google Scholar
  61. 61.
    A Novel Detection Approach for Statement Clones. Qing Qing Shi, Li Ping Zhang, Fan lun Meng and Dong Sheng Liu. 2013. Beijing: IEEE, 2013. pp. 27–30.Google Scholar
  62. 62.
    A Data Mining Approach for Detecting Higher-Level Clones in Software. Hamid Abdul Basit, Stan Jarzabek. 2009. 2009, IEEE, pp. 497–514.Google Scholar
  63. 63.
    Detecting Clones in Business Applications. Jin Guo, Ying Zou. 2008. Antwerp: IEEE, 2008. pp. 91–100.Google Scholar
  64. 64.
    A Study of Cloning in the Linux SCSI Drivers. Wei Wang, Michael W. Godfrey. 2011. Williamsburg, VI: IEEE, 2011. pp. 95–104.Google Scholar
  65. 65.
    A study of code cloning in server pages of web applications developed using classic ASP .NET and ASP .NET MVC framework. Md. Rak ibul [], Md. Raf iqullslam, Md. Ma idullslam, Ta sneem Hal im. 2011. Dhaka: IEEE, 2011. pp. 497–502.Google Scholar
  66. 66.
    AnDarwin: Scalable Detection of Android Application Clones Based on Semantics. Chen, Jonathan Crussell Clint Gibler Hao. 2014. 2014, IEEE, p. 1.Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.University School of ICT, GGS Indraprastha UniversityDelhiIndia

Personalised recommendations