Skip to main content

A Survey on Code Clone, Its Behavior and Applications

  • Conference paper
  • First Online:
Networking Communication and Data Knowledge Engineering

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 4))

Abstract

Code Clones are separate fragments of code that are very similar to a piece of code in matter or in its functionality. It is a type of Bad Smell that increases the project size and maintenance cost. However, the existing research elaborates several detection techniques. But the data from the available research is still insufficient to reach at some conclusion. The aim of this survey is to investigate all detection techniques and to analyze the Code Clone behavior and motivation behind the cloning. In this paper, 16 techniques for detecting the clones are summarized. The paper presents detailed analysis of 76 research papers. The research identified that there are various tools that are available for detecting Code Clones. We also investigate the approaches followed in the tools and further summarized the Code Clone patterns that are used for qualitative analysis. Overall, our findings indicate that the management of Clones should be started at the earliest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Brown, W.J., Malveau, R.C., Brown, W.H., McCormick III, H.W., Mowbray, T.J.: Anti Patterns: Refactoring Software, Architectures, and Projects in Crisis, 1st edn. John Wiley and Sons, Chichester.

    Google Scholar 

  2. Flower, Martin. Improving the design of an existing code.

    Google Scholar 

  3. Bernadette Schell & Clemens Martin,: Webster’s New world Hackers dictionary 2006.

    Google Scholar 

  4. A Survey on Software Clone Detection Research. Chanchal Kumar Roy, James R. Cordy. 2007.

    Google Scholar 

  5. Clone Region Descriptors: Representing and Tracking Duplication in Source Code. Ekwa Duala-Ekoko, Martin P. Robillard. 2010. 2010, ACM.

    Google Scholar 

  6. Cloning practices: Why developers clone and what can be changed. Gang Zhang, Xin Peng, Zhenchang Xing, Wenyun Zhao. 2012. Trento: IEEE, 2012. pp. 285–294.

    Google Scholar 

  7. Cloning: The need to understand developer intent. Debarshi Chatterji, Jeffrey C. Carver and Nicholas A. Kraft. 2013. San Francisco, CA: IEEE, 2013. pp. 14–15.

    Google Scholar 

  8. An Empirical Study of the Impacts of Clones in Software Maintenance. Manishankar Mondal, Md. Saidur Rahman, Ripon K. Saha, Chanchal K. Roy, Jens Krinke, Kevin A. Schneider. 2011. Kingston, ON: IEEE, 2011. pp. 242–245.

    Google Scholar 

  9. R. Tairas, F. Jacob, J. Gray, Representing clones in a localized manner, in: Proceedings of 5th International Workshop on Software Clones, Honolulu, USA, 2011, pp. 54–60.

    Google Scholar 

  10. “Cloning considered harmful” considered harmful: patterns of cloning in software. Cory J. Kapser, Michael W. Godfrey. 2008. 2008, ACM, pp. 645–692.

    Google Scholar 

  11. Clone Detection: Why, What and How? Marat Akhin, Vladimir Itsykson. 2010. Moscow: IEEE, 2010. pp. 36–42.

    Google Scholar 

  12. A hybrid-token and textual based approach to find similar code segments. Akshat Agrawal, SumitKumar Yadav. 2013. Tiruchengode:IEEE, 2013. pp. 1–4.

    Google Scholar 

  13. Ctcompare: Code Clone Detection Using Hashed Token Sequences. Toomey, Warren. 2012. Zurich: IEEE, 2012. pp. 92–93.

    Google Scholar 

  14. Efficient Token Based Clone Detection with Flexible Tokenization. Hamid Abdul Basit, Simon J. Puglisi, William F. Smyth, Andrew Turpin, Stan Jarzabek. 2004. New York, USA: ACM, 2004. pp. 513–516.

    Google Scholar 

  15. Agec: An execution-semantic clone detection tool. Kamiya, Toshihiro. 2013. San Francisco, CA: IEEE, 2013. pp. 227–229.

    Google Scholar 

  16. CP-Miner: finding copy-paste and related bugs in large-scale software code. Li Z, Lu S, Myagmar S, Zhou Y. 2006. s.l.: IEEE, 2006, pp. 176–192.

    Google Scholar 

  17. Boreas: An Accurate and Scalable Token-Based Approach to Code Clone Detection. Yang Yuan, Yao Guo. 2012. New York, USA: ACM, 2012. pp. 286–289.

    Google Scholar 

  18. Ethnographic Study of Copy and Paste Programming Practices in OOPL. Kim, Miryung. 2004. s.l.: IEEE, 2004. pp. 83–92.

    Google Scholar 

  19. How developers copy. Balint M, Girba T, Marinescu R. 2006. Athens: IEEE, 2006. pp. 56–68.

    Google Scholar 

  20. An empirical study of build maintenance effort. Thummalapenta S, Cerulo L, Aversano L, Penta MD. 2011. s.l.: IEEE, 2011. pp. 141–150.

    Google Scholar 

  21. An empirical study on the maintenance of source code clones. Thummalapenta S, Cerulo L, Aversano L, Penta MD. 2010. 2010, Springer, pp. 1–34.

    Google Scholar 

  22. Project Bauhaus. URL http://www.bauhaus-stuttgart.de Last accessed November 2008.

  23. How clones are maintained: an empirical study. Aversano L, Cerulo L, Penta MD. 2007. Amsterdam: IEEE, 2007. pp. 81–90.

    Google Scholar 

  24. ClemanX:Incremental Clone Detection Tool for Evolving Software. Tung Thanh Nguyen, Hoan Anh Nguyen, Nam H. Pham, Jafar M. Al-Kofahi, Tien N. Nguyen. 2009. Vancouver, BC: IEEE, 2009. pp. 437–438.

    Google Scholar 

  25. A Framework for Studying Clones In Large Software Systems. Zhen Ming Jiang, Ahmed E. Hassan. 2007. Paris: IEEE, 2007. pp. 203–212.

    Google Scholar 

  26. A study of consistent and inconsistent changes of code clone. Krinke, Jens. 2007. Washington, DC, USA: IEEE, 2007. pp. 170–178.

    Google Scholar 

  27. Is cloned code more stable than non-cloned code. J, Krinke. 2008. Beijing: IEEE, 2008. pp. 57–66.

    Google Scholar 

  28. Incremental Code Clone Detection: A PDG-based Approach. Yoshiki Higo, Yasushi Ueda, Minoru Nishino, Shinji Kusumoto. 2011. Limerick: IEEE, 2011. pp. 3–12.

    Google Scholar 

  29. Measuring Clone Based Reengineering Opportunities M. Balazinska, E. Merlo, M. Dagenais, B. Lague and K. Kontogiannis,, in: Proceedings of the IEEE Symposium on S/W Metrics, METRICS 1999, pp. 292–303 (1999).

    Google Scholar 

  30. A Metrics-Based Data Mining Approach for Software Clone Detection. Abd-El-Hafiz, Salwa K. 2012. Izmir: IEEE, 2012. pp. 35–41.

    Google Scholar 

  31. Detection of Redundant Code Using R2D2, A. Leit˜ao, Software Quality Journal, 12(4):361–382 (2004).

    Google Scholar 

  32. An empirical study on the fault-proneness of clone migration in clone genealogies. Shuai Xie, Foutse Khomh, Ying Zou. 2014. Antwerp: IEEE, 2014. pp. 94–103.

    Google Scholar 

  33. Assessing the benefits of incorporating function clone detection in a development process. Lague B, Proulx D, Mayrand J, Merlo E. 1997. Bari, Italy: IEEE, 1997. pp. 314–321.

    Google Scholar 

  34. CCFinder: a multilinguistic token-based code clone detection system for large scale source code. Toshihiro Kamiya, Shinji Kusumoto, Katsuro Inoue. 2002. 2002, pp. 654–670.

    Google Scholar 

  35. Gemini: maintenance support environment based on code clone. Yasushi Ueda, Yoshiki Higo, Toshihiro Kamiya, Shinji Kusumoto, Katsuro Inoue. 2002. s.l.: IEEE, 2002. pp. 67–76.

    Google Scholar 

  36. ARIES: REFACTORING SUPPORT ENVIRONMENT BASED ON CODE CLONE ANALYSIS. Yoshiki Higo, Toshihiro Kamiya, Shinji Kusumoto, Katsuro Inoue. 2005. New York, NY, USA: ACM, 2005. pp. 1–4.

    Google Scholar 

  37. Relation of code clones and change couplings. Geiger R, Fluri B, Gall H, Pinzger M. 2006. Berlin, Heidelberg: Springer, 2006. pp. 411–425.

    Google Scholar 

  38. Evolution of type-1 clones. N, Gode. 2009. Edmonton, AB: IEEE, 2009. pp. 77–86.

    Google Scholar 

  39. Assessing the effect of clones on changeability. Lozano A, Wermelinger M. 2008. Beijing: IEEE, 2008. pp. 227–236.

    Google Scholar 

  40. Clone smells in software evolution. Bakota T, Ferenc R, Gyimothy T. 2007. Paris: IEEE, 2007. pp. 24–33.

    Google Scholar 

  41. An empirical study on inconsistent changes to code clones at release level. Bettenburg N, Shang W, Ibrahim W, Adams B, Zou Y, Hassan A. 2009. Lille: IEEE, 2009. pp. 85–94.

    Google Scholar 

  42. Clone region descriptors: representing and tracking duplication in source code. Duala-Ekoko E, Robillard M. 2010. 2010, ACM.

    Google Scholar 

  43. CloneTracker:tool support for code clone management. Ekwa Duala-Ekko, Martin P. Robillard. 2008. Leipzig: IEEE, 2008. pp. 843–846.

    Google Scholar 

  44. CloneDetective–A Work bench for Clone Detection Research. Elmar Juergens, Florian Deissenboeck, Benjamin Hummel. 2009. Vancouver, BC: IEEE, 2009. pp. 603–606.

    Google Scholar 

  45. Do code clones matter? Jürgens E, Deissenboeck F, Humme lB, Wagner S. 2009. Vancouver, BC: IEEE, 2009. pp. 485–495.

    Google Scholar 

  46. ClemanX:IncrementalCloneDetectionToolforEvolvingSoftware. Tung Thanh Nguyen, Hoan Anh Nguyen, Nam H. Pham, Jafar M. Al-Kofahi, Tien N. Nguyen. 2009. Vancouver, BC: IEEE, 2009. pp. 437–438.

    Google Scholar 

  47. Studying clone evolution using incremental clone detection. Göde N, Koschke R. 2010. 2010, Wiley.

    Google Scholar 

  48. The NiCad Clone Detector. James R Cordy, Chanchal K. Roy. 2011. Washington DC, USA: IEEE, 2011. pp. 219–220.

    Google Scholar 

  49. Detecting Clones across Microsoft.NET Programming Languages. Farouq Al-omari, Iman Keivanloo, Chanchal K. Roy, Juergen Rilling. 2012. Kingston, ON: IEEE, 2012. pp. 405–414.

    Google Scholar 

  50. SimCad: An Extensible and Faster Clone Detection Tool for Large Scale Software Systems. Md. Sharif Uddin, Chanchal K. Roy, Kevin A. Schneider. 2013. San Francisco, CA: IEEE, 2013. pp. 236–238.

    Google Scholar 

  51. Evaluating the harmfulness of cloning: a change based experiment. Lozano A, Wermelinger M, Nuseibeh B. 2007. Minneapolis, MN: IEEE, 2007. p. 18.

    Google Scholar 

  52. A language independent approach for detecting duplicated code. Stephane Ducasse, Matthias Rieger, Serge Demeyer. 1999. Oxford: IEEE, 1999. pp. 109–118.

    Google Scholar 

  53. Clone Detection Using Abstract Syntax Trees. Ira D. Baxter, Andrew Yahin, Leonardo Moura, Marcelo Sant Anna, Lorraine Bier. 1998. Washington, DC, USA: IEEE, 1998.

    Google Scholar 

  54. Clone Detection Using Abstract Syntax Suffix Trees. Rainer Koschke, Raimar Falke, Pierre Frenzel. 2006. Benevento: IEEE, 2006. pp. 253–262.

    Google Scholar 

  55. A Tree Kernel based approach for clone detection. Anna Corazza, Sergio Di Martino, Valerio Maggio, Giuseppe Scanniello. 2010. Timisoara: IEEE, 2010. pp. 1–5.

    Google Scholar 

  56. Detection of Type-1 and Type-2 Code Clones Using Textual Analysis and Metrics. KODHAI E, KANMANI S, KAMATCHI A, RADHIKA R, VIJAYA SARANYA B. 2010. Kochi, Kerala: IEEE, 2010. pp. 241–243.

    Google Scholar 

  57. SeByte: A semantic clone detection tool for intermediate languages. Iman Keivanloo, Chanchal K. Roy, Juergen Rilling. 2012. Passau: IEEE, 2012. pp. 247–249.

    Google Scholar 

  58. BinClone: Detecting Code Clones in Malware. Mohammad Reza Farhadi, Information Systems Engineering Concordia University Montreal, Benjamin C. M. Fung, Philippe Charland. 2014. San Francisco, CA: IEEE, 2014. pp. 78–87.

    Google Scholar 

  59. Chakraborty, Sanjeev. CODE CLONE DETECTION A NEW APPROACH.

    Google Scholar 

  60. Semantic Code Clone Detection Using Parse Trees and Grammar Recovery. Rajkumar Tekchandani, Rajesh Kumar Bhatia, Maninder Singh. 2013. Nodia: IEEE, 2013. pp. 41–46.

    Google Scholar 

  61. A Novel Detection Approach for Statement Clones. Qing Qing Shi, Li Ping Zhang, Fan lun Meng and Dong Sheng Liu. 2013. Beijing: IEEE, 2013. pp. 27–30.

    Google Scholar 

  62. A Data Mining Approach for Detecting Higher-Level Clones in Software. Hamid Abdul Basit, Stan Jarzabek. 2009. 2009, IEEE, pp. 497–514.

    Google Scholar 

  63. Detecting Clones in Business Applications. Jin Guo, Ying Zou. 2008. Antwerp: IEEE, 2008. pp. 91–100.

    Google Scholar 

  64. A Study of Cloning in the Linux SCSI Drivers. Wei Wang, Michael W. Godfrey. 2011. Williamsburg, VI: IEEE, 2011. pp. 95–104.

    Google Scholar 

  65. A study of code cloning in server pages of web applications developed using classic ASP .NET and ASP .NET MVC framework. Md. Rak ibul [], Md. Raf iqullslam, Md. Ma idullslam, Ta sneem Hal im. 2011. Dhaka: IEEE, 2011. pp. 497–502.

    Google Scholar 

  66. AnDarwin: Scalable Detection of Android Application Clones Based on Semantics. Chen, Jonathan Crussell Clint Gibler Hao. 2014. 2014, IEEE, p. 1.

    Google Scholar 

Download references

Acknowledgement

We would like to thank Prof. B.P. Singh, Prof. Rekha Agrawal and the Amity School of Engineering and Technology, Delhi, for providing the good environment and support in my research work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aakanshi Gupta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gupta, A., Suri, B. (2018). A Survey on Code Clone, Its Behavior and Applications. In: Perez, G., Mishra, K., Tiwari, S., Trivedi, M. (eds) Networking Communication and Data Knowledge Engineering. Lecture Notes on Data Engineering and Communications Technologies, vol 4. Springer, Singapore. https://doi.org/10.1007/978-981-10-4600-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-4600-1_3

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-4599-8

  • Online ISBN: 978-981-10-4600-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics