Skip to main content

On the Structural Code Clone Detection Problem: A Survey and Software Metric Based Approach

  • Conference paper
Computational Science and Its Applications – ICCSA 2014 (ICCSA 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8583))

Included in the following conference series:

Abstract

Unnecessary repeated codes (clones) have not been well documented and are difficult to maintain. Code clones may become an important problem in software development cycle and they must be fixed in all occurrences. This condition increases significantly software maintenance costs and required effort/duration for understanding the code. Over the years, many techniques have been proposed in order to minimize or prevent the code cloning problems. The main focus of these techniques is on the detection of clones. In such studies, code cloning is studied under two main categories: simple and structural. Simple clone is defined as the similarity that arises from the repetition of the code snippet in the software. Structural clone is defined as the similarity in software structure (i.e. design patterns and object oriented programming class relations). Simple clone detection techniques fail to determine the reasons of code repetition whether it is due to design or not, as they do not look at the code from a wider perspective for repetitive code snippets. In this study, we survey the existing structural clones approaches. We also introduce an approach that utilizes software quality metrics for detecting the structural code clones.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fowler, M.: Refactoring: improving the design of existing code. Addison-Wesley Professional (1999)

    Google Scholar 

  2. Roy, C.K., Cordy, J.R.: A survey on software clone detection research. Technical report, Citeseer (2007)

    Google Scholar 

  3. Baker, B.S.: On finding duplication and near-duplication in large software systems. In: Proceedings of 2nd Working Conference on Reverse Engineering, pp. 86–95. IEEE (1995)

    Google Scholar 

  4. Casazza, G., Antoniol, G., Villano, U., Merlo, E., Di Penta, M.: Identifying clones in the linux kernel. In: Proceedings of the First IEEE International Workshop on Source Code Analysis and Manipulation, pp. 90–97. IEEE (2001)

    Google Scholar 

  5. Kamiya, T., Kusumoto, S., Inoue, K.: Ccfinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Transactions on Software Engineering 28(7), 654–670 (2002)

    Article  Google Scholar 

  6. Kontogiannis, K.: Evaluation experiments on the detection of programming patterns using software metrics. In: Proceedings of the Fourth Working Conference on Reverse Engineering, pp. 44–54. IEEE (1997)

    Google Scholar 

  7. Li, Z., Lu, S., Myagmar, S., Zhou, Y.: Cp-miner: Finding copy-paste and related bugs in large-scale software code. IEEE Transactions on Software Engineering 32(3), 176–192 (2006)

    Article  Google Scholar 

  8. Jiang, L., Misherghi, G., Su, Z., Glondu, S.: Deckard: Scalable and accurate treebased detection of code clones. In: Proceedings of the 29th International Conference on Software Engineering, pp. 96–105. IEEE Computer Society (2007)

    Google Scholar 

  9. Lague, B., Proulx, D., Mayrand, J., Merlo, E.M., Hudepohl, J.: Assessing the benefits of incorporating function clone detection in a development process. In: Proceedings of the International Conference on Software Maintenance, pp. 314–321. IEEE (1997)

    Google Scholar 

  10. Baxter, I.D., Yahin, A., Moura, L., Sant’Anna, M., Bier, L.: Clone detection using abstract syntax trees. In: Proceedings of the International Conference on Software Maintenance, pp. 368–377. IEEE (1998)

    Google Scholar 

  11. Mayrand, J., Leblanc, C., Merlo, E.M.: Experiment on the automatic detection of function clones in a software system using metrics. In: Proceedings of the International Conference on Software Maintenance 1996, pp. 244–253. IEEE (1996)

    Google Scholar 

  12. Kapser, C.J., Godfrey, M.W.: Supporting the analysis of clones in software systems. Journal of Software Maintenance and Evolution: Research and Practice 18(2), 61–82 (2006)

    Article  Google Scholar 

  13. Rysselberghe, F.V., Demeyer, S.: Evaluating clone detection techniques from a refactoring perspective. In: Proceedings of the 19th IEEE International Conference on Automated Software Engineering, pp. 336–339. IEEE Computer Society (2004)

    Google Scholar 

  14. Antoniol, G., Villano, U., Merlo, E., Di Penta, M.: Analyzing cloning evolution in the linux kernel. Information and Software Technology 44(13), 755–765 (2002)

    Article  Google Scholar 

  15. Johnson, J.H.: Identifying redundancy in source code using fingerprints. In: Proceedings of the 1993 Conference of the Centre for Advanced Studies on Collaborative Research: Software Engineering, vol. 1, pp. 171–183. IBM Press (1993)

    Google Scholar 

  16. Grubb, P., Takang, A.A.: Software maintenance: concepts and practice. World Scientific (2003)

    Google Scholar 

  17. Baker, B.S.: A program for identifying duplicated code. Computing Science and Statistics, 49–49 (1993)

    Google Scholar 

  18. Ducasse, S., Nierstrasz, O., Rieger, M.: Lightweight detection of duplicated codea language-independent approach. Institute for Applied Mathematics and Computer Science, University of Berne, Switzerland (2004)

    Google Scholar 

  19. Ducasse, S., Rieger, M., Demeyer, S.: A language independent approach for detecting duplicated code. In: Proceedings of the IEEE International Conference on Software Maintenance (ICSM 1999), pp. 109–118. IEEE (1999)

    Google Scholar 

  20. Li, Z., Lu, S., Myagmar, S., Zhou, Y.: Cp-miner: A tool for nding copy-paste and related bugs in operating system code. In: OSDI, vol. 4, pp. 289–302 (2004)

    Google Scholar 

  21. Baker, B.S.: On finding duplication in strings and software. submitted for publication (1993)

    Google Scholar 

  22. Raza, A., Vogel, G., Plödereder, E.: Bauhaus – A tool suite for program analysis and reverse engineering. In: Pinho, L.M., González Harbour, M. (eds.) Ada-Europe 2006. LNCS, vol. 4006, pp. 71–82. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  23. Komondoor, R., Horwitz, S.: Using slicing to identify duplication in source code. In: Cousot, P. (ed.) SAS 2001. LNCS, vol. 2126, pp. 40–56. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  24. Komondoor, R.V.: Automated duplicated-code detection and procedure extraction. PhD thesis, UNIVERSITY OF WISCONSIN (2003)

    Google Scholar 

  25. Gallagher, K., Layman, L.: Are decomposition slices clones? In: 11th IEEE International Workshop on Program Comprehension, pp. 251–256. IEEE (2003)

    Google Scholar 

  26. Chen, W.K., Li, B., Gupta, R.: Code compaction of matching single-entry multipleexit regions. In: Cousot, R. (ed.) SAS 2003. LNCS, vol. 2694, pp. 401–417. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  27. Patenaude, J.F., Merlo, E., Dagenais, M., Laguë, B.: Extending software quality assessment techniques to java systems. In: Proceedings of the Seventh International Workshop on Program Comprehension, pp. 49–56. IEEE (1999)

    Google Scholar 

  28. Raheja, K., Tekchandani, R.: An emerging approach towards code clone detection: metric based approach on byte code. International Journal of Advanced Research in Computer Science and Software Engineering 3(5) (2013)

    Google Scholar 

  29. Basit, H.A., Jarzabek, S.: Detecting higher-level similarity patterns in programs. ACM SIGSOFT Software Engineering Notes 30(5), 156–165 (2005)

    Article  Google Scholar 

  30. De Lucia, A., Francese, R., Scanniello, G., Tortora, G.: Reengineering web applications based on cloned pattern analysis. In: Proceedings of the 12th IEEE International Workshop on Program Comprehension, pp. 132–141. IEEE (2004)

    Google Scholar 

  31. Marcus, A., Maletic, J.I.: Identification of high-level concept clones in source code. In: Proceedings of the 16th Annual International Conference on Automated Software Engineering (ASE 2001), pp. 107–114. IEEE (2001)

    Google Scholar 

  32. Gil, J.Y., Maman, I.: Micro patterns in java code. In: ACM SIGPLAN Notices, vol. 40, pp. 97–116. ACM (2005)

    Google Scholar 

  33. Shi, N., Olsson, R.A.: Reverse engineering of design patterns from java source code. In: 21st IEEE/ACM International Conference on Automated Software Engineering, ASE 2006, pp. 123–134. IEEE (2006)

    Google Scholar 

  34. Bakota, T., Ferenc, R., Gyimothy, T.: Clone smells in software evolution. In: IEEE International Conference on Software Maintenance, ICSM 2007, pp. 24–33. IEEE (2007)

    Google Scholar 

  35. Dangel, A., Pelisse, R.: Pmd is a source code analyzer (2014) (accessed March 10, 2014)

    Google Scholar 

  36. Kapdan, M.: Aktaş, M., Yiğit, M.: Yapısal kod klon analizinde metrik tabanlı teknikler. In: Ulusal Yazilim Muhendisligi Sempozyumu (UYMS), pp.1–19 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Kapdan, M., Aktas, M., Yigit, M. (2014). On the Structural Code Clone Detection Problem: A Survey and Software Metric Based Approach. In: Murgante, B., et al. Computational Science and Its Applications – ICCSA 2014. ICCSA 2014. Lecture Notes in Computer Science, vol 8583. Springer, Cham. https://doi.org/10.1007/978-3-319-09156-3_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09156-3_35

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09155-6

  • Online ISBN: 978-3-319-09156-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics