Tree slicing: Finding intertwined and gapped clones in one simple step

Akhin, M.; Itsykson, V.

doi:10.3103/S0146411613070171

Tree slicing: Finding intertwined and gapped clones in one simple step

Published: 06 February 2014

Volume 47, pages 427–432, (2013)
Cite this article

Automatic Control and Computer Sciences Aims and scope Submit manuscript

M. Akhin¹ &
V. Itsykson¹

71 Accesses
2 Citations
Explore all metrics

Abstract

Most of software nowadays contain code duplication that leads to serious problems in software maintenance. A lot of different clone detection approaches have been proposed over the years to deal with this problem, but almost all of them do not consider semantic properties of the source code.

We propose to reinforce traditional tree-based clone detection algorithm by using additional information about variable slices. This allows to find intertwined/gapped clones on variables; preliminary evaluation confirms applicability of our approach to real-world software.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

CoRT: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection

Article 08 April 2024

How different are different diff algorithms in Git?

Article Open access 11 September 2019

References

Deshpande, A. and Riehle, D., The total growth of open source, in Conference on Open Source Systems, Springer Verlag, 2008.
Google Scholar
Roy, C.K. and Cordy, J.R., A Survey on Software Clone Detection Research, TR 2007541, Queen’s University: School of Computing, 2007.
Google Scholar
Li, Z., Lu, S., Myagmar, S., and Zhou, Y., CP-miner: a tool for finding copy-paste and related bugs in operating system code, in Conference on Symposium on Operating Systems Design and Implementation, USENIX Association, Berkeley, 2004, pp. 20–20.
Google Scholar
Kontogiannis, K., Evaluation experiments on the detection of programming patterns using software metrics, in Working Conference on Reverse Engineering, Washington, DC: IEEE Computer Society, 1997, pp. 44–44.
Google Scholar
Koschke, R., Falke, R., and Frenzel, P., Clone detection using abstract syntax suffix trees, in Working Conference on Reverse Engineering, Washington, DC: IEEE Computer Society, 2006, pp. 253–262.
Google Scholar
Hummel, B., Juergens, E., Heinemann, L., and Conradt, M., Index-based code clone detection: incremental, distributed, scalable, in International Conference on Software Maintenance, Washington, DC: IEEE Computer Society, 2010, pp. 1–9.
Google Scholar
Gabel, M., Jiang, L., and Su, S., Scalable detection of semantic clones, in International Conference on Software Engineering, Washington, DC: IEEE Computer Society, 2008, pp. 321–330.
Google Scholar
Krinke, J., Identifying similar code with program dependence graphs, on Working Conference on Reverse Engineering, Washington, DC: IEEE Computer Society, 2001.
Google Scholar
Jiang, L., Misherghi, G., Su, S., and Glondu, S., DECKARD: scalable and accurate tree-based detection of code clones, in International Conference on Software Engineering, Washington, DC: IEEE Computer Society, 2007, pp. 96–105.
Google Scholar
Andoni, A. and Indyk, P., Near-optimal hashing algorithms for approximate Nearest Neighbour in high dimensions, Commun. ACM, New York, NY: ACM, 2008, vol. 51, no. 1, pp. 117–122.
Google Scholar
Evans, W., Fraser, C., and Ma, F., Clone detection via structural abstraction, Software Quality J., 2009, vol. 17, no. 4, pp. 309–330.
Article Google Scholar
Bulychev, P. and Minea, M., Duplicate code detection using anti-unification, in Spring Young Researchers Colloquium on Software Engineering, 2008, pp. 51–54.
Google Scholar
Kamiya, T., Kusumoto, S., and Inoue, K., CCFinder: Multilinguistic token-based code clone detection system for large scale source code, IEEE Trans. Soft. Eng., 2002, vol. 28, no. 7, pp. 654–670; Washington, DC: IEEE Computer Society, 2002.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Saint-Petersburg State Polytechnical University, Saint-Petersburg, Russia
M. Akhin & V. Itsykson

Authors

M. Akhin
View author publications
You can also search for this author in PubMed Google Scholar
V. Itsykson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Akhin.

Additional information

The article is published in the original.

Gapped clones-similar code fragments with some different code portions between the similar fragments; intertwined clones-similar code fragments with intertwined statements.

About this article

Cite this article

Akhin, M., Itsykson, V. Tree slicing: Finding intertwined and gapped clones in one simple step. Aut. Control Comp. Sci. 47, 427–432 (2013). https://doi.org/10.3103/S0146411613070171

Download citation

Published: 06 February 2014
Issue Date: December 2013
DOI: https://doi.org/10.3103/S0146411613070171

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tree slicing: Finding intertwined and gapped clones in one simple step

Abstract

Access this article

Similar content being viewed by others

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

CoRT: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection

How different are different diff algorithms in Git?

References

Author information

Authors and Affiliations

Corresponding author

Additional information

About this article

Cite this article

Keywords

Navigation

Tree slicing: Finding intertwined and gapped clones in one simple step

Abstract

Access this article

Similar content being viewed by others

Large Language Model Assisted Software Engineering: Prospects, Challenges, and a Case Study

CoRT: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection

How different are different diff algorithms in Git?

References

Author information

Authors and Affiliations

Corresponding author

Additional information

About this article

Cite this article

Share this article

Keywords

Search

Navigation