STVsm: Similar Structural Code Detection Based on AST and VSM

  • Ning Li
  • Mingda Shen
  • Sinan Li
  • Lijun Zhang
  • Zhanhuai Li
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 340)

Abstract

The potential software defects are most derived from the frequent changes during the development life cycle. It is very helpful to inform developers of the related codes which are affected by the change they are currently performing. In this paper, we propose a new approach STVsm to detect the similar structural code which related to some software changes. The method of STVsm is based on abstract syntax tree and vector space model. Experimental results show that our STVsm method achieves a significant accurate to detect the similar structural codes in C programming language, including exact clones, change code format, renamed codes, reordered codes and add redundancy codes.

Keywords

Clone Code Detection Similar Stuctural Change Related Code AST VSM 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Roy, C.K., Cordy, J.R.: A survey on software clone detection research, Queens University, Kingston, Canada, Tech. Rep. 2007-541 (2007)Google Scholar
  2. 2.
    Lehnert, S.: A taxonomy for software change impact analysis. Presented at the Proceedings of the 12th International Workshop on Principles of Software Evolution and the 7th annual ERCIM Workshop on Software Evolution, Szeged, Hungary (2011)Google Scholar
  3. 3.
    Faidhi, J.A.W., Robinson, S.K.: An empirical approach for detecting program similarity and plagiarism within a university programming environment. In: Proceedings of the 5th International Workshop on Software Clones (IWSC 2011), pp. 7–13 (2011)Google Scholar
  4. 4.
    Elenbogen, B.S., Seliya, N.: Detecting outsourced student programming assignments. Journal of Computing Sciences in Colleges 23, 50–57 (2008)Google Scholar
  5. 5.
    Roy, C.K., Cordy, J.R.: NICAD: Accurate detection of near-miss intentional clones using flexible pretty-printing and code normalization. In: ICPC 2008 (2008)Google Scholar
  6. 6.
    Kamiya, T., Kusumoto, S., Inoue, K.: CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng. 28(7), 654–670 (2002)CrossRefGoogle Scholar
  7. 7.
    Juergens, E., Deissenboeck, F., Hummel, B.: Clonedetective a workbench for clone detection research. In: ICSE 2009 (2009)Google Scholar
  8. 8.
    Baxter, I.D., Yahin, A., Moura, L., SantAnna, M., Bier, L.: Clone detection using abstract syntax trees. In: Proceeding of IEEE International Conference Software Maintenance (ICSM 1998), pp. 368–377 (1998)Google Scholar
  9. 9.
    Jiang, L., Misherghi, G., Su, Z., Glondu, S.: DECKARD: Scalable and accurate tree-based detection of code clones. In: ICSE 2007 (2007)Google Scholar
  10. 10.
    Krinke, J.: Identifying similar code with program dependence graphs. In: WCRE 2001 (2001)Google Scholar
  11. 11.
    Wise, M.J.: String similarity via greedy string tiling and running karp-rabin matching (December 1993)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Ning Li
    • 1
  • Mingda Shen
    • 1
  • Sinan Li
    • 1
  • Lijun Zhang
    • 1
  • Zhanhuai Li
    • 1
  1. 1.School of Computer Science and TechnologyNorthwestern Polytechnical UniversityXi’anChina

Personalised recommendations