Advertisement

Efficient Algorithm for Detecting Parameterized Multiple Clones in a Large Software System

  • Rajesh Prasad
  • Suneeta Agarwal
  • Anuj Kumar Sharma
  • Alok Singh
  • Sanjay Misra
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6786)

Abstract

Two code fragments are said to be similar if they are similar in their program text or in their functionalities. The first kind of similarity can be detected with the help of parameterized string matching. In this type of matching, a given pattern P is said to match with a sub-string t of the text T, if there exists a bijection between the symbols of P and the symbols of t. The parameterized string matching problem has been efficiently solved by Fredriksson and Mozgovoy by using the shift-or (PSO) algorithm. The drawback of this algorithm is: it is unable to handle patterns of length greater than the word length (w) of a computer. In this paper, we solve this word length problem in a bit-parallel parameterized matching by extending the BLIM algorithm of exact string matching. Extended algorithm is also suitable for searching multiple patterns simultaneously. Experimentally, it has been observed that our algorithm is comparable with PSO for pattern length ≤ w and has ability to handle longer patterns efficiently.

Keywords

Parameterized string matching bit-parallelism BLIM software maintenance clone detection and multiple patterns 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Roy, C.K., Cordy, J.R.: A survey on clone detection research. Technical Report No. 2007-541, School of Computing, Queen’s University at Kingston, Ontario, Canada (2007)Google Scholar
  2. 2.
    Baeza-Yates, R.A., Gonnet, G.H.: A new approach to text searching. Communication of ACM 35(10), 74–82 (1992)CrossRefGoogle Scholar
  3. 3.
    Baker, B.S.: Parameterized duplication in string: algorithm and application in software maintenance. SIAM J. Computing 26(5), 1343–1362 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Baker, B.S.: Parameterized diff. In: 10th Symposium on Discrete Algorithm (SODA), pp. 854–855 (1999)Google Scholar
  5. 5.
    Boyer, R.S., Moore, J.S.: A fast string-searching algorithm. Communication of ACM 20(10), 762–772 (1977)CrossRefzbMATHGoogle Scholar
  6. 6.
    Fredriksson, K., Mozgovoy, M.: Efficient parameterized string matching. Information Processing Letters (IPL) 100(3), 91–96 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Horspool, R.N.: Practical fast searching in strings, Software. Practice & Experience 10(6), 501–506 (1980)CrossRefGoogle Scholar
  8. 8.
    Prasad, R., Agarwal, S.: A new parameterized string matching algorithm by combining bit-parallelism and suffix automata. In: 8th IEEE International Conference on Computer and Information Technology, Sydney, Australia, pp. 778–783. IEEE Press, Los Alamitos (2008)Google Scholar
  9. 9.
    Raita, T.: Tuning the Boyer-Moore-Horspool string searching algorithm. Software - Practice & Experience 22(10), 879–884 (1992)CrossRefGoogle Scholar
  10. 10.
    Salmela, L., Tarhio, J.: Fast Parameterized Matching with q-grams. Journal of Discrete Algorithm 6(3), 408–419 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Smith, P.D.: Experiments with a very fast substring search algorithm. Software - Practice & Experience 21(10), 1065–1074 (1991)CrossRefGoogle Scholar
  12. 12.
    Sunday, D.M.: A very fast substring search algorithm. Communications of the ACM 33(8), 132–142 (1990)CrossRefGoogle Scholar
  13. 13.
    Wu, S., Manber, U.: Fast text searching allowing errors. Communication of the ACM 35(10), 83–91 (1992)CrossRefGoogle Scholar
  14. 14.
    Kulekci, M.O.: BLIM: A New Bit-Parallel Pattern Matching Algorithm Overcoming Computer Word Size Limitation. Mathematics in Computer Science 3(4), 407–420 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Navarro, G., Raffinot, M.: Fast and Flexible String Matching by Combining Bit-parallelism and Suffix automata. ACM Journal of Experimental Algorithms 5(4) (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Rajesh Prasad
    • 1
  • Suneeta Agarwal
    • 1
  • Anuj Kumar Sharma
    • 1
  • Alok Singh
    • 1
  • Sanjay Misra
    • 2
  1. 1.Motilal Nehru National Institute of TechnologyAllahabadIndia
  2. 2.Department of Computer EngineeringFederal University of TechnologyMinnaNigeria

Personalised recommendations