Skip to main content

Clone Detection

  • Chapter
  • First Online:
Binary Code Fingerprinting for Cybersecurity

Abstract

Different clone detection techniques can be used to identify the known parts of a code and to avoid analyzing the same code portions again. Existing methods are found to be neither robust enough to accommodate the mutations brought by compilers nor scalable enough when querying against modern code base of high volume. To address these limitations, in this chapter we present BinSequence, a two-step clone detection engine. The proposed fine-grained fuzzy matching detection engine can perform code comparison accurately and as a result, the false correlation to irrelevant code can be avoided. The fingerprint-based detection engine can efficiently prune the search space without notably compromising the accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Diaphora: A Program Diffing Plugin for IDA Pro. https://github.com/joxeankoret/diaphora. Accessed: January 2019.

  2. Libpng library. http://www.libpng.org/.

  3. Obfuscator-LLVM. https://github.com/obfuscator-llvm/obfuscator/wiki. Accessed: January 2017.

  4. PatchDiff2: Binary Diffing Plugin for IDA. https://code.google.com/p/patchdiff2/. Accessed: January 2019.

  5. Vulnerability Details: CVE-2015-4485. http://www.cvedetails.com/cve/CVE-2015-4485/. Accessed: January 2019.

  6. Zlib library. http://www.zlib.net/.

  7. BinDiff tool: Zynamics bindiff. http://www.zynamics.com/bindiff.html, 2017. Accessed: February, 2016.

  8. Hex-Rays IDA Pro. https://www.hex-rays.com/products/ida/, 2019. Accessed: June 2019.

  9. Alexandr Andoni and Piotr Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06)., pages 459–468. IEEE, 2006.

    Google Scholar 

  10. Bencsáth, B and Buttyán, L and Félegyházi, M. sKyWIper (aka Flame aka Flamer): A Complex Malware for Targeted Attacks. Technical report, Laboratory of Cryptography and System Security (CrySyS Lab), Department of Telecommunications, Budapest University of Technology and Economics, 2012.

    Google Scholar 

  11. Hamad Binsalleeh, Thomas Ormerod, Amine Boukhtouta, Prosenjit Sinha, Amr Youssef, Mourad Debbabi, and Lingyu Wang. On the Analysis of the Zeus Botnet Crimeware Toolkit. In Proceedings of the 8th Annual International Conference on Privacy, Security and Trust (PST), pages 31–38. IEEE Press, 2010.

    Google Scholar 

  12. Thomas H Cormen. Introduction to algorithms. MIT Press, 2009.

    Google Scholar 

  13. Yaniv David and Eran Yahav. Tracelet-based code search in executables. ACM SIGPLAN Notices, 49(6):349–360, 2014.

    Article  Google Scholar 

  14. András Frank. On kuhn’s hungarian method- a tribute from hungary. Naval Research Logistics (NRL), 52(1):2–5, 2005.

    Article  MathSciNet  Google Scholar 

  15. Pascal Junod, Julien Rinaldini, Johan Wehrli, and Julie Michielin. Obfuscator-LLVM: software protection for the masses. In Proceedings of the 1st International Workshop on Software PROtection (SPRO), pages 3–9. IEEE Press, 2015.

    Google Scholar 

  16. Jure Leskovec, Anand Rajaraman, and Jeffrey David Ullman. Mining of massive datasets. Cambridge University Press, 2014.

    Google Scholar 

  17. Lannan Luo, Jiang Ming, Dinghao Wu, Peng Liu, and Sencun Zhu. Semantics-based obfuscation-resilient binary code similarity comparison with applications to software plagiarism detection. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pages 389–400. ACM, 2014.

    Google Scholar 

  18. Jason Milletary. Citadel trojan malware analysis. Dell SecureWorks Counter Threat Unit Intelligence Services, pages 10–18, 2012.

    Google Scholar 

  19. Audris Mockus. Large-scale code reuse in open source software. In First International Workshop on Emerging Trends in FLOSS Research and Development (FLOSS’07: ICSE Workshops 2007), pages 7–7. IEEE, 2007.

    Google Scholar 

  20. James Munkres. Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics, 5(1):32–38, 1957.

    Article  MathSciNet  Google Scholar 

  21. Ashkan Rahimian, Raha Ziarati, Stere Preda, and Mourad Debbabi. On the Reverse Engineering of the Citadel Botnet. In International Symposium on Foundations and Practice of Security, pages 408–425. Springer, 2013.

    Google Scholar 

  22. Paria Shirani, Leo Collard, Basile L Agba, Bernard Lebel, Mourad Debbabi, Lingyu Wang, and Aiman Hanna. BinARM: Scalable and efficient detection of vulnerabilities in firmware images of intelligent electronic devices. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA), pages 114–138. Springer, 2018.

    Google Scholar 

  23. Manuel Sojer and Joachim Henkel. Code reuse in open source software development: Quantitative evidence, drivers, and impediments. Journal of the Association for Information Systems, 11(12):868–901, 2010.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Alrabaee, S. et al. (2020). Clone Detection. In: Binary Code Fingerprinting for Cybersecurity. Advances in Information Security, vol 78. Springer, Cham. https://doi.org/10.1007/978-3-030-34238-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34238-8_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34237-1

  • Online ISBN: 978-3-030-34238-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics