Abstract
Different clone detection techniques can be used to identify the known parts of a code and to avoid analyzing the same code portions again. Existing methods are found to be neither robust enough to accommodate the mutations brought by compilers nor scalable enough when querying against modern code base of high volume. To address these limitations, in this chapter we present BinSequence, a two-step clone detection engine. The proposed fine-grained fuzzy matching detection engine can perform code comparison accurately and as a result, the false correlation to irrelevant code can be avoided. The fingerprint-based detection engine can efficiently prune the search space without notably compromising the accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Diaphora: A Program Diffing Plugin for IDA Pro. https://github.com/joxeankoret/diaphora. Accessed: January 2019.
Libpng library. http://www.libpng.org/.
Obfuscator-LLVM. https://github.com/obfuscator-llvm/obfuscator/wiki. Accessed: January 2017.
PatchDiff2: Binary Diffing Plugin for IDA. https://code.google.com/p/patchdiff2/. Accessed: January 2019.
Vulnerability Details: CVE-2015-4485. http://www.cvedetails.com/cve/CVE-2015-4485/. Accessed: January 2019.
Zlib library. http://www.zlib.net/.
BinDiff tool: Zynamics bindiff. http://www.zynamics.com/bindiff.html, 2017. Accessed: February, 2016.
Hex-Rays IDA Pro. https://www.hex-rays.com/products/ida/, 2019. Accessed: June 2019.
Alexandr Andoni and Piotr Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06)., pages 459–468. IEEE, 2006.
Bencsáth, B and Buttyán, L and Félegyházi, M. sKyWIper (aka Flame aka Flamer): A Complex Malware for Targeted Attacks. Technical report, Laboratory of Cryptography and System Security (CrySyS Lab), Department of Telecommunications, Budapest University of Technology and Economics, 2012.
Hamad Binsalleeh, Thomas Ormerod, Amine Boukhtouta, Prosenjit Sinha, Amr Youssef, Mourad Debbabi, and Lingyu Wang. On the Analysis of the Zeus Botnet Crimeware Toolkit. In Proceedings of the 8th Annual International Conference on Privacy, Security and Trust (PST), pages 31–38. IEEE Press, 2010.
Thomas H Cormen. Introduction to algorithms. MIT Press, 2009.
Yaniv David and Eran Yahav. Tracelet-based code search in executables. ACM SIGPLAN Notices, 49(6):349–360, 2014.
András Frank. On kuhn’s hungarian method- a tribute from hungary. Naval Research Logistics (NRL), 52(1):2–5, 2005.
Pascal Junod, Julien Rinaldini, Johan Wehrli, and Julie Michielin. Obfuscator-LLVM: software protection for the masses. In Proceedings of the 1st International Workshop on Software PROtection (SPRO), pages 3–9. IEEE Press, 2015.
Jure Leskovec, Anand Rajaraman, and Jeffrey David Ullman. Mining of massive datasets. Cambridge University Press, 2014.
Lannan Luo, Jiang Ming, Dinghao Wu, Peng Liu, and Sencun Zhu. Semantics-based obfuscation-resilient binary code similarity comparison with applications to software plagiarism detection. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pages 389–400. ACM, 2014.
Jason Milletary. Citadel trojan malware analysis. Dell SecureWorks Counter Threat Unit Intelligence Services, pages 10–18, 2012.
Audris Mockus. Large-scale code reuse in open source software. In First International Workshop on Emerging Trends in FLOSS Research and Development (FLOSS’07: ICSE Workshops 2007), pages 7–7. IEEE, 2007.
James Munkres. Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics, 5(1):32–38, 1957.
Ashkan Rahimian, Raha Ziarati, Stere Preda, and Mourad Debbabi. On the Reverse Engineering of the Citadel Botnet. In International Symposium on Foundations and Practice of Security, pages 408–425. Springer, 2013.
Paria Shirani, Leo Collard, Basile L Agba, Bernard Lebel, Mourad Debbabi, Lingyu Wang, and Aiman Hanna. BinARM: Scalable and efficient detection of vulnerabilities in firmware images of intelligent electronic devices. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA), pages 114–138. Springer, 2018.
Manuel Sojer and Joachim Henkel. Code reuse in open source software development: Quantitative evidence, drivers, and impediments. Journal of the Association for Information Systems, 11(12):868–901, 2010.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Alrabaee, S. et al. (2020). Clone Detection. In: Binary Code Fingerprinting for Cybersecurity. Advances in Information Security, vol 78. Springer, Cham. https://doi.org/10.1007/978-3-030-34238-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-34238-8_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34237-1
Online ISBN: 978-3-030-34238-8
eBook Packages: Computer ScienceComputer Science (R0)