Advertisement

Programming and Computer Software

, Volume 40, Issue 5, pp 276–287 | Cite as

Methods and software tools to support combined binary code analysis

  • V. A. PadaryanEmail author
  • A. I. Getman
  • M. A. Solovyev
  • M. G. Bakulin
  • A. I. Borzilov
  • V. V. Kaushan
  • I. N. Ledovskikh
  • Yu. V. Markin
  • S. S. Panasenko
Article
  • 116 Downloads

Abstract

Methods and tools for binary code analysis developed in the Institute of System Programming, Russian Academy of Sciences, and their applications in algorithm and data format recovery are considered. The executable code of various general-purpose CPU architectures is analyzed. The analysis is performed given no source codes, debugging information, and specific OS version requirements. The approach implies collecting a detailed machine instruction level execution trace; a method for successively increasing presentation level; extraction of algorithm’s code followed by structuring of both code and data formats it processes. Important results are obtained, viz. an intermediate representation is developed that allows carrying out most preliminary processing tasks and algorithm code extraction without having to focus on specifics of a given machine; and a method and software tool are developed for automated recovery of network message and file formats. The tools are integrated into the unified analysis platform that supports their combined use. The architecture behind the platform is also described. Examples of its application to real programs are given.

Keywords

Binary Code Address Space Representation Level Symbolic Execution Intermediate Representation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Tikhonov, A., Avetisyan, A., Padaryan, V., Methodology of exploring of an algorithm from binary code by dynamic analysis Problemy inphormatsionnoi bezopasnosti. Komp’yuternye systemy, 2008, no. 3, pp.66–71.Google Scholar
  2. 2.
    Avetisyan, A.I., Padaryan, V.A., Getman, A.I., and Solovyev, M.A., On some methods of improving the representation level when analyzing the protected binary code, Proc. of the 19th All-Russian Scientific and Technical Conference “Methods and Technical Tools for Information Security”, 2010, pp.97–98.Google Scholar
  3. 3.
    Tikhonov, A.Yu. and Avetisyan, A.I., Combined (static and dynamic) analysis of binary code, Trudy Instituta Sistemnogo Programmirovaniya Ross. Akad. Nauk, 2012, vol. 22, pp. 131–152.Google Scholar
  4. 4.
    Getman, A., Padaryan, V., and Solovyev, M., Combined approach to solving problems in binary code analysis, Proc. of the 9th Int. Conf. on Computer Science and Information Technologies (CSIT’2013), pp. 295–297.Google Scholar
  5. 5.
    Batuzov, K., Dovgalyuk, P., Koshelev, V., and Padaryan, V., Two approaches to full-system deterministic replay in QEMU, Trudy Instituta Sistemnogo Programmirovaniya Ross. Akad. Nauk, 2012, vol. 22, pp. 77–94.Google Scholar
  6. 6.
    Song, D., Brumley, D., Yin, H., Caballero, J., Jager, I., Gyung Kang, M., Liang, Z., Newsome, J., Poosankam, P., and Saxena, P., BitBlaze: a new approach to computer security via binary analysis, Int. Conf. on Information Systems Security, 2008, LNCS 5352, pp. 1–25.CrossRefGoogle Scholar
  7. 7.
    Kwong Yan, L., and Yin, H., DroidScope: seamlessly reconstructing the OS and Dalvik semantic views for dynamic Android malware analysis, Proc. of the 21st USENIX conference on Security symposium (Security’12). USENIX Association, Berkeley, CA, USA, pp. 29–29.Google Scholar
  8. 8.
    Yin, H., and Song, D., TEMU: binary code analysis via whole-system layered annotative execution, EECS Department University of California, Berkeley, Technical Report no. UCB/EECS-2010-3, January 11, 2010, p. 14.Google Scholar
  9. 9.
    Harman, M., Danicic, S., Sivagurunathan, Y., and Simpson, D., The next 700 slicing criteria, Second UK Workshop on Program Comprehension, 1996.Google Scholar
  10. 10.
    Padaryan, V.A., Getman, A.I., and Solovyev, M.A., Software environment for dynamic analysis of binary code, Trudy Instituta Sistemnogo Programmirovaniya Ross. Akad. Nauk, 2009, vol. 16, pp. 51–72.Google Scholar
  11. 11.
    Padaryan, V.A., Solovyev, M.A., and Kononov, A.I., Simulation of operational semantics of machine instructions, Program. Comput. Software, 2011, vol. 37, no. 3, pp. 161–170.CrossRefzbMATHGoogle Scholar
  12. 12.
    Brumley, D., Jager, I., Avgerinos, Th., and Schwartz, E. J., BAP: a binary analysis platform, Proc. of the 23rd Int. Conf. on Computer Aided Verification (CAV’11), Gopalakrishnan., G, and Qadeer, Sh., Eds., Berlin: Springer, pp. 463–469.Google Scholar
  13. 13.
    Getman, A.I., Markin, Yu.V., Padaryan, V.A., and Schetinin, E.I., Data format recovery, Trudy Instituta Sistemnogo Programmirovaniya Ross. Akad. Nauk, 2010, vol. 19, pp. 195–214.Google Scholar
  14. 14.
    Avetisyan, A.I., and Getman, A.I., Recovery the structure of binary data on the program traces, Trudy Instituta Sistemnogo Programmirovaniya Ross. Akad. Nauk, 2012, vol. 22, pp. 95–118.Google Scholar
  15. 15.
    Newsome, J. and Song, D., Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software, Proc. of the Network and Distributed System Security Symposium (NDSS), 2005.Google Scholar
  16. 16.
    Caballero, J., Poosankam, P., Kreibich, C., and Song, D., Dispatcher: enabling active Botnet infiltration using automatic protocol reverse-engineering, Proc. of the 16th ACM Conf. on Computer and Communications Security (CCS), 2009, pp. 621–634.Google Scholar
  17. 17.
    Cui, W., Peinado, M., Chen, K., Wang, H. J., and Irun-Briz, L., Tupni: automatic reverse engineering of input formats, CCS’08: Proc. of the 15th ACM Conf. on Computer and Communications Security, 2008, pp. 391–402.Google Scholar
  18. 18.
    Lin, Z., Zhang, X., and Xu, D., Automatic reverse engineering of data structures from binary execution, Proc. of the 17th Network and Distributed System Security Symposium, 2010.Google Scholar
  19. 19.
    Needleman, S.B., and Wunsch, C.D., A general method applicable to the search for similarities in the amino acid sequence of two proteins, J. Mol. Biol., 1970, vol. 48, no. 3, pp. 443–453.CrossRefGoogle Scholar
  20. 20.
    Wang, Y., Zhang, Z., Yao, D., Qu, B., and Guo, L., Inferring protocol state machine from network traces: a probabilistic approach, Proc. of the 9th Int. Conf. on Applied Cryptography and Network Security (ACNS), 2011, pp. 1–18.CrossRefGoogle Scholar
  21. 21.
    Comparetti, P.M., Wondracek, G., Kruegel, C., and Kirda, E., Prospex: protocol specification extraction, Proc. of the 30th IEEE Symposium on Security and Privacy, 2009, pp. 110–125.Google Scholar
  22. 22.
    Balakrishnan, G., Gruian, R., Reps, T., and Teitelbaum, T., CodeSurfer/x86-a platform for analyzing x86 executables, Proc. of the 14th Int. Conf. on Compiler Construction (CC’05), Berlin: Springer, pp. 250–254.Google Scholar
  23. 23.
    Balakrishnan, G. and Reps, T., Analyzing memory accesses in x86 executables, Proc. of Compiler Construction, New York: Springer, 2004, pp. 5–23.CrossRefGoogle Scholar
  24. 24.
    Babic, D., Martignoni, L., McCamant, S., and Song, D., Statically-directed dynamic automated test generation, Proc. of the 2011 Int. Symp. on Software Testing and Analysis (ISSTA’ 11). ACM, New York, pp. 12–22.Google Scholar
  25. 25.
    Caselden, D., Bazhanyuk, A., Payer, M., McCamant, S., and Song, D., HI-CFG: construction by binary analysis, and application to attack polymorphism, Proc. of 18th Europ. Symp. on Research in Computer Security, Egham, UK, 2013, LNCS 8134, pp. 164–181.Google Scholar
  26. 26.
    Saxena, P., Poosankam, P., McCamant, S., and Song, D., Loop-extended symbolic execution on binary programs, Proc. of the 18th Int. Symp. on Software Testing and Analysis (ISSTA’ 09), New York: ACM, pp. 225–236.Google Scholar
  27. 27.
    Caballero, J., Poosankam, P., McCamant, S., Babic, D., and Song, D., Input generation via decomposition and re-stitching: finding bugs in malware, Proc. of the 17th ACM Conf. on Computer and Communications Security (CCS’ 10), New York: ACM, pp. 413–425.Google Scholar
  28. 28.
    Kil Cha, S., Avgerinos, Th., Rebert, A., and Brumley, D., Unleashing mayhem on binary code, Proc. of the 2012 IEEE Symposium on Security and Privacy (SP’ 12), IEEE Computer Society, Washington, pp. 380–394.Google Scholar

Copyright information

© Pleiades Publishing, Ltd. 2014

Authors and Affiliations

  • V. A. Padaryan
    • 1
    Email author
  • A. I. Getman
    • 1
  • M. A. Solovyev
    • 1
  • M. G. Bakulin
    • 1
  • A. I. Borzilov
    • 1
  • V. V. Kaushan
    • 1
  • I. N. Ledovskikh
    • 1
  • Yu. V. Markin
    • 1
  • S. S. Panasenko
    • 1
  1. 1.Institute for System ProgrammingRussian Academy of SciencesMoscowRussia

Personalised recommendations