Advertisement

Cross-Architecture Lifter Synthesis

  • Rijnard van TonderEmail author
  • Claire Le GouesEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10886)

Abstract

Code translation is a staple component of program analysis. A lifter is a code translation unit that translates low-level code to a higher-level intermediate representation (IR). Lifters thus enable a host of static and dynamic analyses for such low-level code. However, writing a lifter is a tedious manual process which must be repeated for every architecture an analysis aims to support. We introduce cross-architecture lifter synthesis, a novel approach that automatically synthesizes lifters for previously unsupported architectures. Our insight is that lifter synthesis can be bootstrapped with existing IR sketches that exploit the shared semantic properties of heterogeneous architecture instruction sets. We show that our approach automates a significant amount of translation effort for a previously unsupported instruction set, and that it enables discovery of new bugs on new architecture targets through reuse of an existing IR-based analysis.

Notes

Acknowledgments

This work is partially supported under NSF grant number CCF-1563797. All statements are those of the authors, and do not necessarily reflect the views of the funding agency. The authors would like to thank the BAP Team for their continued open source development and support for the BAP project.

References

  1. 1.
    BAP IR Operational Semantics (2018). https://github.com/BinaryAnalysisPlatform/bil/releases/download/v0.1/bil.pdf. Accessed 23 Apr 2018
  2. 2.
    BAP Warn Unused Analysis (2018). https://opam.ocaml.org/packages/bap-warn-unused/bap-warn-unused.1.3.0/. Accessed 23 Apr 2018
  3. 3.
    Binary Analysis Platform (2018). https://github.com/BinaryAnalysisPlatform/bap. Accessed 23 Apr 2018
  4. 4.
    Alur, R., Bodík, R., Juniwal, G., Martin, M.M.K., Raghothaman, M., Seshia, S.A., Singh, R., Solar-Lezama, A., Torlak, E., Udupa, A.: Syntax-guided synthesis. In: Formal Methods in Computer-Aided Design, pp. 1–8 (2013)Google Scholar
  5. 5.
    Balakrishnan, G., Reps, T.: Analyzing memory accesses in x86 executables. In: Compiler Construction, pp. 2732–2733 (2004)Google Scholar
  6. 6.
    Bornholt, J., Torlak, E.: Scaling program synthesis by exploiting existing code. Machine Learning for Programming Languages (2015)Google Scholar
  7. 7.
    Bornholt, J., Torlak, E., Grossman, D., Ceze, L.: Optimizing synthesis with metasketches. In: POPL 2016, pp. 775–788 (2016)Google Scholar
  8. 8.
    Dullien, T., Porst, S.: REIL: a platform-independent intermediate representation of disassembled code for static code analysis. In: CanSecWest 2009 (2009)Google Scholar
  9. 9.
    Federico, A.D., Payer, M., Agosta, G.: rev.ng: a unified binary analysis framework to recover CFGs and function boundaries. In: CC 2017, pp. 131–141 (2017)Google Scholar
  10. 10.
    Godefroid, P., Taly, A.: Automated synthesis of symbolic instruction encodings from I/O samples. In: PLDI 2012, pp. 441–452 (2012)Google Scholar
  11. 11.
    Gotovchits, I., van Tonder, R., Brumley, D.: Saluki: finding taint-style vulnerabilities with static property checking. In: BAR 2018 (2018)Google Scholar
  12. 12.
    Hasabnis, N., Qiao, R., Sekar, R.: Checking correctness of code generator architecture specifications. In: CGO 2015, pp. 167–178 (2015)Google Scholar
  13. 13.
    Hasabnis, N., Sekar, R.: Extracting instruction semantics via symbolic execution of code generators. In: FSE 2016, pp. 301–313 (2016)Google Scholar
  14. 14.
    Hasabnis, N., Sekar, R.: Lifting assembly to intermediate representation: a novel approach leveraging compilers. In: ASPLOS 2016, pp. 311–324 (2016)Google Scholar
  15. 15.
    Heule, S., Schkufza, E., Sharma, R., Aiken, A.: Stratified synthesis: automatically learning the x86–64 instruction set. In: PLDI 2016, pp. 237–250 (2016)Google Scholar
  16. 16.
    Hindle, A., Barr, E.T., Gabel, M., Su, Z., Devanbu, P.T.: On the naturalness of software. Commun. ACM 59(5), 122–131 (2016)CrossRefGoogle Scholar
  17. 17.
    Kim, S., Faerevaag, M., Junk, M., Jung, S., Oh, D., Lee, J., Cha, S.K.: Testing intermediate representations for binary analysis. In: ASE 2017 (2017)Google Scholar
  18. 18.
    Kinder, J., Veith, H.: Precise static analysis of untrusted driver binaries. In: FMCAD 2010, pp. 43–50 (2010)Google Scholar
  19. 19.
    Lattner, C., Adve, V.S.: LLVM: a compilation framework for lifelong program analysis & transformation. In: CGO 2004, pp. 75–88 (2004)Google Scholar
  20. 20.
    Le, V., Sun, C., Su, Z.: Randomized stress-testing of link-time optimizers. In: ISSTA 2015, pp. 327–337 (2015)Google Scholar
  21. 21.
    Molnar, D., Li, X.C., Wagner, D.A.: Dynamic test generation to find integer bugs in x86 binary linux programs. In: USENIX Security Symposium 2009 (2009)Google Scholar
  22. 22.
    Schwartz, E.J., Avgerinos, T., Brumley, D.: All you ever wanted to know about dynamic taint analysis and forward symbolic execution (but might have been afraid to ask). In: IEEE Security and Privacy, pp. 317–331 (2010)Google Scholar
  23. 23.
    Sun, C., Le, V., Zhang, Q., Su, Z.: Toward understanding compiler bugs in GCC and LLVM. In: ISSTA 2016, pp. 294–305 (2016)Google Scholar
  24. 24.
    Warren, H.S.: Hacker’s Delight. Pearson Education, London (2013)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Carnegie Mellon UniversityPittsburghUSA

Personalised recommendations