Advertisement

JTR: A Binary Solution for Switch-Case Recovery

  • Lucian Cojocar
  • Taddeus Kroes
  • Herbert Bos
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10379)

Abstract

Most security solutions that rely on binary rewriting assume a clean separation between code and data. Unfortunately, jump tables violate this assumption. In particular, switch statements in binary code often appear as indirect jumps with jump tables that interleave with executable code—especially on ARM architectures. Most existing rewriters and disassemblers handle jump tables in a crude manner, by means of pattern matching. However, any deviation from the pattern (e.g. slightly different instructions) leads to a mismatch.

Instead, we propose a complementary approach to “solve” jump tables and automatically find the right target addresses of the indirect jump by means of a tailored Value Set Analysis (VSA). Our approach is generic and applies to binary code without any need for source, debug symbols, or compiler generated patterns.

We benchmark our technique on a large corpus of ARM binaries, including malware and firmware. For gcc binaries, our results approach those of IDA Pro when IDA has symbols (which is generally not the case), while for clang binaries we outperform IDA Pro with debug symbols by orders of magnitude: IDA finds 11 of 828 switch statements implemented as jump tables in SPEC, while we find 763.

Notes

Acknowledgments

We thank the anonymous reviewers for their feedback. This work was supported by the Netherlands Organisation for Scientific Research through the grant NWO 628.001.005 CYBSEC “OpenSesame” and through the grant NWO 639.023.309 VICI “Dowsing”.

References

  1. 1.
    IDA F.L.I.R.T. Technology: OverviewGoogle Scholar
  2. 2.
    Angr, Switch Statement Analysis 106, June 2016. https://github.com/angr/angr/issues/106
  3. 3.
    Radare2, Analyze jump tables 3201, June 2016. https://github.com/radare/radare2/issues/3201
  4. 4.
    Radare2, Portable reversing framework, June 2016. https://radare.org
  5. 5.
    Retargetable Decompiler, June 2016. https://retdec.com/decompilation-run/
  6. 6.
    Abadi, M., Budiu, M., Erlingsson, U., and Ligatti, J. Control-flow integrity. In: CCS12 (2005)Google Scholar
  7. 7.
    Anand, K., Smithson, M., Elwazeer, K., Kotha, A., Gruen, J., Giles, N., Barua, R.: A compiler-level intermediate representation based binary analysis and rewriting system. In: ECCS8, pp. 295–308 (2013)Google Scholar
  8. 8.
    Anand, K., Smithson, M., Kotha, A., Elwazeer, K., Barua, R.: Decompilation to compiler high IR in a binary rewriter. Technical report, University of Maryland (2010)Google Scholar
  9. 9.
    Arzt, S., Rasthofer, S., Fritz, C., Bodden, E., Bartel, A., Klein, J., Le Traon, Y., Octeau, D., McDaniel, P.: Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. In: ACM SIGPLAN (2014)Google Scholar
  10. 10.
    Balakrishnan, G., Reps, T.: Analyzing memory accesses in x86 executables. In: Duesterwald, E. (ed.) CC 2004. LNCS, vol. 2985, pp. 5–23. Springer, Heidelberg (2004). doi: 10.1007/978-3-540-24723-4_2 CrossRefGoogle Scholar
  11. 11.
    Balakrishnan, G., Reps, T.: What you see is not what you execute. ACM Trans. Program. Lang. Syst. 32(6), 23:1–23:84 (2010)CrossRefGoogle Scholar
  12. 12.
    Bansal, S., Aiken, A.: Binary translation using peephole super optimizers. In: OSDI 2008 (2008)Google Scholar
  13. 13.
    Bao, T., Burket, J., Woo, M., Turner, R., Brumley, D. Byteweight: learning to recognize functions in binary code. In: USENIX Security 2014 (2014)Google Scholar
  14. 14.
    Bardin, S., Herrmann, P., Védrine, F.: Refinement-based CFG reconstruction from unstructured programs. In: Jhala, R., Schmidt, D. (eds.) VMCAI 2011. LNCS, vol. 6538, pp. 54–69. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-18275-4_6 CrossRefGoogle Scholar
  15. 15.
    Brauer, J., Hansen, R.R., Kowalewski, S., Larsen, K.G., Olesen, M.C.: Adaptable value-set analysis for low-level code. In: SSV 2012 (2012)Google Scholar
  16. 16.
    Brumley, D., Jager, I., Avgerinos, T., Schwartz, E.J.: BAP: a binary analysis platform. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 463–469. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-22110-1_37 CrossRefGoogle Scholar
  17. 17.
    Brumley, D., Lee, J., Schwartz, E.J., Woo, M.: Native x86 decompilation using semantics-preserving structural analysis and iterative control-flow structuring. In: USENIX SEC 2013 (2013)Google Scholar
  18. 18.
    Castro, M., Costa, M., Martin, J.-P., Peinado, M., Akritidis, P., Donnelly, A., Barham, P., Black, R.: Fast byte-granularity software fault isolation. In: SIGOPS 2009 (2009)Google Scholar
  19. 19.
    Cha, S.K., Woo, M., Brumley, D.: Program-adaptive mutational fuzzing. In: S&P 2015 (2015)Google Scholar
  20. 20.
    Cifuentes, C., Van Emmerik, M.: Recovery of jump table case statements from binary code. In: Program Comprehension (1999)Google Scholar
  21. 21.
    Cojocar, L., Zaddach, J., Verdult, R., Bos, H., Francillon, A., Balzarotti, D.: Parser identification in embedded systems. In: ACSAC 2015 (2015)Google Scholar
  22. 22.
    Davi, L., Lehmann, D., Sadeghi, A.-R., Monrose, F.: Stitching the gadgets: on the ineffectiveness of coarse-grained control-flow integrity protection. In: USENIX SEC 2014 (2014)Google Scholar
  23. 23.
    Di Federico, A., Payer, M., Agosta, G.: Rev.Ng: a unified binary analysis framework to recover CFGs and function boundaries. In: Proceedings of the 26th International Conference on Compiler Construction, CC 2017, pp. 131–141. ACM (2017)Google Scholar
  24. 24.
    Durfina, L., Křoustek, J., Zemek, P., Kolávr, D., Hruska, T., Masarík, K., Meduna, A.: Design of a retargetable decompiler for a static platform-independent malware analysis. Int. J. Secur. Its Appl. 5(4), 91–106 (2011)Google Scholar
  25. 25.
    Erlingsson, U., Abadi, M., Vrable, M., Budiu, M., Necula, G.: Software guards for system address spaces. In: OSDI 2006 (2006)Google Scholar
  26. 26.
    Evans, I., Long, F., Otgonbaatar, U., Shrobe, H., Rinard, M., Okhravi, H., Sidiroglou-Douskos, S.: Control jujutsu: on the weaknesses of fine-grained control flow integrity. In: CCS 2015 (2015)Google Scholar
  27. 27.
    Ford, B., Cox, R.: Vx32: lightweight user-level sandboxing on the x86. In: USENIX Annual Technical ConferenceGoogle Scholar
  28. 28.
    Ganesh, V., Leek, T., Rinard, M.: Taint-based directed whitebox fuzzing. In: ICSE 2009 (2009)Google Scholar
  29. 29.
    Gedich, A., Lazdin, A.: Improved algorithm for identification of switch tables in executable code. In: FRUCT 2015 (2015)Google Scholar
  30. 30.
    Harris, L.C., Miller, B.P.: Practical analysis of stripped binary code. ACM SIGARCH Comput. Archit. News 33(5), 63–68 (2005)CrossRefGoogle Scholar
  31. 31.
    Holsti, N.: Analysing switch-case tables by partial evaluation. In: WCET (2007)Google Scholar
  32. 32.
    Kinder, J., Veith, H.: Jakstab: a static analysis platform for binaries. In: Gupta, A., Malik, S. (eds.) CAV 2008. LNCS, vol. 5123, pp. 423–427. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-70545-1_40 CrossRefGoogle Scholar
  33. 33.
    Kinder, J., Veith, H.: Precise static analysis of untrusted driver binaries. In: FMCAD 2010 (2010)Google Scholar
  34. 34.
    Křoustek, J.: Retargetable Analysis of Machine Code. PhD thesis, Faculty of Information Technology, Brno University of Technology, CZ (2015)Google Scholar
  35. 35.
    Kästner, D., Wilhelm, S.: Generic Control Flow Reconstruction from Assembly CodeGoogle Scholar
  36. 36.
    Li, Y., McCune, J., Newsome, J., Perrig, A., Baker, B., Drewry, W.: Minibox : a two-way sandbox for x86 native code. In: USENIX ATC 2014 (2014)Google Scholar
  37. 37.
    McCabe, T.J.: A complexity measure. IEEE Softw. Eng. (1976)Google Scholar
  38. 38.
    McCamant, S., Morrisett, G.: Evaluating SFI for a CISC architecture. In: USENIX-SS 2006 (2006)Google Scholar
  39. 39.
    Meng, X., Miller, B.: Binary code is not easy. In: ISSTA 2016 (2016)Google Scholar
  40. 40.
    Microsoft. The Z3 Theorem Prover, February 2016. https://github.com/Z3Prover/z3
  41. 41.
    Ming, J., Wu, D., Xiao, G., Wang, J., Liu, P. TaintPipe: pipelined symbolic taint analysis. In: USENIX SEC 2015 (2015)Google Scholar
  42. 42.
    O’Sullivan, P., Anand, K., Kotha, A.: Retrofitting security in COTS software with binary rewriting. In: IFP SEC 2011 (2011)Google Scholar
  43. 43.
    Reinbacher, T., Brauer, J.: Precise control flow reconstruction using boolean logic. In: EMSOFT 2011 (2011)Google Scholar
  44. 44.
    Sayle, R.A.: A superoptimizer analysis of multiway branch code generation. In: Proceedings of the GCC Developers Summit (2008)Google Scholar
  45. 45.
    Sehr, D., Muth, R., Biffle, C. L., Khimenko, V., Pasko, E., Yee, B., Schimpf, K., Chen, B.: Adapting software fault isolation to contemporary CPU architectures. In: USENIX SEC 2010 (2010)Google Scholar
  46. 46.
    Shen, B.-Y., Chen, J.-Y., Hsu, W.-C., Yang, W.: An LLVM-based static binary translator. In: Proceedings of the 2012 International Conference on Compilers, Architectures and Synthesis for Embedded Systems, CASES 2012, pp. 51–60. ACM, New York (2012)Google Scholar
  47. 47.
    Shin, Y., Williams, L.: An empirical model to predict security vulnerabilities using code complexity metrics. In: ESEM 2008 (2008)Google Scholar
  48. 48.
    Shoshitaishvili, Y., Wang, R., Salls, C., Stephens, N., Polino, M., Dutcher, A., Grosen, J., Feng, S., Hauser, C., Kruegel, C., Vigna, G.: SoK: (State of) the art of war: offensive techniques in binary analysis. In: S&P 2016 (2016)Google Scholar
  49. 49.
    Smithson, M., Anand, K., Kotha, A.: Binary rewriting without relocation information. Technical report. University of Maryland, November 2010Google Scholar
  50. 50.
    Tice, C., Roeder, T., Collingbourne, P., Checkoway, S., Erlingsson, U., Lozano, L., Pike, G.: Enforcing forward-edge control-flow integrity in GCC & LLVM. In: USENIX SEC 2014 (2014)Google Scholar
  51. 51.
    Tikir, M. M., Laurenzano, M., Carrington, L., Snavely, A.: PMaC binary instrumentation library for PowerPC/AIX. In: Workshop on Bin. Inst. and App. (2006)Google Scholar
  52. 52.
    van der Veen, V., Andriesse, D., Göktaş, E., Gras, B., Sambuc, L., Slowinska, A., Bos, H., Giuffrida, C.: Practical context-sensitive cfi. In: CCS 2015 (2015)Google Scholar
  53. 53.
    Wang, S., Wang, P., Wu, D.: Reassembleable disassembling. In: USENIX SEC 2015 (2015)Google Scholar
  54. 54.
    Wang, X., Jhi, Y.-C., Zhu, S., Liu, P. Still : Exploit code detection via static taint and initialization analyses. In: ACSAC 2008 (2008)Google Scholar
  55. 55.
    Yee, B., Sehr, D., Dardyk, G., Chen, J., Muth, R., Ormandy, T., Okasaka, S., Narula, N., Fullagar, N.: Native client: a sandbox for portable, untrusted x86 native code. In: S&P 2009 (2009)Google Scholar
  56. 56.
    Zaddach, J., Bruno, L., Francillon, A., Balzarotti, D.: A framework to support dynamic security analysis of embedded systems’ firmwares. In: NDSS 2014 (2014)Google Scholar
  57. 57.
    Zeng, B., Tan, G., Morrisett, G.: Combining control-flow integrity and static analysis for efficient and validated data sandboxing. In: CCS18, pp. 29–40. ACM (2011)Google Scholar
  58. 58.
    Zhang, M., Sekar, R.: Control flow integrity for COTS binaries. In: USENIX SEC 2013 (2013)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Vrije Universiteit AmsterdamAmsterdamNetherlands

Personalised recommendations