Control Flow Obfuscation Using Neural Network to Fight Concolic Testing

  • Haoyu Ma
  • Xinjie Ma
  • Weijie Liu
  • Zhipeng Huang
  • Debin Gao
  • Chunfu Jia
Conference paper
Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 152)


Concolic testing is widely regarded as the state-of-the-art technique in dynamic discovering and analyzing trigger-based behavior in software programs. It uses symbolic execution and an automatic theorem prover to generate new concrete test cases to maximize code coverage for scenarios like software verification and malware analysis. While malicious developers usually try their best to hide malicious executions, there are also circumstances in which legitimate reasons are presented for a program to conceal trigger-based conditions and the corresponding behavior, which leads to the demand of control flow obfuscation techniques. We propose a novel control flow obfuscation design based on the incomprehensibility of artificial neural networks to fight against reverse engineering tools including concolic testing. By training neural networks to simulate conditional behaviors of a program, we manage to precisely replace essential points of a program’s control flow with neural network computations. Evaluations show that since the complexity of extracting rules from trained neural networks easily goes beyond the capability of program analysis tools, it is infeasible to apply concolic testing on code obfuscated with our method. Our method also incorporates only basic integer operations and simple loops, thus can be hard to be distinguished from regular programs.


Software obfuscation Malware analysis Reverse engineering Concolic testing Neural network 


  1. 1.
    Lee, G., Morris, J., Parker, K., Bundell, G.A., Lam, P.: Using symbolic execution to guide test generation. Softw. Test. Verif. Rel. 15(1), 41–61 (2005)CrossRefGoogle Scholar
  2. 2.
    Molnar, D., Li, X.C., Wagner, D.A.: Dynamic test generation to find integer bugs in x86 binary linux programs. In: Proceedings of the 18th Conference on USENIX Security Symposium (USENIX Security), pp. 67–82 (2009)Google Scholar
  3. 3.
    Moser, A., Kruegel, C., Kirda, E.: Exploring multiple execution paths for malware analysis. In: Proceedings of the 2007 IEEE Symposium on Security and Privacy (S&P), pp. 231–245 (2007)Google Scholar
  4. 4.
    Sen, K., Marinov, D., Agha, G.: Cute: A concolic unit testing engine for c. In: Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC/FSE), pp. 263–272 (2005)Google Scholar
  5. 5.
    Sen, K., Agha, G.: CUTE and jCUTE: concolic unit testing and explicit path model-checking tools. In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 419–423. Springer, Heidelberg (2006) CrossRefGoogle Scholar
  6. 6.
    Godefroid, P., Klarlund, N., Sen, K.: Dart: directed automated random testing. In: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). (2005) 213–223Google Scholar
  7. 7.
    Cadar, C., Ganesh, V., Pawlowski, P.M., Dill, D.L., Engler, D.R.: Exe: automatically generating inputs of death. ACM Trans. Inf. Syst. Secur. (TISSEC) 12(2), 10:1–10:38 (2008)CrossRefGoogle Scholar
  8. 8.
    Williams, N., Marre, B., Mouy, P.: On-the-fly generation of k-path tests for c functions. In: Proceedings of the 19th IEEE International Conference on Automated Software Engineering (ASE), pp. 290–293 (2004)Google Scholar
  9. 9.
    Cadar, C., Dunbar, D., Engler, D.R.: Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In: Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pp. 209–224 (2008)Google Scholar
  10. 10.
    Falcarin, P., Collberg, C., Atallah, M., Jakubowski, M.: Guest editors’ introduction: software protection. IEEE Softw. 28(2), 24–27 (2011)CrossRefGoogle Scholar
  11. 11.
    Wang, Z., Ming, J., Jia, C., Gao, D.: Linear obfuscation to combat symbolic execution. In: Atluri, V., Diaz, C. (eds.) ESORICS 2011. LNCS, vol. 6879, pp. 210–226. Springer, Heidelberg (2011) CrossRefGoogle Scholar
  12. 12.
    Falcarin, P., Carlo, S.D., Cabutto, A., Garazzino, N., Barberis, D.: Exploiting code mobility for dynamic binary obfuscation. In: Proceedings of the 2011 World Congress on Internet Security (WorldCIS), pp. 114–120 (2011)Google Scholar
  13. 13.
    Wang, Z., Jia, C., Liu, M., Yu, X.: Branch obfuscation using code mobility and signal. In: Proceedings of the 36th Annual Computer Software and Applications Conference Workshops (COMPSACW), pp. 16–20 (2012)Google Scholar
  14. 14.
    Sharif, M., Lanzi, A., Giffin, J., Lee, W.: Impeding malware analysis using conditional code obfuscation. In: Proceedings of the 16th Annual Network & Distributed System Security Symposium (NDSS)(2008)Google Scholar
  15. 15.
    Popov, I., Debray, S., Andrews, G.: Binary obfuscation using signals. In: Proceedings of the 16th Conference on USENIX Security Symposium (USENIX Security), pp. 275–290 (2007)Google Scholar
  16. 16.
    Schrittwieser, S., Katzenbeisser, S.: Code obfuscation against static and dynamic reverse engineering. In: Filler, T., Pevný, T., Craver, S., Ker, A. (eds.) IH 2011. LNCS, vol. 6958, pp. 270–284. Springer, Heidelberg (2011) CrossRefGoogle Scholar
  17. 17.
    Tickle, A.B., Andrews, R., Golea, M., Diederich, J.: The truth will come to light: directions and challenges in extracting the knowledge embedded within trained artificial neural networks. IEEE Trans. Neural Netw. 9(6), 1057–1068 (1998)CrossRefGoogle Scholar
  18. 18.
    Golea, M.: On the complexity of rule extraction from neural networks and network querying. In: Proceedings of the Rule Extraction From Trained Artificial Neural Networks Workshop, Society For the Study of Artificial Intelligence and Simulation of Behavior Workshop Series (AISB), pp. 51–59 (1996)Google Scholar
  19. 19.
    Towell, G.G., Shavlik, J.W.: The extraction of refined rules from knowledge based neural networks. Mach. Learn. 13(1), 71–101 (1993)Google Scholar
  20. 20.
    Song, D., Brumley, D., Yin, H., Caballero, J., Jager, I., Kang, M.G., Liang, Z., Newsome, J., Poosankam, P., Saxena, P.: BitBlaze: a new approach to computer security via binary analysis. In: Sekar, R., Pujari, A.K. (eds.) ICISS 2008. LNCS, vol. 5352, pp. 1–25. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  21. 21.
    Qu, X., Robinson, B.: A case study of concolic testing tools and their limitations. In: Proceedings of 2011 International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 117–126 (2011)Google Scholar
  22. 22.
    Collberg, C., Thomborson, C., Low, D.: A taxonomy of obfuscation transformations. Technical report 148, Department of Computer Science, The University of Auckland (1997)Google Scholar
  23. 23.
    Cybenko, G.: Approximations by superpositions of sigmoidal functions. Math. Control Signals Syst. 2(4), 303–314 (1989)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2), 251–257 (1991)CrossRefGoogle Scholar
  25. 25.
    Johansson, C., Lansner, A.: Implementing plastic weights in neural networks using low precision arithmetic. Neurocomputing 72(4–6), 968–972 (2009)CrossRefGoogle Scholar
  26. 26.
    Tang, C., Kwan, H.K.: Multilayer feedforward neural networks with single powers-of-two weights. IEEE Trans. Signal Process. 41(8), 2724–2727 (1993)CrossRefzbMATHGoogle Scholar
  27. 27.
    Draghici, S.: On the capabilities of neural networks using limited precision weights. Neural Networks 15(3), 395–414 (2002)CrossRefGoogle Scholar
  28. 28.
    Moerland, P., Fiesler, E.: Hardware-friendly learning algorithms for neural networks: an overview. In: Proceedings of 5th International Conference on Microelectronics for Neural Networks, pp. 117–124 (1996)Google Scholar
  29. 29.
    Ramalingam, G.: The undecidability of aliasing. ACM Trans. Program. Lang. Syst. (TOPLAS) 16(5), 1467–1471 (1994)CrossRefGoogle Scholar
  30. 30.
    Ghiya, R., Hendren, L.J.: Is it a tree, a dag, or a cyclic graph? a shape analysis for heap-directed pointers in c. In: Proceedings of the 23rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pp. 1–15 (1996)Google Scholar
  31. 31.
    Eberhart, R.C., Shi, Y.: Particle swarm optimization: developments, applications and resources. In: Proceedings of the 2001 Congress on Evolutionary Computation (CEC), pp. 81–86 (2001)Google Scholar
  32. 32.
    Zhang, J.R., Zhang, J., Lok, T.M., Lyu, M.R.: A hybrid particle swarm optimization-back-propagation algorithm for feedforward neural network training. Appl. Math. Comput. 185(2), 1026–1037 (2007)CrossRefzbMATHGoogle Scholar

Copyright information

© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2015

Authors and Affiliations

  • Haoyu Ma
    • 1
  • Xinjie Ma
    • 1
  • Weijie Liu
    • 1
  • Zhipeng Huang
    • 1
  • Debin Gao
    • 2
  • Chunfu Jia
    • 1
  1. 1.College of Computer and Control EngineeringNankai UniversityTianjinChina
  2. 2.School of Information SystemsSingapore Management UniversitySingaporeSingapore

Personalised recommendations