Program Equivalence Using Neural Networks

  • Tiago M. Nascimento
  • Charles B. Prado
  • Davidson R. Boccardo
  • Luiz F. R. C. Carmo
  • Raphael C. S. Machado
Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 87)

Abstract

Program equivalence refers to the mapping between equivalent codes written in different languages – including high-level and low-level languages. In the present work, we propose a novel approach for correlating program codes of different languages using artificial neural networks and program characteristics derived from control flow graphs and call graphs. Our approach correlates the program codes of different languages by feeding the neural network with logical flow characteristics. Our evaluation using real code examples shows a typical correspondence rate between 62% and 100% with the very low rate of 4% false positives.

Keywords

Neural Network Source Code Binary Code Software Defect True Association 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Thompson, K.: Reflections on trusting trust. Commun. ACM 27(8), 761–763 (1984)CrossRefGoogle Scholar
  2. 2.
    McDonald, J.: Delphi falls prey (2009), http://www.symantec.com/connect/blogs/delphi-falls-prey (last accessed October 2009)
  3. 3.
    Quinlan, D., Panas, T.: Source code and binary analysis of software defects. In: CSIIRW 2009: Proceedings of the 5th Annual Workshop on Cyber Security and Information Intelligence Research, pp. 1–4. ACM, New York (2009)Google Scholar
  4. 4.
    Hassan, A.E., Jiang, Z.M., Holt, R.C.: Source versus object code extraction for recovering software architecture. In: WCRE 2005: Proceedings of the 12th Working Conference on Reverse Engineering, pp. 67–76. IEEE Computer Society, Washington, DC (1995)Google Scholar
  5. 5.
    Hatton, L.: Estimating source lines of code from object code. In: Windows and Embedded Control Systems (2005), http://www.leshatton.org/Documents/LOC2005.pdf
  6. 6.
    Buttle, D.L.: Verification of Compiled Code. PhD thesis, University of York, UK (2001)Google Scholar
  7. 7.
    Wang, Z., Pierce, K., McFarling, S.: Bmat - a binary matching tool for stale profile propagation. The Journal of Instruction-Level Parallelism (2002)Google Scholar
  8. 8.
    Flake, H.: Structural comparison of executable objects. In: Proc. of the Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA). IEEE Computer Society (2004)Google Scholar
  9. 9.
    Oh, J.: Fight against 1-day exploits: Diffing binaries vs anti-diffing binaries. In: Blackhat Technical Security Conference (2009)Google Scholar
  10. 10.
    Zhenga, J.: A digital image encryption algorithm based on hyper-chaotic cellular neural network. Journal Fundamenta Informaticae (2009)Google Scholar
  11. 11.
    Zeng, H., Rine, D.: A neural network approach for software defects fix effort estimation. In: IASTED Conf. on Software Engineering and Applications, pp. 513–517 (2004)Google Scholar
  12. 12.
    Zhenga, J.: Predicting software reliability with neural network ensembles. Expert Systems with Applications (36), 2116–2122 (2007)Google Scholar
  13. 13.
    Reddy, C.S., Raju, K.V.S.V.N., Kumari, V.V., Devi, G.L.: Fault-prone module prediction of a web application using artificial neural networks. In: Proceeding (591) Software Engineering and Applications (2007)Google Scholar
  14. 14.
    Lenic, M., Povalej, P., Kokol, P., Cardoso, A.I.: Using cellular automata to predict reliability of modules. In: Proceeding (436) Software Engineering and Applications (2004)Google Scholar
  15. 15.
    Boccardo, D.R., Nascimento, T.M., Machado, R.C., Prado, C.B., Carmo, L.F.R.C.: Traceability of executable codes using neural networks. In: Proceedings of the Information Security Conference (2010) (to appear)Google Scholar
  16. 16.
    Moretti, E., Chanteperdrix, G., Osorio, A.: New algorithms for control-flow graph structuring. In: CSMR 2001: Proceedings of the Fifth European Conference on Software Maintenance and Reengineering, p. 184. IEEE Computer Society, Washington, DC (2001)Google Scholar
  17. 17.
    IdaPro: Ida pro - disassembler (2010), http://www.hex-rays.com/idapro/ (last accessed January 2010)
  18. 18.
    Poznyakoff, S.: Gnu cflow (2010), http://savannah.gnu.org/projects/cflow (last accessed January 2010)
  19. 19.
    Ciocoiu, I.B.: Hybrid feedforward neural networks for solving classification problems. Neural Processing Letters 16(1), 81–91 (2002)CrossRefMATHGoogle Scholar
  20. 20.
    Asadi, R., Mustapha, N., Sulaiman, N.: New supervisioned multi layer feed forward neural network model to accelerate classification with high accuracy. European Journal of Scientific Research 33(1), 163–178 (2009)Google Scholar
  21. 21.
    Haykin, S.: Neural Networks: A Comprehensive Foundation. Prentice Hall (1998)Google Scholar
  22. 22.
    Hertz, J.A., Krogh, A.S., Palmer, R.G.: Introduction to the Theory of Neural Computation. Addison-Wesley, Redwood City (1991)Google Scholar
  23. 23.
    Moler, C.B.: MATLAB — an interactive matrix laboratory. Technical Report 369, University of New Mexico. Dept. of Computer Science (1980)Google Scholar
  24. 24.
    Men, H., Wu, Y., Gao, Y., Kou, Z., Xu, Z., Yang, S.: Application of support vector machine to heterotrophic bacteria colony recognition. In: CSSE (1), pp. 830–833 (2008)Google Scholar
  25. 25.
    Angulo, C., Ruiz, F., González, L., Ortega, J.A.: Multi-classification by using tri-class svm. Neural Processing Letters 23(1), 89–101 (2006)CrossRefGoogle Scholar
  26. 26.
    Burkard, J.: C software (2010), http://people.sc.fsu.edu/~burkardt/ (Last accessed January 2010)
  27. 27.
    Oliveira Cruz, A.J.: C software (2010), http://equipe.nce.ufrj.br/adriano/c/exemplos.htm (last accessed January 2010)

Copyright information

© ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering 2012

Authors and Affiliations

  • Tiago M. Nascimento
    • 1
    • 2
  • Charles B. Prado
    • 1
  • Davidson R. Boccardo
    • 1
  • Luiz F. R. C. Carmo
    • 1
  • Raphael C. S. Machado
    • 1
  1. 1.Normalization and Industrial QualityINMETRO - National Institute of MetrologyRio de JaneiroBrazil
  2. 2.UFRJ – Federal University of Rio de JaneiroBrazil

Personalised recommendations