Empirical Software Engineering

, Volume 19, Issue 4, pp 1040–1074 | Cite as

A family of experiments to assess the effectiveness and efficiency of source code obfuscation techniques

  • Mariano CeccatoEmail author
  • Massimiliano Di Penta
  • Paolo Falcarin
  • Filippo Ricca
  • Marco Torchiano
  • Paolo Tonella


Context: code obfuscation is intended to obstruct code understanding and, eventually, to delay malicious code changes and ultimately render it uneconomical. Although code understanding cannot be completely impeded, code obfuscation makes it more laborious and troublesome, so as to discourage or retard code tampering. Despite the extensive adoption of obfuscation, its assessment has been addressed indirectly either by using internal metrics or taking the point of view of code analysis, e.g., considering the associated computational complexity. To the best of our knowledge, there is no publicly available user study that measures the cost of understanding obfuscated code from the point of view of a human attacker. Aim: this paper experimentally assesses the impact of code obfuscation on the capability of human subjects to understand and change source code. In particular, it considers code protected with two well-known code obfuscation techniques, i.e., identifier renaming and opaque predicates. Method: We have conducted a family of five controlled experiments, involving undergraduate and graduate students from four Universities. During the experiments, subjects had to perform comprehension or attack tasks on decompiled clients of two Java network-based applications, either obfuscated using one of the two techniques, or not. To assess and compare the obfuscation techniques, we measured the correctness and the efficiency of the performed task. Results: —at least for the tasks we considered—simpler techniques (i.e., identifier renaming) prove to be more effective than more complex ones (i.e., opaque predicates) in impeding subjects to complete attack tasks.


Empirical studies Software obfuscation Program comprehension 


  1. Anckaert B, Madou M, Sutter BD, Bus BD, Bosschere KD, Preneel B (2007) Program obfuscation: a quantitative approach. In: QoP ’07: Proc. of the 2007 ACM workshop on quality of protection. ACM, New York, NY, USA, pp 15–20. doi: 10.1145/1314257.1314263 CrossRefGoogle Scholar
  2. Baker RD (1995) Modern permutation test software. In: Edgington E (ed) Randomization tests. Marcel DeckerGoogle Scholar
  3. Ceccato M, Di Penta M, Nagra J, Falcarin P, Ricca F, Torchiano M, Tonella P (2009a) The effectiveness of source code obfuscation: an experimental assessment. In: IEEE 17th international conference on program comprehension (ICPC), pp 178–187. doi: 10.1109/ICPC.2009.5090041
  4. Ceccato M, Di Penta M, Nagra J, Falcarin P, Ricca F, Torchiano M, Tonella P (2009b) The effectiveness of source code obfuscation: an experimental assessment. Tech. rep., University of Sannio. URL
  5. Ceccato M, Preda MD, Nagra J, Collberg C, Tonella P (2007) Barrier slicing for remote software trusting. In: Proc. of the 7th IEEE international working conference on source code analysis and manipulation (SCAM 2007). IEEE Computer Society, pp 27–36. (Sept. 30 2007–Oct. 1 2007). doi: 10.1109/SCAM.2007.4362895
  6. Chang H, Atallah M (2002) Protecting software code by guards. In: ACM workshop on security and privacy in digital rights management. ACMGoogle Scholar
  7. Cohen J (1988) Statistical power analysis for the behavioral sciences, 2nd edn. Lawrence Earlbaum Associates, Hillsdale, NJzbMATHGoogle Scholar
  8. Collberg C, Nagra J (2009) Surreptitious software: obfuscation, watermarking, and tamperproofing for software protection, 1st edn. Addison-Wesley ProfessionalGoogle Scholar
  9. Collberg C, Thomborson C, Low D (1997) A taxonomy of obfuscating transformations. Technical Report 148, Dept. of Computer Science, The Univ. of AucklandGoogle Scholar
  10. Collberg C, Thomborson C, Low D (1998) Manufacturing cheap, resilient, and stealthy opaque constructs. In: POPL ’98: Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on principles of programming languages. ACM, New York, NY, USA, pp 184–196. doi:10.1145/268946.268962 CrossRefGoogle Scholar
  11. Cordy J (2006) The TXL source transformation language. Sci Comput Program 61(3):190–210CrossRefzbMATHMathSciNetGoogle Scholar
  12. Devore JL (2007) Probability and statistics for engineering and the sciences, 7th edn. Duxbury PressGoogle Scholar
  13. Eisenbarth T, Koschke R, Simon D (2003) Locating features in source code. IEEE Trans Softw Eng 29(3):195–209CrossRefGoogle Scholar
  14. Falcarin P, Collberg C, Atallah M, Jakubowski M (2011) Guest editors’ introduction: software protection. IEEE Softw 28(2):24–27CrossRefGoogle Scholar
  15. Falcarin P, Scandariato R, Baldi M (2006) Remote trust with aspect oriented programming. In: IEEE advanced information and networking applications (AINA-06). IEEEGoogle Scholar
  16. Fiutem R, Tonella P, Antoniol G, Merlo E (1999) Points-to analysis for program understanding. J Syst Softw 44(3):213–227CrossRefGoogle Scholar
  17. Goto H, Mambo M, Matsumura K, Shizuya H (2000) An approach to the objective and quantitative evaluation of tamper-resistant software. In: 3rd int. workshop on information security (ISW2000). Springer, pp 82–96Google Scholar
  18. Grissom RJ, Kim JJ (2005) Effect sizes for research: a broad practical approach, 2nd edn. Lawrence Earlbaum AssociatesGoogle Scholar
  19. Horne B, Matheson L, Sheehan C, Tarjan RE (2001) Dynamic self-checking techniques for improved tamper resistance. In: ACM workshop on security and privacy in digital rights management. ACMGoogle Scholar
  20. Iversen G, Norpoth H (1987) Analysis of variance, 2nd edn. Sage PublicationsGoogle Scholar
  21. Juristo N, Moreno A (2001) Basics of software engineering experimentation. Kluwer Academic Publishers, Englewood Cliffs, NJCrossRefzbMATHGoogle Scholar
  22. Motulsky H (2010) Intuitive biostatistics: a nonmathematical guide to statistical thinking. Oxford University Press.
  23. Oppenheim AN (1992) Questionnaire design, interviewing and attitude measurement. Pinter, LondonGoogle Scholar
  24. R Core Team (2012) R: a language and environment for statistical computing. Vienna, Austria. ISBN 3-900051-07-0
  25. Ricca F, Di Penta M, Torchiano M, Tonella P, Ceccato M (2010) How developers’ experience and ability influence web application comprehension tasks supported by UML stereotypes: a series of four experiments. IEEE Trans Softw Eng 36:96–118. doi:10.1109/TSE.2009.69 CrossRefGoogle Scholar
  26. Ricca F, Di Penta M, Torchiano M, Tonella P, Ceccato M, Visaggio CA (2008) Are fit tables really talking?: a series of experiments to understand whether fit tables are useful during evolution tasks. In: 30th International Conference on Software Engineering (ICSE 2008), pp 361–370Google Scholar
  27. Ricca F, Torchiano M, Di Penta M, Ceccato M, Tonella P (2009) Using acceptance tests as a support for clarifying requirements: a series of experiments. Inf Softw Technol 51:270–283CrossRefGoogle Scholar
  28. Scandariato R, Ofek Y, Falcarin P, Baldi M (2008) Application-oriented trust in distributed computing. In: 3rd international conference on availability, reliability and security, ARES 08. IEEE, pp 434–439Google Scholar
  29. Sheskin D (2007) Handbook of parametric and nonparametric statistical procedures, 4th edn. Chapman & AllGoogle Scholar
  30. Sutherland I, Kalb GE, Blyth A, Mulley G (2006) An empirical examination of the reverse engineering process for binary files. Comput Secur 25(3):221–228CrossRefGoogle Scholar
  31. Tyma P (2000) Method for renaming identifiers of a computer program. US Patent 6,102,966Google Scholar
  32. Udupa S, Debray S, Madou M (2005) Deobfuscation: reverse engineering obfuscated code. In: 12th working conference on reverse engineering. doi: 10.1109/WCRE.2005.13
  33. Wohlin C, Runeson P, Höst M, Ohlsson M, Regnell B, Wesslén A (2000) Experimentation in software engineering—an introduction. Kluwer Academic PublishersGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Mariano Ceccato
    • 1
    Email author
  • Massimiliano Di Penta
    • 2
  • Paolo Falcarin
    • 3
  • Filippo Ricca
    • 4
  • Marco Torchiano
    • 5
  • Paolo Tonella
    • 1
  1. 1.CitFondazione Bruno KesslerTrentoItaly
  2. 2.Department of EngineeringUniversity of SannioBeneventoItaly
  3. 3.School of Architecture, Computing and EngineeringUniversity of East LondonLondonUK
  4. 4.DIBRISUniversity of GenovaGenovaItaly
  5. 5.Politecnico di TorinoTorinoItaly

Personalised recommendations