Search Based Clustering for Protecting Software with Diversified Updates

  • Mariano Ceccato
  • Paolo FalcarinEmail author
  • Alessandro Cabutto
  • Yosief Weldezghi Frezghi
  • Cristian-Alexandru Staicu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9962)


Reverse engineering is usually the stepping stone of a variety of attacks aiming at identifying sensitive information (keys, credentials, data, algorithms) or vulnerabilities and flaws for broader exploitation. Software applications are usually deployed as identical binary code installed on millions of computers, enabling an adversary to develop a generic reverse-engineering strategy that, if working on one code instance, could be applied to crack all the other instances. A solution to mitigate this problem is represented by Software Diversity, which aims at creating several structurally different (but functionally equivalent) binary code versions out of the same source code, so that even if a successful attack can be elaborated for one version, it should not work on a diversified version. In this paper, we address the problem of maximizing software diversity from a search-based optimization point of view. The program to protect is subject to a catalogue of transformations to generate many candidate versions. The problem of selecting the subset of most diversified versions to be deployed is formulated as an optimisation problem, that we tackle with different search heuristics. We show the applicability of this approach on some popular Android apps.


Software diversity Clustering Obfuscation Security 



The authors want to thank Prof. Mark Harman who was involved in the initial stages of this work, and contributed by suggesting the use of clustering for this search problem. This research has been funded by the European Union 7th Framework Programme (FP7/2007-2013), under grant agreement number 609734 - ASPIRE project (Advanced Software Protection: Integration Research and Exploitation),


  1. 1.
    Anckaert, B., De Sutter, B., De Bosschere, K.: Software piracy prevention through diversity. In: Proceedings of the 4th ACM workshop on Digital Rights Management, pp. 63–71. ACM (2004)Google Scholar
  2. 2.
    Arcuri, A., Fraser, G.: On parameter tuning in search based software engineering. In: Cohen, M.B., Ó Cinnéide, M. (eds.) SSBSE 2011. LNCS, vol. 6956, pp. 33–47. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  3. 3.
    Bellon, S., Koschke, R., Antoniol, G., Krinke, J., Merlo, E.: Comparison and evaluation of clone detection tools. IEEE Trans. Softw. Eng. 33(9), 577–591 (2007)CrossRefGoogle Scholar
  4. 4.
    Capiluppi, A., Falcarin, P., Boldyreff, C.: Code defactoring: evaluating the effectiveness of java obfuscations. In: Proceedings of the 19th Working Conference on Reverse Engineering, WCRE 2012, pp. 71–80. IEEE (2012)Google Scholar
  5. 5.
    Cebrián, M., Alfonseca, M., Ortega, A., et al.: Common pitfalls using the normalized compression distance: what to watch out for in a compressor. Commun. Inf. Syst. 5(4), 367–384 (2005)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Ceccato, M., Capiluppi, A., Falcarin, P., Boldyreff, C.: A large study on the effect of code obfuscation on the quality of java code. Empirical Softw. Eng. 20(6), 1486–1524 (2015)CrossRefGoogle Scholar
  7. 7.
    Cohen, F.B.: Operating system protection through program evolution. Comput. Secur. 12(6), 565–584 (1993)CrossRefGoogle Scholar
  8. 8.
    Collberg, C., Nagra, J.: Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection. Addison-Wesley Professional, Boston (2009)Google Scholar
  9. 9.
    Coppens, B., De Sutter, B., Maebe, J.: Feedback-driven binary code diversification. ACM Trans. Archit. Code Optim. 9(4), 24:1–24:26 (2013)CrossRefGoogle Scholar
  10. 10.
    Davi, L., Dmitrienko, A., Nürnberger, S., Sadeghi, A.R.: XIFER: a software diversity tool against code-reuse attacks. In: 4th ACM International Workshop on Wireless of the Students, by the Students, for the Students (S3 2012), August 2012Google Scholar
  11. 11.
    Falcarin, P., Collberg, C., Atallah, M., Jakubowski, M.: Guest editors’ introduction: software protection. IEEE Softw. 28(2), 24–27 (2011)CrossRefGoogle Scholar
  12. 12.
    Forrest, S., Somayaji, A., Ackley, D.H.: Building diverse computer systems. In: The Sixth Workshop on Hot Topics in Operating Systems, pp. 67–72, May 1997Google Scholar
  13. 13.
    Franz, M.: E unibus pluram: massive-scale software diversity as a defense mechanism. In: Proceedings of the 2010 Workshop on New Security Paradigms, pp. 7–16. ACM (2010)Google Scholar
  14. 14.
    Freire, M., Cebrian, M., del Rosal, E.: Uncovering plagiarism networks. arXiv preprint cs/0703136 (2007)Google Scholar
  15. 15.
    Gupta, A., Kerr, S., Kirkpatrick, M.S., Bertino, E.: Marlin: a fine grained randomization approach to defend against ROP attacks. In: Lopez, J., Huang, X., Sandhu, R. (eds.) NSS 2013. LNCS, vol. 7873, pp. 293–306. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  16. 16.
    Holland, D.A., Lim, A.T., Seltzer, M.I.: An architecture a day keeps the hacker away. ACM SIGARCH Comput. Archit. News 33(1), 34–41 (2005)CrossRefGoogle Scholar
  17. 17.
    Jackson, T., et al.: Compiler-generated software diversity. In: Jajodia, S., Ghosh, A.K., Swarup, V., Wang, C., Wang, X.S. (eds.) Moving Target Defense: Creating Asymmetric Uncertainty for Cyber Threats, Advances in Information Security. Advances in Information Security, vol. 54, pp. 77–98. Springer, New York (2011). doi: 10.1007/978-1-4614-0977-9_4 CrossRefGoogle Scholar
  18. 18.
    Just, J.E., Cornwell, M.: Review and analysis of synthetic diversity for breaking monocultures. In: Proceedings of the 2004 ACM Workshop on Rapid Malcode, pp. 23–32. ACM (2004)Google Scholar
  19. 19.
    Larsen, P., Brunthaler, S., Franz, M.: Security through diversity: are we there yet? IEEE Secur. Priv. 12(2), 28–35 (2014)CrossRefGoogle Scholar
  20. 20.
    Larsen, P., Homescu, A., Brunthaler, S., Franz, M.: SoK: automated software diversity. In: 2014 IEEE Symposium on Security and Privacy (SP), pp. 276–291, May 2014Google Scholar
  21. 21.
    Le Goues, C., Nguyen, T., Forrest, S., Weimer, W.: GenProg: a generic method for automatic software repair. IEEE Trans. Softw. Eng. 38(1), 54–72 (2012)CrossRefGoogle Scholar
  22. 22.
    Martin, W., Harman, M., Jia, Y., Sarro, F., Zhang, Y.: The app. sampling problem for app. store mining. In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp. 123–133. IEEE (2015)Google Scholar
  23. 23.
    McMinn, P.: Search-based software test data generation: a survey. Softw. Test. Verification Reliab. 14(2), 105–156 (2004)CrossRefGoogle Scholar
  24. 24.
    Michael, R.G., David, S.J.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman & Co., San Francisco (1979)zbMATHGoogle Scholar
  25. 25.
    Myles, G., Collberg, C.S.: Detecting software theft via whole program path birthmarks. In: Zhang, K., Zheng, Y. (eds.) ISC 2004. LNCS, vol. 3225, pp. 404–415. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  26. 26.
    Shioji, E., Kawakoya, Y., Iwamura, M., Hariu, T.: Code shredding: byte-granular randomization of program layout for detecting code-reuse attacks. In: Proceedings of the 28th Annual Computer Security Applications Conference, ACSAC 2012, pp. 309–318. ACM (2012)Google Scholar
  27. 27.
    Van Put, L., Chanet, D., De Bus, B., De Sutter, B., De Bosschere, K.: Diablo: a reliable, retargetable and extensible link-time rewriting framework. In: 2005 Proceedings of the Fifth IEEE International Symposium on Signal Processing and Information Technology, pp. 7–12. IEEE (2005)Google Scholar
  28. 28.
    Weimer, W., Nguyen, T., Le Goues, C., Forrest, S.: Automatically finding patches using genetic programming. In: Proceedings of 31st International Conference on Software Engineering, pp. 364–374 (2009)Google Scholar
  29. 29.
    Williams, D., Hu, W., Davidson, J.W., Hiser, J.D., Knight, J.C., Nguyen-Tuong, A.: Security through diversity: leveraging virtual machine technology. IEEE Secur. Priv. 7(1), 26–33 (2009)CrossRefGoogle Scholar
  30. 30.
    Wong, W., Stamp, M.: Hunting for metamorphic engines. J. Comput. Virol. 2(3), 211–229 (2006)CrossRefGoogle Scholar
  31. 31.
    Xu, J., Kalbarczyk, Z., Iyer, R.K.: Transparent runtime randomization for security. In: 2003 Proceedings of the 22nd International Symposium on Reliable Distributed Systems, pp. 260–269. IEEE (2003)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Mariano Ceccato
    • 1
  • Paolo Falcarin
    • 2
    Email author
  • Alessandro Cabutto
    • 2
  • Yosief Weldezghi Frezghi
    • 1
  • Cristian-Alexandru Staicu
    • 3
  1. 1.Fondazione Bruno KesslerTrentoItaly
  2. 2.University of East LondonLondonUK
  3. 3.Department of Computer ScienceTU DarmstadtDarmstadtGermany

Personalised recommendations