Discriminating Traces with Time

  • Saeid Tizpaz-Niari
  • Pavol Černý
  • Bor-Yuh Evan Chang
  • Sriram Sankaranarayanan
  • Ashutosh Trivedi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10206)

Abstract

What properties about the internals of a program explain the possible differences in its overall running time for different inputs? In this paper, we propose a formal framework for considering this question we dub trace-set discrimination. We show that even though the algorithmic problem of computing maximum likelihood discriminants is NP-hard, approaches based on integer linear programming (ILP) and decision tree learning can be useful in zeroing-in on the program internals. On a set of Java benchmarks, we find that compactly-represented decision trees scalably discriminate with high accuracy—more scalably than maximum likelihood discriminants and with comparable accuracy. We demonstrate on three larger case studies how decision-tree discriminants produced by our tool are useful for debugging timing side-channel vulnerabilities (i.e., where a malicious observer infers secrets simply from passively watching execution times) and availability vulnerabilities.

References

  1. 1.
    Aafer, Y., Du, W., Yin, H.: DroidAPIMiner: mining API-level features for robust malware detection in android. In: Zia, T., Zomaya, A., Varadharajan, V., Mao, M. (eds.) SecureComm 2013. LNICSSITE, vol. 127, pp. 86–103. Springer, Heidelberg (2013). doi:10.1007/978-3-319-04283-1_6 CrossRefGoogle Scholar
  2. 2.
    Akthar, F., Hahne, C.: Rapidminer 5 operator reference. Rapid-I GmbH (2012)Google Scholar
  3. 3.
    Ammons, G., Bodík, R., Larus, J.R.: Mining specifications. In: POPL, pp. 4–16 (2002)Google Scholar
  4. 4.
    Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Kruegel, C., Lippmann, R., Clark, A. (eds.) RAID 2007. LNCS, vol. 4637, pp. 178–197. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74320-0_10 CrossRefGoogle Scholar
  5. 5.
    Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth, Belmont (1984)MATHGoogle Scholar
  6. 6.
    Burguera, I., Zurutuza, U., Nadjm-Tehrani, S.: Crowdroid: behavior-based malware detection system for android. In: Workshop on Security and Privacy in Smartphones and Mobile devices, pp. 15–26 (2011)Google Scholar
  7. 7.
    Domingos, P.: The role of Occam’s razor in knowledge discovery. Data Min. Knowl. Discov. 3(4), 409–425 (1999). ISSN 1573–756XCrossRefGoogle Scholar
  8. 8.
    Elish, K.O., Elish, M.O.: Predicting defect-prone software modules using support vector machines. J. Syst. Softw. 81(5), 649–660 (2008)CrossRefGoogle Scholar
  9. 9.
    Fredrikson, M., Jha, S., Christodorescu, M., Sailer, R., Yan, X.: Near-optimal malware specifications from suspicious behaviors. In: Security and Privacy (SP), pp. 45–60 (2010)Google Scholar
  10. 10.
    Hyafil, L., Rivest, R.L.: Constructing optimal binary decision trees is NP-complete. Inf. Process. Lett. 5(1), 15–17 (1976)MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Kass, G.V.: An exploratory technique for investigating large quantities of categorical data. J. R. Stat. Soc. Ser. C (Appl. Stat.) 29(2), 119–127 (1980)Google Scholar
  12. 12.
    Kolbitsch, C., Comparetti, P.M., Kruegel, C., Kirda, E., Zhou, X., Wang, X.: Effective and efficient malware detection at the end host. In: USENIX Security, pp. 351–366 (2009)Google Scholar
  13. 13.
    Lo, D., Cheng, H., Han, J., Khoo, S.-C., Sun, C.: Classification of software behaviors for failure detection: a discriminative pattern mining approach. In: SIGKDD, pp. 557–566 (2009)Google Scholar
  14. 14.
    Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, Cambridge (2012). ISBN 026201825X, 9780262018258MATHGoogle Scholar
  15. 15.
    Ross Quinlan, J.: Induction of decision trees. Mach. Learn. 1, 81–106 (1986)Google Scholar
  16. 16.
    Rieck, K., Holz, T., Willems, C., Düssel, P., Laskov, P.: Learning and classification of malware behavior. In: Zamboni, D. (ed.) DIMVA 2008. LNCS, vol. 5137, pp. 108–125. Springer, Heidelberg (2008). doi:10.1007/978-3-540-70542-0_6 CrossRefGoogle Scholar
  17. 17.
    Sun, C., Lo, D., Wang, X., Jiang, J., Khoo, S.-C.: A discriminative model approach for accurate duplicate bug report retrieval. In: ICSE, pp. 45–54 (2010)Google Scholar
  18. 18.
    Tan, P.-N., Steinbach, M., Kumar, V., et al.: Introduction to Data Mining, vol. 1. Pearson Addison Wesley, Boston (2006)Google Scholar
  19. 19.
    Weimer, W., Necula, G.C.: Mining temporal specifications for error detection. In: Halbwachs, N., Zuck, L.D. (eds.) TACAS 2005. LNCS, vol. 3440, pp. 461–476. Springer, Heidelberg (2005). doi:10.1007/978-3-540-31980-1_30 CrossRefGoogle Scholar
  20. 20.
    Wu, D.-J., Mao, C.-H., Wei, T.-E., Lee, H.-M., Wu, K.-P.: Droidmat: android malware detection through manifest and API calls tracing. In: JCIS, pp. 62–69 (2012)Google Scholar
  21. 21.
    Zeller, A.: Specifications for free. In: Bobaru, M., Havelund, K., Holzmann, G.J., Joshi, R. (eds.) NFM 2011. LNCS, vol. 6617, pp. 2–12. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20398-5_2 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  • Saeid Tizpaz-Niari
    • 1
  • Pavol Černý
    • 1
  • Bor-Yuh Evan Chang
    • 1
  • Sriram Sankaranarayanan
    • 1
  • Ashutosh Trivedi
    • 1
  1. 1.University of Colorado BoulderBoulderUSA

Personalised recommendations