Hybrid Statistical Estimation of Mutual Information for Quantifying Information Flow

  • Yusuke KawamotoEmail author
  • Fabrizio Biondi
  • Axel Legay
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9995)


Analysis of a probabilistic system often requires to learn the joint probability distribution of its random variables. The computation of the exact distribution is usually an exhaustive precise analysis on all executions of the system. To avoid the high computational cost of such an exhaustive search, statistical analysis has been studied to efficiently obtain approximate estimates by analyzing only a small but representative subset of the system’s behavior. In this paper we propose a hybrid statistical estimation method that combines precise and statistical analyses to estimate mutual information and its confidence interval. We show how to combine the analyses on different components of the system with different precision to obtain an estimate for the whole system. The new method performs weighted statistical analysis with different sample sizes over different components and dynamically finds their optimal sample sizes. Moreover it can reduce sample sizes by using prior knowledge about systems and a new abstraction-then-sampling technique based on qualitative analysis. We show the new method outperforms the state of the art in quantifying information leakage.


Mutual Information Precise Analysis Joint Probability Distribution Information Leakage Execution Trace 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Adami, C.: Information theory in molecular biology. Phys. Life Rev. 1(1), 3–22 (2004)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Backes, M., Köpf, B., Rybalchenko, A.: Automatic discovery and quantification of information leaks. In: 30th IEEE Symposium on Security and Privacy (S&P 2009), 17–20 May 2009, Oakland, California, USA, pp. 141–153. IEEE Computer Society (2009)Google Scholar
  3. 3.
    Barbot, B., Haddad, S., Picaronny, C.: Coupling and importance sampling for statistical model checking. In: Flanagan, C., König, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 331–346. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-28756-5_23 CrossRefGoogle Scholar
  4. 4.
    Barthe, G., Köpf, B.: Information-theoretic bounds for differentially private mechanisms. In: Proceedings of CSF, pp. 191–204. IEEE (2011)Google Scholar
  5. 5.
    Biondi, F., Legay, A., Malacaria, P., Wasowski, A.: Quantifying information leakage of randomized protocols. Theor. Comput. Sci. 597, 62–87 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Biondi, F., Legay, A., Traonouez, L.M., Wasowski, A.: QUAIL.
  7. 7.
    Biondi, F., Legay, A., Traonouez, L.-M., Wąsowski, A.: QUAIL: a quantitative security analyzer for imperative code. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 702–707. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-39799-8_49 CrossRefGoogle Scholar
  8. 8.
    Boreale, M., Paolini, M.: On formally bounding information leakage by statistical estimation. In: Chow, S.S.M., Camenisch, J., Hui, L.C.K., Yiu, S.M. (eds.) ISC 2014. LNCS, vol. 8783, pp. 216–236. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-13257-0_13 Google Scholar
  9. 9.
    Brillinger, D.R.: Some data analysis using mutual information. Braz. J. Probab. Stat. 18(6), 163–183 (2004)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Chadha, R., Mathur, U., Schwoon, S.: Computing information flow using symbolic model-checking. In: Raman, V., Suresh, S.P. (eds.) FSTTCS 2014. Proceedings. LIPIcs, vol. 29, pp. 505–516. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2014)Google Scholar
  11. 11.
    Chakraborty, S., Fremont, D.J., Meel, K.S., Seshia, S.A., Vardi, M.Y.: On parallel scalable uniform SAT witness generation. In: Baier, C., Tinelli, C. (eds.) TACAS 2015. LNCS, vol. 9035, pp. 304–319. Springer, Heidelberg (2015). doi: 10.1007/978-3-662-46681-0_25 Google Scholar
  12. 12.
    Chakraborty, S., Meel, K.S., Vardi, M.Y.: A scalable approximate model counter. In: Schulte, C. (ed.) CP 2013. LNCS, vol. 8124, pp. 200–216. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40627-0_18 CrossRefGoogle Scholar
  13. 13.
    Chatzikokolakis, K., Chothia, T., Guha, A.: Statistical measurement of information leakage. In: Esparza, J., Majumdar, R. (eds.) TACAS 2010. LNCS, vol. 6015, pp. 390–404. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-12002-2_33 CrossRefGoogle Scholar
  14. 14.
    Chatzikokolakis, K., Palamidessi, C., Panangaden, P.: Anonymity protocols as noisy channels. Inf. Comp. 206(2–4), 378–401 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Chaum, D.: The dining cryptographers problem: unconditional sender and recipient untraceability. J. Cryptol. 1, 65–75 (1988)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Chothia, T., Kawamoto, Y.: Statistical estimation of min-entropy leakage, April 2004. (Manuscript)
  17. 17.
    Chothia, T., Kawamoto, Y., Novakovic, C.: LeakWatch.
  18. 18.
    Chothia, T., Kawamoto, Y., Novakovic, C.: LeakiEst.
  19. 19.
    Chothia, T., Kawamoto, Y., Novakovic, C.: A Tool for Estimating Information Leakage. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 690–695. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-39799-8_47 CrossRefGoogle Scholar
  20. 20.
    Chothia, T., Kawamoto, Y., Novakovic, C.: LeakWatch: estimating information leakage from java programs. In: Kutyłowski, M., Vaidya, J. (eds.) ESORICS 2014. LNCS, vol. 8713, pp. 219–236. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-11212-1_13 Google Scholar
  21. 21.
    Chothia, T., Kawamoto, Y., Novakovic, C., Parker, D.: Probabilistic point-to-point information leakage. In: Proceedings of CSF 2013, pp. 193–205. IEEE (2013)Google Scholar
  22. 22.
    Clark, D., Hunt, S., Malacaria, P.: Quantitative analysis of the leakage of confidential data. Electr. Notes Theor. Comput. Sci. 59(3), 238–251 (2001)CrossRefGoogle Scholar
  23. 23.
    Clark, D., Hunt, S., Malacaria, P.: A static analysis for quantifying information flow in a simple imperative language. J. Comput. Secur. 15(3), 321–371 (2007)CrossRefGoogle Scholar
  24. 24.
    Clarke, E.M., Zuliani, P.: Statistical model checking for cyber-physical systems. In: Bultan, T., Hsiung, P.-A. (eds.) ATVA 2011. LNCS, vol. 6996, pp. 1–12. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-24372-1_1 CrossRefGoogle Scholar
  25. 25.
    Clarkson, M.R., Schneider, F.B.: Hyperproperties. J. Comput. Secur. 18(6), 1157–1210 (2010)CrossRefGoogle Scholar
  26. 26.
    Cover, T.M., Thomas, J.A.: Elements of Information Theory, 2nd edn. A Wiley-Interscience publication, Wiley, New York (2006)zbMATHGoogle Scholar
  27. 27.
    Denning, D.E.: A lattice model of secure information flow. Commun. ACM 19(5), 236–243 (1976)MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Escolano, F., Suau, P., Bonev, B.: Information Theory in Computer Vision and Pattern Recognition. Springer, London (2009). CrossRefzbMATHGoogle Scholar
  29. 29.
    Espinoza, B., Smith, G.: Min-entropy as a resource. Inf. Comput. 226, 57–75 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Fremont, D.J., Seshia, S.A.: Speeding up SMT-based quantitative program analysis. In: Rümmer, P., Wintersteiger, C.M. (eds.) SMT 2014. Proceedings. CEUR Workshop Proceedings, vol. 1163, pp. 3–13. (2014)Google Scholar
  31. 31.
    Gallager, R.G.: Information Theory and Reliable Communication. Wiley, New York (1968)zbMATHGoogle Scholar
  32. 32.
    Gray, J.W.: Toward a mathematical foundation for information flow security. In: IEEE Symposium on Security and Privacy, pp. 21–35 (1991)Google Scholar
  33. 33.
    Jensen, F.V.: Introduction to Bayesian Networks, 1st edn. Springer, Secaucus (1996)Google Scholar
  34. 34.
    Kang, M.G., McCamant, S., Poosankam, P., Song, D.: DTA++: dynamic taint analysis with targeted control-flow propagation. In: Proceedings of NDSS 2011. The Internet Society (2011)Google Scholar
  35. 35.
    Kawamoto, Y., Biondi, F., Legay, A.: Hybrid statistical estimation of mutual information for quantifying information flow. Research report, INRIA (2016).
  36. 36.
    Kawamoto, Y., Chatzikokolakis, K., Palamidessi, C.: Compositionality results for quantitative information flow. In: Norman, G., Sanders, W. (eds.) QEST 2014. LNCS, vol. 8657, pp. 368–383. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-10696-0_28 Google Scholar
  37. 37.
    Kawamoto, Y., Given-Wilson, T.: Quantitative information flow for scheduler-dependent systems. In: Proceedings of QAPL 2015, vol. 194, pp. 48–62 (2015)Google Scholar
  38. 38.
    Köpf, B., Basin, D.A.: An information-theoretic model for adaptive side-channel attacks. In: Proceedings of CCS, pp. 286–296. ACM (2007)Google Scholar
  39. 39.
    Köpf, B., Rybalchenko, A.: Approximation and randomization for quantitative information-flow analysis. In: Proceedings CSF 2010, pp. 3–14. IEEE Computer Society (2010)Google Scholar
  40. 40.
    Legay, A., Delahaye, B., Bensalem, S.: Statistical model checking: an overview. In: Barringer, H., Falcone, Y., Finkbeiner, B., Havelund, K., Lee, I., Pace, G., Roşu, G., Sokolsky, O., Tillmann, N. (eds.) RV 2010. LNCS, vol. 6418, pp. 122–135. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-16612-9_11 CrossRefGoogle Scholar
  41. 41.
    MacKay, D.J.C.: Information Theory, Inference & Learning Algorithms. Cambridge University Press, New York (2002)Google Scholar
  42. 42.
    McCamant, S., Ernst, M.D.: Quantitative information flow as network flow capacity. In: Gupta, R., Amarasinghe, S.P. (eds.) Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation, Tucson, AZ, USA, 7–13 June 2008, pp. 193–205. ACM (2008)Google Scholar
  43. 43.
    Moddemeijer, R.: On estimation of entropy and mutual information of continuous distributions. Sig. Process. 16, 233–248 (1989)MathSciNetCrossRefGoogle Scholar
  44. 44.
    Newsome, J., McCamant, S., Song, D.: Measuring channel capacity to distinguish undue influence. In: Chong, S., Naumann, D.A. (eds.) Proceedings of the 2009 Workshop on Programming Languages and Analysis for Security, PLAS 2009, Dublin, Ireland, 15–21 June 2009, pp. 73–85. ACM (2009)Google Scholar
  45. 45.
    Phan, Q., Malacaria, P.: Abstract model counting: a novel approach for quantification of information leaks. In: Moriai, S., Jaeger, T., Sakurai, K. (eds.) Proceedings of AsiaCCS 2014, pp. 283–292. ACM (2014)Google Scholar
  46. 46.
    Phan, Q., Malacaria, P., Pasareanu, C.S., d’Amorim, M.: Quantifying information leaks using reliability analysis. In: Rungta, N., Tkachuk, O. (eds.) Proceedings of SPIN 2014, pp. 105–108. ACM (2014)Google Scholar
  47. 47.
    Smith, G.: On the foundations of quantitative information flow. In: Alfaro, L. (ed.) FoSSaCS 2009. LNCS, vol. 5504, pp. 288–302. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-00596-1_21 CrossRefGoogle Scholar
  48. 48.
    Vitter, J.S.: Random sampling with a reservoir. ACM Trans. Math. Softw. 11(1), 37–57 (1985). MathSciNetCrossRefzbMATHGoogle Scholar
  49. 49.
    Wilde, M.M.: Quantum Information Theory, 1st edn. Cambridge University Press, New York (2013)CrossRefzbMATHGoogle Scholar
  50. 50.
    Yasuoka, H., Terauchi, T.: Quantitative information flow as safety and liveness hyperproperties. Theor. Comput. Sci. 538, 167–182 (2014)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.AISTTsukubaJapan
  2. 2.InriaRennesFrance

Personalised recommendations