Abstract
In this paper, we explore the adaption of techniques previously used in the domains of adversarial machine learning and differential privacy to mitigate the ML-powered analysis of streaming traffic. Our findings are twofold. First, constructing adversarial samples effectively confounds an adversary with a predetermined classifier but is less effective when the adversary can adapt to the defense by using alternative classifiers or training the classifier with adversarial samples. Second, differential-privacy guarantees are very effective against such statistical-inference-based traffic analysis, while remaining agnostic to the machine learning classifiers used by the adversary. We propose three mechanisms for enforcing differential privacy for encrypted streaming traffic and evaluate their security and utility. Our empirical implementation and evaluation suggest that the proposed statistical privacy approaches are promising solutions in the underlying scenarios
Similar content being viewed by others
Notes
The model converged after 40 epochs. Training for 1000 epochs improved the accuracy by only 0.024.
\(\textit{thres} \)(\(d^*\),0.25,30) = 0.0000111524321020, \(\textit{thres} \)(\(d_{\mathrm {L1}}\),0.25,30) = 0.0000024160161657.
A burst is the total size of all packets whose timestamps are no farther apart than a threshold. Here, the threshold is set to 0.5s.
In previous sections, the evaluations were performed on the \( BPB \) feature; here, we show that extracting more features does not really help the classification.
References
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., et al.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems (2016). arXiv preprint arXiv:1603.04467
Bengio, Y., et al.: Learning deep architectures for ai. Foundations and trends® in Machine Learning (2009)
Benhamouda, F., Joye, M., Libert, B.: A new framework for privacy-preserving aggregation of time-series data. ACM Trans. Inf. Syst. Secur. (TISSEC) 18, 1–21 (2016)
Brickell, E., Graunke, G., Neve, M., Seifert, J.P.: Software mitigations to hedge AES against cache-based software side channel vulnerabilities. In: IACR Cryptology ePrint Archive (2006)
Cai, X., Nithyanand, R., Wang, T., Johnson, R., Goldberg, I.: A systematic approach to developing and evaluating website fingerprinting defenses. In: 2014 ACM Conference on Computer and Communications Security. ACM (2014)
Cao, J., Xiao, Q., Ghinita, G., Li, N., Bertino, E., Tan, K.L.: Efficient and accurate strategies for differentially-private sliding window queries. In: 16th International Conference on Extending Database Technology. ACM (2013)
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: IEEE Symposium on Security and Privacy. IEEE (2017)
Chan, T.H.H., Shi, E., Song, D.: Private and continual release of statistics. ACM Trans. Inf. Syst. Secur. (TISSEC) 14, 1–24 (2011)
Chatzikokolakis, K., Andrés, M.E., Bordenabe, N.E., Palamidessi, C.: Broadening the scope of differential privacy using metrics. In: International Symposium on Privacy Enhancing Technologies Symposium. Springer (2013)
Chen, Q.A., Qian, Z., Mao, Z.M.: Peeking into your app without actually seeing it: Ui state inference and novel android attacks. In: USENIX Security Symposium (2014)
Dahl, G.E., Stokes, J.W., Deng, L., Yu, D.: Large-scale malware classification using random projections and neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE (2013)
Diao, W., Liu, X., Li, Z., Zhang, K.: No pardon for the interruption: New inference attacks on android through interrupt timing analysis. In: 2016 IEEE Symposium on Security and Privacy (SP). IEEE (2016)
Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.: Querying and mining of time series data: experimental comparison of representations and distance measures. VLDB Endowment (2008)
Dwork, C.: Differential privacy. In: 33rd International Conference on Automata, Languages and Programming (ICALP) (2006)
Dwork, C., Roth, A., et al.: The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9, 211–407 (2014)
Fan, L., Xiong, L.: An adaptive approach to real-time aggregate monitoring with differential privacy. IEEE Trans. Knowl. Eng. 26, 2094–2106 (2014)
Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1915–1929 (2013)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples (2014). arXiv:1412.6572
Hamm, J.: Machine vs machine: minimax-optimal defense against adversarial examples (2017). arXiv:1711.04368
Hayes, J., Danezis, G.: k-fingerprinting: a robust scalable website fingerprinting technique. In: USENIX Security Symposium (2016)
Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29, 82–97 (2012)
Jean, S., Cho, K., Memisevic, R., Bengio, Y.: On using very large target vocabulary for neural machine translation (2014). arXiv:1412.2007
Juarez, M., Imani, M., Perry, M., Diaz, C., Wright, M.: Toward an efficient website fingerprinting defense. In: European Symposium on Research in Computer Security. Springer (2016)
Kellaris, G., Papadopoulos, S., Xiao, X., Papadias, D.: Differentially private event sequences over infinite streams. VLDB Endow 7, 1155–1666 (2014)
Keramidas, G., Antonopoulos, A., Serpanos, D.N., Kaxiras, S.: Non deterministic caches: a simple and effective defense against side channel attacks. Design Autom. Embed. Syst. 12, 221–230 (2008)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (2012)
Kwon, A., Corrigan-Gibbs, H., Devadas, S., Ford, B.: Atom: Horizontally scaling strong anonymity. In: 26th Symposium on Operating Systems Principles. ACM (2017)
Lazar, D., Zeldovich, N.: Alpenhorn: Bootstrapping secure communication without leaking metadata. In: OSDI (2016)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature (2015)
LeCun, Y., Bengio, Y., et al.: Convolutional networks for images, speech, and time series. In: The Handbook of Brain Theory and Neural Networks (1995)
Lecuyer, M., Atlidakis, V., Geambasu, R., Hsu, D., Jana, S.: Certified robustness to adversarial examples with differential privacy. In: 2019 IEEE Symposium on Security and Privacy (SP). IEEE (2019)
Li, P., Gao, D., Reiter, M.K.: Mitigating access-driven timing channels in clouds using stopwatch. In: 43rd International Conference on Dependable systems and networks. IEEE (2013)
Liu, W., Gao, D., Reiter, M.K.: On-demand time blurring to support side-channel defense. In: European Symposium on Research in Computer Security. Springer (2017)
Liu, X., Zhou, Z., Diao, W., Li, Z., Zhang, K.: When good becomes evil: Keystroke inference with smartwatch. In: 22nd ACM Conference on Computer and Communications Security. ACM (2015)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks (2017). arXiv:1706.06083
Marohn, B., Wright, C.V., Feng, W.C., Rosulek, M., Bobba, R.B.: Approximate thumbnail preserving encryption. In: 1st International Workshop on Multimedia Privacy and Security (2017)
Mikolov, T., Deoras, A., Povey, D., Burget, L., Černockỳ, J.: Strategies for training large scale neural network language models. In: 2011 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). IEEE (2011)
Mnih, A., Teh, Y.W.: A fast and simple algorithm for training neural probabilistic language models (2012). arXiv:1206.6426
Mondal, A., Sengupta, S., Reddy, B.R., Koundinya, M., Govindarajan, C., De, P., Ganguly, N., Chakraborty, S.: Candid with youtube: Adaptive streaming behavior and implications on data consumption. In: 27th Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV). ACM (2017)
Nasr, M., Bahramali, A., Houmansadr, A.: Blind adversarial network perturbations (2020). arXiv:2002.06495
Nicolae, M.I., Sinn, M., Tran, M.N., Buesser, B., Rawat, A., Wistuba, M., Zantedeschi, V., Baracaldo, N., Chen, B., Ludwig, H., Molloy, I., Edwards, B.: Adversarial robustness toolbox v1.2.0. CoRR (2018). arXiv:1807.01069
Oh, S.J., Fritz, M., Schiele, B.: Adversarial image perturbation for privacy protection—a game theory perspective. In: IEEE International Conference on Computer Vision (2017)
Panchenko, A., Lanze, F., Pennekamp, J., Engel, T., Zinnen, A., Henze, M., Wehrle, K.: Website fingerprinting at internet scale. In: NDSS (2016)
Panchenko, A., Niessen, L., Zinnen, A., Engel, T.: Website fingerprinting in onion routing based anonymization networks. In: 10th Annual ACM Workshop on Privacy in the Electronic Society. ACM (2011)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. (2011)
Pinot, R., Yger, F., Gouy-Pailler, C., Atif, J.: A unified view on differential privacy and robustness to adversarial examples (2019). arXiv:1906.07982
Rastogi, V., Nath, S.: Differentially private aggregation of distributed time-series with transformation and encryption. In: 2010 ACM SIGMOD International Conference on Management of data. ACM (2010)
Sainath, T.N., Mohamed, A.R., Kingsbury, B., Ramabhadran, B.: Deep convolutional neural networks for LVCSR. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE (2013)
Schuster, R., Shmatikov, V., Tromer, E.: Beauty and the burst: Remote identification of encrypted video streams. In: USENIX Security Symposium (2017)
Shi, E., Chan, H., Rieffel, E., Chow, R., Song, D.: Privacy-preserving aggregation of time-series data. In: NDSS (2011)
Sirinam, P., Imani, M., Juarez, M., Wright, M.: Deep fingerprinting: Undermining website fingerprinting defenses with deep learning (2018). arXiv:1801.02265 (2018)
Sun, Q., Simon, D.R., Wang, Y.M., Russell, W., Padmanabhan, V.N., Qiu, L.: Statistical identification of encrypted web browsing traffic. In: IEEE Symposium on Security and Privacy. IEEE (2002)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems (2014)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A., et al.: Going deeper with convolutions. In: CVPR (2015)
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks (2013). arXiv:1312.6199
Tompson, J.J., Jain, A., LeCun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in Neural Information Processing Systems (2014)
Tyagi, N., Gilad, Y., Leung, D., Zaharia, M., Zeldovich, N.: Stadium: A distributed metadata-private messaging system. In: 26th Symposium on Operating Systems Principles. ACM (2017)
Van Den Hooff, J., Lazar, D., Zaharia, M., Zeldovich, N.: Vuvuzela: Scalable private messaging resistant to traffic analysis. In: 25th Symposium on Operating Systems Principles. ACM (2015)
Vattikonda, B.C., Das, S., Shacham, H.: Eliminating fine grained timers in xen. In: 3rd ACM Workshop on Cloud Computing Security Workshop. ACM (2011)
Wang, T., Cai, X., Nithyanand, R., Johnson, R., Goldberg, I.: Effective attacks and provable defenses for website fingerprinting. In: USENIX Security Symposium (2014)
Wang, T., Goldberg, I.: Walkie-talkie: An efficient defense against passive website fingerprinting attacks. In: USENIX Security Symposium (2017)
Xiao, Q., Reiter, M.K., Zhang, Y.: Mitigating storage side channels using statistical privacy mechanisms. In: 22nd ACM Conference on Computer and Communications Security. ACM (2015)
Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary LP norms (2000)
Zhang, X., Wang, X., Bai, X., Zhang, Y., Wang, X.: Os-level side channels without PROCFS: Exploring cross-app information leakage on IOS. In: NDSS (2018)
Zhang, Y., Juels, A., Reiter, M.K., Ristenpart, T.: Cross-VM side channels and their use to extract private keys. In: 19th ACM conference on Computer and communications security. ACM (2012)
Zhang, Y., Reiter, M.K.: Düppel: retrofitting commodity operating systems to mitigate cache side channels in the cloud. In: 2013 ACM conference on Computer and communications security. ACM (2013)
Funding
This project is supported in part by NSF grants 1718084, 1750809, 1801494, and grant W911NF-17-1-0370 from the Army Research Office. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Office or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: Appendix
Appendix A: Appendix
Theorem 1
c \(\mathbb {D}\), we have method A that is \(\epsilon \)-private, and method B that is \((d^*, \epsilon )\)-private. We denote the maximum and minimum \(d^*\) distance in \(\mathbb {D}\) as \(d_{max}\) and \(d_{min}\). Then, we have:
(1) If B is (\(d^*,\epsilon \))-private, then B is \((\epsilon d_{max})\)-private.
(2) If A is \(\epsilon \)-private, then A is \((d^*,\frac{\epsilon }{d_{min}})\)-private.
Proof According to the definitions, we have:
For B, we have:
If B is (\(d^*,\epsilon \))-private,
So B is at least \((\epsilon d_{max})\)-private. Similarly, if A is \(\epsilon \)-private, let \(\epsilon =\epsilon ' \times d^*(x,x')\), we have:
So A is at least \((d^*,\frac{\epsilon }{d_{min}})\)-private \(\square \).
Rights and permissions
About this article
Cite this article
Zhang, X., Hamm, J., Reiter, M.K. et al. Defeating traffic analysis via differential privacy: a case study on streaming traffic. Int. J. Inf. Secur. 21, 689–706 (2022). https://doi.org/10.1007/s10207-021-00574-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10207-021-00574-3