Abstract
In recent years, deep neural networks (DNNs) have achieved great success in many areas and have been deployed as cloud services to bring convenience to people’s daily lives. However, the widespread use of DNNs in the cloud brings critical privacy concerns. Researchers have proposed many solutions to address the privacy concerns of deploying DNN in the cloud, and one major category of solutions rely on a trusted execution environment (TEE). Nonetheless, the DNN inference requires extensive memory and computing resources to achieve accurate decision-making, which does not operate well in TEE with restricted memory space. This paper proposes a network pruning algorithm based on mean shift clustering to reduce the model size and improve the inference performance in TEE. The core idea of our design is to use a mean shift algorithm to aggregate the weight values automatically and prune the network based on the distance between the weight and center. Our experiments prune three popular networks on the CIFAR-10 dataset. The experimental results show that our algorithm successfully reduces the network size without affecting its accuracy. The inference in TEE is accelerated by 20%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Intel Software Guard Extensions. https://software.intel.com/content/www/us/en/develop/topics/software-guard-extensions.html [online]
PyTorch. https://pytorch.org [online]
Azure Machine Learning (2021). https://azure.microsoft.com/en-au/services/machine-learning/
Google Vertex AI (2021). https://cloud.google.com/vertex-ai
Amjad, G., Kamara, S., Moataz, T.: Forward and backward private searchable encryption with SGX. In: EuroSec 2019 (2019)
Arnautov, S., et al.: SCONE: secure linux containers with intel SGX. In: USENIX OSDI 2016 (2016)
Chabanne, H., de Wargny, A., Milgram, J., Morel, C., Prouff, E.: Privacy-preserving classification on deep neural network. IACR Crypto ePrint: 2017/035 (2017)
Deng, L., Hinton, G., Kingsbury, B.: New Types of deep neural network learning for speech recognition and related applications: an overview. In: IEEE ICASSP 2013 (2013)
Duan, H., et al.: LightBox: full-stack protected stateful middlebox at lightning speed. In: ACM CCS1 2019 (2019)
Gilad-Bachrach, R., et al.: CryptoNets: applying neural nnetworks to encrypted data with high throughput and accuracy. In: ICML 2016 (2016)
Guo, Y., Yao, A., Chen, Y.: Dynamic network surgery for efficient DNNs. arXiv preprint:1608.04493 (2016)
Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv preprint:1510.00149 (2016)
Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural networks. arXiv preprint:1506.02626 (2015)
Hashemi, H., Wang, Y., Annavaram, M.: DarKnight: a data privacy scheme for training and inference of deep neural networks. arXiv preprint:2006.01300 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. arXiv preprint:1512.03385 (2015)
Hu, H., Peng, R., Tai, Y.W., Tang, C.K.: Network trimming: a data-driven neuron pruning approach towards efficient deep architectures. arXiv preprint:1607.03250 (2016)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: densely connected convolutional networks. In: IEEE CVPR 2017 (2017)
Hunt, T., Zhu, Z., Xu, Y., Peter, S., Witchel, E.: Ryoan: a distributed sandbox for untrusted computation on secret data. In: USENIX OSDI 2016 (2016)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML 2015 (2015)
Juvekar, C., Vaikuntanathan, V., Chandrakasan, A.: GAZELLE: a low latency framework for secure neural network inference. In: USENIX Security 2018 (2018)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS 2012 (2012)
Lai, S., et al.: Enabling efficient privacy-assured outlier detection over encrypted incremental datasets. IEEE Internet of Things J. 7(4), 2651–2662 (2019)
Lai, S., et al.: GraphSE2: an encrypted graph database for privacy-preserving social search. In: ACM ASIACCS 2019 (2019)
Lai, S., et al.: OblivSketch: oblivious network measurement as a cloud service. In: NDSS 2021 (2021)
Lee, S., et al.: Inferring fine-grained control flow inside SGX enclaves with branch shadowing. In: USENIX Security 2017 (2017)
Lee, T., et al.: Occlumency: privacy-preserving remote deep-learning inference using SGX. In: ACM MobiCom 2019 (2019)
Liu, J., Juuti, M., Lu, Y., Asokan, N.: Oblivious neural network predictions via MiniONN transformations. In: ACM CCS 2017 (2017)
Liu, Z., et al.: Learning efficient convolutional networks through network slimming. In: IEEE ICCV 2017 (2017)
Luo, J.H., Wu, J., Lin, W.: ThiNet: A Filter level pruning method for deep neural network compression. In: IEEE ICCV 2017 (2017)
Masi, I., Wu, Y., Hassner, T., Natarajan, P.: Deep face recognition: a survey. In: SIBGRAPI 2018 (2018)
Mishra, P., Lehmkuhl, R., Srinivasan, A., Zheng, W., Popa, R.A.: DELPHI: a cryptographic inference service for neural networks. In: USENIX Security 2020 (2020)
Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient inference. arXiv preprint:1611.06440 (2016)
Ohrimenko, O., et al.: Oblivious multi-party machine learning on trusted processors. In: USENIX Security 2016 (2016)
Poddar, R., Ananthanarayanan, G., Setty, S., Volos, S., Popa, R.A.: Visor: privacy-preserving video analytics as a cloud service. In: USENIX Security 2020 (2020)
Poddar, R., Lan, C., Popa, R.A., Ratnasamy, S.: Safebricks: shielding network functions in the cloud. In: USENIX NSDI 2018 (2018)
Priebe, C., et al.: SGX-LKL: securing the host OS interface for trusted execution. arXiv preprint:1908.11143 (2020)
Shen, Y., et al.: Occlum: secure and efficient multitasking inside a single enclave of intel SGX. In: ACM ASPLOS 2020 (2020)
Shinde, S., Chua, Z.L., Narayanan, V., Saxena, P.: Preventing page faults from telling your secrets. In: ACM AsiaCCS 2016 (2016)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint:1409.1556 (2014)
Taassori, M., Shafiee, A., Balasubramonian, R.: VAULT: reducing paging overheads in SGX with efficient integrity verification structures. In: ACM ASPLOS 2018 (2018)
Tramer, F., Boneh, D.: Slalom: fast, verifiable and private execution of neural networks in trusted hardware. In: ICLR 2019 (2019)
Tsai, C.C., Porter, D.E., Vij, M.: Graphene-SGX: a practical library OS for unmodified applications on SGX. In: USENIX ATC 2017 (2017)
Vo, V., Lai, S., Yuan, X., Nepal, S., Liu, J.K.: Towards efficient and strong backward private searchable encryption with secure enclaves. In: ACNS 2021 (2021)
Vo, V., et al.: Accelerating forward and backward private searchable encryption using trusted execution. In: ACNS 2020 (2020)
Volos, S., Vaswani, K., Bruno, R.: Graviton: trusted execution environments on GPUs. In: USENIX OSDI 2018 (2018)
Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning structured sparsity in deep neural networks. arXiv preprint:1608.03665 (2016)
Xu, Y., Cui, W., Peinado, M.: Controlled-channel attacks: deterministic side channels for untrusted operating systems. In: IEEE S&P 2015 (2015)
Young, T., Hazarika, D., Poria, S., Cambria, E.: Recent trends in deep learning based natural language processing. IEEE Comput. Intell. Mag. 13(3), 55–75 (2018)
Zhuang, T., Zhang, Z., Huang, Y., Zeng, X., Shuang, K., Li, X.: Neuron-level structured pruning using polarization regularizer. In: NeurIPS 2020 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Xu, C., Lai, S. (2021). Accelerating TEE-Based DNN Inference Using Mean Shift Network Pruning. In: Yuan, X., Bao, W., Yi, X., Tran, N.H. (eds) Quality, Reliability, Security and Robustness in Heterogeneous Systems. QShine 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 402. Springer, Cham. https://doi.org/10.1007/978-3-030-91424-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-91424-0_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-91423-3
Online ISBN: 978-3-030-91424-0
eBook Packages: Computer ScienceComputer Science (R0)