Verifying the Quality of Outsourced Training on Clouds

Li, Peiyang; Wang, Ye; Liu, Zhuotao; Xu, Ke; Wang, Qian; Shen, Chao; Li, Qi

doi:10.1007/978-3-031-17146-8_7

Peiyang Li^11,12,
Ye Wang^11,12,
Zhuotao Liu^11,12,
Ke Xu^11,12,
Qian Wang¹³,
Chao Shen¹⁴ &
…
Qi Li^11,12

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13555))

Included in the following conference series:

European Symposium on Research in Computer Security

1787 Accesses
2 Citations

Abstract

Deep learning training is often outsourced to clouds due to its high computation overhead. However, clouds may not perform model training correctly due to the potential violations on Service Level Agreement (SLA) and attacks, incurring low quality of outsourced training. It is challenging for customers to understand the quality of outsourced training on clouds. They cannot measure the quality by simply testing the trained models because the testing performance is impacted by various factors, e.g., the quality of training and testing data. In order to address these issues, in this paper, we propose a novel framework that allows customers to verify the quality of outsourced training without modifying the processes of model training. Particularly, our framework achieves black-box verification by utilizing an extra training task that can be learned by the model only after the model converges on the original training task. We construct well-designed extra training tasks according to the original tasks, and develop a training quality verification method to measure the model performance on the extra task with a hypothesis testing-based threshold. The experiment results show that the models passing the quality verification achieve at least 96% of their best performance with negligible accuracy loss, i.e., less than 0.25%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://groups.di.unipi.it/~gulli/AG_corpus_of_news_articles.html.

References

Algorithmia (2022). https://algorithmia.com/
Amazon sagemaker (2022). https://aws.amazon.com/sagemaker/
Google vertex AI (2022). https://cloud.google.com/vertex-ai/
Microsoft azure (2022). https://azure.microsoft.com/en-us/services/machine-learning/
Adi, Y., Baum, C., Cisse, M., Pinkas, B., Keshet, J.: Turning your weakness into a strength: watermarking deep neural networks by backdooring. In: USENIX (2018)
Google Scholar
An, J., Cho, S.: Variational autoencoder based anomaly detection using reconstruction probability. Spec. Lect. IE 2(1), 1–18 (2015)
Google Scholar
Arpit, D., et al.: A closer look at memorization in deep networks. In: ICML (2017)
Google Scholar
Bugiel, S., Nürnberger, S., Pöppelmann, T., Sadeghi, A.R., Schneider, T.: Amazonia: when elasticity snaps back. In: CCS (2011)
Google Scholar
Chen, H., Rohani, B.D., Koushanfar, F.: Deepmarks: A digital fingerprinting framework for deep neural networks. arXiv preprint arXiv:1804.03648 (2018)
Deng, J., et al.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)
Google Scholar
Ghodsi, Z., Gu, T., Garg, S.: SafetyNets: verifiable execution of deep neural networks on an untrusted cloud. In: NeurIPS (2017)
Google Scholar
Gu, T., Dolan-Gavitt, B., Garg, S.: BadNets: identifying vulnerabilities in the machine learning model supply chain. arXiv preprint arXiv:1708.06733 (2017)
Gu, Z., et al.: Securing input data of deep learning inference systems via partitioned enclave execution. arXiv preprint arXiv:1807.00969 (2018)
Hashemi, H., Wang, Y., Annavaram, M.: DarKnight: an accelerated framework for privacy and integrity preserving deep learning using trusted hardware. In: MICRO (2021)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Google Scholar
He, Z., Zhang, T., Lee, R.: Sensitive-sample fingerprinting of deep neural networks. In: CVPR (2019)
Google Scholar
He, Z., Zhang, T., Lee, R.B.: VeriDeep: verifying integrity of deep neural networks through sensitive-sample fingerprinting. arXiv preprint arXiv:1808.03277 (2018)
Hunt, T., Song, C., Shokri, R., Shmatikov, V., Witchel, E.: Chiron: privacy-preserving machine learning as a service. arXiv preprint arXiv:1803.05961 (2018)
Kim, Y.: Convolutional neural networks for sentence classification. In: EMNLP. ACL (2014)
Google Scholar
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Lee, T., et al.: Occlumency: privacy-preserving remote deep-learning inference using SGX. In: MobiCom (2019)
Google Scholar
Li, Z., Hu, C., Zhang, Y., Guo, S.: How to prove your model belongs to you: a blind-watermark based framework to protect intellectual property of DNN. In: ACSAC (2019)
Google Scholar
Liu, Y., et al.: Trojaning attack on neural networks. In: NDSS (2018)
Google Scholar
Maas, A., et al.: Learning word vectors for sentiment analysis. In: ACL (2011)
Google Scholar
Mirsky, Y., Doitshman, T., Elovici, Y., Shabtai, A.: Kitsune: an ensemble of autoencoders for online network intrusion detection. arXiv preprint arXiv:1802.09089 (2018)
Nagai, Y., Uchida, Y., Sakazawa, S., Satoh, S.: Digital watermarking for deep neural networks. Int. J. Multimedia Inf. Retr. 7(1), 3–16 (2018). https://doi.org/10.1007/s13735-018-0147-1
Article Google Scholar
Ruder, S.: An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 (2016)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Somorovsky, J., Heiderich, M., Jensen, M., Schwenk, J., Gruschka, N., Lo Iacono, L.: All your clouds are belong to us: security analysis of cloud management interfaces. In: SCC@ASIACCS (2011)
Google Scholar
Song, C., Ristenpart, T., Shmatikov, V.: Machine learning models that remember too much. In: SIGSAC (2017)
Google Scholar
Tramer, F., Boneh, D.: Slalom: fast, verifiable and private execution of neural networks in trusted hardware. In: ICLR (2018)
Google Scholar
Uchida, Y., Nagai, Y., Sakazawa, S., Satoh, S.: Embedding watermarks into deep neural networks. In: ICMR (2017)
Google Scholar
Yao, Y., Rosasco, L., Caponnetto, A.: On early stopping in gradient descent learning. Constr. Approx. 26(2), 289–315 (2007)
Article MathSciNet Google Scholar
Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. In: ICLR (2017)
Google Scholar
Zhang, X., Li, F., Zhang, Z., Li, Q., Wang, C., Wu, J.: Enabling execution assurance of federated learning at untrusted participants. In: IEEE INFOCOM (2020)
Google Scholar
Zhao, L., et al.: VeriML: enabling integrity assurances and fair payments for machine learning as a service (2019)
Google Scholar
Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. In: AAAI (2020)
Google Scholar
Zhou, P., et al.: Attention-based bidirectional long short-term memory networks for relation classification. In: ACL, pp. 207–212 (2016)
Google Scholar
Zhu, J., Gibson, B., Rogers, T.T.: Human rademacher complexity. In: NeurIPS (2009)
Google Scholar

Download references

Acknowledgement

This work is supported in part by the National Key R &D Project of China under Grant 2021ZD0110502, NSFC under Grant 62132011, U20B2049, U21B2018, and 62161160337, Beijing Outstanding Young Scientist Program under Grant BJJWZYJH01201910003011, China National Funds for Distinguished Young Scientists under Grant 61825204, and BNRist under Grant BNR2020RC01013.

Author information

Authors and Affiliations

Tsinghua University & BNRist, Beijing, China
Peiyang Li, Ye Wang, Zhuotao Liu, Ke Xu & Qi Li
Zhongguancun Lab, Beijing, China
Peiyang Li, Ye Wang, Zhuotao Liu, Ke Xu & Qi Li
Wuhan University, Wuhan, China
Qian Wang
Xi’an Jiaotong University, Xi’an, China
Chao Shen

Authors

Peiyang Li
View author publications
You can also search for this author in PubMed Google Scholar
Ye Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhuotao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ke Xu
View author publications
You can also search for this author in PubMed Google Scholar
Qian Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chao Shen
View author publications
You can also search for this author in PubMed Google Scholar
Qi Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qi Li .

Editor information

Editors and Affiliations

Rutgers University, Newark, NJ, USA
Vijayalakshmi Atluri
Hamad Bin Khalifa University, Doha, Qatar
Roberto Di Pietro
Technical University of Denmark, Kongens Lyngby, Denmark
Christian D. Jensen
Technical University of Denmark, Kongens Lyngby, Denmark
Weizhi Meng

A A Proof of Theorem 1

Proof:

The main difference between the cleanly labeled samples and randomly labeled samples is that the former can be generalized while the latter can only be “memorized” by the model [7]. Specifically, the gradients for optimizing a cleanly labeled sample also contribute to optimize the other cleanly labeled samples, when the model is not convergent on the cleanly labeled samples. In contrast, optimizing any randomly labeled sample has almost no contribution to the other randomly labeled samples or cleanly labeled samples. In other words, the gradient correlation of clean label samples is much stronger than of gradients of random label samples:

$$\begin{aligned} \underset{\begin{array}{c} \boldsymbol{x}_i\in D_c, \boldsymbol{x}_j\in D_c , \boldsymbol{x}_i\ne \boldsymbol{x}_j \end{array}}{\text {average}} \frac{\boldsymbol{g}(\boldsymbol{x}_i) \cdot \boldsymbol{g}(\boldsymbol{x}_j)}{\left\| \boldsymbol{g}(\boldsymbol{x}_i)\right\| \left\| \boldsymbol{g}(\boldsymbol{x}_j)\right\| } \gg \underset{\begin{array}{c} \boldsymbol{x}_i\in D_r, \boldsymbol{x}_j\in D_r , \boldsymbol{x}_i\ne \boldsymbol{x}_j \end{array}}{\text {average}} \frac{\boldsymbol{g}(\boldsymbol{x}_i) \cdot \boldsymbol{g}(\boldsymbol{x}_j)}{\left\| \boldsymbol{g}(\boldsymbol{x}_i)\right\| \left\| \boldsymbol{g}(\boldsymbol{x}_j)\right\| }, \end{aligned}$$

(7)

where $\boldsymbol{g}(\boldsymbol{x})$ denote the gradient of $\boldsymbol{x}$, i.e., $\boldsymbol{g}(\boldsymbol{x}) := \nabla _{\theta } {\mathcal {L}(\boldsymbol{x}, f_{\theta })}$. The norm of the gradient of a individual sample can be approximated to a constant C since the direction of gradients has a more important effect than the norm of gradients. Then, we have:

$$\begin{aligned} \underset{\begin{array}{c} \boldsymbol{x}_i\in D_c, \boldsymbol{x}_j\in D_c , \boldsymbol{x}_i\ne \boldsymbol{x}_j \end{array}}{\text {average}} {\boldsymbol{g}(\boldsymbol{x}_i) \cdot \boldsymbol{g}(\boldsymbol{x}_j)} \gg \underset{\begin{array}{c} \boldsymbol{x}_i\in D_r, \boldsymbol{x}_j\in D_r , \boldsymbol{x}_i\ne \boldsymbol{x}_j \end{array}}{\text {average}} {\boldsymbol{g}(\boldsymbol{x}_i) \cdot \boldsymbol{g}(\boldsymbol{x}_j)} . \end{aligned}$$

(8)

Suppose there are $n_c$ and $n_r$ samples in $\mathcal {D}_c$ and $\mathcal {D}_r$ respectively, i.e., $D_{c}=\{(\boldsymbol{x}_i,y_i)_{i=1}^{n_c}\}$ and $D_{r}=\{(\boldsymbol{x}_i,y_i)_{i=n_c+1}^{n_c+n_r}\}$. The left term of Eq. 2 can be rewritten as:

$$\begin{aligned} \left\| \nabla _{{\theta }} {\mathcal {L}(\mathcal {D}_c, {\theta })} \right\| ^2 =&\left\| \sum _{i=1}^{n_c} \boldsymbol{g}(x_i) \right\| ^2 \end{aligned}$$

(9)

$$\begin{aligned} =&\sum _{i=1}^{n_c} \left\| \boldsymbol{g}(x_i) \right\| ^2 + 2 \sum _{\begin{array}{c} j=1 \\ j\ne i \end{array}}^{n_c}\sum _{i=1}^{n_c} \boldsymbol{g}(x_i)\cdot \boldsymbol{g}(x_j)\end{aligned}$$

(10)

$$\begin{aligned} =&n_c\cdot C + 2 \sum _{\begin{array}{c} j=1 \\ j\ne i \end{array}}^{n_c}\sum _{i=1}^{n_c} \boldsymbol{g}(x_i)\cdot \boldsymbol{g}(x_j) . \end{aligned}$$

(11)

Similarly, for the right term of Eq. 2, we have:

$$\begin{aligned} \left\| \nabla _{{\theta }} {\mathcal {L}(\mathcal {D}_r, {\theta })} \right\| ^2 = n_r\cdot C + 2 \sum _{\begin{array}{c} j=n_c+1 \\ j\ne i \end{array}}^{n_c+n_r}\sum _{i=n_c+1}^{n_c+n_r} \boldsymbol{g}(x_i)\cdot \boldsymbol{g}(x_j) . \end{aligned}$$

(12)

Since $|\mathcal {D}_r|>|\mathcal {D}_c|$, we have $n_c>n_r$. Then the conclusion can be deductively reasoned from Eq. 11, 12 and 8. $\square $

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, P. et al. (2022). Verifying the Quality of Outsourced Training on Clouds. In: Atluri, V., Di Pietro, R., Jensen, C.D., Meng, W. (eds) Computer Security – ESORICS 2022. ESORICS 2022. Lecture Notes in Computer Science, vol 13555. Springer, Cham. https://doi.org/10.1007/978-3-031-17146-8_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-17146-8_7
Published: 22 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17145-1
Online ISBN: 978-3-031-17146-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Verifying the Quality of Outsourced Training on Clouds

Abstract

Access this chapter

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A A Proof of Theorem 1

A A Proof of Theorem 1

Proof:

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation