Skip to main content

The Robust and Harmless Model Watermarking

  • Chapter
  • First Online:
Digital Watermarking for Machine Learning Model
  • 322 Accesses

Abstract

Obtaining well-performed deep neural networks usually requires expensive data collection and training procedures. Accordingly, they are valuable intellectual properties of their owners. However, recent literature revealed that the adversaries can easily “steal” models by acquiring their function-similar copy, even when they have no training samples and information about the victim models. In this chapter, we introduce a robust and harmless model watermark, based on which we design a model ownership verification via hypothesis test. In particular, our model watermark is persistent during complicated stealing processes and does not introduce additional security risks. Specifically, our defense consists of three main stages. First, we watermark the model by embedding external features, based on modifying some training samples via style transfer. After that, we train a meta-classifier to determine whether a suspicious model is stolen from the victim, based on model gradients. The final ownership verification is judged by hypothesis test. Extensive experiments on CIFAR-10 and ImageNet datasets verify the effectiveness of our defense under both centralized training and federated learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Adi, Y., Baum, C., Cisse, M., Pinkas, B., Keshet, J.: Turning your weakness into a strength: watermarking deep neural networks by backdooring. In: USENIX Security (2018)

    Google Scholar 

  2. Chandrasekaran, V., Chaudhuri, K., Giacomelli, I., Jha, S., Yan, S.: Exploring connections between active learning and model extraction. In: USENIX Security (2020)

    Google Scholar 

  3. Chen, X., Zhang, Y., Wang, Y., Shu, H., Xu, C., Xu, C.: Optical flow distillation: Towards efficient and stable video style transfer. In: ECCV (2020)

    Google Scholar 

  4. Cheng, S., Liu, Y., Ma, S., Zhang, X.: Deep feature space Trojan attack of neural networks by controlled detoxification. In: AAAI (2021)

    Google Scholar 

  5. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)

    Google Scholar 

  6. Duan, R., Ma, X., Wang, Y., Bailey, J., Qin, A.K., Yang, Y.: Adversarial camouflage: hiding physical-world attacks with natural styles. In: CVPR (2020)

    Google Scholar 

  7. Fang, G., Song, J., Shen, C., Wang, X., Chen, D., Song, M.: Data-free adversarial distillation (2019). arXiv preprint arXiv:1912.11006

    Google Scholar 

  8. Geiping, J., Fowl, L., Huang, W.R., Czaja, W., Taylor, G., Moeller, M., Goldstein, T.: Witches’ brew: industrial scale data poisoning via gradient matching. In: ICLR (2021)

    Google Scholar 

  9. Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W.: ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. In: ICLR (2019)

    Google Scholar 

  10. Gu, T., Liu, K., Dolan-Gavitt, B., Garg, S.: BadNets: evaluating backdooring attacks on deep neural networks. IEEE Access 7, 47230–47244 (2019)

    Article  Google Scholar 

  11. Guo, Y., Wang, H., Hu, Q., Liu, H., Liu, L., Bennamoun, M.: Deep learning for 3d point clouds: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 43(12), 4338–4364 (2020)

    Article  Google Scholar 

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  13. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. In: NeurIPS Workshop (2014)

    Google Scholar 

  14. Hogg, R.V., McKean, J., Craig, A.T.: Introduction to Mathematical Statistics. Pearson Education (2005)

    Google Scholar 

  15. Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: ICCV (2017)

    Google Scholar 

  16. Jagielski, M., Carlini, N., Berthelot, D., Kurakin, A., Papernot, N.: High accuracy and high fidelity extraction of neural networks. In: USENIX Security (2020)

    Google Scholar 

  17. Jia, H., Choquette-Choo, C.A., Chandrasekaran, V., Papernot, N.: Entangled watermarks as a defense against model extraction. In: USENIX Security (2021)

    Google Scholar 

  18. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: ECCV (2016)

    Google Scholar 

  19. Juuti, M., Szyller, S., Marchal, S., Asokan, N.: PRADA: protecting against DNN model stealing attacks. In: EuroS&P (2019)

    Google Scholar 

  20. Kesarwani, M., Mukhoty, B., Arya, V., Mehta, S.: Model extraction warning in MLaaS paradigm. In: ACSAC (2018)

    Google Scholar 

  21. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images. Technical report, Citeseer (2009)

    Google Scholar 

  22. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  23. Lee, T., Edwards, B., Molloy, I., Su, D.: Defending against neural network model stealing attacks using deceptive perturbations. In: IEEE S&P Workshop (2019)

    Google Scholar 

  24. Li, T., Sahu, A.K., Talwalkar, A., Smith, V.: Federated learning: challenges, methods, and future directions. IEEE Signal Proc. Mag. 37(3), 50–60 (2020)

    Article  Google Scholar 

  25. Li, Y., Jiang, Y., Li, Z., Xia, S.-T.: Backdoor learning: a survey. IEEE Trans. Neural Netw. Learn. Syst. (2022).

    Google Scholar 

  26. Li, Y., Zhang, Z., Bai, J., Wu, B., Jiang, Y., Xia, S.T.: Open-sourced dataset protection via backdoor watermarking. In: NeurIPS Workshop (2020)

    Google Scholar 

  27. Li, Y., Zhong, H., Ma, X., Jiang, Y., Xia, S.T.: Few-shot backdoor attacks on visual object tracking. In: ICLR (2022)

    Google Scholar 

  28. Li, Y., Zhu, L., Jia, X., Jiang, Y., Xia, S.T., Cao, X.: Defending against model stealing via verifying embedded external features. In: AAAI (2022)

    Google Scholar 

  29. Li, Y., Li, Y., Wu, B., Li, L., He, R., Lyu, S.: Invisible backdoor attack with sample-specific triggers. In: ICCV (2021)

    Google Scholar 

  30. Liu, H., Weng, Z., Zhu, Y.: Watermarking deep neural networks with greedy residuals. In: ICML (2021)

    Google Scholar 

  31. Maini, P., Yaghini, M., Papernot, N.: Dataset inference: Ownership resolution in machine learning. In: ICLR (2021)

    Google Scholar 

  32. McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: AISTATS (2017)

    Google Scholar 

  33. Minaee, S., Boykov, Y.Y., Porikli, F., Plaza, A.J., Kehtarnavaz, N., Terzopoulos, D.: Image segmentation using deep learning: a survey. IEEE Trans. Pattern Anal. Mach. Intell. (2021).

    Google Scholar 

  34. Nguyen, T.A., Tran, A.: Input-aware dynamic backdoor attack. In: NeurIPS (2020)

    Google Scholar 

  35. Nguyen, T.A., Tran, A.T.: WaNet-imperceptible warping-based backdoor attack. In: ICLR (2021)

    Google Scholar 

  36. Orekondy, T., Schiele, B., Fritz, M.: Knockoff nets: stealing functionality of black-box models. In: CVPR (2019)

    Google Scholar 

  37. Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning. In: AsiaCCS (2017)

    Google Scholar 

  38. Sachs, L.: Applied Statistics: A Handbook of Techniques. Springer, Berlin (2012)

    Google Scholar 

  39. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)

    Google Scholar 

  40. Stokes, J.M., Yang, K., Swanson, K., Jin, W., Cubillos-Ruiz, A., Donghia, N.M., MacNair, C.R., French, S., Carfrae, L.A., Bloom-Ackermann, Z., et al.: A deep learning approach to antibiotic discovery. Cell 180(4), 688–702 (2020)

    Article  Google Scholar 

  41. Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., Ristenpart, T.: Stealing machine learning models via prediction APIs. In: USENIX Security (2016)

    Google Scholar 

  42. Wang, T., Kerschbaum, F.: RIGA: covert and robust white-box watermarking of deep neural networks. In: WWW (2021)

    Google Scholar 

  43. Yan, H., Li, X., Li, H., Li, J., Sun, W., Li, F.: Monitoring-based differential privacy mechanism against query flooding-based model extraction attack. IEEE Trans. Depend. Secure Comput. (2021)

    Google Scholar 

  44. Zagoruyko, S., Komodakis, N.: Wide residual networks. In: BMVC (2016)

    Google Scholar 

  45. Zhai, T., Li, Y., Zhang, Z., Wu, B., Jiang, Y., Xia, S.-T.: Backdoor attack against speaker verification. In: ICASSP (2021)

    Google Scholar 

  46. Zhang, J., Gu, Z., Jang, J., Wu, H., Stoecklin, M.P., Huang, H., Molloy, I.: Protecting intellectual property of deep neural networks with watermarking. In: AsiaCCS (2018)

    Google Scholar 

  47. Zhang, J., Chen, D., Liao, J., Zhang, W., Hua, G., Yu, N.: Passport-aware normalization for deep model protection. In: NeurIPS (2020)

    Google Scholar 

  48. Zhu, L., Liu, X., Li, Y., Yang, X., Xia, S.-T., Lu, R.: A fine-grained differentially private federated learning against leakage from gradients. IEEE Internet Things J. (2021)

    Google Scholar 

Download references

Acknowledgements

We sincerely thank Xiaojun Jia from Chinese Academy of Science and Professor Xiaochun Cao from Sun Yat-sen University for their constructive comments and helpful suggestions on an early draft of this chapter.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yiming Li or Shu-Tao Xia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Li, Y., Zhu, L., Bai, Y., Jiang, Y., Xia, ST. (2023). The Robust and Harmless Model Watermarking. In: Fan, L., Chan, C.S., Yang, Q. (eds) Digital Watermarking for Machine Learning Model. Springer, Singapore. https://doi.org/10.1007/978-981-19-7554-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-7554-7_4

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-7553-0

  • Online ISBN: 978-981-19-7554-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics