Skip to main content
Log in

Protecting ownership rights of ML models using watermarking in the light of adversarial attacks

  • Original Research
  • Published:
AI and Ethics Aims and scope Submit manuscript

Abstract

In this paper, we present and analyze two novel—and seemingly distant—research trends in Machine Learning: ML watermarking and adversarial patches. First, we show how ML watermarking uses specially crafted inputs to provide a proof of model ownership. Second, we demonstrate how an attacker can craft adversarial samples in order to trigger an abnormal behavior in a model and thus perform an ambiguity attack on ML watermarking. Finally, we describe three countermeasures that could be applied in order to prevent ambiguity attacks. We illustrate our works using the example of a binary classification model for welding inspection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

The datasets used and analysed during the current study are available from the Confiance.ai program on reasonable request.

Notes

  1. https://lambdalabs.com/blog/demystifying-gpt-3..

  2. https://www.cs.toronto.edu/kriz/cifar.html..

References

  1. Akhtar, N., Mian, A.: Threat of adversarial attacks on deep learning in computer vision: A survey. Ieee Access 6, 14410–14430 (2018)

    Article  Google Scholar 

  2. Uchida, Y., Nagai, Y., Sakazawa, S., Satoh, S.: Embedding watermarks into deep neural networks. In Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval. (2017)

  3. Fan, L., Ng, K.W., Chan, C.S.: Rethinking deep neural network ownership verification: Embedding passports to defeat ambiguity attacks. Advances in neural information processing systems, 32. (2019)

  4. Adi, Y., Baum, C., Cisse, M., Pinkas, B., Keshet, J.: Turning your weakness into a strength: Watermarking deep neural networks by backdooring. In 27th USENIX Security Symposium (USENIX Security 18). (2018)

  5. Zhang, J., Gu, Z., Jang, J., Wu, H., Stoecklin, M.P., Huang, H., Molloy, I.: Protecting intellectual property of deep neural networks with watermarking. In Proceedings of the 2018 on Asia Conference on Computer and Communications Security. (2018)

  6. Kapusta, K., Thouvenot, V., Bettan, O.: Watermarking at the service of intellectualproperty rights of ML models. Conference on Artificial Intelligence for Defense (CAID). (2020)

  7. Wang, T., Kerschbaum, F.: Attacks on digital watermarks for deep neural networks. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). (2019)

  8. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)

    Article  MathSciNet  Google Scholar 

  9. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. arXiv preprintarXiv:1312.6199. (2013)

  10. Biggio, B., Corona, I., Maiorca, D., Nelson, B., Šrndić, N., Laskov, P., Giacinto, G., Roli, F.: Evasion attacks against machine learning at test time. In Joint European conference on machine learning and knowledge discovery in databases. pp. 387–402. Springer (2013)

  11. Kurakin, A., Goodfellow, I.J., Bengio, S.: Adversarial Machine Learning at Scale. (2017)

  12. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards Deep Learning Models Resistant to Adversarial Attacks. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. (2018)

  13. Cohen, J., Rosenfeld, E., Kolter, Z.: Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning. pp. 1310–1320. PMLR (2019)

  14. Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 427–436. (2015)

  15. Brown, T.B., Mané, D., Roy, A., Abadi, M., Gilmer, J.: Adversarial patch. arXiv preprintarXiv:1712.09665. (2017)

  16. Braunschweig, B., Gelin, R., Terrier, F.: The wall of safety for AI: approaches in the confiance. ai program. In Workshop on Artificial Intelligence Safety (SAFEAI). (2022)

  17. Confiance.ai; et al. Towards the engineering of trustworthy AI applications for critical systems—The Confiance.ai program (2022)

  18. Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D Object Representations for Fine-Grained Categorization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops. (2013)

  19. Li, Z., Hu, C., Zhang, Y., Guo, S.: How to Prove Your Model Belongs to You: A Blind-Watermark Based Framework to Protect Intellectual Property of DNN. In Proceedings of the 35th Annual Computer Security Applications Conference, ACSAC ’19, 126-137. New York, NY, USA: Association for Computing Machinery. ISBN 9781450376280. (2019)

  20. Lansari, M., Kapusta, K., Thouvenot, V.: How to efficiently and explicitly watermark your Convolutional Neural Network. In Conference on Artificial Intelligence for Defense, Actes de la 4ème Conference on Artificial Intelligence for Defense (CAID 2022). Rennes, France: DGA Maîtrise de l’Information. (2022)

  21. Kapusta, K., Thouvenot, V., Bettan, O., Beguinet, H., Senet, H.: A Protocol for Secure Verification of Watermarks Embedded into Machine Learning Models. In Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security, IH & MMSec ’21. pp. 171-176. New York, NY, USA: Association for Computing Machinery. ISBN 9781450382953 (2021)

Download references

Acknowledgements

This work has been supported by the French government under the ”France 2030” program, as part of the SystemX Technological Research Institute within the Confiance.ai Program (www.confiance.ai).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Boussad Addad.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kapusta, K., Mattioli, L., Addad, B. et al. Protecting ownership rights of ML models using watermarking in the light of adversarial attacks. AI Ethics 4, 95–103 (2024). https://doi.org/10.1007/s43681-023-00412-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s43681-023-00412-3

Navigation