Trust, but Verify: Using Self-supervised Probing to Improve Trustworthiness

Deng, Ailin; Li, Shen; Xiong, Miao; Chen, Zhirui; Hooi, Bryan

doi:10.1007/978-3-031-19778-9_21

Ailin Deng¹²,
Shen Li¹²,
Miao Xiong¹²,
Zhirui Chen¹² &
…
Bryan Hooi¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13673))

Included in the following conference series:

European Conference on Computer Vision

1915 Accesses

Abstract

Trustworthy machine learning is of primary importance to the practical deployment of deep learning models. While state-of-the-art models achieve astonishingly good performance in terms of accuracy, recent literature reveals that their predictive confidence scores unfortunately cannot be trusted: e.g., they are often overconfident when wrong predictions are made, or so even for obvious outliers. In this paper, we introduce a new approach of self-supervised probing, which enables us to check and mitigate the overconfidence issue for a trained model, thereby improving its trustworthiness. We provide a simple yet effective framework, which can be flexibly applied to existing trustworthiness-related methods in a plug-and-play manner. Extensive experiments on three trustworthiness-related tasks (misclassification detection, calibration and out-of-distribution detection) across various benchmarks verify the effectiveness of our proposed probing framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Our code is available at https://github.com/d-ailin/SSProbing.

References

Adi, Y., Kermany, E., Belinkov, Y., Lavi, O., Goldberg, Y.: Fine-grained analysis of sentence embeddings using auxiliary prediction tasks. arXiv preprint arXiv:1608.04207 (2016)
Alain, G., Bengio, Y.: Understanding intermediate layers using linear classifier probes. arXiv preprint arXiv:1610.01644 (2016)
Blundell, C., Cornebise, J., Kavukcuoglu, K., Wierstra, D.: Weight uncertainty in neural networks. arXiv abs/1505.05424 (2015)
Google Scholar
Brier, G.W., et al.: Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78(1), 1–3 (1950)
Article Google Scholar
Charpentier, B., Zügner, D., Günnemann, S.: Posterior network: uncertainty estimation without OOD samples via density-based pseudo-counts. Adv. Neural. Inf. Process. Syst. 33, 1356–1367 (2020)
Google Scholar
Chen, J., Liu, F., Avci, B., Wu, X., Liang, Y., Jha, S.: Detecting errors and estimating accuracy on unlabeled data with self-training ensembles. In: Advances in Neural Information Processing Systems 34 (2021)
Google Scholar
Coates, A., Ng, A., Lee, H.: An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 215–223. JMLR Workshop and Conference Proceedings (2011)
Google Scholar
Corbière, C., Thome, N., Bar-Hen, A., Cord, M., Pérez, P.: Addressing failure prediction by learning model confidence. arXiv preprint arXiv:1910.04851 (2019)
Darlow, L.N., Crowley, E.J., Antoniou, A., Storkey, A.J.: CINIC-10 is not imagenet or CIFAR-10. arXiv preprint arXiv:1810.03505 (2018)
Deng, W., Gould, S., Zheng, L.: What does rotation prediction tell us about classifier accuracy under varying testing environments? In: International Conference on Machine Learning, pp. 2579–2589. PMLR (2021)
Google Scholar
Doersch, C., Gupta, A., Efros, A.A.: Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1422–1430 (2015)
Google Scholar
Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: International Conference on Machine Learning, pp. 1050–1059. PMLR (2016)
Google Scholar
Gidaris, S., Singh, P., Komodakis, N.: Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728 (2018)
Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. In: International Conference on Machine Learning, pp. 1321–1330. PMLR (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hein, M., Andriushchenko, M., Bitterwolf, J.: Why ReLu networks yield high-confidence predictions far away from the training data and how to mitigate the problem. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 41–50 (2019)
Google Scholar
Hendrycks, D., Carlini, N., Schulman, J., Steinhardt, J.: Unsolved problems in ML safety. arXiv preprint arXiv:2109.13916 (2021)
Hendrycks, D., Gimpel, K.: A baseline for detecting misclassified and out-of-distribution examples in neural networks. In: Proceedings of International Conference on Learning Representations (2017)
Google Scholar
Hendrycks, D., Mazeika, M., Kadavath, S., Song, D.: Using self-supervised learning can improve model robustness and uncertainty. In: Advances in Neural Information Processing Systems 32 (2019)
Google Scholar
Hewitt, J., Liang, P.: Designing and interpreting probes with control tasks. arXiv preprint arXiv:1909.03368 (2019)
Jiang, H., Kim, B., Guan, M.Y., Gupta, M.: To trust or not to trust a classifier. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 5546–5557 (2018)
Google Scholar
Kendall, A., Gal, Y.: What uncertainties do we need in Bayesian deep learning for computer vision? In: Advances in Neural Information Processing Systems 30 (2017)
Google Scholar
Köhn, A.: What’s in an embedding? analyzing word embeddings through multilingual evaluation (2015)
Google Scholar
Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in Neural Information Processing Systems 30 (2017)
Google Scholar
Malinin, A., Gales, M.: Predictive uncertainty estimation via prior networks. In: Advances in Neural Information Processing Systems 31 (2018)
Google Scholar
Mukhoti, J., Kulharia, V., Sanyal, A., Golodetz, S., Torr, P.H.S., Dokania, P.: Calibrating deep neural networks using focal loss. arXiv abs/2002.09437 (2020)
Google Scholar
Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 427–436 (2015)
Google Scholar
Noroozi, M., Favaro, P.: Unsupervised learning of visual representations by solving jigsaw puzzles. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 69–84. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_5
Chapter Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. Adv. Neural. Inf. Process. Syst. 28, 3483–3491 (2015)
Google Scholar
Tenney, I., et al.: What do you learn from context? probing for sentence structure in contextualized word representations. arXiv preprint arXiv:1905.06316 (2019)
Toreini, E., Aitken, M., Coopamootoo, K., Elliott, K., Zelaya, C.G., Van Moorsel, A.: The relationship between trust in AI and trustworthy machine learning technologies. In: Proceedings of the 2020 conference on fairness, accountability, and transparency, pp. 272–283 (2020)
Google Scholar
Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1958–1970 (2008)
Article Google Scholar
Yang, J., Zhou, K., Li, Y., Liu, Z.: Generalized out-of-distribution detection: a survey. arXiv preprint arXiv:2110.11334 (2021)
Zadrozny, B., Elkan, C.: Obtaining calibrated probability estimates from decision trees and Naive Bayesian classifiers. In: ICML, vol. 1, pp. 609–616. CiteSeer (2001)
Google Scholar
Zhang, R., Isola, P., Efros, A.A.: Colorful image colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 649–666. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_40
Chapter Google Scholar

Download references

Acknowledgements

This work was supported in part by NUS ODPRT Grant R252-000-A81-133.

Author information

Authors and Affiliations

National University of Singapore, Singapore, Singapore
Ailin Deng, Shen Li, Miao Xiong, Zhirui Chen & Bryan Hooi

Authors

Ailin Deng
View author publications
You can also search for this author in PubMed Google Scholar
Shen Li
View author publications
You can also search for this author in PubMed Google Scholar
Miao Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Zhirui Chen
View author publications
You can also search for this author in PubMed Google Scholar
Bryan Hooi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ailin Deng .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 638 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Deng, A., Li, S., Xiong, M., Chen, Z., Hooi, B. (2022). Trust, but Verify: Using Self-supervised Probing to Improve Trustworthiness. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13673. Springer, Cham. https://doi.org/10.1007/978-3-031-19778-9_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-19778-9_21
Published: 03 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19777-2
Online ISBN: 978-3-031-19778-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Trust, but Verify: Using Self-supervised Probing to Improve Trustworthiness