Abstract
Neural-network classifiers achieve high accuracy when predicting the class of an input that they were trained to identify. Maintaining this accuracy in dynamic environments, where inputs frequently fall outside the fixed set of initially known classes, remains a challenge. We consider the problem of monitoring the classification decisions of neural networks in the presence of novel classes. For this purpose, we generalize our recently proposed abstraction-based monitor from binary output to real-valued quantitative output. This quantitative output enables new applications, two of which we investigate in the paper. As our first application, we introduce an algorithmic framework for active monitoring of a neural network, which allows us to learn new classes dynamically and yet maintain high monitoring performance. As our second application, we present an offline procedure to retrain the neural network to improve the monitor’s detection performance without deteriorating the network’s classification accuracy. Our experimental evaluation demonstrates both the benefits of our active monitoring framework in dynamic scenarios and the effectiveness of the retraining procedure.
Article PDF
Similar content being viewed by others
References
Bendale, A., Boult, T.E.: Towards open world recognition. In: CVPR, pp. 1893–1902. IEEE Comput. Soc., Los Alamitos (2015). https://doi.org/10.1109/CVPR.2015.7298799
Bendale, A., Boult, T.E.: Towards open set deep networks. In: CVPR, pp. 1563–1572. IEEE Comput. Soc., Los Alamitos (2016). https://doi.org/10.1109/CVPR.2016.173
Bendre, N., Terashima-Marín, H., Najafirad, P.: Learning from few samples: a survey (2020). https://arxiv.org/abs/2007.15484. arXiv:2007.15484. CoRR
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: COMPSTAT, pp. 177–186. Physica-Verlag, Heidelberg (2010). https://doi.org/10.1007/978-3-7908-2604-3_16
Chen, Y., Cheng, C., Yan, J., et al.: Monitoring object detection abnormalities via data-label and post-algorithm abstractions (2021). https://arxiv.org/abs/2103.15456. arXiv:2103.15456. CoRR
Cheng, C., Nührenberg, G., Yasuoka, H.: Runtime monitoring neuron activation patterns. In: DATE, IEEE, Florence, Italy, pp. 300–303 (2019). https://doi.org/10.23919/DATE.2019.8714971
Cohen, G., Afshar, S., Tapson, J., et al.: EMNIST: extending MNIST to handwritten letters. In: IJCNN, pp. 2921–2926. IEEE, Anchorage, AK, USA (2017). https://doi.org/10.1109/IJCNN.2017.7966217
Cohn, D.A., Atlas, L.E., Ladner, R.E.: Improving generalization with active learning. Mach. Learn. 15(2), 201–221 (1994). https://doi.org/10.1007/BF00993277
Das, S., Wong, W., Dietterich, T.G., et al.: Incorporating expert feedback into active anomaly discovery. In: ICDM, pp. 853–858. IEEE Comput. Soc., Los Alamitos (2016). https://doi.org/10.1109/ICDM.2016.0102
Fan, J., Li, W.: Adversarial training and provable robustness: a tale of two objectives. In: AAAI, pp. 7367–7376. AAAI Press, Menlo Park (2021). https://ojs.aaai.org/index.php/AAAI/article/view/16904
Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: ICML, JMLR Workshop and Conference Proceedings, vol. 48. JMLR.org, New York, NY, USA, pp. 1050–1059 (2016). http://proceedings.mlr.press/v48/gal16.html
Geifman, Y., El-Yaniv, R.: Selective classification for deep neural networks. In: NeurIPS, pp. 4878–4887 (2017). http://papers.nips.cc/paper/7073-selective-classification-for-deep-neural-networks
Griewank, A., Walther, A.: Evaluating Derivatives - Principles and Techniques of Algorithmic Differentiation. SIAM, Philadelphia (2008). https://doi.org/10.1137/1.9780898717761
Guerriero, S., Caputo, B., Mensink, T.: DeepNCM: deep nearest class mean classifiers. In: ICLR. OpenReview.net (2018). https://openreview.net/forum?id=rkPLZ4JPM
Guo, C., Pleiss, G., Sun, Y., et al.: On calibration of modern neural networks. In: ICML, PMLR, vol. 70. PMLR, Sydney, Australia, pp. 1321–1330 (2017). http://proceedings.mlr.press/v70/guo17a.html
Gupta, A., Carlone, L.: Online monitoring for neural network based monocular pedestrian pose estimation. In: ITSC, pp. 1–8. IEEE, Rhodes, Greece (2020). https://doi.org/10.1109/ITSC45102.2020.9294609
Hashemi, V., Kretínský, J., Mohr, S., et al.: Gaussian-based runtime detection of out-of-distribution inputs for neural networks. In: RV, LNCS, vol. 12974, pp. 254–264. Springer, Berlin (2021). https://doi.org/10.1007/978-3-030-88494-9_14
Hendrycks, D., Gimpel, K.: A baseline for detecting misclassified and out-of-distribution examples in neural networks. In: ICLR. OpenReview.net (2017). https://openreview.net/forum?id=Hkg4TI9xl
Henzinger, T.A., Lukina, A., Schilling, C.: Outside the box: abstraction-based monitoring of neural networks. In: ECAI, Frontiers in Artificial Intelligence and Applications, vol. 325, pp. 2433–2440. IOS Press, Amsterdam (2020). https://doi.org/10.3233/FAIA200375
Ibrahim, S.H., Nassar, M.: Hack the box: Fooling deep learning abstraction-based monitors (2021). https://arxiv.org/abs/2107.04764. arXiv:2107.04764. CoRR
Jolliffe, I.T.: Principal Component Analysis. Springer Series in Statistics. Springer, Berlin (1986). https://doi.org/10.1007/978-1-4757-1904-8
Knorr, E.M., Ng, R.T.: A unified notion of outliers: properties and computation. In: KDD, pp. 219–222. AAAI Press, Menlo Park (1997). http://www.aaai.org/Library/KDD/1997/kdd97-044.php
Krizhevsky, A.: Learning multiple layers of features from tiny images (2009). Tech. Rep. http://www.cs.Toronto.edu/~kriz/learning-features-2009-TR.pdf
LeCun, Y., Bottou, L., Bengio, Y., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Liu, W., Wang, Z., Liu, X., et al.: A survey of deep neural network architectures and their applications. Neurocomputing 234, 11–26 (2017). https://doi.org/10.1016/j.neucom.2016.12.038
Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–136 (1982). https://doi.org/10.1109/TIT.1982.1056489
Lu, J., Gong, P., Ye, J., et al.: Learning from very few samples: a survey (2020). https://arxiv.org/abs/2009.02653. arXiv:2009.02653. CoRR
Lukina, A., Schilling, C., Henzinger, T.A.: Into the unknown: active monitoring of neural networks. In: RV, LNCS, vol. 12974, pp. 42–61. Springer, Berlin (2021). https://doi.org/10.1007/978-3-030-88494-9_3
Mancini, M., Karaoguz, H., Ricci, E., et al.: Knowledge is never enough: towards web aided deep open world recognition. In: ICRA, pp. 9537–9543. IEEE, Montreal, QC, Canada (2019). https://doi.org/10.1109/ICRA.2019.8793803
Mandelbaum, A., Weinshall, D.: Distance-based confidence score for neural network classifiers (2017). http://arxiv.org/abs/1709.09844. arXiv:1709.09844. CoRR
McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: the sequential learning problem. In: Psychology of Learning and Motivation, vol. 24, pp. 109–165. Elsevier, Amsterdam (1989). http://www.sciencedirect.com/science/article/pii/S0079742108605368
Mensink, T., Verbeek, J.J., Perronnin, F., et al.: Distance-based image classification: generalizing to new classes at near-zero cost. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2624–2637 (2013). https://doi.org/10.1109/TPAMI.2013.83
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010). https://doi.org/10.1109/TKDE.2009.191
Parisi, G.I., Kemker, R., Part, J.L., et al.: Continual lifelong learning with neural networks: a review. Neural Netw. 113, 54–71 (2019). https://doi.org/10.1016/j.neunet.2019.01.012
Pimentel, M.A.F., Clifton, D.A., Clifton, L.A., et al.: A review of novelty detection. Signal Process. 99, 215–249 (2014). https://doi.org/10.1016/j.sigpro.2013.12.026
Rahman, Q.M., Corke, P., Dayoub, F.: Run-time monitoring of machine learning for robotic perception: a survey of emerging trends. IEEE Access 9, 20,067–20,075 (2021). https://doi.org/10.1109/ACCESS.2021.3055015
Rebuffi, S., Kolesnikov, A., Sperl, G., et al.: iCaRL: incremental classifier and representation learning. In: CVPR, pp. 5533–5542. IEEE Comput. Soc., Los Alamitos (2017). https://doi.org/10.1109/CVPR.2017.587
Redko, I., Morvant, E., Habrard, A., et al.: Advances in Domain Adaptation Theory. Elsevier, Amsterdam (2019). https://doi.org/10.1016/C2016-0-05108-2
Royer, A., Lampert, C.H.: Classifier adaptation at prediction time. In: CVPR, IEEE Comput. Soc., Los Alamitos, pp. 1401–1409 (2015). https://doi.org/10.1109/CVPR.2015.7298746
Schölkopf, B., Smola, A.J., Müller, K.: Kernel principal component analysis. In: ICANN, LNCS, vol. 1327, pp. 583–588. Springer, Berlin (1997). https://doi.org/10.1007/BFb0020217
Schultheiss, A., Käding, C., Freytag, A., et al.: Finding the unknown: novelty detection with extreme value signatures of deep neural activations. In: GCPR, LNCS, vol. 10496, pp. 226–238. Springer, Berlin (2017). https://doi.org/10.1007/978-3-319-66709-6_19
Settles, B.: Active Learning, Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan Kaufmann, San Mateo (2012). https://doi.org/10.2200/S00429ED1V01Y201207AIM018
Stallkamp, J., Schlipsing, M., Salmen, J., et al.: The German traffic sign recognition benchmark: a multi-class classification competition. In: IJCNN, pp. 1453–1460. IEEE, San Jose, CA, USA (2011). https://doi.org/10.1109/IJCNN.2011.6033395
Sun, R., Lampert, C.H.: Ks(conf): a light-weight test if a multiclass classifier operates outside of its specifications. Int. J. Comput. Vis. 128(4), 970–995 (2020). https://doi.org/10.1007/s11263-019-01232-x
Tan, C., Sun, F., Kong, T., et al.: A survey on deep transfer learning. In: ICANN, LNCS, vol. 11141, pp. 270–279. Springer, Berlin (2018). https://doi.org/10.1007/978-3-030-01424-7_27
Tobin, J., Fong, R., Ray, A., et al.: Domain randomization for transferring deep neural networks from simulation to the real world. In: IROS, pp. 23–30. IEEE, Vancouver, BC, Canada (2017). https://doi.org/10.1109/IROS.2017.8202133
Wagstaff, K.L., Lu, S.: Efficient active learning for new domains (2020). In: Workshop on real world experiment design and active learning
Wu, C., Falcone, Y., Bensalem, S.: Customizable reference runtime monitoring of neural networks using resolution boxes (2021). https://arxiv.org/abs/2104.14435. arXiv:2104.14435. CoRR
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms (2017). http://arxiv.org/abs/1708.07747. arXiv:1708.07747. CoRR
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: ECCV, LNCS, vol. 8689, pp. 818–833. Springer, Berlin (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Zhang, P., Wang, J., Farhadi, A., et al.: Predicting failures of vision systems. In: CVPR, pp. 3566–3573. IEEE Comput. Soc., Los Alamitos (2014). https://doi.org/10.1109/CVPR.2014.456
Zhang, X., Zou, J., He, K., et al.: Accelerating very deep convolutional networks for classification and detection. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 1943–1955 (2016). https://doi.org/10.1109/TPAMI.2015.2502579
Zhang, Z., Wu, P., Chen, Y., et al.: Out-of-distribution detection through relative activation-deactivation abstractions. In: ISSRE, pp. 150–161. IEEE, Wuhan, China (2021). https://doi.org/10.1109/ISSRE52982.2021.00027
Zhao, P., Hoi, S.C.H.: OTL: a framework of online transfer learning. In: ICML, Omnipress, Haifa, Israel pp. 1231–1238 (2010). https://icml.cc/Conferences/2010/papers/219.pdf
Funding
This work was supported in part by the ERC-2020-AdG 101020093, by DIREC - Digital Research Centre Denmark, and by the Villum Investigator Grant S4OS.
Author information
Authors and Affiliations
Contributions
Konstantin Kueffner, Anna Lukina and Christian Schilling contributed equally to this work.
Corresponding authors
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kueffner, K., Lukina, A., Schilling, C. et al. Into the unknown: active monitoring of neural networks (extended version). Int J Softw Tools Technol Transfer 25, 575–592 (2023). https://doi.org/10.1007/s10009-023-00711-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10009-023-00711-4