Advertisement

Backpropagated Gradient Representations for Anomaly Detection

Conference paper
  • 655 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12366)

Abstract

Learning representations that clearly distinguish between normal and abnormal data is key to the success of anomaly detection. Most of existing anomaly detection algorithms use activation representations from forward propagation while not exploiting gradients from backpropagation to characterize data. Gradients capture model updates required to represent data. Anomalies require more drastic model updates to fully represent them compared to normal data. Hence, we propose the utilization of backpropagated gradients as representations to characterize model behavior on anomalies and, consequently, detect such anomalies. We show that the proposed method using gradient-based representations achieves state-of-the-art anomaly detection performance in benchmark image recognition datasets. Also, we highlight the computational efficiency and the simplicity of the proposed method in comparison with other state-of-the-art methods relying on adversarial networks or autoregressive models, which require at least 27 times more model parameters than the proposed method.

Keywords

Gradient-based representations Anomaly detection Novelty detection Image recognition 

Supplementary material

504479_1_En_13_MOESM1_ESM.pdf (733 kb)
Supplementary material 1 (pdf 732 KB)

References

  1. 1.
    Abati, D., Porrello, A., Calderara, S., Cucchiara, R.: Latent space autoregression for novelty detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 481–490 (2019)Google Scholar
  2. 2.
    Achille, A.: Task2vec: task embedding for meta-learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6430–6439 (2019)Google Scholar
  3. 3.
    Achille, A., Paolini, G., Soatto, S.: Where is the information in a deep neural network? arXiv preprint arXiv:1905.12213 (2019)
  4. 4.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Boston (2006).  https://doi.org/10.1007/978-1-4615-7566-5CrossRefzbMATHGoogle Scholar
  5. 5.
    Drucker, H., Le Cun, Y.: Double backpropagation increasing generalization performance. In: IJCNN 1991-Seattle International Joint Conference on Neural Networks, vol. 2, pp. 145–150. IEEE (1991)Google Scholar
  6. 6.
    Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep Learning, vol. 1. MIT Press, Cambridge (2016)zbMATHGoogle Scholar
  7. 7.
    Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)Google Scholar
  8. 8.
    Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
  9. 9.
    Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1735–1742. IEEE (2006)Google Scholar
  10. 10.
    Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Advances in Neural Information Processing Systems, pp. 487–493 (1999)Google Scholar
  11. 11.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  12. 12.
    Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
  13. 13.
    Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, Citeseer (2009)Google Scholar
  14. 14.
    Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236 (2016)
  15. 15.
    Kwon, G., Prabhushankar, M., Temel, D., AIRegib, G.: Distorted representation space characterization through backpropagated gradients. In: 2019 26th IEEE International Conference on Image Processing (ICIP). IEEE (2019)Google Scholar
  16. 16.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  17. 17.
    Luc, P., Neverova, N., Couprie, C., Verbeek, J., LeCun, Y.: Predicting deeper into the future of semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 648–657 (2017)Google Scholar
  18. 18.
    Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., Frey, B.: Adversarial autoencoders. arXiv preprint arXiv:1511.05644 (2015)
  19. 19.
    Mu, F., Liang, Y., Li, Y.: Gradients as features for deep representation learning. In: International Conference on Learning Representations (2020)Google Scholar
  20. 20.
    Van den Oord, A., Kalchbrenner, N., Espeholt, L., Vinyals, O., Graves, A., et al.: Conditional image generation with pixel CNN decoders. In: Advances in Neural Information Processing Systems, pp. 4790–4798 (2016)Google Scholar
  21. 21.
    Peng, X., Zou, C., Qiao, Yu., Peng, Q.: Action recognition with stacked fisher vectors. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 581–595. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_38CrossRefGoogle Scholar
  22. 22.
    Perera, P., Nallapati, R., Xiang, B.: OcGAN: one-class novelty detection using GANs with constrained latent representations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2898–2906 (2019)Google Scholar
  23. 23.
    Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)Google Scholar
  24. 24.
    Pidhorskyi, S., Almohsen, R., Doretto, G.: Generative probabilistic novelty detection with adversarial autoencoders. In: Advances in Neural Information Processing Systems, pp. 6822–6833 (2018)Google Scholar
  25. 25.
    Ross, A.S., Doshi-Velez, F.: Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In: Thirty-second AAAI Conference on Artificial Intelligence (2018)Google Scholar
  26. 26.
    Ruff, L., et al.: Deep one-class classification. In: International Conference on Machine Learning, pp. 4390–4399 (2018)Google Scholar
  27. 27.
    Ruff, L., et al.: Deep one-class classification. In: International Conference on Machine Learning, pp. 4393–4402 (2018)Google Scholar
  28. 28.
    Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533 (1986)CrossRefGoogle Scholar
  29. 29.
    Sabokrou, M., Khalooei, M., Fathy, M., Adeli, E.: Adversarially learned one-class classifier for novelty detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3379–3388 (2018)Google Scholar
  30. 30.
    Sakurada, M., Yairi, T.: Anomaly detection using autoencoders with nonlinear dimensionality reduction. In: Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis, p. 4. ACM (2014)Google Scholar
  31. 31.
    Sánchez, J., Perronnin, F., Mensink, T., Verbeek, J.: Image classification with the fisher vector: theory and practice. Int. J. Comput. Vis. 105(3), 222–245 (2013)MathSciNetCrossRefGoogle Scholar
  32. 32.
    Schlegl, T., Seeböck, P., Waldstein, S.M., Schmidt-Erfurth, U., Langs, G.: Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In: Niethammer, M., et al. (eds.) IPMI 2017. LNCS, vol. 10265, pp. 146–157. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-59050-9_12 CrossRefGoogle Scholar
  33. 33.
    Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)CrossRefGoogle Scholar
  34. 34.
    Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)Google Scholar
  35. 35.
    Sokolić, J., Giryes, R., Sapiro, G., Rodrigues, M.R.: Robust large margin deep neural networks. IEEE Trans. Signal Process. 65(16), 4265–4280 (2017)MathSciNetCrossRefGoogle Scholar
  36. 36.
    Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.: Striving for simplicity: the all convolutional net. arXiv preprint arXiv:1412.6806 (2014)
  37. 37.
    Tax, D.M., Duin, R.P.: Support vector data description. Mach. Learn. 54(1), 45–66 (2004)CrossRefGoogle Scholar
  38. 38.
    Temel, D., Kwon, G., Prabhushankar, M., AlRegib, G.: Cure-TSR: challenging unreal and real environments for traffic sign recognition. arXiv preprint arXiv:1712.02463 (2017)
  39. 39.
    Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
  40. 40.
    Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10590-1_53CrossRefGoogle Scholar
  41. 41.
    Zhou, C., Paffenroth, R.C.: Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 665–674. ACM (2017)Google Scholar
  42. 42.
    Zong, B., et al.: Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In: International Conference on Learning Representations (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Georgia Institute of TechnologyAtlantaUSA

Personalised recommendations