Skip to main content

Advertisement

Log in

Fixing Bias in Reconstruction-based Anomaly Detection with Lipschitz Discriminators

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

Anomaly detection is of great interest in fields where abnormalities need to be identified and corrected (e.g., medicine and finance). Deep learning methods for this task often rely on autoencoder reconstruction error, sometimes in conjunction with other penalties. We show that this approach exhibits intrinsic biases that lead to undesirable results. Reconstruction-based methods can sometimes show low error on simple-to-reconstruct points that are not part of the training data, for example the all black image. Instead, we introduce a new unsupervised Lipschitz anomaly discriminator (LAD) that does not suffer from these biases. Our anomaly discriminator is trained, similar to the discriminator of a GAN, to detect the difference between the training data and corruptions of the training data. We show that this procedure successfully detects unseen anomalies with guarantees on those that have a certain Wasserstein distance from the data or corrupted training set. These additions allow us to show improved performance on MNIST, CIFAR10, and health record data. Further, LAD does not require decoding back to the original data space, which makes anomaly detection possible in domains where it is difficult to define a decoder, such as in irregular graph structured data. Empirically, we show this framework leads to improved performance on image, health record, and graph data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://medicine.yale.edu/intmed/vacs/

References

  1. Zimek, A., Schubert, E., & Kriegel, H. P. (2012). A survey on unsupervised outlier detection in high-dimensional numerical data. Statistical Analysis and Data Mining: The ASA Data Science Jounal, 5(5), 363–387. https://doi.org/10.1002/sam.11161

  2. Chalapathy, R., & Chawla, S. (2019). Deep Learning for Anomaly Detection: A Survey. ArXiv190103407 Cs Stat.

  3. Pang, G., Shen, C., Cao, L., & van den Hengel, A. (2020). Deep Learning for Anomaly Detection: A Review. ArXiv200702500 Cs Stat.

  4. Radhakrishnan, A., Yang, K., Belkin, M., & Uhler, C. (2019). Memorization in Overparameterized Autoencoders. ArXiv181010333 Cs Stat.

  5. Zhao, M., & Saligrama, V. (2009). Anomaly detection with score functions based on nearest neighbor graphs. In NeurIPS.

  6. Abati, D., Porrello, A., Calderara, S., & Cucchiara, R. (2019). Latent Space Autoregression for Novelty Detection. In CVPR.

  7. Chalapathy, R., Menon, A. K., & Chawla, S. (2017). Robust, Deep and Inductive Anomaly Detection. In ECML. Springer International Publishing. https://doi.org/10.1007/978-3-319-71249-9sps3

  8. Sabokrou, M., Khalooei, M., Fathy, M., & Adeli, E. (2018). Adversarially Learned One-Class Classifier for Novelty Detection. In CVPRhttps://doi.org/10.1109/CVPR.2018.00356

  9. Schlegl, T., Seeböck, P., Waldstein, S. M., Schmidt-Erfurth, U., & Langs, G. (2017). Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In IPML.

  10. Ruff, L., Kauffmann, J. R., Vandermeulen, R. A., Montavon, G., Samek, W., Kloft, M., Dietterich, T. G., & Müller, K. R. (2020). A unifying review of deep and shallow anomaly detection. ArXiv200911732 Cs Stat.

  11. Sriperumbudur, B. K., Fukumizu, K., Gretton, A., Schölkopf, B., & Lanckriet, G. R. G. (2009). On integral probability metrics, \(\phi\)-divergences and binary classification. ArXiv09012698 Cs Math.

  12. Tong, A., Wolf, G., & Krishnaswamy, S. (2020). Fixing Bias in Reconstruction-based Anomaly Detection with Lipschitz Discriminators. In IEEE MLSP.Espoo, Finland.

  13. Andrews, J. T. A., Morton, E. J., & Griffin, L. D. (2016). Detecting Anomalous Data Using Auto-Encoders. International Journal of Machine Learning and Computing, 6(1), 6.

    Google Scholar 

  14. Erfani, S. M., Rajasegarar, S., Karunasekera, S., & Leckie, C. (2016). High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recognition, 58, 121–134. https://doi.org/10.1016/j.patcog.2016.03.028

    Article  Google Scholar 

  15. Hawkins, S., He, H., Williams, G., & Baxter, R. (2002). Outlier Detection Using Replicator Neural Networks. In Data Warehousing and Knowledge Discovery (vol. 2454, pp. 170–180). Springer Berlin Heidelberg, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46145-0sps17

  16. Sakurada, M., & Yairi, T. (2014). Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction. In MLSDA. Australia. https://doi.org/10.1145/2689746.2689747

  17. Perera, P., Nallapati, R., & Xiang, B. (2019). OCGAN: One-class Novelty Detection Using GANs with Constrained Latent Representations. ArXiv190308550 Cs.

  18. Pidhorskyi, S., Almohsen, R., Adjeroh, D. A., & Doretto, G. (2018). Generative Probabilistic Novelty Detection with Adversarial Autoencoders. In arXiv:1807.02588 [Cs].

  19. Zenati, H., Foo, C. S., Lecouat, B., Manek, G., & Chandrasekhar, V. R. (2018). Efficient GAN-based anomaly detection. ArXiv180206222 Cs Stat.

  20. Akcay, S., Atapour-Abarghouei, A., & Breckon, T. P. (2018). GANomaly: Semi-supervised anomaly detection via adversarial training. ArXiv180506725 Cs.

  21. Di Mattia, F., Galeone, P., De Simoni, M., & Ghelfi, E. (2019). A survey on GANs for anomaly detection. ArXiv190611632 Cs Stat.

  22. Schölkopf, B., Platt, J. C., Shawe-Taylor, J., Smola, A. J., & Williamson, R. C. (2001). Estimating the support of a high-dimensional distribution. Neural Computation, 13(7), 1443–1471. https://doi.org/10.1162/089976601750264965

    Article  MATH  Google Scholar 

  23. Tax, D. M., & Duin, R. P. (2004). Support vector data description. Machine Learning, 54(1), 45–66. https://doi.org/10.1023/B:MACH.0000008084.60811.49

    Article  MATH  Google Scholar 

  24. Chalapathy, R., Menon, A. K., & Chawla, S. (2018). Anomaly Detection using One-Class Neural Networks. ArXiv180206360 Cs Stat.

  25. Ruff, L., Vandermeulen, R. A., Görnitz, N., Deecke, L., Siddiqui, S. A., Binder, A., Müller, E., & Kloft, M. (2018). Deep One-Class Classification. In ICML (p. 10). Stockholm, Sweden.

  26. Elomaa, T., Mannila, H., & Toivonen, H. (2002). Fast Outlier Detection in High Dimensional Spaces. In PKDD, Lecture Notes in Computer Science; Lecture Notes in Artificial Intelligence. Springer, Helsinki, Finland.

  27. Knorr, E. M., Ng, R. T., & Tucakov, V. (2000). Distance-based outliers: Algorithms and applications. The VLDB Journal The International Journal on Very Large Data Bases, 8(3–4), 237–253. https://doi.org/10.1007/s007780050006.

    Article  Google Scholar 

  28. Ramaswamy, S., Rastogi, R., & Shim, K. (2000). Efficient algorithms for mining outliers from large data sets. In MOD (p. 12). Dalles, TX.

  29. Breunig, M. M., Kriegel, H.P., Ng, R. T., & Sander, J. (2000). LOF: Identifying Density-Based Local Outliers. In ACM SIGMOD. Dalles, TX, p 12.

  30. Campos, G. O., Zimek, A., Sander, J., Campello, R. J. G. B., Micenková, B., Schubert, E., et al. (2016). On the evaluation of unsupervised outlier detection: Measures, datasets, and an empirical study. Data Mining and Knowledge Discovery, 30(4), 891–927. https://doi.org/10.1007/s10618-015-0444-8

    Article  MathSciNet  Google Scholar 

  31. Bentley, J. L. (1975). Multidimensional binary search trees used for associative searching. Communications of the ACM, 18(9), 509–517. https://doi.org/10.1145/361002.361007

    Article  MATH  Google Scholar 

  32. Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein GAN. In ICML.

  33. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., & Courville, A. (2017). Improved Training of Wasserstein GANs. ArXiv170400028 Cs Stat.

  34. Villani, C. (2009). Optimal Transport: Old and New. Berlin: Springer.

    Book  Google Scholar 

  35. Leeb, W. (2015). Topics in metric approximation. Ph.D. thesis, Yale University

  36. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., & Manzagol, P. A. (2010). Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. JMLR, 3371–3408.

  37. Liu, F. T., Ting, K. M., & Zhou, Z. H. (2012). Isolation-based anomaly detection. ACM Transactions on Knowledge Discovery from Data, 6(1), 1–39. https://doi.org/10.1145/2133360.2133363

    Article  Google Scholar 

  38. Chen, Y., Zhou, X., & Huang, T. (2001). One-class SVM for learning in image retrieval. In: ICIP (vol. 1, pp. 34–37). IEEE, Thessaloniki, Greece. https://doi.org/10.1109/ICIP.2001.958946

  39. Inker, L. A., & Perrone, R. D. (2018). Assessment of kidney function. https://www.uptodate.com/

  40. Moon, K. R., van Dijk, D., Wang, Z., Gigante, S., Burkhardt, D. B., Chen, W. S., Yim, K., van den Elzen, A., Hirn, M. J., Coifman, R. R., Ivanova, N. B., Wolf, G., & Krishnaswamy, S. (2019). Visualizing structure and transitions in high-dimensional biological data. Nature Biotechnology, 37(12), 1482–1492.

    Article  Google Scholar 

  41. Kipf, T. N., & Welling, M. (2016). Semi-Supervised Classification with Graph Convolutional Networks. 4th International Conference on Machine Learning.

  42. Zhou, N., Jiang, Y., Bergquist, T. R., Lee, A. J., Kacsoh, B. Z., Crocker, A. W., Lewis, K. A., Georghiou, G., Nguyen, H. N., Hamid, M. N., Davis, L., Dogan, T., Atalay, V., Rifaioglu, A. S., Dalkıran, A., Cetin Atalay, R., Zhang, C., Hurto, R. L., Freddolino, P. L., Zhang, Y., Bhat, P., Supek, F., Fernández, J. M., Gemovic, B., Perovic, V. R., Davidović, R. S., Sumonja, N., Veljkovic, N., Asgari, E., Mofrad, M. R., Profiti, G., Savojardo, C., Martelli, P. L., Casadio, R., Boecker, F., Schoof, H., Kahanda, I., Thurlby, N., McHardy, A. C., Renaux, A., Saidi, R., Gough, J., Freitas, A. A., Antczak, M., Fabris, F., Wass, M. N., Hou, J., Cheng, J., Wang, Z., Romero, A. E., Paccanaro, A., Yang, H., Goldberg, T., Zhao, C., Holm, L., Törönen, P., Medlar, A. J., Zosa, E., Borukhov, I., Novikov, I., Wilkins, A., Lichtarge, O., Chi, P. H., Tseng, W. C., Linial, M., Rose, P. W., Dessimoz, C., Vidulin, V., Dzeroski, S., Sillitoe, I., Das, S., Lees, J. G., Jones, D. T., Wan, C., Cozzetto, D., Fa, R., Torres, M., Warwick Vesztrocy, A., Rodriguez, J. M., Tress, M. L., Frasca, M., Notaro, M., Grossi, G., Petrini, A., Re, M., Valentini, G., Mesiti, M., Roche, D. B., Reeb, J., Ritchie, D. W., Aridhi, S., Alborzi, S. Z., Devignes, M. D., Koo, D. C. E., Bonneau, R., Gligorijević, V., Barot, M., Fang, H., Toppo, S., Lavezzo, E., Falda, M., Berselli, M., Tosatto, S. C., Carraro, M., Piovesan, D., Ur Rehman, H., Mao, Q., Zhang, S., Vucetic, S., Black, G. S., Jo, D., Suh, E., Dayton, J. B., Larsen, D. J., Omdahl, A. R., McGuffin, L. J., Brackenridge, D. A., Babbitt, P. C., Yunes, J. M., Fontana, P., Zhang, F., Zhu, S., You, R., Zhang, Z., Dai, S., Yao, S., Tian, W., Cao, R., Chandler, C., Amezola, M., Johnson, D., Chang, J. M., Liao, W. H., Liu, Y. W., Pascarelli, S., Frank, Y., Hoehndorf, R., Kulmanov, M., Boudellioua, I., Politano, G., Di Carlo, S., Benso, A., Hakala, K., Ginter, F., Mehryary, F., Kaewphan, S., Björne, J., Moen, H., Tolvanen, M. E., Salakoski, T., Kihara, D., Jain, A., Šmuc, T., Altenhoff, A., Ben-Hur, A., Rost, B., Brenner, S. E., Orengo, C. A., Jeffery, C. J., Bosco, G., Hogan, D. A., Martin, M. J., O’Donovan, C., Mooney, S. D., Greene, C. S., Radivojac, P., & Friedberg, I. (2019). The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens. Genome Biology, 20(1), 244. https://doi.org/10.1186/s13059-019-1835-8

  43. Gligorijevic, V., Renfrew, P. D., Kosciolek, T., Leman, J. K., Berenberg, D., Vatanen, T., Chandler, C., Taylor, B. C., Fisk, I. M., Vlamakis, H., Xavier, R. J., Knight, R., Cho, K., & Bonneau, R. (2019). Structure-based protein function prediction using graph convolutional networks. Preprint, Bioinformatics. https://doi.org/10.1101/786236

  44. Bruna, J., Zaremba, W., Szlam, A., & LeCun, Y. (2014) Spectral Networks and Locally Connected Networks on Graphs. In ICLR.

  45. Errica, F., Bacciu, D., Podda, M., & Micheli, A. (2020). A Fair Comparison of Graph Neural Networks for Graph Classification. In: ICLR. p. 14.

  46. Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., & Dahl, G. E. (2017). Neural Message Passing for Quantum Chemistry. In Proceedings of the 34th International Conference on Machine Learning.

  47. Leaver-Fay, A., Tyka, M., Lewis, S. M., Lange, O. F., Thompson, J., Jacak, R., Kaufman, K., Renfrew, P. D., Smith, C. A., Sheffler, W., Davis, I. W., Cooper, S., Treuille, A., Mandell, D. J., Richter, F., Ban, Y. E. A., Fleishman, S. J., Corn, J. E., Kim, D. E., Lyskov, S., Berrondo, M., Mentzer, S., Popović, Z., Havranek, J. J., Karanicolas, J., Das, R., Meiler, J., Kortemme, T., Gray, J. J., Kuhlman, B., Baker, D., & Bradley, P. (2011). Rosetta3: An object-oriented software suite for the simulation and design of macromolecules. Methods in Enzymology, 487, 545–574. https://doi.org/10.1016/B978-0-12-381270-4.00019-6

    Article  Google Scholar 

  48. Kipf, T. N., & Welling, M. (2016). Variational Graph Auto-Encoders. In Bayesian Deep Learning Workshop NeurIPS 2016.

  49. Borgwardt, K. M., Ong, C. S., Schonauer, S., Vishwanathan, S. V. N., Smola, A. J., & Kriegel, H. P. (2005). Protein function prediction via graph kernels. Bioinformatics, 21(Suppl 1), i47–i56. https://doi.org/10.1093/bioinformatics/bti1007

    Article  Google Scholar 

  50. Kingma, D. P., & Ba, J. (2015). Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations.

Download references

Acknowledgements

The authors would like to thank Matthew Amodio and Ronald Coifman for productive discussions and feedback on this project as well as the anonymous reviewers who helped improve this work. This research was partially funded by IVADO Professor funds, CIFAR AI Chair, and NSERC Discovery grant 03267 [G.W.]; Chan-Zuckerberg Initiative grants 182702 & CZF2019-002440 [S.K.]; NSF career grant 2047856 [S.K.]; Sloan Fellowship FG-2021-15883 [S.K.]; and NIH grants R01GM135929 & R01GM130847 [S.K.]. The content provided here is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Smita Krishnaswamy.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tong, A., Wolf, G. & Krishnaswamy, S. Fixing Bias in Reconstruction-based Anomaly Detection with Lipschitz Discriminators. J Sign Process Syst 94, 229–243 (2022). https://doi.org/10.1007/s11265-021-01715-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-021-01715-6

Keywords

Navigation