Unsupervised Artificial Neural Networks for Outlier Detection in High-Dimensional Data

  • Daniel PopovicEmail author
  • Edouard Fouché
  • Klemens Böhm
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11695)


Outlier detection is an important field in data mining. For high-dimensional data the task is particularly challenging because of the so-called “curse of dimensionality”: The notion of neighborhood becomes meaningless, and points typically show their outlying behavior only in subspaces. As a result, traditional approaches are ineffective. Because of the lack of a ground truth in real-world data and of a priori knowledge about the characteristics of potential outliers, outlier detection should be considered an unsupervised learning problem. In this paper, we examine the usefulness of unsupervised artificial neural networks – autoencoders, self-organising maps and restricted Boltzmann machines – to detect outliers in high-dimensional data in a fully unsupervised way. Each of those approaches targets at learning an approximate representation of the data. We show that one can measure the “outlierness” of objects effectively, by measuring their deviation from the learned representation. Our experiments show that neural-based approaches outperform the current state of the art in terms of both runtime and accuracy.


Unsupervised learning Outlier detection Neural networks 


  1. 1.
    Aggarwal, C.C., Yu, P.S.: Outlier detection for high dimensional data. In: SIGMOD Conference, pp. 37–46. ACM (2001). Scholar
  2. 2.
    Attik, M., Bougrain, L., Alexandre, F.: Self-organizing map initialization. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3696, pp. 357–362. Springer, Heidelberg (2005). Scholar
  3. 3.
    Bellman, R.E.: Dynamic Programming. Princeton University Press, Princeton (1957)Google Scholar
  4. 4.
    Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful? In: Beeri, C., Buneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 217–235. Springer, Heidelberg (1999). Scholar
  5. 5.
    Bishop, C.M.: Novelty detection and neural network validation. In: ICANN 1993, pp. 789–794 (1993). Scholar
  6. 6.
    Bourland, H., Kamp, Y.: Auto-association by multilayer perceptrons and singular value decomposition. Biol. Cybern. 59(4), 291–294 (1988). Scholar
  7. 7.
    Breunig, M.M., Kriegel, H., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: SIGMOD Conference, pp. 93–104. ACM (2000). Scholar
  8. 8.
    Campos, G.O., et al.: On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Min. Knowl. Discov. 30(4), 891–927 (2016). Scholar
  9. 9.
    Chen, J., Sathe, S., Aggarwal, C.C., Turaga, D.S.: Outlier detection with autoencoder ensembles. In: SDM, pp. 90–98. SIAM (2017). Scholar
  10. 10.
    Chen, Y., Lu, L., Li, X.: Application of continuous restricted boltzmann machine to identify multivariate geochemical anomaly. J. Geochem. Explor. 140, 56–63 (2014). Scholar
  11. 11.
    Ciampi, A., Lechevallier, Y.: Clustering large, multi-level data sets: an approach based on Kohonen Self Organizing Maps. In: Zighed, D.A., Komorowski, J., Żytkow, J. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 353–358. Springer, Heidelberg (2000). Scholar
  12. 12.
    Dau, H.A., Ciesielski, V., Song, A.: Anomaly detection using replicator neural networks trained on examples of one class. In: Dick, G., et al. (eds.) SEAL 2014. LNCS, vol. 8886, pp. 311–322. Springer, Cham (2014). Scholar
  13. 13.
    Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: ICML, ACM International Conference Proceeding Series, vol. 148, pp. 233–240. ACM (2006).
  14. 14.
    Dua, D., Graff, C.: UCI machine learning repository (2019).
  15. 15.
    Fiore, U., Palmieri, F., Castiglione, A., Santis, A.D.: Network anomaly detection with the restricted boltzmann machine. Neurocomputing 122, 13–23 (2013). Scholar
  16. 16.
    Hahnloser, R.R., Sarpeshkar, R., Mahowald, M.A., Douglas, R.J., Seung, S.H.: Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 405(6789), 947–951 (2000). Scholar
  17. 17.
    Hawkins, D.M.: Identification of Outliers, Monographs on Applied Probability and Statistics, vol. 11. Springer, Dordrecht (1980). Scholar
  18. 18.
    Hawkins, S., He, H., Williams, G., Baxter, R.: Outlier detection using replicator neural networks. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2002. LNCS, vol. 2454, pp. 170–180. Springer, Heidelberg (2002). Scholar
  19. 19.
    Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002). Scholar
  20. 20.
    Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006). Scholar
  21. 21.
    Hinton, G.E., Zemel, R.S.: Autoencoders, minimum description length and helmholtz free energy. In: NIPS, pp. 3–10. Morgan Kaufmann (1993).
  22. 22.
    Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 6(2), 107–116 (1998). Scholar
  23. 23.
    Japkowicz, N., Myers, C., Gluck, M.A.: A novelty detection approach to classification. In: IJCAI, pp. 518–523. Morgan Kaufmann (1995).
  24. 24.
    Keller, F., Müller, E., Böhm, K.: HiCS: high contrast subspaces for density-based outlier ranking. In: ICDE, pp. 1037–1048. IEEE Computer Society (2012).
  25. 25.
    Kelley, H.J.: Gradient theory of optimal flight paths. ARS J. 30(10), 947–954 (1960). Scholar
  26. 26.
    Kohonen, T.: Self-organized formation of topologically correct feature maps. Biol. Cybern. 43(1), 59–69 (1982). Scholar
  27. 27.
    Kohonen, T.: Self-Organizing Maps. Springer Series in Information Sciences. Springer, Heidelberg (1995). Scholar
  28. 28.
    Kriegel, H., Kröger, P., Schubert, E., Zimek, A.: Loop: local outlier probabilities. In: CIKM, pp. 1649–1652. ACM (2009).
  29. 29.
    Kriegel, H.-P., Kröger, P., Schubert, E., Zimek, A.: Outlier detection in axis-parallel subspaces of high dimensional data. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS (LNAI), vol. 5476, pp. 831–838. Springer, Heidelberg (2009). Scholar
  30. 30.
    Kriegel, H.P., Schubert, M., Zimek, A.: Angle-based outlier detection in high-dimensional data. In: Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2008, ACM Press, New York, NY, USA, pp. 444–452 (2008).
  31. 31.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012). Scholar
  32. 32.
    Linnainmaa, S.: Taylor expansion of the accumulated rounding error. BIT Numer. Math. 16(2), 146–160 (1976). Scholar
  33. 33.
    Liu, F.T., Ting, K.M., Zhou, Z.: Isolation forest. In: ICDM, pp. 413–422. IEEE Computer Society (2008).
  34. 34.
    Marchi, E., Vesperini, F., Eyben, F., Squartini, S., Schuller, B.W.: A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks. In: ICASSP, pp. 1996–2000. IEEE (2015).
  35. 35.
    Müller, E., Schiffer, M., Seidl, T.: Adaptive outlierness for subspace outlier ranking. In: CIKM, pp. 1629–1632. ACM (2010).
  36. 36.
    Müller, E., Schiffer, M., Seidl, T.: Statistical selection of relevant subspace projections for outlier ranking. In: ICDE, pp. 434–445. IEEE Computer Society (2011).
  37. 37.
    Muñoz, A., Muruzábal, J.: Self-organising maps for outlier detection. Neurocomputing 18(1), 33–60 (1998). Scholar
  38. 38.
    Nguyen, H.V., Gopalkrishnan, V., Assent, I.: An unbiased distance-based outlier detection approach for high-dimensional data. In: Yu, J.X., Kim, M.H., Unland, R. (eds.) DASFAA 2011. LNCS, vol. 6587, pp. 138–152. Springer, Heidelberg (2011). Scholar
  39. 39.
    Nguyen, H.V., Müller, E., Vreeken, J., Keller, F., Böhm, K.: CMI: an information-theoretic contrast measure for enhancing subspace cluster and outlier detection. In: SDM, pp. 198–206 (2013).
  40. 40.
    Provost, F.J., Fawcett, T.: Analysis and visualization of classifier performance: comparison under imprecise class and cost distributions. In: KDD, pp. 43–48. AAAI Press (1997),
  41. 41.
    Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets. In: SIGMOD Conference, pp. 427–438. ACM (2000).
  42. 42.
    Rayana, S.: ODDS library (2016).
  43. 43.
    Reddy, K.K., Sarkar, S., Venugopalan, V., Giering, M.: Anomaly detection and fault disambiguation in large flight data: a multi-modal deep autoencoder approach. In: Proceedings of the Annual Conference of the Prognostics and Health Management Society, Denver, Colorado. PHMC 2016, PHM Society, Rochester, NY, USA, vol. 7, pp. 192–199 (2016).
  44. 44.
    Rubinstein, R.: The cross-entropy method for combinatorial and continuous optimization. Methodol. Comput. Appl. Probab. 1(2), 127–190 (1999). Scholar
  45. 45.
    Sammon, J.W.: A nonlinear mapping for data structure analysis. IEEE Trans. Comput. 18(5), 401–409 (1969). Scholar
  46. 46.
    Sathe, S., Aggarwal, C.C.: LODES: local density meets spectral outlier detection. In: SDM, pp. 171–179. SIAM (2016).
  47. 47.
    Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001). Scholar
  48. 48.
    Schubert, E., Koos, A., Emrich, T., Züfle, A., Schmid, K.A., Zimek, A.: A framework for clustering uncertain data. PVLDB 8(12), 1976–1979 (2015). Scholar
  49. 49.
    Smolensky, P.: Information processing in dynamical systems: Foundations of harmony theory. In: Rumelhart, D.E., McClelland, J.L., PDP Research Group, C. (eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1, pp. 194–281. MIT Press, Cambridge (1986).
  50. 50.
    Wittek, P.: Somoclu: an efficient distributed library for self-organizing maps. CoRR abs/1305.1422 (2013).
  51. 51.
    Zeiler, M.D.: ADADELTA: an adaptive learning rate method. CoRR abs/1212.5701 (2012).

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Karlsruhe Institute of Technology (KIT)KarlsruheGermany

Personalised recommendations