Knowledge and Information Systems

, Volume 57, Issue 2, pp 413–435 | Cite as

Energy-based anomaly detection for mixed data

  • Kien Do
  • Truyen TranEmail author
  • Svetha Venkatesh
Regular Paper


Anomalies are those deviating significantly from the norm. Thus, anomaly detection amounts to finding data points located far away from their neighbors, i.e., those lying in low-density regions. Classic anomaly detection methods are largely designed for single data type such as continuous or discrete. However, real-world data is increasingly heterogeneous, where a data point can have both discrete and continuous attributes. Mixed data poses multiple challenges including (a) capturing the inter-type correlation structures and (b) measuring deviation from the norm under multiple types. These challenges are exaggerated under (c) high-dimensional regimes. In this paper, we propose a new scalable unsupervised anomaly detection method for mixed data based on Mixed-variate Restricted Boltzmann Machine (Mv.RBM). The Mv.RBM is a principled probabilistic method that estimates density of mixed data. We propose to use free energy derived from Mv.RBM as anomaly score as it is identical to data negative log-density up to an additive constant. We then extend this method to detect anomalies across multiple levels of data abstraction, an effective approach to deal with high-dimensional settings. The extension is dubbed \(\mathtt {MIXMAD}\), which stands for MIXed data Multilevel Anomaly Detection. In \(\mathtt {MIXMAD}\), we sequentially construct an ensemble of mixed-data Deep Belief Nets (DBNs) with varying depths. Each DBN is an energy-based detector at a predefined abstraction level. Predictions across the ensemble are finally combined via a simple rank aggregation method. The proposed methods are evaluated on a comprehensive suit of synthetic and real high-dimensional datasets. The results demonstrate that for anomaly detection, (a) a proper handling of mixed types is necessary, (b) free energy is a powerful anomaly scoring method, (c) multilevel abstraction of data is important for high-dimensional data, and (d) empirically Mv.RBM and \(\mathtt {MIXMAD}\) are superior to popular unsupervised detection methods for both homogeneous and mixed data.


Mixed data Mixed-variate restricted Boltzmann machine Deep belief net Multilevel anomaly detection 



This work is partially supported by the Telstra-Deakin Centre of Excellence in Big Data and Machine Learning.


  1. 1.
    Aggarwal CC, Hinneburg A, Keim DA (2001) On the surprising behavior of distance metrics in high dimensional space. In: International conference on database theory, Springer, pp 420–434Google Scholar
  2. 2.
    Aggarwal CC, Sathe S (2015) Theoretical foundations and algorithms for outlier ensembles. ACM SIGKDD Explor Newsl 17(1):24–47CrossRefGoogle Scholar
  3. 3.
    Akoglu L, Tong H, Vreeken J, Faloutsos C (2012) Fast and reliable anomaly detection in categorical data. In: Proceedings of the 21st ACM international conference on information and knowledge management, ACM, pp 415–424Google Scholar
  4. 4.
    Angiulli, F, Pizzuti C (2002) Fast outlier detection in high dimensional spaces. In: European conference on principles of data mining and knowledge discovery, Springer, pp 15–27Google Scholar
  5. 5.
    Becker J, Havens TC, Pinar A, Schulz TJ (2015) Deep belief networks for false alarm rejection in forward-looking ground-penetrating radar. In: SPIE defense+ security, International Society for Optics and Photonics, pp 94540W–94540WGoogle Scholar
  6. 6.
    Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828CrossRefGoogle Scholar
  7. 7.
    Bontemps L, McDermott J, Le-Khac NA et al (2016) Collective anomaly detection based on long short-term memory recurrent neural networks. In: International conference on future data and security engineering, Springer, pp 141–152Google Scholar
  8. 8.
    Bouguessa M (2015) A practical outlier detection approach for mixed-attribute data. Expert Syst Appl 42(22):8637–8649CrossRefGoogle Scholar
  9. 9.
    Breunig MM, Kriegel HP, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. In: ACM sigmod record, vol 29. ACM, pp 93–104Google Scholar
  10. 10.
    Campos GO, Zimek A, Sander J, Campello RJGB, Micenková B, Schubert E, Assent I, Houle ME (2015) On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Min Knowl Discov 30(4):891–927MathSciNetCrossRefGoogle Scholar
  11. 11.
    Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv (CSUR) 41(3):15CrossRefGoogle Scholar
  12. 12.
    Chauhan S, Vig L (2015) Anomaly detection in ECG time signals via deep long short-term memory networks. In: IEEE international conference on data science and advanced analytics (DSAA), 2015. 36678 2015, IEEE, pp 1–7Google Scholar
  13. 13.
    Cheng M, Xu Q, Lv J, Liu W, Li Q, Wang J (2016) MS-LSTM: a multi-scale LSTM model for BGP anomaly detection. In: IEEE 24th international conference on network protocols (ICNP), 2016, IEEE, pp 1–6Google Scholar
  14. 14.
    Das K, Schneider J, Neill DB (2008) Anomaly pattern detection in categorical datasets. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 169–176Google Scholar
  15. 15.
    De Leon AR, Chough KC (2013) Analysis of mixed data: methods & applications. CRC Press, Boca RatonCrossRefzbMATHGoogle Scholar
  16. 16.
    Do K, Tran T, Phung D, Venkatesh S (2016) Outlier detection on mixed-type data: an energy-based approach. In: International conference on advanced data mining and applications (ADMA 2016)Google Scholar
  17. 17.
    Fiore U, Palmieri F, Castiglione A, De Santis A (2013) Network anomaly detection with the restricted Boltzmann machine. Neurocomputing 122:13–23CrossRefGoogle Scholar
  18. 18.
    Gao N, Gao L, Gao Q, Wang H (2014) An intrusion detection model based on deep belief networks. In: Second international conference on advanced cloud and big data (CBD), 2014, IEEE, pp 247–252Google Scholar
  19. 19.
    Ghoting A, Otey ME, Parthasarathy S (2004) Loaded: link-based outlier and anomaly detection in evolving data sets. In: ICDM, pp 387–390Google Scholar
  20. 20.
    Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14:1771–1800CrossRefzbMATHGoogle Scholar
  21. 21.
    Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Ienco D, Pensa RG, Meo R (2016) A semisupervised approach to the detection and characterization of outliers in categorical data. IEEE Trans Neural Netw Learn Syst 28(5):1017–1029CrossRefGoogle Scholar
  23. 23.
    Kamyshanska H, Memisevic R (2015) The potential energy of an autoencoder. IEEE Trans Pattern Anal Mach Intell 37(6):1261–1273CrossRefGoogle Scholar
  24. 24.
    Kingma D, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
  25. 25.
    Koufakou A, Georgiopoulos M (2010) A fast outlier detection strategy for distributed high-dimensional data sets with mixed attributes. Data Min Knowl Discov 20(2):259–289MathSciNetCrossRefGoogle Scholar
  26. 26.
    Koufakou A, Georgiopoulos M, Anagnostopoulos GC (2008) Detecting outliers in high-dimensional datasets with mixed attributes. In: DMIN, Citeseer, pp 427–433Google Scholar
  27. 27.
    LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRefGoogle Scholar
  28. 28.
    Lu YC, Feng C, Yating W, Lu CT (2016) Discovering anomalies on mixed-type data using a generalized student-t based approach. IEEE Trans Knowl Data Eng.
  29. 29.
    Malhotra P, Vig L, Shroff G, Agarwal P (2015) Long short term memory networks for anomaly detection in time series. In: Proceedings of ESANN, Presses universitaires de Louvain, pp 89–94Google Scholar
  30. 30.
    Mehta P, Schwab DJ (2014) An exact mapping between the variational renormalization group and deep learning. arXiv preprint arXiv:1410.3831
  31. 31.
    Nguyen TD, Tran T, Phung D, Venkatesh S (2013) Latent patient profile modelling and applications with mixed-variaterestricted Boltzmann machine. In: Proceedings of Pacific-Asia conference on knowledge discovery and datamining (PAKDD), Gold Coast, Queensland, AustraliaGoogle Scholar
  32. 32.
    Nguyen TD, Tran T, Phung D, Venkatesh S (2013) Learning sparse latent representation and distance metric for image retrieval. In: Proceedings of IEEE international conference on multimedia & expo, California, USA, July 15–19Google Scholar
  33. 33.
    Otey ME, Parthasarathy S, Ghoting A (2005) Fast lightweight outlier detection in mixed-attribute data. Techincal report, OSU–CISRC–6/05–TR43Google Scholar
  34. 34.
    Pai HT, Wu F, Hsueh PYSS (2014) A relative patterns discovery for enhancing outlier detection in categorical data. Dec Support Syst 67:90–99CrossRefGoogle Scholar
  35. 35.
    Papadimitriou S, Kitagawa H, Gibbons PB, Faloutsos C (2003) Loci: fast outlier detection using the local correlation integral. In: Proceedings. 19th international conference on data engineering, 2003. IEEE, pp 315–326Google Scholar
  36. 36.
    Salakhutdinov R, Hinton G (2009) Semantic hashing. Int J Approx Reas 50(7):969–978CrossRefGoogle Scholar
  37. 37.
    Serfling R, Wang S (2014) General foundations for studying masking and swamping robustness of outlier identifiers. Statis Methodol 20:79–90MathSciNetCrossRefGoogle Scholar
  38. 38.
    Sun J, Wyss R, Steinecker A, Glocker P (2014) Automated fault detection using deep belief networks for the quality inspection of electromotors. tm-Technisches Messen 81(5):255–263CrossRefGoogle Scholar
  39. 39.
    Tagawa T, Tadokoro Y, Yairi T (2014) Structured denoising autoencoder for fault detection and analysis. In: ACMLGoogle Scholar
  40. 40.
    Tang G, Pei J, Bailey J, Dong G (2015) Mining multidimensional contextual outliers from categorical relational data. Intell Data Anal 19(5):1171–1192CrossRefGoogle Scholar
  41. 41.
    Taylor A, Leblanc S, Japkowicz N (2016) Anomaly detection in automobile control network data with long short-term memory networks. In: IEEE international conference on data science and advanced analytics (DSAA), 2016, IEEE, pp 130–139Google Scholar
  42. 42.
    Tran N, Jin H (2012) Detecting network anomalies in mixed-attribute data sets. In: Third international conference on knowledge discovery and data mining, 2010. WKDD’10, IEEE, pp 383–386Google Scholar
  43. 43.
    Tran T, Phung D, Venkatesh S (2013) Thurstonian Boltzmann machines: learning from multiple inequalities. In: International conference on machine learning (ICML), Atlanta, USA, June 16–21Google Scholar
  44. 44.
    Tran T, Phung DQ, Venkatesh S (2011) Mixed-variate restricted Boltzmann machines. In: Proceedings of 3rd Asian conference on machine learning (ACML), Taoyuan, TaiwanGoogle Scholar
  45. 45.
    Tran T, Luo W, Phung D, Morris J, Rickard K, Venkatesh S (2016) Preterm birth prediction: deriving stable and interpretable rules from high dimensional data. In: Conference on machine learning in healthcare, LA, USAGoogle Scholar
  46. 46.
    Tuor A, Kaplan S, Hutchinson B, Nichols N, Robinson S (2017) Deep learning for unsupervised insider threat detection in structured cybersecurity data streams. In: Proceedings of the AAAI-17 Workshop on Artificial Intelligence for Cyber Security, pp 224–231Google Scholar
  47. 47.
    Wang Y, Cai W, Wei P (2016) A deep learning approach for detecting malicious JavaScript code. Secur Commun Netw 9:1520–1534CrossRefGoogle Scholar
  48. 48.
    Ye M, Li X, Orlowska ME (2009) Projected outlier detection in high-dimensional mixed-attributes data set. Expert Syst Appl 36(3):7104–7113CrossRefGoogle Scholar
  49. 49.
    Zhai S, Cheng Y, Lu W, Zhang Z (2016) Deep structured energy based models for anomaly detection. arXiv preprint arXiv:1605.07717
  50. 50.
    Zhang K, Jin H (2010) An effective pattern based outlier detection approach for mixed attribute data. In: Australasian joint conference on artificial intelligence, Springer, pp 122–131Google Scholar
  51. 51.
    Zimek A, Schubert E, Kriegel HP (2012) A survey on unsupervised outlier detection in high-dimensional numerical data. Statis Anal Data Mining 5(5):363–387MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2018

Authors and Affiliations

  1. 1.Applied AI InstituteDeakin UniversityWaurn PondsAustralia

Personalised recommendations