Skip to main content
Log in

Novelty detection with CANDIES: a holistic technique based on probabilistic models

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

In this article, we propose Combined Approach for Novelty Detection in Intelligent Embedded Systems (CANDIES), a new approach to novelty detection in technical systems. We assume that a technical system observes an environment that can be regarded as being composed of several processes. When observing these processes with sensors, features are extracted from sensor signals and we are able to model the sample distribution in feature space with a probabilistic model. In an ideal case, the components of the parametric mixture density model we use correspond to the processes in the real world. Eventually, e.g., in the case of an unpredictable failure, novel processes emerge. As a consequence, new kinds of samples are observed that require an adaptation of the model. Novelty detection in low- and high-density regions of the feature space require different detection strategies. We introduce a new technique to detect novel processes in high-density regions by means of a fast online goodness-of-fit test. For detection in low-density regions we use 2SND (Two-Stage-Novelty-Detector), an approach we presented in preliminary work. With CANDIES, we combine both techniques to provide a holistic method to detect novelty. The properties of CANDIES are evaluated using artificial data and benchmark data from the field of intrusion detection in computer networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

References

  1. Abowd G, Dey A, Brown P, Davies N, Smith M, Steggles P (1999) Towards a better understanding of context and context-awareness. In: Gellersen H-W (ed) Handheld and ubiquitous computing. Lecture notes in computer science, vol 1707. Springer, Berlin, pp 304–307

    Chapter  Google Scholar 

  2. Al-Behadili H, Grumpe A, Dopp C, Wohler C (2015) Extreme learning machine based novelty detection for incremental semi-supervised learning. In: International Conference on Image Information Processing, p 230–235

  3. Bhattacharyya A (1946) On a measure of divergence between two multinomial populations. Sankhyā Indian J Stat, p 401–406

  4. Bishop CM (1994) Novelty detection and neural network validation. Vis Image Signal Process 141:217–222

    Article  Google Scholar 

  5. Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer, Berlin

    MATH  Google Scholar 

  6. Bonifacio J, Cansian A, Carvalho A, Moreira E (1998) Neural networks applied in intrusion detection systems. Proc IJCNN 1:205–210

    Google Scholar 

  7. Breunig MM, Kriegel H-P, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. ACM SIGMOD Record 29(2):93–104

    Article  Google Scholar 

  8. Clifton DA, Hugueny S, Tarassenko L (2011) Novelty detection with multivariate extreme value statistics. J Signal Process Syst 65(3):371–389

    Article  Google Scholar 

  9. Clifton L, Clifton DA, Watkinson PJ, Tarassenko L (2011) Identification of patient deterioration in vital-sign data using one-class support vector machines. 2011 Federated Conference on Computer Science and Information Systems (FedCSIS) 2:125–131

    Google Scholar 

  10. Clifton L, Clifton DA, Zhang Y, Watkinson P, Tarassenko L, Yin H (2014) Probabilistic novelty detection with support vector machines. IEEE Trans Reliab 63(2):455–467

    Article  Google Scholar 

  11. Ester M, Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. KDD-96, AAAI Press, p 226–231

  12. Fisch D, Jänicke M, Kalkowski E, Sick B (2012) Techniques for knowledge acquisition in dynamically changing environments. TAAS 7(1):1–25

    Article  Google Scholar 

  13. Fisch D, Jänicke M, Sick B, Müller-Schloer C (2010) Quantitative emergence—a refined approach based on divergence measures. SASO, p 94–103

  14. Fisch D, Kalkowski E, Sick B (2014) Knowledge fusion for probabilistic generative classifiers with data mining applications. TKDE 26(3):652–666

    Google Scholar 

  15. Greenwood PE, Nikulin MS (1996) A guide to chi-squared testing, vol 280. Wiley, Hoboken

    MATH  Google Scholar 

  16. Gruhl C, Sick B, Wacker A, Tomforde S, Hähner J (2015) A building block for awareness in technical systems: online novelty detection and reaction with an application in intrusion detection. In: IEEE iCAST, IEEE, p 194–200

  17. Haehner J, Brinkschulte U, Lukowicz P, Mostaghim S, Sick B, Tomforde S (2015) Runtime self-integration as key challenge for mastering interwoven systems. p 1–8

  18. Hautamäki V, Kärkkäinen I, Fränti P (2004) Outlier detection using k-nearest neighbour graph. Proc. International Conference on Pattern Recognition 3(9):430–433

    Google Scholar 

  19. Hazan A, Lacaille J, Madani K (2012) Extreme value statistics for vibration spectra outlier detection. International Conference on Condition Monitoring and Machinery Failure Prevention Technologies, London, pp 736–744

    Google Scholar 

  20. Hellinger E (1909) Neue Begründung der Theorie quadratischer Formen von unendlich vielen Veränderlichen. J für die reine Angew Math 136:210–271

    MathSciNet  MATH  Google Scholar 

  21. Ilonen J, Paalanen P, Kamarainen J, Kälviäinen H (2006) Gaussian mixture pdf in one-class classifcation: computing and utilizing confidence values. ICPR 2:577–580

    MATH  Google Scholar 

  22. KDD Cup (1999) KDD Cup 1999 data—data set. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html. Accessed 6 Feb 2015

  23. Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22:79–86

    Article  MathSciNet  MATH  Google Scholar 

  24. Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml/. Accessed 14 Nov 2016

  25. Markou M, Singh S (2003) Novelty detection: a review—part 1: statistical approaches. Signal Process 83:2481–2497

    Article  MATH  Google Scholar 

  26. Markou M, Singh S (2003) Novelty detection: a review—part 2: neural network based approaches. Signal Process 83:2499–2521

    Article  MATH  Google Scholar 

  27. Müller-Schloer C, Schmeck H, Ungerer T (2011) Organic computing—a paradigm shift for complex systems. Springer, Berlin

    Book  MATH  Google Scholar 

  28. Papadimitriou S, Kitagawa H, Gibbons P, Faloutsos C (2003) LOCI: fast outlier detection using the local correlation integral. Data Eng 1:315–326

    Google Scholar 

  29. Pearson K (1900) On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philos Mag Ser 50(302):157–175

    Article  MATH  Google Scholar 

  30. Pimentel MA, Clifton DA, Clifton L, Tarassenko L (2014) A review of novelty detection. Signal Process 99:215–249

    Article  Google Scholar 

  31. Pontoppidan NH, Larsen J (2003) Unsupervised condition change detection in large diesel engines. Neural networks for signal processing, Proc. of the IEEE XIII, Workshop, p 565–574

  32. Roberts S (2000) Extreme value statistics for novelty detection in biomedical data processing. IEEE Proc. Science, Measurement and Technology, 147(6):363–367

  33. Roberts SJ (1999) Novelty detection using extreme value statistics. Vis Image Signal Process IEEE Proc. 146(3):124–129

    Article  Google Scholar 

  34. Settles B (2009) Active learning literature survey. Computer Sciences Technischer Bericht 1648, Department of Computer Science, University of Wisconsin

  35. Spinosa EJ, de Carvalho F, deLeon A, Gama J (2009) Novelty detection with application to data streams. Intell Data Anal 13(3):405–422

    Google Scholar 

  36. Tarassenko L, Hayton P, Cerneaz N, Brady M (1995) Novelty detection for the identification of masses in mammograms. Artif Neural Netw, Fourth International Conference, 10:442–447

  37. Tax D, Duin R (2002) Uniform object generation for optimizing one-class classifiers. J Mach Learn Res 2:155–173

    MATH  Google Scholar 

  38. Wang CH (2009) Outlier identification and market segmentation using kernel-based clustering techniques. Expert Syst Appl 36(2):3744–3750

    Article  Google Scholar 

  39. Yeung D, Chow C (2002) Parzen-window network intrusion detectors. Proc ICPR 4:385–388

    Google Scholar 

  40. Zorriassatine F, Al-Habaibeh A, Parkin RM, Jackson MR, Coy J (2005) Novelty detection for practical pattern recognition in condition monitoring of multivariate processes: a case study. Int J Adv Manuf Technol 25(9–10):954–963

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the German Research Foundation (DFG) for support within the DFG project CYPHOC (SI 674/9-1).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Gruhl.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gruhl, C., Sick, B. Novelty detection with CANDIES: a holistic technique based on probabilistic models. Int. J. Mach. Learn. & Cyber. 9, 927–945 (2018). https://doi.org/10.1007/s13042-016-0618-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-016-0618-8

Keywords

Navigation