Abstract
In this article, we propose Combined Approach for Novelty Detection in Intelligent Embedded Systems (CANDIES), a new approach to novelty detection in technical systems. We assume that a technical system observes an environment that can be regarded as being composed of several processes. When observing these processes with sensors, features are extracted from sensor signals and we are able to model the sample distribution in feature space with a probabilistic model. In an ideal case, the components of the parametric mixture density model we use correspond to the processes in the real world. Eventually, e.g., in the case of an unpredictable failure, novel processes emerge. As a consequence, new kinds of samples are observed that require an adaptation of the model. Novelty detection in low- and high-density regions of the feature space require different detection strategies. We introduce a new technique to detect novel processes in high-density regions by means of a fast online goodness-of-fit test. For detection in low-density regions we use 2SND (Two-Stage-Novelty-Detector), an approach we presented in preliminary work. With CANDIES, we combine both techniques to provide a holistic method to detect novelty. The properties of CANDIES are evaluated using artificial data and benchmark data from the field of intrusion detection in computer networks.
Similar content being viewed by others
References
Abowd G, Dey A, Brown P, Davies N, Smith M, Steggles P (1999) Towards a better understanding of context and context-awareness. In: Gellersen H-W (ed) Handheld and ubiquitous computing. Lecture notes in computer science, vol 1707. Springer, Berlin, pp 304–307
Al-Behadili H, Grumpe A, Dopp C, Wohler C (2015) Extreme learning machine based novelty detection for incremental semi-supervised learning. In: International Conference on Image Information Processing, p 230–235
Bhattacharyya A (1946) On a measure of divergence between two multinomial populations. Sankhyā Indian J Stat, p 401–406
Bishop CM (1994) Novelty detection and neural network validation. Vis Image Signal Process 141:217–222
Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer, Berlin
Bonifacio J, Cansian A, Carvalho A, Moreira E (1998) Neural networks applied in intrusion detection systems. Proc IJCNN 1:205–210
Breunig MM, Kriegel H-P, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. ACM SIGMOD Record 29(2):93–104
Clifton DA, Hugueny S, Tarassenko L (2011) Novelty detection with multivariate extreme value statistics. J Signal Process Syst 65(3):371–389
Clifton L, Clifton DA, Watkinson PJ, Tarassenko L (2011) Identification of patient deterioration in vital-sign data using one-class support vector machines. 2011 Federated Conference on Computer Science and Information Systems (FedCSIS) 2:125–131
Clifton L, Clifton DA, Zhang Y, Watkinson P, Tarassenko L, Yin H (2014) Probabilistic novelty detection with support vector machines. IEEE Trans Reliab 63(2):455–467
Ester M, Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. KDD-96, AAAI Press, p 226–231
Fisch D, Jänicke M, Kalkowski E, Sick B (2012) Techniques for knowledge acquisition in dynamically changing environments. TAAS 7(1):1–25
Fisch D, Jänicke M, Sick B, Müller-Schloer C (2010) Quantitative emergence—a refined approach based on divergence measures. SASO, p 94–103
Fisch D, Kalkowski E, Sick B (2014) Knowledge fusion for probabilistic generative classifiers with data mining applications. TKDE 26(3):652–666
Greenwood PE, Nikulin MS (1996) A guide to chi-squared testing, vol 280. Wiley, Hoboken
Gruhl C, Sick B, Wacker A, Tomforde S, Hähner J (2015) A building block for awareness in technical systems: online novelty detection and reaction with an application in intrusion detection. In: IEEE iCAST, IEEE, p 194–200
Haehner J, Brinkschulte U, Lukowicz P, Mostaghim S, Sick B, Tomforde S (2015) Runtime self-integration as key challenge for mastering interwoven systems. p 1–8
Hautamäki V, Kärkkäinen I, Fränti P (2004) Outlier detection using k-nearest neighbour graph. Proc. International Conference on Pattern Recognition 3(9):430–433
Hazan A, Lacaille J, Madani K (2012) Extreme value statistics for vibration spectra outlier detection. International Conference on Condition Monitoring and Machinery Failure Prevention Technologies, London, pp 736–744
Hellinger E (1909) Neue Begründung der Theorie quadratischer Formen von unendlich vielen Veränderlichen. J für die reine Angew Math 136:210–271
Ilonen J, Paalanen P, Kamarainen J, Kälviäinen H (2006) Gaussian mixture pdf in one-class classifcation: computing and utilizing confidence values. ICPR 2:577–580
KDD Cup (1999) KDD Cup 1999 data—data set. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html. Accessed 6 Feb 2015
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22:79–86
Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml/. Accessed 14 Nov 2016
Markou M, Singh S (2003) Novelty detection: a review—part 1: statistical approaches. Signal Process 83:2481–2497
Markou M, Singh S (2003) Novelty detection: a review—part 2: neural network based approaches. Signal Process 83:2499–2521
Müller-Schloer C, Schmeck H, Ungerer T (2011) Organic computing—a paradigm shift for complex systems. Springer, Berlin
Papadimitriou S, Kitagawa H, Gibbons P, Faloutsos C (2003) LOCI: fast outlier detection using the local correlation integral. Data Eng 1:315–326
Pearson K (1900) On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philos Mag Ser 50(302):157–175
Pimentel MA, Clifton DA, Clifton L, Tarassenko L (2014) A review of novelty detection. Signal Process 99:215–249
Pontoppidan NH, Larsen J (2003) Unsupervised condition change detection in large diesel engines. Neural networks for signal processing, Proc. of the IEEE XIII, Workshop, p 565–574
Roberts S (2000) Extreme value statistics for novelty detection in biomedical data processing. IEEE Proc. Science, Measurement and Technology, 147(6):363–367
Roberts SJ (1999) Novelty detection using extreme value statistics. Vis Image Signal Process IEEE Proc. 146(3):124–129
Settles B (2009) Active learning literature survey. Computer Sciences Technischer Bericht 1648, Department of Computer Science, University of Wisconsin
Spinosa EJ, de Carvalho F, deLeon A, Gama J (2009) Novelty detection with application to data streams. Intell Data Anal 13(3):405–422
Tarassenko L, Hayton P, Cerneaz N, Brady M (1995) Novelty detection for the identification of masses in mammograms. Artif Neural Netw, Fourth International Conference, 10:442–447
Tax D, Duin R (2002) Uniform object generation for optimizing one-class classifiers. J Mach Learn Res 2:155–173
Wang CH (2009) Outlier identification and market segmentation using kernel-based clustering techniques. Expert Syst Appl 36(2):3744–3750
Yeung D, Chow C (2002) Parzen-window network intrusion detectors. Proc ICPR 4:385–388
Zorriassatine F, Al-Habaibeh A, Parkin RM, Jackson MR, Coy J (2005) Novelty detection for practical pattern recognition in condition monitoring of multivariate processes: a case study. Int J Adv Manuf Technol 25(9–10):954–963
Acknowledgements
The authors would like to thank the German Research Foundation (DFG) for support within the DFG project CYPHOC (SI 674/9-1).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gruhl, C., Sick, B. Novelty detection with CANDIES: a holistic technique based on probabilistic models. Int. J. Mach. Learn. & Cyber. 9, 927–945 (2018). https://doi.org/10.1007/s13042-016-0618-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-016-0618-8