Abstract
Anomaly detection involves identifying rare data instances (anomalies) that come from a different class or distribution than the majority (which are simply called “normal” instances). Given a training set of only normal data, the semi-supervised anomaly detection task is to identify anomalies in the future. Good solutions to this task have applications in fraud and intrusion detection. The unsupervised anomaly detection task is different: Given unlabeled, mostly-normal data, identify the anomalies among them. Many real-world machine learning tasks, including many fraud and intrusion detection tasks, are unsupervised because it is impractical (or impossible) to verify all of the training data. We recently presented FRaC, a new approach for semi-supervised anomaly detection. FRaC is based on using normal instances to build an ensemble of feature models, and then identifying instances that disagree with those models as anomalous. In this paper, we investigate the behavior of FRaC experimentally and explain why FRaC is so successful. We also show that FRaC is a superior approach for the unsupervised as well as the semi-supervised anomaly detection task, compared to well-known state-of-the-art anomaly detection methods, LOF and one-class support vector machines, and to an existing feature-modeling approach.
Similar content being viewed by others
References
Asuncion A, Newman D (2007) UCI machine learning repository. http://www.ics.uci.edu/~mlearn/MLRepository.html
Breunig M, Kriegel H, Ng R, Sander J (2000) LOF: identifying density-based local outliers. ACM SIGMOD Rec 29(2): 93–104
Byers S, Raftery AE (1998) Nearest-neighbor clutter removal for estimating features in spatial point processes. J Am Stat Assoc 93(442): 577–584
Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3): 1–58
Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines software. http://www.csie.ntu.edu.tw/~cjlin/libsvm
Cohen WW (1995) Fast effective rule induction. In: Proceedings of the twelfth international conference on machine learning, pp 115–123
Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9: 1871–1874
Guttormsson SE, Marks RJ, El-Sharkawi MA, Kerszenbaum I (1999) Elliptical novelty grouping for on-line short-turn detection of excited running rotors. IEEE Trans Energy Convers 14(1): 16–22
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1): 10–18
Huang YA, Fan W, Lee W, Yu PS (2003) Cross-feature analysis for detecting ad-hoc routing anomalies. In: ICDCS ’03: proceedings of the 23rd international conference on distributed computing systems. IEEE Computer Society, Washington, DC, USA
John G, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence. Morgan Kaufmann, San Mateo, pp 338–345
Lazarevic A, Kumar V (2005) Feature bagging for outlier detection. In: Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery and data mining, pp 157–166
Leon D, Podgurski A, Dickinson W (2005) Visualizing similarity between program executions. Proceedings of the 16th IEEE international symposium on software reliability engineering, pp 311–321
Mitchell T (1997) Machine learning. McGraw-Hill, New York
Noto K, Brodley C, Slonim D (2010) Anomaly detection using an ensemble of feature models. In: Proceedings of the 10th IEEE international conference on data mining (ICDM 2010)
Quinlan JR (1990) Probabilistic decision trees, vol 3, chap 5. Morgan Kaufmann, San Mateo, pp. 140–153
Quinlan J (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo
Schölkopf B, Smola AJ, Williamson RC, Bartlett PL (2000) New support vector algorithms. Neural Comput 12(5): 1207–1245
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27: 379–423 (Part I)
Shapiro SS, Wilk MB (1965) An analysis of variance test for normality (complete samples). Biometrika 52(3-4): 591–611
Smith R, Bivens A, Embrechts M, Palagiri C, Szymanski B (2002) Clustering approaches for anomaly based intrusion detection. In: Proceedings of intelligent engineering systems through artificial neural networks, pp 579–584
Spackman KA (1989) Signal detection theory: valuable tools for evaluating inductive learning. In: Proceedings of the sixth international workshop on machine learning. Morgan Kaufmann Publishers Inc, San Francisco, CA, USA, pp 160–163
Tang J, Chen Z, Fu A, Cheung D (2002) Enhancing effectiveness of outlier detections for low density patterns. Lecture notes in computer science. Springer, New York, pp 535–548
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Eamonn Keogh.
Rights and permissions
About this article
Cite this article
Noto, K., Brodley, C. & Slonim, D. FRaC: a feature-modeling approach for semi-supervised and unsupervised anomaly detection. Data Min Knowl Disc 25, 109–133 (2012). https://doi.org/10.1007/s10618-011-0234-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-011-0234-x