Combining Support Vector Machines and Segmentation Algorithms for Efficient Anomaly Detection: A Petroleum Industry Application

  • Luis Martí
  • Nayat Sanchez-Pi
  • José Manuel Molina
  • Ana Cristina Bicharra García
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 299)

Abstract

Anomaly detection is the problem of finding patterns in data that do not conform to expected behavior. Similarly, when patterns are numerically distant from the rest of sample, anomalies are indicated as outliers. Anomaly detection had recently attracted the attention of the research community for real-world applications. The petroleum industry is one of the application contexts where these problems are present. The correct detection of such types of unusual information empowers the decision maker with the capacity to act on the system in order to correctly avoid, correct, or react to the situations associated with them. In that sense, heavy extraction machines for pumping and generation operations like turbomachines are intensively monitored by hundreds of sensors each that send measurements with a high frequency for damage prevention. For dealing with this and with the lack of labeled data, in this paper we propose a combination of a fast and high quality segmentation algorithm with a one-class support vector machine approach for efficient anomaly detection in turbomachines. As result we perform empirical studies comparing our approach to other methods applied to benchmark problems and a real-life application related to oil platform turbomachinery anomaly detection.

Keywords

Anomaly detection support vector machines time series segmentation oil industry application 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Eskin, E., Arnold, A., Prerau, M., Portnoy, L., Stolfo, S.: A geometric framework for unsupervised anomaly detection. In: Applications of Data Mining in Computer Security, pp. 77–101. Springer (2002)Google Scholar
  2. 2.
    King, S., King, D., Astley, K., Tarassenko, L., Hayton, P., Utete, S.: The use of novelty detection techniques for monitoring high-integrity plant. In: Proceedings of the 2002 International Conference on Control Applications, vol. 1, pp. 221–226. IEEE (2002)Google Scholar
  3. 3.
    Borrajo, M.L., Baruque, B., Corchado, E., Bajo, J., Corchado, J.M.: Hybrid neural intelligent system to predict business failure in small-to-medium-size enterprises. International Journal of Neural Systems 21(4), 277–296 (2011)CrossRefGoogle Scholar
  4. 4.
    Woźniak, M., Graña, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Information Fusion 16, 3–17 (2014)CrossRefGoogle Scholar
  5. 5.
    Calvo-Rolle, J.L., Corchado, E.: A bio-inspired knowledge system for improving combined cycle plant control tuning. Neurocomputing 126, 95–105 (2014)CrossRefGoogle Scholar
  6. 6.
    Keogh, E., Lonardi, S., Chiu, B.: c.: Finding surprising patterns in a time series database in linear time and space. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 550–556. ACM (2002)Google Scholar
  7. 7.
    Ratsch, G., Mika, S., Scholkopf, B., Muller, K.: Constructing boosting algorithms from svms: an application to one-class classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(9), 1184–1199 (2002)CrossRefGoogle Scholar
  8. 8.
    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: A survey. ACM Computing Surveys (CSUR) 41(3), 15 (2009)CrossRefGoogle Scholar
  9. 9.
    Grubbs, F.E.: Procedures for detecting outlying observations in samples. Technometrics 11(1), 1–21 (1969)CrossRefGoogle Scholar
  10. 10.
    Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: Identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD 2000, pp. 93–104. ACM, New York (2000)CrossRefGoogle Scholar
  11. 11.
    Papadimitriou, S., Kitagawa, H., Gibbons, P., Faloutsos, C.: LOCI: Fast outlier detection using the local correlation integral. In: Proceedings 19th International Conference on Data Engineering (ICDE 2003), pp. 315–326. IEEE Press (2003)Google Scholar
  12. 12.
    Ringberg, H., Soule, A., Rexford, J., Diot, C.: Sensitivity of pca for traffic anomaly detection. In: ACM SIGMETRICS Performance Evaluation Review, vol. 35, pp. 109–120. ACM (2007)Google Scholar
  13. 13.
    Fujimaki, R., Yairi, T., Machida, K.: An approach to spacecraft anomaly detection problem using kernel feature space. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 401–410. ACM (2005)Google Scholar
  14. 14.
    Barbara, D., Wu, N., Jajodia, S.: Detecting novel network intrusions using Bayes estimators. In: First SIAM Conference on Data Mining. SIAM (2001)Google Scholar
  15. 15.
    Roth, V.: Outlier detection with one-class kernel Fisher discriminants. In: Advances in Neural Information Processing Systems, vol. 17, pp. 1169–1176. MIT Press (2005)Google Scholar
  16. 16.
    Bouchard, D.: Automated time series segmentation for human motion analysis. Center for Human Modeling and Simulation, University of Pennsylvania (2006)Google Scholar
  17. 17.
    Bingham, E., Gionis, A., Haiminen, N., Hiisilä, H., Mannila, H., Terzi, E.: Segmentation and dimensionality reduction. In: SDM. SIAM (2006)Google Scholar
  18. 18.
    Lemire, D.: A better alternative to piecewise linear time series segmentation. In: SDM. SIAM (2007)Google Scholar
  19. 19.
    Hunter, J., McIntosh, N.: Knowledge-based event detection in complex time series data. In: Horn, W., Shahar, Y., Lindberg, G., Andreassen, S., Wyatt, J.C. (eds.) AIMDM 1999. LNCS (LNAI), vol. 1620, pp. 271–280. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  20. 20.
    Vlachos, M., Lin, J., Keogh, E., Gunopulos, D.: A wavelet-based anytime algorithm for k-means clustering of time series. In: Proc. Workshop on Clustering High Dimensionality Data and Its Applications. Citeseer (2003)Google Scholar
  21. 21.
    Bollobás, B., Das, G., Gunopulos, D., Mannila, H.: Time-series similarity problems and well-separated geometric sets. In: Proceedings of the Thirteenth Annual Symposium on Computational Geometry, pp. 454–456. ACM (1997)Google Scholar
  22. 22.
    Martí, L.: Scalable Multi-Objective Optimization. PhD thesis, Departmento de Informtica, Universidad Carlos III de Madrid, Colmenarejo, Spain (2011)Google Scholar
  23. 23.
    Neyman, J.: Outline of a theory of statistical estimation based on the classical theory of probability. Philosophical Transactions of the Royal Society A 236, 333–380 (1937)CrossRefGoogle Scholar
  24. 24.
    Chambers, J., Cleveland, W., Kleiner, B., Tukey, P.: Graphical Methods for Data Analysis, Wadsworth, Belmont (1983)Google Scholar
  25. 25.
    Di Eugenio, B., Glass, M.: The Kappa statistic: A second look. Computational Linguistics 30(1), 95–101 (2004)CrossRefMATHGoogle Scholar
  26. 26.
    Salzberg, S.L.: On comparing classifiers: Pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery 1(3), 317–328 (1997)CrossRefGoogle Scholar
  27. 27.
    McNemar, Q.: Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12(2), 153–157 (1947)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Luis Martí
    • 1
  • Nayat Sanchez-Pi
    • 2
  • José Manuel Molina
    • 3
  • Ana Cristina Bicharra García
    • 4
  1. 1.Dept. of Electrical EngineeringPontifícia Universidade Católica do Rio de JaneiroRio de JaneiroBrazil
  2. 2.Instituto de Lógica, Filosofia e Teoria da Ciéncia (ILTC)NiteróiBrazil
  3. 3.Dept. of InformaticsUniversidad Carlos III de MadridMadridSpain
  4. 4.ADDLabsFluminense Federal UniversityNiteróiBrazil

Personalised recommendations