YASA: Yet Another Time Series Segmentation Algorithm for Anomaly Detection in Big Data Problems
Time series patterns analysis had recently attracted the attention of the research community for real-world applications. Petroleum industry is one of the application contexts where these problems are present, for instance for anomaly detection. Offshore petroleum platforms rely on heavy turbomachines for its extraction, pumping and generation operations. Frequently, these machines are intensively monitored by hundreds of sensors each, which send measurements with a high frequency to a concentration hub. Handling these data calls for a holistic approach, as sensor data is frequently noisy, unreliable, inconsistent with a priori problem axioms, and of a massive amount. For the anomalies detection problems in turbomachinery, it is essential to segment the dataset available in order to automatically discover the operational regime of the machine in the recent past. In this paper we propose a novel time series segmentation algorithm adaptable to big data problems and that is capable of handling the high volume of data involved in problem contexts. As part of the paper we describe our proposal, analyzing its computational complexity. We also perform empirical studies comparing our algorithm with similar approaches when applied to benchmark problems and a real-life application related to oil platform turbomachinery anomaly detection.
KeywordsTime series segmentation anomaly detection big data oil industry application
Unable to display preview. Download preview PDF.
- 2.DeCoste, D.: Mining multivariate time-series sensor data to discover behavior envelopes. In: KDD, pp. 151–154 (1997)Google Scholar
- 3.Hawkins, D.M.: Identification of outliers, vol. 11. Springer (1980)Google Scholar
- 4.Yairi, T., Kato, Y., Hori, K.: Fault detection by mining association rules from house-keeping data. In: Proc. of International Symposium on Artificial Intelligence, Robotics and Automation in Space, vol. 3. Citeseer (2001)Google Scholar
- 8.Bouchard, D.: Automated time series segmentation for human motion analysis. Center for Human Modeling and Simulation, University of Pennsylvania (2006)Google Scholar
- 9.Bingham, E., Gionis, A., Haiminen, N., Hiisilä, H., Mannila, H., Terzi, E.: Segmentation and dimensionality reduction. In: SDM. SIAM (2006)Google Scholar
- 10.Lemire, D.: A better alternative to piecewise linear time series segmentation. In: SDM. SIAM (2007)Google Scholar
- 12.Vlachos, M., Lin, J., Keogh, E., Gunopulos, D.: A wavelet-based anytime algorithm for k-means clustering of time series. In: Proc. Workshop on Clustering High Dimensionality Data and Its Applications. Citeseer (2003)Google Scholar
- 13.Bollobás, B., Das, G., Gunopulos, D., Mannila, H.: Time-series similarity problems and well-separated geometric sets. In: Proceedings of the Thirteenth Annual Symposium on Computational Geometry, pp. 454–456. ACM (1997)Google Scholar
- 14.Feder, P.I.: On asymptotic distribution theory in segmented regression problems–identified case. The Annals of Statistics, 49–83 (1975)Google Scholar
- 17.Logan Jr., E.: Handbook of Turbomachinery, 2nd edn. CRC Press (2003)Google Scholar