Abstract
Time series data is pervasive in many applications and the anomaly detection about it is important, which will provide the early warning of some unexpected patterns. In this paper, we propose a multiple similarity based anomalous subsequences detection method, which is unsupervised and domain knowledge free. Firstly, to improve the time efficiency, an anomaly candidates selection scheme is introduced based on the locality sensitive hashing (LSH), which considers a subsequence that does not collide with the others as a potential anomaly. However, if the raw time series is noisy and the anomaly is subtle, the performance of LSH will be degraded. In order to address this problem, we present a smoothing method to remove the noise and highlight the anomalous part in a time series, which can help to decrease the collision probability between an anomaly and the other subsequences. Secondly, we employ Pareto analysis to incorporate multiple similarity measures since there are different types of anomalies in real applications. It is unlikely that a single similarity measure can perform consistently well on different types of anomalies. Thirdly a new anomaly score scheme is provided to evaluate each anomaly candidate, which is based on the number of non-dominated vectors. Finally, we conduct extensive experiments on benchmark datasets from diverse domains and compare our method with the state-of-the-art approaches. The results show that our method can reach higher accuracy.
Similar content being viewed by others
Change history
25 September 2020
A Correction to this paper has been published: https://doi.org/10.1007/s10845-020-01644-4
References
Ahmad, S., Lavin, A., Purdy, S., & Agha, Z. (2017). Unsupervised real-time anomaly detection for streaming data. Neurocomputing, 262, 134–147. https://doi.org/10.1016/j.neucom.2017.04.070.
Appice, A., Guccione, P., Malerba, D., & Ciampi, A. (2014). Dealing with temporal and spatial correlations to classify outliers in geophysical data streams. Information Sciences, 285(1), 162–180. https://doi.org/10.1016/j.ins.2013.12.009.
Burbeck, K., & Nadjm-Tehrani, S. (2007). Adaptive real-time anomaly detection with incremental clustering. Information Security Technical Report, 12(1), 56–67. https://doi.org/10.1016/j.istr.2007.02.004.
Chalapathy, R., & Chawla, S. (2019). Deep learning for anomaly detection: A survey. https://arxiv.org/abs/1901.03407.
Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys, 41(3), 1–58. https://doi.org/10.1145/1541880.1541882.
Cheng, H., Tan, P. N., Potter, C., & Klooster., S. (2008). A robust graph-based algorithm for detection and characterization of anomalies in noisy multivariate time series. In 2008 IEEE international conference on data mining workshops (pp. 349–358). IEEE. https://doi.org/10.1109/ICDMW.2008.48.
Cui, Y., Ahmad, S., & Hawkins, J. (2016). Continuous online sequence learning with an unsupervised neural network model. Neural Computation, 28(11), 2474–2504. https://doi.org/10.1162/NECO_a_00893.
Datar, M., Immorlica, N., Indyk, P. & Mirrokni V. S. (2004). Locality-sensitive hashing scheme based on p-stable distributions. In The twentieth annual symposium on computational geometry (pp. 253–262). https://doi.org/10.1145/997817.997857.
Dereszynski, E. W., & Dietterich, T. G. (2011). Spatiotemporal models for data-anomaly detection in dynamic environmental monitoring campaigns. ACM Transactions on Sensor Networks, 8(1), 1–36. https://doi.org/10.1145/1993042.1993045.
Gaxiola, F., Melin, P., Valdez, F., & Castillo, O. (2015). Generalized type-2 fuzzy weight adjustment for backpropagation neural networks in time series prediction. Information Sciences, 325, 159–174. https://doi.org/10.1016/j.ins.2015.07.020.
Gupta, C., Jain, A., Tayal, D. K., & Castillo, O. (2018). ClusFuDE: Forecasting low dimensional numerical data using an improved method based on automatic clustering, fuzzy relationships and differential evolution. Engineering Applications of Artificial Intelligence, 71, 175–189. https://doi.org/10.1016/j.engappai.2018.02.015.
Hachiya, H. & Matsugu, M. (2013). NSH: Normality sensitive hashing for anomaly detection. In 2013 IEEE international conference on computer vision workshops (pp. 795–802). IEEE. https://doi.org/10.1109/ICCVW.2013.109.
Hawkins, J., & Ahmad, S. (2016). Why neurons have thousands of synapses, a theory of sequence memory in neocortex. Frontiers in Neural Circuits, 10(23), 1–13. https://doi.org/10.3389/fncir.2016.00023.
Hsiao, K., Xu, K., Calder, J., & Hero, A. (2016). Multicriteria similarity-based anomaly detection using Pareto depth analysis. IEEE Transactions on Neural Networks and Learning Systems, 27(6), 1307–1321. https://doi.org/10.1109/TNNLS.2015.2466686.
Indyk, P., & Motwani, R. (1998). Approximate nearest neighbors: Towards removing the curse of dimensionality. In The thirtieth annual ACM symposium on theory of computing (pp. 604–613). https://doi.org/10.1145/276698.276876.
Izakian, H. & Pedrycz, W. (2013). Anomaly detection in time series data using a fuzzy c-means clustering. In 2013 joint IFSA world congress and NAFIPS annual meeting (IFSA/NAFIPS) (pp. 1513–1518). https://doi.org/10.1109/IFSA-NAFIPS.2013.6608627.
Izakian, H., & Pedrycz, W. (2014). Anomaly detection and characterization in spatial time series data: A cluster-centric approach. IEEE Transactions on Fuzzy Systems, 22(6), 1612–1624. https://doi.org/10.1109/TFUZZ.2014.2302456.
Jones, M., Nikovski, D., Imamura, M., & Hirata, T. (2016). Exemplar learning for extremely efficient anomaly detection in real-valued time series. Data Mining and Knowledge Discovery, 30(6), 1427–1454. https://doi.org/10.1007/s10618-015-0449-3.
Keogh, E., Lin, J. & Fu, A. (2005). HOT SAX: Efficiently finding the most unusual time series subsequence: Algorithms and applications. In Fifth IEEE international conference on data mining (pp. 226–233). IEEE. https://doi.org/10.1109/ICDM.2005.79.
Laptev, N., Amizadeh, S., & Flint, I. (2015). Generic and scalable framework for automated time-series anomaly detection. In ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1939–1947). https://doi.org/10.1145/2783258.2788611.
Lavin, A., & Ahmad, S. (2015). Evaluating real-time anomaly detection algorithms—The Numenta anomaly benchmark. In 14th IEEE international conference on machine learning and applications (pp. 38–44). IEEE. https://doi.org/10.1109/ICMLA.2015.141.
Li, D., Chen, D., Goh, J., & Ng, S. (2018). Anomaly detection with generative adversarial networks for multivariate time series. https://arxiv.org/abs/1809.04758.
Li, J., Pedrycz, W., & Jamal, I. (2017). Multivariate time series anomaly detection: A framework of hidden Markov models. Applied Soft Computing, 60, 229–240. https://doi.org/10.1016/j.asoc.2017.06.035.
Liu, B., Chen, H., Sharma, A., Jiang, G., & Xiong, H. (2013). Modeling heterogeneous time series dynamics to profile big sensor data in complex physical systems. In IEEE international conference on big data (pp. 631–638). https://doi.org/10.1109/BigData.2013.6691632.
Liu, F. T., Ting, K. M., & Zhou, Z. H. (2012). Isolation-based anomaly detection. ACM transactions on knowledge discovery from data, 6(1), 1–39. https://doi.org/10.1145/2133360.2133363.
Ma, J., & Perkins, S. (2003). Online novelty detection on temporal sequences. In ACM international conference on knowledge discovery and data mining (pp. 613–618). https://doi.org/10.1145/956750.956828.
Ma, J., Sun, L., Wang, H., Zhang, Y., & Aickelin, U. (2016). Supervised anomaly detection in uncertain Pseudoperiodic data streams. ACM Transactions on Internet Technology, 16(1), 4–24. https://doi.org/10.1145/2806890.
Malhotra, P., Lovekesh, V., Shroff, G., & Agarwal, P. (2015). Long short term memory networks for anomaly detection in time series. In 2015 European symposium on artificial neural networks, computional intelligence and machine learning (pp. 89–94). http://www.i6doc.com/en/.
Malhotra, P., Ramakrishnan, A., Anand, G., Vig, L., Agarwal, P., & Shroff, G. (2016). LSTM-based encoder-decoder for multi-sensor anomaly detection. In ICML anomaly detection workshop. https://arxiv.org/abs/1607.00148.
Nascimento, E. G. S., Tavares, O. D. L., & Souza, A. F. D. (2015). A cluster-based algorithm for anomaly detection in time series using Mahalanobis distance. In Proceedings of the international conference on artificial intelligence (pp. 622–628).
Padilla, D. E., Brinkworth, R., & McDonnell, M. D. (2013). Performance of a hierarchical temporal memory network in noisy sequence learning. In Proceedings of the international conference on computational intelligence and cybernetics (pp. 45–51). https://doi.org/10.1109/CyberneticsCom.2013.6865779.
Portnoy, L., Eskin, E., & Stolfo, S. (2001). Intrusion detection with unlabeled data using clustering. ACM CSS Workshop on Data Mining Applied to Security. https://doi.org/10.7916/D8MP5904.
Raudys, A., Lenčiauskas, V., & Malčius, E. (2013). Moving averages for financial data smoothing. In Proceedings of international conference on information and software technologies (pp. 34–45). https://doi.org/10.1007/978-3-642-41947-8_4.
Ren, H., Liu, M., Li, Z., & Pedrycz, W. (2017). A piecewise aggregate pattern representation approach for anomaly detection in time series. Knowledge-Based Systems, 135, 29–39. https://doi.org/10.1016/j.knosys.2017.07.021.
Rong, K., & Bailis, P. (2017). ASAP: Prioritizing attention via time series smoothing. In Proceedings of the VLDB Endowment (pp. 1358–1369). https://doi.org/10.14778/3137628.3137645.
Tian, H., Ren, D., Li, K., & Zhao, Z. (2020). An adaptive update model based on improved long short term memory for online prediction of vibration signal. Journal of Intelligent Manufacturing. https://doi.org/10.1007/s10845-020-01556-3.
Funding
The funding was provided by the Key Laboratory for Fault Diagnosis and Maintenance of Spacecraft in Orbit of China (SDML_OF2015008).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Wang, W., Bao, J. & Li, T. Bound smoothing based time series anomaly detection using multiple similarity measures. J Intell Manuf 32, 1711–1727 (2021). https://doi.org/10.1007/s10845-020-01583-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10845-020-01583-0