Skip to main content

Advertisement

Log in

MERLIN++: parameter-free discovery of time series anomalies

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

The burgeoning age of IoT has reinforced the need for robust time series anomaly detection. While there are hundreds of anomaly detection methods in the literature, one definition, time series discords, has emerged as a competitive and popular choice for practitioners. Time series discords are subsequences of a time series that are maximally far away from their nearest neighbors. Perhaps the most attractive feature of discords is their simplicity. Unlike many of the parameter-laden methods proposed, discords require only a single parameter to be set by the user: the subsequence length. We believe that the utility of discords is reduced by sensitivity to even this single user choice. The obvious solution to this problem, computing discords of all lengths then selecting the best anomalies (under some measure), appears at first glance to be computationally untenable. However, in this work we discuss MERLIN, a recently introduced algorithm that can efficiently and exactly find discords of all lengths in massive time series archives. By exploiting computational redundancies, MERLIN is two orders of magnitude faster than comparable algorithms. Moreover, we show that by exploiting a little-known indexing technique called Orchard’s algorithm, we can create a new algorithm called MERLIN++, which is an order of magnitude faster than MERLIN, yet produces identical results. We demonstrate the utility of our ideas on a large and diverse set of experiments and show that MERLIN++ can discover subtle anomalies that defy existing algorithms or even careful human inspection. We further compare to five state-of-the-art rival methods, on the largest benchmark dataset for this task, and show that MERLIN++ is superior in terms of accuracy and speed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24

Similar content being viewed by others

Notes

  1. Note that some algorithms that discover discords may have other parameters, the discord representation itself requires just a single parameter.

  2. Note that this DST anomaly is misidentified in the original work that introduced this dataset as the NY-Marathon anomaly (Ahmad et al. 2017). This misidentification has since been repeated in dozens of papers. We are confident that our labeling is correct. If we correctly process the data with the standard DST algorithm count(1–2am) = ½ apparent count(1–am), then the apparent anomaly disappears.

  3. This name is a play on the fact that the first paper on time series discords was titled “Approximations to Magic” (Lin et al. 2005). Merlin was the magician of the Arthurian legend. In addition, Mitsubishi Electric Corporation’s subsidiary in the USA is called MERL (Mitsubishi Electric Research Laboratory, Boston).

  4. Orchard’s algorithm is obscure enough that when googling for “Orchard’s algorithm”, almost half the results are about algorithms for literal fruit processing.

  5. An order of magnitude simpler in terms of number of parameters to set, of the number of lines of code written etc.

  6. To preempt confusion, we note that Page is an author.

References

  • Alireza A, Sara A, Shima I, Murillo AC, Gerry AC, Hickle L, Keogh EJ (2020) Fitbit for chickens?: time series data mining can increase the productivity of poultry farms. KDD 2020:3328–3336

    Google Scholar 

  • Ahmed M, and Abdun Naser M (2014) "Network traffic pattern analysis using improved information theoretic co-clustering based collective anomaly detection." In International conference on security and privacy in communication networks, pp. 204–219. Springer, Cham

  • Ahmad S, Lavin A, Purdy S, Agha Z (2017) Unsupervised real-time anomaly detection for streaming data. Neurocomputing 262:134–147

    Article  Google Scholar 

  • Audibert J, Marti S, Guyard F, Zuluaga MA (2021) From univariate to multivariate time series anomaly detection with non-local information. In: Lemaire V, Malinowski S, Bagnall A et al (eds) Advanced analytics and learning on temporal data. Springer International Publishing

    Google Scholar 

  • Barz B, Yanira Guanche G, Erik R, and Joachim D (2017) "Maximally divergent intervals for extreme weather event detection." In OCEANS 2017-Aberdeen, pp. 1–9. IEEE

  • Beyer K, Jonathan G, Raghu R, and Uri S (1999) "When is “nearest neighbor” meaningful?" In International conference on database theory, pp. 217–235. Springer, Berlin, Heidelberg

  • Boniol P, Linardi M, Roncallo F et al (2021) Unsupervised and scalable subsequence anomaly detection in large data series. VLDB J. https://doi.org/10.1007/s00778-021-00655-8

    Article  Google Scholar 

  • Bu Y, Lei C, Ada W-CF, and Dawei L (2009) "Efficient anomaly monitoring over moving object trajectory streams." In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 159–168. 2009.

  • Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. In ACM Comput Surv (CSUR) 41(3):1–58

    Article  Google Scholar 

  • Daigavane A, Wagstaff KL, Doran G, Cochrane C, Jackman C, and Rymer A (2020). Detection of Environment Transitions in Time Series Data for Responsive Science. In MileTS ’20: 6th KDD Workshop on Mining and Learning from Time Series, August 24th, 2020, San Diego, California, USA. ACM, New York, NY, USA, 5 pages.

  • Däubener Sina, Sebastian Schmitt, Hao Wang, Thomas Bäck, Peter krause, “Large Anomaly Detection in Univariate Time Series: An Empirical Comparison of Machine Learning Algorithms,” 19th Industrial Conference on Data Mining ICDM, 2019.

  • Ding H, Trajcevski G, Scheuermann P et al (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. Proc VLDB Endow 1:1542–1552. https://doi.org/10.14778/1454159.1454226

    Article  Google Scholar 

  • Doan Minh T, Sutharshan R, Mahsa S, Masud M, and Christopher L (2015) "Profiling pedestrian distribution and anomaly detection in a dynamic environment." In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 1827–1830. 2015.

  • Filonov P, Andrey L, and Artem V (2016) "Multivariate industrial time series with cyber-attack simulation: Fault detection using an lstm-based predictive data model."http.arxiv.org./1612.06676

  • Heldt T, Oefinger MB, Hoshiyama M, Mark RG (2003) Circulatory response topassive and active changes in posture. Comput Cardiol 30:263–266

    Google Scholar 

  • Huet A, Navarro JM, Rossi D (2022) Local Evaluation of Time Series Anomaly Detection Algorithms. arXiv preprint arXiv:2206.13167.

  • Hundman K, Valentino C, Christopher L, Ian C, and Tom S (2018) "Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding." In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 387–395. 2018.

  • Hwang W-S, Yun J-H, Kim J, Min BG (2022) Do you know existing accuracy metrics overrate time-series anomaly detections? In: Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing. Association for Computing Machinery, New York, NY, USA, pp 403–412

  • Imamura M, Takaaki N, Eamonn K (2020). Matrix Profile XXI: A Geometric Approach to Time Series Chains Improves Robustness. Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pages 1114–1122

  • Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Dimensionality reduction for fast similarity search in large time series databases. Knowl Inf Syst 3:263–286. https://doi.org/10.1007/PL00011669

    Article  MATH  Google Scholar 

  • Keogh E, Jessica L, and Ada F (2005) "Hot sax: Efficiently finding the most unusual time series subsequence." In Fifth IEEE International Conference on Data Mining (ICDM'05), pp. 8-pp. Ieee

  • Kim S, Choi K, Choi H-S et al (2022) Towards a rigorous evaluation of time-series anomaly detection. Proc AAAI Conf Artif Intell 36:7194–7201. https://doi.org/10.1609/aaai.v36i7.20680

    Article  Google Scholar 

  • Laptev N and Saeed A, “S5 - A Labeled Anomaly Detection Dataset, version 1.0(16M).” 2015. Distributed by Yahoo Research. https://webscope.sandbox.yahoo.com/catalog.php?datatype=s&did=70.

  • Lin J, Eamonn K, Ada F, and Helga Van H (2005) "Approximations to magic: Finding unusual medical time series." In 18th IEEE Symposium on Computer-Based Medical Systems (CBMS'05), pp. 329–334. IEEE

  • Linardi M, Zhu Y, Palpanas T, Keogh E (2020) Matrix profile goes MAD: variable-length motif and discord discovery in data series. Data Min Knowl Discov 34(4):1022–1071

    Article  MathSciNet  Google Scholar 

  • Marcus M, and Minc H (1992). A survey of matrix theory and matrix inequalities (Vol. 14). Courier Corporation.

  • McRae M, Melbourne D- Bourke St Mall Flash Mob - 29th June 2013. (Jul. 20, 2013). Accessed: May 22, 2020. [Online Video]. Available: www.youtube.com/watch?v=gLzDFjiRQE8.

  • Mueen A (2015). The Fastest Similarity Search Algorithm for Time Series Subsequences under Euclidean Distance, URL: http://www.cs.unm.edu/~mueen/FastestSimilaritySearch.html

  • Murray D, Stankovic L, Stankovic V, and Liao J (2015) University of Strathclyde, PURE http://dx.doi.org/https://doi.org/10.15129/31da3ece-f902-4e95-a093-e0a9536983c4

  • Nakamura T (2021) MERLIN Webpage: https://sites.google.com/view/merlin-find-anomalies.

  • Nakamura T, Imamura M, Mercer R, Keogh E (2020) “MERLIN: Parameter-Free Discovery of Arbitrary Length Anomalies in Massive Time Series Archives”. ICDM: 1190–1195

  • Nichiforov C, Stancu I, Stamatescu I, Stamatescu G (2020) Information Extraction Approach for Energy Time Series Modelling, 24th International Conference on System Theory, Control and Computing, ICSTCC 2020, October 8–10, 2020, Sinaia, Romania.

  • Orchard MT (1991) A fast nearest-neighbor search algorithm. International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2297–2300, IEEE Computer Society Press.

  • Page, E. S. "On problems in which a change in a parameter occurs at an unknown point." Biometrika 44, no. 1/2 (1957): 248–252.

  • Park D, Hoshi Y, Kemp CC (2018) A multimodal anomaly detector for robot-assisted feeding using an LSTM-Based variational autoencoder. IEEE Robot Autom Lett 3:1544–1551. https://doi.org/10.1109/LRA.2018.2801475

    Article  Google Scholar 

  • Pedestrian Counting System (2013) City of Melbourne—Pedestrian counting system. In: Pedestrian Counting System. http://www.pedestrian.melbourne.vic.gov.au/#date=28-10-2021&time=8. Accessed May 22, 2020.

  • Thanawin R, Bilson JLC, Abdullah M, Gustavo EAPAB, Brandon Westover M, Qiang Z, Jesin Z, Eamonn JK (2013) Addressing Big Data Time Series: Mining Trillions of Time Series Subsequences Under Dynamic Time Warping. ACM Trans Knowl Discov Data 7(3):1–31

    Article  Google Scholar 

  • Thirey B and Hickman R (2015) Distribution of euclidean distances between randomly distributed gaussian points in n-space. arXiv:1508.02238.

  • Vasheghani-Farahani I, Alex C, Russell EK, Michael GK, and Brad K (2019) "Time Series Anomaly Detection from a Markov Chain Perspective." In 2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1000–1007. IEEE

  • Wang JT-L, Wang X, Lin K-I, et al (1999) Evaluating a class of distance-mapping algorithms for data mining and clustering. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’99. ACM Press, San Diego, California, United States, pp 307–311

  • Wei, Li, Keogh, Eamonn, Xi, Xiaopeng. SAXually Explicit Images: Finding Unusual Shapes. ICDM 2006: 711–720

  • Wikipedia (2022) "January 2017 Melbourne car attack.", last modified July 3, 2022, 1:21, https://en.wikipedia.org/wiki/January_2017_Melbourne_car_attack.

  • Wu, R and Keogh Eamonn (2021) Current Time Series Anomaly Detection Benchmarks are Flawed and are Creating the Illusion of Progress. CoRR abs/2009.13807 (2020)

  • Yankov D, Keogh E, Rebbapragada U (2008) Disk aware discord discovery: finding unusual time series in terabyte sized datasets. Knowl Inf Syst 17(2S):241–262

    Article  Google Scholar 

  • Yeh C-CM, Yan Zhu LU, Nurjahan B, Yifei D, Hoang AD, Diego FS, Abdullah M, and Eamonn K (2016) "Matrix profile I: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets." In 2016 IEEE 16th international conference on data mining (ICDM), pp. 1317–1322. Ieee.

  • Zhang V (2019), “A tour of AI technologies in time series prediction.” Society of Actuaries.

Download references

Funding

Funding was provided by National Science Foundation (Grant no. 1631776)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ryan Mercer.

Additional information

Responsible Editor: Panagiotis Papapetrou.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Some benchmark datasets are trivial

In the main text we noted that some fraction of the benchmark data yield to simple algorithms from the 1950s (Page 1957). Here we demonstrate that claim. This is important because it confounds any comparison of algorithms. For example, suppose we find that Olympic powerlifter Long Qingquan can lift 1, 2, 3 and 300 kg, and that the current author can lift 1, 2 and 3 kg. It would be foolish to conclude that because they agree on ¾ of the lifting tasks, that they are almost equally strong.

A further simplified version of the sixty-three-year-old algorithm in (Page 1957) is:

flag  = zeros(size(T));                       %% Code can be run in Matlab

for i    =   4  :   length(T)-4

     if    std(T(i + 1:i + 4)) − td(T(i-3:i))  >  1,      flag(i)    =   1;,     end;

end;

In Fig. 25 we show the results of running this code on two benchmark datasets that yield to such simple algorithms.

Fig. 25
figure 25

Two (of many) examples of benchmark datasets that yield to the trivial hard-coded algorithm shown above. Top From NASA (Hundman et al. 2018). Bottom From Yahoo (Laptev and Amizadeh, 2015)

Appendix B : choosing a value for L

A reasonable value for L is about one period, say one heartbeat, one day, one machine cycle etc. The following snippet of Matlab code will return one period.

[autocor,lags] = xcorr(T,'coeff');

[~ ,m] = findpeaks(autocor(length(T) + 10:length(T) + 1000),…

lags(length(T) + 10:length(T) + 1000),'SortStr','descend','NPeaks',1);

m(isempty(m)) = 1000;

m = floor(m);

While this is reasonable heuristic, it is not perfect. For example, an anomaly may show up only at a fraction of a period length, but then get swamped if you consider a full period. This is why the multiscale property of MERLIN++ is so useful, it frees the user from needing to be careful in their choice of a parameter.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nakamura, T., Mercer, R., Imamura, M. et al. MERLIN++: parameter-free discovery of time series anomalies. Data Min Knowl Disc 37, 670–709 (2023). https://doi.org/10.1007/s10618-022-00876-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-022-00876-7

Keywords

Navigation