Abstract
A motif is a pair of subsequences of a longer time series, which are very similar to each other. Motif discovery is applied in a wide range of subject areas involving time series: medicine, biology, entertainment, weather prediction, and others. In this paper, we propose a novel parallel algorithm for motif discovery using Intel MIC (Many Integrated Core) accelerators in the case when time series fit in the main memory. We perform parallelization through thread-level parallelism and OpenMP technology. The algorithm employs a set of matrix data structures to store and index the subsequences of a time series and to provide an efficient vectorization of computations on the Intel MIC platform. The experimental evaluation shows the high scalability of the proposed algorithm.
Similar content being viewed by others
References
D. F. Bacon, S. L. Graham, and O. J. Sharp, “Compiler transformations for high-performance computing,” ACM Comput. Surv. 26, 345–420 (1994). https://doi.org/10.1145/197405.197406
B. Y. Chiu, E. J. Keogh, and S. Lonardi, “Probabilistic discovery of time series motifs,” in Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, Aug. 24–27, 2003 (2003), pp. 493–498. https://doi.org/10.1145/956750.956808
G. Chrysos, “Intel registered Xeon Phi coprocessor (codename Knights Corner),” in Proceedings of the 2012 IEEE Hot Chips 24th Symposium (HCS), Cupertino, CA, USA, Aug. 27–29, 2012 (2012), pp. 1–31. https://doi.org/10.1109/HOTCHIPS.2012.7476487
A. Goldberger, L. Amaral, L. Glass, J. Hausdorff, P. Ivanov, R. Mark, J. Mietus, G. Moody, C. Peng, and H. Stanley, “PhysioBank, Physio Toolkit, and PhysioNet: components of a new research resource for complex physiologic signals,” Circulation 101(23), 215–220 (2000). https://doi.org/10.1161/01.CIR.101.23.e215
P. Kostenetskiy and P. Semenikhina, “SUSU supercomputer resources for industry and fundamental science,” in Proceedings of the 2018 Global Smart Industry Conference (GloSIC), Chelyabinsk, Russia, Nov. 13–15, 2018 (2018), p. 8570068. https://doi.org/10.1109/GloSIC.2018.8570068
Ya. Kraeva and M. Zymbler, “Scalable algorithm for subsequence similarity search in very large time series data on cluster of Phi KNL,” in Proceedings of the 20th International Conference on Data Analytics and Management in Data Intensive Domains, DAMDID/RCDL 2018, Moscow, Russia, Oct. 9–12, 2018, Commun. Comput. Inform. Sci. 1003, 149–164 (2019). https://doi.org/10.1007/978-3-030-23584-0_9
T. Mattson, “Introduction to OpenMP,” in Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, Nov. 11–17, 2006, Tampa, FL, USA (ACM Press, 2006). https://doi.org/10.1145/1188455.1188673
J. Meng, J. Yuan, M. Hans, and Y. Wu, “Mining motifs from human motion,” in Proceedings of the Eurographics 2008, Crete, Greece, April 14–18, 2008 (Eurographics Association, 2008), pp. 71–74.
D. Minnen, C. L. Isbell, I. A. Essa, and T. Starner, “Discovering multivariate motifs using subsequence density estimation and greedy mixture learning,” in Proceedings of the 22nd AAAI Conference on Artificial Intelligence, July 22–26, 2007, Vancouver, British Columbia, Canada (AAAI Press, 2007), pp. 615–620.
A. Mueen, E. J. Keogh, Q. Zhu, S. Cash, and M. B. Westover, “Exact discovery of time series motifs,” in Proceedings of the SIAM International Conference on Data Mining, SDM 2009, April 30–May 2, 2009, Sparks, Nevada, USA (SIAM, 2009), pp. 473–484. https://doi.org/10.1137/L9781611972795.41
A. Narang and S. Bhattacherjee, “Parallel exact time series motif discovery,” in Proceedings of the 16th International Euro-Par Conference, Ischia, Italy, Aug. 31–Sept. 3, 2010, Lect. Notes Comput. Sci. 6272, 304–315 (2010). https://doi.org/10.1007/978-3-642-15291-7_28
D. A. Padua, “POSIX threads (pthreads),” in Encyclopedia of Parallel Computing (Springer, Berlin, 2011), pp. 1592–1593. https://doi.org/10.1007/978-0-387-09766-4_447
P. Patel, E. J. Keogh, J. Lin, and S. Lonardi, “Mining motifs in massive time series databases,” in Proceedings of the 2002 IEEE International Conference on Data Mining ICDM 2002, Dec. 9–12, 2002, Maebashi City, Japan (IEEE Comput. Soc., 2002), pp. 370–377. https://doi.org/10.1109/ICDM.2002.1183925
K. Pearson, “Theproblem of therandom walk,” Nature (London, U.K.) 72(1865), 294 (1905). https://doi.org/10.1038/072342a0
J. Shieh and E. J. Keogh, “iSAX: indexing and mining terabyte sized time series,” in Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, Aug.24–27, 2008 (ACM, 2008), pp. 623–631. https://doi.org/10.1145/1401890.1401966
A. Sodani, “Knights Landing (KNL): 2nd generation Intel® Xeon Phi processor,” in Proceedings of the 2015 IEEE Hot Chips 27th Symposium HCS, Cupertino, CA, USA, Aug. 22–25, 2015 (IEEE, 2015), pp. 1–24. doi https://doi.org/10.1109/HOTCHIPS.2015.7477467
I. Sokolinskaya and L. Sokolinsky, “Revised pursuit algorithm for solving non-stationary linear programming problems on modern computing clusters with manycore accelerators,” in Proceedings of the 2nd Russian Conference Supercomputing Days, RuSCDays 2016, Moscow, Russia, Sept. 26–27, 2016, Commun. Comput. Inform. Sci. 687, 212–223. Springer (2016). https://doi.org/10.1007/978-3-319-55669-7_17
Y. Tanaka, K. Iwamoto, and K. Uehara, “Discovery of time-series motif from multi-dimensional data based on MDL Principle,” Machine Learning 58, 269–300 (2005). https://doi.org/10.1007/s10994-005-5829-2
D. R. Wilson, and T. R. Martinez, “Reduction techniques for instance-based learning algorithms,” Machine Learning 38, 257–286 (2000). https://doi.org/10.1023/A:1007626913721
M. Zymbler, A. Polyakov, and M. Kipnis, “Time series discord discovery on Intel many-core systems,” in Proceedings of the 13th International Conference, PCT 2019, Kaliningrad, Russia, April 2–4, 2019, Commun. Comput. Inform. Science 1063, 168–182 (2019). https://doi.org/10.1007/978-3-030-28163-2_12
Funding
This work was financially supported by the Russian Foundation for Basic Research (grant no. 17-07-00463) and by the Ministry of Science and Higher Education of the Russian Federation (government orders 2.7905.2017/8.9 and 14.578.21.0265).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Submitted by A. M. Elizarov
Rights and permissions
About this article
Cite this article
Zymbler, M.L., Kraeva, Y.A. Discovery of Time Series Motifs on Intel Many-Core Systems. Lobachevskii J Math 40, 2124–2132 (2019). https://doi.org/10.1134/S199508021912014X
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S199508021912014X