Abstract
Many time series algorithms reduce the computation cost by pruning unpromising candidates with lower-bound distance functions. In this paper, we focus on an orthogonal research direction that further boosts the performance by unlocking the potentials of modern commodity CPUs. First, we conduct a performance profiling on existing algorithms to understand where does time go. Second, we design vectorized implementations for lower-bound and distance functions that can enjoy characteristics (e.g., data parallelism, caching, branch prediction) provided by CPU. Third, our vectorized methods are general and applicable to many time series problems such as subsequence search, motif discovery and kNN classification. Our experimental study on real datasets shows that our proposal can achieve up to 6 times of speedup.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
The squared distance preserves the relative ordering of distances, and it avoids expensive square root calculations.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
For consistency, we use the ‘float’ data type to represent time series values in all evaluated methods.
- 10.
References
Intel 64 and IA-32 architecutres optimization reference manual. http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf. Accessed 20 June 2016
Source codes and datasets for experimental study. http://goo.gl/mwDTxP. Accessed 20 June 2016
Ailamaki, A., DeWitt, D.J., Hill, M.D., Wood, D.A.: DBMSs on a modern processor: where does time go? In: VLDB, Edinburgh, UK, pp. 266–277 (1999)
Assent, I., Krieger, R., Afschari, F., Seidl, T.: The ts-tree: efficient time series search and retrieval. In: EDBT (2008)
Athitsos, V., Papapetrou, P., Potamias, M., Kollios, G., Gunopulos, D.: Approximate embedding-based subsequence matching of time series. In: SIGMOD (2008)
Balkesen, C., Teubner, J., Alonso, G., Özsu, M.T.: Main-memory hash joins on multi-core cpus: tuning to the underlying hardware. In: ICDE (2013)
Blanas, S., Li, Y., Patel, J.M.: Design and evaluation of main memory hash join algorithms for multi-core CPUs. In: SIGMOD (2011)
Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl. 14(3), 189–204 (2000)
Camerra, A., Palpanas, T., Shieh, J., Keogh, E.J.: iSAX 2.0: Indexing and mining one billion time series. In: ICDM (2010)
Chen, S., Ailamaki, A., Gibbons, P.B., Mowry, T.C.: Improving hash join performance through prefetching. TODS 32(3), 17 (2007)
Chhugani, J., Nguyen, A.D., Lee, V.W., Macy, W., Hagog, M., Chen, Y.-K., Baransi, A., Kumar, S., Dubey, P.: Efficient implementation of sorting on multi-core simd CPU architecture. PVLDB 1(2), 1313–1324 (2008)
Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.J.: Querying and mining of time series data: experimental comparison of representations and distance measures. PVLDB 1(2), 1542–1552 (2008)
Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. In: SIGMOD (1994)
Fu, A.W., Keogh, E.J., Lau, L.Y.H., Ratanamahatana, C.A., Wong, R.C.: Scaling and time warping in time series querying. VLDB J. 17(4), 899–921 (2008)
Hennessy, J.L., Patterson, D.A.: Computer Architecture - A Quantitative Approach, 5th edn. Morgan Kaufmann, San Francisco (2012)
Inoue, H., Ohara, M., Taura, K.: Faster set intersection with simd instructions by reducing branch mispredictions. Proc. VLDB Endowment 8(3), 293–304 (2014)
Jha, S., He, B., Lu, M., Cheng, X., Huynh, H.P.: Improving main memory hash joins on intel xeon phi processors: an experimental approach. PVLDB 8(6), 642–653 (2015)
Keogh, E., Ratanamahatana, C.A.: Exact indexing of dynamic time warping. Knowl. Inform. Syst. 7(3), 358–386 (2005)
Li, Y., U, L.H., Yiu, M.L., Gong, Z.: Discovering longest-lasting correlation in sequence databases. PVLDB 6(14), 1666–1677 (2013)
Mueen, A., Keogh, E.J., Zhu, Q., Cash, S., Westover, M.B.: Exact discovery of time series motifs. In: SDM (2009)
Papapetrou, P., Athitsos, V., Potamias, M., Kollios, G., Gunopulos, D.: Embedding-based subsequence matching in time-series databases. ACM TODS 36(3), 17 (2011)
Rakthanmanon, T., Campana, B.J.L., Mueen, A., Batista, G.E., Westover, M.B., Zhu, Q., Zakaria, J., Keogh, E.J.: Searching and mining trillions of time series subsequences under dynamic time warping. In: KDD (2012)
Ross, K.A.: Efficient hash probes on modern processors. In: ICDE (2007)
Sart, D., Mueen, A., Najjar, W.A., Keogh, E.J., Niennattrakul, V.: Accelerating dynamic time warping subsequence search with GPUs and FPGAs. In: ICDM (2010)
Shieh, J., Keogh, E.J.: iSAX: indexing and mining terabyte sized time series. In: KDD (2008)
Shoeb, A.H., Guttag, J.V.: Application of machine learning to epileptic seizure detection. In: ICML (2010)
Sridharan, S., Patel, J.M.: Profiling R on a contemporary processor. Proc. VLDB Endowment 8(2), 173–184 (2014)
Xiao, L., Zheng, Y., Tang, W., Yao, G., Ruan, L.: Parallelizing dynamic time warping algorithm using prefix computations on GPU. In: HPCC/EUC (2013)
Zhou, J., Ross, K.A.: Implementing database operations using SIMD instructions. In: SIGMOD (2002)
Zhu, H., Kollios, G., Athitsos, V.: A generic framework for efficient and effective subsequence retrieval. PVLDB 5(11), 1579–1590 (2012)
Acknowledgement
This project was supported by grant GRF 152043/15E from the Hong Kong RGC and grant MYRG2014-00106-FST from UMAC Research Committee and grant NSFC 61502548 from National Natural Science Foundation of China.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Tang, B., Yiu, M.L., Li, Y., U, L.H. (2017). Exploit Every Cycle: Vectorized Time Series Algorithms on Modern Commodity CPUs. In: Blanas, S., Bordawekar, R., Lahiri, T., Levandoski, J., Pavlo, A. (eds) Data Management on New Hardware. ADMS IMDM 2016 2016. Lecture Notes in Computer Science(), vol 10195. Springer, Cham. https://doi.org/10.1007/978-3-319-56111-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-56111-0_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56110-3
Online ISBN: 978-3-319-56111-0
eBook Packages: Computer ScienceComputer Science (R0)