Abstract
Ubiquitous devices and applications generate data that are naturally ordered by time. Thus elementary data items can form sequences. The most popular way of analyzing sequences is searching for patterns. To this end, sequential pattern discovery techniques were proposed in some research contributions and implemented in a few database systems, e.g., Oracle Database, Teradata Aster, Apache Hive. The goal of this work is to assess the functionality of the systems and to evaluate their performance with respect to pattern queries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bebel, B., Cichowicz, T., Morzy, T., Rytwiński, F., Wrembel, R., Koncilia, C.: Sequential data analytics by means of Seq-SQL language. In: Chen, Q., Hameurlain, A., Toumani, F., Wagner, R., Decker, H. (eds.) DEXA 2015. LNCS, vol. 9261, pp. 416–431. Springer, Heidelberg (2015). doi:10.1007/978-3-319-22849-5_28
Bębel, B., Morzy, M., Morzy, T., Królikowski, Z., Wrembel, R.: OLAP-like analysis of time point-based sequential data. In: Castano, S., Vassiliadis, P., Lakshmanan, L.V., Lee, M.L. (eds.) ER 2012. LNCS, vol. 7518, pp. 153–161. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33999-8_19
Buchmann, A.P., Koldehofe, B.: Complex event processing. Inf. Technol. 51(5), 241–242 (2009)
Chawathe, S.S., Krishnamurthy, V., Ramachandran, S., Sarma, S.: Managing RFID data. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 1189–1195 (2004)
Chui, C.K., Kao, B., Lo, E., Cheung, D.: S-OLAP: an OLAP system for analyzing sequence data. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 1131–1134 (2010)
Chui, C.K., Lo, E., Kao, B., Ho, W.-S.: Supporting ranking pattern-based aggregate queries in sequence data cubes. In: Proceedings of ACM Conference on Information and Knowledge Management (CIKM), pp. 997–1006 (2009)
Fred Zemke, F., Witkowski, A., Cherniak, M., Colby, L.: Pattern matching in sequences of rows, 2007. Accessed 2 Mar 2016. http://web.cs.ucla.edu/classes/fall15/cs240A/notes/temporal/row-pattern-recogniton-11.pdf
Gonzalez, H., Han, J., Li, X.: FlowCube: constructing RFID flowcubes for multi-dimensional analysis of commodity flows. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 834–845 (2006)
Han, J., Chen, Y., Dong, G., Pei, J., Wah, B.W., Wang, J., Cai, Y.D.: Stream Cube: an architecture for multi-dimensional analysis of data streams. Distrib. Parallel Databases 18(2), 173–197 (2005)
Koncilia, C., Morzy, T., Wrembel, R., Eder, J.: Interval OLAP: analyzing interval data. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2014. LNCS, vol. 8646, pp. 233–244. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10160-6_21
Koncilia, C., Pichler, H., Wrembel, R.: A generic data warehouse architecture for analyzing workflow logs. In: Morzy, T., Valduriez, P., Bellatreche, L. (eds.) ADBIS 2015. LNCS, vol. 9282, pp. 106–119. Springer, Heidelberg (2015). doi:10.1007/978-3-319-23135-8_8
Lerner, A., Shasha, D.: AQuery: query language for ordered data, optimization techniques, and experiments. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 345–356 (2003)
Liu, M., Rundensteiner, E., Greenfield, K., Gupta, C., Wang, S., Ari, I., Mehta, A.: E-Cube: multi-dimensional event sequence analysis using hierarchical pattern query sharing. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 889–900 (2011)
Liu, M., Rundensteiner, E.A.: Event sequence processing: new models and optimization techniques. In: Proceedings of SIGMOD PhD Workshop on Innovative Database Research (IDAR), pp. 7–12 (2010)
Lo, E., Kao, B., Ho, W.-S., Lee, S.D., Chui, C.K., Cheung, D.W.: OLAP on sequence data. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 649–660 (2008)
Meisen, P., Kenig, D., Meisen, T., Recchioni, M., Jeschke, S.: TidaQL: a query language enabling on-line analytical processing of time interval data. In: Proceedings of International Conference on Enterprise Information Systems (ICEIS), pp. 54–66 (2015)
Melton, J. (ed.): Working Draft Database Language SQL - Part 15: Row Pattern Recognition (SQL/RPR). ANSI INCITS DM32.2-2011-00005 (2011)
Ramakrishnan, R., Donjerkovic, D., Ranganathan, A., Beyer, K.S., Krishnaprasad, M.: SRQL: Sorted relational query language. In: Proceedings of International Conference on Scientific and Statistical Database Management (SSDBM), pp. 84–95 (1998)
Sadri, R., Zaniolo, C., Zarkesh, A.M., Adibi, J.: A sequential pattern query language for supporting instant data mining for e-services. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 653–656 (2001)
Seshadri, P., Livny, M., Ramakrishnan, R.: Sequence query processing. SIGMOD Rec. 23(2), 430–441 (1994)
Aster nPath. http://developer.teradata.com/aster/articles/aster-npath-guide. Accessed 13 Mar 2014
Aalst, W.M.P.: Process cubes: slicing, dicing, rolling up and drilling down event data for process mining. In: Song, M., Wynn, M.T., Liu, J. (eds.) AP-BPM 2013. LNBIP, vol. 159, pp. 1–22. Springer, Heidelberg (2013). doi:10.1007/978-3-319-02922-1_1
Wu, E., Diao, Y., Rizvi, S.: High-performance complex event processing over streams. In: Procedings of ACM SIGMOD International Conference on Management of Data, pp. 407–418 (2006)
Zhang, Y., Kersten, M., Manegold, S.: SciQL: Array data processing inside an RDBMS. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 1049–1052 (2013)
Acknowledgement
The research of G. Bakkalian has been funded by the European Commission through the “Erasmus Mundus Joint Doctorate Information Technologies for Business Intelligence Doctoral College (IT4BI-DC)”. The research of W. Andrzejewski, B. Bębel, and R. Wrembel has been funded by the Polish National Science Center, grant “Analytical processing and mining of sequential data: models, algorithms, and data structures”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Andrzejewski, W., Bębel, B., Kłosowski, S., Łukaszewski, B., Wrembel, R., Bakkalian, G. (2016). Searching for Patterns in Sequential Data: Functionality and Performance Assessment of Commercial and Open-Source Systems. In: Link, S., Trujillo, J. (eds) Advances in Conceptual Modeling. ER 2016. Lecture Notes in Computer Science(), vol 9975. Springer, Cham. https://doi.org/10.1007/978-3-319-47717-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-47717-6_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47716-9
Online ISBN: 978-3-319-47717-6
eBook Packages: Computer ScienceComputer Science (R0)