Skip to main content

Sequential Data Analytics by Means of Seq-SQL Language

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9261))

Abstract

Ubiquitous devices and applications generate data, whose natural feature is order. Most of the commercial software and research prototypes for data analytics allow to analyze set oriented data, neglecting their order. However, by analyzing both data and their order dependencies, one can discover new business knowledge. Few solutions in this field have been proposed so far, and all of them lack a comprehensive approach to organize and process such data in a data warehouse-like manner. In this paper, we contribute an SQL-like query language for analyzing sequential data in an OLAP-like manner, its prototype implementation and performance evaluation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bębel, B., Morzy, M., Morzy, T., Królikowski, Z., Wrembel, R.: OLAP-Like analysis of time point-based sequential data. In: Castano, S., Vassiliadis, P., Lakshmanan, L.V.S., Lee, M.L. (eds.) ER 2012 Workshops 2012. LNCS, vol. 7518, pp. 153–161. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  2. Bebel, B., Morzy, T., Królikowski, Z., Wrembel, R.: Formal model of time point-based sequential data for OLAP-like analysis. Bull. Pol. Acad. Sci. (Tech. Sci.) 62(2), 331–340 (2014)

    Google Scholar 

  3. Buchmann, A.P., Koldehofe, B.: Complex event processing. Inf. Tech. 51(5), 241–242 (2009)

    MATH  Google Scholar 

  4. Chawathe, S.S., Krishnamurthy, V., Ramachandran, S., Sarma, S.: Managing RFID data. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 1189–1195. VLDB Endowment (2004)

    Google Scholar 

  5. Chui, C.K., Kao, B., Lo, E., Cheung, D.: S-OLAP: an OLAP system for analyzing sequence data. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 1131–1134. ACM (2010)

    Google Scholar 

  6. Chui, C.K., Lo, E., Kao, B., Ho, W.-S.: Supporting ranking pattern-based aggregate queries in sequence data cubes. In: Proceedings of ACM Conference on Information and Knowledge Management (CIKM), pp. 997–1006. ACM (2009)

    Google Scholar 

  7. Gonzalez, H., Han, J., Li, X.: FlowCube: constructing RFID flowcubes for multi-dimensional analysis of commodity flows. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 834–845. VLDB Endowment (2006)

    Google Scholar 

  8. Gonzalez, H., Han, J., Li, X., Klabjan, D.: Warehousing and analyzing massive RFID data sets. In: Proceedings of International Conference on Data Engineering (ICDE), p. 83 (2006)

    Google Scholar 

  9. Han, J., Chen, Y., Dong, G., Pei, J., Wah, B.W., Wang, J., Cai, Y.D.: Stream cube: an architecture for multi-dimensional analysis of data streams. Distrib. Parallel Databases 18(2), 173–197 (2005)

    Article  Google Scholar 

  10. Han, J.-W., Pei, J., Yan, X.-F.: From sequential pattern mining to structured pattern mining: a pattern-growth approach. J. Comput. Sci. Technol. 19(3), 257–279 (2004)

    Article  MathSciNet  Google Scholar 

  11. Lerner, A., Shasha, D.: AQuery: query language for ordered data, optimization techniques, and experiments. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 345–356 (2003)

    Google Scholar 

  12. Liu, M., Rundensteiner, E., Greenfield, K., Gupta, C., Wang, S., Ari, I., Mehta, A.: E-Cube: multi-dimensional event sequence analysis using hierarchical pattern query sharing. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 889–900. ACM (2011)

    Google Scholar 

  13. Liu, M., Rundensteiner, E.A.: Event sequence processing: new models and optimization techniques. In: Proceedings of SIGMOD Ph.D. Workshop on Innovative Database Research (IDAR), pp. 7–12 (2010)

    Google Scholar 

  14. Lo, E., Kao, B., Ho, W.-S., Lee, S.D., Chui, C.K., Cheung, D.W.: OLAP on sequence data. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 649–660 (2008)

    Google Scholar 

  15. Mabroukeh, N.R., Ezeife, C.I.: A taxonomy of sequential pattern mining algorithms. ACM Comput. Surv. 43(1), 3:1–3:41 (2010)

    Article  Google Scholar 

  16. Marascu, A., Masseglia, F.: Mining sequential patterns from data streams: a centroid approach. J. Intell. Inf. Syst. 27(3), 291–307 (2006)

    Article  Google Scholar 

  17. Masseglia, F., Teisseire, M., Poncelet, P.: Sequential pattern mining. In: Wang, J. (ed.) Encyclopedia of Data Warehousing and Mining, pp. 1800–1805. IGI Global (2009)

    Google Scholar 

  18. Melton, J. (ed.) Working Draft Database Language SQL - Part 15: Row Pattern Recognition (SQL/RPR). ANSI INCITS DM32.2-2011-00005 (2011)

    Google Scholar 

  19. Mendes, L.F., Ding, B., Han, J.: Stream sequential pattern mining with precise error bounds. In: Proceedings of IEEE International Conference on Data Mining (ICDM), pp. 941–946 (2008)

    Google Scholar 

  20. Mörchen, F.: Unsupervised pattern mining from symbolic temporal data. SIGKDD Explor. Newsl. 9(1), 41–55 (2007)

    Article  Google Scholar 

  21. Parr, T. (ed.) The Definitive ANTLR Reference: Building Domain-Specific Languages. Pragmatic Bookshelf (2007)

    Google Scholar 

  22. Perng, C., Wang, H., Zhang, S.R., Jr., D.S.P.: Landmarks: a new model for similarity-based pattern querying in time series databases. In: Proceedings of International Conference on Data Engineering (ICDE), pp. 33–42 (2000)

    Google Scholar 

  23. Rafiei, D., Mendelzon, A.O.: Querying time series data based on similarity. IEEE Trans. Knowl. Data Eng. (TKDE) 12(5), 675–693 (2000)

    Article  Google Scholar 

  24. Ramakrishnan, R., Donjerkovic, D., Ranganathan, A., Beyer, K.S., Krishnaprasad, M.: SRQL: sorted relational query language. In: Proceedings of International Conference on Scientific and Statistical Database Management (SSDBM), pp. 84–95 (1998)

    Google Scholar 

  25. Sadri, R., Zaniolo, C., Zarkesh, A., Adibi, J.: Optimization of sequence queries in database systems. In: Proceedings of ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), pp. 71–81. ACM (2001)

    Google Scholar 

  26. Sadri, R., Zaniolo, C., Zarkesh, A., Adibi, J.: Expressing and optimizing sequence queries in database systems. ACM Trans. Database Syst. 29(2), 282–318 (2004)

    Article  MATH  Google Scholar 

  27. Sadri, R., Zaniolo, C., Zarkesh, A.M., Adibi, J.: A sequential pattern query language for supporting instant data mining for e-services. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 653–656 (2001)

    Google Scholar 

  28. Seshadri, P., Livny, M., Ramakrishnan, R.: Sequence query processing. In: SIGMOD Record, vol. 23, no. 2 (1994)

    Google Scholar 

  29. Seshadri, P., Livny, M., Ramakrishnan, R.: SEQ: a model for sequence databases. In: Proceedings of International Conference on Data Engineering (ICDE), pp. 232–239 (1995)

    Google Scholar 

  30. Seshadri, P., Livny, M., Ramakrishnan, R.: The design and implementation of a sequence database system. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 99–110. Morgan Kaufmann Publishers Inc. (1996)

    Google Scholar 

  31. Aster nPath. http://developer.teradata.com/aster/articles/aster-npath-guide. Retrived 13 March 2014

  32. van der Aalst, W.M.P.: Process cubes: slicing, dicing, rolling up and drilling down event data for process mining. In: Song, M., Wynn, M.T., Liu, J. (eds.) AP-BPM 2013. LNBIP, vol. 159, pp. 1–22. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  33. Witkowski, A.: Analyze this! Analytical power in SQL, more than you ever dreamt of. Oracle Open World (2012)

    Google Scholar 

  34. Wu, E., Diao, Y., Rizvi, S.: High-performance complex event processing over streams. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 407–418. ACM (2006)

    Google Scholar 

  35. Zhang, Y., Kersten, M., Manegold, S.: SciQL: array data processing inside an RDBMS. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 1049–1052 (2013)

    Google Scholar 

  36. Zheng, Q., Xu, K., Ma, S.: When to update the sequential patterns of stream data? In: Whang, K.-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) Proceedings of Pacific-Asia Confernece on Advances in Knowledge Discovery and Data Mining (PAKDD), vol. 2637, pp. 545–550. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robert Wrembel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Bebel, B., Cichowicz, T., Morzy, T., Rytwiński, F., Wrembel, R., Koncilia, C. (2015). Sequential Data Analytics by Means of Seq-SQL Language. In: Chen, Q., Hameurlain, A., Toumani, F., Wagner, R., Decker, H. (eds) Database and Expert Systems Applications. Globe DEXA 2015 2015. Lecture Notes in Computer Science(), vol 9261. Springer, Cham. https://doi.org/10.1007/978-3-319-22849-5_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22849-5_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22848-8

  • Online ISBN: 978-3-319-22849-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics