Abstract
As the development of data mining technologies for sensor data streams, more sophisticated methods for complex event processing are demanded. In the case of event recognition, since event recognition results may contain errors, we need to deal with the uncertainty of events. We therefore consider probabilistic event data streams with occurrence probabilities of events, and develop a pattern matching method based on regular expressions. In this paper, we first analyze the semantics of pattern matching over non-probabilistic data streams, and then propose the problem of top-k pattern matching over probabilistic data streams. We introduce the use of an information-theoretic criterion to select appropriate matches as the result of pattern matching. Then, we present an efficient algorithm to detect top-k matches, and evaluate the effectiveness of our approach using real and synthetic datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, C.C., Yu, P.S.: A framework for clustering uncertain data streams. In: 2008 IEEE 24th ICDE, pp. 150–159 (2008)
Aho, A.V., Corasick, M.J.: Efficient string matching: an aid to bibliographic search. Commun. ACM 18(6), 333–340 (1975)
Akdere, M., Çetintemel, U., Tatbul, N.: Plan-based complex event detection across distributed sources. Proc. VLDB Endow. 1(1), 66–77 (2008)
Chandramouli, B., Goldstein, J., Maier, D.: High-performance dynamic pattern matching over disordered streams. Proc. VLDB Endow. 3(1–2), 220–231 (2010)
Chen, L., Nugent, C., Wang, H.: A knowledge-driven approach to activity recognition in smart homes. IEEE TKDE 24(6), 961–974 (2012)
Cormode, G., Garofalakis, M.: Sketching probabilistic data streams. In: Proceedings of 2007 ACM SIGMOD, pp. 281–292 (2007)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Hoboken (2012)
Cugola, G., Margara, A.: Processing flows of information: from data stream to complex event processing. ACM Comput. Surv. 44(3), 15:1–15:62 (2012)
Diao, Y., Fischer, P., Franklin, M.J., To, R.: YFilter: efficient and scalable filtering of XML documents. In: Proceedings of 18th ICDE, pp. 341–342 (2002)
Forney Jr., G.D.: The Viterbi algorithm. Proc. IEEE 61(3), 268–278 (1973)
Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Addison Wesley, Boston (2000)
Jin, C., Yi, K., Chen, L., Yu, J.X., Lin, X.: Sliding-window top-k queries on uncertain streams. Proc. VLDB Endow. 1(1), 301–312 (2008)
Knuth, D.E., Morris Jr., J.H., Pratt, V.R.: Fast pattern matching in strings. SIAM J. Comput. 6(2), 323–350 (1977)
Lara, O.D., Labrador, M.A.: A survey on human activity recognition using wearable sensors. IEEE Commun. Surv. Tutor. 15(3), 1192–1209 (2013)
Li, Z., Ge, T., Chen, C.X.: \(\varepsilon \)-matching: event processing over noisy sequences in real time. In: Proceedings of 2013 ACM SIGMOD, pp. 601–612 (2013)
Liu, M., Golovnya, D., Rundensteiner, E.A., Claypool, K.T.: Sequence pattern query processing over out-of-order event streams. In: 2009 IEEE 25th ICDE, pp. 784–795 (2009)
Mei, Y., Madden, S.: ZStream: a cost-based query processor for adaptively detecting composite events. In: Proceedings of 2009 ACM SIGMOD, pp. 193–206 (2009)
Nakata, I.: Generation of pattern-matching algorithms by extended regular expressions. Japan Soc. Softw. Sci. Tech. 5, 1–9 (1993)
Ré, C., Letchner, J., Balazinska, M., Suciu, D.: Event queries on correlated probabilistic streams. In: Proceedings of 2008 ACM SIGMOD, pp. 715–728 (2008)
Santini, S.: Querying streams using regular expressions: some semantics, decidability, and efficiency issues. VLDB J. 24(6), 801–821 (2015)
Thompson, K.: Programming techniques: regular expression search algorithm. Commun. ACM 11(6), 419–422 (1968)
Tran, T.T.L., Peng, L., Diao, Y., McGregor, A., Liu, A.: CLARO: modeling and processing uncertain data streams. VLDB J. 21(5), 651–676 (2012)
Woods, L., Teubner, J., Alonso, G.: Complex event detection at wire speed with FPGAs. Proc. VLDB Endow. 3(1–2), 660–669 (2010)
Wu, E., Diao, Y., Rizvi, S.: High-performance complex event processing over streams. In: Proceedings of 2006 ACM SIGMOD, pp. 407–418 (2006)
Yin, J., Yang, Q., Pan, J.J.: Sensor-based abnormal human-activity detection. IEEE TKDE 20(8), 1082–1090 (2008)
Zhang, H., Diao, Y., Immerman, N.: On complexity and optimization of expensive queries in complex event processing. In: Proceedings of 2014 ACM SIGMOD, pp. 217–228 (2014)
Zhang, Q., Li, F., Yi, K.: Finding frequent items in probabilistic data. In: Proceedings of 2008 ACM SIGMOD, pp. 819–832 (2008)
Acknowledgment
This research was partially supported by the Center of Innovation Program from Japan Science and Technology Agency (JST) and KAKENHI (16H01722, 26540043).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Sugiura, K., Ishikawa, Y. (2017). Top-k Pattern Matching Using an Information-Theoretic Criterion over Probabilistic Data Streams. In: Chen, L., Jensen, C., Shahabi, C., Yang, X., Lian, X. (eds) Web and Big Data. APWeb-WAIM 2017. Lecture Notes in Computer Science(), vol 10366. Springer, Cham. https://doi.org/10.1007/978-3-319-63579-8_39
Download citation
DOI: https://doi.org/10.1007/978-3-319-63579-8_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63578-1
Online ISBN: 978-3-319-63579-8
eBook Packages: Computer ScienceComputer Science (R0)