Skip to main content

Rare Pattern Mining from Data Streams Using SRP-Tree and Its Variants

  • Chapter
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((TLDKS,volume 9260))

Abstract

There has been some research in the area of rare pattern mining where the researchers try to capture patterns involving events that are unusual in a dataset. These patterns are considered more useful than frequent patterns in some domains, including detection of computer attacks, or fraudulent credit transactions. Until now, most of the research in this area concentrates only on finding rare rules in a static dataset. There is a proliferation of applications which generate data streams, such as network logs and banking transactions, and applying techniques that mine static datasets is not practical for data streams. We propose a novel approach called Streaming Rare Pattern Tree (SRP-Tree) and its variations, which finds rare rules in a data stream environment using a sliding window, and show that it both finds the complete set of itemsets and runs with fast execution time.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://fimi.ua.ac.be/data/.

References

  1. Adda, M., Wu, L., Feng, Y.: Rare itemset mining. In: Proceedings of the Sixth International Conference on Machine Learning and Applications, ICMLA 2007, pp. 73–80. IEEE Computer Society, Washington, DC (2007)

    Google Scholar 

  2. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Bocca, J.B., Jarke, M., Zaniolo, C. (eds.) Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pp. 487–499. Morgan Kaufmann, Santiago (1994)

    Google Scholar 

  3. Cheng, J., Ke, Y., Ng, W.: Maintaining frequent closed itemsets over a sliding window. J. Intell. Inf. Syst. 31, 191–215 (2008)

    Article  Google Scholar 

  4. Chi, Y., Wang, H., Yu, P.S., Muntz, R.R.: Moment: maintaining closed frequent itemsets over a stream sliding window. In: Proceedings of the Fourth IEEE International Conference on Data Mining, ICDM 2004, pp. 59–66. IEEE Computer Society, Washington, DC (2004)

    Google Scholar 

  5. Chi, Y., Wang, H., Yu, P.S., Muntz, R.R.: Catch the moment: maintaining closed frequent itemsets over a data stream sliding window. Knowl. Inf. Syst. 10, 265–294 (2006)

    Article  MATH  Google Scholar 

  6. Cuzzocrea, A.: Models and algorithms for high-performance distributed data mining. J. Parallel Distrib. Computi. 73(3), 281–283 (2013)

    Article  Google Scholar 

  7. Cuzzocrea, A., Furfaro, F., Masciari, E., Saccà, D., Sirangelo, C.: Approximate query answering on sensor network data streams. In: GeoSensor Networks, vol. 49, pp. 53–72 (2004)

    Google Scholar 

  8. Cuzzocrea, A., Papadimitriou, A., Katsaros, D., Manolopoulos, Y.: Edge betweenness centrality: a novel algorithm for qos-based topology control over wireless sensor networks. J. Network Comput. Appl. 35(4), 1210–1217 (2012). http://dx.doi.org/10.1016/j.jnca.2011.06.001

    Article  Google Scholar 

  9. Cuzzocrea, A., Saccà, D., Ullman, J.D.: Big data: a research agenda. In: Proceedings of the 17th International Database Engineering & #38; Applications Symposium, IDEAS 2013, pp. 198–203. ACM, New York (2013)

    Google Scholar 

  10. Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows: (extended abstract). In: Proceedings of theThirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2002, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, pp. 635–644 (2002)

    Google Scholar 

  11. Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining frequent patterns in data streams at multiple time granularities (2002)

    Google Scholar 

  12. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD 2000, pp. 1–12. ACM, New York (2000)

    Google Scholar 

  13. Huang, D.T.J., Koh, Y.S., Dobbie, G., Pears, R.: Kernel-tree: mining frequent patterns in a data stream based on forecast support. In: Thielscher, M., Zhang, D. (eds.) AI 2012. LNCS, vol. 7691, pp. 614–625. Springer, Heidelberg (2012). http://dx.doi.org/10.1007/978-3-642-35101-3_52

    Chapter  Google Scholar 

  14. Huang, D.T.J., Koh, Y.S., Dobbie, G., Pears, R.: Detecting changes in rare patterns from data streams. In: Tseng, V.S., Ho, T.B., Zhou, Z.-H., Chen, A.L.P., Kao, H.-Y. (eds.) PAKDD 2014, Part II. LNCS, vol. 8444, pp. 437–448. Springer, Heidelberg (2014). http://dx.doi.org/10.1007/978-3-319-06605-9_36

    Chapter  Google Scholar 

  15. Koh, Y.S., Rountree, N.: Finding sporadic rules using apriori-inverse. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 97–106. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  16. Koh, Y.S., Dobbie, G.: Efficient single pass ordered incremental pattern mining. In: Hameurlain, A., Küng, J., Wagner, R., Cuzzocrea, A., Dayal, U. (eds.) TLDKS VIII. LNCS, vol. 7790, pp. 137–156. Springer, Heidelberg (2013). http://dx.doi.org/10.1007/978-3-642-37574-3_6

    Chapter  Google Scholar 

  17. Lavergne, J., Benton, R., Raghavan, V.V.: TRARM-RelSup: targeted rare association rule mining using itemset trees and the relative support measure. In: Chen, L., Felfernig, A., Liu, J., Raś, Z.W. (eds.) ISMIS 2012. LNCS, vol. 7661, pp. 61–70. Springer, Heidelberg (2012). http://dx.doi.org/10.1007/978-3-642-34624-8_7

    Chapter  Google Scholar 

  18. Lee, C.H., Lin, C.R., Chen, M.S.: Sliding window filtering: an efficient method for incremental mining on a time-variant database. Inf. Syst. 30(3), 227–244 (2005)

    Article  Google Scholar 

  19. Leung, C.K.-S., Cuzzocrea, A., Jiang, F.: Discovering frequent patterns from uncertain data streams with time-fading and landmark models. In: Hameurlain, A., Küng, J., Wagner, R., Cuzzocrea, A., Dayal, U. (eds.) TLDKS VIII. LNCS, vol. 7790, pp. 174–196. Springer, Heidelberg (2013). http://dx.doi.org/10.1007/978-3-642-37574-3_8

    Chapter  Google Scholar 

  20. Leung, C.K.S., Khan, Q.I.: Dstree: A tree structure for the mining of frequent sets from data streams. In: Proceedings of the Sixth International Conference on Data Mining, ICDM 2006, pp. 928–932. IEEE Computer Society, Washington, DC (2006)

    Google Scholar 

  21. Li, H.F., Lee, S.Y.: Mining frequent itemsets over data streams using efficient window sliding techniques. Expert Syst. Appl. 36, 1466–1477 (2009)

    Article  Google Scholar 

  22. Liu, B., Hsu, W., Ma, Y.: Mining association rules with multiple minimum supports. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 337–341 (1999)

    Google Scholar 

  23. Luna, J., Romero, J., Ventura, S.: On the adaptability of g3parm to the extraction of rare association rules. Knowl. Inf. Syst. 38, 391–418 (2013). http://dx.doi.org/10.1007/s10115-012-0591-9

    Article  Google Scholar 

  24. Mozafari, B., Thakkar, H., Zaniolo, C.: Verifying and mining frequent patterns from large windows over data streams. In: Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, pp. 179–188. IEEE Computer Society, Washington, DC (2008). http://dl.acm.org/citation.cfm?id=1546682.1547157

  25. Okubo, Y., Haraguchi, M., Nakajima, T.: Finding rare patterns with weak correlation constraint. In: 2010 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 822–829 (2010)

    Google Scholar 

  26. Szathmary, L., Napoli, A., Valtchev, P.: Towards rare itemset mining. In: Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2007, vol. 01, pp. 305–312. IEEE Computer Society, Washington, DC (2007)

    Google Scholar 

  27. Tanbeer, S.K., Ahmed, C.F., Jeong, B.S., Lee, Y.K.: Sliding window-based frequent pattern mining over data streams. Inf. Sci. 179(22), 3843–3865 (2009)

    Article  MathSciNet  Google Scholar 

  28. Troiano, L., Scibelli, G., Birtolo, C.: A fast algorithm for mining rare itemsets. In: Proceedings of the 2009 Ninth International Conference on Intelligent Systems Design and Applications, ISDA 2009, pp. 1149–1155. IEEE Computer Society, Washington, DC (2009)

    Google Scholar 

  29. Tsang, S., Koh, Y.S., Dobbie, G.: RP-tree: rare pattern tree mining. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2011. LNCS, vol. 6862, pp. 277–288. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  30. Tsang, S., Koh, Y.S., Dobbie, G., Alam, S.: SPAN: finding collaborative frauds in online auctions. Knowl.-Based Syst. 71, 389–408 (2014). http://dx.doi.org/10.1016/j.knosys.2014.08.016

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Tse Jung Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Huang, D.T.J., Koh, Y.S., Dobbie, G. (2015). Rare Pattern Mining from Data Streams Using SRP-Tree and Its Variants. In: Hameurlain, A., Küng, J., Wagner, R., Cuzzocrea, A., Dayal, U. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems XXI. Lecture Notes in Computer Science(), vol 9260. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-47804-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-47804-2_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-47803-5

  • Online ISBN: 978-3-662-47804-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics