Advertisement

Efficient Frequent Itemset Mining from Dense Data Streams

  • Alfredo Cuzzocrea
  • Fan Jiang
  • Wookey Lee
  • Carson K. Leung
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8709)

Abstract

Due to advances in technology, high volumes of valuable data can be produced at high velocity in many real-life applications. Hence, efficient data mining techniques for discovering implicit, previously unknown, and potentially useful frequent itemsets from data streams are in demand. Many existing stream mining algorithms capture important stream data and assume that the captured data can fit into main memory. However, problems arise when the available memory is so limited that such an assumption does not hold. In this paper, we present a data structure to capture important data from the streams onto the disk. In addition, we present two algorithms—which use this data structure—to mine frequent itemsets from these dense (or sparse) data streams.

Keywords

Data Stream Frequent Pattern Frequent Itemsets Uncertain Data Streaming Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aggarwal, C.C.: On classification of graph streams. In: SDM 2011, pp. 652–663 (2011)Google Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: VLDB 1994, pp. 487–499 (1994)Google Scholar
  3. 3.
    Buehrer, G., Parthasarathy, S., Ghoting, A.: Out-of-core frequent pattern mining on a commodity. In: ACM KDD 2006, pp. 86–95 (2006)Google Scholar
  4. 4.
    Cameron, J.J., Cuzzocrea, A., Jiang, F., Leung, C.K.: Frequent pattern mining from dense graph streams. In: EDBT/ICDT Workshops 2014, pp. 240–247 (2014)Google Scholar
  5. 5.
    Cameron, J.J., Cuzzocrea, A., Jiang, F., Leung, C.K.: Mining frequent itemsets from sparse data streams in limited memory environments. In: Wang, J., Xiong, H., Ishikawa, Y., Xu, J., Zhou, J. (eds.) WAIM 2013. LNCS, vol. 7923, pp. 51–57. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  6. 6.
    Cao, K., Han, D., Wang, G., Hu, Y., Yuan, Y.: An algorithm for outlier detection on uncertain data stream. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds.) APWeb 2013. LNCS, vol. 7808, pp. 449–460. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  7. 7.
    Chiu, D.Y., Wu, Y.H., Chen, A.: Efficient frequent sequence mining by a dynamic strategy switching algorithm. VLDB J. 18(1), 303–327 (2009)CrossRefGoogle Scholar
  8. 8.
    Cuzzocrea, A., Leung, C.K., MacKinnon, R.K.: Mining constrained frequent itemsets from distributed uncertain data. FGCS 37, 117–126 (2014)CrossRefGoogle Scholar
  9. 9.
    Fariha, A., Ahmed, C.F., Leung, C.K.-S., Abdullah, S.M., Cao, L.: Mining frequent patterns from human interactions in meetings using directed acyclic graphs. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part I. LNCS, vol. 7818, pp. 38–49. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  10. 10.
    Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining frequent patterns in data streams at multiple time granularities. In: Data Mining: Next Generation Challenges and Future Directions, ch. 6 (2004)Google Scholar
  11. 11.
    Gong, X., Qian, W., Qin, S., Zhou, A.: Fractal based anomaly detection over data streams. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds.) APWeb 2013. LNCS, vol. 7808, pp. 550–562. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  12. 12.
    Grahne, G., Zhu, J.: Mining frequent itemsets from secondary memory. In: IEEE ICDM 2004, pp. 91–98 (2004)Google Scholar
  13. 13.
    Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation.In: ACM SIGMOD 2000, pp. 1–12 (2000)Google Scholar
  14. 14.
    Jin, R., Agrawal, G.: An algorithm for in-core frequent itemset mining on streaming data. In: IEEE ICDM 2005, pp. 210–217 (2005)Google Scholar
  15. 15.
    Lee, W., Song, J.J., Leung, C.K.-S.: Categorical data skyline using classification tree. In: Du, X., Fan, W., Wang, J., Peng, Z., Sharaf, M.A. (eds.) APWeb 2011. LNCS, vol. 6612, pp. 181–187. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  16. 16.
    Leung, C.K.-S., Brajczuk, D.A.: Efficient mining of frequent itemsets from data streams. In: Gray, A., Jeffery, K., Shao, J. (eds.) BNCOD 2008. LNCS, vol. 5071, pp. 2–14. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  17. 17.
    Leung, C.K.S., Carmichael, C.L., Johnstone, P., Yuen, D.S.H.-C.: Interactive visual analytics of databases and frequent sets. IJIRR 3(4), 120–140 (2013)Google Scholar
  18. 18.
    Leung, C.K.-S., Cuzzocrea, A., Jiang, F.: Discovering frequent patterns from uncertain data streams with time-fading and landmark models. TLDKS VIII, 174–196 (2013)Google Scholar
  19. 19.
    Leung, C.K.-S., Hayduk, Y.: Mining frequent patterns from uncertain data with MapReduce for Big data analytics. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds.) DASFAA 2013, Part I. LNCS, vol. 7825, pp. 440–455. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  20. 20.
    Leung, C.K.-S., Jiang, F.: Frequent itemset mining of uncertain data streams using the damped window model. In: ACM SAC 2011, pp. 950–955 (2011)Google Scholar
  21. 21.
    Leung, C.K.-S., Khan, Q.I.: DSTree: a tree structure for the mining of frequent sets from data streams. In: IEEE ICDM 2006, pp. 928–932 (2006)Google Scholar
  22. 22.
    Leung, C.K.-S., Khan, Q.I., Li, Z., Hoque, T.: CanTree: a canonical-order tree for incremental frequent-pattern mining. KAIS 11(3), 287–311 (2007)Google Scholar
  23. 23.
    Leung, C.K.-S., Tanbeer, S.K.: PUF-tree: a compact tree structure for frequent pattern mining of uncertain data. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part I. LNCS, vol. 7818, pp. 13–25. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  24. 24.
    Papapetrou, O., Garofalakis, M., Deligiannakis, A.: Sketch-based querying of distributed sliding-window data streams. In: VLDB 2012, pp. 992–1003 (2012)Google Scholar
  25. 25.
    Rao, W., Chen, L., Chen, S., Tarkoma, S.: Evaluating continuous top-k queries over document streams. WWW 17(1), 59–83 (2014)CrossRefGoogle Scholar
  26. 26.
    Tanbeer, S.K., Leung, C.K.-S.: Finding diverse friends in social networks. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds.) APWeb 2013. LNCS, vol. 7808, pp. 301–309. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  27. 27.
    Xu, B., Deng, L., Jia, Y., Zhou, B., Han, Y.: Social circle analysis on ego-network based on context frequent pattern mining. In: ICIMCS 2013, pp. 139–144 (2013)Google Scholar
  28. 28.
    Zhou, X., Chen, L.: Event detection over twitter social media streams. VLDB J. 23(3), 381–400 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Alfredo Cuzzocrea
    • 1
  • Fan Jiang
    • 2
  • Wookey Lee
    • 3
  • Carson K. Leung
    • 2
  1. 1.ICAR-CNR & University of CalabriaRendeItaly
  2. 2.University of ManitobaWinnipegCanada
  3. 3.Inha UniversityIncheonSouth Korea

Personalised recommendations