Skip to main content

Efficient Frequent Itemset Mining from Dense Data Streams

  • Conference paper
Web Technologies and Applications (APWeb 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8709))

Included in the following conference series:

Abstract

Due to advances in technology, high volumes of valuable data can be produced at high velocity in many real-life applications. Hence, efficient data mining techniques for discovering implicit, previously unknown, and potentially useful frequent itemsets from data streams are in demand. Many existing stream mining algorithms capture important stream data and assume that the captured data can fit into main memory. However, problems arise when the available memory is so limited that such an assumption does not hold. In this paper, we present a data structure to capture important data from the streams onto the disk. In addition, we present two algorithms—which use this data structure—to mine frequent itemsets from these dense (or sparse) data streams.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C.C.: On classification of graph streams. In: SDM 2011, pp. 652–663 (2011)

    Google Scholar 

  2. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: VLDB 1994, pp. 487–499 (1994)

    Google Scholar 

  3. Buehrer, G., Parthasarathy, S., Ghoting, A.: Out-of-core frequent pattern mining on a commodity. In: ACM KDD 2006, pp. 86–95 (2006)

    Google Scholar 

  4. Cameron, J.J., Cuzzocrea, A., Jiang, F., Leung, C.K.: Frequent pattern mining from dense graph streams. In: EDBT/ICDT Workshops 2014, pp. 240–247 (2014)

    Google Scholar 

  5. Cameron, J.J., Cuzzocrea, A., Jiang, F., Leung, C.K.: Mining frequent itemsets from sparse data streams in limited memory environments. In: Wang, J., Xiong, H., Ishikawa, Y., Xu, J., Zhou, J. (eds.) WAIM 2013. LNCS, vol. 7923, pp. 51–57. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  6. Cao, K., Han, D., Wang, G., Hu, Y., Yuan, Y.: An algorithm for outlier detection on uncertain data stream. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds.) APWeb 2013. LNCS, vol. 7808, pp. 449–460. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  7. Chiu, D.Y., Wu, Y.H., Chen, A.: Efficient frequent sequence mining by a dynamic strategy switching algorithm. VLDB J. 18(1), 303–327 (2009)

    Article  Google Scholar 

  8. Cuzzocrea, A., Leung, C.K., MacKinnon, R.K.: Mining constrained frequent itemsets from distributed uncertain data. FGCS 37, 117–126 (2014)

    Article  Google Scholar 

  9. Fariha, A., Ahmed, C.F., Leung, C.K.-S., Abdullah, S.M., Cao, L.: Mining frequent patterns from human interactions in meetings using directed acyclic graphs. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part I. LNCS, vol. 7818, pp. 38–49. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  10. Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining frequent patterns in data streams at multiple time granularities. In: Data Mining: Next Generation Challenges and Future Directions, ch. 6 (2004)

    Google Scholar 

  11. Gong, X., Qian, W., Qin, S., Zhou, A.: Fractal based anomaly detection over data streams. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds.) APWeb 2013. LNCS, vol. 7808, pp. 550–562. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  12. Grahne, G., Zhu, J.: Mining frequent itemsets from secondary memory. In: IEEE ICDM 2004, pp. 91–98 (2004)

    Google Scholar 

  13. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation.In: ACM SIGMOD 2000, pp. 1–12 (2000)

    Google Scholar 

  14. Jin, R., Agrawal, G.: An algorithm for in-core frequent itemset mining on streaming data. In: IEEE ICDM 2005, pp. 210–217 (2005)

    Google Scholar 

  15. Lee, W., Song, J.J., Leung, C.K.-S.: Categorical data skyline using classification tree. In: Du, X., Fan, W., Wang, J., Peng, Z., Sharaf, M.A. (eds.) APWeb 2011. LNCS, vol. 6612, pp. 181–187. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  16. Leung, C.K.-S., Brajczuk, D.A.: Efficient mining of frequent itemsets from data streams. In: Gray, A., Jeffery, K., Shao, J. (eds.) BNCOD 2008. LNCS, vol. 5071, pp. 2–14. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  17. Leung, C.K.S., Carmichael, C.L., Johnstone, P., Yuen, D.S.H.-C.: Interactive visual analytics of databases and frequent sets. IJIRR 3(4), 120–140 (2013)

    Google Scholar 

  18. Leung, C.K.-S., Cuzzocrea, A., Jiang, F.: Discovering frequent patterns from uncertain data streams with time-fading and landmark models. TLDKS VIII, 174–196 (2013)

    Google Scholar 

  19. Leung, C.K.-S., Hayduk, Y.: Mining frequent patterns from uncertain data with MapReduce for Big data analytics. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds.) DASFAA 2013, Part I. LNCS, vol. 7825, pp. 440–455. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  20. Leung, C.K.-S., Jiang, F.: Frequent itemset mining of uncertain data streams using the damped window model. In: ACM SAC 2011, pp. 950–955 (2011)

    Google Scholar 

  21. Leung, C.K.-S., Khan, Q.I.: DSTree: a tree structure for the mining of frequent sets from data streams. In: IEEE ICDM 2006, pp. 928–932 (2006)

    Google Scholar 

  22. Leung, C.K.-S., Khan, Q.I., Li, Z., Hoque, T.: CanTree: a canonical-order tree for incremental frequent-pattern mining. KAIS 11(3), 287–311 (2007)

    Google Scholar 

  23. Leung, C.K.-S., Tanbeer, S.K.: PUF-tree: a compact tree structure for frequent pattern mining of uncertain data. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part I. LNCS, vol. 7818, pp. 13–25. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  24. Papapetrou, O., Garofalakis, M., Deligiannakis, A.: Sketch-based querying of distributed sliding-window data streams. In: VLDB 2012, pp. 992–1003 (2012)

    Google Scholar 

  25. Rao, W., Chen, L., Chen, S., Tarkoma, S.: Evaluating continuous top-k queries over document streams. WWW 17(1), 59–83 (2014)

    Article  Google Scholar 

  26. Tanbeer, S.K., Leung, C.K.-S.: Finding diverse friends in social networks. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds.) APWeb 2013. LNCS, vol. 7808, pp. 301–309. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  27. Xu, B., Deng, L., Jia, Y., Zhou, B., Han, Y.: Social circle analysis on ego-network based on context frequent pattern mining. In: ICIMCS 2013, pp. 139–144 (2013)

    Google Scholar 

  28. Zhou, X., Chen, L.: Event detection over twitter social media streams. VLDB J. 23(3), 381–400 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Cuzzocrea, A., Jiang, F., Lee, W., Leung, C.K. (2014). Efficient Frequent Itemset Mining from Dense Data Streams. In: Chen, L., Jia, Y., Sellis, T., Liu, G. (eds) Web Technologies and Applications. APWeb 2014. Lecture Notes in Computer Science, vol 8709. Springer, Cham. https://doi.org/10.1007/978-3-319-11116-2_56

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11116-2_56

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11115-5

  • Online ISBN: 978-3-319-11116-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics