Skip to main content

Probabilistic Frequent Pattern Mining by PUH-Mine

  • Conference paper
  • First Online:
Web Technologies and Applications (APWeb 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9313))

Included in the following conference series:

Abstract

To mine frequent itemsets from uncertain data, many existing algorithms rely on expected support based mining. An alternative approach relies on probabilistic based mining, which captures the frequentness probability. While the possible world semantics are widely used, the exponential growth of possible worlds makes the probabilistic based mining computationally challenging when compared to the expected support based mining. In this paper, we propose two efficient approximate hyperlinked structure based algorithms, which generate a collection of all potentially probabilistic frequent itemsets with a novel upper bound and verify if they are truly probabilistic frequent. Experimental results show the efficiency of our algorithms in mining probabilistic frequent itemsets from uncertain data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Aggarwal, C.C., Li, Y., Wang, J., Wang, J.: Frequent pattern mining with uncertain data. In: ACM KDD 2009, pp. 29–38 (2009)

    Google Scholar 

  2. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: VLDB 1994, pp. 487–499 (1994)

    Google Scholar 

  3. Bernecker, T., Cheng, R., Cheung, D.W., Kriegel, H.-P., Lee, S.D., Renz, M., Verhein, F., Wang, L., Züfle, A.: Model-based probabilistic frequent itemset mining. KAIS 37(1), 181–217 (2013)

    Google Scholar 

  4. Bernecker, T., Kriegel, H.-P., Renz, M., Verhein, F., Züfle, A.: Probabilistic frequent itemset mining in uncertain databases. In: ACM KDD 2009, pp. 119–128 (2009)

    Google Scholar 

  5. Bernecker, T., Kriegel, H.-P., Renz, M., Verhein, F., Züfle, A.: Probabilistic frequent pattern growth for itemset mining in uncertain databases. In: Ailamaki, A., Bowers, S. (eds.) SSDBM 2012. LNCS, vol. 7338, pp. 38–55. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  6. Calders, T., Garboni, C., Goethals, B.: Approximation of frequentness probability of itemsets in uncertain data. In: IEEE ICDM 2010, pp. 749–754 (2010)

    Google Scholar 

  7. Chui, C.-K., Kao, B., Hung, E.: Mining frequent itemsets from uncertain data. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 47–58. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  8. Cuzzocrea, A., Jiang, F., Lee, W., Leung, C.K.: Efficient frequent itemset mining from dense data streams. In: Chen, L., Jia, Y., Sellis, T., Liu, G. (eds.) APWeb 2014. LNCS, vol. 8709, pp. 593–601. Springer, Heidelberg (2014)

    Google Scholar 

  9. Cuzzocrea, A., Leung, C.K., MacKinnon, R.K.: Mining constrained frequent itemsets from distributed uncertain data. Future Generation Computer Systems 37, 117–126 (2014)

    Article  Google Scholar 

  10. Jiang, J., Lu, H., Yang, B., Cui, B.: Finding top-k local users in geo-tagged social media data. In: IEEE ICDE 2015, pp. 267–278 (2015)

    Google Scholar 

  11. Lee, W., Song, J.J., Leung, C.K.-S.: Categorical data skyline using classification tree. In: Du, X., Fan, W., Wang, J., Peng, Z., Sharaf, M.A. (eds.) APWeb 2011. LNCS, vol. 6612, pp. 181–187. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  12. Leung, C.K.-S.: Uncertain frequent pattern mining. In: Aggarwal, C.C., Han, J. (eds.) Frequent Pattern Mining, pp. 417–453. Springer, Switzerland (2014)

    Google Scholar 

  13. Leung, C.K., Jiang, F.: A data science solution for mining interesting patterns from uncertain big data. In: IEEE BDCloud 2014, pp. 235–242 (2014)

    Google Scholar 

  14. Leung, C.K.-S., MacKinnon, R.K.: BLIMP: A compact tree structure for uncertain frequent pattern mining. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2014. LNCS, vol. 8646, pp. 115–123. Springer, Heidelberg (2014)

    Google Scholar 

  15. Leung, C.K., MacKinnon, R.K., Jiang, F.: Reducing the search space for big data mining for interesting patterns from uncertain data. In: IEEE BigData Congress 2014, pp. 315–322 (2014)

    Google Scholar 

  16. Leung, C.K., MacKinnon, R.K., Tanbeer, S.K.: Fast algorithms for frequent itemset mining from uncertain data. In: IEEE ICDM 2014, pp. 893–898 (2014)

    Google Scholar 

  17. Leung, C.K., Mateo, M.A.F., Brajczuk, D.A.: A tree-based approach for frequent pattern mining from uncertain data. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 653–661. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  18. Leung, C.K., Tanbeer, S.K.: Fast tree-based mining of frequent itemsets from uncertain data. In: Lee, S.-g., Peng, Z., Zhou, X., Moon, Y.-S., Unland, R., Yoo, J. (eds.) DASFAA 2012, Part I. LNCS, vol. 7238, pp. 272–287. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  19. Leung, C.K., Tanbeer, S.K.: PUF-tree: a compact tree structure for frequent pattern mining of uncertain data. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part I. LNCS, vol. 7818, pp. 13–25. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  20. Liu, C., Chen, L., Zhang, C.: Mining probabilistic representative frequent patterns from uncertain data. In: SDM 2013, pp. 73–81 (2013)

    Google Scholar 

  21. Lv, Y., Chen, X., Sun, G., Cui, B.: A probabilistic data replacement strategy for flash-based hybrid storage system. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds.) APWeb 2013. LNCS, vol. 7808, pp. 360–371. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  22. MacKinnon, R.K., Strauss, T.D., Leung, C.K.: DISC: efficient uncertain frequent pattern mining with tightened upper bounds. In: IEEE ICDM Workshops 2014, pp. 1038–1045 (2014)

    Google Scholar 

  23. Pham, T.-A.N., Li, X., Cong, G., Zhang, Z.: A general graph-based model for recommendation in event-based social networks. In: IEEE ICDE 2015, pp. 567–578 (2015)

    Google Scholar 

  24. Sun, L., Cheng, R., Cheung, D.W., Cheng, J.: Mining uncertain data with probabilistic guarantees. In: ACM KDD 2010, pp. 273–282 (2010)

    Google Scholar 

  25. Tanbeer, S.K., Leung, C.K.: Finding diverse friends in social networks. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds.) APWeb 2013. LNCS, vol. 7808, pp. 301–309. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  26. Tong, Y., Chen, L., Cheng, Y., Yu, P.S.: Mining frequent itemsets over uncertain databases. PVLDB 5(11), 1650–1661 (2012)

    Google Scholar 

  27. Tong, Y., Chen, L., Ding, B.: Discovering threshold-based frequent closed itemsets over probabilistic data. In: IEEE ICDE 2012, pp. 270–281 (2012)

    Google Scholar 

  28. Wang, L., Cheng, R., Lee, S.D., Cheung, D.: Accelerating probabilistic frequent itemset mining: a model-based approach. In: ACM CIKM 2010, pp. 429–438 (2010)

    Google Scholar 

  29. Xia, Y.: Two refinements of the Chernoff bound for the sum of nonidentical Bernoulli random variables. Statistics & Probability Letters 78(12), 1557–1559 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  30. Zhang, M., Chen, S., Jensen, C.S., Ooi, B.C., Zhang, Z.: Effectively indexing uncertain moving objects for predictive queries. PVLDB 2(1), 1198–1209 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Tong, W., Leung, C.K., Liu, D., Yu, J. (2015). Probabilistic Frequent Pattern Mining by PUH-Mine. In: Cheng, R., Cui, B., Zhang, Z., Cai, R., Xu, J. (eds) Web Technologies and Applications. APWeb 2015. Lecture Notes in Computer Science(), vol 9313. Springer, Cham. https://doi.org/10.1007/978-3-319-25255-1_63

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25255-1_63

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25254-4

  • Online ISBN: 978-3-319-25255-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics