Abstract
To mine frequent itemsets from uncertain data, many existing algorithms rely on expected support based mining. An alternative approach relies on probabilistic based mining, which captures the frequentness probability. While the possible world semantics are widely used, the exponential growth of possible worlds makes the probabilistic based mining computationally challenging when compared to the expected support based mining. In this paper, we propose two efficient approximate hyperlinked structure based algorithms, which generate a collection of all potentially probabilistic frequent itemsets with a novel upper bound and verify if they are truly probabilistic frequent. Experimental results show the efficiency of our algorithms in mining probabilistic frequent itemsets from uncertain data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aggarwal, C.C., Li, Y., Wang, J., Wang, J.: Frequent pattern mining with uncertain data. In: ACM KDD 2009, pp. 29–38 (2009)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: VLDB 1994, pp. 487–499 (1994)
Bernecker, T., Cheng, R., Cheung, D.W., Kriegel, H.-P., Lee, S.D., Renz, M., Verhein, F., Wang, L., Züfle, A.: Model-based probabilistic frequent itemset mining. KAIS 37(1), 181–217 (2013)
Bernecker, T., Kriegel, H.-P., Renz, M., Verhein, F., Züfle, A.: Probabilistic frequent itemset mining in uncertain databases. In: ACM KDD 2009, pp. 119–128 (2009)
Bernecker, T., Kriegel, H.-P., Renz, M., Verhein, F., Züfle, A.: Probabilistic frequent pattern growth for itemset mining in uncertain databases. In: Ailamaki, A., Bowers, S. (eds.) SSDBM 2012. LNCS, vol. 7338, pp. 38–55. Springer, Heidelberg (2012)
Calders, T., Garboni, C., Goethals, B.: Approximation of frequentness probability of itemsets in uncertain data. In: IEEE ICDM 2010, pp. 749–754 (2010)
Chui, C.-K., Kao, B., Hung, E.: Mining frequent itemsets from uncertain data. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 47–58. Springer, Heidelberg (2007)
Cuzzocrea, A., Jiang, F., Lee, W., Leung, C.K.: Efficient frequent itemset mining from dense data streams. In: Chen, L., Jia, Y., Sellis, T., Liu, G. (eds.) APWeb 2014. LNCS, vol. 8709, pp. 593–601. Springer, Heidelberg (2014)
Cuzzocrea, A., Leung, C.K., MacKinnon, R.K.: Mining constrained frequent itemsets from distributed uncertain data. Future Generation Computer Systems 37, 117–126 (2014)
Jiang, J., Lu, H., Yang, B., Cui, B.: Finding top-k local users in geo-tagged social media data. In: IEEE ICDE 2015, pp. 267–278 (2015)
Lee, W., Song, J.J., Leung, C.K.-S.: Categorical data skyline using classification tree. In: Du, X., Fan, W., Wang, J., Peng, Z., Sharaf, M.A. (eds.) APWeb 2011. LNCS, vol. 6612, pp. 181–187. Springer, Heidelberg (2011)
Leung, C.K.-S.: Uncertain frequent pattern mining. In: Aggarwal, C.C., Han, J. (eds.) Frequent Pattern Mining, pp. 417–453. Springer, Switzerland (2014)
Leung, C.K., Jiang, F.: A data science solution for mining interesting patterns from uncertain big data. In: IEEE BDCloud 2014, pp. 235–242 (2014)
Leung, C.K.-S., MacKinnon, R.K.: BLIMP: A compact tree structure for uncertain frequent pattern mining. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2014. LNCS, vol. 8646, pp. 115–123. Springer, Heidelberg (2014)
Leung, C.K., MacKinnon, R.K., Jiang, F.: Reducing the search space for big data mining for interesting patterns from uncertain data. In: IEEE BigData Congress 2014, pp. 315–322 (2014)
Leung, C.K., MacKinnon, R.K., Tanbeer, S.K.: Fast algorithms for frequent itemset mining from uncertain data. In: IEEE ICDM 2014, pp. 893–898 (2014)
Leung, C.K., Mateo, M.A.F., Brajczuk, D.A.: A tree-based approach for frequent pattern mining from uncertain data. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 653–661. Springer, Heidelberg (2008)
Leung, C.K., Tanbeer, S.K.: Fast tree-based mining of frequent itemsets from uncertain data. In: Lee, S.-g., Peng, Z., Zhou, X., Moon, Y.-S., Unland, R., Yoo, J. (eds.) DASFAA 2012, Part I. LNCS, vol. 7238, pp. 272–287. Springer, Heidelberg (2012)
Leung, C.K., Tanbeer, S.K.: PUF-tree: a compact tree structure for frequent pattern mining of uncertain data. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part I. LNCS, vol. 7818, pp. 13–25. Springer, Heidelberg (2013)
Liu, C., Chen, L., Zhang, C.: Mining probabilistic representative frequent patterns from uncertain data. In: SDM 2013, pp. 73–81 (2013)
Lv, Y., Chen, X., Sun, G., Cui, B.: A probabilistic data replacement strategy for flash-based hybrid storage system. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds.) APWeb 2013. LNCS, vol. 7808, pp. 360–371. Springer, Heidelberg (2013)
MacKinnon, R.K., Strauss, T.D., Leung, C.K.: DISC: efficient uncertain frequent pattern mining with tightened upper bounds. In: IEEE ICDM Workshops 2014, pp. 1038–1045 (2014)
Pham, T.-A.N., Li, X., Cong, G., Zhang, Z.: A general graph-based model for recommendation in event-based social networks. In: IEEE ICDE 2015, pp. 567–578 (2015)
Sun, L., Cheng, R., Cheung, D.W., Cheng, J.: Mining uncertain data with probabilistic guarantees. In: ACM KDD 2010, pp. 273–282 (2010)
Tanbeer, S.K., Leung, C.K.: Finding diverse friends in social networks. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds.) APWeb 2013. LNCS, vol. 7808, pp. 301–309. Springer, Heidelberg (2013)
Tong, Y., Chen, L., Cheng, Y., Yu, P.S.: Mining frequent itemsets over uncertain databases. PVLDB 5(11), 1650–1661 (2012)
Tong, Y., Chen, L., Ding, B.: Discovering threshold-based frequent closed itemsets over probabilistic data. In: IEEE ICDE 2012, pp. 270–281 (2012)
Wang, L., Cheng, R., Lee, S.D., Cheung, D.: Accelerating probabilistic frequent itemset mining: a model-based approach. In: ACM CIKM 2010, pp. 429–438 (2010)
Xia, Y.: Two refinements of the Chernoff bound for the sum of nonidentical Bernoulli random variables. Statistics & Probability Letters 78(12), 1557–1559 (2008)
Zhang, M., Chen, S., Jensen, C.S., Ooi, B.C., Zhang, Z.: Effectively indexing uncertain moving objects for predictive queries. PVLDB 2(1), 1198–1209 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Tong, W., Leung, C.K., Liu, D., Yu, J. (2015). Probabilistic Frequent Pattern Mining by PUH-Mine. In: Cheng, R., Cui, B., Zhang, Z., Cai, R., Xu, J. (eds) Web Technologies and Applications. APWeb 2015. Lecture Notes in Computer Science(), vol 9313. Springer, Cham. https://doi.org/10.1007/978-3-319-25255-1_63
Download citation
DOI: https://doi.org/10.1007/978-3-319-25255-1_63
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25254-4
Online ISBN: 978-3-319-25255-1
eBook Packages: Computer ScienceComputer Science (R0)