A Practice Probability Frequent Pattern Mining Method over Transactional Uncertain Data Streams

  • Guoqiong Liao
  • Linqing Wu
  • Changxuan Wan
  • Naixue Xiong
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6905)

Abstract

In recent years, large amounts of uncertain data are emerged with the widespread employment of the new technologies, such as wireless sensor networks, RFID and privacy protection. According to the features of the uncertain data streams such as incomplete, full of noisy, non-uniform and mutable, this paper presents a probability frequent pattern tree called PFP-tree and a method called PFP-growth, to mine probability frequent patterns based on probability damped windows. The main characteristics of the suggested method include: (1) adopting time-based probability damped window model to enhance the accuracy of mined frequent patterns; (2) setting an item index table and a transaction index table to speed up retrieval on the PFP-tree; and (3) pruning the tree to remove the items that cannot become frequent patterns;. The experimental results demonstrate that PFP-growth method has better performance than the main existing schemes in terms of accuracy, processing time and storage space.

Keywords

Data Stream Frequent Pattern Uncertain Data Average Processing Time Mining Frequent Itemsets 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Zhou, A., Jin, C., Wang, G., Li, J.: A Survey on the Management of Uncertain Data. Journal of Computer 32(1), 1–16 (2009)Google Scholar
  2. 2.
    Zhang, C., Jin, C., Zhou, A.: Clustering Algorithm over Uncertain Data Streams. Journal of Software 21(9), 2173–2182 (2010)MATHGoogle Scholar
  3. 3.
    Aggarwal, C.C., Yu, P.S.: A framework for clustering uncertain data streams. In: Proc. of the 24th Int’l Conf. on Data Engineering, ICDE 2008, pp. 150–159 (2008)Google Scholar
  4. 4.
    Aggarwal, C.C.: On high dimension projected clustering of uncertain data streams. In: Proc. of the 25th Int’l Conf. on Data Engineering, ICDE 2009, pp. 1152–1154 (2009)Google Scholar
  5. 5.
    Zhang, C., Gao, M., Zhou, A.: Tracking high quality clusters over uncertain data streams. In: Proc. of the 1st Workshop on Management and Mining of Uncertain Data (MOUND 2009) Joint with ICDE 2009, pp. 1641–1648 (2009)Google Scholar
  6. 6.
    Li, J., Yu, G., Zhou, A.: Requirements and Challenges of Uncertain Data Management. Communication of China Computer Federation 5(4), 6–14 (2009)Google Scholar
  7. 7.
    Chui, C.-K., Kao, B., Hung, E.: Mining frequent itemsets from uncertain data. In: Proceedings of the 11th Pacific-Asia Conference on Knowledge Discovery Data Mining, IEEE ICDM Workshops, pp. 47–58 (2007)Google Scholar
  8. 8.
    Chui, C.K.-S., Kao, B.: A decremental approach for mining frequent itemsets from uncertain data. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 64–75. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  9. 9.
    Leung, C.K.-S., Mateo, M.A.F., Brajczuk, D.A.: A tree-based approach for frequent pattern mining from uncertain data. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 653–661. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  10. 10.
    Leung, C.K.-S., Carmichael, C.L., Hao, B.: Efficient mining of frequent patterns from uncertain data. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 489–494. Springer, Heidelberg (2007)Google Scholar
  11. 11.
    Leung, C.K.-S., Brajczuk, D.A.: Efficient algorithms for mining constrained frequent patterns from uncertain data. In: Proceedings of KDD Workshop on Knowledge Discovery from Uncertain Data, pp. 9–18 (2009)Google Scholar
  12. 12.
    Zhang, Q., Li, F., Yi, K.: Finding frequent items in probabilistic data. In: Proc. of 27th ACM International Conference on Management of Data, SIGMOD 2008, pp. 819–832 (2008)Google Scholar
  13. 13.
    Aggarwa, C.C., Li, Y., Wang, J., Wang, J.: Frequent Pattern Mining with Uncertain Data. In: Proc. of ACM KDD Conference, pp. 29–38 (2009)Google Scholar
  14. 14.
    Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. of 19th ACM International Conference on Management of Data, SIGMOD 2000, pp. 1–12 (2000)Google Scholar
  15. 15.
    Leung, C.K.-S., Hao, B.: Mining of Frequent Itemsets from Streams of Uncertain Data. In: Proc. of the 1st Workshop on Management and Mining of Uncertain Data (MOUND) Joint with ICDE 2009, pp. 1663–1670 (2009)Google Scholar
  16. 16.
    Cortes, C., Fisher, K., Pregibon, D., et al.: ACM Transactions on Programming Languages and Systems 26(2), 301–308 (2004)CrossRefGoogle Scholar
  17. 17.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of the 20th Int’l Conf. on Very Large Data Bases, VLDB 1994, pp. 487–499 (1994)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Guoqiong Liao
    • 1
    • 2
  • Linqing Wu
    • 1
    • 2
  • Changxuan Wan
    • 1
    • 2
  • Naixue Xiong
    • 3
  1. 1.School of Information TechnologyJiangxi University of Finance and EconomicsNanchangChina
  2. 2.Jiangxi Key Laboratory of Data and Knowledge EngineeringNanchangChina
  3. 3.Department of Computer ScienceGeorgia State UniversityUSA

Personalised recommendations