Advertisement

Efficient Mining of Frequent Itemsets in Distorted Databases

  • Jinlong Wang
  • Congfu Xu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4304)

Abstract

Recently, the data perturbation approach has been applied to data mining, where original data values are modified such that the reconstruction of the values for any individual transaction is difficult. However, this mining in distorted databases brings enormous overheads as compared to normal data sets. This paper presents an algorithm GrC-FIM, which introduces granular computing (GrC), to address the efficiency problem of frequent itemset mining in distorted databases. Using the key granule concept and granule inference, support counts of candidate non-key frequent itemsets can be inferred with the counts of their frequent sub-itemsets obtained during an earlier mining. This eliminates the tedious support reconstruction for these itemsets. And the accuracy is improved in dense data sets while that in sparse ones is the same.

Keywords

Association Rule Minimum Support Frequent Itemsets Original Database Support Count 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proc. ACM-SIGMOD Int. Conference on Management of Data (SIGMOD 2000), pp. 439–450 (2000)Google Scholar
  2. 2.
    Rizvi, S., Haritsa, J.: Maintaining data privacy in association rule mining. In: Proc. VLDB Int. Conference on Very large data bases (VLDB 2002), pp. 682–693 (2002)Google Scholar
  3. 3.
    Du, W., Zhan, Z.: Using randomized response techniques for privacy-preserving data mining. In: Proc. ACM-SIGKDD Int. Conference on Knowldge discovery and data mining (SIGKDD 2003), pp. 505–510 (2003)Google Scholar
  4. 4.
    Agrawal, S., Krishnan, V., Haritsa, J.: On addressing efficiency concerns in privacy-preserving mining. In: Lee, Y., Li, J., Whang, K.-Y., Lee, D. (eds.) DASFAA 2004. LNCS, vol. 2973, pp. 113–124. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  5. 5.
    Lin, T.Y.: Granular computing. In: Announcement of the BISC Special Interest Group on Granular Computing (1997)Google Scholar
  6. 6.
    Zadeh, L.A.: Towards a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets and Systems 90(2), 111–127 (1997)MATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Yao, Y.Y., Zhong, N.: Granular computing using information tabules. In: Data Mining, Rough Sets and Granular Computing, pp. 102–124. Physica-Verlag (2002)Google Scholar
  8. 8.
    Lin, T.Y., Louie, E.: Data mining using granular computing: fast algorithms for finding association rules. In: Data Mining, Rough Sets and Granular Computing, pp. 23–45. Physica-Verlag (2002)Google Scholar
  9. 9.
    Pawlak, Z.: Some issues on rough sets. Transactions on Rough Sets I, 1–58 (2004)CrossRefMathSciNetGoogle Scholar
  10. 10.
    Bastide, Y., Taouil, R., Pasquier, N., Stumme, G., Lakhal, L.: Mining frequent patterns with counting inference. SIGKDD Explorations 2(2), 66–75 (2000)CrossRefGoogle Scholar
  11. 11.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. VLDB Int. Conference on Very large data bases (VLDB 1994), pp. 487–499 (1994)Google Scholar
  12. 12.
    Xu, C., Wang, J., Dan, H., Pan, Y.: An improved EMASK algorithm for privacy-preserving frequent pattern mining. In: Hao, Y., Liu, J., Wang, Y.-P., Cheung, Y.-m., Yin, H., Jiao, L., Ma, J., Jiao, Y.-C. (eds.) CIS 2005. LNCS (LNAI), vol. 3801, pp. 752–757. Springer, Heidelberg (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jinlong Wang
    • 1
  • Congfu Xu
    • 1
  1. 1.Institute of Artificial IntelligenceZhejiang UniversityHangzhouChina

Personalised recommendations