Abstract
In the literatures, hash-based association rule mining algorithms are more efficient than Apriori-based algorithms, since they employ hash functions to generate candidate itemsets efficiently. However, when the dataset is updated, the whole hash table needs to be reconstructed. In this paper, we propose an incremental mining algorithm based on minimal perfect hashing. In our algorithm, each candidate itemset is hashed into a hash table, and their minimum support value can be verified directly by a hash function for latter mining process. Even though new items are added, the structure of the proposed hash does not need to be reconstructed. Therefore, experimental results show that the proposed algorithm is more efficient than other hash-based association rule mining algorithms, and is also more efficient than other Apriori-based incremental mining algorithms for association rules, when the database is dynamically updated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between the Sets of Items in Large Database. In: Proc. ACM SIGMOD, pp. 207–216. ACM Press, Washington, DC (1993)
Park, J.S., Chen, M.S., Yu, P.S.: Using a Hash-Based Method with Transaction Trimming and Database Scan Reduction for Mining Association Rules. IEEE Transactions on Knowledge and Data Engineering 9(5), 813–825 (1997)
Tseng, J.C.R., Hwang, G.J., Tsai, W.F.: A Minimal Perfect Hashing Scheme to Mining Association Rules from Frequently Updated Data. Journal of the Chinese Institute of Engineers 29(3), 391–401 (2006)
Chiou, C.K., Tseng, J.C.R.: A Scalable Association Rules Mining Algorithm Based on Sorting, Indexing and Triming. In: 2007 International Conference on Machine Learning and Cybernetics, vol. 4, pp. 2257–2262. IEEE Computer Society (2007)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD, pp. 1–12. ACM Press, Washington, DC (2000)
Cheung, W., Zaiane, R.: Incremental Mining of Frequent Patterns without Candidate Generation or Support Constraint. In: Proc. of the Seventh International Database Engineering and Applications Symposium. IEEE Computer Society (2003)
Cheung, D.W., Han, J., Ng, V.T., Wong, C.Y.: Maintenance of Discovered Association Rules in Large Database: An Incremental Updating Technique. In: Proceedings of International Conference on Data Engineering, pp. 106–114. IEEE Computer Society (1996)
Pradeepini, G., Jyothi, S.: Tree-based incremental association rule mining without candidate itemset generation. In: Trendz in Information Sciences & Computing (TISC), pp. 78–81. IEEE Computer Society (2010)
Dai, B.R., Lin, P.Y.: iTM: An Efficient Algorithm for Frequent Pattern Mining in the Incremental Database without Rescanning. In: Chien, B.-C., Hong, T.-P., Chen, S.-M., Ali, M. (eds.) IEA/AIE 2009. LNCS, vol. 5579, pp. 757–766. Springer, Heidelberg (2009)
Shan, S., Wang, X., Sui, M.: Mining Association Rules: A Continuous Incremental Updating Technique. In: International Conference on Web Information Systems and Mining, pp. 62–66. IEEE Computer Society (2010)
Chang, C.C.: The Study of an Ordered Minimal Perfect Hashing Scheme. Communications of the ACM 27(4), 384–387 (1984)
Agrawal, R.: IBM Research, Almaden Research Center, Computer Science, http://www.almaden.ibm.com/software/quest/index.shtml
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chiou, CK., Tseng, J.C.R. (2012). An Incremental Mining Algorithm for Association Rules Based on Minimal Perfect Hashing and Pruning. In: Wang, H., et al. Web Technologies and Applications. APWeb 2012. Lecture Notes in Computer Science, vol 7234. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29426-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-29426-6_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29425-9
Online ISBN: 978-3-642-29426-6
eBook Packages: Computer ScienceComputer Science (R0)