TidFP: Mining Frequent Patterns in Different Databases with Transaction ID
Since transaction identifiers (ids) are unique and would not usually be frequent, mining frequent patterns with transaction ids, showing records they occurred in, provides an efficient way to mine frequent patterns in many types of databases including multiple tabled and distributed databases. Existing work have not focused on mining frequent patterns with the transaction ids they occurred in. Many applications require finding strong associations between transaction id (e.g., certain drug) and the itemsets (e.g., certain adverse effects) to help deduce some pertinent lacking information (like how many people use this product in total) and information (like how many people have the adverse effects).
This paper proposes a set of algorithms TidFPs, for mining frequent patterns with their transaction ids in a single transaction database, in a multiple tabled database, and in a distributed database. The proposed technique scans the database records only once even with level-wise Apriori-based mining techniques, stores frequent 1-items with their transaction id bitmap, outperforms traditional approaches and is extendible to other tree-based mining techniques as well as sequential mining.
KeywordsData mining Transaction id Frequent Patterns Distributed Mining Multiple Table Mining
Unable to display preview. Download preview PDF.
- 1.Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In: Proceedings of the 20th International Conference on very Large Databases Santiago, Chile, pp. 487–499 (1994)Google Scholar
- 2.Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential Pattern Mining using A Bitmap Representation. In: Proceedings of the ACM SIKDD conference, Edmonton, Alberta, Canada, pp. 429–435 (2002)Google Scholar
- 7.Imielinski, T., Swami, A., Agarwal, R.: Mining association rules between sets of items in large databases. In: Proceeding of the ACM SIGMOD conference on management of data, Washington D.C., May 1993, pp. 207–216 (1993)Google Scholar
- 8.Kantarcioglu, M., Clifton, C.: Privacy-preserving Distributed Mining of Association Rules on Horizontally Partitioned Data. In: The proceedings of the ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, DMKD 2002, pp. 24–31 (2002)Google Scholar
- 9.Pei, J., Han, J., Mortazavi-asi, B., Zhu, H.: Mining Access Patterns Efficiently from web logs. In: Proceedings, Pacific-Asia conference on Knowledge Discovery and data Mining, Kyoto, Japan, pp. 396–407 (2000)Google Scholar
- 10.Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.C.: PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: Proceedings of the 2001 International Conference on Data Engineering (ICDE 2001), Heidelberg, Germany, pp. 215–224 (2001)Google Scholar
- 11.Srikanth, R., Aggrawal, R.: Mining Sequential Patterns: generalizations and performance improvements, Research Report, IBM Almaden Research Center 650 Harry Road, San Jose, CA 95120, 1–15 (1996)Google Scholar