Fast Association Discovery in Derivative Transaction Collections
Association discovery from a transaction collection is an important data-mining task. We study a new problem in this area whose solution can provide users with valuable association rules in some relevant collections: association discovery in derivative transaction collections. In this problem, we are given association rules in two transaction collections D1 and D2, and aim to find new association rules in derivative transaction collections D1∖D2, D1∩D2, D2∖D1 and D1∪D2. Direct application of existing algorithms can solve this problem, but in an expensive way. We propose an efficient solution through making full use of already discovered information, taking advantage of the relationships existing among relevant collections, and avoiding unnecessary but expensive support-counting operations by scanning databases. Experiments on well-known synthetic data show that our solution consistently outperforms the naive solution by factors from 2 to 3 in most cases. We also propose an efficient parallelization of our approach, as parallel algorithms are often interesting and necessary in the area of data mining.
Unable to display preview. Download preview PDF.