Abstract.
In this paper, we extend the traditional association rule problem by allowing a weight to be associated with each item in a transaction to reflect the interest/intensity of each item within the transaction. In turn, this provides us with an opportunity to associate a weight parameter with each item in a resulting association rule; we call them weighted association rules (WAR). One example of such a rule might be 80% of people buying more than three bottles of soda will also be likely to buy more than four packages of snack food, while a conventional association rule might just be 60% of people buying soda will be also be likely to buy snack food. Thus WARs cannot only improve the confidence of the rules, but also provide a mechanism to do more effective target marketing by identifying or segmenting customers based on their potential degree of loyalty or volume of purchases. Our approach mines WARs by first ignoring the weight and finding the frequent itemsets (via a traditional frequent itemset discovery algorithm), followed by introducing the weight during the rule generation. Specifically, the rule generation is achieved by partitioning the weight domain space of each frequent itemset into fine grids, and then identifying the popular regions within the domain space to derive WARs. This approach does not only support the batch mode mining, i.e., finding WARs for the dataset, but also supports the interactive mode, i.e., finding and refining WARs for a given (set) of frequent itemset(s).
Similar content being viewed by others
References
1994 Agr94 Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In Proceedings of the 20th international conference on very large databases, Santiago de Chile
1993 Agr93 Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In Proceeding of the ACM SIGMOD conference on Management of Data, In Washington, DC
1998 Agr98 Agrawal R, Gehrke J, Gunopulos D, Raghavan P (1998) Automatic subspace clustering of high dimensional data for data mining application. In Proceedings of the ACM SIGMOD conference on management of data, Seattle, WA pp 94–105
1999 Aum99 Aumann Y, Lindell Y (1999) A statistical theory for quantitative association rules. In Proceedings of the ACM SIGKDD conference on knowlwdge discovery and datamining, San Diego, CA, pp 261–270
1997 Bal97 Balabanovic M, Shoham Y (1997) Fab: content-based, collaborative recommendation. Communications of ACM 40(9):64–72
1998 Bay98 Bayardo RJ (1998) Efficiently mining long patterns from databases. In Proceedings of the ACM SIGMOD conference on Management of Data, Seattle, WA, pp. 85–93
1997a Bri97a Brin S, Motwani R, Ullman J, Tsur S (1997a) Dynamic itemset counting and implication rules for market basket data. In Proceedings of the ACM SIGMOD conference on management of data, Tucson, AZ
1997b Bri97b Brin S, Motwani R, Silverstein C (1997b) Beyond market baskets: generalizing association rules to correlations. In Proceedings of the ACM SIGMOD conference on management of data, Tucson, AZ
1999 Bri99 Brin S, Rastogi R, Shim K (1999) Mining optimized gain rules for numeric attributes. In Proceedings of the ACM SIGKDD conference on knowledge discovery and datamining, San Diego, CA, pp 135–144
1999 Che99 Cheng C, Fu A, Zhang Y (1999) Entropy-based subspace clustering for mining numerical data. In Proceedings of the ACM SIGKDD conference on knowledge discovery and datamining, San Diego, CA, pp 84–93
1990 cor90 Cormen T, Leiserson C, Rivest R (1990) Introduction to Algorithms. MIT Press, Cambridge, MA
1996 Est96 Ester M, Kriegal HP, Sander J, Yu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the 2nd international conference on knowledge discovery and data mining (KDD), Montreal
1997 Est97 Ester M, Kriegal HP, Sander J, Yu X (1997) Density-connected sets and their application for trend detection. In Proceedings of the 3rd international conference on knowledge discovery and data mining (KDD), Newport Beach, CA
1996a Fuk96a Fukuda T, Morimoto Y, Morishita S, Tokuyama T (1996a) Mining optimized association rules for numeric attributes. In Proceedings of the 15th ACM symposium on principles of database systems, Montreal
1996b Fuk96b Fukuda T, Morimoto Y, Morishita S, Tokuyama T (1996b) Data mining using two dimensional optimized association rules: scheme, algorithms, and visualization. In Proceedings of the ACM SIGMOD conference on management of data, Montreal
1995 Han95 Han J, Fu Y (1995) Discovery of multiple-level association rules from large databases. In Proceedings of the 21st international conference on very large databases, Zurich, pp 420–431
1999 Hid99 Hidber C (1999) Online association rule mining. In Proceedings of the ACM SIGMOD conference on management of data, Philadelphia, PA, pp 145–156
1998 Kor98 Korn F, Labrinidis A, Kotidis Y, Faloutsos C (1998) Ratio rules: a new paradigm for fast, quantifiable data mining. In Proceedings of the 24th international conference on very large data base (VLDB), New York, pp 582–593
1999 Lak99 Lakshmanan L, Ng R, Han J, Pang A (1999) Optimization of constrained frequent set queries with 2-variable constraints. In Proceedings of the ACM SIGMOD conference on management of data, Philadelphia, PA, pp 157–168
1997 Len97 Lent B, Swami A, Widom J (1997) Clustering association rules. In Proceedings of the 13th international conference on data engineering, Birmingham, UK, pp 220–231
1998 Liu98 Liu B, Hsu W, Ma Y (1998) Integrating classification and association rule mining. In Proceedings of the 4th international conference on knowledge discovery and data mining, New York, pp 80–86
1999 Liu99 Liu B, Hsu W, Ma Y (1999) Pruning and summarizing discovered associations. In Proceedings of the ACM SIGKDD, San Diego, CA
1996 Lu96 Lu H, Setiono R, Liu H (1996) Effective data mining using neural networks. Transactions on Knowledge and Data Engineering 8:957–961
1997 Man97a Mannila H, Toivonen H (1997) Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1:241–258
1997 Man97b Mannila H, Toivonen H, Verkamo A (1997) Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery 1:259–289
1998 Ng98 Ng R, Lakshmanan L, Han J, Pang A (1998) Exploratory mining and pruning optimizations of constrained association rules. In Proceedings of the ACM SIGMOD conference on management of data, Seattle, WA, pp 13–25
1998 Ras98 Rastogi R, Shim K (1998) Mining optimized association rules with categorical and numerical attributes. In Proceedings of the 14th international conference on data engineering (ICDE), Orlando, FL
1999 Ras99 Rastogi R, Shim K (1999) Mining optimized support rules for numerical attributes. In: Proceedings of the 15th international conference on data engineering (ICDE), Sydney
1995 Sav95 Savasere A, Omiecinski E, Navathe S (1995) An efficient algorithm for mining association rules in a large databases. In Proceedings of the 21st international conference on very large data bases (VLDB), Zurich
1996a Sri96 Srikant R, Agrawal R (1996a) Mining quantitative association rules in large relational tables. In Proceedings of the ACM SIGMOD conference on management of data, Montreal
1996b Sri96b Srikant R, Agrawal R (1996b) Mining sequential patterns: generalizations and performance improvements. In Proceedings of the 5th international conference on extending database technology (EDBT), Avignon, France, March
1998 Tsu98 Tsur D, Ullman J, Abiteboul S, Clifton C, Motwani R, Nestorov S, Rosenthal A (1998) Query flocks: a generalization of association-rule mining. In Proceedings of the ACM SIGMOD conference on management of data, Seattle, WA, pp 1–12
1997 Wan97 Wang W, Yang J, Muntz R (1997) STING: a statistical information grid based approach to spatial data mining. In Proceedings of the 23rd international conference on very large data bases (VLDB), Athens, pp 186–195
1998 Wan98 Wang K, Tay S, Liu B (1998) Interestingness-based interval merger for numeric association rules. In Proceedings of the 4th international conference on Knowledge Discovery and Data Mining, New York, pp 121–128
2000 Yan99 Yang J, Wang W, Muntz R (2000) Collaborative web caching based on proxy affinities. In Proceedings of the ACM SIGMETRICS conference, Santa Clara, CA
1999 Yip99 Yip C, Loo K, Kao B, Cheng C (1999) LGen: a lattice-based candidate set generation algorithm for I/O efficient association rule mining. In Third Pacific–Asia conference, lecture notes in computer science 1574, pp 54–63
1997 Yod97 Yoda K, Fukuda T, Morimoto Y, Morishita S, Tokuyama T (1997) Computing optimized rectilinear regions for association rules. In Proceedings of the 3rd international conference on Knowledge Discovery and Data Mining, Newport Beach, CA, pp 96–103
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, W., Yang, J. & Yu, P. WAR: Weighted Association Rules for Item Intensities. Knowledge and Information Systems 6, 203–229 (2004). https://doi.org/10.1007/s10115-003-0108-7
Received:
Revised:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/s10115-003-0108-7