Abstract
Direct marketing is a modern business activity with an aim to maximize the profit generated from marketing to a selected group of customers. A key to direct marketing is to select a subset of customers so as to maximize the profit return while minimizing the cost. Achieving this goal is difficult due to the extremely imbalanced data and the inverse correlation between the probability that a customer responds and the dollar amount generated by a response. We present a solution to this problem based on a creative use of association rules. Association rule mining searches for all rules above an interestingness threshold, as opposed to some rules in a heuristic-based search. Promising association rules are then selected based on the observed value of the customers they summarize. Selected association rules are used to build a model for predicting the value of a future customer. On the challenging KDD-CUP-98 dataset, this approach generates 41% more profit than the KDD-CUP winner and 35% more profit than the best result published thereafter, with 57.7% recall on responders and 78.0% recall on non-responders. The average profit per mail is 3.3 times that of the KDD-CUP winner.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agrawal, R., Imilienski, T., and Swami, A. 1993. Mining association rules between sets of items in large datasets. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (SIGMOD’93). Washington, D.C., USA, pp. 207–216.
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., and Verkamo, A. 1996. Fast discovery of association rules. In Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, pp. 307–328.
Agrawal, R. and Srikant, R. 1994. Fast algorithm for mining association rules. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94), Santiago de Chile, Chile, pp. 487–499.
Bitran, G. and Mondschein, S. 1996. Mailing decisions in the catalog sales industry. Management Science, 42:1364–1381.
Brijs, T., Swinnen, G., Vanhoof, K., and Wets, G. 1999. Using association rules for product assortment decisions: a case study. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD’99), San Diego, CA, USA.
Bult, J. R. and Wansbeek, T. 1995. Optimal selection for direct mail. Marketing Science, 14:378–394.
Clark, P. and Niblett, T. 1989. The CN2 induction algorithm. Machine Learning Journal, 3(4):261–283.
Clopper, C. and Pearson, E. 1934. The use of confidence or fiducial limits illustrated in the case of the binomial (http://www.jstor.org/journals/bio.html). Biometrika, 26(4):404–413.
Desarbo, W. and Ramaswamy, V. 1994. CRISP: Customer response based iterative segmentation procedures for response modeling in direct marketing. Journal of Direct Marketing, 8:7–20.
Domingos, P. and Pazzani, M. 1996. Beyond independence: Conditions for the optimality of the simple Bayesian classifier. In Proceedings of the Thirteenth International Conference on Machine Learning (ICML’96), Bari, Italy, pp. 105–112.
Domingos, P. 1999. MetaCost: A general method for making classifiers cost sensitive. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD’99). San Diego, CA, USA, pp. 155–164.
Joshi, M., Agarwal, R., and Kumar, V. 2001. Mining needles in a haystack: Classifying rare classes via two-phase rule induction. In Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data (SIGMOD’01), Santa Barbara, California, USA, pp. 91–102.
KDD98. 1998a. The KDD-CUP-98 dataset. http://kdd.ics.uci.edu/databases/kddcup98/kddcup98.html.
KDD98. 1998b. The KDD-CUP-98 result. http://www.kdnuggets.com/meetings/kdd98/kdd-cup-98.html.
KDnuggets. 2001. KDnuggets poll results: Data mining applications in 2002. http://www.kdnuggets.com/news/2001/n25.
Levin, N. and Zahavi, J. 1996. Segmentation analysis with managerial judgement. Journal of Direct Marketing, 10:28–37.
Ling, C. and Li, C. 1998. Data mining for direct marketing: Problems and solutions. In The Fourth International Conference on Knowledge Discovery and Data Mining (KDD’98). New York, New York, USA, pp. 73–79.
Masand, B. and Shapiro, G.P. 1996. A comparison of approaches for maximizing business payoff of prediction models. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD’96), Portland, Oregon, USA, pp. 195–201.
Michalski, R.S. 1969. On the quasi-minimal solution of the general covering problem. In Proceedings of the Fifth International Symposium on Information Processing (FCIP 69) (Switching Circuits), Vol. A3, Bled, Yugoslavia, pp. 125–128.
Potharst, R., Kaymak, U., and Pijls, W. 2002. Neural networks for target selection in direct marketing. Neural Networks in Business: Techniques and Applications, 89–110.
Quinlan, J. 1993. C4.5: Programs for Machine Learning. San Mateo, CA, USA: Morgan Kaufmann.
Quinlan, R.J. 1983. Learning efficient classification procedures and their application to chess endgames. Machine Learning: An Artificial Intelligence Approach, 1:463–482.
Rigoutsos, I. and Floratos, A. 1998. Combinatorial pattern discovery in biological sequences. Bioinformatics, 14(2):55–67.
Savasere, A., Omiecinski, E., and Navathe, S. 1998. Mining for strong negative associations in a large database of customer transactions. In Proceedings of the Fourteenth International Conference on Data Engineering (ICDE’98), Orlando, Florida, USA, pp. 494–502.
Tan, P.N., Kumar, V., and Srivastav, J. 2000. Indirect association: Mining higher order dependencies in data. In The Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’00), Lyon, France, pp. 632–637.
Wang, K. and Su, M. Y. 2002. Item selection by hub-authority profit ranking. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD’02), Edmonton, Alberta, Canada, pp. 652–657.
Wang, K., Zhou, S., and Han, J. 2002. Profit mining: From patterns to actions. In Proceedings of the Eighth International Conference on Extending Database Technology (EDBT’02), Prague, Czech Republic, pp. 70–87.
Zadrozny, B. and Elkan, C. 2001. Learning and making decisions when costs and probabilities are both unknown. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining (SIGKDD’01), San Francisco, CA, USA, pp. 204–213.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wong, K.W., Zhou, S., Yang, Q. et al. Mining Customer Value: From Association Rules to Direct Marketing. Data Min Knowl Disc 11, 57–79 (2005). https://doi.org/10.1007/s10618-005-1355-x
Issue Date:
DOI: https://doi.org/10.1007/s10618-005-1355-x