Advertisement

Data Mining and Knowledge Discovery

, Volume 5, Issue 1–2, pp 33–58 | Cite as

Expert-Driven Validation of Rule-Based User Models in Personalization Applications

  • Gediminas Adomavicius
  • Alexander Tuzhilin
Article

Abstract

In many e-commerce applications, ranging from dynamic Web content presentation, to personalized ad targeting, to individual recommendations to the customers, it is important to build personalized profiles of individual users from their transactional histories. These profiles constitute models of individual user behavior and can be specified with sets of rules learned from user transactional histories using various data mining techniques. Since many discovered rules can be spurious, irrelevant, or trivial, one of the main problems is how to perform post-analysis of the discovered rules, i.e., how to validate user profiles by separating “good” rules from the “bad.” This validation process should be done with an explicit participation of the human expert. However, complications may arise because there can be very large numbers of rules discovered in the applications that deal with many users, and the expert cannot perform the validation on a rule-by-rule basis in a reasonable period of time. This paper presents a framework for building behavioral profiles of individual users. It also introduces a new approach to expert-driven validation of a very large number of rules pertaining to these users. In particular, it presents several types of validation operators, including rule grouping, filtering, browsing, and redundant rule elimination operators, that allow a human expert validate many individual rules at a time. By iteratively applying such operators, the human expert can validate a significant part of all the initially discovered rules in an acceptable time period. These validation operators were implemented as a part of a one-to-one profiling system. The paper also presents a case study of using this system for validating individual user rules discovered in a marketing application.

personalization profiling rule discovery post-analysis validation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adomavicius, G. and Tuzhilin, A. 1997. Discovery of actionable patterns in databases: The action hierarchy approach. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining.Google Scholar
  2. Adomavicius, G. and Tuzhilin, A. 1999. User profiling in personalization applications through rule discovery and validation. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
  3. Aggarwal, C.C., Sun, Z., and Yu, P.S. 1998. Online generation of profile association rules. In Proc. of the Fourth Int'l Conference on Knowledge Discovery and Data Mining.Google Scholar
  4. Aggarwal, C.C. and Yu, P.S. 1998. Online generation of association rules. In Proceedings of the Fourteenth International Conference on Data Engineering.Google Scholar
  5. Agrawal, R., Imielinsky, T., and Swami, A. 1993. Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD Conference, pp. 207–216.Google Scholar
  6. Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., and Verkamo, A.I. 1996. Fast discovery of association rules. In Advances in Knowledge Discovery and Data Mining. AAAI Press, Menlo Park, CA, Ch. 12.Google Scholar
  7. Allen, C., Kania, D., and Yaeckel, B. 1998. Internet World Guide to One-to-One Web Marketing. John Wiley & Sons.Google Scholar
  8. Baudisch, P. (Ed.). 1999. CHI'99 Workshop: Interacting with Recommender Systems. http://www.darmstadt. gmd.de/rec99/.Google Scholar
  9. Bayardo, R.J. and Agrawal, R. 1999. Mining the most interesting rules. In Proceedings of the Fifth ACMSIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
  10. Bayardo, R.J., Agrawal, R., and Gunopulos, D. 1999. Constraint-based rule mining in large, dense databases. In Proceedings of the 15th International Conference on Data Engineering.Google Scholar
  11. Brachman, R.J. and Anand, T. 1996. The process of knowledge discovery in databases: A human-centered approach. In Advances in Knowledge Discovery and Data Mining. AAAI Press, Menlo Park, CA, Ch. 2.Google Scholar
  12. Breiman, L., Friedman, J.H., Olshen, R., and Stone, C. 1984. Classification and Regression Trees. Wadsworth Publishers.Google Scholar
  13. Brin, S., Motwani, R., Ullman, J., and Tsur, S. 1997. Dynamic itemset counting and implication rules for market basket data. In Proceedings of the ACM SIGMOD Conference.Google Scholar
  14. Brunk, C., Kelly, J., and Kohavi, R. 1997. MineSet: An integrated system for data mining. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining.Google Scholar
  15. CACM. 1997. Communications of the ACM, 40(3):56–89. Special issue on Recommender Systems.Google Scholar
  16. Chan, P.K. 1999. A non-invasive learning approach to building web user profiles. In Workshop on Web Usage Analysis and User Profiling (WEBKDD'99).Google Scholar
  17. Cheung, D., Han, J., Ng, V., and Wong, C.Y. 1996. Maintenance of discovered association rules in large databases: An incremental updating technique. In Proceedings of 1996 International Conference on Data Engineering. IEEE Computer Society.Google Scholar
  18. Clearwater, S. and Provost, F. 1990. RL4: A tool for knowledge-based induction. In Proceedings of the Second International IEEE Conference on Tools for Artificial Intelligence.Google Scholar
  19. Dhar, V. and Tuzhilin, A. 1993. Abstract-driven pattern discovery in databases. IEEE Transactions on Knowledge and Data Engineering, 5(6):926–938.Google Scholar
  20. Fawcett, T. and Provost, F. 1996. Combining data mining and machine learning for efficient user profiling. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining.Google Scholar
  21. Fawcett, T. and Provost, F. 1997. Adaptive fraud detection. Journal of Data Mining and Knowledge Discovery, 1(3):291–316.Google Scholar
  22. Fayyad, U.M., Piatetsky-Shapiro, G., and Smyth, P. 1996. From data mining to knowledge discovery: An overview. In Advances in Knowledge Discovery and Data Mining. AAAI Press, Menlo Park, CA, Ch. 1.Google Scholar
  23. Feldman, R., Aumann, Y., Amir, A., and Mannila, H. 1997. Efficient algorithms for discovering frequent sets in incremental databases. In Proceedings of the Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD'97).Google Scholar
  24. Fukuda, T., Morimoto, Y., Morishita, S., and Tokuyama, T. 1996. Data mining using two-dimensional optimized association rules: Scheme, algorithms, and visualization. In Proceedings of the 1996 ACM SIGMOD International Conference on the Management Of Data, pp. 13–23.Google Scholar
  25. Goethals, B. and Van den Bussche, J. 1999.Apriori versus a posteriori filtering of association rules. In Proceedings of the 1999 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.Google Scholar
  26. Hagel, J. 1999. Keynote Address at the Personalization Summit. San Francisco. Nov. 16.Google Scholar
  27. Hagel, J. and Singer, M. 1999. Net Worth: Shaping Markets When Customers Make the Rules. Harvard Business School Press.Google Scholar
  28. Han, J., Fu, Y., Wang, W., Koperski. K., and Zaiane, O. 1996. DMQL: A data mining query language for relational databases. In Proceedings of the SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery. Montreal.Google Scholar
  29. Imielinski, T. and Virmani, A. 1999. MSQL: A query language for database mining. Journal of Data Mining and Knowledge Discovery, 3(4):373–408.Google Scholar
  30. Kautz, H. (Ed.). 1998. Recommender systems. Papers from 1998 workshop. Technical Report WS-98-08. AAAI Press.Google Scholar
  31. Klemettinen, M., Mannila, H., Ronkainen. P., Toivonen, H., and Verkamo, A.I. 1994. Finding interesting rules from large sets of discovered association rules. In Proceedings of the Third International Conference on Information and Knowledge Management.Google Scholar
  32. Lee, Y., Buchanan, B.G., and Aronis, J.M. 1998. Knowledge-based learning in exploratory science: Learning rules to predict rodent carcinogenicity. Machine Learning, 30:217–240.Google Scholar
  33. Lent, B., Swami, A.N., and Widom, J. 1997. Clustering association rules. In Proceedings of the Thirteenth International Conference on Data Engineering, April 7- 11, 1997 Birmingham U.K., IEEE Computer Society, pp. 220–231.Google Scholar
  34. Liu, B. and Hsu, W. 1996. Post-analysis of learned rules. In Proceedings of the AAAI Conference, pp. 828–834.Google Scholar
  35. Liu, B., Hsu, W., and Chen, S. 1997. Using general impressions to analyze discovered classification rules. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining.Google Scholar
  36. Liu, B., Hsu, W., and Ma, Y. 1999. Pruning and summarizing the discovered associations. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
  37. Meo, R., Psaila, G., and Ceri, S. 1998. An extension to SQL for mining association rules. Journal of Data Mining and Knowledge Discovery, 2(2):195–224.Google Scholar
  38. Morimoto, Y., Fukuda, T., Matsuzawa, H., Tokuyama, T., and Yoda, K. 1998. Algorithms for mining association rules for binary segmentations of huge categorical databases. In Proceedings of the 24th VLDB Conference, pp. 380–391.Google Scholar
  39. Morishita, S. 1998. On classification and regression. In Proceedings of the First International Conference on Discovery Science.Google Scholar
  40. Padmanabhan, B. and Tuzhilin, A. 1998. A belief-driven method for discovering unexpected patterns. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining.Google Scholar
  41. Padmanabhan, B. and Tuzhilin, A. 1999. Unexpectedness as a measure of interestingness in knowledge discovery. Decision Support Systems, 27(3):303–318.Google Scholar
  42. Peppers, D. and Rogers, M. 1993. The One-to-One Future. Doubleday, New York, NY.Google Scholar
  43. Personalization Summit. 1999. Personalization Summit. San Francisco. Nov. 14- 16.Google Scholar
  44. Piatetsky-Shapiro, G. and Matheus, C.J. 1994. The interestingness of deviations. In Proceedings of the AAAI-94 Workshop on Knowledge Discovery in Databases.Google Scholar
  45. Provost, F. and Jensen, D. 1998. Evaluating knowledge discovery and data mining. In Tutorial for the Fourth International Conference on Knowledge Discovery and Data Mining.Google Scholar
  46. Quinlan, J. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann.Google Scholar
  47. Sahar, S. 1999. Interestingness via what is not interesting. In Proceedings of the Fifth ACMSIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
  48. Shen, W.-M., Ong, K.-L., Mitbander, B., and Zaniolo, C. 1996. Metaqueries for data mining. In Advances in Knowledge Discovery and Data Mining. AAAI Press, Menlo Park, CA, Ch. 15.Google Scholar
  49. Silberschatz, A. and Tuzhilin, A. 1996a. User-assisted knowledge discovery: How much should the user be involved. In Proceedings of the SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery. Montreal.Google Scholar
  50. Silberschatz, A. and Tuzhilin, A. 1996b. What makes patterns interesting in knowledge discovery systems. IEEE Transactions on Knowledge and Data Engineering, 8(6):970–974.Google Scholar
  51. Soboroff, I., Nicholas, C., and Pazzani, M.J. (Eds.). 1999. ACM SIGIR'99Workshop on Recommender Systems: Algorithms and Evaluation. http://www.cs.umbc.edu/»ian/sigir99-rec/.Google Scholar
  52. Srikant, R. 1996. Fast algorithms for mining association rules and sequential patterns. PhD Thesis, University of Wisconsin, Madison.Google Scholar
  53. Srikant, R. and Agrawal, R. 1995. Mining generalized association rules. In Proceedings of the 21st International Conference on Very Large Databases.Google Scholar
  54. Srikant, R., Vu, Q., and Agrawal, R. 1997. Mining association rules with item constraints. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining.Google Scholar
  55. Stedman, C. 1997. Data mining for fool's gold. Computerworld, 31(48).Google Scholar
  56. Suzuki, E. 1997. Autonomous discovery of reliable exception rules. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining.Google Scholar
  57. Thomas, S., Bodagala, S., Alsabti, K., and Ranka, S. 1997. An efficient algorithm for the incremental updation of association rules in large databases. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining.Google Scholar
  58. Toivonen, H., Klemettinen, M., Ronkainen, P., Hatonen, K., and Mannila, H. 1995. pruning and grouping discovered association rules. In ECML-95 Workshop on Statistics, Machine Learning, and Knowledge Discovery in Databases.Google Scholar
  59. Tuzhilin, A. and Adomavicius, G. 1999. Integrating user behavior and collaborative methods in recommender systems. In CHI'99 Workshop. Interacting with Recommender Systems.Google Scholar
  60. Tuzhilin, A. and Silberschatz, A. 1996. A belief-driven discovery framework based on data monitoring and triggering. Technical Report IS-96-26, Stern School of Business, New York University.Google Scholar
  61. Wang, K., Tay, S.H.W., and Liu, B. 1998. Interestingness-based interval merger for numeric association rules. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining.Google Scholar

Copyright information

© Kluwer Academic Publishers 2001

Authors and Affiliations

  • Gediminas Adomavicius
    • 1
  • Alexander Tuzhilin
    • 2
  1. 1.Computer Science Department, Courant Institute of Mathematical SciencesNew York UniversityNew YorkUSA
  2. 2.Information Systems Department, Stern School of BusinessNew York UniversityNew YorkUSA

Personalised recommendations