Advertisement

Discovering Exceptional Information from Customer Inquiry by Association Rule Miner

  • Keiko Shimazu
  • Atsuhito Momma
  • Koichi Furukawa
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2843)

Abstract

This paper reports the results of our experimental study on a new method of applying an association rule miner to discover useful information from a text database. It has been claimed that association rule mining is not suited for text mining. To overcome this problem, we propose (1) to generate a sequential data set of words with dependency structure from a Japanese text database, and (2) to employ a new method for extracting meaningful association rules by applying a new rule selection criterion. Each inquiry was converted to a list of word pairs, having dependency relationship in the original sentence. The association rules were acquired regarding each pair of words as an item. The rule selection criterion derived from our principle of giving heavier weights to co-occurrence of multiple items than to single item occurrence. We regarded a rule as important if the existence of the items in the rule body significantly affected the occurrence of the item in the rule head. Based on this method, we conducted experiments on a customer inquiry database in a call center of a company and successfully acquired practical meaningful rules, which were not too general nor appeared only rarely. Also, they were not acquired by only simple keyword retrieval. Additionally, inquiries with multiple aspects were properly classified into corresponding multiple categories. Furthermore, we compared (i) rules obtained from a sequential data set of words with dependency structure, which we propose in this paper, and those without dependency structure, as well as (ii) rules acquired through the association rule selection criterion and those through the conventional criteria. As a result, discovery of meaningful rules increased 14.3-fold in the first comparison, and we confirmed that our criterion enables to obtain rules according to the objectives more precisely in the second comparison.

Keywords

Association Rule Call Center Default Rule Dependency Information Exception Rule 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R.: Fast Algorithms for Data Mining Applications. In: Proceedings of the 20th International Conference on Very Large Databases, Santiago Chile, pp. 487–489 (1994)Google Scholar
  2. 2.
    Arimura, H., Abe, J., Fujino, R., Sakamoto, H., Shimozono, S., Arikawa, S.: Text Data Mining: Discovery of Important Keywords in the Cyberspace. In: Proceedings of Kyoto International Conference on Digital Libraries 2000, Kyoto Japan, pp. 121–126 (2000)Google Scholar
  3. 3.
    Borgel, C.: Apriori: Finding Association Rules/Hyperedges with the Apriori Algorithm, http://fuzzy.cs.uni-magdeburg.de/~borgelt/apriori/
  4. 4.
    Hearst, M.A.: Untangling Text Data Mining (invited paper). In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, College Park MD (1999)Google Scholar
  5. 5.
    Hisamitsu, T., Niwa, Y., Tsujii, J.: A Method of Measuring Term Representativeness – Baseline Method Using Co-occurrence Distribution. In: Proceedings of the 18th International Conference on Computational Linguistics, Saabrucken Germany, July 2000, pp. 320–326 (2000)Google Scholar
  6. 6.
    Inoue, K., Kudoh, Y.: Learning Extended Logic Programs. In: Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, Nagoya Japan, August 1997, pp. 176–181 (1997)Google Scholar
  7. 7.
    Laurence, S., Giles, L.: Searching the World Wide Web. Science 280(5360), 98–100 (1998)CrossRefGoogle Scholar
  8. 8.
    Matsumoto, Y., Kitauchi, A., Yamashita, T., Hirano, Y., Matsuda, H., Takaoka, K., Asahara, M.: Morphological Analysis System ChaSen version 2.2.1 Manual, http://chasen.aist-nara.ac.jp/chasen/doc/chasen-2.2.1.pdf
  9. 9.
    Matsuo, Y., Ohsawa, Y., Ishizuka, M.: KeyWorld: Extracting Keywords in a Document as a Small World. In: Jantke, K.P., Shinohara, A. (eds.) DS 2001. LNCS (LNAI), vol. 2226, pp. 271–281. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  10. 10.
    Nasukawa, T., Nagano, T.: Text Analysis and Knowledge Mining System. IBM Systems Journal 40(4), 967–984 (Winter 2001)CrossRefGoogle Scholar
  11. 11.
    Nagano, T., Takeda, K., Nasukawa, T.: Information Extraction for Text Mining. In: IPSJ SIG Notes FI60-5, pp. 31–38 (2000) (in Japanese)Google Scholar
  12. 12.
    Ohsawa, Y., Benson, N.E., Yachida, M.: KeyGraph: Automatic Indexing by Cooccurrence Graph Based on Building Construction Metaphor. In: Proceedings of 5th Advanced Digital Library Conference, Santa Barbara CA, April 1998, pp. 12–18 (1998)Google Scholar
  13. 13.
    Reiter, R.: A Logic for Default Reasoning. Artificial Intelligence 13(2), 81–132 (1980)zbMATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Sakurai, S., Ichimura, Y., Suyama, A., Orihara, R.: nductive Learning of a Knowledge Dictionary for a Text Mining System. In: Proceedings of the 14th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems, Budapest Hungary, June 2001, pp. 247–252 (2001)Google Scholar
  15. 15.
    Segal, R., Kephart, J.: MailCat: An Intelligent Assistant for Organizing E-Mail. In: Proceedings of the 3rd International Conference on Autonomous Agents, Seattle WA, May 1999, pp. 276–282 (1999)Google Scholar
  16. 16.
    Shimazu, K., Momma, A., Furukawa, K.: Experimental Study of Discovering Essential Information from Customer Inquiry. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington D.C (August 2003)Google Scholar
  17. 17.
    Smyth, P., Goodman, R.M.: An Information Theoretic Approach to Rule Induction from Databases. IEEE Transactions on Knowledge and Data Engineering 4(4), 301–316 (1992)CrossRefGoogle Scholar
  18. 18.
    Smyth, P., Pregibon, D., Faloutsos, C.: Data-driven Evolution of Data Mining Algorithms. Commun. ACM 45(8), 33–37 (2002)CrossRefGoogle Scholar
  19. 19.
    Suzuki, E.: Scheduled Discovery of Exception Rules. In: Proceedings of the Second International Conference on Discovery Science, Tokyo Japan, December 1999, pp. 184–195 (1999)Google Scholar
  20. 20.
    Suzuki, E., Tsumoto, S.: Evaluating Hypothesis-Driven Exception-Rule Discovery with Medical Data Sets. In: Proceedings of the Fourth Pacific-Asia Conference on Knowledge Discovery and Data Mining, Kyoto Japan, April 2000, pp. 208–211 (2000)Google Scholar
  21. 21.
    Tanabe, T., Yoshimura, K., Shudo, K.: Modality Expressions in Japanese and Their Automatic Paraphrasing. In: Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium, Tokyo Japan, pp. 507–512 (2001)Google Scholar
  22. 22.
    Zaki, M.J.: Efficient Enumeration of Frequent Sequences. In: Proceedings of the Seventh International Conference on Information and Knowledge Management, Bethesda MD, November 1998, pp. 68–75 (1998)Google Scholar
  23. 23.
    Zaki, M.J.: Efficiently Mining Frequent Trees in a Forest. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton Canada, July 2002, pp. 71–80 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Keiko Shimazu
    • 1
  • Atsuhito Momma
    • 1
  • Koichi Furukawa
    • 2
  1. 1.Information Media LaboratoryCorporate Research Group, Fuji Xerox Co., Ltd.KanagawaJapan
  2. 2.Graduate School of Media and GovernanceKeio UniversityKanagawaJapan

Personalised recommendations