Advertisement

Chaining Patterns

  • Taneli Mielikäinen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2843)

Abstract

Finding condensed representations for pattern collections has been an active research topic in data mining recently and several representations have been proposed. In this paper we introduce chain partitions of partially ordered pattern collections as high-level condensed representations that can be applied to a wide variety of pattern collections including most known condensed representations and databases. We analyze the goodness of the approach, study the computational challenges and algorithms for finding the optimal chain partitions, and show empirically that this approach can simplify the pattern collections significantly.

Keywords

Partial Order Bipartite Graph Association Rule Condensed Representation Closed Pattern 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hand, D.J.: Pattern detection and discovery. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds.) Pattern Detection and Discovery. LNCS (LNAI), vol. 2447, pp. 1–12. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  2. 2.
    Hand, D.J., Mannila, H., Smyth, P.: Principles of Data Mining. MIT Press, Cambridge (2001)Google Scholar
  3. 3.
    Mannila, H.: Local and global methods in data mining: Basic techniques and open problems. In: Widmayer, P., Triguero, F., Morales, R., Hennessy, M., Eidenbenz, S., Conejo, R. (eds.) ICALP 2002. LNCS, vol. 2380, pp. 57–68. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  4. 4.
    Gunopulos, D., Khardon, R., Mannila, H., Saluja, S., Toivonen, H., Sharma, R.S.: Discovering all most specific sentences. ACM Transactions on Database Systems 28, 140–174 (2003)CrossRefGoogle Scholar
  5. 5.
    Hipp, J., Güntzer, U., Nakhaeizadeh, G.: Algorithms for association rule mining – a general survey and comparison. SIGKDD Explorations 1, 58–64 (2000)CrossRefGoogle Scholar
  6. 6.
    Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1, 241–258 (1997)CrossRefGoogle Scholar
  7. 7.
    Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 307–328, AAAI/MIT Press (1996)Google Scholar
  8. 8.
    Boulicaut, J.F., Bykowski, A.: Frequent closures as a concise representation for binary data mining. In: Terano, T., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805, pp. 62–73. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  9. 9.
    Boulicaut, J.F., Bykowski, A., Rigotti, C.: Approximation of frequency queries by means of free-sets. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 75–85. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  10. 10.
    Boros, E., Gurvich, V., Khachiyan, L., Makino, K.: On the complexity of generating maximal frequent and minimal infrequent sets. In: Alt, H., Ferreira, A. (eds.) STACS 2002. LNCS, vol. 2285, pp. 133–141. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  11. 11.
    Bykowski, A., Rigotti, C.: A condensed representation to find frequent patterns. In: Proceedings of the Twenteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, ACM, New York (2001)Google Scholar
  12. 12.
    Calders, T., Goethals, B.: Mining all non-derivable frequent itemsets. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 74–865. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  13. 13.
    Calders, T., Goethals, B.: Minimal k-free representations of frequent sets. In: Lavrac, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) Principles of Knowledge Discovery and Data Mining. LNCS (LNAI), Springer, Heidelberg (2003)Google Scholar
  14. 14.
    Geerts, F., Goethals, B., Mielikäinen, T.: What you store is what you get (extended abstract). In: 2nd International Workshop on Knowledge Discovery in Inductive Databases (2003)Google Scholar
  15. 15.
    Gouda, K., Zaki, M.J.: Efficiently mining maximal frequent itemsets. In: Cercone, N., Lin, T.Y., Wu, X. (eds.) Proceedings of the 2001 IEEE International Conference on Data Mining, pp. 163–170. IEEE Computer Society, Los Alamitos (2001)CrossRefGoogle Scholar
  16. 16.
    Kryszkiewicz, M.: Concise representation of frequent patterns based on disjunctionfree generators. In: Cercone, N., Lin, T.Y., Wu, X. (eds.) Proceedings of the 2001 IEEE International Conference on Data Mining, pp. 305–312. IEEE Computer Society, Los Alamitos (2001)CrossRefGoogle Scholar
  17. 17.
    Mannila, H., Toivonen, H.: Multiple uses of frequent sets and condensed representations. In: Simoudis, E., Han, J., Fayyad, U.M. (eds.) Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD 1996), pp. 189–194. AAAI Press, Menlo Park (1996)Google Scholar
  18. 18.
    Mielikäinen, T.: Frequency-based views to pattern collections. In: IFIP/SIAM Workshop on Discrete Mathematics and Data Mining (2003)Google Scholar
  19. 19.
    Mielikäinen, T., Mannila, H.: The pattern ordering problem. In: Lavrac, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) Principles of Knowledge Discovery and Data Mining. LNCS (LNAI), Springer, Heidelberg (2003)Google Scholar
  20. 20.
    Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  21. 21.
    Pavlov, D., Mannila, H., Smyth, P.: Beyond independence: probabilistic methods for query approximation on binary transaction data. IEEE Transactions on Data and Knowledge Engineering (2003) (to appear)Google Scholar
  22. 22.
    Pei, J., Dong, G., Zou, W., Han, J.: On computing condensed pattern bases. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), Maebashi City, Japan, December 9–12, pp. 378–385. IEEE Computer Society, Los Alamitos (2002)Google Scholar
  23. 23.
    Mielikäinen, T.: Finding all occurring sets of interest. In: 2nd International Workshop on Knowledge Discovery in Inductive Databases (2003)Google Scholar
  24. 24.
    Pei, J., Han, J., Mao, T.: CLOSET: An efficient algorithm for mining frequent closed itemsets. In: Gunopulos, D., Rastogi, R. (eds.) ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 21–30 (2000)Google Scholar
  25. 25.
    Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Computing iceberg concept lattices with Titanic. Data & Knowledge Engineering 42, 189–222 (2002)zbMATHCrossRefGoogle Scholar
  26. 26.
    Zaki, M.J., Hsiao, C.J.: CHARM: An efficient algorithms for closed itemset mining. In: Grossman, R., Han, J., Kumar, V., Mannila, H., Motwani, R. (eds.) Proceedings of the Second SIAM International Conference on Data Mining, SIAM, Philadelphia (2002)Google Scholar
  27. 27.
    Agarwal, R.C., Aggarwal, C.C., Prasad, V.V.V.: A tree projection algorithm for generation of frequent item sets. Journal of Parallel and Distributed Computing 61, 350–371 (2001)zbMATHCrossRefGoogle Scholar
  28. 28.
    Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Chen, W., Naughton, J.F., Bernstein, P.A. (eds.) Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 1–12. ACM, New York (2000)CrossRefGoogle Scholar
  29. 29.
    Jukna, S.: Extremal Combinatorics: With Applications in Computer Science. EATCS Texts in Theoretical Computer Science. Springer, Heidelberg (2001)Google Scholar
  30. 30.
    Lovász, L., Plummer, M.: Matching Theory. Volume 121 of Annals of Discrete Mathematics, vol. 121. North-Holland, Amsterdam (1986)Google Scholar
  31. 31.
    Galil, Z.: Efficient algorithms for finding maximum matchings in graphs. ACM Computing Surveys 18, 23–38 (1986)zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Taneli Mielikäinen
    • 1
  1. 1.HIIT Basic Research Unit, Department of Computer ScienceUniversity of HelsinkiFinland

Personalised recommendations