Efficiently Depth-First Minimal Pattern Mining

  • Arnaud Soulet
  • François Rioult
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8443)

Abstract

Condensed representations have been studied extensively for 15 years. In particular, the maximal patterns of the equivalence classes have received much attention with very general proposals. In contrast, the minimal patterns remained in the shadows in particular because of their difficult extraction. In this paper, we present a generic framework for minimal patterns mining by introducing the concept of minimizable set system. This framework addresses various languages such as itemsets or strings, and at the same time, different metrics such as frequency. For instance, the free and the essential patterns are naturally handled by our approach, just as the minimal strings. Then, for any minimizable set system, we introduce a fast minimality check that is easy to incorporate in a depth-first search algorithm for mining the minimal patterns. We demonstrate that it is polynomial-delay and polynomial-space. Experiments on traditional benchmarks complete our study.

Keywords

Pattern mining condensed representation minimal pattern 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Boulicaut, J.-F., Bykowski, A., Rigotti, C.: Approximation of frequency queries by means of free-sets. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 75–85. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  2. 2.
    Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Syst. 24(1), 25–46 (1999)CrossRefGoogle Scholar
  3. 3.
    Zaki, M.J.: Generating non-redundant association rules. In: KDD, pp. 34–43 (2000)Google Scholar
  4. 4.
    Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: KDD, pp. 80–86 (1998)Google Scholar
  5. 5.
    Eiter, T., Gottlob, G.: Hypergraph transversal computation and related problems in logic and AI. In: Flesca, S., Greco, S., Leone, N., Ianni, G. (eds.) JELIA 2002. LNCS (LNAI), vol. 2424, pp. 549–564. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  6. 6.
    Calders, T., Rigotti, C., Boulicaut, J.-F.: A survey on condensed representations for frequent sets. In: Boulicaut, J.-F., De Raedt, L., Mannila, H. (eds.) Constraint-Based Mining. LNCS (LNAI), vol. 3848, pp. 64–80. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    Soulet, A., Crémilleux, B.: Adequate condensed representations of patterns. Data Min. Knowl. Discov. 17(1), 94–110 (2008)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Lo, D., Khoo, S.C., Li, J.: Mining and ranking generators of sequential patterns. In: SDM, pp. 553–564. SIAM (2008)Google Scholar
  9. 9.
    Li, J., Li, H., Wong, L., Pei, J., Dong, G.: Minimum description length principle: Generators are preferable to closed patterns. In: AAAI, pp. 409–414 (2006)Google Scholar
  10. 10.
    Arimura, H., Uno, T.: Polynomial-delay and polynomial-space algorithms for mining closed sequences, graphs, and pictures in accessible set systems. In: SDM, pp. 1087–1098. SIAM (2009)Google Scholar
  11. 11.
    Calders, T., Goethals, B.: Depth-first non-derivable itemset mining. In: SDM, pp. 250–261 (2005)Google Scholar
  12. 12.
    Liu, G., Li, J., Wong, L.: A new concise representation of frequent itemsets using generators and a positive border. Knowl. Inf. Syst. 17(1), 35–56 (2008)CrossRefMathSciNetGoogle Scholar
  13. 13.
    Murakami, K., Uno, T.: Efficient algorithms for dualizing large-scale hypergraphs. In: ALENEX, pp. 1–13 (2013)Google Scholar
  14. 14.
    Hamrouni, T.: Key roles of closed sets and minimal generators in concise representations of frequent patterns. Intell. Data Anal. 16(4), 581–631 (2012)Google Scholar
  15. 15.
    Casali, A., Cicchetti, R., Lakhal, L.: Essential patterns: A perfect cover of frequent patterns. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2005. LNCS, vol. 3589, pp. 428–437. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  16. 16.
    Kryszkiewicz, M.: Generalized disjunction-free representation of frequent patterns with negation. J. Exp. Theor. Artif. Intell. 17(1-2), 63–82 (2005)CrossRefGoogle Scholar
  17. 17.
    Lo, D., Khoo, S.C., Wong, L.: Non-redundant sequential rules - theory and algorithm. Inf. Syst. 34(4-5), 438–453 (2009)CrossRefGoogle Scholar
  18. 18.
    Gao, C., Wang, J., He, Y., Zhou, L.: Efficient mining of frequent sequence generators. In: WWW, pp. 1051–1052. ACM (2008)Google Scholar
  19. 19.
    Gasmi, G., Yahia, S.B., Nguifo, E.M., Bouker, S.: Extraction of association rules based on literalsets. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2007. LNCS, vol. 4654, pp. 293–302. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  20. 20.
    Zeng, Z., Wang, J., Zhang, J., Zhou, L.: FOGGER: an algorithm for graph generator discovery. In: EDBT, pp. 517–528 (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Arnaud Soulet
    • 1
  • François Rioult
    • 2
  1. 1.Université François Rabelais Tours, LIBloisFrance
  2. 2.Université de Caen, GREYCCaen CédexFrance

Personalised recommendations