Fast Extraction of Locally Optimal Patterns Based on Consistent Pattern Function Variations

  • Frédéric Pennerath
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6323)

Abstract

This article introduces the problem of searching locally optimal patterns within a set of patterns constrained by some anti-monotonic predicate: given some pattern scoring function, a locally optimal pattern has a maximal (or minimal) score locally among neighboring patterns. Some instances of this problem have produced patterns of interest in the framework of knowledge discovery since locally optimal patterns extracted from datasets are very few, informative and non-redundant compared to other pattern families derived from frequent patterns. This article then introduces the concept of variation consistency to characterize pattern functions and uses this notion to propose GALLOP, an algorithm that outperforms existing algorithms to extract locally optimal itemsets. Finally it shows how GALLOP can generically be applied to two classes of scoring functions useful in binary classification or clustering pattern mining problems.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Buneman, P., Jajodia, S. (eds.) Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, D.C., May 26-28, pp. 207–216. ACM Press, New York (1993)CrossRefGoogle Scholar
  2. 2.
    Dong, G., Li, J.: Efficient mining of emerging patterns: discovering trends and differences. In: KDD ’99: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 43–52. ACM, New York (1999)CrossRefGoogle Scholar
  3. 3.
    Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. International Journal of Information Systems 24(1), 25–46 (1999)Google Scholar
  4. 4.
    Boulicaut, J.F., Bykowski, A., Rigotti, C.: Free-sets: A condensed representation of boolean data for the approximation of frequency queries. Data Min. Knowl. Discov. 7(1), 5–22 (2003)CrossRefMathSciNetGoogle Scholar
  5. 5.
    Wang, J., Han, J., Lu, Y., Tzvetkov, P.: Tfp: An efficient algorithm for mining top-k frequent closed itemsets. IEEE Trans. Knowl. Data Eng. 17(5), 652–664 (2005)CrossRefGoogle Scholar
  6. 6.
    Pennerath, F., Napoli, A.: The model of most informative patterns and its application to knowledge extraction from graph databases. In: Buntine, W. L., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009. LNCS, vol. 5782, pp. 205–220. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  7. 7.
    Knobbe, A.J., Ho, E.K.Y.: Pattern teams. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 577–584. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  8. 8.
    Siebes, A., Vreeken, J., van Leeuwen, M.: Item sets that compress. In: Ghosh, J., Lambert, D., Skillicorn, D.B., Srivastava, J. (eds.) SDM. SIAM, Philadelphia (2006)Google Scholar
  9. 9.
    Bringmann, B., Zimmermann, A.: One in a million: picking the right patterns. Knowl. Inf. Syst. 18(1), 61–81 (2009)CrossRefGoogle Scholar
  10. 10.
    Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Min. Knowl. Discov. 8(1), 53–87 (2004)CrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Frédéric Pennerath
    • 1
  1. 1.Supélec, 2 rue Edouard BelinMetzFrance

Personalised recommendations