Advertisement

Soft Threshold Constraints for Pattern Mining

  • Willy Ugarte
  • Patrice Boizumault
  • Samir Loudni
  • Bruno Crémilleux
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7569)

Abstract

Constraint-based pattern discovery is at the core of numerous data mining tasks. Patterns are extracted with respect to a given set of constraints (frequency, closedness, size, etc). In practice, many constraints require threshold values whose choice is often arbitrary. This difficulty is even harder when several thresholds are required and have to be combined. Moreover, patterns barely missing a threshold will not be extracted even if they may be relevant. In this paper, by using Constraint Programming we propose a method to integrate soft threshold constraints into the pattern discovery process. We show the relevance and the efficiency of our approach through a case study in chemoinformatics for discovering toxicophores.

Keywords

Pattern Mining Constraint Satisfaction Problem Soft Constraint Interestingness Measure Data Mining Task 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bajorath, J., Auer, J.: Emerging chemical patterns: A new methodology for molecular classification and compound selection. J. of Chemical Information and Modeling 46, 2502–2514 (2006)CrossRefGoogle Scholar
  2. 2.
    Basu, S., Davidson, I., Wagstaff, K.L.: Constrained Clustering: Advances in Algorithms, Theory, and Applications. Chapman & Hall (2008)Google Scholar
  3. 3.
    Bistarelli, S., Bonchi, F.: Soft constraint based pattern mining. Data Knowl. Eng. 62(1), 118–137 (2007)CrossRefGoogle Scholar
  4. 4.
    Borzsonyi, S., Kossmann, D., Stocker, K.: The skyline operator. In: Proceedings of the 17th International Conference on Data Engineering (ICDE 2001), pp. 421–430. IEEE Computer Science, Springer (2001)Google Scholar
  5. 5.
    Garofalakis, M.N., Rastogi, R., Shim, K.: SPIRIT: Sequential pattern mining with regular expression constraints. The VLDB Journal, 223–234 (1999)Google Scholar
  6. 6.
    Guns, T., Nijssen, S., De Raedt, L.: Itemset mining: A constraint programming perspective. Artif. Intell. 175(12-13), 1951–1983 (2011)zbMATHCrossRefGoogle Scholar
  7. 7.
    Ke, Y., Cheng, J., Xu Yu, J.: Top-k correlative graph mining. In: SDM, pp. 1038–1049 (2009)Google Scholar
  8. 8.
    Khiari, M., Boizumault, P., Crémilleux, B.: Constraint Programming for Mining n-ary Patterns. In: Cohen, D. (ed.) CP 2010. LNCS, vol. 6308, pp. 552–567. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  9. 9.
    Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1(3), 241–258 (1997)CrossRefGoogle Scholar
  10. 10.
    Ng, R.T., Lakshmanan, V.S., Han, J., Pang, A.: Exploratory mining and pruning optimizations of constrained associations rules. In: Proceedings of ACM SIGMOD 1998, pp. 13–24. ACM (1998)Google Scholar
  11. 11.
    Kralj Novak, P., Lavrac, N., Webb, G.I.: Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining. Journal of Machine Learning Research 10, 377–403 (2009)zbMATHGoogle Scholar
  12. 12.
    Régin, J.-C., Petit, T., Bessière, C., Puget, J.-F.: An Original Constraint Based Approach for Solving over Constrained Problems. In: Dechter, R. (ed.) CP 2000. LNCS, vol. 1894, pp. 543–548. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  13. 13.
    Poezevara, G., Cuissart, B., Crémilleux, B.: Extracting and summarizing the frequent emerging graph patterns from a dataset of graphs. J. Intell. Inf. Syst. 37(3), 333–353 (2011)CrossRefGoogle Scholar
  14. 14.
    De Raedt, L., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: KDD 2008, pp. 204–212. ACM (2008)Google Scholar
  15. 15.
    De Raedt, L., Zimmermann, A.: Constraint-based pattern set mining. In: Proceedings of the Seventh SIAM International Conference on Data Mining, Minneapolis, Minnesota, USA. SIAM (April 2007)Google Scholar
  16. 16.
    Wang, J., Han, J., Lu, Y., Tzvetkov, P.: Tfp: An efficient algorithm for mining top-k frequent closed itemsets. IEEE Trans. Knowl. Data Eng. 17(5), 652–664 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Willy Ugarte
    • 1
  • Patrice Boizumault
    • 1
  • Samir Loudni
    • 1
  • Bruno Crémilleux
    • 1
  1. 1.GREYC (CNRS UMR 6072)University of Caen Basse-NormandieCaenFrance

Personalised recommendations