Advertisement

Extending the Soft Constraint Based Mining Paradigm

  • Stefano Bistarelli
  • Francesco Bonchi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4747)

Abstract

The paradigm of pattern discovery based on constraints has been recognized as a core technique in inductive querying: constraints provide to the user a tool to drive the discovery process towards potentially interesting patterns, with the positive side effect of achieving a more efficient computation. So far the research on this paradigm has mainly focussed on the latter aspect: the development of efficient algorithms for the evaluation of constraint-based mining queries. Due to the lack of research on methodological issues, the constraint-based pattern mining framework still suffers from many problems which limit its practical relevance. In our previous work [5], we analyzed such limitations and showed how they flow out from the same source: the fact that in the classical constraint-based mining, a constraint is a rigid boolean function which returns either true or false. To overcome such limitations we introduced the new paradigm of pattern discovery based on Soft Constraints, and instantiated our idea to the fuzzy soft constraints. In this paper we extend the framework to deal with probabilistic and weighted soft constraints: we provide theoretical basis and detailed experimental analysis. We also discuss a straightforward solution to deal with top-k queries. Finally we show how the ideas presented in this paper have been implemented in a real Inductive Database system.

Keywords

Association Rule Frequent Itemsets Regular Language Soft Constraint Pattern Discovery 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Antunes, C., Oliveira, A.L.: Constraint relaxations for discovering unknown sequential patterns. In: Goethals, B., Siebes, A. (eds.) KDID 2004. LNCS, vol. 3377, pp. 11–32. Springer, Heidelberg (2005)Google Scholar
  2. 2.
    Bayardo, R.J.: The hows, whys, and whens of constraints in itemset and rule discovery. In: Boulicaut, J.-F., De Raedt, L., Mannila, H. (eds.) Constraint-Based Mining and Inductive Databases. LNCS (LNAI), vol. 3848, pp. 1–13. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  3. 3.
    Bayardo, R.J., Agrawal, R.: Mining the most interesting rules. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 145–154. ACM Press, New York (1999)CrossRefGoogle Scholar
  4. 4.
    Besson, J., Robardet, C., Boulicaut, J.F., Rome, S.: Constraint-based concept mining and its application to microarray data analysis. Intelligent Data Analysis journal, 59–82 (2005)Google Scholar
  5. 5.
    Bistarelli, S., Bonchi, F.: Interestingness is not a dichotomy: Introducing softness in constrained pattern mining. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 22–33. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  6. 6.
    Bistarelli, S., Codognet, P., Rossi, F.: Abstracting soft constraints: Framework, properties, examples. Artificial Intelligence 139(2), 175–211 (2002)CrossRefMathSciNetGoogle Scholar
  7. 7.
    Bistarelli, S., Montanari, U., Rossi, F.: Semiring-based Constraint Solving and Optimization. Journal of the ACM 44(2), 201–236 (1997)zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Bonchi, F., Giannotti, F., Lucchese, C., Orlando, S., Perego, R., Trasarti, R.: ConQueSt: a constraint-based querying system for exploratory pattern discovery. In: Proceedings of The 22nd IEEE International Conference on Data Engineering, pp. 22–33. IEEE Computer Society Press, Los Alamitos (2006) Google Scholar
  9. 9.
    Bonchi, F., Lucchese, C.: Extending the state-of-the-art of constraint-based pattern discovery. Data and Knowledge Engineering (DKE) (to appear, 2006)Google Scholar
  10. 10.
    Bonchi, F., Giannotti, F., Lucchese, C., Orlando, S., Perego, R., Trasarti, R.: On interactive pattern mining from relational databases. In: KDID 2006. LNCS, vol. 4747, pp. 42–62. Springer, Heidelberg (2007)Google Scholar
  11. 11.
    Boulicaut, J.F., Jeudy, B.: Constraint-based data mining. In: Maimon, O., Rokach, L. (eds.) The Data Mining and Knowledge Discovery Handbook, pp. 399–416. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  12. 12.
    Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: Generalizing association rules to correlations. In: Proceedings ACM SIGMOD International Conference on Management of Data, pp. 256–276. ACM Press, New York (1997)Google Scholar
  13. 13.
    Ordonez, C., et al.: Mining constrained association rules to predict heart disease. In: Proceedings of the First IEEE International Conference on Data Mining, pp. 433–440. IEEE Computer Society Press, Los Alamitos (2001)CrossRefGoogle Scholar
  14. 14.
    Fargier, H., Lang, J.: Uncertainty in constraint satisfaction problems: a probabilistic approach. In: Moral, S., Kruse, R., Clarke, E. (eds.) ECSQARU 1993. LNCS, vol. 747, pp. 97–104. Springer, Heidelberg (1993)CrossRefGoogle Scholar
  15. 15.
    Hilderman, R.J., Hamilton, H.J.: Knowledge Discovery and Measures of Interest. Kluwer Academic Publishers, Boston (2002)Google Scholar
  16. 16.
    Hipp, J., Güntzer, H.: Is pushing constraints deeply into the mining algorithms really what we want?: an alternative approach for association rule mining. SIGKDD Explorations 4(1), 50–55 (2002)CrossRefGoogle Scholar
  17. 17.
    Hofmann, H., Siebes, A., Wilhelm, A.F.X.: Visualizing association rules with interactive mosaic plots. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 227–235. ACM Press, New York (2000)CrossRefGoogle Scholar
  18. 18.
    Lau, A., Ong, S., Mahidadia, A., Hoffmann, A., Westbrook, J., Zrimec, T.: Mining patterns of dyspepsia symptoms across time points using constraint association rules. In: Whang, k-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) PAKDD 2003. LNCS (LNAI), vol. 2637, pp. 124–135. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  19. 19.
    Mitasiunaite, I., Boulicaut, J.-F.: About softness for inductive querying on sequence databases. In: Proceedings 7th International Baltic Conference on Databases and Information Systems DB IS 2006, July 3-6 2006, Vilnius (Lithuania) (2006)Google Scholar
  20. 20.
    Silberschatz, A., Tuzhilin, A.: On subjective measures of interestingness. In: Proceedings of the First International Conference on Knowledge Discovery and Data Mining, pp. 275–281 (1995)Google Scholar
  21. 21.
    Tan, P.-N., Kumar, V., Srivastava, J.: Selecting the right interestingness measure for association patterns. In: Proc. of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD’2002), ACM Press, New York (2002)Google Scholar
  22. 22.
    Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley, Reading (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Stefano Bistarelli
    • 1
    • 2
  • Francesco Bonchi
    • 3
  1. 1.Dipartimento di Scienze, Università degli Studi “G. D’Annunzio”, PescaraItaly
  2. 2.Istituto di Informatica e Telematica, CNR, PisaItaly
  3. 3.Pisa KDD Laboratory, ISTI - C.N.R., PisaItaly

Personalised recommendations