Combining Constraint Programming and Constraint-Based Mining for Pattern Discovery

  • Mehdi Khiari
  • Patrice Boizumault
  • Bruno Crémilleux
Chapter
Part of the Studies in Computational Intelligence book series (SCI, volume 398)

Abstract

The large outputs of data mining methods hamper the individual and global analysis performed by the data analysts. That is why discovering patterns of higher level is an active research field. In this paper, by investigating the relationship between constraint-based mining and constraint satisfaction problems, we propose an approach to model and mine queries involving several local patterns (n-ary patterns). First, the user expresses his/her query under constraints involving n-ary patterns. Second, the constraints are formulated using constraint programming and solved by a constraint solver which generates the correct and complete set of solutions. This dissociation allows the user to express in a declarative way a large set of queries without taking care of their solving. Our approach also takes benefit from the recent progress on mining local patterns by pushing, with a solver on local patterns, all local constraints which can be inferred from the query. This approach enables us to model in a flexible way any set of constraints combining several local patterns and it leads to discover patterns of higher level. Experiments show the feasibility and the interest of our approach.

Keywords

Local Pattern Constraint Satisfaction Problem Local Constraint Pattern Discovery Condensed Representation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [Apt and Wallace, 2007]
    Apt, K.R., Wallace, M.: Constraint Logic Programming using Eclipse. Cambridge University Press, New York (2007)MATHGoogle Scholar
  2. [Benhamou and Goualard, 2000]
    Benhamou, F., Goualard, F.: Universally Quantified Interval Constraints. In: Dechter, R. (ed.) CP 2000. LNCS, vol. 1894, p. 67. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  3. [Besson et al., 2006]
    Besson, J., Robardet, C., Boulicaut, J.-F.: Mining a New Fault-Tolerant Pattern Type as an Alternative to Formal Concept Discovery. In: Schärfe, H., Hitzler, P., Øhrstrøm, P. (eds.) ICCS 2006. LNCS (LNAI), vol. 4068, pp. 144–157. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  4. [Bonchi et al., 2009]
    Bonchi, F., Giannotti, F., Lucchese, C., Orlando, S., Perego, R., Trasarti, R.: A constraint-based querying system for exploratory pattern discovery. Inf. Syst. 34(1), 3–27 (2009)CrossRefGoogle Scholar
  5. [Bringmann and Zimmermann, 2007]
    Bringmann, B., Zimmermann, A.: The chosen few: On identifying valuable patterns. In: Proceedings of the 12th IEEE Int. Conf. on Data Mining (ICDM-2007), Omaha, NE, pp. 63–72 (2007)Google Scholar
  6. [Calders et al., 2005]
    Calders, T., Rigotti, C., Boulicaut, J.-F.: A Survey on Condensed Representations for Frequent Sets. In: Boulicaut, J.-F., De Raedt, L., Mannila, H. (eds.) Constraint-Based Mining and Inductive Databases. Lecture Notes in Artificial Intelligence (LNCS), vol. 3848, pp. 64–80. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. [De Raedt et al., 2008]
    De Raedt, L., Guns, T., Nijssen, S.: Constraint Programming for Itemset Mining. In: ACM SIGKDD Int. Conf. KDD 2008, Las Vegas, Nevada, USA (2008)Google Scholar
  8. [De Raedt et al., 2002]
    De Raedt, L., Jäger, M., Lee, S.D., Mannila, H.: A theory of inductive query answering. In: Proceedings of the IEEE Conference on Data Mining (ICDM 2002), Maebashi, Japan, pp. 123–130 (2002)Google Scholar
  9. [De Raedt and Zimmermann, 2007]
    De Raedt, L., Zimmermann, A.: Constraint-based pattern set mining. In: Proceedings of the Seventh SIAM Int. Conf. on Data Mining. SIAM, Minneapolis (2007)Google Scholar
  10. [ECLiPSe, 2004]
    ECLiPSe (2004), Eclipse documentation, http://www.eclipse-clp.org
  11. [Gecode Team, 2006]
    Gecode Team (2006), Gecode: Generic constraint development environment, http://www.gecode.org
  12. [Gervet, 1994]
    Gervet, C.: Conjunto: constraint logic programming with finite set domains. In: ILPS 1994: Proceedings of the 1994 Int. Symposium on Logic Programming, pp. 339–358. MIT Press, Cambridge (1994)Google Scholar
  13. [Gervet, 1997]
    Gervet, C.: Interval Propagation to Reason about Sets: Definition and Implementation of a Practical Language. Constraints 1(3), 191–244 (1997)MathSciNetMATHCrossRefGoogle Scholar
  14. [Giacometti et al., 2009]
    Giacometti, A., Miyaneh, E.K., Marcel, P., Soulet, A.: A framework for pattern-based global models. In: 10th Int. Conf. on Intelligent Data Engineering and Automated Learning, Burgos, Spain, pp. 433–440 (2009)Google Scholar
  15. [Khiari et al., 2010]
    Khiari, M., Boizumault, P., Crémilleux, B.: Combining CSP and Constraint-Based Mining for Pattern Discovery. In: Taniar, D., Gervasi, O., Murgante, B., Pardede, E., Apduhan, B.O. (eds.) ICCSA 2010. LNCS, vol. 6017, pp. 432–447. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  16. [Knobbe et al., 2008]
    Knobbe, A., Crémilleux, B., Fürnkranz, J., Scholz, M.: From local patterns to global models: The lego approach to data mining. In: Int. Workshop LeGo Co-Located With ECML/PKDD 2008, Antwerp, Belgium, pp. 1–16 (2008)Google Scholar
  17. [Knobbe and Ho, 2006]
    Knobbe, A., Ho, E.: Pattern Teams. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 577–584. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  18. [Lakshmanan et al., 1999]
    Lakshmanan, L.V.S., Ng, R.T., Han, J., Pang, A.: Optimization of constrained frequent set queries with 2-variable constraints. In: Delis, A., Faloutsos, C., Ghandeharizadeh, S. (eds.) SIGMOD Conference, pp. 157–168. ACM Press (1999)Google Scholar
  19. [Lhomme, 1993]
    Lhomme, O.: Consistency techniques for numeric CSPs. In: Proc. of the 13th IJCAI, Chambery, France, pp. 232–238 (1993)Google Scholar
  20. [Mamoulis and Stergiou, 2004]
    Mamoulis, N., Stergiou, K.: Algorithms for Quantified Constraint Satisfaction Problems. In: Wallace, M. (ed.) CP 2004. LNCS, vol. 3258, pp. 752–756. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  21. [Mannila and Toivonen, 1997]
    Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1(3), 241–258 (1997)CrossRefGoogle Scholar
  22. [Moore, 1966]
    Moore, R.E.: Interval analysis. Prentice-Hall (1966)Google Scholar
  23. [Ng et al., 1998]
    Ng, R.T., Lakshmanan, V.S., Han, J., Pang, A.: Exploratory mining and pruning optimizations of constrained associations rules. In: Proceedings of ACM SIGMOD 1998, pp. 13–24. ACM Press (1998)Google Scholar
  24. [Nijssen et al., 2009]
    Nijssen, S., Guns, T., De Raedt, L.: Correlated itemset mining in roc space: a constraint programming approach. In: ACM SIGKDD Int. Conf. KDD 2009, Paris, France, pp. 647–655 (2009)Google Scholar
  25. [Padmanabhan and Tuzhilin, 1998]
    Padmanabhan, B., Tuzhilin, A.: A belief-driven method for discovering unexpected patterns. In: KDD, pp. 94–100 (1998)Google Scholar
  26. [Siebes et al., 2006]
    Siebes, A., Vreeken, J., van Leeuwen, M.: Item sets that compress. In: Proceedings of the Sixth SIAM Int. Conf. on Data Mining. SIAM, Bethesda (2006)Google Scholar
  27. [Soulet and Crémilleux, 2005]
    Soulet, A., Crémilleux, B.: An Efficient Framework for Mining Flexible Constraints. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 661–671. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  28. [Soulet et al., 2007]
    Soulet, A., Kléma, J., Crémilleux, B.: Efficient mining under rich constraints derived from various datasets. In: Džeroski, S., Struyf, J. (eds.) KDID 2006. LNCS, vol. 4747, pp. 223–239. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  29. [Suzuki, 2002]
    Suzuki, E.: Undirected Discovery of Interesting Exception Rules. Int. Journal of Pattern Recognition and Artificial Intelligence 16(8), 1065–1086 (2002)CrossRefGoogle Scholar
  30. [Szathmary et al., 2007]
    Szathmary, L., Napoli, A., Valtchev, P.: Towards Rare Itemset Mining. In: Proc. of the 19th IEEE ICTAI 2007, Patras, Greece, vol. 1 (2007)Google Scholar
  31. [Thornary et al., 1998]
    Thornary, V., Gensel, J., Sherpa, P.: An hybrid representation for set constraint satisfaction problems. In: Workshop on Set Constraints Co-Located With the Fourth Int. Conf. on Principles and Practice of Constraint Programming, Pisa, Italy (1998)Google Scholar
  32. [Yin and Han, 2003]
    Yin, X., Han, J.: CPAR: classification based on predictive association rules. In: Proceedings of the 2003 SIAM Int. Conf. on Data Mining (SDM 2003), San Fransisco, CA (2003)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2012

Authors and Affiliations

  • Mehdi Khiari
    • 1
  • Patrice Boizumault
    • 1
  • Bruno Crémilleux
    • 1
  1. 1.GREYC (CNRS - UMR 6072)Université de Caen Basse-NormandieCaen CedexFrance

Personalised recommendations