Mining (Soft-) Skypatterns Using Constraint Programming

  • Willy Ugarte
  • Patrice Boizumault
  • Samir Loudni
  • Bruno Crémilleux
  • Alban Lepailleur
Chapter
Part of the Studies in Computational Intelligence book series (SCI, volume 615)

Abstract

Within the pattern mining area, skypatterns enable to express a user-preference point of view according to a dominance relation. In this paper, we deal with the introduction of softness in the skypattern mining problem. First, we show how softness can provide convenient patterns that would be missed otherwise. Then, thanks to Constraint Programming, we propose a generic and efficient method to mine skypatterns as well as soft ones. Finally, we show the relevance and the effectiveness of our approach through experiments on UCI benchmarks and a case study in chemoinformatics for discovering toxicophores.

Keywords

Constraint Programming Pattern Mining Pareto Frontier Skyline Query Skyline Point 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgments

This work is partly supported by the ANR (French Research National Agency) funded project FiCOLOFO ANR-10-BLA-0214. The authors would like to thank Arnaud Soulet (University François Rabelais of Tours, France), for providing the Aetheris program and his highly valuable comments.

References

  1. Auer, J., and J. Bajorath. 2006. Emerging chemical patterns: A new methodology for molecular classification and compound selection. Journal of Chemical Information and Modeling (JCIM) 46(6): 2502–2514.CrossRefGoogle Scholar
  2. Bistarelli, S., and F. Bonchi. 2007. Soft constraint based pattern mining. Data & Knowledge Engineering (DKE) 62(1): 118–137.CrossRefGoogle Scholar
  3. Börzsönyi, S., D. Kossmann, and K. Stocker. 2001. The skyline operator. In Proceedings of the 17th International Conference on Data Engineering (ICDE’2001), 421–430. IEEE Computer Society.Google Scholar
  4. De Raedt, L., T. Guns, and S. Nijssen. 2008. Constraint programming for itemset mining. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’2008), 204–212. ACM.Google Scholar
  5. De Raedt, L. and A. Zimmermann. 2007. Constraint-based pattern set mining. In Proceedings of the Seventh SIAM International Conference on Data Mining (SDM’2007), 237–248. SIAM.Google Scholar
  6. Drugan, M.M., and D. Thierens. 2012. Stochastic pareto local search: Pareto neighbourhood exploration and perturbation strategies. Journal of Heuristics 18(5): 727–766.CrossRefGoogle Scholar
  7. Gavanelli, M. 2002. An algorithm for multi-criteria optimization in CSPs. In Proceedings of the 15th Eureopean Conference on Artificial Intelligence (ECAI’2002), 136–140. IOS Press.Google Scholar
  8. Guns, T., S. Nijssen, and L. De Raedt. 2011. Itemset mining: A constraint programming perspective. Artificial Intelligence 175(12–13): 1951–1983.MATHMathSciNetCrossRefGoogle Scholar
  9. Jin, W., J. Han, and M. Ester. 2004. Mining thick skylines over large databases. In Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’2004), vol. 3202 of Lecture Notes in Computer Science, 255–266. Springer.Google Scholar
  10. Khiari, M., P. Boizumault, and B. Crémilleux. 2010. Constraint programming for mining n-ary patterns. In Proceedings of the 16th International Conference in Principles and Practice of Constraint Programming (CP’2010), vol. 6308 of Lecture Notes in Computer Science, 552–567. Springer.Google Scholar
  11. Kung, H.T., F. Luccio, and F.P. Preparata. 1975. On finding the maxima of a set of vectors. Journal of the ACM 22(4): 469–476.MATHMathSciNetCrossRefGoogle Scholar
  12. Lin, X., Y. Yuan, Q. Zhang, and Y. Zhang. 2007. Selecting stars: The k most representative skyline operator. In Proceedings of the IEEE 23rd International Conference on Data Engineering (ICDE’2007), 86–95.Google Scholar
  13. Mannila, H., and H. Toivonen. 1997. Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1(3): 241–258.CrossRefGoogle Scholar
  14. Matousek, J. 1991. Computing dominances in \({\rm E}{\hat{\,}}{\rm n}\). Information Processing Letters (IPL) 38(5): 277–278.MATHMathSciNetCrossRefGoogle Scholar
  15. Novak, P.K., N. Lavrac, and G.I. Webb. 2009. Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining. Journal of Machine Learning Research (JMLR) 10: 377–403.MATHGoogle Scholar
  16. Papadias, D., Y. Tao, G. Fu, and B. Seeger. 2005. Progressive skyline computation in database systems. ACM Transactions on Database Systems (TODS) 30(1): 41–82.CrossRefGoogle Scholar
  17. Papadias D., M.L. Yiu, N. Mamoulis, and Y. Tao. 2008. Nearest neighbor queries in network databases. In Encyclopedia of GIS, 772–776. Springer.Google Scholar
  18. Papadopoulos, A.N., A. Lyritsis, and Y. Manolopoulos. 2008. Skygraph: an algorithm for important subgraph discovery in relational graphs. Data Mining and Knowledge Discovery 17(1): 57–76.MathSciNetCrossRefGoogle Scholar
  19. Poezevara, G., B. Cuissart, and B. Crémilleux. 2011. Extracting and summarizing the frequent emerging graph patterns from a dataset of graphs. Journal of Intelligent Information Systems (JIIS) 37(3): 333–353.CrossRefGoogle Scholar
  20. Shelokar, P., A. Quirin, and O. Cordón. 2013. Mosubdue: a pareto dominance-based multiobjective subdue algorithm for frequent subgraph mining. Knowledge and Information Systems (KAIS) 34(1): 75–108.CrossRefGoogle Scholar
  21. Soulet, A., C. Raïssi, M. Plantevit, and B. Crémilleux. 2011. Mining dominant patterns in the sky. In Proceedings of the 11th IEEE International Conference on Data Mining (ICDM’2011), 655–664. IEEE Computer Society.Google Scholar
  22. Steuer, R.E. 1992. Multiple Criteria Optimization: Theory, Computation and Application, 504. Moscow: Radio e Svyaz. (in Russian).Google Scholar
  23. Tan, K., P. Eng, and B.C. Ooi. 2001. Efficient progressive skyline computation. In Proceedings of 27th International Conference on Very Large Data Bases (VLDB’2001), 301–310. Morgan Kaufmann.Google Scholar
  24. Ugarte, W., P. Boizumault, S. Loudni, and B. Crémilleux. 2012. Soft threshold constraints for pattern mining. In Proceedings of the 15th International Conference in Discovery Science (DS’2012), vol. 7569 of Lecture Notes in Computer Science, 313–327. Springer.Google Scholar
  25. van Leeuwen, M. and A. Ukkonen. 2013. Discovering skylines of subgroup sets. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD’2013), vol. 8190 of Lecture Notes in Computer Science, 272–287. Springer.Google Scholar
  26. Verfaillie, G., and N. Jussien. 2005. Constraint solving in uncertain and dynamic environments: A survey. Constraints 10(3): 253–281.Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Willy Ugarte
    • 1
  • Patrice Boizumault
    • 1
  • Samir Loudni
    • 1
  • Bruno Crémilleux
    • 1
  • Alban Lepailleur
    • 2
  1. 1.GREYC (CNRS UMR 6072) – University of CaenCaenFrance
  2. 2.CERMN (UPRES EA 4258 - FR CNRS 3038 INC3M) – University of Caen Boulevard BecquerelCaen CedexFrance

Personalised recommendations