Skip to main content

Mining (Soft-) Skypatterns Using Constraint Programming

  • Chapter
  • First Online:
Advances in Knowledge Discovery and Management

Part of the book series: Studies in Computational Intelligence ((SCI,volume 615))

Abstract

Within the pattern mining area, skypatterns enable to express a user-preference point of view according to a dominance relation. In this paper, we deal with the introduction of softness in the skypattern mining problem. First, we show how softness can provide convenient patterns that would be missed otherwise. Then, thanks to Constraint Programming, we propose a generic and efficient method to mine skypatterns as well as soft ones. Finally, we show the relevance and the effectiveness of our approach through experiments on UCI benchmarks and a case study in chemoinformatics for discovering toxicophores.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.gecode.org/.

  2. 2.

    The closed constraint is used to reduce pattern redundancy. Indeed, closed skypatterns make up an exact condensed representation of the whole set of skypatterns (Soulet et al. 2011).

  3. 3.

    http://www.ics.uci.edu/~mlearn/MLRepository.html.

  4. 4.

    Obviously, it is the same for both methods.

  5. 5.

    They correspond to edge-skypatterns that are not hard skypatterns.

  6. 6.

    They correspond to \(\delta \)-skypatterns that are neither hard skypatterns neither edge-skypatterns.

  7. 7.

    Lethal concentration of a substance required to kill half the members of a tested population after a specified test duration.

  8. 8.

    A fragment denominates a connected part of a chemical structure containing at least one chemical bond.

  9. 9.

    European Chemicals Bureau: http://echa.europa.eu/.

  10. 10.

    A chemical Ch contains an item A if Ch supports A, and A is a frequent subgraph of \(\mathscr {T}\).

References

  • Auer, J., and J. Bajorath. 2006. Emerging chemical patterns: A new methodology for molecular classification and compound selection. Journal of Chemical Information and Modeling (JCIM) 46(6): 2502–2514.

    Article  Google Scholar 

  • Bistarelli, S., and F. Bonchi. 2007. Soft constraint based pattern mining. Data & Knowledge Engineering (DKE) 62(1): 118–137.

    Article  Google Scholar 

  • Börzsönyi, S., D. Kossmann, and K. Stocker. 2001. The skyline operator. In Proceedings of the 17th International Conference on Data Engineering (ICDE’2001), 421–430. IEEE Computer Society.

    Google Scholar 

  • De Raedt, L., T. Guns, and S. Nijssen. 2008. Constraint programming for itemset mining. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’2008), 204–212. ACM.

    Google Scholar 

  • De Raedt, L. and A. Zimmermann. 2007. Constraint-based pattern set mining. In Proceedings of the Seventh SIAM International Conference on Data Mining (SDM’2007), 237–248. SIAM.

    Google Scholar 

  • Drugan, M.M., and D. Thierens. 2012. Stochastic pareto local search: Pareto neighbourhood exploration and perturbation strategies. Journal of Heuristics 18(5): 727–766.

    Article  Google Scholar 

  • Gavanelli, M. 2002. An algorithm for multi-criteria optimization in CSPs. In Proceedings of the 15th Eureopean Conference on Artificial Intelligence (ECAI’2002), 136–140. IOS Press.

    Google Scholar 

  • Guns, T., S. Nijssen, and L. De Raedt. 2011. Itemset mining: A constraint programming perspective. Artificial Intelligence 175(12–13): 1951–1983.

    Article  MATH  MathSciNet  Google Scholar 

  • Jin, W., J. Han, and M. Ester. 2004. Mining thick skylines over large databases. In Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’2004), vol. 3202 of Lecture Notes in Computer Science, 255–266. Springer.

    Google Scholar 

  • Khiari, M., P. Boizumault, and B. Crémilleux. 2010. Constraint programming for mining n-ary patterns. In Proceedings of the 16th International Conference in Principles and Practice of Constraint Programming (CP’2010), vol. 6308 of Lecture Notes in Computer Science, 552–567. Springer.

    Google Scholar 

  • Kung, H.T., F. Luccio, and F.P. Preparata. 1975. On finding the maxima of a set of vectors. Journal of the ACM 22(4): 469–476.

    Article  MATH  MathSciNet  Google Scholar 

  • Lin, X., Y. Yuan, Q. Zhang, and Y. Zhang. 2007. Selecting stars: The k most representative skyline operator. In Proceedings of the IEEE 23rd International Conference on Data Engineering (ICDE’2007), 86–95.

    Google Scholar 

  • Mannila, H., and H. Toivonen. 1997. Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1(3): 241–258.

    Article  Google Scholar 

  • Matousek, J. 1991. Computing dominances in \({\rm E}{\hat{\,}}{\rm n}\). Information Processing Letters (IPL) 38(5): 277–278.

    Article  MATH  MathSciNet  Google Scholar 

  • Novak, P.K., N. Lavrac, and G.I. Webb. 2009. Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining. Journal of Machine Learning Research (JMLR) 10: 377–403.

    MATH  Google Scholar 

  • Papadias, D., Y. Tao, G. Fu, and B. Seeger. 2005. Progressive skyline computation in database systems. ACM Transactions on Database Systems (TODS) 30(1): 41–82.

    Article  Google Scholar 

  • Papadias D., M.L. Yiu, N. Mamoulis, and Y. Tao. 2008. Nearest neighbor queries in network databases. In Encyclopedia of GIS, 772–776. Springer.

    Google Scholar 

  • Papadopoulos, A.N., A. Lyritsis, and Y. Manolopoulos. 2008. Skygraph: an algorithm for important subgraph discovery in relational graphs. Data Mining and Knowledge Discovery 17(1): 57–76.

    Article  MathSciNet  Google Scholar 

  • Poezevara, G., B. Cuissart, and B. Crémilleux. 2011. Extracting and summarizing the frequent emerging graph patterns from a dataset of graphs. Journal of Intelligent Information Systems (JIIS) 37(3): 333–353.

    Article  Google Scholar 

  • Shelokar, P., A. Quirin, and O. Cordón. 2013. Mosubdue: a pareto dominance-based multiobjective subdue algorithm for frequent subgraph mining. Knowledge and Information Systems (KAIS) 34(1): 75–108.

    Article  Google Scholar 

  • Soulet, A., C. Raïssi, M. Plantevit, and B. Crémilleux. 2011. Mining dominant patterns in the sky. In Proceedings of the 11th IEEE International Conference on Data Mining (ICDM’2011), 655–664. IEEE Computer Society.

    Google Scholar 

  • Steuer, R.E. 1992. Multiple Criteria Optimization: Theory, Computation and Application, 504. Moscow: Radio e Svyaz. (in Russian).

    Google Scholar 

  • Tan, K., P. Eng, and B.C. Ooi. 2001. Efficient progressive skyline computation. In Proceedings of 27th International Conference on Very Large Data Bases (VLDB’2001), 301–310. Morgan Kaufmann.

    Google Scholar 

  • Ugarte, W., P. Boizumault, S. Loudni, and B. Crémilleux. 2012. Soft threshold constraints for pattern mining. In Proceedings of the 15th International Conference in Discovery Science (DS’2012), vol. 7569 of Lecture Notes in Computer Science, 313–327. Springer.

    Google Scholar 

  • van Leeuwen, M. and A. Ukkonen. 2013. Discovering skylines of subgroup sets. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD’2013), vol. 8190 of Lecture Notes in Computer Science, 272–287. Springer.

    Google Scholar 

  • Verfaillie, G., and N. Jussien. 2005. Constraint solving in uncertain and dynamic environments: A survey. Constraints 10(3): 253–281.

    Google Scholar 

Download references

Acknowledgments

This work is partly supported by the ANR (French Research National Agency) funded project FiCOLOFO ANR-10-BLA-0214. The authors would like to thank Arnaud Soulet (University François Rabelais of Tours, France), for providing the Aetheris program and his highly valuable comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Willy Ugarte .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Ugarte, W., Boizumault, P., Loudni, S., Crémilleux, B., Lepailleur, A. (2016). Mining (Soft-) Skypatterns Using Constraint Programming. In: Guillet, F., Pinaud, B., Venturini, G., Zighed, D. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 615. Springer, Cham. https://doi.org/10.1007/978-3-319-23751-0_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23751-0_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23750-3

  • Online ISBN: 978-3-319-23751-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics