Advertisement

Modeling in MiningZinc

  • Anton Dries
  • Tias Guns
  • Siegfried Nijssen
  • Behrouz Babaki
  • Thanh Le Van
  • Benjamin Negrevergne
  • Sergey Paramonov
  • Luc De RaedtEmail author
Chapter
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10101)

Abstract

MiningZinc offers a framework for modeling and solving constraint-based mining problems. The language used is MiniZinc, a high-level declarative language for modeling combinatorial (optimisation) problems. This language is augmented with a library of functions and predicates that help modeling data mining problems and facilities for interfacing with databases. We show how MiningZinc can be used to model constraint-based itemset mining problems, for which it was originally designed, as well as sequence mining, Bayesian pattern mining, linear regression, clustering data factorization and ranked tiling. The underlying framework can use any existing MiniZinc solver. We also showcase how the framework and modeling capabilities can be integrated into an imperative language, for example as part of a greedy algorithm.

Keywords

Bayesian Network Mining Problem Constraint Programming Pattern Mining Constraint Satisfaction Problem 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp. 207–216. ACM Press (1993)Google Scholar
  2. 2.
    Basu, S., Davidson, I., Wagstaff, K.: Constrained Clustering: Advances in Algorithms, Theory, and Applications. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series. CRC Press, Boca Raton (2008)zbMATHGoogle Scholar
  3. 3.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)zbMATHGoogle Scholar
  4. 4.
    Blockeel, H., Calders, T., Fromont, É., Goethals, B., Prado, A., Robardet, C.: An inductive database system based on virtual mining views. Data Min. Knowl. Discov. 24(1), 247–287 (2012)CrossRefzbMATHGoogle Scholar
  5. 5.
    Boulicaut, J.F., Dzeroski, S. (eds.): Proceedings of the Second International Workshop on Inductive Databases, 22 September, Cavtat-Dubrovnik, Croatia. Rudjer Boskovic Institute, Zagreb (2003)Google Scholar
  6. 6.
    Boulicaut, J.-F., Raedt, L., Mannila, H. (eds.): Constraint-Based Mining and Inductive Databases. LNCS (LNAI), vol. 3848. Springer, Heidelberg (2006). doi: 10.1007/11615576 Google Scholar
  7. 7.
    Coquery, E., Jabbour, S., Sais, L., Salhi, Y., et al.: A SAT-based approach for discovering frequent, closed and maximal patterns in a sequence. In: European Conference on Artificial Intelligence (ECAI), vol. 242, pp. 258–263 (2012)Google Scholar
  8. 8.
    Darwiche, A.: A differential approach to inference in bayesian networks. J. ACM 50(3), 280–305 (2003). http://doi.acm.org/10.1145/765568.765570 MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    De Raedt, L., Paramonov, S., van Leeuwen, M.: Relational decomposition using answer set programming. In: Online Preprints 23rd International Conference on Inductive Logic Programming, International Conference on Inductive Logic Programming, Rio de Janeiro, 28–30 August 2013, August 2013. https://lirias.kuleuven.be/handle/123456789/439287
  10. 10.
    Denecker, M., Kakas, A.: Abduction in logic programming. In: Kakas, A.C., Sadri, F. (eds.) Computational Logic: Logic Programming and Beyond. LNCS (LNAI), vol. 2407, pp. 402–436. Springer, Heidelberg (2002). doi: 10.1007/3-540-45628-7_16 CrossRefGoogle Scholar
  11. 11.
    Dao, T.-B.-H., Duong, K.-C., Vrain, C.: A declarative framework for constrained clustering. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 419–434. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40994-3_27 CrossRefGoogle Scholar
  12. 12.
    Frisch, A., Harvey, W., Jefferson, C., Hernández, B.M., Miguel, I.: Essence: a constraint language for specifying combinatorial problems. Constraints 13(3), 268–306 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Gilpin, S., Davidson, I.N.: Incorporating SAT solvers into hierarchical clustering algorithms: an efficient and flexible approach. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, 21–24 August 2011, pp. 1136–1144 (2011)Google Scholar
  14. 14.
    Guns, T., Dries, A., Tack, G., Nijssen, S., De Raedt, L.: MiningZinc: a modeling language for constraint-based mining. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, pp. 1365–1372. AAAI Press, August 2013Google Scholar
  15. 15.
    Guns, T., Dries, A., Tack, G., Nijssen, S., Raedt, L.D.: Miningzinc: a language for constraint-based mining. In: International Joint Conference on Artificial Intelligence (2013)Google Scholar
  16. 16.
    Guns, T., Nijssen, S., De Raedt, L.: Itemset mining: a constraint programming perspective. Artif. Intell. 175(12–13), 1951–1983 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Guns, T., Nijssen, S., De Raedt, L.: k-Pattern set mining under constraints. IEEE Trans. Knowl. Data Eng. 25(2), 402–418 (2013)CrossRefGoogle Scholar
  18. 18.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)CrossRefGoogle Scholar
  19. 19.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, Burlington (2000)zbMATHGoogle Scholar
  20. 20.
    Imielinski, T., Virmani, A.: MSQL: a query language for database mining. Data Min. Knowl. Disc. 3, 373–408 (1999)CrossRefGoogle Scholar
  21. 21.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999). http://doi.acm.org/10.1145/331499.331504 CrossRefGoogle Scholar
  22. 22.
    Järvisalo, M.: Itemset mining as a challenge application for answer set enumeration. In: Delgrande, J.P., Faber, W. (eds.) LPNMR 2011. LNCS (LNAI), vol. 6645, pp. 304–310. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-20895-9_35 CrossRefGoogle Scholar
  23. 23.
    Van, T., Leeuwen, M., Nijssen, S., Fierro, A.C., Marchal, K., Raedt, L.: Ranked tiling. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS (LNAI), vol. 8725, pp. 98–113. Springer, Heidelberg (2014). doi: 10.1007/978-3-662-44851-9_7. https://lirias.kuleuven.be/handle/123456789/457022 Google Scholar
  24. 24.
    Mannila, H.: Inductive databases and condensed representations for data mining. In: ILPS, pp. 21–30 (1997)Google Scholar
  25. 25.
    Marriott, K., Nethercote, N., Rafeh, R., Stuckey, P.J., De La Banda, M.G., Wallace, M.: The design of the Zinc modelling language. Constraints 13(3), 229–267 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Meo, R., Psaila, G., Ceri, S.: A new SQL-like operator for mining association rules. In: VLDB, pp. 122–133 (1996)Google Scholar
  27. 27.
    Métivier, J.-P., Boizumault, P., Crémilleux, B., Khiari, M., Loudni, S.: Constrained clustering using SAT. In: Hollmén, J., Klawonn, F., Tucker, A. (eds.) IDA 2012. LNCS, vol. 7619, pp. 207–218. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-34156-4_20 CrossRefGoogle Scholar
  28. 28.
    Métivier, J.P., Boizumault, P., Crémilleux, B., Khiari, M., Loudni, S.: A constraint language for declarative pattern discovery. In: SAC 2012, pp. 119–125. ACM (2012). http://doi.acm.org/10.1145/2245276.2245302
  29. 29.
    Miettinen, P., Mielikäinen, T., Gionis, A., Das, G., Mannila, H.: The discrete basis problem. IEEE Trans. Knowl. Data Eng. 20(10), 1348–1362 (2008)CrossRefGoogle Scholar
  30. 30.
    Mitchell, T.: Machine Learning, 1st edn. McGraw-Hill, New York (1997)zbMATHGoogle Scholar
  31. 31.
    Negrevergne, B., Guns, T.: Constraint-based sequence mining using constraint programming. In: Michel, L. (ed.) CPAIOR 2015. LNCS, vol. 9075, pp. 288–305. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-18008-3_20 Google Scholar
  32. 32.
    Nethercote, N., Stuckey, P.J., Becket, R., Brand, S., Duck, G.J., Tack, G.: MiniZinc: towards a standard CP modelling language. In: Bessière, C. (ed.) CP 2007. LNCS, vol. 4741, pp. 529–543. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-74970-7_38 CrossRefGoogle Scholar
  33. 33.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  34. 34.
    Stuckey, P.J., Tack, G.: MiniZinc with functions. In: Gomes, C., Sellmann, M. (eds.) CPAIOR 2013. LNCS, vol. 7874, pp. 268–283. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-38171-3_18 CrossRefGoogle Scholar
  35. 35.
    Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley, Boston (2005)Google Scholar
  36. 36.
    Van Hentenryck, P.: The OPL Optimization Programming Language. MIT Press, Cambridge (1999)Google Scholar
  37. 37.
    Van Hentenryck, P., Michel, L.: Constraint-Based Local Search. MIT Press, Cambridge (2005)zbMATHGoogle Scholar
  38. 38.
    Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Roy. Stat. Soc. Series B 67, 301–320 (2005)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Anton Dries
    • 1
  • Tias Guns
    • 1
  • Siegfried Nijssen
    • 1
    • 2
  • Behrouz Babaki
    • 1
  • Thanh Le Van
    • 1
  • Benjamin Negrevergne
    • 1
  • Sergey Paramonov
    • 1
  • Luc De Raedt
    • 1
    Email author
  1. 1.DTAIKU LeuvenLeuvenBelgium
  2. 2.LIACSUniversiteit LeidenLeidenThe Netherlands

Personalised recommendations