Data Mining and Knowledge Discovery

, Volume 31, Issue 2, pp 424–464 | Cite as

Survey on using constraints in data mining

Article

Abstract

This paper provides an overview of the current state-of-the-art on using constraints in knowledge discovery and data mining. The use of constraints in a data mining task requires specific definition and satisfaction tools during knowledge extraction. This survey proposes three groups of studies based on classification, clustering and pattern mining, whether the constraints are on the data, the models or the measures, respectively. We consider the distinctions between hard and soft constraint satisfaction, and between the knowledge extraction phases where constraints are considered. In addition to discussing how constraints can be used in data mining, we show how constraint-based languages can be used throughout the data mining process.

Keywords

Data mining Constraints Background knowledge 

References

  1. Agrawal R, Srikant R, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data, pp 207–216Google Scholar
  2. Ahmed CF, Tanbeer SK, Jeong BS, Lee YK, Choi HJ (2012) Single-pass incremental and interactive mining for weighted frequent patterns. Expert Syst Appl 39(9):7976–7994Google Scholar
  3. An A, Stefanowski J, Ramanna S, Butz CJ, Pedrycz W, Wang G (eds) (2007) Rough sets, fuzzy sets, data mining and granular computing. In: Proceedings of the 11th international conference, RSFDGrC 2007, Toronto, Canada, May 14–16, 2007, (Lecture Notes in Computer Science), vol 4482. SpringerGoogle Scholar
  4. Antunes C (2009) Pattern mining over star schemas in the Onto4AR framework. In: Proceedings of the IEEE international conference on data mining (ICDM) workshops, pp 453–458Google Scholar
  5. Antunes C, Oliveira AL (2003) Sequence mining in categorical domains: Incorporating constraints. In: Proceedings of the 3th international conference on machine learning and data mining in pattern recognition (MLDM), pp 239–251Google Scholar
  6. Antunes C, Oliveira A (2004) Constraint relaxations for discovering unknown sequential patterns. In: Proceedings of the third international workshop on knowledge discovery in inductive databases (KDID), pp 11–32Google Scholar
  7. Babaki B, Guns T, Nijssen S (2014) Constrained clustering using column generation. In: Simonis H (ed) Integration of AI and OR techniques in constraint programming: proceedings of the 11th international conference, CPAIOR 2014, Cork, Ireland, May 19–23, 2014. Lecture Notes in Computer Science, vol 8451, pp. 438–454. Springer. doi:10.1007/978-3-319-07046-9_31
  8. Bade K, Nürnberger A (2006) Personalized hierarchical clustering. In: IEEE/ACM international conference on web intelligence (WIC), pp 181–187Google Scholar
  9. Bade K, Nürnberger A (2008) Creating a cluster hierarchy under constraints of a partially known hierarchy. In: Proceedings of the SIAM international conference on data mining (SDM), pp 13–24Google Scholar
  10. Banerjee A, Ghosh J (2006) Scalable clustering algorithms with balancing constraints. Data Min Knowl Discov 13(3):365–395MathSciNetGoogle Scholar
  11. Banerjee A, Ghosh J (2008) Clustering with balancing constraints. Constrained clustering: advances in algorithms, theory, and applications. Chapman and Hall/CRC, Boca Raton, pp 171–200Google Scholar
  12. Baralis E, Garza P, Quintarelli E, Tanca L (2007) Answering XML queries by means of data summaries. ACM Trans Inf Syst J 25(3):10–16Google Scholar
  13. Baralis E, Cagliero L, Cerquitelli T, Garza P (2012) Generalized association rule mining with constraints. Inf Sci 194:68–84Google Scholar
  14. Baralis E, Cerquitelli T, Chiusano S (2005) Index support for frequent itemset mining in a relational DBMS. In: Proceedings of the 21st international conference on data engineering (ICDE), pp 754–765Google Scholar
  15. Bar-Hillel A, Hertz T, Shental N, Weinshall D (2003) Learning distance functions using equivalence relations. In: Proceedings of the twentieth international conference on machine learning (ICML), pp 11–18Google Scholar
  16. Basu S, Davidson I, Wagstaff KL (2008) Constrained clustering: advances in algorithms, theory, and applications. Chapman and Hall/CRC, Boca RatonMATHGoogle Scholar
  17. Basu S, Banerjee A, Mooney RJ (2004a) Active semi-supervision for pairwise constrained clustering. In: Proceedings of the Fourth SIAM international conference on data mining (SDM)Google Scholar
  18. Basu S, Bilenko M, Mooney RJ (2004b) A probabilistic framework for semi-supervised clustering. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 59–68Google Scholar
  19. Bellandi A, Furletti B, Grossi V, Romei A (2007) Ontology-driven association rules extraction: a case study. In: Proceedings of the international workshop on contexts and ontologies: representation and reasoning (C&O:RR), pp 1–10Google Scholar
  20. Bellandi A, Furletti B, Grossi V, Romei A (2008) Ontological support for association rule mining. In: Proceedings of the 26th IASTED international conference on artificial intelligence and applications (AIA), AIA ’08. ACTA Press, Anaheim, pp 110–115. http://dl.acm.org/citation.cfm?id=1712759.1712781
  21. Bentayeb F, Darmont J (2002) Decision tree modeling with relational views. In: Proceedings of the 13th international symposium on foundations of intelligent systems (ISMIS), pp 423–431Google Scholar
  22. Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517MATHGoogle Scholar
  23. Bernhardt J, Chaudhuri S, Fayyad U, Netz A (2001) Integrating data mining with SQL databases: OLE DB for data mining. In: Proceedings of the 17th international conference on data engineering (ICDE), pp 379–387Google Scholar
  24. Bernstein A, Mannor S, Shimkin N (2010) Online classification with specificity constraints. In: Proceedings of the 24th annual conference on neural information processing systems (NIPS), pp 190–198Google Scholar
  25. Bertsekas DP (1991) Linear network optimization: algorithms and codes. MIT Press Cambridge. http://opac.inria.fr/record=b1089011
  26. Besson J, Pensa RG, Robardet C, Boulicaut JF (2006) Knowledge discovery in inductive databases: 4th international workshop, KDID 2005, Porto, Portugal, October 3, 2005, Revised selected and invited papers, chap. Constraint-based mining of fault-tolerant patterns from boolean data. Springer, Berlin Heidelberg, pp 55–71. doi:10.1007/11733492_4
  27. Bilenko M, Basu S, Mooney RJ (2004) Integrating constraints and metric learning in semi-supervised clustering. In: Proceedings of the twenty-first international conference on machine learning (ICML), ICML ’04. ACM, New York, pp. 11–18. doi:10.1145/1015330.1015360
  28. Bistarelli S, Montanari U, Rossi F (1997) Semiring-based constraint satisfaction and optimization. J ACM 44(2):201–236. doi:10.1145/256303.256306 MathSciNetMATHGoogle Scholar
  29. Bistarelli S, Bonchi F (2007) Soft constraint based pattern mining. Data Knowl Eng 62(1):118–137Google Scholar
  30. Blaszczynski J, Deng W, Hu F, Slowinski R, Szelag M, Wang G (2012) On different ways of handling inconsistencies in ordinal classification with monotonicity constraints. In: Greco S, Bouchon-Meunier B, Coletti G, Fedrizzi M, Matarazzo B, Yager RR (eds) Advances on computational intelligence: 14th international conference on information processing and management of uncertainty in knowledge-based systems, IPMU 2012, Catania, Italy, July 9–13, 2012. Proceedings, Part I, communications in computer and information science, vol 297. Springer, pp 300–309. doi:10.1007/978-3-642-31709-5_31
  31. Blaszczynski J, Slowinski R, Szelag M (2010) Probabilistic rough set approaches to ordinal classification with monotonicity constraints. In: Computational intelligence for knowledge-based systems design, 13th international conference on information processing and management of uncertainty, IPMU 2010, pp 99–108Google Scholar
  32. Blockeel H, Calders T, Fromont É, Goethals B, Prado A, Robardet C (2012) An inductive database system based on virtual mining views. Data Min Knowl Discov 24(1):247–287MATHGoogle Scholar
  33. Blockeel H, Calders T, Fromont É, Goethals B, Prado A (2008a) Mining views: database views for data mining. In: Alonso G, Blakeley JA, Chen ALP (eds) Proceedings of the 24th international conference on data engineering, ICDE 2008, April 7–12, 2008, Cancún, México. IEEE computer society, pp 1608–1611. doi:10.1109/ICDE.2008.4497633
  34. Blockeel H, Calders T, Fromont É, Goethals B, Prado A, Robardet C (2008b) An inductive database prototype based on virtual mining views. In: Li Y, Liu B, Sarawagi S (eds) Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, Las Vegas, NV, August 24–27, 2008. ACM, pp 1061–1064. doi:10.1145/1401890.1402019
  35. Bonchi F, Giannotti F, Mazzanti A, Pedreschi D (2005) ExAnte: a preprocessing method for frequent-pattern mining. IEEE Intell Syst 20(3):25–31Google Scholar
  36. Bonchi F, Giannotti F, Lucchese C, Orlando S, Perego R, Trasarti R (2009) A constraint-based querying system for exploratory pattern discovery. Inf Syst 34(1):3–27Google Scholar
  37. Bonchi F, Lucchese C (2007) Extending the state-of-the-art of constraint-based pattern discovery. Data Knowl Eng 60(2):377–399Google Scholar
  38. Boulicaut J, Jeudy B (2010) Constraint-based data mining. In: Maimon O, Rokach L (eds) Data mining and knowledge discovery handbook, 2nd edn. Springer, New York. doi:10.1007/978-0-387-09823-4_17 Google Scholar
  39. Boulicaut JF, Masson C (2005) Data mining query languages. In: Maimom O, Rokach L (eds) The data mining and knowledge discovery handbook. Springer, New York, pp 715–727Google Scholar
  40. Bradley PS, Bennett KP, Demiriz A (2000) Constrained k-means clustering. In: Technical report, MSR-TR-2000-65, Microsoft ResearchGoogle Scholar
  41. Brunner C, Fischer A, Luig K, Thies T (2012) Pairwise support vector machines and their application to large scale problems. J Mach Learn Res 13(1): 2279–2292. http://dl.acm.org/citation.cfm?id=2503308.2503316
  42. Bucilă C, Gehrke J, Kifer D, White W (2003) DualMiner: a dual-pruning algorithm for itemsets with constraints. Data Min Knowl Discov 7(3):241–272MathSciNetGoogle Scholar
  43. Bult JR, Wansbeek TJ (1995) Optimal selection for direct mail. Market Sci 14(4):378–394Google Scholar
  44. Capelle M, Masson C, Boulicaut J (2003) Mining frequent sequential patterns under regular expressions: a highly adaptive strategy for pushing constraints. In: Proceedings of the third SIAM international conference on data mining (SDM), pp 316–320Google Scholar
  45. Cerf L, Besson J, Robardet C, Boulicaut J (2009) Closed patterns meet n-ary relations. ACM Trans Knowl Discov Data (TKDD). doi:10.1145/1497577.1497580 Google Scholar
  46. Cerf L, Besson J, Nguyen K, Boulicaut J (2013) Closed and noise-tolerant patterns in n-ary relations. Data Min Knowl Discov 26(3):574–619. doi:10.1007/s10618-012-0284-8 MathSciNetMATHGoogle Scholar
  47. Ceri S, Meo R, Psaila G (1998) An extension to SQL for mining association rules. Data Min Knowl Discov 2(2):195–224. doi:10.1023/A:1009774406717 Google Scholar
  48. Chand C, Thakkar A, Ganatra A (2012a) Sequential pattern mining: survey and current research challenges. Int J Soft Comput Eng (IJSCE) 2(1):2231–2307Google Scholar
  49. Chand C, Thakkar A, Ganatra A (2012b) Target oriented sequential pattern mining using recency and monetary constraints. Int J Comput Appl 45(10):12–18Google Scholar
  50. Chang JH (2011) Mining weighted sequential patterns in a sequence database with a time-interval weight. Knowl Based Syst 24(1):1–9Google Scholar
  51. Chen E, Cao H, Li Q, Qian T (2008) Efficient strategies for tough aggregate constraint-based sequential pattern mining. Inf Sci 178(6):1498–1518MathSciNetMATHGoogle Scholar
  52. Chen YL, Kuo MH, yi Wu S, Tang K (2009) Discovering recency, frequency, and monetary (RFM) sequential patterns from customers’ purchasing data. Electron Commer Res Appl 8(5):241–251Google Scholar
  53. Coleman T, Saunderson J, Wirth A (2008) Spectral clustering with inconsistent advice. In: Proceedings of the twenty-fifth international conference on machine learning (ICML), pp 152–159Google Scholar
  54. Costa JA, Iii AOH (2005) Classification constrained dimensionality reduction. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp 1077–1080Google Scholar
  55. Dao TBH, Duong KC, Vrain C (2013) A declarative framework for constrained clustering. In: Blockeel H, Kersting K, Nijssen S, Zelezn F (eds) ECML/PKDD (3), Lecture Notes in Computer Science, vol 8190. Springer, pp 419–434. doi:10.1007/978-3-642-40994-3
  56. Dao TBH, Duong KC, Vrain C (2015) Constrained minimum sum of squares clustering by constraint programming. In: Proceedings of the 21st international conference on principles and practice of constraint programming (CP 2015). Cork, Ireland, pp 557–573. https://hal.archives-ouvertes.fr/hal-01168193
  57. Davidson I, Ravi SS (2005a) Agglomerative hierarchical clustering with constraints: theoretical and empirical results. In: Knowledge discovery in databases: PKDD 2005, 9th European conference on principles and practice of knowledge discovery in databases (PKDD), pp 59–70Google Scholar
  58. Davidson I, Ravi SS (2005b) Clustering with constraints: feasibility issues and the \(k\)-means algorithm. In: Kargupta H, et al. (eds) Proceedings of the 2005 SIAM international conference on data mining, pp 138–149. doi:10.1137/1.9781611972757.13
  59. Davidson I, Ravi SS (2006) Identifying and generating easy sets of constraints for clustering. In: Proceedings of the twenty-first national conference on artificial intelligence and the eighteenth innovative applications of artificial intelligence conference (AAAI), pp 336–341Google Scholar
  60. Davidson I, Ravi SS (2007) The complexity of non-hierarchical clustering with instance and cluster level constraints. Data Min Knowl Discov 14(1):25–61MathSciNetGoogle Scholar
  61. Davidson I, Ravi SS (2009) Using instance-level constraints in agglomerative hierarchical clustering: theoretical and empirical results. Data Min Knowl Discov 18(2):257–282MathSciNetGoogle Scholar
  62. Davidson I, Wagstaff K, Basu S (2006) Measuring constraint-set utility for partitional clustering algorithms. In: Knowledge discovery in databases: PKDD 2006, 10th European conference on principles and practice of knowledge discovery in databases (PKDD), pp 115–126Google Scholar
  63. Dawson S, di Vimercati SDC, Samarati P (1999) Specification and enforcement of classification and inference constraints. In: IEEE symposium on security and privacy, pp 181–195Google Scholar
  64. De Raedt L, Guns T, Nijssen S (2010) Constraint programming for data mining and machine learning. In: Fox M, Poole D (eds) Proceedings of the twenty-fourth AAAI conference on artificial intelligence, AAAI 2010, Atlanta, July 11–15, 2010. AAAI Press, pp 1671–1675. http://www.aaai.org/ocs/index.php/AAAI/AAAI10/paper/view/1837
  65. De Raedt L (2002) A perspective on inductive databases. SIGKDD Explor 4(2):69–77. doi:10.1145/772862.772871
  66. Demiriz A, Bennett KP, Bradley PS (2008) Using assignment constraints to avoid empty clusters in k-means clustering. Constrained clustering: advances in algorithms, theory, and applications. Chapman and Hall/CRC, Boca Raton, pp 201–220Google Scholar
  67. Druck G, Mann GS, McCallum A (2008) Learning from labeled features using generalized expectation criteria. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval (SIGIR), pp 595–602Google Scholar
  68. Duivesteijn W, Feelders A (2008) Nearest neighbour classification with monotonicity constraints. Mach Learn Knowl Discov Databases Eur Conf ECML/PKDD 2008:301–316Google Scholar
  69. Dzeroski S, Goethals B, Panov P (2010) Inductive databases and constraint-based data mining. Springer, New YorkMATHGoogle Scholar
  70. Euler T, Klinkenberg R, Mierswa I, Scholz M, Wurst M (2006) YALE: rapid prototyping for complex data mining tasks. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 935–940Google Scholar
  71. Fawcett T (2006) An introduction to roc analysis. Pattern Recognit Lett 27(8):861–874MathSciNetGoogle Scholar
  72. Fiot C, Laurent A, Teisseire M (2009) Softening the blow of frequent sequence analysis: soft constraints and temporal accuracy. Int J Web Eng Technol 5(1):24–47Google Scholar
  73. Fromont É, Blockeel H, Struyf J (2006) Integrating decision tree learning into inductive databases. In: Proceedings of the 5th international workshop on knowledge discovery in inductive databases (KDID), pp 81–96Google Scholar
  74. Fu Y, Han J (1995) Meta-rule-guided mining of association rules in relational databases. In: Proceedings of the post-conference workshops on integration of knowledge discovery in databases with deductive and object-oriented databases (KDOOD/TDOOD), pp 39–46Google Scholar
  75. Fu Y, Han J, Koperski K, Wang W, Zaiane O (1996) DMQL: a data mining query language for relational databases. In: Proceedings of the first workshop on research issues in data mining and knowledge discovery (DMKD), pp 122–133Google Scholar
  76. Garofalakis MN, Rastogi R, Shim K (1999) SPIRIT: Sequential pattern mining with regular expression constraints. In: Proceedings of 25th international conference on very large data bases (VLDB), pp 223–234Google Scholar
  77. Garofalakis MN, Hyun D, Rastogi R, Shim K (2003) Building decision trees with constraints. Data Min Knowl Discov 7(2):187–214MathSciNetGoogle Scholar
  78. Giannotti F, Nanni M, Pedreschi D (2000) Logic-based knowledge discovery in databases. In: Proceedings of tenth European–Japanese conference on information modelling and knowledge bases (EJC), pp 279–283Google Scholar
  79. Gilpin S, Davidson I (2011) Incorporating SAT solvers into hierarchical clustering algorithms: an efficient and flexible approach. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 1136–1144Google Scholar
  80. Grossi V, Monreale A, Nanni M, Pedreschi D, Turini F (2015) Software engineering and formal methods: SEFM 2015 collocated workshops: ATSE, HOFM, MoKMaSD, and VERY*SCART, York, UK, September 7–8, 2015. Revised selected papers, chap. clustering formulation using constraint optimization. Springer, Berlin, Heidelberg, pp 93–107. doi:10.1007/978-3-662-49224-6_9
  81. Grossi V, Romei A (2012) XQuake as a constraint-based mining language. In: Proceedings of the ECAI 2012 workshop on combining constraint solving with mining and learning (CoCoMile), pp 90–91Google Scholar
  82. Gu W, Chen B, Hu J (2010) Combining binary-svm and pairwise label constraints for multi-label classification. In: Proceedings of the IEEE international conference on systems, man and cybernetics (SMC), pp 4176–4181Google Scholar
  83. Guns T, Nijssen S, De Raedt L (2011) Itemset mining: a constraint programming perspective. Artif Intell 175(12–13):1951–1983Google Scholar
  84. Guns T, Nijssen S, De Raedt L (2013) k-Pattern set mining under constraints. IEEE Trans Knowl Data Eng 25(2):402–418Google Scholar
  85. Guns T, Dries A, Tack G, Nijssen S, De Raedt L (2013a) Miningzinc: a modeling language for constraint-based mining. In: Rossi F (ed) IJCAI 2013, proceedings of the 23rd international joint conference on artificial intelligence, Beijing, China, August 3–9, 2013. IJCAI/AAAI. http://www.aaai.org/ocs/index.php/IJCAI/IJCAI13/paper/view/6947
  86. Guns T, Dries A, Tack G, Nijssen S, De Raedt L (2013b) The miningzinc framework for constraint-based itemset mining. In: Ding W, Washio T, Xiong H, Karypis G, Thuraisingham BM, Cook DJ, Wu X (eds) 13th IEEE international conference on data mining workshops, ICDM workshops, TX, December 7–10, 2013. IEEE computer society, pp 1081–1084. doi:10.1109/ICDMW.2013.38
  87. Han J, Lakshmanan LVS, Ng RT (1999) Constraint-based multidimensional data mining. IEEE Comput 32(8):46–50Google Scholar
  88. Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Discov 15(1):55–86MathSciNetGoogle Scholar
  89. Han J, Fu Y (1999) Mining multiple-level association rules in large databases. IEEE Trans Knowl Data Eng 11(5):798–805Google Scholar
  90. Han J, Kamber M (2012) Data mining: concepts and techniques, 3rd edn. Morgan Kaufmann, San FranciscoMATHGoogle Scholar
  91. Hansen P, Aloise D (2009) A survey on exact methods for minimum sum-of-squares clustering, pp 1–2. http://www.math.iit.edu/Buck65files/msscStLouis.pdf
  92. Har-Peled S, Roth D, Zimak D (2002) Constraint classification: a new approach to multiclass classification. In: Proceedings of the 13th international conference algorithmic learning theory (ALT), pp 365–379Google Scholar
  93. Hirate Y, Yamana H (2006) Generalized sequential pattern mining with item intervals. J Comput 1(3):51–60Google Scholar
  94. Hu YH, Kao YH (2011) Mining sequential patterns with consideration to recency, frequency, and monetary. In: Proceedings of the Pacific Asia conference on information systems (PACIS), pp 78–91Google Scholar
  95. Hu YH, Yen TW (2010) Considering RFM-values of frequent patterns in transactional databases. In: Proceedings of the 2th international conference on software engineering and data mining (SEDM), pp 422–427Google Scholar
  96. Hwang JH, Gu MS (2014) Ontology based service frequent pattern mining. Future Inf Technol 309:809–814. doi:10.1007/978-3-642-55038-6-123 Google Scholar
  97. Imielinski T, Mannila H (1996) A database perspective on knowledge discovery. Commun ACM 39(11):58–64Google Scholar
  98. Imielinski T, Virmani A (1999) MSQL: a query language for database mining. Data Min Knowl Discov 2(4):373–408Google Scholar
  99. Jeudy B, Boulicaut JF (2002) Optimization of association rule mining queries. Intell Data Anal 6(4):341–357MATHGoogle Scholar
  100. Kestler H, Kraus J, Palm G, Schwenker F (2006) On the effects of constraints in semi-supervised hierarchical clustering. In: Schwenker F, Marinai S (eds) Artificial neural networks in pattern recognition, vol 4087., Lecture notes in computer scienceSpringer, Berlin, heidelberg, pp 57–66Google Scholar
  101. Klein D, Kamvar SD, Manning CD (2002) From instance-level constraints to space-level constraints: making the most of prior knowledge in data clustering. In: Proceedings of the nineteenth international conference on machine learning, ICML ’02. Morgan Kaufmann Publishers Inc., San Francisco, CA, pp 307–314. http://dl.acm.org/citation.cfm?id=645531.655989
  102. Kumar N, Kummamuru K (2008) Semisupervised clustering with metric learning using relative comparisons. IEEE Trans Knowl Data Eng 20(4):496–503Google Scholar
  103. Kummamuru K, Krishnapuram R, Agrawal R (2004) Learning spatially variant dissimilarity (svad) measures. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 611–616Google Scholar
  104. Lakshmanan LVS, Ng R, Han J, Pang A (1999) Optimization of constrained frequent set queries with 2-variable constraints. ACM SIGMOD Rec 28(2):157–168. doi:10.1145/304181.304196 Google Scholar
  105. Lange TCMH, Anil L, Jain K, Buhmann JM (2005) Learning with constrained and unlabeled data. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 731–738Google Scholar
  106. Law MHC, Topchy AP, Jain AK (2005) Model-based clustering with probabilistic constraints. In: Kargupta et al., pp 641–645. doi:10.1137/1.9781611972757.77
  107. Law MHC, Topchy A, Jain AK (2004) Structural, syntactic, and statistical pattern recognition: joint IAPR international workshops, SSPR 2004 and SPR 2004, Lisbon, Portugal, August 18–20, 2004. Proceedings, chap. Clustering with soft and group constraints. Springer, Berlin, Heidelberg, pp 662–670. doi:10.1007/978-3-540-27868-9_72
  108. Law Y, Wang H, Zaniolo C (2004) Query languages and data models for database sequences and data streams. In: Proceedings of the 30th international conference on very large data bases (VLDB), pp 492–503Google Scholar
  109. Leung CKS, Hao B, Brajczuk DA (2010) Mining uncertain data for frequent itemsets that satisfy aggregate constraints. In: Proceedings of the 2010 ACM symposium on applied computing (SAC), pp 1034–1038Google Scholar
  110. Li YC, Yeh JS, Chang CC (2008) Isolated items discarding strategy for discovering high utility itemsets. Data Knowl Eng 64(1):198–217Google Scholar
  111. Li Z, Liu J, Tang X (2009) Constrained clustering via spectral regularization. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 421–428Google Scholar
  112. Lin MY, Hsueh SC, Chang CW (2008) Fast discovery of sequential patterns in large databases using effective time-indexing. Inf Sci 178(22):4228–4245MathSciNetMATHGoogle Scholar
  113. Lin MY, Lee SY (2005) Efficient mining of sequential patterns with time constraints by delimited pattern growth. Knowl Inf Syst 7(4):499–514Google Scholar
  114. Liu EY, Zhang Z, Wang W (2011) Clustering with relative constraints. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 947–955Google Scholar
  115. Lu Z, Carreira-Perpiñán MÁ (2008) Constrained spectral clustering through affinity propagation. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 1–8Google Scholar
  116. Lucey S, Ashraf AB (2013) Nearest neighbor classifier generalization through spatially constrained filters. Pattern Recognit 46(1):325–331. doi:10.1016/j.patcog.2012.06.009 MATHGoogle Scholar
  117. Lu Z, Leen TK (2007) Penalized probabilistic clustering. Neural Comput 19(6):1528–1567. doi:10.1162/neco.2007.19.6.1528 MathSciNetMATHGoogle Scholar
  118. Mabroukeh NR, Ezeife CI (2010) A taxonomy of sequential pattern mining algorithms. ACM Comput Surv 43(1):1–3Google Scholar
  119. Mansingh G, Osei-Bryson KM, Reichgelt H (2011) Using ontologies to facilitate post-processing of association rules by domain experts. Inf Sci 181(3):419–434Google Scholar
  120. Marinica C, Guillet F (2010a) Knowledge-based interactive postmining of association rules using ontologies. IEEE Trans Knowl Data Eng 22(6):784–797Google Scholar
  121. Marinica C, Guillet F (2010) Knowledge-based interactive postmining of association rules using ontologies. IEEE Trans Knowl Data Eng 22(6):784–797. doi:10.1109/TKDE.2010.29 Google Scholar
  122. Marriott K, Nethercote N, Rafeh R, Stuckey PJ, de la Banda MG, Wallace M (2008) The design of the zinc modelling language. Constraints 13(3):229–267. doi:10.1007/s10601-008-9041-4 MathSciNetMATHGoogle Scholar
  123. Masseglia F, Poncelet P, Teisseire M (2009) Efficient mining of sequential patterns with time constraints: reducing the combinations. Expert Syst Appl 36(2):2677–2690Google Scholar
  124. Masson C, Robardet C, Boulicaut J (2004) Optimizing subset queries: a step towards sql-based inductive databases for itemsets. In: Haddad H, Omicini A, Wainwright RL, Liebrock LM (eds) Proceedings of the 2004 ACM symposium on applied computing (SAC), Nicosia, Cyprus, March 14–17, 2004. ACM, pp 535–539. doi:10.1145/967900.968013
  125. Meo R, Psaila G, Ceri S (1998) An extension to SQL for mining association rules. Data Min Knowl Disc 2(2):195–224. doi:10.1023/A:1009774406717 Google Scholar
  126. Meo R, Psaila G (2006) An XML-based database for knowledge discovery. In: Proceedings of the 10th international conference on extending database technology (EDBT), pp 814–828Google Scholar
  127. Morzy T, Zakrzewicz M (1997) SQL-like language for database mining. In: Proceedings of the first east-European symposium on advances in databases and information systems (ADBIS), pp 331–317Google Scholar
  128. Nethercote N, Stuckey PJ, Becket R, Brand S, Duck GJ, Tack G (2007) Minizinc: towards a standard CP modelling language. In: Proceedings of the 13th international conference on principles and practice of constraint programming, CP’07. Springer, Berlin, Heidelberg, pp 529–543. http://dl.acm.org/citation.cfm?id=1771668.1771709
  129. Nguyen N, Caruana R (2008) Improving classification with pairwise constraints: a margin-based approach. In: Daelemans W, Goethals B, Morik K (eds) ECML/PKDD (2), Lecture Notes in Computer Science, vol 5212. Springer, pp 113–124. http://dblp.uni-trier.de/db/conf/pkdd/pkdd2008-2.html#NguyenC08
  130. Nijssen S, Fromont E (2007) Mining optimal decision trees from itemset lattices. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 530–539Google Scholar
  131. Nijssen S, Fromont E (2010) Optimal constraint-based decision tree induction from itemset lattices. Data Min Knowl Disc 21(1):9–51. doi:10.1007/s10618-010-0174-x MathSciNetGoogle Scholar
  132. Niyogi P, Pierrot JB, Siohan O (2000) Multiple classifiers by constrained minimization. In: Proceedings of the acoustics, speech, and signal processing, 2000. On IEEE international conference, vol 06, ICASSP ’00. IEEE Computer Society, Washington, DC, pp 3462–3465. doi:10.1109/ICASSP.2000.860146
  133. Okabe M, Yamada S (2012) Clustering by learning constraints priorities. In: Proceedings of the 12th international conference on data mining (ICDM), pp 1050–1055Google Scholar
  134. Park SH, Fürnkranz J (2008) Multi-label classification with label constraints. In: Technical report, knowledge engineering group, TU DarmstadtGoogle Scholar
  135. Pei J, Han J, Lakshmanan LVS (2004) Pushing convertible constraints in frequent itemset mining. Data Min Knowl Disc 8(3):227–252MathSciNetGoogle Scholar
  136. Pei J, Han J, Wang W (2007) Constraint-based sequential pattern mining: the pattern growth methods. Inf Sci 28(2):133–160Google Scholar
  137. Pei J, Han J (2000) Can we push more constraints into frequent pattern mining? In: Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 350–354Google Scholar
  138. Pinto H, Han J, Pei J, Wang K, Chen Q, Dayal U (2001) Multi-dimensional sequential pattern mining. In: Proceedings of the 2001 ACM CIKM international conference on information and knowledge management (CIKM), pp 81–88Google Scholar
  139. Plantevit M, Laurent A, Laurent D, Teisseire M, Choong YW (2010) Mining multidimensional and multilevel sequential patterns. Trans Knowl Discov Data 4(1):4Google Scholar
  140. Pyle D (1999) Data preparation for data mining. Morgan Kaufmann Publishers Inc., San franciscoGoogle Scholar
  141. Richter L, Wicker J, Kessler K, Kramer S (2008) An inductive database and query language in the relational model. In: Proceedings of the 11th international conference on extending database technology (EDBT), pp 740–744Google Scholar
  142. Rigollet P, Tong X (2011a) Neyman-pearson classification, convexity and stochastic constraints. J Mach Learn Res 12:2831–2855MathSciNetMATHGoogle Scholar
  143. Rigollet P, Tong X (2011b) Neyman-pearson classification under a strict constraint. Proc Track J Mach Learn Res 19:595–614Google Scholar
  144. Romei A, Ruggieri S, Turini F (2006) KDDML: a middleware language and system for knowledge discovery in databases. Data Knowl Eng 57(2):179–220. doi:10.1016/j.datak.2005.04.007 Google Scholar
  145. Romei A, Turini F (2011) Programming the KDD process using XQuery. In: Proceedings of the international conference on knowledge discovery and information retrieval (KDIR), pp 131–139Google Scholar
  146. Romei A, Turini F (2010) XML data mining. Softw Pract Exp 40(2):101–130. doi:10.1002/spe.944 Google Scholar
  147. Romei A, Turini F (2011) Inductive database languages: requirements and examples. Knowl Inf Syst 26(3):351–384Google Scholar
  148. Ruiz C, Spiliopoulou M, Ruiz EM (2010) Density-based semi-supervised clustering. Data Min Knowl Disc 21(3):345–370MathSciNetGoogle Scholar
  149. Sarawagi S, Thomas S, Agrawal R (2000) Integrating association rule mining with relational database systems: alternatives and implications. Data Min Knowl Disc 4(2/3):89–125Google Scholar
  150. Schultz M, Joachims T (2003) Learning a distance metric from relative comparisons. In: Thrun S, Saul LK, Schölkopf B (eds) Proceeding of advances in neural information processing systems (NIPS), December 8–13, 2003, Vancouver and Whistler, British Columbia. MIT Press, pp 41–48. http://papers.nips.cc/paper/2366-learning-a-distance-metric-from-relative-comparisons
  151. Shankar S (2009) Utility sentient frequent itemset mining and association rule mining: a literature survey and comparative study. Int J Soft Comput Appl 4:81–95Google Scholar
  152. Small K, Wallace BC, Brodley CE, Trikalinos TA (2011) The constrained weight space SVM: learning with ranked features. In: Proceedings of the 28th international conference on machine learning (ICML), pp 865–872Google Scholar
  153. Soulet A, Crémilleux B (2005) Optimizing constraint-based mining by automatically relaxing constraints. In: Proceedings of the 5th IEEE international conference on data mining (ICDM), 27–30 November 2005, Houston. IEEE Computer Society, pp 777–780. doi:10.1109/ICDM.2005.112
  154. Soulet A, Crémilleux B (2009) Mining constraint-based patterns using automatic relaxation. Intell Data Anal 13(1):109–133Google Scholar
  155. Soulet A, Crémilleux B, Plantevit M (2011) Summarizing contrasts by recursive pattern mining. In: Spiliopoulou M, Wang H, Cook DJ, Pei J, Wang W, Zaïane OR, Wu X (eds) Data mining workshops (ICDMW), 2011 IEEE 11th international conference on, Vancouver, December 11, 2011. IEEE Computer Society, pp 1155–1162. doi:10.1109/ICDMW.2011.161
  156. Srikant R, Agrawal R (1995) Mining generalized association rules. In: Proceedings of the 21th conference on very large data bases (VLDB), pp 407–419Google Scholar
  157. Srikant R, Agrawal R (1996) Mining sequential patterns: generalizations and performance improvements. In: Proceedings of the 5th international conference on extending database technology (EDBT), pp 3–17Google Scholar
  158. Srikant R, Vu Q, Agrawal R (1997) Mining association rules with item constraints. In: Proceedings of the third international conference on knowledge discovery and data mining (KDD), pp 67–73Google Scholar
  159. Sriphaew K, Theeramunkong T (2002) A new method for finding generalized frequent itemsets in generalized association rule mining. In: Proceedings of the 7th IEEE symposium on computers and communications (ISCC), pp 1040–1045Google Scholar
  160. Strehl A, Ghosh J (2003) Relationship-based clustering and visualization for high-dimensional data mining. INFORMS J Comput 15(2):208–230MATHGoogle Scholar
  161. Tan PN, Steinbach M, Kumar V (2006) Introduction to data mining. Addison Wesley, BostonGoogle Scholar
  162. Tao F, Murtagh F (2003) Weighted association rule mining using weighted support and significance framework. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 661–666Google Scholar
  163. Trasarti R, Bonchi F, Goethals B (2008) Sequence mining automata: a new technique for mining frequent sequences under regular expressions. In: Proceedings of the 8th IEEE international conference on data mining (ICDM), pp 1061–1066Google Scholar
  164. Tseng VS, Shie BE, Wu CW, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786. doi:10.1109/TKDE.2012.59 Google Scholar
  165. Tsochantaridis I, Joachims T, Hofmann T, Altun Y (2005) Large margin methods for structured and interdependent output variables. J Mach Learn Res 6:1453–1484MathSciNetMATHGoogle Scholar
  166. Vanderlooy S, Sprinkhuizen-Kuyper IG, Smirnov EN, van den Herik HJ (2009) The roc isometrics approach to construct reliable classifiers. Intell Data Anal 13(1):3–37. http://dl.acm.org/citation.cfm?id=1551758.1551760
  167. Vens C, Struyf J, Schietgat L, Dzeroski S, Blockeel H (2008) Decision trees for hierarchical multi-label classification. Mach Learn 73(2):185–214Google Scholar
  168. von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416. doi:10.1007/s11222-007-9033-z MathSciNetGoogle Scholar
  169. Vu V, Labroche N, Bouchon-Meunier B (2010) An efficient active constraint selection algorithm for clustering. In: 20th international conference on pattern recognition, ICPR 2010, Istanbul, Turkey, 23–26 August 2010. IEEE Computer Society, pp 2969–2972. doi:10.1109/ICPR.2010.727
  170. Wagstaff K, Basu S, Davidson I (2006) When is constrained clustering beneficial, and why? In: Proceedings, the twenty-first national conference on artificial intelligence and the eighteenth innovative applications of artificial intelligence conference (AAAI)Google Scholar
  171. Wagstaff K, Cardie C (2000) Clustering with instance-level constraints. In: Proceedings of the seventeenth national conference on artificial intelligence and twelfth conference on Innovative applications of artificial intelligence (AAAI/IAAI), pp 1103–1110Google Scholar
  172. Wagstaff K, Cardie C, Rogers S, Schrödl S (2001) Constrained k-means clustering with background knowledge. In: Proceedings of the eighteenth international conference on machine learning, ICML ’01. Morgan Kaufmann Publishers Inc., San Francisco, pp 577–584. http://dl.acm.org/citation.cfm?id=645530.655669
  173. Wang K, Jiang Y, Yu JX, Dong G, Han J (2005) Divide-and-approximate: a novel constraint push strategy for iceberg cube mining. IEEE Trans Knowl Data Eng 17(3):354–368Google Scholar
  174. Wang X, Rostoker C, Hamilton HJ (2012) A density-based spatial clustering for physical constraints. J Intell Inf Syst 38(1):269–297Google Scholar
  175. Wang X, Qian B, Davidson I (2014) On constrained spectral clustering and its applications. Data Min Knowl Disc 28(1):1–30. doi:10.1007/s10618-012-0291-9 MathSciNetMATHGoogle Scholar
  176. Wang X, Davidson I (2010) Flexible constrained spectral clustering. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 563–572Google Scholar
  177. Wang F, Ding CHQ, Li T (2009) Integrated kl (k-means—laplacian) clustering: a new clustering approach by combining attribute data and pairwise relations. In: Proceedings of the SIAM international conference on data mining (SDM), pp 38–48Google Scholar
  178. Wang W, Yang J, Yu PS (2000) Efficient mining of weighted association rules (WAR). In: Proceedings of the 6th ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 270–274Google Scholar
  179. Wei JT, Lin SY, Wu HH (2010) A review of the application of rfm model. Afr J Bus Manag 4(19):4199–4206Google Scholar
  180. Witten IH, Frank E, Hall M (2011) Data mining, pratical machine learning tools and techiniques, 3rd edn. Morgan Kaufmann, San FranciscoGoogle Scholar
  181. Wu CM, Huang YF (2011) Generalized association rule mining using an efficient data structure. Expert Syst Appl 38(6):7277–7290Google Scholar
  182. Xing EP, Ng AY, Jordan MI, Russell SJ (2002) Distance metric learning with application to clustering with side-information. In: Advances in neural information processing systems (NIPS), pp 505–512Google Scholar
  183. Xing EP, Ng AY, Jordan MI, Russell S (2002) Distance metric learning, with application to clustering with side-information. Advances in neural information processing systems 15. MIT Press, CambridgeGoogle Scholar
  184. Yan R, Zhang J, Yang J, Hauptmann AG (2006) A discriminative learning framework with pairwise constraints for video object classification. IEEE Trans Pattern Anal Mach Intell 28(4):578–593. doi:10.1109/TPAMI.2006.65 Google Scholar
  185. Yan W, Goebel KF (2004) Designing classifier ensembles with constrained performance requirements. In: Proceedings of the SPIE defense security symposium, multisensor multisource information fusion: architectures, algorithms, and applications (2004), pp 78-87Google Scholar
  186. Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. In: Proceedings of the fourth SIAM international conference on data mining (SDM), pp 482–486Google Scholar
  187. Yun U (2008) A new framework for detecting weighted sequential patterns in large sequence databases. Knowl Based Syst 21(2):110–122Google Scholar
  188. Yun U, Shin H, Ryu KH, Yoon E (2012) An efficient mining algorithm for maximal weighted frequent patterns in transactional databases. Knowl Based Syst 33:53–64Google Scholar
  189. Yun U, Leggett JJ (2005) WFIM: weighted frequent itemset mining with a weight range and a minimum weight. In: Kargupta et al., pp 636–640. doi:10.1137/1.9781611972757.76
  190. Yun U, Ryu KH (2010) Discovering important sequential patterns with length-decreasing weighted support constraints. Int J Inf Technol Decis Mak 9(4):575–599MATHGoogle Scholar
  191. Yun U, Ryu KH (2011) Approximate weighted frequent pattern mining with/without noisy environments. Knowl Based Syst 24(1):73–82Google Scholar
  192. Zaidan O, Eisner J (2008) Modeling annotators: a generative approach to learning from annotator rationales. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), pp 31–40Google Scholar
  193. Zaki MJ (2000) Sequence mining in categorical domains: incorporating constraints. In: Proceedings of the 9th international conference on information and knowledge management (CIKM), pp 422–429Google Scholar
  194. Zhang J, Yan R (2007) On the value of pairwise constraints in classification and consistency. In: Proceedings of the 24th international conference on machine learning, ICML ’07. ACM, New York, pp 1111–1118. doi:10.1145/1273496.1273636
  195. Zhang C, Zhang S (2002) Association rule mining, models and algorithms, lecture notes in computer science. Springer, New YorkGoogle Scholar
  196. Zhang Y, Zhang L, Nie G, Shi Y (2009) A survey of interestingness measures for association rules. In: Proceedings of the second international conference on business intelligence and financial engineering, (BIFE), pp 460–463Google Scholar
  197. Zhong S, Ghosh J (2003) Scalable, balanced model-based clustering. In: Proceedings of the third SIAM international conference on data mining (SDM), San Francisco, pp 71–82Google Scholar

Copyright information

© The Author(s) 2016

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of PisaPisaItaly

Personalised recommendations