Skip to main content

Abstract

Data mining focuses on the development of methods and algorithms for such tasks as classification, clustering, rule induction, and discovery of associations. In the database field, the view of data mining as advanced querying has recently stimulated much research into the development of data mining query languages. In the field of machine learning, inductive logic programming has broadened its scope toward extending standard data mining tasks from the usual attribute-value setting to a multirelational setting. After a concise description of data mining, the contribution of logic to both fields is discussed. At the end, we indicate the potential use of logic for unifying different existing data mining formalisms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References References

  1. S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, Reading, MA, 1995.

    MATH  Google Scholar 

  2. R. Agrawal, T. Imielinski, and A. N. Swami. Mining association rules between sets of items in large databases. In Proc. ACM SIGMOD Int. Conf. Management of Data, pp. 207–216, 1993.

    Google Scholar 

  3. R. Agrawal and K. Shim. Developing tightly-coupled data mining applications on a relational database system. In Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining (KDD’96), pp. 287–290, 1996.

    Google Scholar 

  4. R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proc. Int. Conf. Very Large Data Bases, pp. 487–499, 1994.

    Google Scholar 

  5. D. Barbará, W. DuMouchel, C. Faloutsos, P. J. Haas, J. M. Hellerstein, Y. E. Ioannidis, H. V. Jagadish, T. Johnson, R. T. Ng, V. Poosala, K. A. Ross, and K. C. Sevcik. The New Jersey data reduction report. Data Engineering Bulletin, 20(4):3–45, 1997.

    Google Scholar 

  6. R. J. Bayardo Jr., R. Agrawal, and D. Gunopulos. Constraint-based rule mining in large, dense databases. In Proc. 15th Int. Conf. on Data Engineering (ICDE’99), pp. 188–197, 1999.

    Google Scholar 

  7. M. J. A. Berry and G. Linoff. Data Mining Techniques for Marketing, Sales, and Customer Support. Wiley, New York, 1997.

    Google Scholar 

  8. G. Bisson. Learning in FOL with a similarity measure. In Proc. 10th National Conf. on Artificial Intelligence (AAAI’92), pp. 82–87, 1992.

    Google Scholar 

  9. H. Blockeel and L. De Raedt. Top-down induction of first-order logical decision trees. Artificial Intelligence, 101(1–2):285–297, 1998.

    Article  MathSciNet  MATH  Google Scholar 

  10. H. Blockeel, L. De Raedt, and J. Ramon. Top-down induction of clustering trees. In Proc. 15th Int. Conf. on Machine Learning (ICML’98), pp. 55–63, 1998.

    Google Scholar 

  11. U. Bohnebeck, T. Horvath, and S. Wrobel. Term comparisons in first-order similarity measures. In Proc. 8th Int. Workshop on Inductive Logic Pmgramming (ILP’98), LNAI 1446, pp. 65–79, 1998.

    Google Scholar 

  12. J.-F. Boulicaut, M. Klemettinen, and H. Mannila. Querying inductive databases: A case study on the MINE RULE operator. In Proc. 2nd Eumpean Symposium on Principles of Data Mining and Knowledge Discovery (PKDD’98), LNCS 1510, pp. 194–202, 1998.

    Google Scholar 

  13. J.-F. Boulicaut, P. Marcel, and C. Rigotti. Query driven knowledge discovery in multidimensional data. In Proc. of the ACM 2nd Int. Workshop on Data Warehousing and OLAP (DOLAP’99), pp. 87–93, 1999.

    Google Scholar 

  14. T. Calders, R. T. Ng, and J. Wijsen. Searching for dependencies at multiple abstraction levels. ACM Trans. on Database Systems, 27(3):229–260, 2002.

    Article  Google Scholar 

  15. T. Calders and J. Wijsen. On monotone data mining languages. In Proc. 8th Int. Workshop on Database Pmgramming Languages (DBPL’01), LNCS 2397, pp. 119–132, Springer, 2002.

    Google Scholar 

  16. S. Chaudhuri, U. M. Fayyad, and J. Bernhardt. Scalable classification over SQL databases. In Proc. 15th Int. Conf. on Data Engineering (ICDE’99), pp. 470–479, 1999.

    Google Scholar 

  17. M.-S. Chen, J. Han, and P. S. Yu. Data mining: An overview from a database perspective. IEEE Trans. on Knowledge and Data Engineering, 8(6):866–883, 1996.

    Article  Google Scholar 

  18. S. Choenni and A. Siebes. Query optimization to support data mining. In Proc. Int. Workshop on Database and Expert Systems Applications (DEXA ’97), pp. 658–663, 1997.

    Google Scholar 

  19. L. De Raedt. A logical database mining query language. In Proc. 10th Int. Conf. on Inductive Logic Pmgramming (ILP’00), LNAI 1866, pp. 78–92, 2000.

    Google Scholar 

  20. L. Dehaspe and L. De Raedt. DLAB: A declarative language bias formalism. In Proc. Int. Symposium on Foundations of Intelligent Systems (ISMIS’96), LNCS 1079, pp. 613–622, 1996.

    Google Scholar 

  21. L. Dehaspe and H. Toivonen. Discovery offrequent DATALOG patterns. Data Mining and Knowledge Discovery, 3(1):7–36, 1999.

    Article  Google Scholar 

  22. L. Dehaspe and H. Toivonen. Discovery of relational association rules. In A. Dzeroski and N. Lavrac, editors, Relational Data Mining, Chap. 8, pp. 189–212, Springer, 2001.

    Google Scholar 

  23. A. Džeroski and N. Lavrac, editors. Relational Data Mining. Springer, Berlin, 2001.

    MATH  Google Scholar 

  24. W. Emde and D. Wettschereck. Relational instance-based learning. In Proc. 13th Int. Conf. on Machine Learning (ICML’96), pp. 122–130, 1996.

    Google Scholar 

  25. A. Famili, W.-M. Shen, R. Weber, and E. Simoudis. Data preprocessing for intelligent data analysis. Intelligent Data Analysis, 1(1), 1997.

    Google Scholar 

  26. U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors. Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, Cambridge, MA, 1996.

    Google Scholar 

  27. M. R. Garey and D. S. Johnson. Computers and Intractability. A Guide to the Theory of NP-Completeness. W. H. Freeman, San Francisco, CA, 1979.

    Google Scholar 

  28. F. Giannotti and G. Manco. Querying inductive databases via logic-based user-defined aggregates. In Proc. 3rd European Conf. on Principles of Data Mining and Knowledge Discovery (PKDD’99), LNAI 1704, pp. 125–135, 1999.

    Google Scholar 

  29. F. Giannotti and G. Manco. Making knowledge extraction and reasoning closer. In Proc. 4th Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD’00), LNAI 1805, pp. 360–371, 2000.

    Google Scholar 

  30. F. Giannotti, G. Manco, M. Nanni, and D. Pedreschi. Nondeterministic, non-monotonic logic databases. IEEE Trans. on Knowledge and Data Engineering, 13(5):813–823, 2001.

    Article  Google Scholar 

  31. F. Giannotti, G. Manco, and F. Thrini. Specifying mining algorithms with iterative user-defined aggregates: A case study. In Proc. 5th European Conf. on Principles of Data Mining and Knowledge Discovery (PKDD’01), LNAI 2168, pp. 128–139, 2001.

    Google Scholar 

  32. F. Giannotti, D. Pedreschi, and C. Zaniolo. Semantics and expressive power of nondeterministic constructs in deductive databases. Journal of Computer and System Sciences, 62(1):15–42, 2001.

    Article  MathSciNet  Google Scholar 

  33. B. Goethals and J. Van den Bussche. On supporting interactive association rule mining. In Proc. of the 2nd Int. Conf. on Data Warehousing and Knowledge Discovery (DaWaK’00), LNCS 1874, pp. 307–316, 2000.

    Google Scholar 

  34. B. Goethals and J. Van den Bussche. Relational association rules: Getting WARMeR. In Proc. of the ESP Exploratory Workshop on Pattern Detection and Discovery, LNCS 2447, pp. 125–139, 2002.

    Google Scholar 

  35. G. Graefe, U. M. Fayyad, and S. Chaudhuri. On the efficient gathering of sufficient statistics for classification from large SQL databases. In Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining (KDD’98), pp. 204–208, 1998.

    Google Scholar 

  36. J. Han. Towards on-line analytical mining in large databases. SIGMOD Record, 27(1):97–107, 1998.

    Article  Google Scholar 

  37. J. Han, Y. Fu, W. Wang, K. Koperski, and O. Zaiane. DMQL: A data mining query language for relational databases. In Proc. ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD’96), 1996.

    Google Scholar 

  38. J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco, CA, 2000.

    Google Scholar 

  39. J. Han, L. Lakshmanan, and R. T. Ng. Constraint-based multidimensional data mining. IEEE Computer, 32(8):46–50, 1999.

    Article  Google Scholar 

  40. J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proc. ACM SIGMOD Int. Conf. Management of Data, pp. 1–12, 2000.

    Google Scholar 

  41. D. J. Hand, H. Mannila, and P. Smyth. Principles of Data Mining. MIT Press, Cambridge, MA, 2001.

    Google Scholar 

  42. T. Imielinski and H. Mannila. A database perspective on knowledge discovery. Commun. of the ACM, 39(11):58–64, 1996.

    Article  Google Scholar 

  43. T. Imielinski and A. Virmani. MSQL: A query language for database mining. Data Mining and Knowledge Discovery, 3(4):373–408, 1999.

    Article  Google Scholar 

  44. T. Imielinski, A. Virmani, and A. Abdulghani. DMajor-Application programming interface for database mining. Data Mining and Knowledge Discovery, 3(4):347–372, 1999.

    Article  Google Scholar 

  45. A. K. Jain, M. N. Murthy, and P. J. Flynn. Data clustering: A review. ACM Computing Surveys, 31(3):264–323, 1999.

    Article  Google Scholar 

  46. D. Kapur and P. Narendran. NP-completeness of the set unification and matching problems. In Proc. 8th Int. Conf. on Automated Deduction, LNCS 230, pp. 489–495, 1986.

    Google Scholar 

  47. M. Kirsten and S. Wrobel. Relational distance-based clustering. In Proc. 8th Int. Workshop on Inductive Logic Programming (ILP’98), LNAI 1446, pp. 261–270, 1998.

    Google Scholar 

  48. M. Kirsten and S. Wrobel. Extending k-means clustering to first-order representations. In Proc. 10th. Int. Conf. on Inductive Logic Programming (ILP’00), LNCS 1866, pp. 112–119, 2000.

    Google Scholar 

  49. S. Kramer, N. Lavrač, and P. Flach. Propositionalization approaches to relational data mining. In A. Dzeroski and N. Lavrac, editors, Relational Data Mining, Chap. 11, pp. 262–291, Springer, 2001.

    Google Scholar 

  50. G. Manco. Foundations of a Logic-Based Framework for Intelligent Data Analysis. Ph.D. Thesis, Department of Computer Science, University of Pisa, 2001.

    Google Scholar 

  51. H. Mannila. Inductive databases and condensed representations for data mining. In Proc. Int. Symposium on Logic Programming (ILPS’97), pp. 21–30, 1997.

    Google Scholar 

  52. H. Mannila and H. Toivonen. Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery, 1(3):241–258, 1997.

    Article  Google Scholar 

  53. J. Marcinkowski and L. Pacholski. Undecidability of the Horn-clause implication problem. In Proc. of 33rd Annual IEEE Symposium on the Foundations of Computer Science, pp. 354–362, 1992.

    Google Scholar 

  54. R. Meo, G. Psaila, and S. Ceri. An extension to SQL for mining association rules. Data Mining and Knowledge Discovery, 2(2):195–224, 1998.

    Article  Google Scholar 

  55. T. Mitchell. Machine Learning. McGraw-Hill, Boston, MA, 1997.

    MATH  Google Scholar 

  56. S. Muggleton and C. Feng. Efficient induction of logic programs. In Proc. of the 1st International Workshop on Algorithmic Learning Theory (ALT’90), pp. 368–381, 1990.

    Google Scholar 

  57. A. Netz, S. Chaudhuri, U. M. Fayyad, and J. Bernhardt. Integrating data mining with SQL databases: OLE DB for data mining. In Proc. 17th Int. Conf. on Data Engineering (ICDE’01), pp. 379–387, 2001.

    Google Scholar 

  58. R. T. Ng, L. V. S. Lakshmanan, J. Han, and A. Pang. Exploratory mining and pruning optimizations of constrained association rules. In Proc. ACM SIGMOD Int. Conf. Management of Data, pp. 13–24, 1998.

    Google Scholar 

  59. S.-H. Nienhuys-Cheng. Distance between Herbrand interpretations: A measure for approximations to a target concept. In Proc. 7th Int. Workshop on Inductive Logic Programming (ILP’97), LNAI 1297, pp. 213–226, 1997.

    Google Scholar 

  60. S.-H. Nienhuys-Cheng and R. de Wolf. Least generalizations and greatest specializations of sets of clauses. Journal of Artificial Intelligence Research, 4:341–363, 1996.

    MathSciNet  MATH  Google Scholar 

  61. S.-H. Nienhuys-Cheng and R. de Wolf. The subsumption theorem in inductive logic programming: Facts and fallacies. In L. De Raedt, editor, Advances in Inductive Logic Programming, pp. 265–276, IOS Press, 1996.

    Google Scholar 

  62. G. Piatetsky-Shapiro. Discovery, analysis, and presentation of strong rules. In G. Piatetsky-Shapiro and W. J. Frawley, editors, Knowledge Discovery in Databases, pp. 229–248, AAAI/MIT Press, 1991.

    Google Scholar 

  63. J. R. Quinlan. Learning logical definitions from relations. Machine Learning, 5:239–266, 1990.

    Google Scholar 

  64. J. R. Quinlan and R. M. Cameron-Jones. Induction of logic programs: FOIL and related systems. New Generation Computing, 13(3&4):287–312, 1995.

    Article  Google Scholar 

  65. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco, CA, 1993.

    Google Scholar 

  66. E. Rahm and H. H. Do. Data cleaning: Problems and current approaches. IEEE Data Engineering Bulletin, 23(4):3–13, 2000.

    Google Scholar 

  67. J. Ramon and M. Bruynooghe. A framework for defining distances between first-order logic objects. In Proc. 8th Int. Workshop on Inductive Logic Programming (ILP’98), LNCS 1446, pp. 271–280, 1998.

    Google Scholar 

  68. J. A. Robinson. A machine-oriented logic based on the resolution principle. Journal of ACM, 12(1):23–41, 1965.

    Article  MATH  Google Scholar 

  69. R. Sadri, C. Zaniolo, A. M. Zarkesh, and J. Adibi. A sequential pattern query language for supporting instant data mining for e-services. In Proc. 27th Int. Conf. on Very Large Data Bases (VLDB’01), pp. 653–656, 2001.

    Google Scholar 

  70. S. Sarawagi, S. Thomas, and R. Agrawal. Integrating association rule mining with relational database systems: Alternatives and implications. Data Mining and Knowledge Discovery, 4(2/3):89–125, 2000.

    Article  Google Scholar 

  71. A. Savasere, E. Omiecinski, and S. B. Navathe. An efficient algorithm for mining association rules in large databases. In Proc. 21th Int. Conf. on Very Large Data Bases (VLDB’95), pp. 432–444, 1995.

    Google Scholar 

  72. W.-M. Shen, K. Ong, B. G. Mitbander, and C. Zaniolo. Metaqueries for data mining. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pp. 375–398, AAAI/MIT Press, 1996.

    Google Scholar 

  73. A. Siebes and M. L. Kersten. KESO: Minimizing database interaction. In Proc. 3rd Int. Conf. on Knowledge Discovery and Data Mining (KDD’97), pp. 247–250, 1997.

    Google Scholar 

  74. C. Silverstein, S. Brin, and R. Motwani. Beyond market baskets: Generalizing association rules to dependence rules. Data Mining and Knowledge Discovery, 2(1):39–68, 1998.

    Article  Google Scholar 

  75. R. Srikant and R. Agrawal. Mining generalized association rules. Future Generation Computer Systems, 13(2/3):161–180, 1997.

    Article  Google Scholar 

  76. R. Srikant, Q. Vu, and R. Agrawal. Mining association rules with item constraints. In Proc. 3rd Int. Conf. on Knowledge Discovery and Data Mining (KDD’97), pp. 67–73, 1997.

    Google Scholar 

  77. K. Thompson and P. Langley. Concept formation in structured domains. In D. H. Fisher, M. J. Pazzani, and P. Langley, editors, Concept Formation: Knowledge and Experience in Unsupervised Learning, pp. 127–161. Morgan Kaufmann, 1991.

    Google Scholar 

  78. S. Tsur, J. D. Ullman, S. Abiteboul, C. Clifton, R. Motwani, S. Nestorov, and A. Rosenthal. Query flocks: A generalization of association-rule mining. In Proc. ACM SIGMOD Int. Conf. Management of Data, pp. 1–12, 1998.

    Google Scholar 

  79. P. R. J. van der Laag and S.-H. Nienhuys-Cheng. Completeness and properness of refinement operators in inductive logic programming. Journal of Logic Programming, 34(3):201–225, 1998.

    Article  MathSciNet  MATH  Google Scholar 

  80. W. Van Laer and L. De Raedt. How to upgrade propositional learners to first order logic: A case study. In A. Džeroski and N. Lavrač, editors, Relational Data Mining, Chap. 10, pp. 235–261, Springer, 2001.

    Google Scholar 

  81. H. Wang and C. Zaniolo. Using SQL to build new aggregates and extenders for object-relational systems. In Proc. 26th Int. Conf. on Very Large Data Bases (VLDB’00), pp. 166–175, 2000.

    Google Scholar 

  82. S. M. Weiss and N. Indurkhya. Predictive Data Mining: A Practical Guide. Morgan Kaufmann, San Francisco, CA, 1997.

    Google Scholar 

  83. I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco, CA, 1999.

    Google Scholar 

  84. S. Wrobel. Inductive logic programming for knowledge discovery in databases. In A. Džeroski and N. Lavrac, editors, Relational Data Mining, Chap. 4, pp. 74–101, Springer, 2001.

    Google Scholar 

  85. C. Zaniolo, N. Ami, and K. Ong. Negation and aggregates in recursive rules: The LDL++ approach. In Proc. 3rd Int. Conf. on Deductive and Object-Oriented Databases (DOOD’93), LNCS 760, pp. 204–221, 1993.

    Google Scholar 

  86. C. Zaniolo and H. Wang. Logic-based user-defined aggregates for the next generation of database systems. In K. R. Apt, V. W. Marek, M. Truszczynski, and D. S. Warren, editors, The Logic Programming Paradigm: A 25-Year Perspective, pp. 401–426, Springer, 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Giannotti, F., Manco, G., Wijsen, J. (2004). Logical Languages for Data Mining. In: Chomicki, J., van der Meyden, R., Saake, G. (eds) Logics for Emerging Applications of Databases. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18690-5_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-18690-5_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-62248-9

  • Online ISBN: 978-3-642-18690-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics