Machine Learning

, Volume 55, Issue 2, pp 175–210 | Cite as

Inducing Multi-Level Association Rules from Multiple Relations

  • Francesca A. Lisi
  • Donato Malerba
Article

Abstract

Recently there has been growing interest both to extend ILP to description logics and to apply it to knowledge discovery in databases. In this paper we present a novel approach to association rule mining which deals with multiple levels of description granularity. It relies on the hybrid language \(\mathcal{A}\mathcal{L}\)-log which allows a unified treatment of both the relational and structural features of data. A generality order and a downward refinement operator for \(\mathcal{A}\mathcal{L}\)-log pattern spaces is defined on the basis of query subsumption. This framework has been implemented in SPADA, an ILP system for mining multi-level association rules from spatial data. As an illustrative example, we report experimental results obtained by running the new version of SPADA on geo-referenced census data of Manchester Stockport.

inductive logic programming description logics spatial data mining 

References

  1. Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules in large databases. In J. Bocca, M. Jarke, & C. Zaniolo (Eds.), Proceedings of 20th International Conference on Very Large Data Bases (pp. 487–499). Morgan Kaufmann.Google Scholar
  2. Baader, F., Calvanese, D., McGuinness, D., Nardi, D., & Patel-Schneider, P. (Eds.). (2003). The description logic handbook: Theory, implementation and applications. Cambridge University Press.Google Scholar
  3. Badea, L., & Nienhuys-Cheng, S.-W. (2000). A refinement operator for description logics. In J. Cussens & A. Frisch (Eds.), Inductive logic programming, vol. 1866 of Lecture Notes in Artificial Intigece (pp. 4059) Springer-Verlag.Google Scholar
  4. Bhat, C., Handy, S., Kockelman, K., Mahmassani, H., Chen, Q., & Weston, L. (2000). Uban accessibility index: Literature review. Technical Report TX-Ol/7–4938–1, Texas Dept. of TranspofationUniversity of Texas, Austin.Google Scholar
  5. Blockeel, H., De Raedt, L., Jacobs, N., & Demoen, B. (1999). Scaling up inductive logic programming by learning from interpretations. Data Mining and Knowledge Discovery, 3, 59–93.CrossRefGoogle Scholar
  6. Borgida, A. (1996). On therelative expressiveness of description logics and predicate logics. Artificial Intelligence, 82:1/2, 353–367.CrossRefGoogle Scholar
  7. Buntine, W. (1988). Generalized subsumption and its application to induction and redundancy. Artificial Intelligence, 36:2, 149–176.Google Scholar
  8. Ceri, S., Gottlob, G., & Tanca, L. (1990). logic programming and databases. Springer.Google Scholar
  9. Cohen, W., Borgida, A., & Hirsh, H. (1992). Computing least common subsumers in description logics. In Proc. of the 10th National Conf on Artifiial intelligence (pp. 754–760). The AAAI Press/The MIT Press.Google Scholar
  10. Cohen, W., & Hirsh, H. (1994). Learning the. CLASSIC description logic: Thoretical and experimental results. In Proc. of the 4th Int. Conf on Principles ofKnowledge Representation and Reasoning (KR’94) (pp. 121–133). Morgan Kaufmann.Google Scholar
  11. De Raedt, L., & Dehaspe, L.-(1997).Clausal discovery. Machine Learning, 26:2/3, 99–146.CrossRefGoogle Scholar
  12. De Raedt, L., & Deroski S (1994), First order jk-clausal theories are PAC-learnable. Artificial Intelligence, 70, 375–392.CrossRefGoogle Scholar
  13. Dehaspe, L., & Toivonen, H. (1999). Discovery of frequent DATALOG patterns. Data Mining and Knowledge Discover, 3,7–36.CrossRefGoogle Scholar
  14. Donini, F., Lenzerini, M., Nardi, D., & Schaerf, A. (1998). AC-log: Integrating datalog and description logics. Journal of intelligent Information Systems, 10:3, 227–252.CrossRefGoogle Scholar
  15. Džeroski, S. (1996). Inductive logic programming and knowledge discovery in databases. In U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, & R. Uthurusamy (Eds.), Advances in knowledge discovery and data mining. (pp. 117–152). AAAI Press/The MIT Press.Google Scholar
  16. Džeroski, S. (2001). Relational data mining applications: An overview. In S. Džeroski & N. Lavrač (Eds.), Relational data mining (pp. 339–364). Springer.Google Scholar
  17. Egenhofer, M., & Herring, J. (1994). Categorizing binary topological relations between regions, lines, and points in geographic databases. In M. Egenhofer, D. Mark, & J. Herring (Eds.), The 9-intersection: Fornmalism and its use for natural-language spatial predicates (pp. 183–271). Technical Report 94–1, U.S. NCGIA.Google Scholar
  18. Ester, M., Frommelt, A., Kriegel, H.-P., & Sander, J. (2000). Spatial data mining: Database primitives, algorithms and efficient DBMS support. Data Mining and Knowledge Discovery, 4:2/3, 193–216.CrossRefGoogle Scholar
  19. Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery: An overview. In U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, & R. Uthurusamy (Eds.), Advances in knowledge discovery and data mining, (pp. 1–34). AAAI Press/T’he MIT Press.Google Scholar
  20. Ferri, F., Pourabbas, E., Rafanelli, M., & Ricci, F. (2000). Extending geographic databases for a query language to support queries involving statistical data. In G (Ed.), Proceedings of the 12th nt. Conf on Scientific and Statistical Database Management.Google Scholar
  21. Flach, P., & Dieroski, S. (2001). Editorial: Inductive logic programming is coming of age. Machine Learning, 44:3,207–209.CrossRefGoogle Scholar
  22. Gatrell, A., & Senior, M. (1999). Health and health care applications. In P. Longley, M. Goodchild, D. Maguire, & D. Rhind (Eds.), Geographical information systems, vol. 2, Principles and technical issues, 2nd edn. vol. 2, (pp. 925–938). John Wiley & Sons.Google Scholar
  23. Giuting, R. (1994). An introduction to spatial database systems. VLDB Journal, 3:4, 357–399. CrossRefGoogle Scholar
  24. Han, J., & Fu, Y. (1995). Discovery of multiple-level association rules from large databases. In U. Dayal, P. Gray, & S. Nishio (Eds.), VLDB’95, Proceedings of 21th International Conference on Very Large Data Bases, Sept. 1–15, 1995, (pp. 420–431). Zurich, Switzerland. Morgan Kaufmann.Google Scholar
  25. Han, J., & Fu, Y. (1999). Mining multiple-level association rules in large databases. IEEE Transactions on Knowledge and Data Engineering, 11:5. Google Scholar
  26. Han, J., Koperski, K., & Stefanovic, N. (1997). GeoMiner: A system prototype for spatial data mining. In J. Peckham (Ed.), Proceedings ACM SIGMOD International Conference on Management of Data (pp. 553–556). ACM Press.Google Scholar
  27. Helft, N. (1987). Inductive generalization: A logical framework. In I. Bratko, & N. Lava: (Eds.), Progress in Machine Learning-Proceedings of EWSL87: 2nd European Working Session On Learning (pp. 149–157). Wilmslow, U.K.: Sigma Press.Google Scholar
  28. Kietz, J.-U., & Morik, K. (1994). A polynomial approach to the constructive induetioni of structural knowledge. Machine Learning, 14:1, 193–217.CrossRefGoogle Scholar
  29. Koperski, K., & Han. J. (1995). Discovery of spatial association rules in geographic information databases. In M. Egenhofer and J. Herring (Eds.), Advances in spatial databases, vol 951 of Lecture Notes in Computer Science (pp. 47–66). Springer.Google Scholar
  30. Krogel, M.-A., & Wrobel, S. (2001). Transformation-basedlearning using multirelational aggregation. In C. Rouveirol & M. Sebag (Eds.), Inductive logic programming, vol. 2157 of Lecture Notes in Artificial Intelligence (pp. 142–155). Springer.Google Scholar
  31. Levy, A., & Rousset, M.-C. (1998). Combining Horn rules and description logics in CARIN. Artificial Intelligence, 104, 165–209.CrossRefGoogle Scholar
  32. Lisi, F.A., Ferilli, S., & Fanizzi, N. (2002). Object identity as search bias for pattern spaces. In F. van Harmelen (Ed.), ECAI 2002. Proceedingsof the. 15th European Conference on Artificial Intelligence (pp. 375–379). Amsterdam: IOS Press.Google Scholar
  33. Ludl, M.-C., & Widmer, G. (2000). Relative unsupervised discretization for association rule mining. In D. Zighed, H. Komorowski, & J. Zytkow (eds.), Principles f data mining and knowledge discovery, vol. 1910 of Lecture Notes in Artificial Intelligence (pp. 148–158). Springer.Google Scholar
  34. Malerba, D., Esposito, F., Lanza, A., & Lisi, F. A. (2001). Machine learning for information extraction from topographic maps. in H. 6Miller, & J. Han (Eds.), Geographic data mining and knowledge discover (pp. 291–314). Taylor and Francis.Google Scholar
  35. Malerba, D. & Lisi, F A. (2001a). Discovering associations between spatial objects: An ILP application. In C. Rouveirol & M. Sebag (Eds.), Inductive logic programming, vol. 2157 of Lecture Notes in Artificial Intelligence (pp. 156–163). Springer.Google Scholar
  36. Malerba, D., & Lisi, F. A. (2001b). An ILP method for spatial association rule mining. In A. Knobbe, & D. van der Wallen (Eds.), Notes of the ECMIPKDD 2001 workshop on multi-relational data mining (pp. 18–29).Google Scholar
  37. Mannila, H., & Toivonen, H. (1997). Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery, 1:3, 241–258. CrossRefGoogle Scholar
  38. Martin, D. (1999). Spatial representation: The social scientist’s perspective. In P. Longley, M. Goodchild, D. Maguire, & D. Rhind (Eds.), Geographical information systems, vol. 1, principles and technical issues, 2nd edn., vol. 1. (pp. 71–80). John Wiley & Sons.Google Scholar
  39. Nienhuys-Cheng, S., & de Wolf, R. ( 997). Foundations of inductive logic programming, vol. 1228 of Lecture Notes in Artificial Intelligence. Springer.Google Scholar
  40. Nijssen, S., & Kok, J. (2001). Faster association rules for multiple relations. In B. Nebel (Ed.), Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (pp. 891–896). Morgan Kaufmann.Google Scholar
  41. Plotkin, G. (1970). A note on inductive generalization. Machine Intelligence, 5, 153–163.Google Scholar
  42. Popelinski, L. (1998). Knowledge discovery in spatial data by means of ILP. In J. Zytkow, & M. Quafalou (Eds.), Principles of data mining and knowledge discovery, second European symposium, PKDD ‘98, vol. 1510 of Lecture Notes in Artificial Intelligence (pp. 185–193). Springer.Google Scholar
  43. Reiter, R. (1980). Equality and domain closure in first order databases. Journal of ACM, 27, 235–249.CrossRefGoogle Scholar
  44. Rouveirol, C., & Ventos, V. (2000). Towards learning in CARIN-ACV. In J. Cussens & A. Frisch (Eds.), Inductive logic programming, vol. 1866 of Lecture Notes in Artificial Intelligence (pp. 191–208). Springer.Google Scholar
  45. Schmidt-Schauss, M., & Smolka, G. (1991). Attributive concept descriptions with complements. Artificial Intelligence, 48:1, 1–26.CrossRefGoogle Scholar
  46. Semeraro, G., Esposito, F., Malerba, D., Fanizzi, N., & Ferilli, S. (1998). A logic framework for the incremental inductive synthesis of datalog theories. In N. Fuchs (Ed.), Proceedings of 7th international Workshop on Logic Program Synthesis and Transformation, vol. 1463 of Lecture Notes in Computer Science (pp. 300–321). Springer.Google Scholar
  47. Srikant, R., & Agrawal, R. (1995). Mining generalized association rules. In U. Dayal, P. Gray, & S. Nishio (Eds.), Proceedings of 21th International Conference on Very Large Data Bases (pp. 407–419). MorganKaufmann.Google Scholar
  48. Weber, I. (1999). A declarative language bias for levelwise search of first-order regularities. In Z Ras & A. Skowron (Eds.), Foundations of intelligent systems, vol. 1609 of Lecture Notes in Artificial Intelligence (pp. 253–261). Springer-Verlag.Google Scholar

Copyright information

© Kluwer Academic Publishers 2004

Authors and Affiliations

  • Francesca A. Lisi
  • Donato Malerba

There are no affiliations available

Personalised recommendations