Journal of Intelligent Information Systems

, Volume 5, Issue 3, pp 229–248 | Cite as

Knowledge discovery from structural data

  • Diane J. Cook
  • Lawrence B. Holder
  • Surnjani Djoko


Discovering repetitive substructure in a structural database improves the ability to interpret and compress the data. This paper describes the Subdue system that uses domain-independent and domain-dependent heuristics to find interesting and repetitive structures in structural data. This substructure discovery technique can be used to discover fuzzy concepts, compress the data description, and formulate hierarchical substructure definitions. Examples from the domains of scene analysis, chemical compound analysis, computer-aided design, and program analysis demonstrate the benefits of the discovery technique.


machine discovery data mining data compression inexact graph match scene analysis chemical analysis 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bunke, H. and Allermann, G. (1983). Inexact graph matching for structural pattern recognition.Pattern Recognition Letters, 1(4), 245–253.Google Scholar
  2. Cheeseman, P., Kelly, J., Self, M., Stutz, J., Taylor, W., and Freeman, D. (1988). Autoclass: A bayesian classification system. InProceedings of the Fifth International Workshop on Machine Learning (pp. 54–64).Google Scholar
  3. Conklin, D., Fortier, S., Glasgow, J., and Allen. F. (1992). Discovery of spatial concepts in crystal-lographic databases. InProceedings of the ML92 Workshop on Machine Discovery (pp. 111–116).Google Scholar
  4. Derthick, M. (1991). A minimal encoding approach to feature discovery. InProceedings of the Ninth National Conference on Artificial Intelligence (pp. 565–571).Google Scholar
  5. Fisher, D. (1987). Knowledge acquisition via incremental conceptual clustering.Machine Learning, 2, 139–172.Google Scholar
  6. Fu, K.S. (1982).Syntactic Pattern Recognition and Applications. Prentice-Hall.Google Scholar
  7. Holder, L., Cook, D.J., and Djoko, S. (1994). Substructure discovery in the subdue system. InProceedings of the Workshop on Knowledge Discovery in Databases (pp. 169–180).Google Scholar
  8. Holder, L.B., Cook, D.J., and Bunke, H. (1992). Fuzzy substructures discovery. InProceedings of the Ninth International Machine Learning Conference (pp. 218–223).Google Scholar
  9. Jeltsch, E. and Kreowski, H.J. (1991). Grammatical inference based on hyperedge replacement. InFourth International Workshop on Graph Grammars and Their Application to Computer Science (pp. 461–474).Google Scholar
  10. Leclerc, Y.G. (1989). Constructing simple stable descriptions for image partitioning.International journal of Computer Vision, 3(1), 73–102.Google Scholar
  11. Levinson, R. (1984). A self-organizing retrieval system for graphs. InProceedings of the Second National Conference on Artificial Intelligence (pp. 203–206).Google Scholar
  12. Miclet, L. (1986).Structural Methods in Pattern Recognition. Chapman and Hall.Google Scholar
  13. Pednault, E.P.D. (1989). Some experiments in applying inductive inference principles to surface reconstruction. InProceedings of the International Joint Conference on Artificial Intelligence (pp. 1603–1609).Google Scholar
  14. Pentland, A. (1989). Part segmentation for object recognition.Neural Computation, 1, 82–91.Google Scholar
  15. Piatetsky-Shapiro, G. (1991).Knowledge Discovery in Database. AAAI Press.Google Scholar
  16. Quinlan, J.R. and Rivest, R.L. (1989). Inferring decision trees using the minimum description length principle.Information and Computation, 80, 227–248.Google Scholar
  17. Rao, R.B. and Lu, S.C. (1992). Learning engineering models with the minimum description length principle. InProceedings of the Tenth National Conference on Artificial Intelligence (pp. 717–722).Google Scholar
  18. Rich, E. and Knight, K. (1991).Artificial Intelligence. McGraw-Hill.Google Scholar
  19. Rissanen, J. (1989).Stochastic Complexity in Statistical Inquiry. World Scientific Publishing Company.Google Scholar
  20. Schalkoff, R.J. (1992).Pattern Recognition: Statistical, Structural and Neural Approaches. John Wiley & Sons.Google Scholar
  21. Segen, J. (1990). Graph clustering and model learning by data compression. InProceedings of the Seventh International Machine Learning Workshop (pp. 93–101).Google Scholar
  22. Thompson, K. and Langley, P. (1991). Concept formation in structured domains. In D. H. Fisher and M. Pazzani (Eds.),Concept Formation: Knowledge and Experience in Unsupervised Learning, chapter 5. Morgan Kaufmann Publishers, Inc.Google Scholar
  23. Waltz, D. (1975). Understanding line drawings of scenes with shadows. In PH. Winston (Ed.),The Psychology of Computer Vision. McGraw-Hill.Google Scholar
  24. Winston, P.H. (1975). Learning structural descriptions from examples. In P.H. Winston (Ed.),The Psychology of Computer Vision (pp. 157–210). McGraw-Hill.Google Scholar
  25. Yoshida, K. Motoda, H. and Indurkhya, N. (1993). Unifying learning methods by colored digraphs.In Proceedings of the Learning and Knowledge Acquisition Workshop at IJCAI-93.Google Scholar

Copyright information

© Kluwer Academic Publishers 1995

Authors and Affiliations

  • Diane J. Cook
    • 1
  • Lawrence B. Holder
    • 1
  • Surnjani Djoko
    • 1
  1. 1.Department of Computer Science EngineeringUniversity of Texas at ArlingtonUSA

Personalised recommendations