Abstract
Many algorithms in machine learning, knowledge discovery, pattern recognition and classification are based on the estimation of the similarity or the distance between the analysed objects. Objects with higher structural complexity often cannot be described by feature vectors without losing important structural information. These objects can adequately be represented in the language of logic or by labeled graphs. The similarity of such descriptions is difficult to define and to compute. In this paper, a connectionist approach for the determination of the similarity of arbitrary labeled graphs is introduced. Using an example from organic chemistry, the application of the approach within one distance based and one generalisation based classfication algorithm is demonstrated. The generalisation based algorithm forms clusters or subclasses of similar examples of the same class and extracts the parts of the objects which determine the class of the object. The algorithms perform very satisfactorily in comparison with recent logical and feature vector approaches. Moreover, being able to handle structural data directly, the algorithms need only a subset of the given features of the objects for classification.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
D.W. Aha. Inductive Logic Programming, chapter Relating Relational Learning Algorithms. Academic Press, London, 1992.
Y. Akiyama, A. Yamashita, M. Kajiura, and H. Aiso. Combinatorial optimization with gaussian machines. In IEEE Int. Conf. on Neural Networks, volume I, pages 533–540. 1989.
J.E. Ash, W.A. Warr, and P. Willett. Chemical Structure Systems. Computational Techniques for Representation, Searching and Processing of Structural Information. Ellis Horwood, 1991.
J.M. Barnard. Substructure Searching Methods: Old and new. J. Chem. Inf. Comp. Sci., 33:532–538, 1993.
D. Bawden. Chemical Information Systems. Beyond the Structure Diagram. Ellis Horwood, 1990.
K. Börner, K.P. Jantke, S. Schönherr, and E. Tammer. Lernszenarien im fall-basierten Schließen. FABEL Report 14, GMD, Sankt Augustin, 1993.
L.I. Burke and J.P. Ignizio. Neural networks and operations research: An overview. Computers Ops.Res., 19(3/4):179–189, 1992.
L. Cinque, D. Yasuda, L.G. Shapiro, S. Tanimoto, and B. Allen. An improved algorithm for relational distance graph matching. Pattern Recognition, 29(2):349–359, feb 1996.
D.J. Cook and L.B. Holder. Substructure Dcovery Using Minimum Description Length and Background Knowledge. Journal of Artificial Intelligence Research, 1:231–255, 1994.
W. Emde and D. Wettschereck. Relational instance-based learning. In L. Saitta, editor, Proc. of the 13th International Conference on Machine Learning, pages 122–130. Morgan Kaufmann, 1996.
J. A. Feldman and D. H. Ballard. Computing with Connections. TR 72, Computer Science Department, University of Rochester, April 1981.
J. A. Feldman, M. A. Fanty, N. Goddard, and K. Lynne. Computing with Structured Connectionist Networks. TR 213, Department of Computer Science, University of Rochester, April 1987.
D. H. Fisher. Knowledge acquisition via incremental conceptual clustering. Machine Learning, 2(2):139–172, 1987.
N. Funabiki, Y. Takefuji, and K.-C. Lee. A neural network model for finding a near-maximal clique. Journal of Parallel and Distributed Computing, 14(3):340–344, March 1992.
P. Geibel, K. Schädler, and F. Wysotzki. Begriffslernen für strukturierte Objekte (Concept Learning for Relational Structures). In Proceedings of FGML-95, Dortmund, Germany, 1995.
P. Geibel and F. Wysotzki. Learning relational concepts with decision trees. In L. Saitta, editor, Machine Learning: Proceedings of the 13th International Conference, pages 166–174. Morgan Kaufmann Publishers, San Fransisco, CA, 1996.
P. Geibel and F. Wysotzki. Relational learning with decision trees. In W. Wahlster, editor, Proceedings of the 12th European Conference on Artificial Intelligence. John Wiley and Sons, Ltd., 1996. to appear.
J. H. Gennari, P. Langley, and D. Fisher. Models of Incremental Concept Formation. Artificial Intelligence, 40:11–61, 1989.
J. Hopfield and D. Tank. Neural computations of decisions in optimization problems. Biological Cybernetics, 52:141–152, 1986.
J.J. Hopfield. Neurons with graded response have collective computational properties like those of two-state neurons. In Proceedings of the National Academy of Sciences USA 81, pages 3088–3092. 1984.
A. Jagota. Efficiently approximating MAX-CLIQUE in a hopfield-style network. In Proceedings of International Joint Conference on Neural Networks '92 Volume II, pages 248–253, 1992.
F. Kaden. Graphmetriken und Distanzgraphen. In Beiträge zur angewandten Graphentheorie, ZKI-Information. Berlin, Juni 1982.
J. Kietz. Induktive Analyse Relationaler Daten. PhD thesis, TU Berlin, FB 13, 1996.
R. D. King, M. J. E. Sternberg, A. Srinivasan, and S. H. Muggleton. Knowledge Discovery in a Database of Mutagenic Chemicals. In Proceedings of the Workshop “Statistics, Machine Learning and Discovery in Databases” at the ECML-95, 1995.
W. Knödel. Ein Verfahren zur Feststellung der Isomorphie von endlichen, zusammenhängenden Graphen. Computing, 8, 1971.
N. Lavrac and S. Dzeroski. Inductive Logic Programming: Techniques and Applications. Ellis Horwood, New York, 1994.
K.B. Lipkowitz and D.B. Boyd. Reviews in Computational Chemistry, volume I-IV. VCH, Weinheim, 1990–1993.
C. Looi. Neural network methods in combinatorial optimization. Computers and Operations Research, 19(3/4):191–208, 1992.
D. G. Lowe. Similarity metric learning for a variable-kernel classifier. Technical Report UBC-TR-93-43, Computer Science Dept., University of British Columbia, Vancouver, B.C., V6T 1Z4, Canada, November 1993.
H. Mannila. Aspects of data mining. In Proceedings of the Workshop “Statistics, Machine Learning and Discovery in Databases” at the ECML-95, 1995.
M. Moulet and Y. Kondratoff. From machine learning towards knowledge discovery in databases. In Proceedings of the Workshop “Statistics, Machine Learning and Discovery in Databases” at the ECML-95, 1995.
K. Schädler, U. Schmid, B. Machenschalk, and H. Lübben. A neural net for determining structural similarity of recursive programs. In Proc. of the German Workshop of Case-Based Reasoning, Bad Honnef, March 1997.
K. Schädler and F. Wysotzki. Klassifizierungslernen mit Hilfe spezieller Hopfield-Netze. In Werner Dilger, Michael Schlosser, Jens Zeidler, and Andreas Ittner, editors, Beiträge zum 9.Fachgruppentreffen “Maschinelles Lernen” der GI-Fachgruppe 1.1.3., number CSR-96-06 in Chemnitzer Informatik-Berichte, pages 96–100. TU Chemnitz-Zwickau, August 1996.
K. Schädler and F. Wysotzki. Theoretical foundations of a special neural net approach for graphmatching. Technical Report 96-26, TU Berlin, CS Dept., 1996.
L.G. Shapiro and R.M. Haralick. A metric for computing relational description. IEEE Trans.Pattern Anal. Mach.Intell., 7(1):90–94, 1985.
E. Tammer, K. Steinhöfel, S. Schönherr, and D. Matuschek. Anwendung des Konzepts der Strukturellen ähnlichkeit zum Fallvergleich mittels Term-und Graph-Repräsentationen. Technical Report FABEL-Report No. 38, HTWK Leipzig, FB Informatik, Mathematik und Naturwissenschaften, August 1995.
A. Voß. Similarity concepts and retrieval methods. FABEL Report 13, GMD, Sankt Augustin, 1994.
M. P. Wand and M. C. Jones. Kernel Smoothing. Chapman and Hall, London, 1995.
J. Wang. Progress in Neural Networks, volume 3, chapter 11: Deterministic Neural Networks for Combinatorial Optimization, pages 319–340. Ablex Publishing Corporation, Norwood, New Jersey, 1995.
Ch. Wisozki and F. Wysotzki. Prototype, nearest neighbor and hybrid algorithms for time series classification. In N. Lavrac and S. Wrobel, editors, Machine Learning: ECML-95, number 912 in LNAI, pages 364–367. Springer, 1995.
F. Wysotzki. Artificial Intelligence and Artificial Neural Nets. In Proc. 1st Workshop on AI, Shanghai, September 1990. TU Berlin/Jiao Tong University Shanghai.
B. Zelinka. On a certain distance between isomorphism classes of graphs. Časopis pro pěstováni matematiky, 100:371–373, 1975.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Schädler, K., Wysotzki, F. (1997). A connectionist approach to structural similarity determination as a basis of clustering, classification and feature detection. In: Komorowski, J., Zytkow, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1997. Lecture Notes in Computer Science, vol 1263. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63223-9_124
Download citation
DOI: https://doi.org/10.1007/3-540-63223-9_124
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63223-8
Online ISBN: 978-3-540-69236-2
eBook Packages: Springer Book Archive