Abstract
This paper presents an unified framework for the definition of similarity measures for various formalisms (attribute-value, first order logic...). The underlying idea is that the similarity between two objects does not depend only on the attribute values of the objects, but more especially on the set of the potentially relevant definitions of concepts for the problem considered.
In our framework, the user defines a language with a grammar to specify the similarity measure. Each term of the language represents a property of the objects. The similarity between two objects is the probability that these two objects both satisfy or both reject simultaneously the properties of the given language. When this probability is not computable, we use a stochastic generation procedure to approximate it.
This measure can be applied for both clustering and classification tasks. The empirical evaluation on common classification problems shows a very good accuracy.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
D. Aha. Incremental, instance based-learning of independent and graded concept descriptions. In Sixth International Machine Learning Workshop (ML89), pages 387–391, 1989.
G. Bisson. Learning in FOL with a similarity measure. In 11th National Conf. on Artificial Intelligence (AAAI), San Jose, CA., pages 82–87. AAAI Press, 1992.
C.L. Blake and C.J. Merz. UCI repository of machine learning databases, 1998.
S. Cost and S. Salzberg. A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning, 10(1), 57–78, 1993.
Morris H. DeGroot. Probability and Statistics. Addison-Wesley Series in Statistics. Addison-Wesley, Reading, MA, USA, 2nd edition, 1986.
P. Domingos. Rule induction and instance-based learning: A unified approach. In Fourteenth International Joint Conference on Artificial Intelligence (IJCAI’95), Montreal, Canada, pages 1226–1232. Morgan & Kaufmann, 1995.
W. Emde and D. Wettschereck. Relational instance-based learning. In Saitta L., editor, 13th Int. Conf. on Machine Learning (ICML’96), Bari, Italy, pages 122–130. Morgan & Kaufmann, 1996.
C. Giraud-Carrier and T. Martinez. An efficient metric for heterogeneous inductive learning applications in the attribute-value language. In Proceedings of GWIC’94, pages Vol. 1, 341–350. Kluwer Academic Publishers, 1995.
F. Gécseg and M. Steinby. Tree Automata. Akadémiai Kidoó, Budapest, 1984.
Tamás Horváth, Stefan Wrobel, and Uta Bohnebeck. Relational instance-based learning with lists and terms. Machine Learning, 43(1/2):53–80, 2001.
N. Lachiche and P. Marquis. Scope classification: An instance-based learning algorithm with a rule-based characterization. Lecture Notes in Computer Science, 1398:268–--, 1998.
F. Moal. Langages de biais en Apprentissage Symbolique. PhD thesis, LIFO, Université d’ Orléans, France, December 2000.
Kim Marriott and Peter J. Stuckey. Programming with Constraints: An Introduction. The MIT Press, 1998.
T. Mohri and H. Tanaka. An optimal weighting criterion of case indexing for both numeric and symbolic attributes. In Case-Based Reasoning Workshop, pages 123–127. AAAI Press, 1994.
Enric Plaza. Cases as terms: A feature term approach to the structured representation of cases. In ICCBR, pages 265–276, 1995.
M. Sebag. Distance induction in first order logic. In Proceedings of ILP’97, pages 264–272. Springer-Verlag, 1997.
M. Sebag and M. Schoenauer. Topics in Case-Based Reasonning, volume 837 of LNAI, chapter A Rule-based Similarity Measure, pages 119–130. Springer-Verlag, 1994.
C. Stanfill and D. Waltz. Toward memory-based reasoning. Communication of the ACM, 29(12): 1213–1228, 1986.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Martin, L., Moal, F. (2001). A Language-Based Similarity Measure. In: De Raedt, L., Flach, P. (eds) Machine Learning: ECML 2001. ECML 2001. Lecture Notes in Computer Science(), vol 2167. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44795-4_29
Download citation
DOI: https://doi.org/10.1007/3-540-44795-4_29
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42536-6
Online ISBN: 978-3-540-44795-5
eBook Packages: Springer Book Archive