A Language-Based Similarity Measure

Martin, Lionel; Moal, Frédéric

doi:10.1007/3-540-44795-4_29

Lionel Martin³ &
Frédéric Moal³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2167))

Included in the following conference series:

European Conference on Machine Learning

2311 Accesses
1 Citations

Abstract

This paper presents an unified framework for the definition of similarity measures for various formalisms (attribute-value, first order logic...). The underlying idea is that the similarity between two objects does not depend only on the attribute values of the objects, but more especially on the set of the potentially relevant definitions of concepts for the problem considered.

In our framework, the user defines a language with a grammar to specify the similarity measure. Each term of the language represents a property of the objects. The similarity between two objects is the probability that these two objects both satisfy or both reject simultaneously the properties of the given language. When this probability is not computable, we use a stochastic generation procedure to approximate it.

This measure can be applied for both clustering and classification tasks. The empirical evaluation on common classification problems shows a very good accuracy.

Download to read the full chapter text

Chapter PDF

Semantic Similarity Functions and Their Applications

An overview of distance and similarity functions for structured data

Article 27 February 2020

On the foundations of similarity in information access

Article 02 June 2020

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

D. Aha. Incremental, instance based-learning of independent and graded concept descriptions. In Sixth International Machine Learning Workshop (ML89), pages 387–391, 1989.
Google Scholar
G. Bisson. Learning in FOL with a similarity measure. In 11th National Conf. on Artificial Intelligence (AAAI), San Jose, CA., pages 82–87. AAAI Press, 1992.
Google Scholar
C.L. Blake and C.J. Merz. UCI repository of machine learning databases, 1998.
Google Scholar
S. Cost and S. Salzberg. A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning, 10(1), 57–78, 1993.
Google Scholar
Morris H. DeGroot. Probability and Statistics. Addison-Wesley Series in Statistics. Addison-Wesley, Reading, MA, USA, 2nd edition, 1986.
Google Scholar
P. Domingos. Rule induction and instance-based learning: A unified approach. In Fourteenth International Joint Conference on Artificial Intelligence (IJCAI’95), Montreal, Canada, pages 1226–1232. Morgan & Kaufmann, 1995.
Google Scholar
W. Emde and D. Wettschereck. Relational instance-based learning. In Saitta L., editor, 13th Int. Conf. on Machine Learning (ICML’96), Bari, Italy, pages 122–130. Morgan & Kaufmann, 1996.
Google Scholar
C. Giraud-Carrier and T. Martinez. An efficient metric for heterogeneous inductive learning applications in the attribute-value language. In Proceedings of GWIC’94, pages Vol. 1, 341–350. Kluwer Academic Publishers, 1995.
Google Scholar
F. Gécseg and M. Steinby. Tree Automata. Akadémiai Kidoó, Budapest, 1984.
MATH Google Scholar
Tamás Horváth, Stefan Wrobel, and Uta Bohnebeck. Relational instance-based learning with lists and terms. Machine Learning, 43(1/2):53–80, 2001.
Article MATH Google Scholar
N. Lachiche and P. Marquis. Scope classification: An instance-based learning algorithm with a rule-based characterization. Lecture Notes in Computer Science, 1398:268–--, 1998.
Article Google Scholar
F. Moal. Langages de biais en Apprentissage Symbolique. PhD thesis, LIFO, Université d’ Orléans, France, December 2000.
Google Scholar
Kim Marriott and Peter J. Stuckey. Programming with Constraints: An Introduction. The MIT Press, 1998.
Google Scholar
T. Mohri and H. Tanaka. An optimal weighting criterion of case indexing for both numeric and symbolic attributes. In Case-Based Reasoning Workshop, pages 123–127. AAAI Press, 1994.
Google Scholar
Enric Plaza. Cases as terms: A feature term approach to the structured representation of cases. In ICCBR, pages 265–276, 1995.
Google Scholar
M. Sebag. Distance induction in first order logic. In Proceedings of ILP’97, pages 264–272. Springer-Verlag, 1997.
Google Scholar
M. Sebag and M. Schoenauer. Topics in Case-Based Reasonning, volume 837 of LNAI, chapter A Rule-based Similarity Measure, pages 119–130. Springer-Verlag, 1994.
Google Scholar
C. Stanfill and D. Waltz. Toward memory-based reasoning. Communication of the ACM, 29(12): 1213–1228, 1986.
Article Google Scholar

Download references

Author information

Authors and Affiliations

LIFO - Université d’Orléans, rue Léonard de Vinci, BP 6759, 45067, Orleans cedex 2, FRANCE
Lionel Martin & Frédéric Moal

Authors

Lionel Martin
View author publications
You can also search for this author in PubMed Google Scholar
Frédéric Moal
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Albert-Ludwigs University Freiburg, Georges Köhler-Allee, Geb. 079, 79110, Freiburg, Germany
Luc De Raedt
Department of Computer Science, University of Bristol, Merchant Ventures Bldg., Woodland Road, Bristol, BS8 1UB, UK
Peter Flach

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martin, L., Moal, F. (2001). A Language-Based Similarity Measure. In: De Raedt, L., Flach, P. (eds) Machine Learning: ECML 2001. ECML 2001. Lecture Notes in Computer Science(), vol 2167. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44795-4_29

Download citation

DOI: https://doi.org/10.1007/3-540-44795-4_29
Published: 30 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42536-6
Online ISBN: 978-3-540-44795-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

A Language-Based Similarity Measure

Abstract

Chapter PDF

Similar content being viewed by others

Semantic Similarity Functions and Their Applications

An overview of distance and similarity functions for structured data

On the foundations of similarity in information access

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Language-Based Similarity Measure

Abstract

Chapter PDF

Similar content being viewed by others

Semantic Similarity Functions and Their Applications

An overview of distance and similarity functions for structured data

On the foundations of similarity in information access

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation