The term similarity measure refers to a function that is used for comparing objects of any type. The objects can be data structures, database records, or even multimedia objects (audio, video, etc.). Therefore, the input of a similarity measure is two objects and the output is, in general, a number between 0 and 1; “zero” meaning that the objects are completely dissimilar and “one” signifying that the two objects are identical. Similarity is related to distance, which is the inverse of similarity. That is, a similarity of 1 implies a distance of 0 between two objects.
Motivation and Background
Similarity measures are typically used for quantifying the affinity between objects in search operations, where the user presents an object (query) and requests other objects “similar” to the given query. Therefore, a similarity measure is a mathematical abstraction for comparing objects, assigning a single...
- Agrawal, R., Faloutsos, C., & Swami, A. (1993). Efficient similarity search in sequence databases. In Proceedings of foundations of data organization and algorithms (FODO), (pp. 69–84). Chicago, Illinois, USA.Google Scholar
- Keogh, E., Lonardi, S., & Ratanamahatana, A. (2004). Towards parameter-free data mining. Proceedings of International Conference on Knowledge Discovery and Data Mining (SIGKDD) (pp. 206–215). Seattle, Washington, USA.Google Scholar
- Zezula, P., Amato, G., Dohnal, V., & Batko, M. (2005). Similarity search: the metric approach. Advances in Database Systems, Springer.Google Scholar