Skip to main content

Distance and Similarity Measures

  • Reference work entry
  • First Online:
Encyclopedia of Social Network Analysis and Mining

Synonyms

Comparison criteria

Glossary

BCE:

Before Common Era

Binary Data:

Data that take only two possible values such as “yes” or “no” answers to a question

Compact Space:

An abstract mathematical space whose topology follows the property of compactness

Contingency Table:

A table with r rows and c columns that gives a frequency distribution of two classification criteria

Cosine:

A trigonometric function. For a given angle in a right triangle, it is equal to the length of the side adjacent to the angle divided by the length of hypotenuse

Cosine Similarity:

A measure of similarity between two vectors expressed in terms of the cosine of the angle between the vectors

Distance Metric:

A function which defines a distance between the elements of a set

Earth Mover's Distance:

A measure of distance between two probability distributions and is equal to the minimum cost of turning one pile of dirt into another

Edit Distance:

The number of operations required to transform one string of characters...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 1,500.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Adamson GW, Boreham J (1974) The use of an association measure based on character structure to identify semantically related pairs of words and document titles. Inf Storage Retr 10(7–8):253–260

    Google Scholar 

  • Alt H, Behrends B, Blömer J (1995) Approximate matching of polygonal shapes. Ann Math Artif Intell 13(3):251–265

    MATH  Google Scholar 

  • Can F, Ozkarahan EA (1985) Concepts of the cover coefficient-based clustering methodology. In: Proceedings of the 8th annual international ACM SIGIR conference on research and development in information retrieval, Montreal, pp 204–211

    Google Scholar 

  • Cerra D, Datcu M (2012) A fast compression-based similarity measure with applications to content-based image retrieval. J Vis Commun Image Represent 23(2):293–302

    Google Scholar 

  • Cilibrasi R, Vitanyi PMB (2007) The google similarity distance. IEEE Trans Knowl Data Eng 19(3):370–383

    Google Scholar 

  • Cilibrasi R, Vitányi PMB, Wolf R (2004) Algorithmic clustering of music based on string compression. Comput Music J 28(4):49–67

    Google Scholar 

  • Cooper M, Foote J (2002) Automatic music summarization via similarity analysis. In: Proceedings of the IRCAM, Paris, pp 81–85

    Google Scholar 

  • Cramér H (1999) Mathematical methods of statistics. Princeton University Press, Princeton

    MATH  Google Scholar 

  • Deutch R, Cherner M, Grant I (2006) Significance of testing of a cluster of multivariate binary variables: comparison of the tripartite T index to three common similarity measures. Stat Methods Med Res 15:285–299

    MathSciNet  Google Scholar 

  • Dice LR (1945) Measures of the amount of ecologic association between species. Ecology 26(3):297–302. doi:10.2307/1932409

    Google Scholar 

  • Emond EJ, Mason DW (2002) A new rank correlation coefficient with application to the consensus ranking problem. J Multi-Criteria Decis Anal 11(1):17–28

    MATH  Google Scholar 

  • Gonzalez RP, Cummings G, Mulekar MS, Rodning CB (2006) Increased mortality in rural vehicular trauma: identifying contributing factors through data linkage. J Trauma-Inj Infect Crit Care 61:404–409

    Google Scholar 

  • Goodman LA, Kruskal WH (1963) Measures of association for cross classifications. Part III. J Am Statist Assoc 58:310–364

    MathSciNet  Google Scholar 

  • Hamming RW (1950) Error detecting and error correcting codes. Bell Syst Tech J 29(2):147–160

    MathSciNet  Google Scholar 

  • Heyer WR, Donnelly MA, McDiarmid RW, Hayek LC, Foster MS (1994) Measuring and monitoring biological diversity, Chapter 9. Smithsonian Institution Press, Washington

    Google Scholar 

  • Jaccard P (1901) Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin de la Société Vaudoise des Sciences Naturelles 37: 547–579

    Google Scholar 

  • Jesorsky O, Kirchberg K, Frischholz R (2001) Robust face detection using the Hausdorff distance. In: Audio-and video-based biometric person authentication. Springer, Berlin, pp 90–95

    Google Scholar 

  • Kårén O, Högberg N, Dahlberg A, Jonsson L, Nylund JE (1997) Inter and intraspecific variation in the ITS region of rDNA of ectomycorrhizal fungi in Fennoscandia as detected by endonuclease analysis. New Phytol 136(2):313–325

    Google Scholar 

  • Kendall M (1948) Rank correlation methods. Charles Griffin, London

    MATH  Google Scholar 

  • Keogh E, Lonardi S, Ratanamahatana CA (2004) Towards parameter free data mining. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, Seattle, pp 206–215

    Google Scholar 

  • Krasnogor N, Pelta DA (2004) Measuring the similarity of protein structures by means of the universal similarity metric. Bioinformatics 20(7):1015–1021

    Google Scholar 

  • Kuhn HW (1955) The Hungarian method for the assignment problem. Nav Res Logist Q 2:83–97

    Google Scholar 

  • Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl 10(8):707–710

    MathSciNet  Google Scholar 

  • Levina E, Bickel P (2001) The Earth Mover's distance is the mallows distance: some insights from statistics. In: Proceedings of ICCV, Vancouver pp 251–256

    Google Scholar 

  • Li M, Badger JH, Chen X, Kwong S, Kearney P, Zhang H (2001) An information-based sequence distance and its application to whole mitochondrial genome phylogeny. Bioinformatics 17(2):149

    Google Scholar 

  • Li M, Chen X, Li X, Ma B, Vitányi PMB (2004) The similarity metric. IEEE Trans Inf Theory 50(12):3250–3264

    MATH  Google Scholar 

  • Mulekar MS, Boone JM, Aryal S (2010) Estimating sampling distributions of overlap coefficient and other similarity measures. In: Karian ZA, Dudewicz EJ (eds) Handbook of fitting distributions, Chapter 25. CRC, Boca Raton, pp 1039–1090

    Google Scholar 

  • Munkres J (1957) Algorithms for the assignment and transportation problems. J Soc Ind Appl Math 5(1):32–38

    MATH  MathSciNet  Google Scholar 

  • Moreno PJ, Ho PP, Vasconcelos N (2003) A Kullback-Leibler divergence based kernel for SVM classification in multimedia applications. Adv Neural Inf Process Syst 16:1385–1393

    Google Scholar 

  • Rucklidge W (1996) Efficient visual recognition using the Hausdorff distance, vol 1173. Springer, Berlin

    MATH  Google Scholar 

  • Sweeney K, Keshk OMG (2005) The similarity of states: using S to compute dyadic interest similarity. Manage Peace Sci 22:165–187

    Google Scholar 

  • Spearman C (1904) The proof and measurement of association between two things. Am J Psychol 15:72–101

    Google Scholar 

  • Signorino CS, Ritter JM (1999) Tau-b or not tau-b: measuring the similarity of foreign policy positions. Int Stud Q 43(1):115–144

    Google Scholar 

  • Stoll RJ (1984) Bloc concentration and balance of power. J Confl Resol 28(1):25–50

    Google Scholar 

  • Tulloss RE (1997) Assessment of similarity indices for undesirable properties and a new tripartite similarity index based on cost functions. In: Palm ME, Chapela IH (eds) Mycology in sustainable development: expanding concepts, vanishing borders. Parkway Publishers, Boon, pp 122–143

    Google Scholar 

  • Tversky A (1977) Features of similarity. Psychol Rev 84(4):327–352

    Google Scholar 

  • Willett P, Barnard JM, Downs GM (1998) Chemical similarity searching. J Chem Inf Comput Sci 38(6): 983–996

    Google Scholar 

  • Budescu DV (1980) Some new measures of profile dissimilarity. Appl Psychol Meas 4(2):261–272

    Google Scholar 

  • Bueno De Mesquita B (1981) Risk, power distributions, and the likelihood of war. Int Stud Q 25(4):541–568

    Google Scholar 

  • Eidenberger H (2006) Evaluation and analysis of similarity measures for content based visual information retrieval. Multimed Syst 12(2):71–87

    Google Scholar 

  • Foote J (1999) Visualizing music and audio using self-similarity. In: Proceedings of the seventh ACM international conference on multimedia (part 1), Orlando, pp 77–80

    Google Scholar 

  • Goodman LA, Kruskal WH (1954) Measures of association for cross classifications. Part I. J Am Stat Assoc 49:732–764

    MATH  Google Scholar 

  • Goodman LA, Kruskal WH (1959) Measures of association for cross classifications. Part II. J Am Stat Assoc 52:123–163

    Google Scholar 

  • Iusi-Scarborough G (1988) Polarity, power, and risk in international disputes. J Confl Resolut 32(3):511–533

    Google Scholar 

  • Kim CH (1991) Third-party participation in wars. J Confl Resolut 35(4):659–677

    Google Scholar 

  • McGill M (1979) An evaluation of factors affecting document ranking by information retrieval systems. ERIC Record No. ED188567, Education Resources Information Center. http://eric.ed.gov

  • Shaw WM, Burgin R, Howell P (1997) Performance standards and evaluations in IR test collections: cluster-based retrieval models. Inf Process Manage 33(1):1–14

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this entry

Cite this entry

Mulekar, M.S., Brown, C.S. (2014). Distance and Similarity Measures. In: Alhajj, R., Rokne, J. (eds) Encyclopedia of Social Network Analysis and Mining. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6170-8_141

Download citation

Publish with us

Policies and ethics