Text Comparison Using Soft Cardinality

  • Sergio Jimenez
  • Fabio Gonzalez
  • Alexander Gelbukh
Conference paper

DOI: 10.1007/978-3-642-16321-0_31

Part of the Lecture Notes in Computer Science book series (LNCS, volume 6393)
Cite this paper as:
Jimenez S., Gonzalez F., Gelbukh A. (2010) Text Comparison Using Soft Cardinality. In: Chavez E., Lonardi S. (eds) String Processing and Information Retrieval. SPIRE 2010. Lecture Notes in Computer Science, vol 6393. Springer, Berlin, Heidelberg

Abstract

The  classical set theory provides a method for comparing objects using cardinality and intersection, in combination with well-known resemblance coefficients such as Dice, Jaccard, and cosine. However, set operations are intrinsically crisp: they do not take into account similarities between elements. We propose a new general-purpose method for comparison of objects using a soft cardinality function that show that the soft cardinality method is superior via an auxiliary affinity (similarity) measure. Our experiments with 12 text matching datasets suggest that the soft cardinality method is superior to known approximate string comparison methods in text comparison task.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Sergio Jimenez
    • 1
  • Fabio Gonzalez
    • 1
  • Alexander Gelbukh
    • 2
  1. 1.National University of Colombia 
  2. 2.CIC-IPNMexico

Personalised recommendations