Generalized Mongue-Elkan Method for Approximate Text String Comparison
- Cite this paper as:
- Jimenez S., Becerra C., Gelbukh A., Gonzalez F. (2009) Generalized Mongue-Elkan Method for Approximate Text String Comparison. In: Gelbukh A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2009. Lecture Notes in Computer Science, vol 5449. Springer, Berlin, Heidelberg
The Mongue-Elkan method is a general text string comparison method based on an internal character-based similarity measure (e.g. edit distance) combined with a token level (i.e. word level) similarity measure. We propose a generalization of this method based on the notion of the generalized arithmetic mean instead of the simple average used in the expression to calculate the Monge-Elkan method. The experiments carried out with 12 well-known name-matching data sets show that the proposed approach outperforms the original Monge-Elkan method when character-based measures are used to compare tokens.
Unable to display preview. Download preview PDF.