Result Aggregation for Knowledge-Intensive Multicultural Name Matching
In this paper, we describe a metasearch tool resulting from experiments in aggregating the results of different name matching algorithms on a knowledge-intensive multicultural name matching task. Three retrieval engines that match romanized names were tested on a noisy and predominantly Arabic dataset. One is based on a generic string matching algorithm; another is designed specifically for Arabic names; and the third makes use of culturally-specific matching strategies for multiple cultures. We show that even a relatively naïve method for aggregating results significantly increased effectiveness over each of the individual algorithms, resulting in nearly tripling the F-score of the worst-performing algorithm included in the aggregate, and in a 6-point improvement in F-score over the single best-performing algorithm included.
KeywordsInformation Retrieval Name Matching System Combination
Unable to display preview. Download preview PDF.
- 1.Voorhees, E.M., Harman, D.K.: Overview of the Eighth Text REtrieval Conference (TREC-8). In: Voorhees, E.M., Harman, D.K. (eds.) The Eighth Text REtrieval Conference (TREC-8). U.S. Government Printing Office, Washington (2000)Google Scholar
- 4.Winkler, W.E.: String Comparator Metrics and Enhanced Decision Rules in the Fellegi-Sunter Model of Record Linkage. In: Proceedings of the Section on Survey Research Methods, pp. 354–359. American Statistical Association (1990)Google Scholar
- 6.Aslam, J.A., Montague, M.: Models for Metasearch. In: Proceedings of the 24th Annual International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 276–284. ACM Press, New York (2001)Google Scholar