Result Aggregation for Knowledge-Intensive Multicultural Name Matching

  • Keith J. Miller
  • Mark Arehart
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5603)


In this paper, we describe a metasearch tool resulting from experiments in aggregating the results of different name matching algorithms on a knowledge-intensive multicultural name matching task. Three retrieval engines that match romanized names were tested on a noisy and predominantly Arabic dataset. One is based on a generic string matching algorithm; another is designed specifically for Arabic names; and the third makes use of culturally-specific matching strategies for multiple cultures. We show that even a relatively naïve method for aggregating results significantly increased effectiveness over each of the individual algorithms, resulting in nearly tripling the F-score of the worst-performing algorithm included in the aggregate, and in a 6-point improvement in F-score over the single best-performing algorithm included.


Information Retrieval Name Matching System Combination 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Voorhees, E.M., Harman, D.K.: Overview of the Eighth Text REtrieval Conference (TREC-8). In: Voorhees, E.M., Harman, D.K. (eds.) The Eighth Text REtrieval Conference (TREC-8). U.S. Government Printing Office, Washington (2000)Google Scholar
  2. 2.
    Voorhees, E.M.: The philosophy of information retrieval evaluation. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds.) CLEF 2001. LNCS, vol. 2406, pp. 355–370. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  3. 3.
    Jaro, M.A.: Advances in Record-linkage Methodology a Applied to Matching the 1985 Census of Tampa, Florida. Journal of the American Statistical Association 89, 414–420 (1989)CrossRefGoogle Scholar
  4. 4.
    Winkler, W.E.: String Comparator Metrics and Enhanced Decision Rules in the Fellegi-Sunter Model of Record Linkage. In: Proceedings of the Section on Survey Research Methods, pp. 354–359. American Statistical Association (1990)Google Scholar
  5. 5.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)MATHGoogle Scholar
  6. 6.
    Aslam, J.A., Montague, M.: Models for Metasearch. In: Proceedings of the 24th Annual International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 276–284. ACM Press, New York (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Keith J. Miller
    • 1
  • Mark Arehart
    • 1
  1. 1.MITRE CorporationMcLeanUSA

Personalised recommendations