Advertisement

Metaphone-pt_BR: The Phonetic Importance on Search and Correction of Textual Information

  • Carlos C. Jordão
  • João Luís G. Rosa
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7182)

Abstract

The increasing automation in the communication among systems produces a volume of information beyond human administrative capacity to deal with on time. Mechanisms to find out the inconsistent information and facilitate the decision-making are required. The use of a phonetic algorithm (Metaphone) adapted to Brazilian Portuguese proved to be a valuable tool in searching for name and address fields for automatic decisions, increasing substantially the performance regular database queries could obtain in information retrieval.

Keywords

Lower Case Letter United States Census Bureau Human Language Technology Word Cluster Automatic Decision 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Freeman, A.T., Condon, S.L., Ackerman, C.M.: Cross linguistic name matching in english and arabic: a “one to many mapping” extension of the levenshtein edit distance algorithm. In: Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, NAACL 2006, pp. 471–478. Association for Computational Linguistics, Stroudsburg (2006), http://dx.doi.org/10.3115/1220835.1220895 CrossRefGoogle Scholar
  2. 2.
    Jaro, M.A.: Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida. Journal of the American Statistical Association 84(406), 414–420 (1989), http://dx.doi.org/10.2307/2289924 CrossRefGoogle Scholar
  3. 3.
    Levenshtein, V.: Binary Codes Capable of Correcting Deletions, Insertions and Reversals. Soviet Physics Doklady 10, 707 (1966)MathSciNetGoogle Scholar
  4. 4.
    Odell, M.K., Russell, R.C.: U.S. Patents 1261167 (1918), 1435663 (1922)\(\dag\) (1918/1922), cited in Knuth (1973)Google Scholar
  5. 5.
    Philips, L.: Hanging on the metaphone. Computer Language 7(12) (1990)Google Scholar
  6. 6.
    Philips, L.: The double metaphone search algorithm. C/C++ Users Journal 18(5) (June 2000)Google Scholar
  7. 7.
    Piltcher, G.: Correção de palavras em chats: Avaliação de bases para dicionários de referência. In: Anais do XXV Congresso da Sociedade Brasileira de Computação, pp. 2228–2237 (2005)Google Scholar
  8. 8.
    Sanae, C.: A comparison and analysis of name matching algorithms. Proceedings of World Academy of Science, Engineering and Technology 21, 252–257 (2007)Google Scholar
  9. 9.
    Snae, C.: A comparison and analysis of name matching algorithms. International Journal of Applied Science. Engineering and Technology 21, 252–257 (2007)Google Scholar
  10. 10.
    UzZaman, N., Khan, M.: A double metaphone encoding for approximate name searching and matching in bangla. Computational Intelligence, 108–113 (2005)Google Scholar
  11. 11.
    Winkler, W.E.: String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage. In: Proceedings of the Section on Survey Research, pp. 354–359 (1990)Google Scholar
  12. 12.
    Zobel, J., Dart, P.W.: Phonetic String Matching: Lessons from Information Retrieval. In: Frei, H.P., Harman, D., Schäble, P., Wilkinson, R. (eds.) Proceedings of the 19th International Conference on Research and Development in Information Retrieval, pp. 166–172. ACM Press, Zurich (1996), http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.2138 Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Carlos C. Jordão
    • 1
  • João Luís G. Rosa
    • 2
  1. 1.Department of Information TechnologySão Carlos City HallSão CarlosBrazil
  2. 2.Computer Science DepartmentUniversity of São PauloSão CarlosBrazil

Personalised recommendations