Abstract
In this chapter we will focus on the discussion of an important type of query understandings: Query spelling correction, especially on the web search queries. Queries issued by web search engine users usually contain errors and misused words/phrases. Although a user might have a clear intent in her mind, inferring the query’s intent in this case becomes difficult because of the edit errors or vocabulary gap between the user’s ideal query and the query issued to the search engine. Because of this, query spelling correction is a crucial component of modern search engines. The performance of the query spelling correction component will affect all other parts of the search engine. In this chapter we will first introduce early works on query spelling correction based on edit distance. Then we will discuss the noisy channel model to the problem. After that we will introduce modern approaches to more complex and realistic problem setup where it involves multiple types of spelling errors. Finally we will also summarize other components needed to support a modern large-scale query spelling correction system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
https://www.microsoft.com/cognitive-services/en-us/web-language-model-api.
Farooq Ahmad and Grzegorz Kondrak. Learning a spelling error model from search query logs. In Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pages 955–962, 2005.
Eric Brill and Robert C. Moore. An improved error model for noisy channel spelling correction. In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, pages 286–293, 2000.
Qing Chen, Mu Li, and Ming Zhou. Improving query spelling correction using web search results. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 181–189, 2007.
Michael Collins. Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, pages 1–8, 2002.
Silviu Cucerzan and Eric Brill. Spelling correction as an iterative process that exploits the collective knowledge of web users. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pages 293–300, 2004.
Hercules Dalianis. Evaluating a spelling support in a search engine. In Proceedings of the 6th International Conference on Applications of Natural Language to Information Systems, pages 183–190, 2002.
Fred Damerau. A technique for computer detection and correction of spelling errors. Commun. ACM, 7 (3): 171–176, 1964.
Markus Dreyer, David A. Smith, and Noah A. Smith. Vine parsing and minimum risk reranking for speed and precision. In Proceedings of the Tenth Conference on Computational Natural Language Learning, pages 201–205, 2006.
Huizhong Duan and Bo-June Paul Hsu. Online spelling correction for query completion. In Proceedings of the 20th International Conference on World Wide Web, pages 117–126, 2011.
Huizhong Duan, Yanen Li, ChengXiang Zhai, and Dan Roth. A discriminative model for query spelling correction with latent structural SVM. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 1511–1521, 2012.
Jianfeng Gao, Xiaolong Li, Daniel Micol, Chris Quirk, and Xu Sun. A large scale ranker-based system for search query spelling correction. In Proceedings of the 23rd International Conference on Computational Linguistics, pages 358–366, 2010.
Jiafeng Guo, Gu Xu, Hang Li, and Xueqi Cheng. A unified and discriminative model for query refinement. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 379–386, 2008.
Victoria J. Hodge and Jim Austin. A novel binary spell checker. In Proceedings of the 2001 International Conference on Artificial Neural Networks, pages 1199–1204, 2001.
Daniel Jurafsky and James H. Martin. Speech and language processing - an introduction to natural language processing, computational linguistics, and speech recognition. Prentice Hall, 2000. ISBN 978-0-13-095069-7.
Mark D. Kernighan, Kenneth Ward Church, and William A. Gale. A spelling correction program based on a noisy channel model. In Proceedings of the 13th International Conference on Computational Linguistics, pages 205–210, 1990.
Karen Kukich. Techniques for automatically correcting words in text. ACM Comput. Surv., 24 (4): 377–439, 1992.
V. I. Levenshtein. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady., 10 (8): 707–710, February 1966.
Yanen Li, Huizhong Duan, and ChengXiang Zhai. A generalized hidden Markov model with discriminative training for query spelling correction. In Proceedings of the 35th International ACM SIGIR conference on research and development in Information Retrieval, pages 611–620, 2012.
Lawrence R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. In Proceedings of the IEEE, pages 257–286, 1989.
Edward M. Riseman and Allen R. Hanson. A contextual postprocessing system for error correction using binary n-grams. IEEE Trans. Computers, 23 (5): 480–493, 1974.
Terrence J. Sejnowski and Charles R. Rosenberg. Parallel networks that learn to pronounce English text. Complex Systems, 1 (1), 1987.
Heping Shang and T. H. Merrett. Tries for approximate string matching. IEEE Trans. Knowl. Data Eng., 8 (4): 540–547, 1996.
Xu Sun, Jianfeng Gao, Daniel Micol, and Chris Quirk. Learning phrase-based spelling error models from clickthrough data. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 266–274, 2010.
Esko Ukkonen. Finding approximate patterns in strings. J. Algorithms, 6 (1): 132–137, 1985.
Robert A. Wagner and Michael J. Fischer. The string-to-string correction problem. J. ACM, 21 (1): 168–173, 1974.
Casey Whitelaw, Ben Hutchinson, Grace Chung, and Ged Ellis. Using the web for language independent spellchecking and autocorrection. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 890–899, 2009.
E. J. Yannakoudakis. Expert spelling error analysis and correction. In Proceedings of a Conference held by the Aslib Informatics Group and the Information Retrieval Group of the British Computer Society, pages 39–52, 1983.
Chun-Nam John Yu and Thorsten Joachims. Learning structural svms with latent variables. In Proceedings of the 26th Annual International Conference on Machine Learning, volume 382 of ACM International Conference Proceeding Series, pages 1169–1176, 2009.
E. M. Zamora, Joseph J. Pollock, and Antonio Zamora. The use of trigram analysis for spelling error detection. Inf. Process. Manag., 17 (6): 305–316, 1981.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Li, Y. (2020). Query Spelling Correction. In: Chang, Y., Deng, H. (eds) Query Understanding for Search Engines. The Information Retrieval Series, vol 46. Springer, Cham. https://doi.org/10.1007/978-3-030-58334-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-58334-7_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58333-0
Online ISBN: 978-3-030-58334-7
eBook Packages: Computer ScienceComputer Science (R0)