Skip to main content

Query Spelling Correction

  • Chapter
  • First Online:
Query Understanding for Search Engines

Part of the book series: The Information Retrieval Series ((INRE,volume 46))

  • 738 Accesses

Abstract

In this chapter we will focus on the discussion of an important type of query understandings: Query spelling correction, especially on the web search queries. Queries issued by web search engine users usually contain errors and misused words/phrases. Although a user might have a clear intent in her mind, inferring the query’s intent in this case becomes difficult because of the edit errors or vocabulary gap between the user’s ideal query and the query issued to the search engine. Because of this, query spelling correction is a crucial component of modern search engines. The performance of the query spelling correction component will affect all other parts of the search engine. In this chapter we will first introduce early works on query spelling correction based on edit distance. Then we will discuss the noisy channel model to the problem. After that we will introduce modern approaches to more complex and realistic problem setup where it involves multiple types of spelling errors. Finally we will also summarize other components needed to support a modern large-scale query spelling correction system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. https://www.microsoft.com/cognitive-services/en-us/web-language-model-api.

  2. Farooq Ahmad and Grzegorz Kondrak. Learning a spelling error model from search query logs. In Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pages 955–962, 2005.

    Google Scholar 

  3. Eric Brill and Robert C. Moore. An improved error model for noisy channel spelling correction. In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, pages 286–293, 2000.

    Google Scholar 

  4. Qing Chen, Mu Li, and Ming Zhou. Improving query spelling correction using web search results. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 181–189, 2007.

    Google Scholar 

  5. Michael Collins. Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, pages 1–8, 2002.

    Google Scholar 

  6. Silviu Cucerzan and Eric Brill. Spelling correction as an iterative process that exploits the collective knowledge of web users. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pages 293–300, 2004.

    Google Scholar 

  7. Hercules Dalianis. Evaluating a spelling support in a search engine. In Proceedings of the 6th International Conference on Applications of Natural Language to Information Systems, pages 183–190, 2002.

    Google Scholar 

  8. Fred Damerau. A technique for computer detection and correction of spelling errors. Commun. ACM, 7 (3): 171–176, 1964.

    Article  Google Scholar 

  9. Markus Dreyer, David A. Smith, and Noah A. Smith. Vine parsing and minimum risk reranking for speed and precision. In Proceedings of the Tenth Conference on Computational Natural Language Learning, pages 201–205, 2006.

    Google Scholar 

  10. Huizhong Duan and Bo-June Paul Hsu. Online spelling correction for query completion. In Proceedings of the 20th International Conference on World Wide Web, pages 117–126, 2011.

    Google Scholar 

  11. Huizhong Duan, Yanen Li, ChengXiang Zhai, and Dan Roth. A discriminative model for query spelling correction with latent structural SVM. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 1511–1521, 2012.

    Google Scholar 

  12. Jianfeng Gao, Xiaolong Li, Daniel Micol, Chris Quirk, and Xu Sun. A large scale ranker-based system for search query spelling correction. In Proceedings of the 23rd International Conference on Computational Linguistics, pages 358–366, 2010.

    Google Scholar 

  13. Jiafeng Guo, Gu Xu, Hang Li, and Xueqi Cheng. A unified and discriminative model for query refinement. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 379–386, 2008.

    Google Scholar 

  14. Victoria J. Hodge and Jim Austin. A novel binary spell checker. In Proceedings of the 2001 International Conference on Artificial Neural Networks, pages 1199–1204, 2001.

    Google Scholar 

  15. Daniel Jurafsky and James H. Martin. Speech and language processing - an introduction to natural language processing, computational linguistics, and speech recognition. Prentice Hall, 2000. ISBN 978-0-13-095069-7.

    Google Scholar 

  16. Mark D. Kernighan, Kenneth Ward Church, and William A. Gale. A spelling correction program based on a noisy channel model. In Proceedings of the 13th International Conference on Computational Linguistics, pages 205–210, 1990.

    Google Scholar 

  17. Karen Kukich. Techniques for automatically correcting words in text. ACM Comput. Surv., 24 (4): 377–439, 1992.

    Article  Google Scholar 

  18. V. I. Levenshtein. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady., 10 (8): 707–710, February 1966.

    MathSciNet  Google Scholar 

  19. Yanen Li, Huizhong Duan, and ChengXiang Zhai. A generalized hidden Markov model with discriminative training for query spelling correction. In Proceedings of the 35th International ACM SIGIR conference on research and development in Information Retrieval, pages 611–620, 2012.

    Google Scholar 

  20. Lawrence R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. In Proceedings of the IEEE, pages 257–286, 1989.

    Google Scholar 

  21. Edward M. Riseman and Allen R. Hanson. A contextual postprocessing system for error correction using binary n-grams. IEEE Trans. Computers, 23 (5): 480–493, 1974.

    Article  Google Scholar 

  22. Terrence J. Sejnowski and Charles R. Rosenberg. Parallel networks that learn to pronounce English text. Complex Systems, 1 (1), 1987.

    Google Scholar 

  23. Heping Shang and T. H. Merrett. Tries for approximate string matching. IEEE Trans. Knowl. Data Eng., 8 (4): 540–547, 1996.

    Google Scholar 

  24. Xu Sun, Jianfeng Gao, Daniel Micol, and Chris Quirk. Learning phrase-based spelling error models from clickthrough data. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 266–274, 2010.

    Google Scholar 

  25. Esko Ukkonen. Finding approximate patterns in strings. J. Algorithms, 6 (1): 132–137, 1985.

    Article  MathSciNet  Google Scholar 

  26. Robert A. Wagner and Michael J. Fischer. The string-to-string correction problem. J. ACM, 21 (1): 168–173, 1974.

    Article  MathSciNet  Google Scholar 

  27. Casey Whitelaw, Ben Hutchinson, Grace Chung, and Ged Ellis. Using the web for language independent spellchecking and autocorrection. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 890–899, 2009.

    Google Scholar 

  28. E. J. Yannakoudakis. Expert spelling error analysis and correction. In Proceedings of a Conference held by the Aslib Informatics Group and the Information Retrieval Group of the British Computer Society, pages 39–52, 1983.

    Google Scholar 

  29. Chun-Nam John Yu and Thorsten Joachims. Learning structural svms with latent variables. In Proceedings of the 26th Annual International Conference on Machine Learning, volume 382 of ACM International Conference Proceeding Series, pages 1169–1176, 2009.

    Google Scholar 

  30. E. M. Zamora, Joseph J. Pollock, and Antonio Zamora. The use of trigram analysis for spelling error detection. Inf. Process. Manag., 17 (6): 305–316, 1981.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanen Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Li, Y. (2020). Query Spelling Correction. In: Chang, Y., Deng, H. (eds) Query Understanding for Search Engines. The Information Retrieval Series, vol 46. Springer, Cham. https://doi.org/10.1007/978-3-030-58334-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58334-7_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58333-0

  • Online ISBN: 978-3-030-58334-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics