Skip to main content

WEB Information Retrieval Models

  • Reference work entry
  • First Online:
  • 300 Accesses

Synonyms

Web search engines

Definition

The Web can be considered as a large-scale document collection, for which classical text retrieval techniques can be applied. However, its unique features and structure offer new sources of evidence that can be used to enhance the effectiveness of Information Retrieval (IR) systems. Generally, Web IR examines the combination of evidence from both the textual content of documents and the structure of the Web, as well as the search behavior of users and issues related to the evaluation of retrieval effectiveness in the Web setting.

Web Information Retrieval models are ways of integrating many sources of evidence about documents, such as the links, the structure of the document, the actual content of the document, the quality of the document, etc. so that an effective Web search engine can be achieved. In contrast with the traditional library-type settings of IR systems, the Web is a hostile environment, where Web search engines have to deal with...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

  1. Brin S, Page L. The anatomy of a large-scale hypertextual Web search engine. Comput Netw ISDN Syst. 1998;30(1–7):107–17.

    Article  Google Scholar 

  2. Craswell N, Hawking D, Robertson S. Effective site finding using link anchor information. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2001. p. 250–7.

    Google Scholar 

  3. Craswell N, Robertson S, Zaragoza H, Taylor M. Relevance weighting for query independent evidence. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2005. p. 416–23.

    Google Scholar 

  4. Hawking D, Craswell N. The very large collection and Web tracks. In: TREC: experiment and evaluation in information retrieval. Dordrecht: Kluwer Academic Publishers; 2004. p. 199–232.

    Google Scholar 

  5. Joachims T, Li H, Liu TY, Zhai C. SIGIR workshop report: learning to rank for information retrieval (LR4IR 2007). SIGIR Forum. 2007;41(2):55–62.

    Article  Google Scholar 

  6. Kleinberg JM. Authoritative sources in a hyperlinked environment. J. ACM. 1999;46(5):604–32.

    Article  MathSciNet  MATH  Google Scholar 

  7. Kraaij W, Westerveld T, Hiemstra D. The importance of prior probabilities for entry page search. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2002. p. 27–34.

    Google Scholar 

  8. Macdonald C, Plachouras V, He B, Lioma C, Ounis I. University of Glasgow at WebCLEF 2005: Experiments in per-field normlisation and language specific stemming. In: Proceedings of the 6th Workshop, Cross-Language Evaluation Forum; 2005. p. 898–907.

    Chapter  Google Scholar 

  9. Ogilvie P Callan J. Combining document representations for known-item search. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2003. p. 143–50.

    Google Scholar 

  10. Peng J, Macdonald C, He B, Ounis I. Combination of document priors in Web information retrieval. In: Proceedings of the 8th International Conference on Computer-Assisted Information Retrieval; 2007.

    Google Scholar 

  11. Plachouras V. Selective web information retrieval. PhD thesis, Department of Computing Science, University of Glasgow. 2006.

    Google Scholar 

  12. Plachouras V Ounis I. Multinomial randomness models for retrieval with document fields. In: Proceedings of the 29th European conference on IR research; 2007. p. 28–39.

    Google Scholar 

  13. Plachouras V, Ounis I, Amati G. The static absorbing model for the Web. J Web Eng. 2005;4(2):165–86.

    Google Scholar 

  14. Robertson S, Zaragoza H, Taylor M. Simple BM25 extension to multiple weighted fields. In: Proceedings of the 13th ACM International Conference on Information and Knowledge Management; 2004. p. 42–9.

    Google Scholar 

  15. Silverstein C, Henzinger M, Marais H, Moricz M. Analysis of a very large AltaVista Query Log. Technical report 1998–014, Digital SRC. 1998.

    Google Scholar 

  16. Zaragoza H, Craswell N, Taylor M, Saria S, Robertson S. Microsoft cambridge at TREC-13: Web and HARD tracks. In: Proceedings of the 4th Text Retrieval Conference; 2004.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Craig MacDonald .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

MacDonald, C., Ounis, I. (2018). WEB Information Retrieval Models. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_928

Download citation

Publish with us

Policies and ethics