WEB Information Retrieval Models
Web search engines
The Web can be considered as a large-scale document collection, for which classical text retrieval techniques can be applied. However, its unique features and structure offer new sources of evidence that can be used to enhance the effectiveness of Information Retrieval (IR) systems. Generally, Web IR examines the combination of evidence from both the textual content of documents and the structure of the Web, as well as the search behavior of users and issues related to the evaluation of retrieval effectiveness in the Web setting.
Web Information Retrieval models are ways of integrating many sources of evidence about documents, such as the links, the structure of the document, the actual content of the document, the quality of the document, etc. so that an effective Web search engine can be achieved. In contrast with the traditional library-type settings of IR systems, the Web is a hostile environment, where Web search engines have to
- Brin S. and Page L. The anatomy of a large-scale hypertextual Web search engine. Comput. Netw. ISDN Syst., 30(1–7):107–117, 1998. CrossRef
- Craswell N., Hawking D., and Robertson S. Effective site finding using link anchor information. In Proc. 24th Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2001, pp. 250–257.
- Craswell N., Robertson S., Zaragoza H., and Taylor M. Relevance weighting for query independent evidence. In Proc. 31st Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2005, pp. 416–423.
- Hawking D. and Craswell N. The very large collection and Web tracks. In TREC: Experiment and Evaluation in Information Retrieval. Kluwer Academic Publishers, Dordrecht, 2004, pp. 199–232.
- Joachims T., Li H., Liu T.Y., and Zhai C. SIGIR workshop report: learning to rank for information retrieval (LR4IR 2007). SIGIR Forum, 41(2):55–62, 2007. CrossRef
- Kleinberg J.M. Authoritative sources in a hyperlinked environment. J. ACM, 46(5):604–632, 1999. CrossRef
- Kraaij W., Westerveld T., and Hiemstra D. The importance of prior probabilities for entry page search. In Proc. 25th Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2002, pp. 27–34.
- Macdonald C., Plachouras V., He B., Lioma C., and Ounis I. University of Glasgow at WebCLEF 2005: Experiments in per-field normlisation and language specific stemming. In Proc. 6th Workshop, Cross-Language Evaluation Forum, 2005, pp. 898–907.
- Ogilvie P. and Callan J. Combining document representations for known-item search. In Proc. 26th Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2003, pp. 143–150.
- Peng J., Macdonald C., He B., and Ounis I. Combination of document priors in Web information retrieval. In Proc. 8th Int. Conf. Computer-Assisted Information Retrieval, 2007.
- Plachouras V. Selective Web Information Retrieval. PhD thesis, Department of Computing Science, University of Glasgow, 2006.
- Plachouras V. and Ounis I. Multinomial randomness models for retrieval with document fields. In Proc. 29th European Conf. on IR Research, 2007, pp. 28–39.
- Plachouras V., Ounis I., and Amati G. The static absorbing model for the Web. J. Web Eng., 165–186, 2005.
- Robertson S., Zaragoza H., and Taylor M. Simple BM25 extension to multiple weighted fields. In Proc. Int. Conf. on Information and Knowledge Management, 2004, pp. 42–49.
- Silverstein C., Henzinger M., Marais H., and Moricz M. Analysis of a very large AltaVista Query Log. Technical Report 1998-014, Digital SRC, 1998.
- Zaragoza H., Craswell N., Taylor M., Saria S., and Robertson S. Microsoft cambridge at TREC-13: Web and HARD tracks. In Proc. the 4th Text Retrieval Conf., 2004.
- WEB Information Retrieval Models
- Reference Work Title
- Encyclopedia of Database Systems
- pp 3479-3482
- Print ISBN
- Online ISBN
- Springer US
- Copyright Holder
- Springer US
- Additional Links
- Industry Sectors
- eBook Packages
- Editor Affiliations
- 1. College of Computing, Georgia Institute of Technology
- 2. Database Research Group David R. Cheriton School of Computer Science, University of Waterloo
- Author Affiliations
- 1. University of Glasgow, Glasgow, UK
To view the rest of this content please follow the download PDF link above.