Search Engines: Applications of ML
Search engines provide users with Internet resources – links to web sites, documents, text snippets, images, videos, etc. – in response to queries. They use techniques that are part of the field of information retrieval, and rely on statistical and pattern matching methods. Search engines have to take into account many key aspects and requirements of this specific instance of the information retrieval problem. First, the fact is that they have to be able to process hundreds of millions of searches a day and answer queries in a matter of milliseconds. Second, the resources on the World Wide Web are constantly updated, with information being continuously added, removed or changed – the overall contents changing by up to 8% a week – in a pool consisting of billions of documents. Third, the users express possibly semantically complex queries in a language with limited expressive power, and often not make use or proper use of available syntactic features of that language – for...
- Becker, J., & Kuropka, D. (2003). Topic-based vector space model. In W. Abramowicz & G. Klein (Eds.), Proceedings of the sixth international conference on business information systems (pp. 7–12). Colorado Springs, CO.Google Scholar
- Lee, U., Liu, Z., & Cho, J. (2005). Automatic identification of user goals in web search. In WWW ’05: In Proceedings of the 14th international conference on World Wide Web (pp. 391–400). New York: ACM Press.Google Scholar
- Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The pagerank citation ranking: Bringing order to the web. Technical report. Stanford, CA: Stanford University Press.Google Scholar
- Robertson, S. E., Walker, S., & Beaulieu, M. (1999a). Okapi at trec-7: Automatic ad hoc, filtering, VLC and filtering tracks. In E. Voorhees & D. Harman (Eds.), In Proceedings of the seventh text retrieval conference (pp. 253–264). Gaithersburg, MD.Google Scholar
- Robertson, S. E., Walker, S., & Beaulieu, M. (1999b). On modeling of information retrieval concepts in vector spaces. ACM Transactions on Database Systems, 12(2), 299–321.Google Scholar