RANDOM 2001, APPROX 2001: Approximation, Randomization, and Combinatorial Optimization: Algorithms and Techniques pp 6-6 | Cite as
Web Search via Hub Synthesis
Abstract
We present a probabilistic generative model for web search which captures in a unified manner three critical components of web search: how the link structure of the web is generated, how the content of a web document is generated, and how a human searcher generates a query. The key to this unification lies in capturing the correlations between each of these components in terms of proximity in latent semantic space. Given such a combined model, the correct answer to a search query is well defined, and thus it becomes possible to evaluate web search algorithms rigorously. We present a new web search algorithm, based on spectral techniques, and prove that it is guaranteed to produce an approximately correct answer in our model. The algorithm assumes no knowledge of the model, and is well-defined regardless of the accuracy of the model.
Joint work with D. Achlioptas, A. Fiat and F. McSherry