Abstract
This article presents a survey of techniques for ranking results in search engines, with emphasis on link-based ranking methods and the PageRank algorithm. The problem of selecting, in relation to a user search query, the most relevant documents from an unstructured source such as the WWW is discussed in detail. The need for extending classical information retrieval techniques such as boolean searching and vector space models with link-based ranking methods is demonstrated. The PageRank algorithm is introduced, and its numerical and spectral properties are discussed. The article concludes with an alternative means of computing PageRank, along with some example applications of this new method.
Similar content being viewed by others
References
Andersson, F.: Estimation of the quality of hyperlinked documents using a series formulation of PageRank. Master Thesis at the Centre for Mathematical Sciences, Faculty of Engineering, Lund University (2006)
Bellman, R.: Introduction to Matrix Analysis. Classics in Applied Mathematics, vol. 19. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (1997). Reprint of the second (1970) edition. With a foreword by Gene Golub
Berman, A., Plemmons, R.J.: Nonnegative Matrices in the Mathematical Sciences. Classics in Applied Mathematics, vol. 9. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (1994). Revised reprint of the 1979 original
Billingsley, P.: Ergodic Theory and Information. Wiley, New York (1965)
Billingsley, P.: Convergence of Probability Measures, 2nd edn. Wiley Series in Probability and Statistics: Probability and Statistics. Wiley, New York (1999)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. In: Proceedings of the 7th International World Wide Web Conference, 1998
Cvetković, D.M., Doob, M., Sachs, H.: Spectra of Graphs. Academic, New York (1980)
Feller, W.: An Introduction to Probability Theory and its Applications, vol. I, 3rd edn. Wiley, New York (1968)
Feller, W.: An Introduction to Probability Theory and its Applications, vol. II, 2nd edn. Wiley, New York (1971)
Gantmacher, F.R.: Applications of the Theory of Matrices (1960)
Golub, G.H., van Loan, C.F.V.: Matrix Computations. Johns Hopkins University Press, Baltimore (1996)
Haveliwala, T.H., Kamvar, S.D.: The second eigenvalue of the Google matrix. Stanford University (2003)
Haveliwala, T.H., Kamvar, S.D.: The condition number of the PageRank problem. Stanford University (2003)
Isaacson, D.L., Madsen, R.W.: Markov Chains: Theory and Applications. Wiley, New York (1976)
Jorgensen, P.E.T.: Analysis and Probability: Wavelets, Signals, Fractals. Graduate Texts in Mathematics, vol. 234. Springer, New York (2006)
Kamvar, S.D., Haveliwala, T.H., Golub, G.: Adaptive methods for the computation of PageRank. Stanford University (2003)
Kamvar, S.D., Haveliwala, T.H., Manning, C.D., Golub, G.H.: Exploiting the block structure of the Web for computing PageRank. Stanford University (2003)
Kamvar, S.D., Haveliwala, T.H., Manning, C.D., Golub, G.H.: Extrapolation methods for accelerating PageRank computations. In: Proceedings of the 12th International World Wide Web Conference, 2003
Kleinberg, J.: Authoritative sources in a hyperlinked environment. In: Journals of the 9th ACM-SIAM Symposium on Discrete Algorithms, 1999
Lancaster, P., Tismenetsky, M.: The Theory of Matrices. Academic, Orlando (1985)
Langville, A.N., Meyer, C.D.: Deeper inside PageRank. Internet Math. 1(3), 335–380 (2004)
Langville, A.N., Meyer, C.D.: Google’s PageRank and Beyond the Science of Search Engine Rankings. Princeton University Press, Princeton (2006)
Minc, H.: Nonnegative Matrices. Wiley-Interscience Series in Discrete Mathematics and Optimization. Wiley, New York (1988)
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the Web. Stanford Digital Libraries Working Paper, Stanford University (1998)
Perron, O.: Jacobischer Kettenbruchalgorithmus. Math. Ann. 64, 1–76 (1907)
Perron, O.: Ueber Matrizen. Math. Ann. 64, 248–263 (1907)
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27 (1948)
Spärck-Jones, K., Walker, S., Robertson, S.E.: A probabilistic model of information retrieval: Development and comparative experiments. Inf. Process. Manag. 36(6), 779–808 (2000)
Wilkinson, J.H.: The Algebraic Eigenvalue Problem. Clarendon, Oxford (1965)
Zipf, G.K.: Selective Studies and the Principle of Relative Frequency in Language. Harvard University Press, Cambridge (1932)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Andersson, F.K., Silvestrov, S.D. The Mathematics of Internet Search Engines. Acta Appl Math 104, 211–242 (2008). https://doi.org/10.1007/s10440-008-9254-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10440-008-9254-y