Advertisement

Acta Applicandae Mathematicae

, Volume 104, Issue 2, pp 211–242 | Cite as

The Mathematics of Internet Search Engines

  • Fredrik K. Andersson
  • Sergei D. Silvestrov
Article

Abstract

This article presents a survey of techniques for ranking results in search engines, with emphasis on link-based ranking methods and the PageRank algorithm. The problem of selecting, in relation to a user search query, the most relevant documents from an unstructured source such as the WWW is discussed in detail. The need for extending classical information retrieval techniques such as boolean searching and vector space models with link-based ranking methods is demonstrated. The PageRank algorithm is introduced, and its numerical and spectral properties are discussed. The article concludes with an alternative means of computing PageRank, along with some example applications of this new method.

Keywords

Citation ranking PageRank Search engines Information retrieval Text indexing Ranking Markov chains Power method Power series 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Andersson, F.: Estimation of the quality of hyperlinked documents using a series formulation of PageRank. Master Thesis at the Centre for Mathematical Sciences, Faculty of Engineering, Lund University (2006) Google Scholar
  2. 2.
    Bellman, R.: Introduction to Matrix Analysis. Classics in Applied Mathematics, vol. 19. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (1997). Reprint of the second (1970) edition. With a foreword by Gene Golub zbMATHGoogle Scholar
  3. 3.
    Berman, A., Plemmons, R.J.: Nonnegative Matrices in the Mathematical Sciences. Classics in Applied Mathematics, vol. 9. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (1994). Revised reprint of the 1979 original zbMATHGoogle Scholar
  4. 4.
    Billingsley, P.: Ergodic Theory and Information. Wiley, New York (1965) zbMATHGoogle Scholar
  5. 5.
    Billingsley, P.: Convergence of Probability Measures, 2nd edn. Wiley Series in Probability and Statistics: Probability and Statistics. Wiley, New York (1999) zbMATHGoogle Scholar
  6. 6.
    Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. In: Proceedings of the 7th International World Wide Web Conference, 1998 Google Scholar
  7. 7.
    Cvetković, D.M., Doob, M., Sachs, H.: Spectra of Graphs. Academic, New York (1980) Google Scholar
  8. 8.
    Feller, W.: An Introduction to Probability Theory and its Applications, vol. I, 3rd edn. Wiley, New York (1968) zbMATHGoogle Scholar
  9. 9.
    Feller, W.: An Introduction to Probability Theory and its Applications, vol. II, 2nd edn. Wiley, New York (1971) zbMATHGoogle Scholar
  10. 10.
    Gantmacher, F.R.: Applications of the Theory of Matrices (1960) Google Scholar
  11. 11.
    Golub, G.H., van Loan, C.F.V.: Matrix Computations. Johns Hopkins University Press, Baltimore (1996) zbMATHGoogle Scholar
  12. 12.
    Haveliwala, T.H., Kamvar, S.D.: The second eigenvalue of the Google matrix. Stanford University (2003) Google Scholar
  13. 13.
    Haveliwala, T.H., Kamvar, S.D.: The condition number of the PageRank problem. Stanford University (2003) Google Scholar
  14. 14.
    Isaacson, D.L., Madsen, R.W.: Markov Chains: Theory and Applications. Wiley, New York (1976) zbMATHGoogle Scholar
  15. 15.
    Jorgensen, P.E.T.: Analysis and Probability: Wavelets, Signals, Fractals. Graduate Texts in Mathematics, vol. 234. Springer, New York (2006) zbMATHGoogle Scholar
  16. 16.
    Kamvar, S.D., Haveliwala, T.H., Golub, G.: Adaptive methods for the computation of PageRank. Stanford University (2003) Google Scholar
  17. 17.
    Kamvar, S.D., Haveliwala, T.H., Manning, C.D., Golub, G.H.: Exploiting the block structure of the Web for computing PageRank. Stanford University (2003) Google Scholar
  18. 18.
    Kamvar, S.D., Haveliwala, T.H., Manning, C.D., Golub, G.H.: Extrapolation methods for accelerating PageRank computations. In: Proceedings of the 12th International World Wide Web Conference, 2003 Google Scholar
  19. 19.
    Kleinberg, J.: Authoritative sources in a hyperlinked environment. In: Journals of the 9th ACM-SIAM Symposium on Discrete Algorithms, 1999 Google Scholar
  20. 20.
    Lancaster, P., Tismenetsky, M.: The Theory of Matrices. Academic, Orlando (1985) zbMATHGoogle Scholar
  21. 21.
    Langville, A.N., Meyer, C.D.: Deeper inside PageRank. Internet Math. 1(3), 335–380 (2004) zbMATHMathSciNetGoogle Scholar
  22. 22.
    Langville, A.N., Meyer, C.D.: Google’s PageRank and Beyond the Science of Search Engine Rankings. Princeton University Press, Princeton (2006) zbMATHGoogle Scholar
  23. 23.
    Minc, H.: Nonnegative Matrices. Wiley-Interscience Series in Discrete Mathematics and Optimization. Wiley, New York (1988) Google Scholar
  24. 24.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the Web. Stanford Digital Libraries Working Paper, Stanford University (1998) Google Scholar
  25. 25.
    Perron, O.: Jacobischer Kettenbruchalgorithmus. Math. Ann. 64, 1–76 (1907) zbMATHCrossRefMathSciNetGoogle Scholar
  26. 26.
    Perron, O.: Ueber Matrizen. Math. Ann. 64, 248–263 (1907) zbMATHCrossRefMathSciNetGoogle Scholar
  27. 27.
    Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27 (1948) Google Scholar
  28. 28.
    Spärck-Jones, K., Walker, S., Robertson, S.E.: A probabilistic model of information retrieval: Development and comparative experiments. Inf. Process. Manag. 36(6), 779–808 (2000) CrossRefGoogle Scholar
  29. 29.
    Wilkinson, J.H.: The Algebraic Eigenvalue Problem. Clarendon, Oxford (1965) zbMATHGoogle Scholar
  30. 30.
    Zipf, G.K.: Selective Studies and the Principle of Relative Frequency in Language. Harvard University Press, Cambridge (1932) Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2008

Authors and Affiliations

  1. 1.WorldLight.com ABHalmstadSweden
  2. 2.Centre for Mathematical SciencesLund UniversityLundSweden

Personalised recommendations