World Wide Web

, Volume 20, Issue 6, pp 1153–1177 | Cite as

Wikiometrics: a Wikipedia based ranking system

Article
  • 275 Downloads

Abstract

We present a new concept—Wikiometrics—the derivation of metrics and indicators from Wikipedia. Wikipedia provides an accurate representation of the real world due to its size, structure, editing policy and popularity. We demonstrate an innovative “mining” methodology, where different elements of Wikipedia – content, structure, editorial actions and reader reviews – are used to rank items in a manner which is by no means inferior to rankings produced by experts or other methods. We test our proposed method by applying it to two real-world ranking problems: top world universities and academic journals. Our proposed ranking methods were compared to leading and widely accepted benchmarks, and were found to be extremely correlative but with the advantage of the data being publically available.

Keywords

Wikipedia Ranking 

References

  1. 1.
    Agrawal, V.K., Agrawal, V., Rungtusanatham, M.: Theoretical and interpretation challenges to using the author affiliation index method to rank journals. Prod. Oper. Manag. 20(2), 280–300 (2011)CrossRefGoogle Scholar
  2. 2.
    Aguillo, I.F., Bar-Ilan, J., Levene, M., Ortega, J.L.: Comparing university rankings. Scientometrics. 85(1), 243–256 (2010)CrossRefGoogle Scholar
  3. 3.
    Al-Maskari, A., Sanderson, M., and Clough, P. The relationship between IR effectiveness measures and user satisfaction. in Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM (2007)Google Scholar
  4. 4.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives Z.: DBpedia: a nucleus for a Web of open data. In: Aberer, K., et al. (eds.) The semantic Web. Lect. Notes Comput. Sci. vol 4825. Springer, Berlin (2007)Google Scholar
  5. 5.
    Balog, K., M. Bron, and M. De Rijke, Category-based query modeling for entity search, in Advances in Information Retrieval, Springer. p. 319–331 (2010)Google Scholar
  6. 6.
    Bergstrom, C.: Measuring the value and prestige of scholarly journals. College & Research Libraries News. 68(5), 314–316 (2007)CrossRefGoogle Scholar
  7. 7.
    Brynjolfsson, E., Hu, Y., Simester, D.: Goodbye pareto principle, hello long tail: the effect of search costs on the concentration of product sales. Manag. Sci. 57(8), 1373–1386 (2011)CrossRefGoogle Scholar
  8. 8.
    Calver, M., Bradley, J.: Should we use the mean citations per paper to summarise a journal’s impact or to rank journals in the same field? Scientometrics. 81(3), 611–615 (2009)CrossRefGoogle Scholar
  9. 9.
    Cheng, C.H., Holsapple, C.W., Lee, A.: Citation-based journal rankings for AI research a business perspective. AI Mag. 17(2), 87 (1996)Google Scholar
  10. 10.
    Chepelianskii, A.D., Towards physical laws for software architecture. arXiv preprint arXiv:1003.5455, (2010)Google Scholar
  11. 11.
    Cronin, B., Meho, L.I.: Applying the author affiliation index to library and information science journals. J. Am. Soc. Inf. Sci. Technol. 59(11), 1861–1865 (2008)CrossRefGoogle Scholar
  12. 12.
    Demartini, G., C.S. Firan, T. Iofciu, and W. Nejdl, Semantically enhanced entity ranking, in Web Information Systems Engineering-WISE 2008. Springer. p. 176–188 (2008)Google Scholar
  13. 13.
    Eom, Y.-H., Frahm, K.M., Benczúr, A., and Shepelyansky, D.L, Time evolution of Wikipedia network ranking. arXiv preprint arXiv:1304.6601, (2013)Google Scholar
  14. 14.
    Fader, A., Soderland, S., Etzioni, O., and Center, T. Scaling Wikipedia-based named entity disambiguation to arbitrary web text. in Proceedings of the IJCAI Workshop on User-contributed Knowledge and Artificial Intelligence: An Evolving Synergy, Pasadena, CA, USA. (2009)Google Scholar
  15. 15.
    Ferron, M., Massa, P.: The Arab spring| wikirevolutions: Wikipedia as a lens for studying the real-time formation of collective memories of revolutions. International Journal of Communication. 5, 20 (2011)Google Scholar
  16. 16.
    Garfield, E.: The history and meaning of the journal impact factor. JAMA. 295(1), 90–93 (2006)CrossRefGoogle Scholar
  17. 17.
    Hachey, B., Radford, W., Nothman, J., Honnibal, M., Curran, J.R.: Evaluating entity linking with Wikipedia. Artif. Intell. 194, 130–150 (2013)MathSciNetCrossRefMATHGoogle Scholar
  18. 18.
    Harless, D. and Reilly, R., Revision of the journal list for doctoral designation. Unpublished report, Virginia Commonwealth University, Richmond, VA. Retrieved June, 1998. 17: (2008)Google Scholar
  19. 19.
    Harzing, A.-W., Van der Wal, R.: Google scholar: the democratization of citation analysis. Ethics in Science and Environmental Politics. 8(1), 61–73 (2007)Google Scholar
  20. 20.
    Hoffart, J., M.A. Yosef, I. Bordino, H. Fürstenau, M. Pinkal, M. Spaniol, B. Taneva, S. Thater, and G. Weikum. Robust disambiguation of named entities in text. in Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2011)Google Scholar
  21. 21.
    Holsapple, C.W.: A publication power approach for identifying premier information systems journals. J. Am. Soc. Inf. Sci. Technol. 59(2), 166–185 (2008)CrossRefGoogle Scholar
  22. 22.
    Kaptein, R., Kamps, J.: Exploiting the category structure of Wikipedia for entity ranking. Artif. Intell. 194, 111–129 (2013)CrossRefMATHGoogle Scholar
  23. 23.
    Kaptein, R., P. Serdyukov, A. De Vries, and J. Kamps. Entity ranking using Wikipedia as a pivot. in Proceedings of the 19th ACM international conference on Information and Knowl. Manag. ACM (2010)Google Scholar
  24. 24.
    Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM. 46(5), 604–632 (1999)MathSciNetCrossRefMATHGoogle Scholar
  25. 25.
    Lages, J., Patt, A., and Shepelyansky, D.L., Wikipedia Ranking of World Universities. arXiv preprint arXiv:1511.09021, (2015)Google Scholar
  26. 26.
    Marginson, S., Van der Wende, M.: To rank or to be ranked: the impact of global rankings in higher education. J. Stud. Int. Educ. 11(3–4), 306–329 (2007)CrossRefGoogle Scholar
  27. 27.
    McKean, J. and T. Hettmansperger, Robust nonparametric statistical methods: CRC Press (2011)Google Scholar
  28. 28.
    McKinnon, K.I.: Convergence of the Nelder--mead simplex method to a Nonstationary point. SIAM J. Optim. 9(1), 148–158 (1998)MathSciNetCrossRefMATHGoogle Scholar
  29. 29.
    Mestyán, M., Yasseri, T., Kertész, J.: Early prediction of movie box office success based on Wikipedia activity big data. PLoS One. 8(8), e71226 (2013)CrossRefGoogle Scholar
  30. 30.
    Mirizzi, R., A. Ragone, T. Di Noia, and E. Di Sciascio, Ranking the linked data: the case of dbpedia: Springer (2010)Google Scholar
  31. 31.
    Myers, L. and Robe, J., College rankings: history, criticism and reform. Center for College Affordability and Productivity (NJ1), (2009)Google Scholar
  32. 32.
    Nielsen, F.Å., Wikipedia research and tools: Review and comments. (2011)Google Scholar
  33. 33.
    Page, L., S. Brin, R. Motwani, and T. Winograd, The PageRank citation ranking: Bringing order to the web. (1999)Google Scholar
  34. 34.
    Pehcevski, J., A.-M. Vercoustre, and J.A. Thom, Exploiting locality of Wikipedia links in entity ranking, in Advances in Information Retrieval, Springer. p. 258–269 (2008)Google Scholar
  35. 35.
    Pehcevski, J., Thom, J.A., Vercoustre, A.-M., Naumovski, V.: Entity ranking in Wikipedia: utilising categories, links and topic difficulty prediction. Inf. Retr. 13(5), 568–600 (2010)CrossRefGoogle Scholar
  36. 36.
    Raviv, H., D. Carmel, and O. Kurland. A ranking framework for entity oriented search using Markov random fields. in Proceedings of the 1st Joint International Workshop on Entity-Oriented and Semantic Search. ACM (2012)Google Scholar
  37. 37.
    Raviv, H., O. Kurland, and D. Carmel. The cluster hypothesis for entity oriented search. in Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. ACM (2013)Google Scholar
  38. 38.
    Rokach, L.: Applying the publication power approach to artificial intelligence journals. J. Am. Soc. Inf. Sci. Technol. 63(6), 1270–1277 (2012)CrossRefGoogle Scholar
  39. 39.
    Schloegl, C., Stock, W.G.: Impact and relevance of LIS journals: a scientometric analysis of international and German-language LIS journals—citation analysis versus reader survey. J. Am. Soc. Inf. Sci. Technol. 55(13), 1155–1168 (2004)CrossRefGoogle Scholar
  40. 40.
    Serenko, A.: The development of an AI journal ranking based on the revealed preference approach. Journal of Informetrics. 4(4), 447–459 (2010)CrossRefGoogle Scholar
  41. 41.
    Serenko, A., Dohan, M.: Comparing the expert survey and citation impact journal ranking methods: example from the field of artificial intelligence. Journal of Informetrics. 5(4), 629–648 (2011)CrossRefGoogle Scholar
  42. 42.
    Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a large ontology from Wikipedia and WordNet. Web Semant. Sci. Serv. Agents World Wide Web. 6(3), 203–217 (2008)CrossRefGoogle Scholar
  43. 43.
    Vercoustre, A.-M., J.A. Thom, and J. Pehcevski. Entity ranking in Wikipedia. in Proceedings of the 2008 ACM symposium on Applied computing. ACM (2008a)Google Scholar
  44. 44.
    Vercoustre, A.-M., J. Pehcevski, and J.A. Thom, Using wikipedia categories and links in entity ranking, in Focused Access to XML Documents, Springer. p. 321–335 (2008b)Google Scholar
  45. 45.
    Zar, J.H., Spearman rank correlation. Encyclopedia of Biostatistics, (1998)Google Scholar
  46. 46.
    Zaragoza, H., H. Rode, P. Mika, J. Atserias, M. Ciaramita, and G. Attardi. Ranking very many typed entities on wikipedia. in Proceedings of the sixteenth ACM conference on Conference on information and knowledge management. ACM (2007)Google Scholar
  47. 47.
    Zhirov, A., Zhirov, O., Shepelyansky, D.L.: Two-dimensional ranking of Wikipedia articles. The European Physical Journal B. 77(4), 523–531 (2010)CrossRefMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.University of CaliforniaBerkeleyUSA
  2. 2.Ben-Gurion University of the NegevBeershebaIsrael

Personalised recommendations