Skip to main content

Webometrics: evolution of social media presence of universities


This paper aims at an important task of computing the webometrics university ranking and investigating if there exists a correlation between webometrics university ranking and the rankings provided by the world prominent university rankers such as QS world university ranking, for the time period of 2005–2016. However, the webometrics portal provides the required data for the recent years only, starting from 2012, which is insufficient for such an investigation. The rest of the required data can be obtained from the internet archive. However, the existing data extraction tools are incapable of extracting the required data from internet archive, due to unusual link structure that consists of web archive link, year, date, and target links. We developed an internet archive scrapper and extract the required data, for the time period of 2012–2016. After extracting the data, the webometrics indicators were quantified, and the universities were ranked accordingly. We used correlation coefficient to identify the relationship between webometrics university ranking computed by us and the original webometrics university ranking, using the spearman and pearson correlation measures. Our findings indicate a strong correlation between ours and the webometrics university rankings, which proves that the applied methodology can be used to compute the webometrics university ranking of those years for which the ranking is not available, i.e., from 2005 to 2011. We compute the webometrics ranking of the top 30 universities of North America, Europe and Asia for the time period of 2005–2016. Our findings indicate a positive correlation for North American and European universities, but weak correlation for Asian universities. This can be explained by the fact that Asian universities did not pay much attention to their websites as compared to the North American and European universities. The overall results reveal the fact that North American and European universities are higher in rank as compared to Asian universities. To the best of our knowledge, such an investigation has been executed for the very first time by us and no recorded work resembling this has been done before.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6


  1. 1.

  2. 2.

  3. 3.*/

  4. 4.

  5. 5.


  1. Aguillo, I. F. (2018). Altmetrics of the open access institutional repositories: A webometrics approach. In 23rd international conference on science and technology indicators (STI 2018), September 12–14, 2018, Leiden, The Netherlands, Centre for Science and Technology Studies (CWTS), (2018). Centre for Science and Technology Studies (CWTS).

  2. Aguillo, I. F., Ortega, J. L., & Fernández, M. (2008). Webometric ranking of world universities: Introduction, methodology, and future developments. Higher Education in Europe, 33(2–3), 233–244.

    Article  Google Scholar 

  3. Almind, T. C., & Ingwersen, P. (1997). Informetric analyses on the world wide web: Methodological approaches to ‘webometrics’. Journal of Documentation, 53(4), 404–426.

    Article  Google Scholar 

  4. Alsmadi, I., & Taylor, Z. (2018). Examining university ranking metrics: Articulating issues of size and web dependency. In Proceedings of the 2018 international conference on computing and big data (pp. 73–77). ACM.

  5. Ananiadou, S., Thompson, P., & Nawaz, R. (2013). Enhancing search: Events and their discourse context. In International conference on intelligent text processing and computational linguistics (pp. 318–334). Springer.

  6. Ayu, M. A., & Elgharabawy, M. A. (2013). Effects of web accessibility on search engines and webometrics ranking. IJMCMC, 5(1), 69–94.

    Google Scholar 

  7. Batista-Navarro, R.T., Kontonatsios, G., Mihăilă, C., Thompson, P., Rak, R., Nawaz, R., Korkontzelos, I., & Ananiadou, S. (2013). Facilitating the analysis of discourse phenomena in an interoperable NLP platform. In International conference on intelligent text processing and computational linguistics (pp. 559–571). Springer.

  8. Björneborn, L., & Ingwersen, P. (2004). Toward a basic framework for webometrics. Journal of the American society for Information Science and Technology, 55(14), 1216–1227.

    Article  Google Scholar 

  9. Bonaccorsi, A., Cicero, T., Haddawy, P., & Hassan, S.-U. (2017a). Explaining the transatlantic gap in research excellence. Scientometrics, 110(1), 217–241.

    Article  Google Scholar 

  10. Bonaccorsi, A., Haddawy, P., Cicero, T., & Hassan, S.-U. (2017b). The solitude of stars: An analysis of the distributed excellence model of european universities. Journal of Informetrics, 11(2), 435–454.

    Article  Google Scholar 

  11. Brown, M. G., Schiltz, J., Derry, H., & Holman, C. (2019). Implementing online personalized social comparison nudges in a web-enabled coaching system. The Internet and Higher Education, 43, 100691.

    Article  Google Scholar 

  12. Chavez-Demoulin, V.C., Roehrl, A.S., Roehrl, R.A., & Weinberg, A. (2000). The WEB archives: A time-machine in your pocket! In Proceedings of The Internet archive colloquium 2000.

  13. Das, S. S., Balasubramanian, P., & Chowdhury, A. R. (2019). Webometrics ranking (WR) of world universities and national institutional ranking framework (NIRF): A comparative study. SRELS Journal of Information Management, 56(3), 154–158.

    Article  Google Scholar 

  14. Dastani, M., Panahi, S., Sattari, M., et al. (2019). Webometrics analysis of Iranian universities about m edical sciences’ websites between september 2016 and March 2017. Acta Informatica Malaysia (AIM), 3(1), 7–12.

    Article  Google Scholar 

  15. Galikyan, I., & Admiraal, W. (2019). Students’ engagement in asynchronous online discussion: The relationship between cognitive presence, learner prominence, and academic performance. The Internet and Higher Education, 43, 100692.

    Article  Google Scholar 

  16. Hande, N. H. (2019). Websites of IITs, IIMs and NITs: A webometrics study. Journal of Advancements in Library Sciences, 6(1), 351–357.

    Google Scholar 

  17. Hassan, S.-U., Aljohani, N. R., Idrees, N., Sarwar, R., Nawaz, R., Martínez-Cámara, E., et al. (2019). Predicting literature’s early impact with sentiment analysis in Twitter. Knowledge-Based Systems, 192, 105383.

    Article  Google Scholar 

  18. Hassan, S.-U., Akram, A., & Haddawy, P. (2017). Identifying important citations using contextual information from full text. In 2017 ACM/IEEE joint conference on digital libraries (JCDL) (pp. 1–8). IEEE.

  19. Hassan, S.-U., & Haddawy, P. (2015). Analyzing knowledge flows of scientific literature through semantic links: A case study in the field of energy. Scientometrics, 103(1), 33–46.

    Article  Google Scholar 

  20. Hassan, S.-U., Haddawy, P., Kuinkel, P., Degelsegger, A., & Blasy, C. (2012). A bibliometric study of research activity in asean related to the eu in fp7 priority areas. Scientometrics, 91(3), 1035–1051.

    Article  Google Scholar 

  21. Hassan, S.-U., Sarwar, R., & Muazzam, A. (2016). Tapping into intra-and international collaborations of the organization of islamic cooperation states across science and technology disciplines. Science and Public Policy, 43(5), 690–701.

    Article  Google Scholar 

  22. Hassan, S. U., Aljohani, N. R., Shabbir, M., Ali, U., Iqbal, S., Sarwar, R., Martínez-Cámara, E., Ventura, S., & Herrera, F. (2020). Tweet Coupling: a social media methodology for clustering scientific publications. Scientometrics, 1–19.

  23. Hickey, D. T., Robinson, J., Fiorini, S., & Feng, Y. (2020). Internet-based alternatives for equitable preparation, access, and success in gateway courses. The Internet and Higher Education, 44, 100693.

    Article  Google Scholar 

  24. Jahangir, M., Afzal, H., Ahmed, M., Khurshid, K., & Nawaz, R. (2017). An expert system for diabetes prediction using auto tuned multi-layer perceptron. In 2017 Intelligent systems conference (IntelliSys) (pp. 722–728). IEEE.

  25. Jalal, S. K., Sutradhar, B., Sahu, K., Mukhopadhyay, P., & Biswas, S. C. (2015). Search engines and alternative data sources in webometric research: An exploratory study. DESIDOC Journal of Library & Information Technology, 35(6), 427–435.

    Article  Google Scholar 

  26. Kenney, A. R., McGovern, N. Y., Botticelli, P., Entlich, R., Lagoze, C., & Payette, S. (2002). Preservation risk management for web resources. Information Management Journal-Prairie Village, 36(5), 52–61.

    Google Scholar 

  27. Koman, R. (2002). How the wayback machine works. Jan, 21, 6.

    Google Scholar 

  28. Lorentzen, D. G. (2014). Webometrics benefitting from web mining? an investigation of methods and applications of two research fields. Scientometrics, 99(2), 409–445.

    Article  Google Scholar 

  29. Molinillo, S., Anaya-Sánchez, R., Aguilar-Illescas, R., & Vallespín-Arán, M. (2018). Social media-based collaborative learning: Exploring antecedents of attitude. Internet and Higher Education, 38(1), 18–27.

    Article  Google Scholar 

  30. Nawaz, R., Thompson, P., McNaught, J., & Ananiadou, S. (2010). Meta-Knowledge annotation of bio-events. In LREC (Vol. 17, pp. 2498–2507).

  31. Nutanong, S., Yu, C., Sarwar, R., Xu, P., & Chow, D. (2016, December). A scalable framework for stylometric analysis query processing. In 2016 IEEE 16th International Conference on Data Mining (ICDM) (pp. 1125–1130). IEEE.

  32. Patel, H. J., & Parmar, S. D. (2018). Webometrics study of all india institutes of medical sciences. Journal of Advancements in Library Sciences, 2(2), 12–17.

    Google Scholar 

  33. Sabah, F., Hassan, S.-U., Muazzam, A., Iqbal, S., Soroya, S. H., & Sarwar, R. (2019). Scientific collaboration networks in pakistan and their impact on institutional research performance: A case study based on scopus publications. Library Hi Tech, 37(1), 19–29.

    Article  Google Scholar 

  34. Sarwar, R., & Hassan, S.-U. (2015). A bibliometric assessment of scientific productivity and international collaboration of the Islamic world in science and technology (s&t) areas. Scientometrics, 105(2), 1059–1077.

    Article  Google Scholar 

  35. Sarwar, R., Li, Q., Rakthanmanon, T., & Nutanong, S. (2018a). A scalable framework for cross-lingual authorship identification. Information Sciences, 465, 323–339.

    Article  Google Scholar 

  36. Sarwar, R., Porthaveepong, T., Rutherford, A., Rakthanmanon, T., & Nutanong, S. (2020a). StyloThai: A scalable framework for stylometric authorship identification of thai documents. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 19(3), 1–15.

  37. Sarwar, R., Rutherford, A. T., Hassan, S. U., Rakthanmanon, T., & Nutanong, S. (2020b). Native Language Identification of Fluent and Advanced Non-Native Writers. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 19(4), 1–19.

  38. Sarwar, R., Soroya, S.H., Muazzam, A., Sabah, F., Iqbal, S., & Hassan, S.-U. (2019). A bibliometric perspective on technology-driven innovation in the Gulf Cooperation Council (GCC) countries in relation to its transformative impact on international business. In Technology-driven innovation in Gulf Cooperation Council (GCC) countries: Emerging research and opportunities (pp. 49–66). IGI Global.

  39. Sarwar, R., Urailertprasert, N., Vannaboot, N., Yu, C., Rakthanmanon, T., Chuangsuwanich, E., & Nutanong, S. (2020c). CAG: Stylometric authorship attribution of multi-author documents using a co-authorship graph. IEEE Access, 8, 18374–18393.

  40. Sarwar, R., Yu, C., Nutanong, S., Urailertprasert, N., Vannaboot, N., & Rakthanmanon, T. (2018c). A scalable framework for stylometric analysis of multi-author documents. In International Conference on Database Systems for Advanced Applications (pp. 813–829). Cham: Springer.

  41. Sarwar, R., Yu, C., Tungare, N., Chitavisutthivong, K., Sriratanawilai, S., Xu, Y., & Nutanong, S. (2018b). An effective and scalable framework for authorship attribution query processing. IEEE Access, 6, 50030–50048.

    Article  Google Scholar 

  42. Sarwar, R., & Nutanong, S. (2016). The key factors and their influence in authorship attribution. Research in Computer Science, 110, 139–150.

    Article  Google Scholar 

  43. Shardlow, M., Batista-Navarro, R., Thompson, P., Nawaz, R., McNaught, J., & Ananiadou, S. (2018). Identification of research hypotheses and new knowledge from scientific literature. BMC Medical Informatics and Decision Making, 18(1), 46.

    Article  Google Scholar 

  44. Stuart, E., Stuart, D., & Thelwall, M. (2017). An investigation of the online presence of UK universities on instagram. Online Information Review, 41(5), 582–597.

    Article  Google Scholar 

  45. Thakur, M. (2007). The impact of ranking systems on higher education and its stakeholders. Journal of Institutional Research, 13(1), 83–96.

    MathSciNet  Google Scholar 

  46. Thompson, P., Nawaz, R., Korkontzelos, I., Black, W., McNaught, J., & Ananiadou, S. (2013, October). News search using discourse analytics. In 2013 Digital Heritage International Congress (DigitalHeritage) (Vol. 1, pp. 597–604). IEEE.

  47. Thompson, P., Nawaz, R., McNaught, J., & Ananiadou, S. (2017). Enriching news events with meta-knowledge information. Language Resources and Evaluation, 51(2), 409–438.

    Article  Google Scholar 

  48. Tofel, B. (2007). Wayback’for accessing web archives. In Proceedings of the 7th international web archiving workshop (pp. 27–37).

  49. Waheed, H., Hassan, S.-U., Aljohani, N. R., & Wasif, M. (2018). A bibliometric perspective of learning analytics research landscape. Behaviour & Information Technology, 37(10–11), 941–957.

    Article  Google Scholar 

  50. Waheed, H., Hassan, S. U., Aljohani, N. R., Hardman, J., Alelyani, S., & Nawaz, R. (2020). Predicting academic performance of students from VLE big data using deep learning models. Computers in Human Behavior, 104, 106189.

Download references

Author information



Corresponding author

Correspondence to Raheem Sarwar.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sarwar, R., Zia, A., Nawaz, R. et al. Webometrics: evolution of social media presence of universities. Scientometrics 126, 951–967 (2021).

Download citation


  • Webometrics university ranking
  • University rankers
  • Web impact indicators
  • Higher education
  • Internet archive