Advertisement

Application of SEO Metrics to Determine the Quality of Wikipedia Articles and Their Sources

  • Włodzimierz Lewoniewski
  • Ralf-Christian Härting
  • Krzysztof Węcel
  • Christopher Reichstein
  • Witold Abramowicz
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 920)

Abstract

The leading online encyclopedia Wikipedia is struggling with inconsistent article quality caused by the collaborative editing model. While one can find many helpful articles with consistent information on Wikipedia, there are also a lot of questionable articles with unclear or unfinished information yet. The quality of each article may vary over time as different users repeatedly re-edit content. One of the most important elements of the Wikipedia articles are references which allow to verify content and to show its source to user. Based on the fact that most of these references are web pages, it is possible to get more information about their quality by using citation analysis tools. For science and practice the empirical proof of the quality of the articles in Wikipedia could have a further signal effect, as the citation of Wikipedia articles, especially in scientific practice, is not yet recognised. This paper presents general results of Wikipedia analysis using metrics from the Toolbox SISTRIX, which is one of the leading providers of indicators for Search Engine Optimization (SEO). In addition to the preliminary analysis of the Wikipedia articles as separate web pages, we extracted data from more than 30 million references in different language versions of Wikipedia and analyzed over 180 thousand most popular hosts. In addition, we compared the same sources from different geographical perspectives using country-specific visibility indices.

Keywords

Data quality Wikipedia References SEO SISTRIX Sources Visibility Index Search engine 

References

  1. 1.
    Gantz, J., Reinsel, D.: The digital universe in 2020: big data, bigger digital shadows, and biggest growth in the far east (2012). http://www.emc.com/collateral/analyst-reports/idc-the-digital-universe-in-2020.pdf
  2. 2.
    Bughin, J., Chui, M., Manyika, J.: Clouds, big data, and smart assets: ten tech-enabled business trends to watch. McKinsey Q. 56(1), 75–86 (2010)Google Scholar
  3. 3.
    Schmidt, R., Möhring, M., Härting, R.-C., Reichstein, C., Neumaier, P., Jozinović, P.: Industry 4.0 - potentials for creating smart products: empirical research results. In: Abramowicz, W. (ed.) BIS 2015. LNBIP, vol. 208, pp. 16–27. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-19027-3_2CrossRefGoogle Scholar
  4. 4.
    International Telecommunication Union: Measuring the Information Society Report 2017, vol. 1 (2017). https://www.itu.int/en/ITU-D/Statistics/Documents/publications/misr2017/MISR2017_Volume1.pdf
  5. 5.
    Kumar, L., Kumar, N.: SEO technique for a website and its effectiveness in context of Google Search Engine. Int. J. Comput. Sci. Eng. (IJCSE) 2, 113–118 (2014)Google Scholar
  6. 6.
    Schroeder, B.: Publicizing your program: website evaluation, design, and marketing strategies. AACE J. 15(4), 437–471 (2007)Google Scholar
  7. 7.
    SISTRIX GmbH: The secret of successful Websites. http://www.sistrix.com
  8. 8.
    Stróżyna, M., Eiden, G., Abramowicz, W., et al.: A framework for the quality-based selection and retrieval of open data - a use case from the maritime domain. Electron Mark. (2017).  https://doi.org/10.1007/s12525-017-0277-yCrossRefGoogle Scholar
  9. 9.
    Filipiak, D., Filipowska, A.: Improving the quality of art market data using linked open data and machine learning. In: Abramowicz, W., Alt, R., Franczyk, B. (eds.) BIS 2016. LNBIP, vol. 263, pp. 418–428. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-52464-1_39CrossRefGoogle Scholar
  10. 10.
    Lewoniewski, W., Wecel, K., Abramowicz, W.: Relative quality and popularity evaluation of multilingual Wikipedia articles. Informatics 4, 43 (2017)CrossRefGoogle Scholar
  11. 11.
    Lewoniewski, W.: Enrichment of information in multilingual Wikipedia based on quality analysis. In: Abramowicz, W. (ed.) BIS 2017. LNBIP, vol. 303, pp. 216–227. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-69023-0_19CrossRefGoogle Scholar
  12. 12.
    Teplitskiy, M., Lu, G., Duede, E.: Amplifying the impact of open access: Wikipedia and the diffusion of science. J. Assoc. Inf. Sci. Technol. 68(9), 2116–2127 (2017)CrossRefGoogle Scholar
  13. 13.
    Drèze, X., Zufryden, F.: Measurement of online visibility and its impact on Internet traffic. J. Interact. Mark. 18(1), 20–37 (2004)CrossRefGoogle Scholar
  14. 14.
    Goodman, A.: Winning Results with Google AdWords, 2nd edn. McGraw-Hill, New York City (2009)Google Scholar
  15. 15.
    Maynes, R., Everdell, I.: The Evolution of Google Search Results Pages & Their Effects on User Behaviour (2014). http://www.mediative.com/whitepaper-the-evolution-of-googles-search-results-pages-effects-on-user-behaviour/
  16. 16.
    Kronenberg, H.: Wie wird der Sichtbarkeitsindex berechnet? (2013). http://www.sistrix.de/frag-sistrix/was-ist-der-sistrix-sichtbarkeitsindex/
  17. 17.
    Härting, R.-C., Mohl, M., Steinhauser, P., Möhring, M.: Search engine visibility indices versus visitor traffic on websites. In: Abramowicz, W., Alt, R., Franczyk, B. (eds.) BIS 2016. LNBIP, vol. 255, pp. 91–101. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-39426-8_8CrossRefGoogle Scholar
  18. 18.
    RYTE GmbH: Search Engine Optimization (2018) https://en.ryte.com/wiki/Category:Search_Engine_Optimization
  19. 19.
    Berman, R., Katona, Z.: The role of search engine optimization in search marketing. Mark. Sci. 32(4), 644–651 (2011)CrossRefGoogle Scholar
  20. 20.
    Searchmetrics: Backlinks Definition - SEO Glossary. https://www.searchmetrics.com/glossary/Backlinks/
  21. 21.
    Killoran, J.B.: How to use search engine optimization techniques to increase website visibility. IEEE Trans. Prof. Commun. 56(1), 50–66 (2013)CrossRefGoogle Scholar
  22. 22.
    Warncke-Wang, M., Cosley, D., Riedl, J.: Tell me more: an actionable quality model for Wikipedia. In: Proceedings of the 9th International Symposium on Open Collaboration, p. 8. ACM (2013)Google Scholar
  23. 23.
    Wecel, K., Lewoniewski, W.: Modelling the quality of attributes in Wikipedia infoboxes. In: Abramowicz, W. (ed.) BIS 2015. LNBIP, vol. 228, pp. 308–320. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-26762-3_27CrossRefGoogle Scholar
  24. 24.
    Peinado, A.J.R., Barahona, J.M.G.: Temporal and behavioral patterns in the use of Wikipedia. Doctoral dissertation, Ph.D. thesis, Universidad Rey Juan Carlos, pp. 128, 139 (2011)Google Scholar
  25. 25.
    Lerner, J., Lomi, A.: Knowledge categorization affects popularity and quality of Wikipedia articles. PLoS ONE 13(1), e0190674 (2018)CrossRefGoogle Scholar
  26. 26.
    Lehmann, J., Müller-Birn, C., Laniado, D., Lalmas, M., Kaltenbrunner, A.: Reader preferences and behavior on Wikipedia. In: Proceedings of the 25th ACM Conference on Hypertext and Social Media, pp. 88–97. ACM (2014)Google Scholar
  27. 27.
    Luyt, B., Tan, D.: Improving Wikipedia’s credibility: references and citations in a sample of history articles. J. Assoc. Inf. Sci. Technol. 61(4), 715–722 (2010)Google Scholar
  28. 28.
    Lewoniewski, W., Wecel, K., Abramowicz, W.: Analysis of references across Wikipedia languages. In: Damaševičius, R., Mikašytė, V. (eds.) ICIST 2017. CCIS, vol. 756, pp. 561–573. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-67642-5_47CrossRefGoogle Scholar
  29. 29.
    Klusch, M.: Information agent technology for the internet: a survey. Data Knowl. Eng. 36(3), 337–372 (2001)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Włodzimierz Lewoniewski
    • 1
  • Ralf-Christian Härting
    • 2
  • Krzysztof Węcel
    • 1
  • Christopher Reichstein
    • 2
  • Witold Abramowicz
    • 1
  1. 1.Poznan University of Economics and BusinessPoznanPoland
  2. 2.Aalen University of Applied Science, Business Information SystemsAalenGermany

Personalised recommendations