Application of SEO Metrics to Determine the Quality of Wikipedia Articles and Their Sources
Abstract
The leading online encyclopedia Wikipedia is struggling with inconsistent article quality caused by the collaborative editing model. While one can find many helpful articles with consistent information on Wikipedia, there are also a lot of questionable articles with unclear or unfinished information yet. The quality of each article may vary over time as different users repeatedly re-edit content. One of the most important elements of the Wikipedia articles are references which allow to verify content and to show its source to user. Based on the fact that most of these references are web pages, it is possible to get more information about their quality by using citation analysis tools. For science and practice the empirical proof of the quality of the articles in Wikipedia could have a further signal effect, as the citation of Wikipedia articles, especially in scientific practice, is not yet recognised. This paper presents general results of Wikipedia analysis using metrics from the Toolbox SISTRIX, which is one of the leading providers of indicators for Search Engine Optimization (SEO). In addition to the preliminary analysis of the Wikipedia articles as separate web pages, we extracted data from more than 30 million references in different language versions of Wikipedia and analyzed over 180 thousand most popular hosts. In addition, we compared the same sources from different geographical perspectives using country-specific visibility indices.
Keywords
Data quality Wikipedia References SEO SISTRIX Sources Visibility Index Search engineReferences
- 1.Gantz, J., Reinsel, D.: The digital universe in 2020: big data, bigger digital shadows, and biggest growth in the far east (2012). http://www.emc.com/collateral/analyst-reports/idc-the-digital-universe-in-2020.pdf
- 2.Bughin, J., Chui, M., Manyika, J.: Clouds, big data, and smart assets: ten tech-enabled business trends to watch. McKinsey Q. 56(1), 75–86 (2010)Google Scholar
- 3.Schmidt, R., Möhring, M., Härting, R.-C., Reichstein, C., Neumaier, P., Jozinović, P.: Industry 4.0 - potentials for creating smart products: empirical research results. In: Abramowicz, W. (ed.) BIS 2015. LNBIP, vol. 208, pp. 16–27. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19027-3_2CrossRefGoogle Scholar
- 4.International Telecommunication Union: Measuring the Information Society Report 2017, vol. 1 (2017). https://www.itu.int/en/ITU-D/Statistics/Documents/publications/misr2017/MISR2017_Volume1.pdf
- 5.Kumar, L., Kumar, N.: SEO technique for a website and its effectiveness in context of Google Search Engine. Int. J. Comput. Sci. Eng. (IJCSE) 2, 113–118 (2014)Google Scholar
- 6.Schroeder, B.: Publicizing your program: website evaluation, design, and marketing strategies. AACE J. 15(4), 437–471 (2007)Google Scholar
- 7.SISTRIX GmbH: The secret of successful Websites. http://www.sistrix.com
- 8.Stróżyna, M., Eiden, G., Abramowicz, W., et al.: A framework for the quality-based selection and retrieval of open data - a use case from the maritime domain. Electron Mark. (2017). https://doi.org/10.1007/s12525-017-0277-yCrossRefGoogle Scholar
- 9.Filipiak, D., Filipowska, A.: Improving the quality of art market data using linked open data and machine learning. In: Abramowicz, W., Alt, R., Franczyk, B. (eds.) BIS 2016. LNBIP, vol. 263, pp. 418–428. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52464-1_39CrossRefGoogle Scholar
- 10.Lewoniewski, W., Wecel, K., Abramowicz, W.: Relative quality and popularity evaluation of multilingual Wikipedia articles. Informatics 4, 43 (2017)CrossRefGoogle Scholar
- 11.Lewoniewski, W.: Enrichment of information in multilingual Wikipedia based on quality analysis. In: Abramowicz, W. (ed.) BIS 2017. LNBIP, vol. 303, pp. 216–227. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69023-0_19CrossRefGoogle Scholar
- 12.Teplitskiy, M., Lu, G., Duede, E.: Amplifying the impact of open access: Wikipedia and the diffusion of science. J. Assoc. Inf. Sci. Technol. 68(9), 2116–2127 (2017)CrossRefGoogle Scholar
- 13.Drèze, X., Zufryden, F.: Measurement of online visibility and its impact on Internet traffic. J. Interact. Mark. 18(1), 20–37 (2004)CrossRefGoogle Scholar
- 14.Goodman, A.: Winning Results with Google AdWords, 2nd edn. McGraw-Hill, New York City (2009)Google Scholar
- 15.Maynes, R., Everdell, I.: The Evolution of Google Search Results Pages & Their Effects on User Behaviour (2014). http://www.mediative.com/whitepaper-the-evolution-of-googles-search-results-pages-effects-on-user-behaviour/
- 16.Kronenberg, H.: Wie wird der Sichtbarkeitsindex berechnet? (2013). http://www.sistrix.de/frag-sistrix/was-ist-der-sistrix-sichtbarkeitsindex/
- 17.Härting, R.-C., Mohl, M., Steinhauser, P., Möhring, M.: Search engine visibility indices versus visitor traffic on websites. In: Abramowicz, W., Alt, R., Franczyk, B. (eds.) BIS 2016. LNBIP, vol. 255, pp. 91–101. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39426-8_8CrossRefGoogle Scholar
- 18.RYTE GmbH: Search Engine Optimization (2018) https://en.ryte.com/wiki/Category:Search_Engine_Optimization
- 19.Berman, R., Katona, Z.: The role of search engine optimization in search marketing. Mark. Sci. 32(4), 644–651 (2011)CrossRefGoogle Scholar
- 20.Searchmetrics: Backlinks Definition - SEO Glossary. https://www.searchmetrics.com/glossary/Backlinks/
- 21.Killoran, J.B.: How to use search engine optimization techniques to increase website visibility. IEEE Trans. Prof. Commun. 56(1), 50–66 (2013)CrossRefGoogle Scholar
- 22.Warncke-Wang, M., Cosley, D., Riedl, J.: Tell me more: an actionable quality model for Wikipedia. In: Proceedings of the 9th International Symposium on Open Collaboration, p. 8. ACM (2013)Google Scholar
- 23.Wecel, K., Lewoniewski, W.: Modelling the quality of attributes in Wikipedia infoboxes. In: Abramowicz, W. (ed.) BIS 2015. LNBIP, vol. 228, pp. 308–320. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26762-3_27CrossRefGoogle Scholar
- 24.Peinado, A.J.R., Barahona, J.M.G.: Temporal and behavioral patterns in the use of Wikipedia. Doctoral dissertation, Ph.D. thesis, Universidad Rey Juan Carlos, pp. 128, 139 (2011)Google Scholar
- 25.Lerner, J., Lomi, A.: Knowledge categorization affects popularity and quality of Wikipedia articles. PLoS ONE 13(1), e0190674 (2018)CrossRefGoogle Scholar
- 26.Lehmann, J., Müller-Birn, C., Laniado, D., Lalmas, M., Kaltenbrunner, A.: Reader preferences and behavior on Wikipedia. In: Proceedings of the 25th ACM Conference on Hypertext and Social Media, pp. 88–97. ACM (2014)Google Scholar
- 27.Luyt, B., Tan, D.: Improving Wikipedia’s credibility: references and citations in a sample of history articles. J. Assoc. Inf. Sci. Technol. 61(4), 715–722 (2010)Google Scholar
- 28.Lewoniewski, W., Wecel, K., Abramowicz, W.: Analysis of references across Wikipedia languages. In: Damaševičius, R., Mikašytė, V. (eds.) ICIST 2017. CCIS, vol. 756, pp. 561–573. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67642-5_47CrossRefGoogle Scholar
- 29.Klusch, M.: Information agent technology for the internet: a survey. Data Knowl. Eng. 36(3), 337–372 (2001)CrossRefGoogle Scholar