Relative Quality Assessment of Wikipedia Articles in Different Languages Using Synthetic Measure

Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 303)


Online encyclopedia Wikipedia is one of the most popular sources of knowledge. It is often criticized for poor information quality. Articles can be created and edited even by anonymous users independently in almost 300 languages. Therefore, a difference in the information quality in various language versions on the same topic is observed. The Wikipedia community has created a system for assessing the quality of articles, which can be helpful in deciding which language version is more complete and correct. There are several issues: each Wikipedia language can use own grading scheme and there is usually a large number of unevaluated articles. In this paper, we propose to use a synthetic measure for automatic quality evaluation of the articles in different languages based on important features.


Wikipedia Article quality Synthetic measure Wikirank 


  1. 1.
    Węcel, K., Lewoniewski, W.: Modelling the quality of attributes in Wikipedia infoboxes. In: Abramowicz, W. (ed.) BIS 2015. LNBIP, vol. 228, pp. 308–320. Springer, Cham (2015). doi: 10.1007/978-3-319-26762-3_27 CrossRefGoogle Scholar
  2. 2.
    Blumenstock, J.: Size matters: word count as a measure of quality on Wikipedia. In: Proceedings of the 17th International Conference on World Wide Web, pp. 1095–1096. ACM (2008)Google Scholar
  3. 3.
    Lewoniewski, W., Węcel, K., Abramowicz, W.: Quality and importance of Wikipedia articles in different languages. In: Dregvaite, G., Damasevicius, R. (eds.) ICIST 2016. CCIS, vol. 639, pp. 613–624. Springer, Cham (2016). doi: 10.1007/978-3-319-46254-7_50 CrossRefGoogle Scholar
  4. 4.
    Warncke-Wang, M., Cosley, D., Riedl, J.: Tell me more: an actionable quality model for Wikipedia. In: Proceedings of the 9th International Symposium on Open Collaboration, p. 8. ACM, August 2013Google Scholar
  5. 5.
    Anderka, M.: Analyzing and predicting quality flaws in user-generated content: the case of Wikipedia. Ph.D., Bauhaus-Universitaet, Weimar, Germany (2013)Google Scholar
  6. 6.
    Lex, E., et al.: Measuring the quality of web content using factual information. In: Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality, pp. 7–10. ACM (2012)Google Scholar
  7. 7.
    Khairova, N., Lewoniewski, W., Węcel, K.: Estimating the quality of articles in Russian Wikipedia using the logical-linguistic model of fact extraction. In: Abramowicz, W. (ed.) Business Information Systems, BIS 2017. LNBIP, vol. 288, pp. 28–40. Springer, Cham (2017). doi: 10.1007/978-3-319-59336-4_3 Google Scholar
  8. 8.
    Lipka, N., Stein, B.: Identifying featured articles in Wikipedia: writing style matters. In: Proceedings of the 19th International Conference on World Wide Web, pp. 1147–1148. ACM (2010)Google Scholar
  9. 9.
    Xu, Y., Luo, T.: Measuring article quality in Wikipedia: lexical clue model. In: 2011 3rd Symposium on Web Society (SWS), pp. 141–146. IEEE (2011)Google Scholar
  10. 10.
    Wu, G., Harrigan, M., Cunningham, P.: Characterizing Wikipedia pages using edit network motif profiles. In: Proceedings of the 3rd International Workshop on Search and Mining User-Generated Contents, pp. 45–52. ACM (2011)Google Scholar
  11. 11.
    Suzuki, Y., Nakamura, S.: Assessing the quality of Wikipedia editors through crowdsourcing. In: Proceedings of the 25th International Conference Companion on World Wide Web, pp. 1001–1006. International World Wide Web Conferences Steering Committee (2016)Google Scholar
  12. 12.
    Wilkinson, D.M., Huberman, B.A.: Cooperation and quality in Wikipedia. In: Proceedings of the 2007 International Symposium on Wikis, pp. 157–164. ACM (2007)Google Scholar
  13. 13.
    Ingawale, M., Dutta, A., Roy, R., Seetharaman, P.: Network analysis of user generated content quality in Wikipedia. Online Inf. Rev. 37(4), 602–619 (2013)CrossRefGoogle Scholar
  14. 14.
    Halfaker, A., Taraborelli, D.: Artificial intelligence service gives Wikipedians ‘x-ray specs’ to see through bad edits (2015). Accessed 25 April 2017
  15. 15.
    Dalip, D.H., Lima, H., Gonçalves, M.A., Cristo, M., Calado, P.: Quality assessment of collaborative content with minimal information. In: 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL), pp. 201–210. IEEE (2014)Google Scholar
  16. 16.
    Dang, Q.V., Ignat, C.L.: Quality assessment of Wikipedia articles without feature engineering. In: 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL), pp. 27–30. IEEE (2016)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Poznań University of Economics and BusinessPoznańPoland

Personalised recommendations