Modelling the Quality of Attributes in Wikipedia Infoboxes

Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 228)


Quality of data in DBpedia depends on underlying information provided in Wikipedia’s infoboxes. Various language editions can provide different information about given subject with respect to set of attributes and values of these attributes. Our research question is which language editions provide correct values for each attribute so that data fusion can be carried out. Initial experiments proved that quality of attributes is correlated with the overall quality of the Wikipedia article providing them. Wikipedia offers functionality to assign a quality class to an article but unfortunately majority of articles have not been graded by community or grades are not reliable. In this paper we analyse the features and models that can be used to evaluate the quality of articles, providing foundation for the relative quality assessment of infobox’s attributes, with the purpose to improve the quality of DBpedia.


Data quality Information quality DBpedia Wikipedia Infobox Data mining Wikirank 


  1. 1.
    Madnick, S.E., Wang, R.Y., Lee, Y.W., Zhu, H.: Overview and framework for data and information quality research. ACM J. Data Inf. Qual. 1(1), 1–22 (2009)Google Scholar
  2. 2.
    Heinrich, B., Klier, M.: Metric-based data quality assessment – Developing and evaluating a probability-based currency metric. Decis. Support Syst. 72, 82–96 (2015)CrossRefGoogle Scholar
  3. 3.
    Behkamal, B., Kahani, M., Bagheri, E., Jeremic, Z.: A metrics-driven approach for quality assessment of linked open data. J. Theor. Appl. Electron. Commer. Res. 9(2), 64–79 (2014)CrossRefGoogle Scholar
  4. 4.
    Eppler, M.J.: Managing Information Quality: Increasing the Value of Information in Knowledge-intensive Products and Processes. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  5. 5.
    Commission of the European Communities: eEurope 2002: Quality criteria for health related websites (2002)Google Scholar
  6. 6.
    Anderka, M.: Analyzing and Predicting Quality Flaws in User-generated Content: The Case of Wikipedia. Phd, Bauhaus-Universitaet Weimar Germany (2013)Google Scholar
  7. 7.
    Stvilia, B., Al-Faraj, A., Yi, Y.J.: Issues of cross-contextual information quality evaluation-The case of Arabic, English, and Korean Wikipedias. Libr. Inf. Sci. Res. 31(4), 232–239 (2009)CrossRefGoogle Scholar
  8. 8.
    Abramowicz, W.: Filtrowanie informacji. Wydawnictwo Akademii Ekonomicznej w Poznaniu, Poznań (2008)Google Scholar
  9. 9.
    Ge, M., Helfert, M.: Data and information quality assessment in information manufacturing systems. In: Abramowicz, W., Fensel, D. (eds.) BIS 2008. LNBIP, vol. 7, pp. 380–389. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  10. 10.
    Xu, H.: What are the most important factors for accounting information quality and their impact on ais data quality outcomes? J. Data Inf. Qual. 5(4), 14:1–14:22 (2015)Google Scholar
  11. 11.
    Hu, M., Lim, E.P., Sun, A., Lauw, H.W., Vuong, B.Q.: Measuring article quality in wikipedia. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management - CIKM 2007, pp. 243–252 (2007)Google Scholar
  12. 12.
    Blumenstock, J.E.: Size matters: word count as a measure of quality on wikipedia. In: WWW, pp. 1095–1096 (2008)Google Scholar
  13. 13.
    Wöhner, T., Peters, R.: Assessing the quality of Wikipedia articles with lifecycle based metrics. In: Proceedings of the 5th International Symposium on Wikis and Open Collaboration WikiSym 2009, p. 1 (2009)Google Scholar
  14. 14.
    Warncke-wang, M., Cosley, D., Riedl, J.: Tell me more : an actionable quality model for Wikipedia. In: WikiSym 2013, pp. 1–10 (2013)Google Scholar
  15. 15.
    Dalip, D.H., Gonçalves, M.A., Cristo, M., Calado, P.: Automatic quality assessment of content created collaboratively by web communities: a case study of wikipedia. In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 295–304 (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (, which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  1. 1.Poznań University of EconomicsPoznanPoland

Personalised recommendations