Skip to main content

Modelling the Quality of Attributes in Wikipedia Infoboxes

  • Conference paper
  • First Online:
Business Information Systems Workshops (BIS 2015)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 228))

Included in the following conference series:

Abstract

Quality of data in DBpedia depends on underlying information provided in Wikipedia’s infoboxes. Various language editions can provide different information about given subject with respect to set of attributes and values of these attributes. Our research question is which language editions provide correct values for each attribute so that data fusion can be carried out. Initial experiments proved that quality of attributes is correlated with the overall quality of the Wikipedia article providing them. Wikipedia offers functionality to assign a quality class to an article but unfortunately majority of articles have not been graded by community or grades are not reliable. In this paper we analyse the features and models that can be used to evaluate the quality of articles, providing foundation for the relative quality assessment of infobox’s attributes, with the purpose to improve the quality of DBpedia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    except those edited by multi-lingual editors and resulting from translation.

  2. 2.

    https://meta.wikimedia.org/wiki/List_of_Wikipedias.

  3. 3.

    https://en.wikipedia.org/wiki/Template:Grading_scheme.

  4. 4.

    alfa version available at http://wikirank.net.

  5. 5.

    This is obvious as with reduced number of classes we avoid misclassification within combined classes.

References

  1. Madnick, S.E., Wang, R.Y., Lee, Y.W., Zhu, H.: Overview and framework for data and information quality research. ACM J. Data Inf. Qual. 1(1), 1–22 (2009)

    Google Scholar 

  2. Heinrich, B., Klier, M.: Metric-based data quality assessment – Developing and evaluating a probability-based currency metric. Decis. Support Syst. 72, 82–96 (2015)

    Article  Google Scholar 

  3. Behkamal, B., Kahani, M., Bagheri, E., Jeremic, Z.: A metrics-driven approach for quality assessment of linked open data. J. Theor. Appl. Electron. Commer. Res. 9(2), 64–79 (2014)

    Article  Google Scholar 

  4. Eppler, M.J.: Managing Information Quality: Increasing the Value of Information in Knowledge-intensive Products and Processes. Springer, Heidelberg (2003)

    Book  Google Scholar 

  5. Commission of the European Communities: eEurope 2002: Quality criteria for health related websites (2002)

    Google Scholar 

  6. Anderka, M.: Analyzing and Predicting Quality Flaws in User-generated Content: The Case of Wikipedia. Phd, Bauhaus-Universitaet Weimar Germany (2013)

    Google Scholar 

  7. Stvilia, B., Al-Faraj, A., Yi, Y.J.: Issues of cross-contextual information quality evaluation-The case of Arabic, English, and Korean Wikipedias. Libr. Inf. Sci. Res. 31(4), 232–239 (2009)

    Article  Google Scholar 

  8. Abramowicz, W.: Filtrowanie informacji. Wydawnictwo Akademii Ekonomicznej w Poznaniu, Poznań (2008)

    Google Scholar 

  9. Ge, M., Helfert, M.: Data and information quality assessment in information manufacturing systems. In: Abramowicz, W., Fensel, D. (eds.) BIS 2008. LNBIP, vol. 7, pp. 380–389. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  10. Xu, H.: What are the most important factors for accounting information quality and their impact on ais data quality outcomes? J. Data Inf. Qual. 5(4), 14:1–14:22 (2015)

    Google Scholar 

  11. Hu, M., Lim, E.P., Sun, A., Lauw, H.W., Vuong, B.Q.: Measuring article quality in wikipedia. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management - CIKM 2007, pp. 243–252 (2007)

    Google Scholar 

  12. Blumenstock, J.E.: Size matters: word count as a measure of quality on wikipedia. In: WWW, pp. 1095–1096 (2008)

    Google Scholar 

  13. Wöhner, T., Peters, R.: Assessing the quality of Wikipedia articles with lifecycle based metrics. In: Proceedings of the 5th International Symposium on Wikis and Open Collaboration WikiSym 2009, p. 1 (2009)

    Google Scholar 

  14. Warncke-wang, M., Cosley, D., Riedl, J.: Tell me more : an actionable quality model for Wikipedia. In: WikiSym 2013, pp. 1–10 (2013)

    Google Scholar 

  15. Dalip, D.H., Gonçalves, M.A., Cristo, M., Calado, P.: Automatic quality assessment of content created collaboratively by web communities: a case study of wikipedia. In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 295–304 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Krzysztof Węcel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Węcel, K., Lewoniewski, W. (2015). Modelling the Quality of Attributes in Wikipedia Infoboxes. In: Abramowicz, W. (eds) Business Information Systems Workshops. BIS 2015. Lecture Notes in Business Information Processing, vol 228. Springer, Cham. https://doi.org/10.1007/978-3-319-26762-3_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26762-3_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26761-6

  • Online ISBN: 978-3-319-26762-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics