Modelling the Quality of Attributes in Wikipedia Infoboxes

Conference paper

DOI: 10.1007/978-3-319-26762-3_27

Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 228)
Cite this paper as:
Węcel K., Lewoniewski W. (2015) Modelling the Quality of Attributes in Wikipedia Infoboxes. In: Abramowicz W. (eds) Business Information Systems Workshops. BIS 2015. Lecture Notes in Business Information Processing, vol 228. Springer, Cham

Abstract

Quality of data in DBpedia depends on underlying information provided in Wikipedia’s infoboxes. Various language editions can provide different information about given subject with respect to set of attributes and values of these attributes. Our research question is which language editions provide correct values for each attribute so that data fusion can be carried out. Initial experiments proved that quality of attributes is correlated with the overall quality of the Wikipedia article providing them. Wikipedia offers functionality to assign a quality class to an article but unfortunately majority of articles have not been graded by community or grades are not reliable. In this paper we analyse the features and models that can be used to evaluate the quality of articles, providing foundation for the relative quality assessment of infobox’s attributes, with the purpose to improve the quality of DBpedia.

Keywords

Data quality Information quality DBpedia Wikipedia Infobox Data mining Wikirank 

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Poznań University of EconomicsPoznanPoland

Personalised recommendations