An Analysis of the Quality Issues of the Properties Available in the Spanish DBpedia

  • Nandana Mihindukulasooriya
  • Mariano Rico
  • Raúl García-Castro
  • Asunción Gómez-Pérez
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9422)


DBpedia exposes data from Wikipedia as machine-readable Linked Data. The DBpedia data extraction process generates RDF data in two ways; (a) using the mappings that map the data from Wikipedia infoboxes to the DBpedia ontology and other vocabularies, and (b) using infobox-properties, i.e., properties that are not defined in the DBpedia ontology but are auto-generated using the infobox attribute-value pairs. The work presented in this paper inspects the quality issues of the properties used in the Spanish DBpedia dataset according to conciseness, consistency, syntactic validity, and semantic accuracy quality dimensions. The main contribution of the paper is the identification of quality issues in the Spanish DBpedia and the possible causes of their existence. The findings presented in this paper can be used as feedback to improve the DBpedia extraction process in order to eliminate such quality issues from DBpedia.


DBpedia Spanish DBpedia Data quality Conciseness Consistency Syntactic validity Semantic accuracy 



This work was funded by the BES-2014-068449 grant under the 4V project (TIN2013-46238-C4-2-R), the LIDER project (EU FP7 610782), and the JCI-2012-12719 contract.


  1. 1.
    Acosta, M., Zaveri, A., Simperl, E., Kontokostas, D., Auer, S., Lehmann, J.: Crowdsourcing linked data quality assessment. In: Alani, H., et al. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 260–276. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  2. 2.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  3. 3.
    Beek, W., Rietveld, L., Bazoobandi, H.R., Wielemaker, J., Schlobach, S.: LOD laundromat: a uniform way of publishing other people’s dirty data. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 213–228. Springer, Heidelberg (2014) Google Scholar
  4. 4.
    Fürber, C., Hepp, M.: SWIQA a semantic web information quality assessment framework. In: Proceeding of the 19th European Conference on Information Systems (ECIS 2011), vol. 15, p. 19 (2011)Google Scholar
  5. 5.
    Hogan, A., Harth, A., Passant, A., Decker, S., Polleres, A.: Weaving the pedantic web. In: Proceedings of the Linked Data on the Web (LDOW 2010), CEUR Workshop Proceedings, vol. 628 (2010)Google Scholar
  6. 6.
    Kontokostas, D., Westphal, P., Auer, S., Hellmann, S., Lehmann, J., Cornelissen, R.: Databugger: a test-driven framework for debugging the web of data. In: Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion, pp. 115–118 (2014)Google Scholar
  7. 7.
    Mendes, P.N., Mühleisen, H., Bizer, C.: Sieve: linked data quality assessment and fusion. In: Proceedings of the 2012 Joint EDBT/ICDT Workshops, pp. 116–123. ACM (2012)Google Scholar
  8. 8.
    Wienand, D., Paulheim, H.: Detecting incorrect numerical data in DBpedia. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 504–518. Springer, Heidelberg (2014) CrossRefGoogle Scholar
  9. 9.
    Zaveri, A., Rula, A., Maurinob, A., Pietrobonc, R., Lehmanna, J., Auer, S.: Quality assessment for linked data: a survey. Semant. Web J. (2015)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Nandana Mihindukulasooriya
    • 1
  • Mariano Rico
    • 1
  • Raúl García-Castro
    • 1
  • Asunción Gómez-Pérez
    • 1
  1. 1.Ontology Engineering Group, Departamento de Inteligencia ArtificialUniversidad Politécnica de MadridMadridSpain

Personalised recommendations