Abstract
DBpedia exposes data from Wikipedia as machine-readable Linked Data. The DBpedia data extraction process generates RDF data in two ways; (a) using the mappings that map the data from Wikipedia infoboxes to the DBpedia ontology and other vocabularies, and (b) using infobox-properties, i.e., properties that are not defined in the DBpedia ontology but are auto-generated using the infobox attribute-value pairs. The work presented in this paper inspects the quality issues of the properties used in the Spanish DBpedia dataset according to conciseness, consistency, syntactic validity, and semantic accuracy quality dimensions. The main contribution of the paper is the identification of quality issues in the Spanish DBpedia and the possible causes of their existence. The findings presented in this paper can be used as feedback to improve the DBpedia extraction process in order to eliminate such quality issues from DBpedia.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
See statistics for Spanish at http://wiki.dbpedia.org/services-resources/datasets/dataset-statistics.
- 4.
See the datasets loaded at http://wiki.dbpedia.org/services-resources/datasets/data-set-loaded-2014.
- 5.
- 6.
- 7.
- 8.
The Spanish DBpedia 2014 dataset is the last publicly available version in July 2015.
- 9.
- 10.
- 11.
- 12.
- 13.
In these cases, the infobox label contains a slash. For example, the label ‘idoma/s’ generates a property ‘http://es.dbpedia.org/property/idioma/s’.
- 14.
- 15.
- 16.
- 17.
- 18.
References
Acosta, M., Zaveri, A., Simperl, E., Kontokostas, D., Auer, S., Lehmann, J.: Crowdsourcing linked data quality assessment. In: Alani, H., et al. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 260–276. Springer, Heidelberg (2013)
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)
Beek, W., Rietveld, L., Bazoobandi, H.R., Wielemaker, J., Schlobach, S.: LOD laundromat: a uniform way of publishing other people’s dirty data. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 213–228. Springer, Heidelberg (2014)
Fürber, C., Hepp, M.: SWIQA a semantic web information quality assessment framework. In: Proceeding of the 19th European Conference on Information Systems (ECIS 2011), vol. 15, p. 19 (2011)
Hogan, A., Harth, A., Passant, A., Decker, S., Polleres, A.: Weaving the pedantic web. In: Proceedings of the Linked Data on the Web (LDOW 2010), CEUR Workshop Proceedings, vol. 628 (2010)
Kontokostas, D., Westphal, P., Auer, S., Hellmann, S., Lehmann, J., Cornelissen, R.: Databugger: a test-driven framework for debugging the web of data. In: Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion, pp. 115–118 (2014)
Mendes, P.N., Mühleisen, H., Bizer, C.: Sieve: linked data quality assessment and fusion. In: Proceedings of the 2012 Joint EDBT/ICDT Workshops, pp. 116–123. ACM (2012)
Wienand, D., Paulheim, H.: Detecting incorrect numerical data in DBpedia. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 504–518. Springer, Heidelberg (2014)
Zaveri, A., Rula, A., Maurinob, A., Pietrobonc, R., Lehmanna, J., Auer, S.: Quality assessment for linked data: a survey. Semant. Web J. (2015)
Acknowledgments
This work was funded by the BES-2014-068449 grant under the 4V project (TIN2013-46238-C4-2-R), the LIDER project (EU FP7 610782), and the JCI-2012-12719 contract.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Mihindukulasooriya, N., Rico, M., García-Castro, R., Gómez-Pérez, A. (2015). An Analysis of the Quality Issues of the Properties Available in the Spanish DBpedia. In: Puerta, J., et al. Advances in Artificial Intelligence. CAEPIA 2015. Lecture Notes in Computer Science(), vol 9422. Springer, Cham. https://doi.org/10.1007/978-3-319-24598-0_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-24598-0_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24597-3
Online ISBN: 978-3-319-24598-0
eBook Packages: Computer ScienceComputer Science (R0)