Evaluating Data Quality in Europeana: Metrics for Multilinguality
Europeana.eu aggregates metadata describing more than 50 million cultural heritage objects from libraries, museums, archives and audiovisual archives across Europe. The need for quality of metadata is particularly motivated by its impact on user experience, information retrieval and data re-use in other contexts. One of the key goals of Europeana is to enable users to retrieve cultural heritage resources irrespective of their origin and the material’s metadata language. The presence of multilingual metadata descriptions is therefore essential for successful cross-language retrieval. Quantitatively determining Europeana’s cross-lingual reach is a prerequisite for enhancing the quality of metadata in various languages. Capturing multilingual aspects of the data requires us to take into account the full lifecycle of data aggregation including data enhancement processes such as automatic data enrichment. The paper presents an approach for assessing multilinguality as part of data quality dimensions, namely completeness, consistency, conformity and accessibility. We describe the measures defined and implemented, and provide initial results and recommendations.
KeywordsMetadata quality Multilinguality Digital cultural heritage Europeana Data quality dimensions
This work was partially supported by Portuguese national funds through Fundac̨ão para a Ciência e a Tecnologia (FCT) with reference UID/CEC/50021/2013.
- 1.Albertoni, R., De Martino, M., Podesta, P.: A linkset quality metric measuring multilingual gain in SKOS thesauri. In: LDQ@ ESWC (2015)Google Scholar
- 3.Bruce, T.R., Hillmann, D.I.: The continuum of metadata quality: defining, expressing, exploiting. In: Hillmann, D., Westbrooks, E. (eds.) Metadata in Practice, pp. 238–256. ALA Editions, Chicago (2004)Google Scholar
- 4.Charles, V., Stiller, J., Király, P., Bailer, W., Freire, N.: Data quality assessment in Europeana: metrics for multilinguality. In: Annalina, C., et al. (eds.) Joint Proceedings of the 1st Workshop on Temporal Dynamics in Digital Libraries (TDDL 2017), the (Meta)-Data Quality Workshop (MDQual 2017) and the Workshop on Modeling Societal Future (Futurity 2017) (TDDL\(\_\)MDQual\(\_\)Futurity 2017) co-located with 21st International Conference on Theory and Practice of Digital Libraries (TPLD 2017), Grand Hotel Palace, Thessaloniki, Greece, 21 September 2017. CEUR Workshop Proceedings, vol. 2038 (2017). http://ceur-ws.org/Vol-2038
- 6.IFLA: Functional requirements for Bibliographic records: final report/IFLA Study Group on the Functional Requirements for Bibliographic Records. No. vol. 19 in UBCIM publications; new series, K.G. Saur, München (1998)Google Scholar
- 7.Isaac, A.: Europeana data model primer. Technical report (2013)Google Scholar
- 8.ISO: ISO/IEC 25012, Software engineering - Software product Quality Requirements and Evaluation (SQuaRE) - Data quality model (2000)Google Scholar
- 9.Palavitsinis, N.: Metadata Quality Issues in Learning Repositories. Ph.D. thesis, Alcala de Henares, Spain, February 2014Google Scholar
- 10.Park, J.r.: Metadata quality in digital repositories: a survey of the current state of the art. Cataloging Classif. Q. 47(3–4), 213–228 (2009)Google Scholar
- 12.Stiller, J., Király, P.: Multilinguality of metadata. Measuring the multilingual degree of Europeana’s metadata. In: Proceedings of 15th International Symposium of Information Science (ISI), pp. 164–176 (2017)Google Scholar
- 13.Stiller, J. (ed.): White Paper on Best Practices for Multilingual Access to Digital Libraries. Technical report, Europeana (2016)Google Scholar
- 14.Vogias, K., Hatzakis, I., Manouselis, N., Szegedi, P.: Extraction and Visualization of Metadata Analytics for Multimedia Learning Object Repositories: The case of TERENA TF-media network (2013)Google Scholar
- 15.Wilkinson, M.D., et al.: The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3 (2016)Google Scholar