Abstract
The aggregation of heterogeneous data from different institutions in cultural heritage and e-science has the potential to create rich data resources useful for a range of different purposes, from research to education and public interests. In this paper, we present the X3ML framework, a framework for information integration that handles effectively and efficiently the steps involved in schema mapping, uniform resource identifier (URI) definition and generation, data transformation, provision and aggregation. The framework is based on the X3ML mapping definition language for describing both schema mappings and URI generation policies and has a lot of advantages when compared with other relevant frameworks. We describe the architecture of the framework as well as details on the various available components. Usability aspects are discussed and performance metrics are demonstrated. The high impact of our work is verified via the increasing number of international projects that adopt and use this framework.
Similar content being viewed by others
Notes
Through the 3M component.
Some feedback from the USA workshop as published in the call of the UK workshop can be found at http://www.researchspace.org/home/project-updates/cidoccrmmappingworkshopatoxforduniversity.
The experiments were carried out on a PC with an Intel i7 processor, 8GB RAM, running Windows 7 32 bit.
References
Marketakis, Y., Tzitzikas, Y., Tona, C., Argenti, M., Marelli, F., Albani, M., Guarino, R., Polsinelli, B., Bitto, R.: On harmonizing earth science policies, semantics, metadata and ontologies. In: Ensuring the Long-Term Preservation and Value Adding to Scientific and Technical Data (PV’2013) (2013)
Tzitzikas, Y., Allocca, C., Bekiari, C., Marketakis, Y., Fafalios, P., Doerr, M., Minadakis, N., Patkos, T., Candela, L.: Integrating heterogeneous and distributed information about marine species through a top level ontology. In: Metadata and Semantics Research, pp. 289–301. Springer, New York (2013)
Tzitzikas, Y., Allocca, C., Bekiari, C., Marketakis, Y., Fafalios, P., Doerr, M., Minadakis, N., Patkos, T., Candella, L.: Unifying heterogeneous and distributed information about marine species through the top level ontology MarineTLO. Program Electron. Library Inf. Syst. 50(1), 16 (2015)
Kondylakis, H., Plexousakis, D., Hrgovcic, V., Woitsch, R., Premm, M., Schüle, M.: Agents, models and semantic integration in support of personal ehealth knowledge spaces. In: Proceedings of Web Information Systems Engineering—WISE 2014—15th International Conference, Thessaloniki, Greece, October 12–14, 2014, Part I, pp. 496–511 (2014)
Kondylakis, H., Spanakis, E.G., Sfakianakis, S., Sakkalis, V., Tsiknakis, M., Marias, K., Zhao, X., Yu, H., Dong, F.: Digital patient: personalized and translational data management through the myhealthavatar EU project. In: 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2015, Milan, Italy, August 25–29, 2015, pp. 1397–1400 (2015)
Kondylakis, H., Flouris, G., Plexousakis, D.: Ontology and schema evolution in data integration: review and assessment. In: On the Move to Meaningful Internet Systems: OTM 2009, Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009, Vilamoura, Portugal, November 1–6, 2009, Proceedings, Part II, pp. 932–947 (2009)
Kondylakis, H., Plexousakis, D.: Ontology evolution: assisting query migration. In: Proceedings of Conceptual Modeling—31st International Conference ER 2012, Florence, Italy, October 15–18, 2012, pp. 331–344 (2012)
Kondylakis, H., Plexousakis, D.: Exelixis: evolving ontology-based data integration system. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June 12–16, 2011, pp. 1283–1286 (2011)
Calvanese, D., De Giacomo, G., Lenzerini, M., Nardi, D., Rosati, R.: Description logic framework for information integration. In: KR, pp. 2–13 (1998)
Minadakis, N., Marketakis, Y., Kondylakis, H., Flouris, G., Theodoridou, M., Doerr, M., de Jong, G.: X3ML framework: an effective suite for supporting data mappings. In: Workshop for Extending, Mapping and Focusing the CRM—co-located with TPDL’2015 (2015)
Doerr, M.: The CIDOC conceptual reference module: an ontological approach to semantic interoperability of metadata. AI Mag. 24(3), 75 (2003)
Doerr, M.: CIDOC-CRM Family of Models. http://www.ics.forth.gr/isl/CRMext
Berners-Lee, T., Hendler, J., Lassila, O., et al.: The semantic web. Sci. Am. 284(5), 28–37 (2001)
Berners-Lee, T.: Relational databases on the semantic web (2013)
Hert, M., Reif, G., Gall, H.C.: A comparison of RDB-to-RDF mapping languages. In: Proceedings of the 7th International Conference on Semantic Systems, pp. 25–32. ACM (2011)
SquirrelRDF. http://jena.sourceforge.net/SquirrelRDF/. Accessed Dec 2015
Bizer, C.: D2R MAP—A Database to RDF Mapping Language. WWW (Posters) (2003)
Barrasa, J., Corcho, O., Gómez-Pérez, A.: Fund finder: a case study of database-to-ontology mapping. In: Semantic Integration Workshop, p 9. Citeseer (2003)
Bizer, C., Seaborne, A.: D2RQ—treating non-RDF databases as virtual RDF graphs. In: Proceedings of the 3rd International Semantic Web Conference (ISWC2004), vol. 2004. Citeseer, Hiroshima (2004)
Barrasa Rodríguez, J., Corcho, O., Gómez-Pérez, A.: R2O, an extensible and semantically based database-to-ontology mapping language (2004)
Openlink Software: Mapping Relational Data to RDF with Virtuoso’s RDF Views. http://virtuoso.openlinksw.com/whitepapers/relational%20rdf%20views%20mapping.html. Accessed Dec 2015
Auer, S., Dietzold, S., Lehmann, J., Hellmann, S., Aumueller, D.: Triplify: light-weight linked data publication from relational databases. In: Proceedings of the 18th International Conference on World Wide Web, pp. 621–630. ACM (2009)
Das, S., Sundara, S., Cyganiak, R.: R2RML: RDB to RDF mapping language (2012)
RDB2RDF Implementations. http://www.w3.org/2001/sw/rdb2rdf/wiki/Implementations. Accessed Dec 2015
Dimou, A., Vander Sande, M., Colpaert, P., Verborgh, R., Mannens, E., Van de Walle, R.: RML: a generic language for integrated RDF mappings of heterogeneous data. In: Proceedings of the 7th Workshop on Linked Data on the Web (LDOW2014), Seoul, Korea (2014)
de Laborda, C.P., Conrad, S.: Relational.OWL: a data and schema representation format based on OWL. In: Proceedings of the 2nd Asia-Pacific Conference on Conceptual Modelling, vol 43, pp. 89–96. Australian Computer Society, Inc. (2005)
Langegger, A., Wöß, W.: XLWrap—querying and integrating arbitrary spreadsheets with SPARQL. Springer, New York (2009)
O’Connor, M.J., Halaschek-Wiener, C., Musen, M.A.: Mapping master: a flexible approach for mapping spreadsheets to OWL. In: The Semantic Web–ISWC 2010, pp. 194–208. Springer, New York (2010)
Vertere RDF. https://github.com/knudmoeller/Vertere-RDF. Accessed Dec 2015
Tarql: SPARQL for Tables. https://github.com/tarql/tarql. Accessed Dec 2015
Lange, C.: Krextor—an extensible framework for contributing content math to the web of data. In: Intelligent Computer Mathematics, pp. 304–306. Springer, New York (2011)
AstroGrid-D: A Transformation from XML to RDF via XSLT. http://www.gac-grid.de/project-products/Software/XML2RDF.html. Accessed Dec 2015
Tripliser. http://daverog.github.io/tripliser/. Accessed Dec 2015
Bischof, S., Decker, S., Krennwallner, T., Lopes, N., Polleres, A.: Mapping between RDF and XML with XSPARQL. J. Data Semant. 1(3), 147–185 (2012)
Connolly, D. et al.: Gleaning resource descriptions from dialects of languages (GRDDL). W3C, W3C Recommendation, p. 11 (2007)
Scharffe, F., Atemezing, G., Troncy, R., Gandon, F., Villata, S., Bucher, B., Hamdi, F., Bihanic, L., Képéklian, G., Cotton, F. et al.: Enabling linked data publication with the datalift platform. In: Proceedings of AAAI Workshop on Semantic Cities (2012)
DataTank: Transform Datasets into a RESTful API. http://thedatatank.com/. Accessed Dec 2015
OpenRefine. http://openrefine.org/. Accessed Dec 2015
RDFizers. http://wiki.opensemanticframework.org/index.php/RDFizers. Accessed Dec 2015
Virtuoso Sponger. http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtSponger. Accessed Dec 2015
Szekely, P., Knoblock, C.A., Yang, F., Zhu, X., Fink, E.E., Allen, R., Goodlander, G.: Connecting the Smithsonian American art museum to the linked data cloud. In: The Semantic Web: Semantics and Big Data, pp. 593–607. Springer, New York (2013)
Sip Creator. https://github.com/delving/delving/tree/master/sip-creator. Accessed Dec 2015
Groovy: A Multi-Faceted Language for the Java platform. http://www.groovy-lang.org/. Accessed Dec 2015
Lavoie, B.: Meeting the challenges of digital preservation: the oais reference model. OCLC Newsl. 243, 26–30 (2000)
Kondylakis, H., Doerr, M., Plexousakis, D.: Mapping Language for Information Integration. Technical Report ICS-FORTH, vol. 385 (2006)
Thomas, D., Hunt, A.: Orthogonality and the DRY Principle (2010)
Coburn, E., Light, R., McKenna, G., Stein, R., Vitzthum, A.: LIDO-lightweight information describing objects version 1.0. In: ICOM International Committee of Museums (2010)
Smyrnaki, O.: Design and Implementation of a Semi-Automatic Tool for Mapping Source Schemas to Target Ontologies. Master’s thesis, University of Crete, Voutes Campus, 70013 Heraklion (2013)
Gregorio, J., Fielding, R., Hadley, M., Nottingham, M., Orchard, D.: RFC 6570: URI template. Internet Engineering Task Force (IETF) Request for Comments (2012)
Eckerson, W.W.: Three tier client/server architectures: achieving scalability, performance, and efficiency in client/server applications. Open Inf. Syst. 3(20), 46–50 (1995)
ARIADNE: Advanced Research Infrastructure for Archaeological Dataset Networking in Europe, FP7 Research Infrastructures, 2013–2017. http://www.ariadne-infrastructure.eu/
dFMRÖ: Digitale Fundmünzen der Römischen Zeit in Österreich. http://www.oeaw.ac.at/antike/index.php?id=358 (2007)
Felicetti, A., Gerth, P., Meghini, C., Theodoridou, M.: Integrating heterogeneous coin datasets in the context of archaeological research. In: Workshop for Extending, Mapping and Focusing the CRM—co-located with TPDL’2015 (2015)
Felicetti, A., Scarselli, T., Mancinelli, M., Niccolucci, F.: Mapping ICCD archaeological data to CIDOC-CRM: the RA schema. In: A Mapping of CIDOC CRM Events to German Wordnet for Event Detection in Texts, p. 11 (2013)
Simeoni, F., Candela, L., Kakaletris, G., Sibeko, M., Pagano, P., Papanikos, G., Polydoras, P., Ioannidis, Y., Aarvaag, D., Crestani, F.: A Grid-Based Infrastructure for Distributed Retrieval. Springer, New York (2007)
Doerr, M., Gradmann, S., Hennicke, S., Isaac, A., Meghini, C., van de Sompel, H.: The europeana data model (EDM). In: World Library and Information Congress: 76th IFLA General Conference and Assembly, pp. 10–15 (2010)
VRE4EIC: A Europe-wide interoperable virtual research environment to empower multidisciplinary research communities and accelerate innovation and collaboration. H2020 Research Infrastructures 2015–2018. http://www.vre4eic.eu/
Asserson, A., Jeffery, K.G., Lopatenko, A.: CERIF: past, present and future: an overview. euroCRIS (2002)
ITN-DCH: Initial Training Network for Digital Cultural Heritage, 2013–2017. http://www.itn-dch.eu/
LifeWatch Greece: National Strategic Reference Framework, 2012–2015. https://www.lifewatchgreece.eu/
Prud’Hommeaux, E., Seaborne, A. et al.: Sparql Query Language for rdf. W3C Recommendation, p. 15 (2008)
Acknowledgments
This work was partially supported by the following projects: ARIADNE (FP7 Research Infrastructures, 2013–2017), PARTHENOS (H2020 Research Infrastructures, 2015–2019), BlueBRIDGE (H2020 Research Infrastructures, 2015–2018), and VRE4EIC (H2020 Research Infrastructures, 2015–2018). The authors would also like to thank Nikos Anyfantis for working with the Source and Target Analyzer components and Korina Doerr for designing the user interfaces of the X3ML framework.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Marketakis, Y., Minadakis, N., Kondylakis, H. et al. X3ML mapping framework for information integration in cultural heritage and beyond. Int J Digit Libr 18, 301–319 (2017). https://doi.org/10.1007/s00799-016-0179-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00799-016-0179-1