X3ML mapping framework for information integration in cultural heritage and beyond

  • Yannis Marketakis
  • Nikos Minadakis
  • Haridimos Kondylakis
  • Konstantina Konsolaki
  • Georgios Samaritakis
  • Maria Theodoridou
  • Giorgos Flouris
  • Martin Doerr
Article

Abstract

The aggregation of heterogeneous data from different institutions in cultural heritage and e-science has the potential to create rich data resources useful for a range of different purposes, from research to education and public interests. In this paper, we present the X3ML framework, a framework for information integration that handles effectively and efficiently the steps involved in schema mapping, uniform resource identifier (URI) definition and generation, data transformation, provision and aggregation. The framework is based on the X3ML mapping definition language for describing both schema mappings and URI generation policies and has a lot of advantages when compared with other relevant frameworks. We describe the architecture of the framework as well as details on the various available components. Usability aspects are discussed and performance metrics are demonstrated. The high impact of our work is verified via the increasing number of international projects that adopt and use this framework.

Keywords

Data mappings Schema matching Data aggregation URI generation Information integration 

References

  1. 1.
    Marketakis, Y., Tzitzikas, Y., Tona, C., Argenti, M., Marelli, F., Albani, M., Guarino, R., Polsinelli, B., Bitto, R.: On harmonizing earth science policies, semantics, metadata and ontologies. In: Ensuring the Long-Term Preservation and Value Adding to Scientific and Technical Data (PV’2013) (2013)Google Scholar
  2. 2.
    Tzitzikas, Y., Allocca, C., Bekiari, C., Marketakis, Y., Fafalios, P., Doerr, M., Minadakis, N., Patkos, T., Candela, L.: Integrating heterogeneous and distributed information about marine species through a top level ontology. In: Metadata and Semantics Research, pp. 289–301. Springer, New York (2013)Google Scholar
  3. 3.
    Tzitzikas, Y., Allocca, C., Bekiari, C., Marketakis, Y., Fafalios, P., Doerr, M., Minadakis, N., Patkos, T., Candella, L.: Unifying heterogeneous and distributed information about marine species through the top level ontology MarineTLO. Program Electron. Library Inf. Syst. 50(1), 16 (2015)CrossRefGoogle Scholar
  4. 4.
    Kondylakis, H., Plexousakis, D., Hrgovcic, V., Woitsch, R., Premm, M., Schüle, M.: Agents, models and semantic integration in support of personal ehealth knowledge spaces. In: Proceedings of Web Information Systems Engineering—WISE 2014—15th International Conference, Thessaloniki, Greece, October 12–14, 2014, Part I, pp. 496–511 (2014)Google Scholar
  5. 5.
    Kondylakis, H., Spanakis, E.G., Sfakianakis, S., Sakkalis, V., Tsiknakis, M., Marias, K., Zhao, X., Yu, H., Dong, F.: Digital patient: personalized and translational data management through the myhealthavatar EU project. In: 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2015, Milan, Italy, August 25–29, 2015, pp. 1397–1400 (2015)Google Scholar
  6. 6.
    Kondylakis, H., Flouris, G., Plexousakis, D.: Ontology and schema evolution in data integration: review and assessment. In: On the Move to Meaningful Internet Systems: OTM 2009, Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009, Vilamoura, Portugal, November 1–6, 2009, Proceedings, Part II, pp. 932–947 (2009)Google Scholar
  7. 7.
    Kondylakis, H., Plexousakis, D.: Ontology evolution: assisting query migration. In: Proceedings of Conceptual Modeling—31st International Conference ER 2012, Florence, Italy, October 15–18, 2012, pp. 331–344 (2012)Google Scholar
  8. 8.
    Kondylakis, H., Plexousakis, D.: Exelixis: evolving ontology-based data integration system. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June 12–16, 2011, pp. 1283–1286 (2011)Google Scholar
  9. 9.
    Calvanese, D., De Giacomo, G., Lenzerini, M., Nardi, D., Rosati, R.: Description logic framework for information integration. In: KR, pp. 2–13 (1998)Google Scholar
  10. 10.
    Minadakis, N., Marketakis, Y., Kondylakis, H., Flouris, G., Theodoridou, M., Doerr, M., de Jong, G.: X3ML framework: an effective suite for supporting data mappings. In: Workshop for Extending, Mapping and Focusing the CRM—co-located with TPDL’2015 (2015)Google Scholar
  11. 11.
    Doerr, M.: The CIDOC conceptual reference module: an ontological approach to semantic interoperability of metadata. AI Mag. 24(3), 75 (2003)MathSciNetGoogle Scholar
  12. 12.
    Doerr, M.: CIDOC-CRM Family of Models. http://www.ics.forth.gr/isl/CRMext
  13. 13.
    Berners-Lee, T., Hendler, J., Lassila, O., et al.: The semantic web. Sci. Am. 284(5), 28–37 (2001)CrossRefGoogle Scholar
  14. 14.
    Berners-Lee, T.: Relational databases on the semantic web (2013)Google Scholar
  15. 15.
    Hert, M., Reif, G., Gall, H.C.: A comparison of RDB-to-RDF mapping languages. In: Proceedings of the 7th International Conference on Semantic Systems, pp. 25–32. ACM (2011)Google Scholar
  16. 16.
    SquirrelRDF. http://jena.sourceforge.net/SquirrelRDF/. Accessed Dec 2015
  17. 17.
    Bizer, C.: D2R MAP—A Database to RDF Mapping Language. WWW (Posters) (2003)Google Scholar
  18. 18.
    Barrasa, J., Corcho, O., Gómez-Pérez, A.: Fund finder: a case study of database-to-ontology mapping. In: Semantic Integration Workshop, p 9. Citeseer (2003)Google Scholar
  19. 19.
    Bizer, C., Seaborne, A.: D2RQ—treating non-RDF databases as virtual RDF graphs. In: Proceedings of the 3rd International Semantic Web Conference (ISWC2004), vol. 2004. Citeseer, Hiroshima (2004)Google Scholar
  20. 20.
    Barrasa Rodríguez, J., Corcho, O., Gómez-Pérez, A.: R2O, an extensible and semantically based database-to-ontology mapping language (2004)Google Scholar
  21. 21.
    Openlink Software: Mapping Relational Data to RDF with Virtuoso’s RDF Views. http://virtuoso.openlinksw.com/whitepapers/relational%20rdf%20views%20mapping.html. Accessed Dec 2015
  22. 22.
    Auer, S., Dietzold, S., Lehmann, J., Hellmann, S., Aumueller, D.: Triplify: light-weight linked data publication from relational databases. In: Proceedings of the 18th International Conference on World Wide Web, pp. 621–630. ACM (2009)Google Scholar
  23. 23.
    Das, S., Sundara, S., Cyganiak, R.: R2RML: RDB to RDF mapping language (2012)Google Scholar
  24. 24.
    RDB2RDF Implementations. http://www.w3.org/2001/sw/rdb2rdf/wiki/Implementations. Accessed Dec 2015
  25. 25.
    Dimou, A., Vander Sande, M., Colpaert, P., Verborgh, R., Mannens, E., Van de Walle, R.: RML: a generic language for integrated RDF mappings of heterogeneous data. In: Proceedings of the 7th Workshop on Linked Data on the Web (LDOW2014), Seoul, Korea (2014)Google Scholar
  26. 26.
    de Laborda, C.P., Conrad, S.: Relational.OWL: a data and schema representation format based on OWL. In: Proceedings of the 2nd Asia-Pacific Conference on Conceptual Modelling, vol 43, pp. 89–96. Australian Computer Society, Inc. (2005)Google Scholar
  27. 27.
    Langegger, A., Wöß, W.: XLWrap—querying and integrating arbitrary spreadsheets with SPARQL. Springer, New York (2009)Google Scholar
  28. 28.
    O’Connor, M.J., Halaschek-Wiener, C., Musen, M.A.: Mapping master: a flexible approach for mapping spreadsheets to OWL. In: The Semantic Web–ISWC 2010, pp. 194–208. Springer, New York (2010)Google Scholar
  29. 29.
    Vertere RDF. https://github.com/knudmoeller/Vertere-RDF. Accessed Dec 2015
  30. 30.
    Tarql: SPARQL for Tables. https://github.com/tarql/tarql. Accessed Dec 2015
  31. 31.
    Lange, C.: Krextor—an extensible framework for contributing content math to the web of data. In: Intelligent Computer Mathematics, pp. 304–306. Springer, New York (2011)Google Scholar
  32. 32.
    AstroGrid-D: A Transformation from XML to RDF via XSLT. http://www.gac-grid.de/project-products/Software/XML2RDF.html. Accessed Dec 2015
  33. 33.
    Tripliser. http://daverog.github.io/tripliser/. Accessed Dec 2015
  34. 34.
    Bischof, S., Decker, S., Krennwallner, T., Lopes, N., Polleres, A.: Mapping between RDF and XML with XSPARQL. J. Data Semant. 1(3), 147–185 (2012)CrossRefGoogle Scholar
  35. 35.
    Connolly, D. et al.: Gleaning resource descriptions from dialects of languages (GRDDL). W3C, W3C Recommendation, p. 11 (2007)Google Scholar
  36. 36.
    Scharffe, F., Atemezing, G., Troncy, R., Gandon, F., Villata, S., Bucher, B., Hamdi, F., Bihanic, L., Képéklian, G., Cotton, F. et al.: Enabling linked data publication with the datalift platform. In: Proceedings of AAAI Workshop on Semantic Cities (2012)Google Scholar
  37. 37.
    DataTank: Transform Datasets into a RESTful API. http://thedatatank.com/. Accessed Dec 2015
  38. 38.
    OpenRefine. http://openrefine.org/. Accessed Dec 2015
  39. 39.
  40. 40.
  41. 41.
    Szekely, P., Knoblock, C.A., Yang, F., Zhu, X., Fink, E.E., Allen, R., Goodlander, G.: Connecting the Smithsonian American art museum to the linked data cloud. In: The Semantic Web: Semantics and Big Data, pp. 593–607. Springer, New York (2013)Google Scholar
  42. 42.
  43. 43.
    Groovy: A Multi-Faceted Language for the Java platform. http://www.groovy-lang.org/. Accessed Dec 2015
  44. 44.
    Lavoie, B.: Meeting the challenges of digital preservation: the oais reference model. OCLC Newsl. 243, 26–30 (2000)Google Scholar
  45. 45.
    Kondylakis, H., Doerr, M., Plexousakis, D.: Mapping Language for Information Integration. Technical Report ICS-FORTH, vol. 385 (2006)Google Scholar
  46. 46.
    Thomas, D., Hunt, A.: Orthogonality and the DRY Principle (2010)Google Scholar
  47. 47.
    Coburn, E., Light, R., McKenna, G., Stein, R., Vitzthum, A.: LIDO-lightweight information describing objects version 1.0. In: ICOM International Committee of Museums (2010)Google Scholar
  48. 48.
    Smyrnaki, O.: Design and Implementation of a Semi-Automatic Tool for Mapping Source Schemas to Target Ontologies. Master’s thesis, University of Crete, Voutes Campus, 70013 Heraklion (2013)Google Scholar
  49. 49.
    Gregorio, J., Fielding, R., Hadley, M., Nottingham, M., Orchard, D.: RFC 6570: URI template. Internet Engineering Task Force (IETF) Request for Comments (2012)Google Scholar
  50. 50.
    Eckerson, W.W.: Three tier client/server architectures: achieving scalability, performance, and efficiency in client/server applications. Open Inf. Syst. 3(20), 46–50 (1995)Google Scholar
  51. 51.
    ARIADNE: Advanced Research Infrastructure for Archaeological Dataset Networking in Europe, FP7 Research Infrastructures, 2013–2017. http://www.ariadne-infrastructure.eu/
  52. 52.
    dFMRÖ: Digitale Fundmünzen der Römischen Zeit in Österreich. http://www.oeaw.ac.at/antike/index.php?id=358 (2007)
  53. 53.
    Felicetti, A., Gerth, P., Meghini, C., Theodoridou, M.: Integrating heterogeneous coin datasets in the context of archaeological research. In: Workshop for Extending, Mapping and Focusing the CRM—co-located with TPDL’2015 (2015)Google Scholar
  54. 54.
    Felicetti, A., Scarselli, T., Mancinelli, M., Niccolucci, F.: Mapping ICCD archaeological data to CIDOC-CRM: the RA schema. In: A Mapping of CIDOC CRM Events to German Wordnet for Event Detection in Texts, p. 11 (2013)Google Scholar
  55. 55.
    Simeoni, F., Candela, L., Kakaletris, G., Sibeko, M., Pagano, P., Papanikos, G., Polydoras, P., Ioannidis, Y., Aarvaag, D., Crestani, F.: A Grid-Based Infrastructure for Distributed Retrieval. Springer, New York (2007)Google Scholar
  56. 56.
    Doerr, M., Gradmann, S., Hennicke, S., Isaac, A., Meghini, C., van de Sompel, H.: The europeana data model (EDM). In: World Library and Information Congress: 76th IFLA General Conference and Assembly, pp. 10–15 (2010)Google Scholar
  57. 57.
    VRE4EIC: A Europe-wide interoperable virtual research environment to empower multidisciplinary research communities and accelerate innovation and collaboration. H2020 Research Infrastructures 2015–2018. http://www.vre4eic.eu/
  58. 58.
    Asserson, A., Jeffery, K.G., Lopatenko, A.: CERIF: past, present and future: an overview. euroCRIS (2002)Google Scholar
  59. 59.
    ITN-DCH: Initial Training Network for Digital Cultural Heritage, 2013–2017. http://www.itn-dch.eu/
  60. 60.
    LifeWatch Greece: National Strategic Reference Framework, 2012–2015. https://www.lifewatchgreece.eu/
  61. 61.
    Prud’Hommeaux, E., Seaborne, A. et al.: Sparql Query Language for rdf. W3C Recommendation, p. 15 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Yannis Marketakis
    • 1
  • Nikos Minadakis
    • 1
  • Haridimos Kondylakis
    • 1
  • Konstantina Konsolaki
    • 1
  • Georgios Samaritakis
    • 1
  • Maria Theodoridou
    • 1
  • Giorgos Flouris
    • 1
  • Martin Doerr
    • 1
  1. 1.Institute of Computer Science, FORTH-ICSHeraklionGreece

Personalised recommendations