Skip to main content

Interoperability of Open Science Metadata: What About the Reality?

  • Conference paper
  • First Online:
Research Challenges in Information Science: Information Science and the Connected World (RCIS 2023)

Abstract

Open Science aims at sharing results and data widely between different research domains. Interoperability is one of the keys to enable the exchange and crossing of data between different research communities. In this paper, we assess the state of the interoperability of Open Science datasets from various communities. The diversity of metadata schemata of these datasets from different sources does not allow for native interoperability, highlighting the need for matching tools. The question is whether current metadata schema matching tools are sufficiently efficient to achieve interoperability between existing datasets. In our study, we first define our vision of interoperability by transversally considering the technical and semantic aspects when dealing with metadata schemata coming from various domains. We then evaluate the interoperability of some datasets from the medical domain and Earth system study domain using acknowledged matching tools. We evaluate the performance of the tools, then we investigate the correlation between various metrics characterizing the schemata and the performance related to their mapping. This paper leads to identify complementary ways to improve dataset interoperability: (1) to adapt mapping algorithms to the issues raised by metadata schema matching; (2) to adapt metadata schemata, for instance by sharing a core vocabulary and/or reusing existing standards; (3) to combine various trends in a more complex interoperability approach that would also make available and operational the (RDA) crosswalks between schemata and that would promote good practices in metadata labeling and documentation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    www.rd-alliance.org/groups/research-metadata-schemas-wg.

  2. 2.

    schema.org.

  3. 3.

    oaei.ontologymatching.org/.

  4. 4.

    guides.lib.utexas.edu/metadata-basics/standards.

  5. 5.

    www.dublincore.org/.

  6. 6.

    www.loc.gov/marc/.

  7. 7.

    www.loc.gov/ead/EAD3taglib/index.html.

  8. 8.

    rs.tdwg.org/dwc.htm.

  9. 9.

    www.foaf-project.org/.

  10. 10.

    www.ogc.org/docs/is.

  11. 11.

    www.iso.org.

  12. 12.

    www.data-terra.org/.

  13. 13.

    www.aeris-data.fr/.

  14. 14.

    www.iso.org/standard/32579.html.

  15. 15.

    www.odatis-ocean.fr/.

  16. 16.

    www.seadatanet.org/.

  17. 17.

    https://www.iso.org/ics/35.100/x/.

  18. 18.

    www.fairsfair.eu.

  19. 19.

    www.odatis-ocean.fr/.

  20. 20.

    www.aeris-data.fr/.

  21. 21.

    gitlab.com/smilecdr-public/cda2r4.

  22. 22.

    github.com/jmandel/sample_ccdas.

  23. 23.

    www.hl7.org/fhir/index.html.

  24. 24.

    gitlab.com/smilecdr-public/cda2r4.

  25. 25.

    github.com/vincentnam/OS_data_interop_RCIS_2023.

  26. 26.

    github.com/IlyaSemenov/wikipedia-word-frequency.

References

  1. Alasem, A.: An overview of e-government metadata standards and initiatives based on Dublin core. Electron. J. e-Gov. 7(1), 1–10 (2009)

    Google Scholar 

  2. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_52

    Chapter  Google Scholar 

  3. Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. In: Advances in Neural Information Processing Systems, vol. 13 (2000)

    Google Scholar 

  4. Diallo, S.Y., et al.: Understanding interoperability. In: Proceedings of the 2011 Emerging M &S Applications in Industry and Academia Symposium, pp. 84–91 (2011)

    Google Scholar 

  5. Do, H.H., Rahm, E.: Coma-a system for flexible combination of schema matching approaches. In: VLDB 2002: Proceedings of the 28th International Conference on Very Large Databases, pp. 610–621. Elsevier (2002)

    Google Scholar 

  6. Embley, D.W., Jackman, D., Xu, L.: Multifaceted exploitation of metadata for attribute match discovery in information integration. In: Workshop on Information Integration on the Web, pp. 110–117 (2001)

    Google Scholar 

  7. Fallatah, O., et al.: A gold standard dataset for large knowledge graphs matching. In: Ontology Matching 2020: Proceedings of the 15th International Workshop on Ontology Matching co-located with the 19th International Semantic Web Conference (ISWC 2020), vol. 2788, pp. 24–35. CEUR Workshop Proceedings (2020)

    Google Scholar 

  8. Gittens, A., Achlioptas, D., Mahoney, M.W.: Skip-gram- zipf+ uniform= vector additivity. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 69–76 (2017)

    Google Scholar 

  9. Gonçalves, R.S., Kamdar, M.R., Musen, M.A.: Aligning biomedical metadata with ontologies using clustering and embeddings. In: Hitzler, P., et al. (eds.) ESWC 2019. LNCS, vol. 11503, pp. 146–161. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21348-0_10

    Chapter  Google Scholar 

  10. He, Y., Chen, J., Antonyrajah, D., Horrocks, I.: BERTMap: a BERT-based ontology alignment system. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 5, pp. 5684–5691 (2022). https://doi.org/10.1609/aaai.v36i5.20510

  11. Heiler, S.: Semantic interoperability. ACM Comput. Surv. (CSUR) 27(2), 271–273 (1995)

    Article  Google Scholar 

  12. Hertling, S., Portisch, J., Paulheim, H.: MELT - matching EvaLuation toolkit. In: Acosta, M., Cudré-Mauroux, P., Maleshkova, M., Pellegrini, T., Sack, H., Sure-Vetter, Y. (eds.) SEMANTiCS 2019. LNCS, vol. 11702, pp. 231–245. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33220-4_17

    Chapter  Google Scholar 

  13. Jacobsen, A., et al.: Fair principles: interpretations and implementation considerations (2020)

    Google Scholar 

  14. Koutras, C., et al.: Valentine: evaluating matching techniques for dataset discovery. In: 2021 IEEE 37th International Conference on Data Engineering (ICDE), pp. 468–479. IEEE (2021)

    Google Scholar 

  15. Kusner, M., et al.: From word embeddings to document distances. In: International Conference on Machine Learning, pp. 957–966. PMLR (2015)

    Google Scholar 

  16. Li, Z., et al.: Temporal knowledge graph reasoning based on evolutional representation learning. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 408–417 (2021)

    Google Scholar 

  17. Madhavan, J., Bernstein, P.A., Rahm, E.: Generic schema matching with cupid. In: VLDB, vol. 1, pp. 49–58 (2001)

    Google Scholar 

  18. Massmann, S., Engmann, D., Rahm, E.: COMA++: results for the ontology alignment contest OAEI 2006. Ontol. Matching 225 (2006)

    Google Scholar 

  19. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  20. Nilsson, M., Baker, T., Johnston, P.: Interoperability levels for Dublin core metadata. Technical report, Dublin Core Metadata Initiative (2008). https://www.dublincore.org/specifications/dublin-core/interoperability-levels/

  21. Noura, M., et al.: Interoperability in internet of things: taxonomies and open challenges. Mob. Netw. Appl. 24(3), 796–809 (2019)

    Article  Google Scholar 

  22. Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB J. 10(4), 334–350 (2001)

    Article  MATH  Google Scholar 

  23. Rehurek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. Citeseer (2010)

    Google Scholar 

  24. National Academies of Sciences, Engineering, and Medicine: Open science by design: Realizing a vision for 21st century research. National Academies Press, Washington DC (2018). https://doi.org/10.17226/25116. https://nap.nationalacademies.org/catalog/25116/open-science-by-design-realizing-a-vision-for-21st-century

  25. Tolk, A., Diallo, S.Y., Turnitsa, C.D.: Applying the levels of conceptual interoperability model in support of integratability, interoperability, and composability for system-of-systems engineering. J. Syst. Cybern. Inform. 5(5) (2007)

    Google Scholar 

  26. Van Der Veer, H., Wiles, A.: Achieving technical interoperability. European Telecommunications Standards Institute (2008)

    Google Scholar 

  27. Vicente-Saez, R., Martinez-Fuentes, C.: Open science now: a systematic literature review for an integrated definition. J. Bus. Res. 88, 428–436 (2018)

    Article  Google Scholar 

  28. Wegner, P.: Interoperability. ACM Comput. Surv. (CSUR) 28(1), 285–287 (1996)

    Article  Google Scholar 

  29. Weibel, S., Kunze, J., Lagoze, C., Wolf, M.: Dublin core metadata for resource discovery. Technical report, IETF Request for Comments (1998). http://www.ietf.org/rfc/rfc2413.txt

  30. Wilkinson, M.D., et al.: The fair guiding principles for scientific data management and stewardship. Sci. Data 3(1), 1–9 (2016)

    Article  Google Scholar 

  31. Willis, C., Greenberg, J., White, H.: Analysis and synthesis of metadata goals for scientific data. J. Am. Soc. Inform. Sci. Technol. 63(8), 1505–1520 (2012)

    Article  Google Scholar 

  32. Zeng, M.L.: Interoperability. KO. Knowl. Organ. 46(2), 122–146 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vincent-Nam Dang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dang, VN., Aussenac-Gilles, N., Megdiche, I., Ravat, F. (2023). Interoperability of Open Science Metadata: What About the Reality?. In: Nurcan, S., Opdahl, A.L., Mouratidis, H., Tsohou, A. (eds) Research Challenges in Information Science: Information Science and the Connected World. RCIS 2023. Lecture Notes in Business Information Processing, vol 476. Springer, Cham. https://doi.org/10.1007/978-3-031-33080-3_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-33080-3_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-33079-7

  • Online ISBN: 978-3-031-33080-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics