Abstract
The availability of high-quality metadata is key to facilitating discovery in the large variety of scientific datasets that are increasingly becoming publicly available. However, despite the recent focus on metadata, the diversity of metadata representation formats and the poor support for semantic markup typically result in metadata that are of poor quality. There is a pressing need for a metadata representation format that provides strong interoperation capabilities together with robust semantic underpinnings. In this paper, we describe such a format, together with open-source Web-based tools that support the acquisition, search, and management of metadata. We outline an initial evaluation using metadata from a variety of biomedical repositories.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The @-prefixed notation follows JSON-LD; see Sect. 3.2.
- 2.
A complete template model specification can be found at http://metadatacenter.org/cedar-template-model.
- 3.
A JSON Schema validator can be found at http://www.jsonschemavalidator.net.
- 4.
A useful online JSON-LD tool can be found at http://json-ld.org/playground.
- 5.
The CEDAR Workbench is available at https://cedar.metadatacenter.net.
References
Borgman, C.L.: The conundrum of sharing research data. J. Am. Soc. Inform. Sci. Technol. 63(6), 1059–1078 (2012)
Tenenbaum, J.D., Sansone, S.-A., Haendel, M.A.: A sea of standards for omics data: sink or swim? JAMIA 21(2), 200–203 (2014)
Edgar, R., Domrachev, M., Lash, A.E.: Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30(1), 207–210 (2002)
BioSample. http://www.ncbi.nlm.nih.gov/biosample. Accessed 15 Sept 2016
Bhattacharya, S., et al.: ImmPort: disseminating data to the public for the future of immunology. Immunol. Res. 58(2–3), 234–239 (2014)
Musen, M.A., et al.: The center for expanded data annotation and retrieval. J. Am. Med. Inform. Assoc. 22(6), 1148–1152 (2015)
BD2K. https://datascience.nih.gov/bd2k. Accessed 15 Sept 2016
Sansone, S.-A., Rocca-Serra, P., Field, D., et al.: Toward interoperable bioscience data. Nat. Genet. 44(2), 121–126 (2012)
Rocca-Serra, P., Brandizi, M., Maquire, E., et al.: ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level. Bioinformatics 26(18), 2354–2356 (2010)
Rayner, T.D., et al.: A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB. BMC Bioinform. 7(1), 489 (2006)
Wilkinson, M.D., et al.: The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3(1), 160018 (2016)
Nosek, B.A., et al.: Promoting an open research culture. Science 6242(348), 1422–1425 (2015)
JSON Schema. http://json-schema.org. Accessed 15 Sept 2016
JSON-LD. http://json-ld.org. Accessed 15 Sept 2016
Musen, M.A., Noy, N.F., Shah, N.H., et al.: The national center for biomedical ontology. JAMIA 19(2), 190–195 (2012)
Maecker, H., et al.: Standardizing immunophenotyping for the human immunology project. Nat. Rev. Immunol. 12(3), 191–200 (2012)
LINCS. http://www.lincsproject.org. Accessed 15 Sept 2016
Panahiazar, M., et al.: Context aware recommendation engine for metadata submission. In: Workshop on Capturing Scientific Knowledge (2015)
Motik, B., Horrocks, I., Sattler, U.: Adding integrity constraints to OWL. In: OWLED, vol. 258 (2007)
SHACL. https://www.w3.org/TR/shacl/. Accessed 15 Sept 2016
JSON-LD Use Cases. https://www.w3.org/2013/dwbp/wiki/RDF_AND_JSON-LD_UseCases. Accessed 15 Sept 2016
CEDAR GitHub Organization. https://github.com/metadatacenter. Accessed 15 Sept 2016
Acknowledgments
CEDAR is supported by the National Institutes of Health through an NIH Big Data to Knowledge program under grant 1U54AI117925. NCBO is supported by the NIH Common Fund under grant U54HG004028. We appreciate the collaborations offered by the ImmPort, BioSharing, HIPC, and LINCS communities.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
O’Connor, M.J., Martínez-Romero, M., Egyedi, A.L., Willrett, D., Graybeal, J., Musen, M.A. (2016). An Open Repository Model for Acquiring Knowledge About Scientific Experiments. In: Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (eds) Knowledge Engineering and Knowledge Management. EKAW 2016. Lecture Notes in Computer Science(), vol 10024. Springer, Cham. https://doi.org/10.1007/978-3-319-49004-5_49
Download citation
DOI: https://doi.org/10.1007/978-3-319-49004-5_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49003-8
Online ISBN: 978-3-319-49004-5
eBook Packages: Computer ScienceComputer Science (R0)