Encoding Provenance Metadata for Social Science Datasets
- 4 Citations
- 1.2k Downloads
Abstract
Recording provenance is a key requirement for data-centric scholarship, allowing researchers to evaluate the integrity of source data sets and reproduce, and thereby, validate results. Provenance has become even more critical in the web environment in which data from distributed sources and of varying integrity can be combined and derived. Recent work by the W3C on the PROV model provides the foundation for semantically-rich, interoperable, and web-compatible provenance metadata. We apply that model to complex, but characteristic, provenance examples of social science data, describe scenarios that make scholarly use of those provenance descriptions, and propose a manner for encoding this provenance metadata within the widely-used DDI metadata standard.
Keywords
Metadata Provenance DDI eSocial SciencePreview
Unable to display preview. Download preview PDF.
References
- 1.Daw, M., Procter, R., Lin, Y., Hewitt, T., Ji, W., Voss, A., Baird, K., Turner, A., Birkin, M., Miller, K., Dutton, W., Jirotka, M., Schroeder, R., de la Flor, G., Edwards, P., Allan, R., Yang, X., Crouchley, R.: Developing an e-Infrastructure for Social Science. In: Proceedings of e-Social Science 2007 (2007)Google Scholar
- 2.Lagoze, C., Block, W., Williams, J., Abowd, J.M., Vilhuber, L.: Data Management of Confidential Data. In: International Data Curation Conference (2013)Google Scholar
- 3.Vardigan, M., Heus, P., Thomas, W.: Data Documentation Initiative: Toward a Standard for the Social Sciences. The International Journal of Digital Curation 3(1) (2008)Google Scholar
- 4.Groth, P., Moreau, L.: PROV-Overview: An Overview of the PROV Family of Documents. W3C (2013)Google Scholar
- 5.National Science Foundation, NSF Award Search: Award#1131848 - NCRN-MN: Cornell Census-NSF Research Node: Integrated Research Support, Training and Data Documentation (2011)Google Scholar
- 6.Simmhan, Y., Plale, B., Gannon, D.: A survey of data provenance in e-science. ACM Sigmod Record (2005)Google Scholar
- 7.Cheney, J., Chong, S., Foster, N., Seltzer, M., Vansummeren, S.: Provenance. In: Proceeding of the 24th ACM SIGPLAN Conference Companion on Object Oriented Programming Systems Languages and Applications - OOPSLA 2009, p. 957 (2009)Google Scholar
- 8.Groth, P., Gil, Y., Cheney, J., Miles, S.: Requirements for Provenance on the Web. International Journal of Digital Curation 7(1), 39–56 (2012)CrossRefGoogle Scholar
- 9.McGuinness, D.L., Fox, P., Pinheiro da Silva, P., Zednik, S., Del Rio, N., Ding, L., West, P., Chang, C.: Annotating and embedding provenance in science data repositories to enable next generation science applications. AGU Fall Meeting Abstracts 1 (2008)Google Scholar
- 10.Moreau, L., Freire, J., Futrelle, J., McGrath, R., Myers, J., Paulson, P.: The Open Provenance Model. University of Southampton, pp. 1–30 (August 2007)Google Scholar
- 11.Moreau, L., Missier, P.: PROV-N: The Provenance Notation. W3C (2013)Google Scholar
- 12.Jarmin, R., Miranda, J.: The Longtitudinal Business Database (2002)Google Scholar
- 13.Klyne, G., Groth, P.: Provenance Access and Query. W3C (2013)Google Scholar
- 14.Lebo, T., Sahoo, S., McGuinness, D.L.: PROV-O: The PROV Ontology. W3C (2013)Google Scholar
- 15.Kramer, S., Leahey, A., Southall, H., Vampras, J., Wackerow, J.: Using RDF to describe and link social science data to related resources on the Web: leveraging the Data Documentation Initiative (DDI) model. Data Documentation Initiative (September 01, 2012)Google Scholar
- 16.Bosch, T., Cyganiak, R., Wackerow, J., Zapilko, B.: Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences. In: International Conference on Dublin Core and Metadata Applications; DC-2012–The Kuching Proceedings (September 2012)Google Scholar
- 17.Bosch, T., Cyganiak, R., Gregory, A., Wackerow, J.: DDI-RDF Discovery Vocabulary: A Metadata Vocabulary for Documenting Research and Survey Data. In: Linked Data on the Web Workshop (2013)Google Scholar