Building Trust in Information pp 3-45 | Cite as
Provenance: Past, Present and Future in Interdisciplinary and Multidisciplinary Perspective
- 2 Citations
- 1k Downloads
Abstract
This chapter presents a multi- and interdisciplinary synthesis of ideas about the definition and theoretical conceptualization of provenance, drawing from disciplines such as archival science, law, computer science, library and information science, and visual analytics. Through the lens of these distinct domains, the chapter explores different purposes served by provenance; various ways that diverse fields are capturing, representing and using provenance information; provenance standards and specifications, and a range of open research challenges relating to theorizing about provenance and capturing, representing and using provenance information in increasingly distributed, heterogeneous information eco-systems combining machine and human intelligence. From this blending of perspectives on provenance from different disciplines and ‘interdisciplines’, a rich picture emerges of provenance as a dynamic construct and evolving focus of research.
Keywords
Metadata Provenance Sense-making Trust Trusted computingReferences
- 1.Papritz, J.: Archivwissenschaft. 4 vols. Archivschule Marburg. Institut fur Archivwissenschaft, Marburg (1976)Google Scholar
- 2.Moreau, L.: The foundations for provenance on the web. Found. Trends Web Sci. 2(2–3), 99–241 (2010)CrossRefGoogle Scholar
- 3.Yeo, G.: Trust and context in cyberspace. Arch. Rec. 34(2), 214–234 (2012)CrossRefGoogle Scholar
- 4.Duranti, L.: The odyssey of records managers. In: Burke, F.G., Nesmith, T. (eds.) Canadian Archival Studies and the Rediscovery of Provenance, pp. 29–60. Scarecrow Press, Metuchen (1993)Google Scholar
- 5.Jones, T.G., Burgess, L., Jefferies, N., Ranganathan, A., Rumsey, S.: Contextual and provenance metadata in the Oxford University Research Archive (ORA). In: Metadata and Semantics Research, pp. 274–285. Springer International Publishing, Berlin (2015)Google Scholar
- 6.Cohen, F: Digital forensics and electronic discovery. http://all.net (c. 2013)
- 7.Socha, G., Gelbmann, T.: Electronic discovery reference model. http://www.edrm.net/resources/edrm-stages-explained (2016)
- 8.Tennis, J.T.: A Kaleidoscope perspective: change in the semantics and structure of facets and isolates in Analytico-Synthetic classification. SRELS J. Inf. Manage. 50(6), 789–794 (2013)Google Scholar
- 9.Bearman, D.A., Lytle, R.H.: The power of the principle of provenance. Archivaria. 1(21), (1985)Google Scholar
- 10.Schon, D.A., DeSanctis, V.: The reflective practitioner: how professionals think in action. J. Contin. High. Educ. 34, (1986)Google Scholar
- 11.Varga, M., Varga, C.: Visual analytics – data, analytical and reasoning provenance. Springer Nature (2016). This volumeGoogle Scholar
- 12.Duranti, L.: The concept of appraisal and archival theory. Am. Arch. 57, 328–344 (1994)CrossRefGoogle Scholar
- 13.Lemieux, V.L.: Applying Mintzberg’s theories on organizational configuration to archival appraisal. Archivaria. 1(46), (1998)Google Scholar
- 14.Cook, T.: Archival science and postmodernism: new formulations for old concepts. Arch. Sci. 1(1), 3–24 (2001)CrossRefGoogle Scholar
- 15.Pearce-Moses, R., Baty, L.A.: A Glossary of Archival and Records Terminology. Society of American Archivists, Chicago (2005)Google Scholar
- 16.International Council on Archives. International Standard Archival Description (General). ICA, Paris (1994)Google Scholar
- 17.International Council on Archives. ISAAR (CPF) International Standard Archival Authority Record for Corporate Bodies, Persons and Families, 2nd edn. ICA, Paris (2004)Google Scholar
- 18.International Council on Archives. Committee on Best Practices and Standards: Progress Report for Revising and Harmonising ICA Descriptive Standards. ICA, Paris (2012)Google Scholar
- 19.Duchein, M.: Theoretical principles and practical problems of respect des fonds in archival science. Archivaria 1(16), 64–82 (1983)Google Scholar
- 20.Horsman, P.: The last dance of the phoenix or the de-discovery of the archival fonds. Archivaria 1(54), 1–23 (2002)Google Scholar
- 21.Gilliland-Swetland, A.J.: Enduring Paradigm, New Opportunities: The Value of the Archival Perspective in the Digital Environment. Council on Library and Information Resources, Washington (2000)Google Scholar
- 22.Cook, T.: Mind over matter: towards a new theory of archival appraisal. In: Craig, B. (ed.) The Archival Imagination: Essays in Honour of Hugh Taylor, pp. 38–70. Association of Canadian Archivists, Ottawa (1992)Google Scholar
- 23.Abukhanfusa, K., Sydbeck, J. (eds.): The Principle of Provenance. Report from the First Stockholm Conference on Archival Theory and the Principle of Provenance. Swedish National Archives, Stockholm (1994)Google Scholar
- 24.Douglas, J.: Origins: evolving ideas about the principle of provenance. In: Eastwood, T., MacNeil, H. (eds.) Currents of Archival Thinking, pp. 23–43. Libraries Unlimited, Santa Barbara (2010)Google Scholar
- 25.Scott, P.: The record group concept: a case for abandonment. Am. Arch. 29, 493–504 (1966)CrossRefGoogle Scholar
- 26.Canadian Committee on Archival Description. Rules for Archival Description. Bureau of Canadian Archivists, Ottawa (1990)Google Scholar
- 27.Barr, D.: The fonds concept in the working group on archival descriptive standards report. Archivaria. 1(25), 163–169 (Winter 1987–88)Google Scholar
- 28.Millar, L.: The death of the fonds and the resurrection of provenance: archival context in space and time. Archivaria 1(53), 1–15 (2002)Google Scholar
- 29.Nesmith, T.: The concept of societal provenance and records of nineteenth-century Aboriginal-European Relations in Western Canada: implications for archival theory and practice. Arch. Sci. 6(3–4), 351–360 (2006)Google Scholar
- 30.Nesmith, T.: Reopening archives: bringing new contextualities into archival theory and practice. Archivaria 60(60), 259–274 (2006)Google Scholar
- 31.Lemieux, V.L.: Toward a ‘Third Order’ archival interface: research notes on some theoretical and practical implications of visual explorations in the Canadian context of financial electronic records. Archivaria 1(78), 53–93 (2014)Google Scholar
- 32.EDM Council. Financial Industry Business Ontology. http://www.edmcouncil.org/financialbusiness (2012–2016)
- 33.World Wide Web Consortium. PROV-O: The PROV Ontology. https://www.w3.org/TR/prov-dictionary/ (2013)
- 34.DCMI. DCMI Specifications. http://dublincore.org/specifications/ (1995–2016)
- 35.Buneman, P., Khanna, S., Wang-Chiew, T.: Why and where: a characterization of data provenance. In: Database Theory—ICDT, pp. 316–330. Springer, Berlin, Heidelberg (2001)Google Scholar
- 36.Cheney, J., Chiticariu, L., Tan, W.-C.: Provenance in Databases: Why, How, and Where. Now Publishers Inc., Breda (2009)Google Scholar
- 37.Green, T.J., Karvounarakis, G., Ives, Z.G., Tannen, V.: Update exchange with mappings and provenance. In: Proceedings of the 33rd International Conference on Very Large Data Bases, pp. 675–686. VLDB Endowment, Almaden (2007)Google Scholar
- 38.Davidson, S.B., Boulakia, S.C., Eyal, A., Ludäscher, B., McPhillips, T.M., Bowers, S., Freire, J.: Provenance in scientific workflow systems. IEEE Data Eng. Bull. 30(4), 44–50 (2007)Google Scholar
- 39.Amsterdamer, Y., Davidson, S.B., Deutch, D., Milo, T., Stoyanovich, J., Tannen, V.: Putting lipstick on pig: enabling database-style workflow provenance. Proc. VLDB Endowment 5(4), 346–357 (2011)CrossRefGoogle Scholar
- 40.Thomas, J.J., Cook, K.A.: Illuminating the Path: The Research and Development Agenda for Visual Analytics. National Visualization and Analytics Centre, Richland, WA (2005)Google Scholar
- 41.Jankun-Kelly, T.J.: The Case for Visual Analysis Provenance Cases, Workshop on Analytic Provenance: Process + Interaction + Insight. CHI (2011)Google Scholar
- 42.Keim, D.A., Kohlhammer, J., Ellis, G., Mansmann, F.: Mastering the Information Age-Solving Problems with Visual Analytics. Florian Mansmann (2010)Google Scholar
- 43.Xu, K.: InterPARES Trust Interdisciplinary Workshop on Provenance Participant’s Statement. Unpublished document (May, 2015)Google Scholar
- 44.Pirolli, P., Card, S.K.: The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. Proc. Int. Conf. Intell. Anal. 5, 2–4 (2005)Google Scholar
- 45.Klein, G., Moon, B., Hoffman, R.R.: Making sense of sensemaking 1: alternative perspectives. IEEE Intell. Syst. 4, 70–73 (2006)CrossRefGoogle Scholar
- 46.Hutchins, E.: Cognition in the Wild. MIT Press, Cambridge (1995)Google Scholar
- 47.Hollan, J., Hutchins, E., Kirsh, D.: Distributed cognition: toward a new foundation for human-computer interaction research. ACM Trans. Comput. Hum. Interact. (TOCHI) 7(2), 174–196 (2000)CrossRefGoogle Scholar
- 48.Roberts, J.C., Keim, D., Hanratty, T., Rowlingson, R.R., Walker, R., Hall, M., Jacobson, Z., Lavigne, V., Rooney, C., Varga, M.: From Ill-defined problems to informed decisions. In: EuroVis Workshop on Visual Analytics. Eurographics Association, Geneva (2014)Google Scholar
- 49.Factor, M., Henis, E., Naor, D., Rabinovici-Cohen, S., Reshef, P., Ronen, S., Michetti, G., Guercio, M.: Authenticity and provenance in long term digital preservation: modeling and implementation in preservation aware storage. In: Workshop on the Theory and Practice of Provenance. ACM SIGMOD 38(2), 57–60 (2009)Google Scholar
- 50.Gillean, D., Leveillé, V., Rogers, C.: Records in the Cloud–A metadata framework for cloud service providers. In: Proceedings of the International Conference on Cloud Security Management: ICCSM, p. 166. Academic Conferences Limited, Curtis Farm (2013)Google Scholar
- 51.Lemieux, V.L., Rogers, C., Thibodeau, K.: InterPARES Trust (international multidisciplinary research into issues of trust in digital objects in online environments) Metadata: Authenticity and Provenance in the Cloud. NATO Specialist Meeting IST-13: Distributed Data Analytics for Combating Weapons of Mass Destruction, Lorton, VA, 15–17 October 2014Google Scholar
- 52.Sedona Conference: Best Practices Recommendations & Principles for Addressing Electronic Document Production. The Sedona Conference, Sedona (2007)Google Scholar
- 53.Missier, P., Ludäscher, B., Dey, S., Wang, M., McPhillips, T., Bowers, S., et al.: Golden trail: retrieving the data history that matters from a comprehensive provenance repository. Int. J. Digit. Curation. 7(1) (2012)Google Scholar
- 54.Open Data Charter.net: Open Data Charter. http://opendatacharter.net/who-we-are/ (c. 2015)
- 55.McKinsey Global Institute. Open Data: Unlocking Innovation and Performance with Liquid Information. McKinsey & Co., London (2013)Google Scholar
- 56.European Union. Data Portal. http://www.europeandataportal.eu (2016)
- 57.Open Government Partnership. About. http://www.opengovpartnership.org/about (2016)
- 58.Ballard, M.: Poor data quality hindering government open data programme. Computer Weekly (28 August 2014)Google Scholar
- 59.Dasu, T., Johnson, T.: Exploratory Data Mining and Data Cleaning, vol. 479. Wiley, New York (2003)CrossRefGoogle Scholar
- 60.Anderson, S.R., Allen, R.B.: Envisioning the archival commons. Am. Arch. 72(2), 383–400 (2009)CrossRefGoogle Scholar
- 61.Oomen, J., Aroyo, L.: Crowdsourcing in the cultural heritage domain: opportunities and challenges. In: Proceedings of the 5th International Conference on Communities and Technologies, pp. 138–149. ACM, New York (2011)Google Scholar
- 62.Eveleigh, A.: Crowding out the archivist? Locating crowdsourcing within the broader landscape of participatory archives. In: Ridge, M., Mia Ridge (ed.) Crowdsourcing our Cultural Heritage, pp. 211–212. Ashgate Publishing, Farnham (2014)Google Scholar
- 63.Dewey, M.: Decimal Classification and Relative Index for Libraries, Clippings, Notes, etc, 8th edn. Forest Press, Tionesta (1913)Google Scholar
- 64.Trickett, S.B., Trafton, J.G., Saner, L., Schunn, C.D.: I don’t know what’s going on there: the use of spatial transformations to deal with and resolve uncertainty in complex visualizations. In: Lovett, M.C., Shah, P. (eds.) Thinking with Data, pp. 65–86. Lawrence Erlbaum Associates, Mahwah (2007)Google Scholar
- 65.Flood, M.D., Lemieux, V.L., Varga, M., Wong, B.L.W.: The application of visual analytics to financial stability monitoring. J. Financ. Stability (2016)Google Scholar
- 66.Watts, K.A.: Proposing a place for politics in arbitrary and capricious review. Yale Law J. 119, 2–85 (2009)Google Scholar
- 67.Kelly, J.E.: Welcome to the era of cognitive systems. http://asmarterplanet.com/blog/2012/05/welcome-to-theera-of-cognitive-systems.html (May 10, 2012)
- 68.Computing Research Association. Grand Research Challenges in Information Systems. CRA, Washington (2002)Google Scholar
- 69.Cavelier, K.: InterPARES Trust Interdisciplinary Workshop on Provenance Participant’s Statement. Unpublished document (May, 2015)Google Scholar
- 70.MacNeil, H.: Trusting description: authenticity, accountability, and archival description standards. J. Arch. Organ. 7(3), 89–107 (2009)Google Scholar
- 71.Bearman, D.: Description standards: a framework for action. Am. Arch. 52(4), 514–519 (1989)CrossRefGoogle Scholar
- 72.GBIF (Global Biodiversity Information Facility. What is GBIF. http://www.gbif.org/what-is-gbif (2016)
- 73.Missier, P., Dey, S., Belhajjame, K., Cuevas-Vicenttín, V., Ludäscher, B.: D-PROV: extending the PROV provenance model with workflow structure. In: Proceedings of the 5th USENIX Workshop on the Theory and Practice of Provenance (TaPP 13). USENIX Association, Berkeley (2013)Google Scholar
- 74.Murta, L., Braganholo, V., Chirigati, F., Koop, D., Freire, J.: Noworkflow: capturing and analyzing provenance of scripts. In: Provenance and Annotation of Data and Processes, pp. 71–83. Springer International Publishing, Berlin (2014)Google Scholar
- 75.Lerner, B., Boose, E.: RDataTracker: collecting provenance in an interactive scripting environment. In: Proceedings of the 6th USENIX Workshop on the Theory and Practice of Provenance (TaPP 2014). USENIX, Berkeley (2014)Google Scholar
- 76.Hertzum, M., Hansen, K.D., Andersen, H.H.K.: Scrutinising usability evaluation: does thinking aloud affect behavior and mental workload? Behav. Inf. Technol. 28(2), 165–181 (2009)CrossRefGoogle Scholar
- 77.Gotz, D., Zhou, M.X.: Characterizing users’ visual analytic activity for insight provenance. Inf. Vis. 8(1), 42–55 (2009)CrossRefGoogle Scholar
- 78.Shrinivasan, Y.B., van Wijk, J.J.: Supporting exploration awareness in information visualization. IEEE Comput. Graph. Appl. 29(5), 24–33 (2009)CrossRefGoogle Scholar
- 79.Pike, W.A., May, R., Baddeley, B., Riensche, R., Bruce, J., Younkin, K.: Scalable visual reasoning: supporting collaboration through distributed analysis. In: International Symposium on Collaborative Technologies and Systems, pp. 24–32. IEEE Press, New York (2007)Google Scholar
- 80.Walker, R., Slingsby, A., Dykes, J., Xu, K., Wood, J., Nguyen, P.H., Stephens, D., Wong, B.L., Zheng, Y.: An extensible framework for provenance in human terrain visual analytics. IEEE Trans. Vis. Comput. Graph. 19(12), 2139–2148 (2013)Google Scholar
- 81.Nguyen, P.H., Xu, K., Walker, R., Wong, B.L.W.: SchemaLine: timeline visualization for sensemaking. In: Proceedings of the 18th International Conference on Information Visualization (IV), pp. 225–233. IEEE Press, New York (2014)Google Scholar
- 82.Gotz, D., Wen, Z.: Behavior-driven visualization recommendation. In: Proceedings of the 14th International Conference on Intelligent User Interfaces, pp. 315–324. ACM, New York (2009)Google Scholar
- 83.Bavoil, L., Callahan, S.P., Crossno, P.J., Freire, J., Scheidegger, C.E., Silva, C.T., Vo, H.T: Vistrails: enabling interactive multiple-view visualizations. In: Proceedings of IEEE Information Visualization 05, pp. 135–142. IEEE Press, New York (2005)Google Scholar
- 84.Dunne, C., Henry Riche, N., Lee, B., Metoyer, R., Robertson, G.: GraphTrail: analyzing large multivariate, heterogeneous networks while supporting exploration history. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1663–1672. ACM, New York (2012)Google Scholar
- 85.Lemieux, V.L., Dang, T.: Building accountability for decision-making into cognitive systems. In: Advances in Information Systems and Technologies, pp. 575–586. Springer, Berlin, Heidelberg (2013)Google Scholar
- 86.Missier, P., Belhajjame, K., Cheney, J.: The W3C PROV family of specifications for modelling provenance metadata. In: Proceedings of the 16th International Conference on Extending Database Technology, pp. 773–776. ACM, New York (2013)Google Scholar
- 87.Moreau, L., Hartig, O., Simmhan, Y., Myers, J., Lebo, T., Belhajjame, K., Miles, S., Soiland-Reyes, S.: PROV-AQ: provenance access and query. http://www.w3.org/TR/prov-aq (2012)
- 88.Gil, Y., Miles, S., Belhajjame, K., Deus, H., Garijo, D., Klyne, G., Missier, P., Soiland-Reyes, S., Zednik, S.: PROV model primer. https://www.w3.org/TR/prov-primer/ (2012)
- 89.Lebo, T., Sahoo, S., McGuinness, D., Belhajjame, K., Cheney, J., Corsar, D., Garijo, D., Soiland-Reyes, S., Zednik, S., Zhao, J.: ProvO: The prov ontology. W3C Recommendation. (2013)Google Scholar
- 90.Firth, H., Missier, P.: ProvGen: generating synthetic PROV graphs with predictable structure. In: Provenance and Annotation of Data and Processes, pp. 16–27. Springer International Publishing, Berlin (2014)Google Scholar
- 91.Groth, P., Moreau, L: PROV-Overview. An overview of the PROV Family of Documents. https://www.w3.org/TR/2013/NOTE-prov-overview-20130430/ (2013)
- 92.The Bodleian Library. CAMELOT: A contextual data model for the Bodleian digital library. http://camelot-dev.bodleian.ox.ac.uk (2016)
- 93.PREMIS Editorial Committee. Data dictionary for preservation metadata, Version 3.0. OCLC, Washington (2015)Google Scholar
- 94.ISO/IEC: ISO 14721: 2012– Space data and information transfer systems -- Open archival information system (OAIS) -- Reference model. ISO, Geneva (2012)Google Scholar
- 95.PREMIS Editorial Committee. PREMIS OWL Ontology 2.2 now available. https://www.loc.gov/standards/premis/ontology-announcement.html (2013)
- 96.Guercio, M.: PREMIS and the long-term preservation of complex digital archives: Lessons learned and critical issues from the CASPAR Research. Round Table on PREMIS – Preservation Metadata: Implementation Strategies, Rome Italy (2009)Google Scholar
- 97.McKemmish, S., Acland, G., Ward, N., Reed, B.: Describing records in context in the continuum: the Australian Recordkeeping Metadata Schema. Archivaria 1(48), 3–37 (1999)Google Scholar
- 98.ISO/IEC: ISO 23081: 2006. Information and Documentation – Records Management Processes – Metadata for Records – Part I: Principles. ISO, Geneva (2006)Google Scholar
- 99.ISO/IEC: ISO 15489: 2001. Information and Documentation – Records Management – Part I: General. ISO, Geneva (2001)Google Scholar
- 100.ISO/IEC: ISO 23081: 2009. Information and Documentation – Records Management Processes – Metadata for Records – Part 2: Conceptual and Implementation Issues. ISO, Geneva (2009)Google Scholar
- 101.ISO/IEC: ISO 23081. Information and Documentation – Records Management Processes – Metadata for Records – Part 3: Self-Assessment Method. ISO, Geneva (2011)Google Scholar
- 102.Duranti, L.: The long-term preservation of accurate and authentic digital data: the INTERPARES project. Data Sci. J. 4, 106–118 (2005)CrossRefGoogle Scholar
- 103.InterPARES 2 Terminology Database. http://www.interpares.org/ip2/ip2_terminology_db.cfm (2016)
- 104.Xie, S.L.: Preserving digital records: InterPARES findings and developments. In: Lemieux, V.L. (ed.) Financial Analysis and Risk Management, pp. 187–206. Springer, Berlin, Heidelberg (2013)CrossRefGoogle Scholar
- 105.InterPARES. Chain of preservation model. http://www.interpares.org/ip2/ip2_models.cfm# (2007)
- 106.International Council on Archives. ISAF: International Standard for Activities-Functions of Corporate Bodies. ICA, Paris (2006)Google Scholar
- 107.Mitchell, C. (ed.): Trusted Computing. Institution of Electrical Engineers, New York (2005)Google Scholar
- 108.Xu, K., Attfield, S., Jankun-Kelly, T.J., Wheat, A., Nguyen, P.H., Selvaraj, N.: Analytic provenance for sensemaking: a research agenda. Comput. Graph. Appl. 35(3), 56–64. IEEE, New York (2015)Google Scholar
- 109.Dou, W., Jeong, D.H., Stukes, F., Ribarsky, W., Lipford, H.R., Chang, R.: Recovering reasoning processes from user interactions. IEEE Comput. Graph. Appl. 3, 52–61 (2009)CrossRefGoogle Scholar