User Trust and Judgments in a Curated Database with Explicit Provenance
We focus on human-in-the-loop, information-integration settings where users gather and evaluate data from a broad variety of sources and where the levels of trust in sources and users change dynamically. In such settings, users must use their judgment as they collect and modify data. As an example, a battlefield information officer preparing a report to inform his or her superiors about the current state of affairs must gather and integrate data from many (including non-computerized) sources. By tracking multiple sources for individual values, the officer may eliminate a value from the current state whenever all of the sources where this value was found are no longer trusted. We define a conceptual model for a curated database with provenance for such settings, the Multi-granularity, Multi-provenance Model (MMP), which supports multiple insertions and multiple (copy-and-)paste operations for a single database element, captures the external source for all operations, and includes a Data Confidence Language that allows users to confirm or doubt values to record their atomic judgments about the data. In this paper, we briefly summarize the MMP model and show how it can be extended to support potentially complex operations including compound judgment operators (such as merging tuples to achieve entity resolution), while capturing a complete record of data provenance.
Unable to display preview. Download preview PDF.
- [Agrawal06]Agrawal, P., Benjelloun, O., Das Sarma, A., Hayworth, C., Nabar, S., Sugihara, T., Widom, J.: Trio: a system for data, uncertainty, and lineage. In: Proceedings of the 32nd International Conference on Very Large Data Bases, VLDB 2006. VLDB Endowment (2006)Google Scholar
- [Archer10]Archer, D., Delcambre, L.: A Conceptual Model and Predicate Language for Data Selection and Projection Based on Provenance. In: Proceedings of the Second Workshop on the Theory and Practiceof Provenance (TaPP 2010), San Jose, CA (February 2010)Google Scholar
- [Archer10]Archer, D.: Conceptual Modeling of Data with Provenance. PhD dissertation. Portland State University (2011)Google Scholar
- [Bhagwat04]Bhagwat, D., Chiticariu, L., Tan, W., Vijayvargiya, G.: An annotation management system for relational databases.In Proceedings of the 30thInternational Conference on Very Large Data Bases, VLDB 2004. VLDB Endowment (2004)Google Scholar
- [Buneman08]Buneman, P., Cheney, J., Vansummeren, S.: On the expressivenesss of implicit provenance in query and update languages. ACM Transactions on Database Systems 33(4) (2008)Google Scholar
- [Cui00]Cui, Y., Widom, J., Wiener, J.: Tracing the lineage of view data in a warehousing environment. ACM Transactions on Database Systems 25(2) (2000)Google Scholar
- [Green07]Green, T., Karvounarakis, G., Taylor, N., Biton, O., Ives, Z., Tannen, V.: Orchestra: facilitating collaborative data sharing. In: SIGMOD 2007: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data. ACM, New York (2007)Google Scholar
- [Green07b]Green, T., Karvounarakis, G., Tannen, V.: Provenance semirings. In: PODS 2007: Proceedings of the Twenty-Sixth ACM SIGMOD-SIGACTSIGART Symposium on Principles of Database Systems, ACM, New York (2007)Google Scholar
- [Levitin86]Levitin, A.: How to measure size, and how not to. In: Proceedings of the Tenth COMPSAC Conference. IEEE Computer Society Press, Washington DC (1986)Google Scholar