Uncertainty That Counts
Uncertainty is modeled by a multibase (db,μ) where db is a database with zero or more primary key violations, and μ associates a multiplicity (a positive integer) to each fact of db. In data integration, the multiplicity of a fact g can indicate the number of data sources in which g was found. In planning databases, facts with the same primary key value are alternatives for each other, and the multiplicity of a fact g can denote the number of people in favor of g.
A repair of db is obtained by selecting a maximal number of facts without ever selecting two distinct facts of the same relation that agree on their primary key. Every repair has a support count, which is the product of the multiplicities of its facts.
For a fixed Boolean query q, we define σ CERTAINTY(q) as the following counting problem: Given a multibase (db,μ), determine the weighted number of repairs of db that satisfy q. Here, every repair is weighted by its support count. We illustrate the practical significance of this problem by means of examples.
For conjunctive queries q without self-join, we provide a syntactic characterization of the class of queries q such that σ CERTAINTY(q) is in P; for queries not in this class, σ CERTAINTY(q) is \(\sharp\) P-hard (and hence highly intractable).
KeywordsConjunctive Query Weighted Number Support Count Source Database Query Answering
Unable to display preview. Download preview PDF.
- 2.Arenas, M., Bertossi, L.E., Chomicki, J.: Consistent query answers in inconsistent databases. In: PODS, pp. 68–79. ACM Press, New York (1999)Google Scholar
- 6.Dalvi, N.N., Suciu, D.: Management of probabilistic data: foundations and challenges. In: Libkin, L. (ed.) PODS, pp. 1–12. ACM, New York (2007)Google Scholar
- 7.Fan, W., Geerts, F., Wijsen, J.: Determining the currency of data. In: Lenzerini, M., Schwentick, T. (eds.) PODS, pp. 71–82. ACM, New York (2011)Google Scholar
- 14.Wijsen, J.: On the first-order expressibility of computing certain answers to conjunctive queries over uncertain databases. In: Paredaens, J., Gucht, D.V. (eds.) PODS, pp. 179–190. ACM, New York (2010)Google Scholar