Skip to main content
Log in

From Causes for Database Queries to Repairs and Model-Based Diagnosis and Back

  • Published:
Theory of Computing Systems Aims and scope Submit manuscript

Abstract

In this work we establish and investigate connections between causes for query answers in databases, database repairs with respect to denial constraints, and consistency-based diagnosis. The first two are relatively new research areas in databases, and the third one is an established subject in knowledge representation. We show how to obtain database repairs from causes, and the other way around. Causality problems are formulated as diagnosis problems, and the diagnoses provide causes and their responsibilities. The vast body of research on database repairs can be applied to the newer problems of computing actual causes for query answers and their responsibilities. These connections are interesting per se. They also allow us, after a transition inspired by consistency-based diagnosis to computational problems on hitting-sets and vertex covers in hypergraphs, to obtain several new algorithmic and complexity results for database causality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. In contrast with general causal claims, such as “smoking causes cancer”, which refer some sort of related events, actual causation specifies a particular instantiation of a causal relationship, e.g., “Joe’s smoking is a cause for his cancer”.

  2. Although not in the context of repairs, consistency-based diagnosis has been applied to consistency restoration of a database with respect to integrity constraints [30].

  3. As opposed to built-in predicates (e.g. ≠) that we assume do not appear, unless explicitly stated otherwise.

  4. In this work, we will assume, unless otherwise explicitly said, that CQs may contain inequality atoms (equality atoms are not an issue, because they can always be eliminated).

  5. In general, in the context of repairs, partitions on instances are not considered. However, in Section 7.3 we will bring them into the repair scene.

  6. Here, and as usual, the atom Ab(c) expresses that component c is (behaving) abnormal(ly).

  7. Cf. [4] for an example of the latter that uses key constraints, which are DCs with inequalities (with violation views that contain inequality).

  8. For a precise formulation, see Definition 5.

  9. Actually, [47] presents a PTIME algorithm for computing responsibilities for a restricted class of CQs.

  10. If \(\phantom {\dot {i}\!}\mathcal {C}\) is a collection of non-empty subsets of a set S, a subset S S is a hitting-set for \(\phantom {\dot {i}\!}\mathcal { C}\) if, for every \(\phantom {\dot {i}\!}C \in \mathcal { C}\), CS . S is an S-minimal hitting-set if no proper subset of it is also a hitting-set. S is a minimum hitting-set if it has minimum cardinality.

  11. The other direction is beyond the scope of this work. More importantly, logic-based diagnosis in general is a much richer scenario than that of database causality. In the former, we can have arbitrary logical specification, whereas under data causality, we have only monotone queries at hand.

  12. Notice that these can also be seen as DCs, since they can be written as \(\phantom {\dot {i}\!}\forall \bar {x} \neg {Ab}_{P}(\bar {x})\).

  13. Notice that these are not denial constraints.

  14. In an hypergraph \(\phantom {\dot {i}\!}\mathcal {H}\), a set of vertices is a vertex cover if it intersects every hyperedge. A minimal vertex cover has no proper subset that is also a vertex cover. A minimum vertex cover has minimum cardinality among the vertex covers. Similarly, an independent set of \(\phantom {\dot {i}\!}\mathcal {H}\) is a set of vertices such that no pair of them is contained in a hyperedge. Maximal and maximum independent sets are defined in an obvious manner.

  15. We recall that repairs of databases with respect to DCs can be characterized as maximal independent sets of conflict hypergraphs (conflict graphs in the case of FDs) whose vertices are the database tuples, and hyperedges connect tuples that together violate a DC [4, 18].

  16. This construction is inspired by [43, Lemma 1]. More details can be found in [44].

  17. Or any other “abducible” predicates that are different from those in \(\phantom {\dot {i}\!}\mathcal {S}\).

  18. This condition is clearly satisfied by the logical reconstruction of a relational database, but can be relaxed in several ways.

  19. We could say that the efforts in [35, 36] to modify the Halpern-Pearl (HP) original definition of causality are about considering more appropriate restrictions on contingencies. Since in some cases the original HP definition does not provide intuitive results regarding causality, the modifications avoid this by recognizing some contingencies as “unreasonable” or “farfetched”.

  20. We can say {t,t } is a conflict, i.e. the two tuples jointly participate in the violation of one of the DCs in Σ.

  21. Pairs of conflicting tuples would inherit the priority relationships from the general priority relation.

  22. Of course, we could use other optimality criteria at this points, but considering all possibilities is beyond the scope of this work.

  23. An alternative, but equivalent formulation can be found in [8].

References

  1. Afrati, F., Kolaitis, P.: Repair checking in inconsistent databases: Algorithms and complexity. Proc. ICDT, 31–41 (2009)

  2. Arenas, M., Bertossi, L., Chomicki, J.: Consistent query answers in inconsistent databases. Proc. ACM PODS, 68–79 (1999)

  3. Arenas, M., Bertossi, L., Chomicki, J.: Answer sets for consistent query answers. Theory Pract. Logic Programm. 3(4–5), 393–424 (2003)

    Article  MATH  Google Scholar 

  4. Arenas, M., Bertossi, L., Chomicki, J., He, X., Raghavan, V., Spinrad, J.: Scalar aggregation in inconsistent databases. Theor. Comput. Sci. 296, 405–434 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  5. Arieli, O., Denecker, M., Van Nuffelen, B., Bruynooghe, M.: Coherent integration of databases by abductive logic programming. J. Artif Intell. Res. 21, 245–286 (2004)

    MathSciNet  MATH  Google Scholar 

  6. Barcelo, P., Bertossi, L., Bravo, L.: Characterizing and computing semantically correct answers from databases with annotated logic and answer sets. in semantics of databases. In: Semantics of Databases, Springer LNCS 2582, pp 1–27 (2003)

  7. Bertossi, L.: Consistent query answering in databases. ACM SIGMOD Rec. 35(2), 68–76 (2006)

    Article  Google Scholar 

  8. Bertossi, L., Li, L.: Achieving data privacy through secrecy views and null-based virtual updates. IEEE Trans Knowl. Data Eng. 25(5), 987–1000 (2013)

    Article  Google Scholar 

  9. Bertossi, L.: Database repairing and consistent query answering, Morgan & Claypool, Synthesis Lectures on Data Management (2011)

  10. Bertossi, L., Salimi, B.: Unifying causality, diagnosis, repairs and view-updates in databases. Presented at the First International Workshop on Big Uncertain Data (BUDA 2014). Posted at: arXiv:http://arXiv.org/abs/1405.4228 [cs.DB]

  11. Brankovic, L., Fernau, H.H.: Parameterized approximation algorithms for hitting set. In: Approximation and Online Algorithms, pp 63–76. Springer LNCS 7164 (2012)

  12. Bravo, L., Bertossi, L.: Semantically correct query answers in the presence of null values. In: Chomicki, J., Wijsen, J. (eds.) Proceedings EDBT WS on Inconsistency and Incompleteness in Databases (IIDB 06), pp 336–357. Springer LNCS 4254 (2006)

  13. Buneman, P., Khanna, S., Tan, W.C.: Why and where: A characterization of data provenance. Proc. ICDT, 316–330 (2001)

  14. Buneman, P., Tan, W.C.: Provenance in databases. Proc. ACM SIGMOD, 1171–1173 (2007)

  15. Cheney, J., Chiticariu, L., Tan, W.C: Provenance in databases why, how, and where. Found. Trends Databases 1(4), 379–474 (2009)

    Article  Google Scholar 

  16. Cheney, J., Chong, S., Foster, N., Seltzer, M.I., Vansummeren, S.: Provenance a future history. OOPSLA Companion (Onward!), 957–964 (2009)

  17. Cheney, J.: Is Provenance Logical?. Proc. LID, 2–6 (2011)

  18. Chomicki, J., Marcinkowski, J.: Minimal-change integrity maintenance using tuple deletions. Inf. Comput. 197(1-2), 90–121 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  19. Chockler, H., Halpern, J.Y.: Responsibility and blame: a structural-model approach. J. Artif. Intell. Res. 22, 93–115 (2004)

    MathSciNet  MATH  Google Scholar 

  20. Console, L., Torasso, P.: A spectrum of logical definitions of model-based diagnosis. Comput. Intell. 7, 133–141 (1991)

    Article  Google Scholar 

  21. Console, L., Sapino, M.L., Theseider-Dupre, D.: The role of abduction in database view updating. J. Intell. Inf. Syst. 4(3), 261–280 (1995)

    Article  Google Scholar 

  22. Cui, Y., Widom, J., Wiener, J.L.: Tracing the lineage of view data in a warehousing environment. ACM Trans. Database Syst. 25(2), 179–227 (2000)

    Article  Google Scholar 

  23. Eiter, T., Gottlob, G., Leone, N.: Abduction from logic programs semantics and complexity. Theor. Comput. Sci. 189(1-2), 129–177 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  24. Eiter, T. h., Faber, W., Leone, N., Pfeifer, G.: The diagnosis frontend of the DLV system. AI Commun. 12(1-2), 99–111 (1999)

    MathSciNet  Google Scholar 

  25. Fagin, R., Kimelfeld, B., Kolaitis, Ph.: Dichotomies in the complexity of preferred repairs. Proc. ACM PODS, 3–15 (2015)

  26. Fernau, H.: Parameterized algorithmics for d-hitting set. Int. J. Comput. Math. 87(14), 3157–3174 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  27. Feldman, A., Provan, G., Gemund, A.V.: Approximate model-based diagnosis using greedy stochastic search. J. Artif. Intell. Res. (JAIR) 87(14), 3157–3174 (2010)

    MATH  Google Scholar 

  28. Flum, J., Grohe, M.: Parameterized complexity theory. Texts in Theoretical Computer Science, Springer Verlag (2006)

  29. Garey, M., Johnson, D.S.: Computers and intractability: a guide to the theory of NP-completenes. W. H. Freeman (1979)

  30. Gertz, M.: Diagnosis and repair of constraint violations in database systems. PhD Thesis, Universität Hannover (1996)

  31. Greco, S., Pijcke, F., Wijsen, J.: Certain query answering in partially consistent databases. PVLDB 7(5), 353–364 (2014)

    Google Scholar 

  32. Halpern, J., Pearl, J.: Causes and explanations: a structural-model approach: part 1. Proc. UAI, 194–202 (2001)

  33. Halpern, J., Pearl, J.: Causes and explanations: a structural-model approach: part 1. British J. Philos. Sci. 56, 843–887 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  34. Halperin, E.: Improved approximation algorithms for the vertex cover problem in graphs and hyper-graphs. Proceedings ACM-SIAM Symposium on Discrete Algorithms, 329–337 (2000)

  35. Halpern, J.: Appropriate causal models and stability of causation. Proc. KR’14 (2014)

  36. Halpern, J.: A modification of Halpern-Pearl definition of causality. Proc. IJCAI (2015)

  37. Kakas A. C., Mancarella, P.: Database updates through abduction. Proc. VLDB, 650–661 (1990)

  38. Karvounarakis, G., Green, T.J.: Semiring-annotated data queries and provenance? SIGMOD Rec. 41(3), 5–14 (2012)

    Article  Google Scholar 

  39. Krentel, M.: The complexity of optimization problems. J. Comput. Syst. 36, 490–509 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  40. Karvounarakis, G., Ives Z. G., Tannen, V.: Querying provenance. Proc. ACM SIGMOD, 951–962 (2010)

  41. Kimelfeld, B.: A dichotomy in the complexity of deletion propagation with functional dependencies. Proc. ACM PODS (2012)

  42. Kimelfeld, B., Vondrak, J., Williams, R.: Maximizing conjunctive views in deletion propagation. ACM Trans. Database Syst. 37(4), 24 (2012)

    Article  Google Scholar 

  43. Lopatenko, A., Bertossi, L.: Complexity of consistent query answering in databases under cardinality-based and incremental repair semantics. Proc. ICDT, 2007, Springer LNCS 4353, pp. 179–193. Proofs of results are found in [44]

  44. Lopatenko, A., Bertossi, L.: Complexity of consistent query answering in databases under cardinality-based and incremental repair semantics. Extended version of [43], including proofs. Posted at: arXiv:http://arXiv.org/abs/cs/1605.07159 [cs.DB]

  45. Meliou, A., Gatterbauer, W., Suciu, D.: Reverse data management. PVLDB 4(12), 1490–1493 (2011)

    Google Scholar 

  46. Meliou, A., Gatterbauer, W., Suciu, D.: Bringing provenance to its full potential using causal reasoning. Proc. TaPP (2011)

  47. Meliou, A., Gatterbauer, W., Moore K. F., Suciu, D.: The complexity of causality and responsibility for query answers and non-answers. Proc. VLDB, 34–41 (2010)

  48. Meliou, A., Gatterbauer, W., Halpern, J.Y., Koch, C., Moore K. F., Suciu, D.: Causality in databases. IEEE Data Eng. Bull. 33(3), 59–67 (2010)

    Google Scholar 

  49. Mozetic, I., Holzbaur, C.: Controlling the complexity in model-based diagnosis. Ann. Math. Artif. Intell. 11(1-4), 297–314 (1994)

    Article  MATH  Google Scholar 

  50. Niedermeier, R., Rossmanith, P.: An efficient fixed-parameter algorithm for 3-hitting set. J Discret. Algorithm. 1(1), 89–102 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  51. Okun, M.: On approximation of the vertex cover problem in hypergraphs. Discret. Optim. 2(1), 101–111 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  52. Papadimitriou, C.H.: Computational complexity. Addison-Wesley (1994)

  53. Reiter, R.: A theory of diagnosis from first principles. Artif. Intell. 32(1), 57–95 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  54. Reiter, R.: Towards a logical reconstruction of relational database theory. In: On Conceptual Modelling, pp 191–233. Springer (1984)

  55. Salimi, B., Bertossi, L.: Causality in databases: the diagnosis and repair connections. Presented at The 15th International Workshop on Non-Monotonic Reasoning (NMR 2014). Posted at: arXiv:1404.6857[cs.DB]

  56. Salimi, B., Bertossi, L.: Causes for query answers from databases, datalog abduction and view-updates: the presence of integrity constraints. Proc. FLAIRS, 2016. Posted as Corr arXiv:http://arXiv.org/abs/cs.DB/1602.06458

  57. Salimi, B., Bertossi, L.: Query-answer causality in databases: abductive diagnosis and view-updates. In: Proceedings UAI Causal Inference Workshop, 2015. CEUR-WS Proceedings Vol-1504 (2015)

  58. Salimi, B., Bertossi, L.: From causes for database queries to repairs and model-based diagnosis and back. In: Proceedings 18th International Conference on Database Theory (ICDT 2015)

  59. Staworko, S., Chomicki, J., Marcinkowski, J.: Prioritized repairing and consistent query answering in relational databases. Ann. Math. Artif. Intell. 64(2-3), 209–246 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  60. Struss, P.: Model-based problem solving. In: Handbook of Knowledge Representation, chap. 10. Elsevier (2008)

  61. Tannen, V.: Provenance propagation in complex queries. In: Buneman Festschrift, 2013, Springer LNCS 8000, pp. 483–493

Download references

Acknowledgments

Research funded by NSERC Discovery, and the NSERC Strategic Network on Business Intelligence (BIN). Conversations with Alexandra Meliou during Leo Bertossi’s visit to U. of Washington in 2011 are much appreciated. He is also grateful to Dan Suciu and Wolfgang Gatterbauer for their hospitality. L. Bertossi is grateful to Benny Kimelfeld for stimulating conversations. Part of the research was developed by L. Bertossi during partial sabbatical stays at LogicBlox and The Center for Semantic Web Research (Chile). Their support is much appreciated. We appreciate the comments from the anonymous reviewers.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leopoldo Bertossi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bertossi, L., Salimi, B. From Causes for Database Queries to Repairs and Model-Based Diagnosis and Back. Theory Comput Syst 61, 191–232 (2017). https://doi.org/10.1007/s00224-016-9718-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00224-016-9718-9

Keywords

Navigation