# Causal identifiability and piecemeal experimentation

- 145 Downloads
- 1 Citations

## Abstract

In medicine and the social sciences, researchers often measure only a handful of variables simultaneously. The underlying assumption behind this methodology is that combining the results of dozens of smaller studies can, in principle, yield as much information as one large study, in which dozens of variables are measured simultaneously. Mayo-Wilson (Philos Sci 78(5):864–874, 2011, Br J Philos Sci 65(2):213–249, 2013. https://doi.org/10.1093/bjps/axs030) shows that assumption is false when causal theories are inferred from *observational* data. This paper extends Mayo-Wilson’s results to cases in which *experimental* data is available. I prove several new theorems that show that, as the number of variables under investigation grows, experiments do not improve, in the worst-case, one’s ability to identify the true causal model if one can measure only a few variables at a time. However, stronger statistical assumptions (e.g., Gaussianity) significantly aid causal discovery in piecemeal inquiry, even if such assumptions are unhelpful when all variables can be measured simultaneously.

## Keywords

Causation Experimentation Induction Randomized controlled trials (RCTs) Piecemeal inquiry Problem of piecemeal induction## Notes

### Acknowledgements

Thanks to David Danks, Clark Glymour, and Peter Spirtes for asking the questions that led to the results reported in this paper. While writing the paper, I benefited from several discussions with Frederick Eberhardt. Finally, thanks to three anonymous reviewers for their detailed comments and feedback.

## Supplementary material

## References

- Cartwright, N. (2002). Against modularity, the causal Markov condition, and any link between the two: Comments on Hausman and Woodward.
*The British Journal for the Philosophy of Science*,*53*(3), 411–453.CrossRefGoogle Scholar - Cartwright, N. (2007).
*Hunting causes and using them: Approaches in philosophy and economics*. Cambridge: Cambridge University Press.CrossRefGoogle Scholar - Danks, D., & Glymour, C. (2001). Linearity properties of Bayes nets with binary variables. In
*Proceedings of the seventeenth conference on uncertainty in artificial intelligence*(pp. 98–104).Google Scholar - Eaton, D., & Murphy, K. (2007). Exact Bayesian structure learning from uncertain interventions.
*Artificial Intelligence and Statistics*, 107–114.Google Scholar - Eberhardt, F. (2007).
*Causation and intervention*, Unpublished doctoral dissertation. Carnegie Mellon University.Google Scholar - Eberhardt, F., & Scheines, R. (2007). Interventions and causal inference.
*Philosophy of Science*,*74*(5), 981–995.CrossRefGoogle Scholar - Eberhardt, F., Glymour, C., & Scheines, R. (2006). N-1 experiments suffice to determine the causal relations among n variables. In D. E. Holmes & L. C. Jain (Eds.),
*Innovations in machine learning*(pp. 97–112). Berlin, Heidelberg: Springer.CrossRefGoogle Scholar - Freedman, D., & Humphreys, P. (1999). Are there algorithms that discover causal structure?
*Synthese*,*121*(1), 29–54.CrossRefGoogle Scholar - Geiger, D., Verma, T., & Pearl, J. (1990). Identifying independence in Bayesian networks.
*Networks*,*20*(5), 507–534.CrossRefGoogle Scholar - Hauser, A., & Bühlmann, P. (2012). Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs.
*Journal of Machine Learning Research*,*13*(Aug), 2409–2464.Google Scholar - Hausman, D., & Woodward, J. (2004). Manipulation and the causal Markov condition.
*Philosophy of Science*,*71*(5), 846–856.CrossRefGoogle Scholar - Hoyer, P. O., Janzing, D., Mooij, J. M., Peters, J., & Schlkopf, B. (2009). Nonlinear causal discovery with additive noise models. In
*Advances in neural information processing systems*(pp. 689–696).Google Scholar - Hyttinen, A., Eberhardt, F., & Hoyer, P. O. (2010). Causal discovery for linear cyclic models with latent variables. In
*Proceedings of the 5th European workshop on probabilistic graphical models (PGM 2010)*.Google Scholar - Kadane, J. B., & Seidenfeld, T. (1990). Randomization in a Bayesian perspective.
*Journal of Statistical Planning and Inference*,*25*(3), 329–345.CrossRefGoogle Scholar - Lauritzen, S. L., Dawid, A. P., Larsen, B. N., & Leimer, H. G. (1990). Independence properties of directed Markov fields.
*Networks*,*20*(5), 491–505.CrossRefGoogle Scholar - Mayo-Wilson, C. (2011). The problem of piecemeal induction.
*Philosophy of Science*,*78*(5), 864–874.CrossRefGoogle Scholar - Mayo-Wilson, C. (2012).
*Combining causal theories and dividing scientific labor*, Doctoral Dissertation. Carnegie Mellon University.Google Scholar - Mayo-Wilson, C. (2013). The limits of piecemeal causal inference.
*The British Journal for the Philosophy of Science*,*65*, 213–249. https://doi.org/10.1093/bjps/axs030.CrossRefGoogle Scholar - Meek, C. (1995). Strong completeness and faithfulness in Bayesian networks. In
*Proceedings of the eleventh conference on uncertainty in artificial intelligence*(pp. 411–418).Google Scholar - Nyberg, E., & Korb, K. (2006). Informative interventions. In P. McKay Illari, F. Russo, J. Williamson (Eds.),
*Causality and probability in the sciences*. College Publications, London.Google Scholar - Pearl, J. (2000).
*Causality: Models, reasoning, and inference*(Vol. 47). Cambridge: Cambridge University Press.Google Scholar - Pearl, J., & Verma, T. S. (1995). A theory of inferred causation. In D. Prawitz, B. Skyrms, & D. Westersthl (Eds.),
*Studies in logic and the foundations of mathematics volume 134 of logic, methodology and philosophy of science IX*(Vol. 134, pp. 789–811). New York City: Elsevier.Google Scholar - Richardson, T., & Spirtes, P. (2002). Ancestral graph Markov models.
*The Annals of Statistics*,*30*(4), 962–1030.CrossRefGoogle Scholar - Shimizu, S., Hoyer, P. O., Hyvrinen, A., & Kerminen, A. (2006). A linear non-Gaussian acyclic model for causal discovery.
*The Journal of Machine Learning Research*,*7*, 2003–2030.Google Scholar - Spirtes, P., Glymour, C. N., & Scheines, R. (2000).
*Causation, prediction, and search*. Cambridge: The MIT Press.Google Scholar - Steel, D. (2005). Indeterminism and the causal Markov condition.
*The British Journal for the Philosophy of Science*,*56*(1), 3–26.CrossRefGoogle Scholar - Tillman, R. E., & Eberhardt, F. (2014). Learning causal structure from multiple datasets with similar variable sets.
*Behaviormetrika*,*41*(1), 41–64.CrossRefGoogle Scholar - Tillman, R. E., & Spirtes, P. (2011). Learning equivalence classes of acyclic models with latent and selection variables from multiple datasets with overlapping variables. In
*Proceedings of the 14th international conference on artificial intelligence and statistics (AISTATS 2011)*.Google Scholar - Triantafillou, S., & Tsamardinos, I. (2015). Constraint-based causal discovery from multiple interventions over overlapping variable sets.
*Journal of Machine Learning Research*,*16*, 2147–2205.Google Scholar - Tsamardinos, I., Triantafillou, S., & Lagani, V. (2012). Towards integrative causal analysis of heterogeneous data sets and studies.
*Journal of Machine Learning Research*,*13*, 1097–1157.Google Scholar - Worrall, J. (2007). Why there’s no cause to randomize.
*The British Journal for the Philosophy of Science*,*58*(3), 451–488.CrossRefGoogle Scholar