# Fast causal inference with non-random missingness by test-wise deletion

- 215 Downloads

## Abstract

Many real datasets contain values missing not at random (MNAR). In this scenario, investigators often perform list-wise deletion, or delete samples with *any* missing values, before applying causal discovery algorithms. List-wise deletion is a sound and general strategy when paired with algorithms such as FCI and RFCI, but the deletion procedure also eliminates otherwise good samples that contain only a few missing values. In this report, we show that we can more efficiently utilize the observed values with *test-wise deletion* while still maintaining algorithmic soundness. Here, test-wise deletion refers to the process of list-wise deleting samples only among the variables required for each conditional independence (CI) test used in constraint-based searches. Test-wise deletion therefore often saves more samples than list-wise deletion for each CI test, especially when we have a sparse underlying graph. Our theoretical results show that test-wise deletion is sound under the justifiable assumption that none of the missingness mechanisms causally affect each other in the underlying causal graph. We also find that FCI and RFCI with test-wise deletion outperform their list-wise deletion and imputation counterparts on average when MNAR holds in both synthetic and real data.

## Keywords

Causal inference Missing values Missing not at random MNAR## Notes

### Acknowledgements

Research reported in this publication was supported by Grant U54HG008540 awarded by the National Human Genome Research Institute through funds provided by the trans-NIH Big Data to Knowledge initiative. The research was also supported by the National Library of Medicine of the National Institutes of Health under award numbers T15LM007059 and R01LM012095. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

## References

- 1.Brand, J.: Development, Implementation and Evaluation of Multiple Imputation Strategies for the Statistical Analysis of Incomplete Data Sets. The Author (1999). https://books.google.com/books?id=-Y0TywAACAAJ
- 2.Colombo, D., Maathius, M., Kalisch, M., Richardson, T.: Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann. Stat.
**40**(1), 294–321 (2012). https://doi.org/10.1214/11-AOS940. http://projecteuclid.org/euclid.aos/1333567191 - 3.Cranmer, S.J., Gill, J.: We have to be discrete about this: a non-parametric imputation technique for missing categorical data. Br. J. Polit. Sci.
**43**, 425–449 (2013). https://doi.org/10.1017/s0007123412000312 CrossRefGoogle Scholar - 4.Daniel, R.M., Kenward, M.G., Cousens, S.N., De Stavola, B.L.: Using causal diagrams to guide analysis in missing data problems. Stoch. Models
**21**(3), 243–256 (2012)MathSciNetzbMATHGoogle Scholar - 5.Doove, L., Van Buuren, S., Dusseldorp, E.: Recursive partitioning for missing data imputation in the presence of interaction effects. Comput. Stat. Data Anal.
**72**(C), 92–104 (2014)MathSciNetCrossRefGoogle Scholar - 6.Kowarik, A., Templ, M.: Imputation with the R package VIM. J. Stat. Softw.
**74**, 1–16 (2016). https://doi.org/10.18637/jss.v074.i07 CrossRefGoogle Scholar - 7.Lauritzen, S.L., Dawid, A.P., Larsen, B.N., Leimer, H.G.: Independence properties of directed Markov fields. Networks
**20**(5), 491–505 (1990). https://doi.org/10.1002/net.3230200503 MathSciNetCrossRefzbMATHGoogle Scholar - 8.Little, R.J.A.: Missing data adjustments in large surveys. J. Bus. Econ. Stat.
**6**, 287–296 (1988)Google Scholar - 9.McArdle, J., Rodgers, W., Willis, R.: Cognition and aging in the USA (CogUSA) 2007–2009. Inter-university Consortium for Political and Social Research, Ann Arbor, MI (2015). https://doi.org/10.3886/ICPSR36053.v1
- 10.Mohan, K., Pearl, J., Tian, J.: Graphical models for inference with missing data. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 26, pp. 1277–1285. Curran Associates, Inc., New York (2013)Google Scholar
- 11.Rubin, D.B.: Multiple Imputation for Nonresponse in Surveys. Wiley, New York (1987)CrossRefzbMATHGoogle Scholar
- 12.Schafer, J.: Analysis of Incomplete Multivariate Data. Chapman and Hall, London (1997)CrossRefzbMATHGoogle Scholar
- 13.Shah, A.D., Bartlett, J.W., Carpenter, J., Nicholas, O., Hemingway, H.: Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study. Am. J. Epidemiol.
**179**(6), 764–774 (2014). https://doi.org/10.1093/aje/kwt312 CrossRefGoogle Scholar - 14.Shpitser, I., Mohan, K., Pearl, J.: Missing data as a causal and probabilistic problem. In: Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, UAI 2015, 12–16 July 2015, Amsterdam, The Netherlands, pp. 802–811 (2015)Google Scholar
- 15.Sokolova, E., Groot, P., Claassen, T., von Rhein, D., Buitelaar, J., Heskes, T.: Causal discovery from medical data: dealing with missing values and a mixture of discrete and continuous data. In: Artificial Intelligence in Medicine—Proceedings of 15th Conference on Artificial Intelligence in Medicine, AIME 2015, Pavia, Italy, 17–20 June 2015, pp. 177–181 (2015). https://doi.org/10.1007/978-3-319-19551-3_23
- 16.Sokolova, E., von Rhein, D., Naaijen, J., Groot, P., Claassen, T., Buitelaar, J., Heskes, T.: Handling hybrid and missing data in constraint-based causal discovery to study the etiology of ADHD. Int. J. Data Sci. Anal.
**3**(2), 105–119 (2017). https://doi.org/10.1007/s41060-016-0034-x CrossRefGoogle Scholar - 17.Spirtes, P.: An anytime algorithm for causal inference. In: In the Presence of Latent Variables and Selection Bias in Computation, Causation and Discovery, pp. 121–128. MIT Press, Cambridge (2001)Google Scholar
- 18.Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search, 2nd edn. MIT Press, Cambridge (2000)zbMATHGoogle Scholar
- 19.Spirtes, P., Meek, C., Richardson, T.: An algorithm for causal inference in the presence of latent variables and selection bias. Computation, Causation, and Discovery, pp. 211–252. AAAI Press, Menlo Park, CA (1999)Google Scholar
- 20.Spirtes, P., Richardson, T.: A polynomial time algorithm for determining DAG equivalence in the presence of latent variables and selection bias. In: Proceedings of the 6th International Workshop on Artificial Intelligence and Statistics, Fort Lauderdale, pp. 489–500 (1996)Google Scholar
- 21.Strobl, E.V., Zhang, K., Visweswaran, S.: Approximate Kernel-Based Conditional Independence Tests for Fast Non-Parametric Causal Discovery (2017). http://arxiv.org/abs/1702.03877
- 22.Tillman, R.E., Danks, D., Glymour, C.: Integrating locally learned causal structures with overlapping variables. In: Advances in Neural Information Processing Systems 21, Proceedings of the Twenty-Second Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, 8–11 Dec 2008, pp. 1665–1672 (2008)Google Scholar
- 23.Tillman, R.E., Eberhardt, F.: Learning causal structure from multiple datasets with similar variable sets. Behaviormetrika
**41**(1), 41–64 (2014)CrossRefGoogle Scholar - 24.Tillman, R.E., Spirtes, P.: Learning equivalence classes of acyclic models with latent and selection variables from multiple datasets with overlapping variables. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2011, Fort Lauderdale, USA, 11–13 April 2011, pp. 3–15 (2011). http://www.jmlr.org/proceedings/papers/v15/tillman11a/tillman11a.pdf
- 25.Triantafilou, S., Tsamardinos, I., Tollis, I.G.: Learning causal structure from overlapping variable sets. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2010, Chia Laguna Resort, Sardinia, Italy, 13–15 May 2010, pp. 860–867 (2010). http://www.jmlr.org/proceedings/papers/v9/triantafillou10a.html
- 26.van Buuren, S.: Flexible Imputation of Missing Data (Chapman and Hall, CRC Interdisciplinary Statistics), 1st edn. Chapman and Hall, London (2012)CrossRefGoogle Scholar
- 27.van Buuren, S., Brand, J.P.L., Groothuis-Oudshoorn, K.C., Rubin, D.B.: Fully conditional specification in multivariate imputation. J. Stat. Comput. Simul. (in press) (2005)Google Scholar
- 28.van Buuren, S., Groothuis-Oudshoorn, K.: Mice: multivariate imputation by chained equations in R. J. Stat. Softw.
**45**(3) (2011). https://www.jstatsoft.org/article/view/v045i03 - 29.Zhang, J.: On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artif. Intell.
**172**(16–17), 1873–1896 (2008). https://doi.org/10.1016/j.artint.2008.08.001 MathSciNetCrossRefzbMATHGoogle Scholar