Abstract
Methods of causal discovery and direction of dependence to evaluate causal properties of variable relations have experienced rapid development. The majority of causal discovery methods, however, relies on the assumption of causal effect homogeneity, that is, the identified causal structure is expected to hold for the entire population. Because causal mechanisms can vary across subpopulations, we propose combining methods of model-based recursive partitioning and non-Gaussian causal discovery to identify such subpopulations. The resulting algorithm can discover subpopulations with potentially varying magnitude and causal direction of effects under mild parameter inequality assumptions. Feasibility conditions are described and results from synthetic data experiments are presented suggesting that large effects and large sample sizes are beneficial for detecting causally competing subgroups with acceptable statistical performance. In a real-world data example, the extraction of meaningful subgroups that differ in the causal mechanism underlying the development of numerical cognition is illustrated. Potential extensions and recommendations for best practice applications are discussed.
This is a preview of subscription content, access via your institution.




References
Andrews, D. W. K. (1993). Tests for parameter instability and structural change with unknown change point. Econometrica, 61(4), 821. https://doi.org/10.2307/2951764
Athey, S., & Imbens, G. (2016). Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences, 113(27), 7353–7360. https://doi.org/10.1073/pnas.1510489113
Brandmaier, A. M., von Oertzen, T., McArdle, J. J., & Lindenberger, U. (2013). Structural equation model trees. Psychological Methods, 18(1), 71–86. https://doi.org/10.1037/a0030001
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140. https://doi.org/10.1023/A:1018054314350
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Cai, R., Xie, F., Chen, W., & Hao, Z. (2017). An efficient kurtosis-based causal discovery method for linear non-Gaussian acyclic data. IEEE/ACM 25th international symposium on quality of service (pp. 1–6).
Cai, R., Ye, J., Qiao, J., Fu, H., & Hao, Z. (2020). FOM: Fourth-order moment based causal direction identification on the heteroscedastic data. Neural Networks, 124, 193–201. https://doi.org/10.1016/j.neunet.2020.01.006
Chen, Z., & Chan, L. (2013). Causality in linear non-Gaussian acyclic models in the presence of latent Gaussian confounders. Neural Computation, 25(6), 1605–1641. https://doi.org/10.1162/NECO_a_00444
Darmois, G. (1953). Analyse générale des liaisons stochastiques: Etude particulière de l’analyse factorielle linéaire [General analysis of stochastic links]. Revue de l’Institut International de Statistique / Review of the International Statistical Institute, 21(1/2), 2–8. https://doi.org/10.2307/1401511
Dehaene, S., & Cohen, L. (1998). Levels of representation in number processing. In B. Stemmer & H. A. Whitaker (Eds.), Handbook of neurolinguistics (pp. 331–431). Academic.
Dodge, Y., & Rousson, V. (2000). Direction dependence in a regression line. Communications in Statistics-Theory and Methods, 29(9–10), 1957–1972. https://doi.org/10.1080/03610920008832589
Dodge, Y., & Rousson, V. (2001). On asymmetric properties of the correlation coefficient in the regression setting. The American Statistician, 55(1), 51–54. https://doi.org/10.1198/000313001300339932
Dodge, Y., & Yadegari, I. (2010). On direction of dependence. Metrika, 72(1), 139–150. https://doi.org/10.1007/s00184-009-0273-0
Doove, L. L., Van Deun, K., Dusseldorp, E., & Van Mechelen, I. (2016). QUINT: A tool to detect qualitative treatment–subgroup interactions in randomized controlled trials. Psychotherapy Research, 26(5), 612–622. https://doi.org/10.1080/10503307.2015.1062934
Dusseldorp, E., & Van Mechelen, I. (2014). Qualitative interaction trees: A tool to identify qualitative treatment-subgroup interactions. Statistics in Medicine, 33(2), 219–237. https://doi.org/10.1002/sim.5933
Fischer, K., & van Geert, P. (2014). Dynamic development of brain and behavior. In P. Molenaar, R. Lerner, & K. Newell (Eds.), Handbook of developmental systems theory and methodology (pp. 287–315). New York, NY: Guilford Press.
Fokkema, M., & Strobl, C. (2020). Fitting prediction rule ensembles to psychological research data: An introduction and tutorial. Psychological Methods. https://doi.org/10.1037/met0000256
Fokkema, M., Smits, N., Zeileis, A., Hothorn, T., & Kelderman, H. (2018). Detecting treatment-subgroup interactions in clustered data with generalized linear mixed-effects model trees. Behavior Research Methods, 50(5), 2016–2034. https://doi.org/10.3758/s13428-017-0971-x
Freund, Y., & Schapire, R. E. (1995). A decision-theoretic generalization of on-line learning and an application to boosting. In European conference on computational learning theory.
Gareiß, M. (2010). Testing Dehaene’s triple code model using linear structural equation models. (Master Thesis), University of Klagenfurt, Austria
Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press.
Gretton, A., Fukumizu, K., Teo, C. H., Song, L., Schölkopf, B., & Smola, A. J. (2008). A kernel statistical test of independence. Advances in Neural Information Processing Systems, 20, 585–592.
Harremoës, P., & Topsøe, F. (2001). Maximum entropy fundamentals. Entropy, 3(3), 191–226. https://doi.org/10.3390/e3030191
Hernandez-Lobato, D., Morales-Mombiela, P., Lopez-Paz, D., & Suarez, A. (2016). Non-linear causal inference using Gaussianity measures. Journal of Machine Learning Research, 17, 1–39.
Hjort, N. L., & Koning, A. (2002). Tests For constancy Of model parameters over time. Journal of Nonparametric Statistics, 14(1–2), 113–132. https://doi.org/10.1080/10485250211394
Hothorn, T., & Zeileis, A. (2015). partykit: A modular toolkit for recursive partytioning in R. Journal of Machine Learning Research, 16, 3905–3909.
Hoyer, P. O., Shimizu, S., Kerminen, A. J., & Palviainen, M. (2008). Estimation of causal effects using linear non-Gaussian causal models with hidden variables. International Journal of Approximate Reasoning, 49(2), 362–378. https://doi.org/10.1016/j.ijar.2008.02.006
Hoyer, P. O., Janzing, D., Mooij, J., Peters, J., & Schölkopf, B. (2009). Nonlinear causal discovery with additive noise models. Advances in neural information processing systems, 21.
Hyvärinen, A., & Smith, S. M. (2013). Pairwise likelihood ratios for estimation of non-Gaussian structural equation models. Journal of Machine Learning Research, 14, 111–152.
Hyvärinen, A., Zhang, K., Shimizu, S., & Hoyer, P. O. (2010). Estimation of a structural vector autoregression model using non-Gaussianity. Journal of Machine Learning Research, 11, 1709–1731.
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An introduction to statistical learning. Springer US. https://doi.org/10.1007/978-1-0716-1418-1
Khoury, M. J., Iademarco, M. F., & Riley, W. T. (2016). Precision public health for the era of precision medicine. American Journal of Preventive Medicine, 50(3), 398–401. https://doi.org/10.1016/j.amepre.2015.08.031
Kim, J.-M., Jung, Y.-S., Sungur, E. A., Han, K.-H., Park, C., & Sohn, I. (2008). A copula method for modeling directional dependence of genes. BMC Bioinformatics, 9(1), 225. https://doi.org/10.1186/1471-2105-9-225
Knief, U., & Forstmeier, W. (2021). Violating the normality assumption may be the lesser of two evils. Behavior Research Methods, 53(6), 2576–2590. https://doi.org/10.3758/s13428-021-01587-5
Koller, I., & Alexandrowicz, R. (2010). A psychometric analysis of the ZAREKI-R using Rasch models. Diagnostica, 56, 57–67. https://doi.org/10.1026/0012-1924/a000003
Komboz, B., Strobl, C., & Zeileis, A. (2018). Tree-based global model tests for polytomous Rasch models. Educational and Psychological Measurement, 78(1), 128–166. https://doi.org/10.1177/0013164416664394
Lee, N., & Kim, J.-M. (2019). Copula directional dependence for inference and statistical analysis of whole-brain connectivity from fMRI data. Brain and Behavior, 9(1), e01191. https://doi.org/10.1002/brb3.1191
Li, X., & Wiedermann, W. (2020). Conditional direction dependence analysis: Evaluating the causal direction of effects in linear models with interaction terms. Multivariate Behavioral Research, 55(5), 786–810. https://doi.org/10.1080/00273171.2019.1687276
Li, X., Bergin, C., & Olsen, A. A. (2022). Positive teacher–student relationships may lead to better teaching. Learning and Instruction, 80, 101581. https://doi.org/10.1016/j.learninstruc.2022.101581
Loh, W.-Y. (2014). Fifty years of classification and regression trees: Fifty years of classification and regression trees. International Statistical Review, 82(3), 329–348. https://doi.org/10.1111/insr.12016
Maeda, T. N., & Shimizu, S. (2020). RCD: Repetitive causal discovery of linear non-Gaussian acyclic models with latent confounders. Proceedings of the twenty third international conference on artificial intelligence and statistics, proceedings of machine learning research, 108 (pp. 735–745).
Maeda, T. N., Zeng, Y., & Shimizu, S. (2023). Causal discovery with hidden variables based on non-Gaussianity and non-linearity. In M. Stemmler, W. Wiedermann, & F. L. Huang (Eds.), Dependent data in social sciences research: Forms, issues, and methods of analysis (2nd ed.). New York: Springer.
McCullagh, P., & Nelder, J. A. (1989). Generalized linear models (2nd ed.). Chapman & Hall.
Pearl, J. (2009). Causality: Models, reasoning, and inference (2nd ed.). Cambridge University Press.
Peters, J., Mooij, D., Janzing, D., & Schölkopf, B. (2014). Causal discovery with continuous additive noise models. Journal of Machine Learning Research, 15, 2009–2053.
Peters, J., Janzing, D., & Schölkopf, B. (2017). Elements of causal inference: Foundations and learning algorithms. MIT Press.
Philipp, M., Rusch, T., Hornik, K., & Strobl, C. (2018). Measuring the stability of results from supervised statistical learning. Journal of Computational and Graphical Statistics, 27(4), 685–700. https://doi.org/10.1080/10618600.2018.1473779
R Core Team. (2023). R: A language and environment for statistical computing. R Foundation for Statistical Computing.  http://www.R-project.org/. Accessed 10/5/2023
Rosenström, T., Jokela, M., Puttonen, S., Hintsanen, M., Pulkki-Råback, L., Viikari, J. S., Raitakari, O. T., & Keltikangas-Järvinen, L. (2012). Pairwise measures of causal direction in the epidemiology of sleep problems and depression. PLoS ONE, 7(11), e50841. https://doi.org/10.1371/journal.pone.0050841
Rusch, T., & Zeileis, A. (2013). Gaining insight with recursive partitioning of generalized linear models. Journal of Statistical Computation and Simulation, 83(7), 1301–1315. https://doi.org/10.1080/00949655.2012.658804
Schlosser, L., Hothorn, T., & Zeileis, A. (2020). The power of unbiased recursive partitioning: A unifying view of Ctree, MOB, and GUIDE. arXiv:1906.10179v1.
Seibold, H., Zeileis, A., & Hothorn, T. (2016). Model-based recursive partitioning for subgroup analyses. The International Journal of Biostatistics, 12(1), 45–63. https://doi.org/10.1515/ijb-2015-0032
Seibold, H., Hothorn, T., & Zeileis, A. (2019). Generalised linear model trees with global additive effects. Advances in Data Analysis and Classification, 13(3), 703–725. https://doi.org/10.1007/s11634-018-0342-1
Shimizu, S. (2019). Non-Gaussian methods for causal structure learning. Prevention Science, 20(3), 431–441. https://doi.org/10.1007/s11121-018-0901-x
Shimizu, S., & Kano, Y. (2008). Use of non-normality in structural equation modeling: Application to direction of causation. Journal of Statistical Planning and Inference, 138(11), 3483–3491. https://doi.org/10.1016/j.jspi.2006.01.017
Shimizu, S., Hoyer, P. O., Hyvärinen, A., & Kerminen, A. (2006). A linear non-Gaussian acyclic model for causal discovery. The Journal of Machine Learning Research, 7, 2003–2030.
Shimizu, S., Inazumi, T., Sogawa, Y., Hyvärinen, A., Kawahara, Y., Washio, T., Hoyer, P. O., & Bollen, K. (2011). DirectLiNGAM: A direct method for learning a linear non-Gaussian structural equation model. Journal of Machine Learning Research, 12, 1225–1248.
Skitovich, W. P. (1953). W. P. (1953). On a property of the normal distribution. Doklady Akademii Nauk SSSR [Reports of the Academy of Sciences USSR], 89, 217–219.
Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, prediction, and search (2nd ed.). MIT Press.
Strobl, C., Malley, J., & Tutz, G. (2009). An introduction to recursive partitioning: Rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychological Methods, 14(4), 323–348. https://doi.org/10.1037/a0016973
Strobl, C., Wickelmaier, F., & Zeileis, A. (2011). Accounting for individual differences in Bradley-Terry models by means of recursive partitioning. Journal of Educational and Behavioral Statistics, 36(2), 135–153. https://doi.org/10.3102/1076998609359791
Suk, Y., & Kang, H. (2022). Tuning random forests for causal inference under cluster-level unmeasured confounding. Multivariate Behavioral Research, 1–33. https://doi.org/10.1080/00273171.2021.1994364
Suk, Y., Kang, H., & Kim, J.-S. (2021). Random forests approach for causal inference with clustered observational data. Multivariate Behavioral Research, 56(6), 829–852. https://doi.org/10.1080/00273171.2020.1808437
Supplee, L. H., Parekh, J., & Johnson, M. (2018). Principles of precision prevention science for improving recruitment and retention of participants. Prevention Science, 19(5), 689–694. https://doi.org/10.1007/s11121-018-0884-7
Székely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. Annals of Statistics, 35(6), 2769–2794. https://doi.org/10.1214/009053607000000505
Turney, P. (1995). Technical note: Bias and the quantification of stability. Machine Learning, 20(1–2), 23–33. https://doi.org/10.1007/BF00993473
van Wie, M., Li, X., & Wiedermann, W. (2019). Identification of confounded subgroups using linear model based recursive partitioning. Psychological Test and Assessment Modeling, 61(4), 365–387.
Verma, T., & Pearl, J. (1990). Equivalence and synthesis of causal models. Proceedings of the 6th conference of uncertainty in artificial intelligence (pp. 220–227).
von Aster, M. G., & Shalev, R. S. (2007). Number development and developmental dyscalculia. Developmental Medicine & Child Neurology, 49(11), 868–873. https://doi.org/10.1111/j.1469-8749.2007.00868.x
von Aster, M., Zulauf, M. W., & Horn, R. (2006). Neuropsychologische Testbatterie fuer Zahlenverarbeitung und Rechnen bei Kindern (ZAREKI-R) [Neuropsychological test battery for number processing and calcuation in children]. Frankfurt, DEU: Harcourt Test Service.
von Eye, A., & DeShon, R. P. (2012). Directional dependence in developmental research. International Journal of Behavioral Development, 36(4), 303–312. https://doi.org/10.1177/0165025412439968
Wager, S., & Athey, S. (2018). Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113(523), 1228–1242. https://doi.org/10.1080/01621459.2017.1319839
Wiedermann, W. (2018). A note on fourth moment-based direction dependence measures when regression errors are non normal. Communications in Statistics - Theory and Methods, 47(21), 5255–5264. https://doi.org/10.1080/03610926.2017.1388403
Wiedermann, W. (2021). Asymmetry properties of the partial correlation coefficient: Foundations for covariate adjustment in distribution-based direction dependence analysis. In W. Wiedermann, D. Kim, E. A. Sungur, & A. von Eye (Eds.), Direction dependence in statistical modeling: Methods of analysis (pp. 81–110). Wiley.
Wiedermann, W. (2022). Third moment-based causal inference. Behaviormetrika, 49(2), 303–328. https://doi.org/10.1007/s41237-021-00154-8
Wiedermann, W., & Hagmann, M. (2016). Asymmetric properties of the Pearson correlation coefficient: Correlation as the negative association between linear regression residuals. Communications in Statistics: Theory and Methods, 45(21), 6263–6283. https://doi.org/10.1080/03610926.2014.960582
Wiedermann, W., & Hirni, M. (2022). Direction dependence analysis in R. www.ddaproject.com. Accessed 10/5/2023
Wiedermann, W., & Li, X. (2018). Direction dependence analysis: A framework to test the direction of effects in linear models with an implementation in SPSS. Behavior Research Methods, 50(4), 1581–1601. https://doi.org/10.3758/s13428-018-1031-x
Wiedermann, W., & Li, X. (2020). Confounder detection in linear mediation models: Performance of kernel-based tests of independence. Behavior Research Methods, 52, 342–359. https://doi.org/10.3758/s13428-019-01230-4
Wiedermann, W., & Sebastian, J. (2020). Direction dependence analysis in the presence of confounders: Applications to linear mediation models. Multivariate Behavioral Research, 55, 495–515. https://doi.org/10.1080/00273171.2018.1528542
Wiedermann, W., & von Eye, A. (2015a). Direction of effects in mediation analysis. Psychological Methods, 20(2), 221–244. https://doi.org/10.1037/met0000027
Wiedermann, W., & von Eye, A. (2015b). Direction of effects in multiple linear regression models. Multivariate Behavioral Research, 50(1), 23–40. https://doi.org/10.1080/00273171.2014.958429
Wiedermann, W., & von Eye, A. (2015c). Direction-dependence analysis: A confirmatory approach for testing directional theories. International Journal of Behavioral Development, 39(6), 570–580. https://doi.org/10.1177/0165025415582056
Wiedermann, W., Artner, R., & von Eye, A. (2017). Heteroscedasticity as a basis of direction dependence in reversible linear regression models. Multivariate Behavioral Research, 52(2), 222–241. https://doi.org/10.1080/00273171.2016.1275498
Wiedermann, W., Merkle, E. C., & von Eye, A. (2018). Direction of dependence in measurement error models. British Journal of Mathematical and Statistical Psychology, 71(1), 117–145. https://doi.org/10.1111/bmsp.12111
Wiedermann, W., Li, X., & von Eye, A. (2019). Testing the causal direction of mediation effects in randomized intervention studies. Prevention Science, 20(3), 419–430. https://doi.org/10.1007/s11121-018-0900-y
Wiedermann, W., Kim, D., Sungur, E. A., & von Eye, A. (2020a). Direction dependence in statistical models: Methods of analysis. Wiley.
Wiedermann, W., Reinke, W., & Herman, K. (2020b). Prosocial skills causally mediate the relation between effective classroom management and academic competence: An application of direction dependence analysis. Developmental Psychology, 56(9), 1723–1735.
Wiedermann, W., Herman, K. C., Reinke, W., & von Eye, A. (2021). Configural frequency trees. Development and Psychopathology, 1–19. https://doi.org/10.1017/S0954579421000018
Wiedermann, W., Frick, U., & Merkle, E. C. (2023). Detecting heterogeneity of intervention effects in comparative judgments. Prevention Science. https://doi.org/10.1007/s11121-021-01212-z
Zeileis, A. (2005). A unified approach to structural change tests based on ML scores, F statistics, and OLS residuals. Econometric Reviews, 24(4), 445–466.https://doi.org/10.1080/07474930500406053
Zeileis, A., & Hornik, K. (2007). Generalized M-fluctuation tests for parameter instability. Statistica Neerlandica, 61(4), 488–508. https://doi.org/10.1111/j.1467-9574.2007.00371.x
Zeileis, A., Hothorn, T., & Hornik, K. (2008). Model-based recursive partitioning. Journal of Computational and Graphical Statistics, 17(2), 492–514. https://doi.org/10.1198/106186008X319331
Zhang, K., Gong, M., Ramsey, J., Batmanghelich, K., Spirtes, P., & Glymour, C. (2018). Causal discovery with linear non-Gaussian models under measurement error: Structural identifiability results. Proc. Conference on Uncertainty in Artificial Intelligence (UAI’18).
Zhang, H., Zhou, S., Yan, C., Guan, J., & Wang, X. (2019). Recursively learning causal structures using regression-based conditional independence test. Thirty-Third AAAI conference on artificial intelligence (pp. 3108–3115).
Zhang, R., Sun, X., Huang, Z., Pan, Y., Westbrook, A., Li, S., Bazzano, L., Chen, W., He, J., Kelly, T., & Li, C. (2022). Examination of serum metabolome altered by cigarette smoking identifies novel metabolites mediating smoking-BMI association. Obesity, 30(4), 943–952. https://doi.org/10.1002/oby.23386
Acknowledgements
The authors are indebted to Dr. Ingrid Koller and Dr. Michaela Pötscher-Gareiß for providing the data used for illustrative purposes, and to Dr. Ed Merkle for valuable comments on an earlier version of the article. DS was partially supported by an ASPIRE grant from the Office of the Vice President for Research at the University of South Carolina.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open practices statements
DDA source code is available at www.ddaproject.com. A synthetic data example is given in the online supplement of the article.
Supplementary Information
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wiedermann, W., Zhang, B. & Shi, D. Detecting heterogeneity in the causal direction of dependence: A model-based recursive partitioning approach. Behav Res (2023). https://doi.org/10.3758/s13428-023-02253-8
Accepted:
Published:
DOI: https://doi.org/10.3758/s13428-023-02253-8