Abstract
Understanding causal relations is vital in scientific discovery. The process of causal structure learning involves identifying causal graphs from observational data to understand such relations. Usually, a central server performs this task, but sharing data with the server poses privacy risks. Federated learning can solve this problem, but existing solutions for federated causal structure learning make unrealistic assumptions about data and lack convergence guarantees. \(\textsc {FedC}^{2}\textsc {SL}\) is a federated constraint-based causal structure learning scheme that learns causal graphs using a federated conditional independence test, which examines conditional independence between two variables under a condition set without collecting raw data from clients. \(\textsc {FedC}^{2}\textsc {SL}\) requires weaker and more realistic assumptions about data and offers stronger resistance to data variability among clients. FedPC and FedFCI are the two variants of \(\textsc {FedC}^{2}\textsc {SL}\) for causal structure learning in causal sufficiency and causal insufficiency, respectively. The study evaluates \(\textsc {FedC}^{2}\textsc {SL}\) using both synthetic datasets and real-world data against existing solutions and finds it demonstrates encouraging performance and strong resilience to data heterogeneity among clients.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Addo, P.M., Manibialoa, C., McIsaac, F.: Exploring nonlinearity on the co2 emissions, economic production and energy use nexus: a causal discovery approach. Energy Rep. 7, 6196–6204 (2021)
Amiri, M.M., Gunduz, D., Kulkarni, S.R., Poor, H.V.: Federated learning with quantized global model updates. arXiv preprint arXiv:2006.10672 (2020)
Belyaeva, A., Squires, C., Uhler, C.: Dci: learning causal differences between gene regulatory networks. Bioinformatics 37(18), 3067–3069 (2021)
Bishop, Y.M., Fienberg, S.E., Fienberg, S.E., Holland, P.W.: Discrete multivariate analysis (1976)
Bogdanov, D., Kamm, L., Laur, S., Pruulmann-Vengerfeldt, P., Talviste, R., Willemson, J.: Privacy-preserving statistical data analysis on federated databases. In: Preneel, B., Ikonomou, D. (eds.) APF 2014. LNCS, vol. 8450, pp. 30–55. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06749-0_3
Bonawitz, K., et al.: Practical secure aggregation for privacy-preserving machine learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1175–1191 (2017)
Chai, Z., Chen, Y., Zhao, L., Cheng, Y., Rangwala, H.: Fedat: a high-performance and communication-efficient federated learning system with asynchronous tiers. arXiv preprint arXiv:2010.05958 (2020)
Fereidooni, H., et al.: Safelearn: secure aggregation for private federated learning. In: 2021 IEEE Security and Privacy Workshops (SPW), pp. 56–62. IEEE (2021)
Gaboardi, M., Rogers, R.: Local private hypothesis testing: chi-square tests. In: International Conference on Machine Learning, pp. 1626–1635. PMLR (2018)
Gao, E., Chen, J., Shen, L., Liu, T., Gong, M., Bondell, H.: Feddag: federated dag structure learning. arXiv preprint arXiv:2112.03555 (2021)
He, C., et al.: Fedml: a research library and benchmark for federated machine learning. arXiv preprint (2020)
Indyk, P.: Stable distributions, pseudorandom generators, embeddings, and data stream computation. J. ACM (JACM) 53(3), 307–323 (2006)
Ji, Z., Ma, P., Wang, S.: Perfce: performance debugging on databases with chaos engineering-enhanced causality analysis. arXiv preprint arXiv:2207.08369 (2022)
Ji, Z., Ma, P., Wang, S., Li, Y.: Causality-aided trade-off analysis for machine learning fairness. arXiv preprint arXiv:2305.13057 (2023)
Kairouz, P., et al.: Advances and open problems in federated learning. Found. Trends® Mach. Learn. 14(1–2), 1–210 (2021)
Karimireddy, S.P., Kale, S., Mohri, M., Reddi, S., Stich, S., Suresh, A.T.: Scaffold: stochastic controlled averaging for federated learning. In: International Conference on Machine Learning, pp. 5132–5143. PMLR (2020)
Khan, L.U., et al.: Federated learning for edge networks: resource optimization and incentive mechanism. IEEE Commun. Maga. 58(10), 88–93 (2020)
Konečnỳ, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016)
Kusner, M.J., Sun, Y., Sridharan, K., Weinberger, K.Q.: Private causal inference. In: Artificial Intelligence and Statistics, pp. 1308–1317. PMLR (2016)
Lauritzen, S.L.: Graphical Models. Clarendon Press, London (1996)
Li, P.: Estimators and tail bounds for dimension reduction in \(l_\alpha (0 < \alpha \le 2)\) using stable random projections. In: Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 10–19 (2008)
Ma, P., et al.: Ml4s: learning causal skeleton from vicinal graphs. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 1213–1223 (2022)
Ma, P., Ding, R., Wang, S., Han, S., Zhang, D.: Xinsight: explainable data analysis through the lens of causality. arXiv preprint arXiv:2207.12718 (2022)
Ma, P., Ji, Z., Pang, Q., Wang, S.: Noleaks: differentially private causal discovery under functional causal model. IEEE Trans. Inf. Forensics Secur. 17, 2324–2338 (2022)
McMahan, B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR (2017)
Mhaisen, N., Awad, A., Mohamed, A., Erbad, A., Guizani, M.: Analysis and optimal edge assignment for hierarchical federated learning on non-iid data. arXiv preprint arXiv:2012.05622 (2020)
Mian, O., Kaltenpoth, D., Kamp, M.: Regret-based federated causal discovery. In: The KDD 2022 Workshop on Causal Discovery, pp. 61–69. PMLR (2022)
Murakonda, S.K., Shokri, R., Theodorakopoulos, G.: Quantifying the privacy risks of learning high-dimensional graphical models. In: International Conference on Artificial Intelligence and Statistics, pp. 2287–2295. PMLR (2021)
Ng, I., Zhang, K.: Towards federated bayesian network structure learning with continuous optimization. In: International Conference on Artificial Intelligence and Statistics, pp. 8095–8111. PMLR (2022)
Niu, F., Nori, H., Quistorff, B., Caruana, R., Ngwe, D., Kannan, A.: Differentially private estimation of heterogeneous causal effects. arXiv preprint arXiv:2202.11043 (2022)
Peters, J., Janzing, D., Schölkopf, B.: Elements of Causal Inference: Foundations and Learning Algorithms. The MIT Press, Cambridge (2017)
Pinna, A., Soranzo, N., De La Fuente, A.: From knockouts to networks: establishing direct cause-effect relationships through graph analysis. PloS One 5(10), e12912 (2010)
Runge, J., et al.: Inferring causation from time series in earth system sciences. Nat. Commun. 10(1), 1–13 (2019)
Sachs, K., Perez, O., Pe’er, D., Lauffenburger, D.A., Nolan, G.P.: Causal protein-signaling networks derived from multiparameter single-cell data. Science 308(5721), 523–529 (2005)
Samet, S., Miri, A.: Privacy-preserving bayesian network for horizontally partitioned data. In: 2009 International Conference on Computational Science and Engineering, vol. 3, pp. 9–16. IEEE (2009)
Shen, X., Ma, S., Vemuri, P., Simon, G.: Challenges and opportunities with causal discovery algorithms: application to alzheimer’s pathophysiology. Sci. Rep. 10(1), 1–12 (2020)
Spirtes, P., Glymour, C.N., Scheines, R., Heckerman, D.: Causation, Prediction, and Search. MIT press, Cambrideg (2000)
T Dinh, C., Tran, N., Nguyen, T.D.: Personalized federated learning with moreau envelopes. In: NeurIPS (2020)
Triola, M.: Essentials of Statistics. Pearson Education, Boston (2014). https://books.google.com.hk/books?id=QZN-AgAAQBAJ
Vepakomma, P., Amiri, M.M., Canonne, C.L., Raskar, R., Pentland, A.: Private independence testing across two parties. arXiv preprint arXiv:2207.03652 (2022)
Versteeg, P., Mooij, J., Zhang, C.: Local constraint-based causal discovery under selection bias. In: Conference on Causal Learning and Reasoning, pp. 840–860. PMLR (2022)
Wang, H., Kaplan, Z., Niu, D., Li, B.: Optimizing federated learning on non-iid data with reinforcement learning. In: IEEE INFOCOM 2020-IEEE Conference on Computer Communications, pp. 1698–1707. IEEE (2020)
Wang, L., Pang, Q., Song, D.: Towards practical differentially private causal graph discovery. Adv. Neural Inf. Process. Syst. 33, 5516–5526 (2020)
Wang, L., Pang, Q., Wang, S., Song, D.: Fed-\(\chi ^{2}\): privacy preserving federated correlation test. arXiv preprint arXiv:2105.14618 (2021)
Wang, Z., Ma, P., Wang, S.: Towards practical federated causal structure learning. arXiv preprint arXiv:2306.09433 (2023)
Xu, D., Yuan, S., Wu, X.: Differential privacy preserving causal graph discovery. In: 2017 IEEE Symposium on Privacy-Aware Computing (PAC), pp. 60–71. IEEE (2017)
Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: concept and applications. ACM Trans. Intell. Syst. Technol. (TIST) 10(2), 1–19 (2019)
Yue, X., Kontar, R.A., Gómez, A.M.E.: Federated data analytics: a study on linear models. arXiv preprint arXiv:2206.07786 (2022)
Zhang, C., Li, S., Xia, J., Wang, W., Yan, F., Liu, Y.: \(\{\)BatchCrypt\(\}\): efficient homomorphic encryption for \(\{\)Cross-Silo\(\}\) federated learning. In: 2020 USENIX annual technical conference (USENIX ATC 2020), pp. 493–506 (2020)
Zhang, J.: On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artif. Intell. 172(16–17), 1873–1896 (2008)
Acknowledgement
We thank the anonymous reviewers for their insightful comments. We also thank Qi Pang for valuable discussions. This research is supported in part by the HKUST 30 for 30 research initiative scheme under the contract Z1283 and the Academic Hardware Grant from NVIDIA.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, Z., Ma, P., Wang, S. (2023). Towards Practical Federated Causal Structure Learning. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14170. Springer, Cham. https://doi.org/10.1007/978-3-031-43415-0_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-43415-0_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43414-3
Online ISBN: 978-3-031-43415-0
eBook Packages: Computer ScienceComputer Science (R0)