Abstract
We consider the problem of learning in the case where an underlying causal model can be inferred. Causal knowledge may facilitate some approaches for a given problem, and rule out others. We formulate the hypothesis that semi-supervised learning can help in an anti-causal setting, but not in a causal setting, and corroborate it with empirical results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
There is no more dangerous mistake than confusing cause and effect: I call it the actual corruption of reason.
- 2.
Note that we will use the term “mechanism” both for the function \(\varphi\) and for the conditional P(E|C), but not for P(C|E).
- 3.
This “independence” condition is closely related to the concept of exogeneity in economics [10]. Given two variables C and E, we say C is exogenous if P(E|C) remains invariant to changes in the process that generates C.
- 4.
Note that anticausal prediction has also been called inverse inference, as opposed to direct inference from cause to effects [4]. However, these terms have been used rather broadly, and may also refer to inference relating hypotheses and consequences [4], or inference from population to sample (direct) vs. the other way round (inverse) [16].
- 5.
Note that a weak form of SSL could roughly work as follows: after learning a generative model for P(X, Y ) from the first part of the sample, we can use the additional samples from P(X) to double-check whether our model generates the right distribution for P(X).
References
Brefeld, U., Gärtner, T., Scheffer, T., Wrobel, S.: Efficient co-regularised least squares regression. In: ICML, Pittsburgh (2006)
Chapelle, O., Schölkopf, B., Zien, A.: Semi-supervised Learning. MIT, Cambridge (2006)
Daniušis, P., Janzing, D., Mooij, J., Zscheischler, J., Steudel, B., Zhang, K., Schölkopf, B.: Inferring deterministic causal relations. In: UAI, Catalina Island (2010)
Fetzer, J.H., Almeder, R.F.: Glossary of Epistemology/Philosophy of Science. Paragon House, New York (1993)
Guo, Y., Niu, X., Zhang, H.: An extensive empirical study on semi-supervised learning. In: ICDM, Sydney (2010)
Hoyer, P.O., Janzing, D., Mooij, J.M., Peters, J., Schölkopf, B.: Nonlinear causal discovery with additive noise models. In: NIPS, Vancouver (2008)
Janzing, D., Schölkopf, B.: Causal inference using the algorithmic Markov condition. IEEE Trans. Inf. Theory 56(10), 5168–5194 (2010)
Lemeire, J., Dirkx, E.: Causal models as minimal descriptions of multivariate systems. http://parallel.vub.ac.be/~jan/ (2007)
Mooij, J., Janzing, D., Peters, J., Schölkopf, B.: Regression by dependence minimization and its application to causal inference in additive noise models. In: ICML, Montreal (2009)
Pearl, J.: Causality. Cambridge University Press, New York (2000)
Pearl, E., Bareinboim, E.: Transportability of causal and statistical relations: a formal approach. In: Proceedings of the 25th AAAI Conference on Artificial Intelligence, Menlo Park, pp. 247–254 (2011)
Reichenbach, H.: The Direction of Time. University of California Press, Berkeley (1956)
Schölkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., Mooij, J.: On causal and anticausal learning. In: Langford, J., Pineau, J. (eds.) Proceedings of the 29th International Conference on Machine Learning, Edinburgh, pp. 1255–1262. Omnipress, New York (2012)
Schweikert, G., Widmer, C., Schölkopf, B., Rätsch, G.: An empirical analysis of domain adaptation algorithms for genomic sequence analysis. In: NIPS, Vancouver (2009)
Seeger, M.: Learning with labeled and unlabeled data. Technical Report (Tech. rep.), University of Edinburgh (2001)
Seidenfeld, T.: Direct inference and inverse inference. J. Philos. 75(12), 709–730 (1978)
Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search. Springer, New York (1993). (2nd edn.: MIT, Cambridge, 2000)
Storkey, A.: When training and test sets are different: characterizing learning transfer. In: Dataset Shift in Machine Learning. MIT, Cambridge (2009)
Vapnik, V.: Estimation of Dependences Based on Empirical Data (in Russian). Nauka, Moscow (1979). English translation: Springer, New York, 1982
Zhang, K., Hyvärinen, A.: On the identifiability of the post-nonlinear causal model. In: UAI, Montreal (2009)
Zhu, X., Goldberg, A.: Introduction to semi-supervised learning. In: Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 3, pp. 1–130. Morgan & Claypool, San Rafael (2009)
Acknowledgements
We thank Ulf Brefeld and Stefan Wrobel who kindly shared their detailed experimental results with us, allowing for our meta-analysis. We thank Bob Williamson, Vladimir Vapnik, and Jakob Zscheischler for helpful discussions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Schölkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., Mooij, J. (2013). Semi-supervised Learning in Causal and Anticausal Settings. In: Schölkopf, B., Luo, Z., Vovk, V. (eds) Empirical Inference. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41136-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-41136-6_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41135-9
Online ISBN: 978-3-642-41136-6
eBook Packages: Computer ScienceComputer Science (R0)