Abstract
We consider the problem of learning in the case where an underlying causal model can be inferred. Causal knowledge may facilitate some approaches for a given problem, and rule out others. We formulate the hypothesis that semi-supervised learning can help in an anti-causal setting, but not in a causal setting, and corroborate it with empirical results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
There is no more dangerous mistake than confusing cause and effect: I call it the actual corruption of reason.
- 2.
Note that we will use the term “mechanism” both for the function \(\varphi\) and for the conditional P(E|C), but not for P(C|E).
- 3.
This “independence” condition is closely related to the concept of exogeneity in economics [10]. Given two variables C and E, we say C is exogenous if P(E|C) remains invariant to changes in the process that generates C.
- 4.
Note that anticausal prediction has also been called inverse inference, as opposed to direct inference from cause to effects [4]. However, these terms have been used rather broadly, and may also refer to inference relating hypotheses and consequences [4], or inference from population to sample (direct) vs. the other way round (inverse) [16].
- 5.
Note that a weak form of SSL could roughly work as follows: after learning a generative model for P(X, Y ) from the first part of the sample, we can use the additional samples from P(X) to double-check whether our model generates the right distribution for P(X).
References
Brefeld, U., Gärtner, T., Scheffer, T., Wrobel, S.: Efficient co-regularised least squares regression. In: ICML, Pittsburgh (2006)
Chapelle, O., Schölkopf, B., Zien, A.: Semi-supervised Learning. MIT, Cambridge (2006)
Daniušis, P., Janzing, D., Mooij, J., Zscheischler, J., Steudel, B., Zhang, K., Schölkopf, B.: Inferring deterministic causal relations. In: UAI, Catalina Island (2010)
Fetzer, J.H., Almeder, R.F.: Glossary of Epistemology/Philosophy of Science. Paragon House, New York (1993)
Guo, Y., Niu, X., Zhang, H.: An extensive empirical study on semi-supervised learning. In: ICDM, Sydney (2010)
Hoyer, P.O., Janzing, D., Mooij, J.M., Peters, J., Schölkopf, B.: Nonlinear causal discovery with additive noise models. In: NIPS, Vancouver (2008)
Janzing, D., Schölkopf, B.: Causal inference using the algorithmic Markov condition. IEEE Trans. Inf. Theory 56(10), 5168–5194 (2010)
Lemeire, J., Dirkx, E.: Causal models as minimal descriptions of multivariate systems. http://parallel.vub.ac.be/~jan/ (2007)
Mooij, J., Janzing, D., Peters, J., Schölkopf, B.: Regression by dependence minimization and its application to causal inference in additive noise models. In: ICML, Montreal (2009)
Pearl, J.: Causality. Cambridge University Press, New York (2000)
Pearl, E., Bareinboim, E.: Transportability of causal and statistical relations: a formal approach. In: Proceedings of the 25th AAAI Conference on Artificial Intelligence, Menlo Park, pp. 247–254 (2011)
Reichenbach, H.: The Direction of Time. University of California Press, Berkeley (1956)
Schölkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., Mooij, J.: On causal and anticausal learning. In: Langford, J., Pineau, J. (eds.) Proceedings of the 29th International Conference on Machine Learning, Edinburgh, pp. 1255–1262. Omnipress, New York (2012)
Schweikert, G., Widmer, C., Schölkopf, B., Rätsch, G.: An empirical analysis of domain adaptation algorithms for genomic sequence analysis. In: NIPS, Vancouver (2009)
Seeger, M.: Learning with labeled and unlabeled data. Technical Report (Tech. rep.), University of Edinburgh (2001)
Seidenfeld, T.: Direct inference and inverse inference. J. Philos. 75(12), 709–730 (1978)
Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search. Springer, New York (1993). (2nd edn.: MIT, Cambridge, 2000)
Storkey, A.: When training and test sets are different: characterizing learning transfer. In: Dataset Shift in Machine Learning. MIT, Cambridge (2009)
Vapnik, V.: Estimation of Dependences Based on Empirical Data (in Russian). Nauka, Moscow (1979). English translation: Springer, New York, 1982
Zhang, K., Hyvärinen, A.: On the identifiability of the post-nonlinear causal model. In: UAI, Montreal (2009)
Zhu, X., Goldberg, A.: Introduction to semi-supervised learning. In: Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 3, pp. 1–130. Morgan & Claypool, San Rafael (2009)
Acknowledgements
We thank Ulf Brefeld and Stefan Wrobel who kindly shared their detailed experimental results with us, allowing for our meta-analysis. We thank Bob Williamson, Vladimir Vapnik, and Jakob Zscheischler for helpful discussions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Schölkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., Mooij, J. (2013). Semi-supervised Learning in Causal and Anticausal Settings. In: Schölkopf, B., Luo, Z., Vovk, V. (eds) Empirical Inference. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41136-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-41136-6_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41135-9
Online ISBN: 978-3-642-41136-6
eBook Packages: Computer ScienceComputer Science (R0)